![]() |
.. (לתיקייה המכילה) | |
Can we count retweets using the field retweet_count in the JSON to count retweets?
| You may not; only imported data should be counted. |
How are transitive retweets counted in hashtag popularity?
|
See the below example: tweet A: "I have a #hashtag" tweet B retweets A tweet C retweets B #hashtag has 2 retweets. |
Can we use subpackages of .impl and .api to place our code?
| Yes, you may |
Which part of our API will be exported to other teams in assignment 4?
| All source code in the .api package including its subpackages |
What is expected size of the index our system should support?
|
Your system should support a million tweets, on a machine with at most 4 GB of ram. You can set the JVM's maximum heap memory using the -Xmx and -Xms flags. |
What is the JSON field that signifies a retweet?
|
The field's name is 'retweeted_status'. note that 'in_reply_to_status_id' deals with replies and not retweets. If you have any doubt, please use https://dev.twitter.com/docs/platform-objects/tweets |
What are the new time limitations for setupIndex?
| The new time limitations are 10,000ns per tweet, or 10 seconds for a million tweets. |
Is the input size counted towards the memory limitations?
|
No, it does not. Note that the JSON format is very verbose, and an input of 2 million tweets can already surpass the 4GB limitation. However, the size of your index (whichever part of it you load to memory, of course) does count toward the memory limitation, so make sure you save data in an efficient manner. |
Should we extract tweets as imported tweets from the retweeted_status?
| No |

