.. (לתיקייה המכילה) | ||
What version of MongoDB to install? | |
Please install MongoDB 3.0.8. In a newer version of MongoDB, Robomongo cannot display the actual collections. |
Can we use MapReduce? | |
No. |
What is the identifier of a query? | |
'query_num' and it is better to understand it as a functional dependency: 'query_num' ->'query_text'. |
Should I assume the database already has a default collection called 'documents'? | |
Yes. |
What file should be submitted: wet.sh or wet.js? | |
We will support both .js and .sh . |
Can we use js code to calculate the query cover in question 2? | |
Yes. The manipulation on text may be done using js. |
What are the features of MongoDB we may use during the assignment? | |
You may use the following, avg,min,max,first,last,multiply, divide, count, sum, and, or. Or any other feature of MongoDB which is not part of MapReduce. |
Can we add aditional fields to the collection of 'documents'? | |
Yes you may if it helps. Please note that the two collections should be clean from helper fields after each part of the assignment.. |
Can you give an example of the calculation [count (top-document-words appears in the document)/count(document words)] ? | |
Yes. if the 'query_num' is 2, and the top ranked document ,denoted by TOP, of it has a field: 'content': 'Hello hello'. And another document dentoed by DOC with the same 'query_num' (2) has a field: 'content': 'Hello hello Hello hello'. While iterating over the tokens of TOP we count the following: The word 'Hello' appears in DOC.content. (+1) The word 'hello' appears in DOC.content. (+1) The length of DOC.content is 4 so the sim_rank is 2/4=0.5 |
Is it possible that two different authors have the same content? | |
Yes. |
What are the functions of MongoDB that be used? | |
You may use any function of MongoDB except MapReduce. Note that all of the parts can be done with what you have learned in the tutorial. |
What can be done using JS? | |
You may use JS for the parts we asked for text maniuplations (e.g. q2 and q7). Example: to split a text into tokens using MongoDB is not necessary, you may use JS functions for that. |
How can i know that my script is well defined and there won't be any problems during the test of it? | |
You may run mongo.exe and then type: load("name_file_of_my_script") After doing it you can watch the results using robomongo. |
When sorting by "sim_rank" descending, should it also be sorted by "first_name" and "last_name" ascending when sim_ranks are equal? | |
Yes. |
Can we add additional collections i.e. helper collections? | |
No. |
The script is running on Robomongo but not on mongo.exe. What should i do? | |
You can choose between the two. Both are accepted. |
Can you explain A or B? | |
Lets focus on one document denoted by DOC. A: Number of documents related to the same query – 'current_position' +1 => if [Number of documents related to the same query]=5 the 'current_position'(DOC)=2 => A=5-2+1=4 B differ from A only because its score induced by 'sim_rank'. |