程序代写代做 Java graph Q1: Indexing Statistics

Q1: Indexing Statistics
Indexing Statistics:
Question 1
Enter the number of indexed documents
Answer:
Question 2
Enter the size of the vocabulary
Answer:
Question 3
Enter the number of tokens indexed
Answer:
Question 4
Enter the number of pointers
Answer:
Question 5
Enter the time it took Terrier to index the collection (in seconds)
Answer:

Q2: Simple TF*IDF
Question 6
Paste your Java method code for your Simple TF*IDF weighting model
public double score(Posting p) {
//paste your code here.

Question 7
Consider a given document and two queries, namely:
• jury service
• jury service jury
Does your implemented Simple TF*IDF weighting model give the same score to this document for both queries? 
Select one:
True
False
Question 8
On your created index, run the following query using the Simple TF*IDF weighting Scheme:  jury service 
What is the docno of the second ranked document returned by the system for this query?
Answer: 
Question 9
On your created index, run the following query using the PL2 weighting Scheme:  jury service 
What is the docno of the second ranked document returned by the system for this query?

Answer: 
Question 10
Inspect the top retrieved documents by Simple TF*IDF and PL2 for the query: jury service
Explain the differences in the retrieved documents by both weighting models. 
(Select one or more answers)

Select one or more:
a. SimpleTF-IDF favours longer documents
b.  SimpleTF-IDF will boost the docs with the higher term frequencies of related terms
c. SimpleTF-IDF will boost the docs with the higher query term frequencies 
d. SimpleTF-IDF favours shorter documents

Q2: Vector Space TF*IDF
Question 11
Not yet answered
Not graded
Flag question
Question text
Paste your Java method code for your Vector Space TF.IDF Model implementation.

Question 12
On your created index, run the following query using your Vector Space TF*IDF Model implementation:  
Consumer food advice
What is the docno of the second ranked document returned for this query?

Answer: 
Question 13
Consider the following two queries:
Query 1:  a history of American agriculture
Query 2:  american agriculture history in america
Using your Vector Space TF*IDF Model implementation, provide the top 2 ranked documents by the system for these two queries, separated by comma (without space), i.e.
Gxx-xx-xxxxxxx,Gyy-yy-yyyyyyy,Gaa-aa-aaaaaaa,Gbb-bb-bbbbbbb 
where Gxx-xx-xxxxxxx and Gyy-yy-yyyyyyy are the top two ranked documents for Query 1 and Gaa-aa-aaaaaaa and Gbb-bb-bbbbbbb are the top two ranked documents for Query 2.

Answer: 
Question 14
When instantiating a WeightingModel class in Terrier, what is the purpose of setEntryStatistics()?
Select one:
a. To provide information about the document length, which is important for normalisation
b. To inform the weighting model about the query, such as the length of the query
c. To inform the weighting model of the collection statistics, such as the number of documents
d. To inform the weighting model of the statistics of the term, such as the document frequency
Clear my choice
Question 15
What is the term frequency of query term ‘jury’ in the 10th ranked document by your Vector Space Model implementation for the query: jury service
Answer:

Q3: Weighting Model Results (Simple TF*IDF)
窗体顶端
Question 16
Enter the MAP performance of Simple TF*IDF on HP04 topics
Answer:
Question 17
Enter the MAP performance of Simple TF*IDF on NP04 topics
Answer:
Question 18
Enter the MAP performance of Simple TF*IDF on TD04 topics
Answer:
Question 19
Enter the average MAP performance of Simple TF*IDF across the HP04, NP04 and TD04 topics
Answer:
窗体底端
窗体顶端

Q3: Weighting Model Results (Vector Space TF*IDF)
Question 20
Enter the MAP performance of Vector Space TF*IDF on HP04 topics
Answer:
Question 21
Enter the MAP performance of Vector Space TF*IDF on NP04 topics
Answer:
Question 22
Enter the MAP performance of Simple TF*IDF on TD04 topics
Answer:
Question 23
Enter the average MAP performance of Vector Space TF*IDF across the HP04, NP04 and TD04 topics
Answer:
窗体底端
Q3: Weighting Model Results (Terrier TF*IDF)
Question 24
Enter the MAP performance of Terrier’s TF*IDF on HP04 topics
Answer:
Question 25
Enter the MAP performance of Terrier’s TF*IDF on NP04 topics
Answer:
Question 26
Enter the MAP performance of Terrier’s TF*IDF on TD04 topics
Answer:
Question 27
Enter the average MAP performance of Terrier’s TF*IDF across the HP04, NP04 and TD04 topics
Answer:

Q3: Weighting Model Results (BM25)
Question 28
Enter the MAP performance of BM25 on HP04 topics
Answer:
Question 29
Enter the MAP performance of BM25 on NP04 topics
Answer:
Question 30
Enter the MAP performance of BM25 on TD04 topics
Answer:
Question 31
Enter the average MAP performance of  BM25 across the HP04, NP04 and TD04 topics
Answer:

Q3: Weighting Model Results (PL2)
Question 32
Enter the MAP performance of PL2 on HP04 topics
Answer:
Question 33
Enter the MAP performance of PL2 on NP04 topics
Answer:
Question 34
Enter the MAP performance of PL2 on TD04 topics
Answer:
Question 35
Enter the average MAP performance of PL2 across the HP04, NP04 and TD04 topics
Answer:

Q3: Recall-Precision Graphs
Question 36
Upload your 3 Recall-Precision graphs for each of the  HP04, NP04 and TD4 topic sets. Use a single PDF document to show the three graphs.
Maximum file size: 230MB, maximum number of files: 1
You can drag and drop files here to add them.
Question 37
Enter the precision of PL2 on the Homepage Finding Task (HP04) at the interpolated 0.2 Recall.
Answer:
Question 38
Enter the precision of Simple TF*IDF on the Named Page Finding Task (NP04) at the interpolated 0.5 Recall.
Answer:
Question 39
On the TD04 topic set, what is the best performing weighting model among the 5 evaluated models on the early interpolated recall values (i.e. recall >= 0.1 and <= 0.2)? Select one: A. BM25 B. Terrier's TF*IDF C. PL2 D. Simple TF*IDF E. Vector Space TF*IDF Clear my choice Question 40 Identify the most effective weighting model in terms of MAP  across the 3 topic sets Select one: 1. PL2 2. Terrier's TF*IDF 3. BM25 4. Vector Space TF*IDF 5. Simple TF*IDF Q4: Query Expansion with Best Weighting Model Question 41 Enter the MAP performance of the identified best weighting model + Query Expansion on the HP04 topics Answer: Question 42 Enter the MAP performance of the identified best weighting model + Query Expansion on the TD04 topics Answer: Q4: Query Expansion with Simple TF*IDF Question 43 Enter the MAP performance of Simple TF*IDF + Query Expansion on the HP04 topics Answer: Question 44 Enter the MAP performance of Simple TF*IDF + Query Expansion on the TD04 topics Answer: Query Expansion with Vector Space TF*IDF Question 45 Enter the MAP performance of Vector Space TF*IDF + Query Expansion on the HP04 topics Answer: Question 46 Enter the MAP performance of  Vector Space TF*IDF + Query Expansion on the TD04 topics Answer: Q4: Query-By-Query Histograms (QE vs No QE) Question 47 Enter your 2 query-by-query histograms comparing your identified best weighting model in Q3 with and without query expansion on each of the topic sets (HP04, TD04). Upload a single PDF document showing the 2 histograms. Maximum file size: 230MB, maximum number of files: 1 You can drag and drop files here to add them. Question 48 Enter the number of queries whose performances have degraded after the application of query expansion on the HP04 topics Answer: Question 49 Enter the number of queries whose performances have improved after the application of query expansion on the TD04 topics Answer: Question 50 The application of query expansion on query ID 5 (American Music) of the TD04 topic set has improved its performance  Select one: True False Q4: Analysis of QE vs No QE Question 51 Consider the following query: Pop stars who once worked at McDonald's Based on your understanding of the course material, the application of query expansion using the Vector Space Model on this query will enhance the Average Precision of the query.
 Select one: True False Question 52 Based on your understanding of the lecture material, when is query expansion likely to work when applied with a Vector Space Model: (Select one or more answers) Select one or more: a. The query is well formed with a clear information need b. The query vector is somehow close to the vector representations of the documents the user desires c. All the relevant documents use different vocabulary from the query  d. The query is full of misspellings and vague terms e. The relevant documents are tightly clustered in the vector space Question 53 Based on your understanding of the course material, your expectation was that the application of query expansion on the Homepage Finding topics (HP04) .... Select one: A. will improve the system's MAP performance  B. will not help to improve the system's MAP performance  Clear my choice Question 54 Based on your understanding of the course material, the application of query expansion on the topic distillation  (TD04) topics  was expected to  improve the system's MAP performance ? Select one: True False To Conclude .... Question 55 Enter how many hours you spent on this exercise? Answer: Question 56 Do you have any additional feedback on the exercise?  e.g. what did you find the most difficult?