CS代考

Computer Science 572 Exam

Prof. , April 23, 2018, 8:00am – 9:00am

Copyright By PowCoder代写 加微信 powcoder

Name: Student Id Number:

1. This is a closed book exam.
2. Please answer all questions.
3. There are a total of 25 questions. Question points may vary.
4. Place your answer immediately below the question. Limit answers to ONE SENTENCE unless more is

requested.

1. [4 pts] Define the contiguity hypothesis.

2. [4 pts] Below is a formula that occurs in the class notes.

Question 1: What does the formula define? Question 2: Define C, |C| and

3. [4 pts] Given N as the number of documents, is the time to train the documents according

to the Rocchio method O(N), O(N log N) or O(N2) ?

Watson’s DeepQA system for question answering has four phases:

(1) Question Processing, (2) Candidate Answer Generation, (3) Candidate Scoring, and

(4)Answer Merging and Confidence Scoring. Answer the following two questions about

4. [4 pts] In what phase does Named Entity tagging occur?

5. [4 pts] Define co-reference and in what phase does it occur?

6. [4 pts] In question answering, when several passages containing the query terms are

returned, there are six criteria used to rank the passages. Please name two of them.

7. [4 pts] Client applications use five fundamental operations to work with Solr using HTTP

requests and responses- Name any 2.

8. [4 pts] Lucene uses a Boolean and Vector space model to determine how relevant a

document is to a user’s query. How does the Vector Space Model score the document?

9. [4 pts] Given two documents below, doc1 and doc2, provide the mapper output if an

invertedIndex is run on the documents in a Hadoop cluster.

doc1 – USC Gould School of Law

doc2 – enacted the criminal statute

10. [4 pts] Given the two documents in the above question, doc1 and doc2, provide the

reducer output if an invertedIndex is run on the documents in a Hadoop cluster.

11. [4 pts] When using N-grams for spelling correction, if no match is found for value N,

then the algorithm will step back and look for a match with the (N-1)-grams, and again if

there is no match the algorithm backs up again. What is this algorithm called?

12. [4 pts] Given 100 documents divided into three clusters where cluster 1 has 10 related

documents, cluster 2 has 5 related documents and cluster 3 has 10 related documents,

what is the Purity Index of this clustering?

13. [4 pts] For the k-means algorithm, is the centroid necessarily a document in the set of

documents? Yes or No.

14. [4 pts] In Solr what file contains configuration for the data dictionary?

15. [4 pts] In Solr what file contains definitions of the field types and fields of a document?

16. [4 pts] What is the syntax for starting and stopping Solr?

17. [4 pts] There are three criteria that define a good clustering algorithm, describe one.

18. [4 pts] Given the two strings: “abcde” and “azced”, what is their minimum edit distance

assuming the operations (replace, delete, insert) all have a count of 1?

19. [4 pts] Suppose one advertiser bids $1.00 for his ad to be displayed and a second

advertiser bids $0.50 for his ad to be displayed and all other factors affecting ads are

identical. If the first advertiser’s ad is clicked on how much does he pay Google?

20. [4 pts] We discussed clustering and classification. One is an example of supervised

learning and the other is an example of unsupervised learning. Which one is supervised

and which one is unsupervised?

21. [4 pts] Is the graph provided to NetworkX directed or undirected?

22. [4 pts] This semester we examined two algorithms for clustering and two algorithms for

classification. Name all four.

23. [4 pts] Microsoft, Google and Yahoo agreed on a formalism for including rich snippets in web

pages. What website contains the specification of this formalism? What is the name of the

formalism?

24. [4 pts] Define: breadcrumbs

25. [4 pts] In our discussion of knowledgebases we discussed the need for instances, classes

and a taxonomic hierarchy. Wikipedia includes many instances. Does it also include

classes and a taxonomic hierarchy?

Monday, April 23, 2018, 8:00am – 9:00am

程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com