Report guide 1:
All steps => All five steps的思路步骤都要写,5步在这个pdf里
Copyright By PowCoder代写 加微信 powcoder
Report guide 2:
Introduction
• describe three models
(需要把课程pdf提到的三个算法相关的mechanism都写清楚,最相关的部分是wk8~wk11):
(下面是代码中三个模型必写的相关mechanism)
BM25 (怎么决定文件document是relevant还是irrelevant; python代码后出来的relevant还是irrelevant; filtering system机制详解)
Model1 (prm的query从哪来; prm的伪标签怎么来的)
Model2 (逻辑回归的training set是什么从哪来)
• Assumptions
• Algorithms
• Describe your development (which packages)
Results & Evaluation
• Use of multiple effectiveness measures
Discussion
• Analysis about your findings Recommendation
• Results + statistical measures –>t-test (T-test解释)
• must include a justification on your result
Limitations
Conclusion
Statement of Completeness
User Manual
• Explain how to execute the system
describe three models:
bm25 week10 IRM
model1 week 10 PRM,
model2 同样是prm,但是使用池袋模型表示文档,然后使用logestic回归分类
Algorithms:
Bm25 w5 logestic回归
Describe your development (which packages):
Bm25 前两部分主要用了math这个库, model2使用sklearn机器学习库,pandas来读取数据
solution for all steps
前俩个参考你们作业就行了,里面步骤很详细
Model2 对于某个query中的全部文件,使用CountVectorizer()编码,得到特征,然后使用于model1中类似的方法,得到文件的伪标签,用这些数据训练逻辑回归模型,最后使用模型预测的可能性作为排名的依据。
IFN647_asm2-20
Microsoft Word – IFN647_asm2-2022.doc
IFN647 – Assignment 2 Requirements
Weighting: 35% of the assessment for IFN647
Deliverable Items
1. A final report in PDF or word file, which includes
• Statement of completeness and your name(s) and student ID(s) in a cover
• Your solution for all steps (See more details in the “Requirements for each
step” session).
• User manual section describing the structure of the data folder setup,
information on “importing packages” and how to execute Python code in
Terminal, IDLE or PyCharm; and
• An appendix listing the top 10 documents for all topics and all models.
2. Your source code for all steps, containing all .py files (using a zip file “code.zip” to
put them together) necessary to run the solution and perform the evaluation (source
code only, no executables); and
3. A poster and demo to your tutor during your week 13 workshop time.
Please zip all files (the final report, code, and poster) into a zip file as your
“studentID_Asm2.zip” and submit it through the Blackboard (a single submission only
for a group) before the due date.
Please do not include the dataset folder generated by “DataCollection.zip” or
“RelevanceFeedback.zip” in your submission.
The following are the frameworks/libraries that you can used for assignment 2:
(a) sk-learn
(c) pandas
(e) Matplotlib
If you want to use another package or library, you need to get your tutor’s approval.
Due date of Blackboard Submission: Sunday week 13 (12th June 2022)
Group: You are required to work on this assignment in a group of three people.
Due to the difficulty of automatically obtaining user information requirements, most search
systems only use queries (or topics) instead of user information requirements. The first reason is
that users may not know how to represent the topics they are interested in. The second reason is
that users may not want to spend a lot of effort mining relevant documents from the hundreds of
thousands of candidates provided by the system.
This unit discusses several ways to extend query-based methods, such as pseudo-relevant
feedback, query expansion or hybrid methods. In this assignment, for a given data collection
(including 50 datasets), you need to discover a good information filtering model that
recommends relevant documents to users on all 50 topics, where documents in each dataset are
collected for a corresponding topic. The methodology you will use for this assignment includes
the following steps:
• Step 1 – Design an IR-based baseline model (BM_IR) that ranks documents in each
dataset using the corresponding queries for all 50 datasets.
• Step 2 – Based on the knowledge you gained from this unit, design two different models
to rank documents in each dataset using corresponding queries from all 50 datasets. You
can call them Model_1 and Model_2, respectively.
• Step 3 – Use python to implement three models: BM_IR, Model_1 and Model_2, and
test them on the given data collection of 50 topics (50 datasets) and print out the top 10
documents for each dataset (put the output in the appendix of your final report).
• Step 4 – Choose three effectiveness measures to display testing results against the
selected effectiveness measures in tables or graphs.
• Step 5 – Recommend the best model based on significance test and your analysis.
Data Collection
It is a subset of RCV1 data collection. It is only for IFN647 students who will be supervised by
Prof. Li Yuefeng. Please do not release this data collection to others.
DataCollection.zip file – It includes 50 Datasets (folders “dataset101” to “dataset150”)
for topic R101 to topic R150.
“Topics.txt” file – It contains definitions for 50 topics (numbered from R101 to R150)
for 50 datasets in the data collection, where each
a topic, including topic number (
narrative (
Example of topic R102 – “Convicts, repeat offenders” is defined as follows:
Search for information pertaining to crimes committed by people who
have been previously convicted and later released or paroled from
Relevant documents are those which cite actual crimes committed by
“repeat offenders” or ex-convicts. Documents which only generally
discuss the topic or efforts to prevent its occurrence with no
specific cases cited are irrelevant.
RelevanceFeedback.zip file – It includes relevance judgements (file “dataset101.txt” to
file “dataset150.txt”) for all documents used in the 50 datasets, where “1” in the third
column of each .txt file indicates that the document (the second column) is relevant to
the corresponding topic (the first column); and “0” means the document is non-relevant.
Requirements for each step
Step 1: Design a BM25 based IR model as a baseline model (BM_IR).
You need to use the following equation
for all topics R101 to R150, where Q is the title of a topic.
Formally describe your design for BM_IR in an algorithm to rank documents in each dataset
using corresponding queries for all 50 datasets. You also need to determine the values for all
parameters used in the above equation, and how to rank documents in each dataset.
Step 2. Design two different models Model_1 and Model_2
In this step, you can only use the 50 datasets (DataCollection.zip file) and topics (Topics.txt).
You cannot use relevance judgements (RelevanceFeedback.zip). You may design IR-based
models, pseudo relevance models or other hybrid methods.
Write your design (or ideas) for the two models into two algorithms. Your approach should be
generic that means it is feasible to be used for other topics. You also need to discuss the
difference between three models.
Step 3. Implement three models: BM_IR, Model_1 and Model_2
Design python programs to implement these three models. You can use a .py file for each
model. Discuss the data structures used to represent a single document and a set of documents
for each model (you can use the same data structure for different models). You also need to test
the three models on the given data collection of 50 datasets for the 50 topics and print out the
top 10 documents for each dataset (in descending order). The output will be put in the appendix
of the final report.
Below is the output format for all 50 topics for the model BM_IR only.
Topic R101:
Topic R102:
DocID Weight
73038 5.8987
26061 4.2736
65414 4.1414
57914 3.9671
58476 3.7084
76635 3.5867
12769 3.4341
12767 3.3521
25096 2.7646
78836 2.6823
Topic R103
Topic R150
Step 4. Display test results on effectiveness measures
In this step, you need to use the Relevance Judgment (RelevanceFeedback.zip) to display the
test results for the selected effectiveness metric.
You need to choose three different effectiveness measures to evaluate the test results, such as
top-10 precision, MAP, F1, or interpolation. Evaluation results can be summarized in tables or
graphs (e.g., precision-recall curves). Below is an example summary table of F1 measure for the
three models.
Table 1. The performance of three models on F1 measure at position 25
Topic BM_IR Model_1 Model_2
R101 0.2100 0.2200 0.2300
R102 0.0320 0.0350 0.0370
R103 0.0765 0.0765 0.0787
R150 … … …
Step 5. Recommend the best model
You need a significance test to compare models. You can choose a t-test to perform a
significance test on the evaluation results you reported in step 4. You can compare models
between BM_IR and Model_1, BM_IR and Model_2, and/or Model_1 and Model_2. Based on
t-test results (p-value and t-statistic value), you can recommend the best model. You can
perform the t-test using a single effectiveness measure or multiple measures. Generally, using
more effectiveness measures provides stronger evidence against the null hypothesis.
Note that if the t-test is unsatisfactory, you can use the evaluation results to refine your model.
For example, you can adjust parameter settings or update your design.
Please Note
• Your programs should be well laid out, easy to read and well commented.
• All items submitted should be clearly labelled with your name and student number.
• Marks will be awarded for design (algorithms), programs (correctness, programming style,
elegance, commenting) and evaluation results, according to the marking guide.
• You will lose marks for missing or inaccurate statements of completeness or user manual,
and for missing sections, files, or items.
• Your results do not need to be exactly the same as the sample output.
• We recommend that you use a fair workload distribution approach, such as one person per
model, but the baseline model is simple, so the person responsible for the baseline may do
more in the evaluation.
• If your group has team conflict issues, your individual contributions will be assessed at the
Week 13 workshop (you may be asked to do a peer review); otherwise, all group members
will participate equally in this assessment project.
• See the marking guide for more details.
END OF ASSIGNMENT 2
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com