程序代写代做代考 COMP3430 / COMP8430 Data wrangling

COMP3430 / COMP8430 Data wrangling
Interactive lecture week 8: Assignments, labs, and summary of week 8
(Lecturer: )

Lecture outline
● Administrative matters – Assignments and Labs
● Summary of week 8 ● Q and A Session

Labs
● We had lab 5 this week, continuation of the record linkage project
● Different classification techniques are discussed
● Sample solutions for lab will be released on Monday
● REMINDER: All of labs 3 to 7 are highly relevant to assignment 3

Labs next week
● There will be NO lab session on Monday 4th October 11 am to 1 pm (public holiday)
● However, for next week only we will conduct all the remaining lab sessions in a public channel
 Channel name: Lab sessions 5th – 7th October
● So you are free to attend any lab session next week

Assignment 1
● Assignment 1, all remark requests received until now have been answered
● You have until Tuesday 5th October 5pm 2021 (next week) to question your marks. After that your marks will be finalised and fixed
● You must carefully read the marking feedback document provided

Assignments 2 and 3, and Quiz 3
● Assignment 2: due on 8th October 2021, 11:55 PM (next Friday)
● Assignment 3: due on 22nd October 2021, 11:55 PM
● Quiz 3 will close on 4th October 2021, 11:55 PM (next Monday)

From last topic (Topic 7)
● Levenshtein edit distance
Target
g
a
m
b
l
e
g
u
m
b
o
Substitute = 2
Delete = 1
Insert = 1
You are here
Source

From last topic (Topic 7)
● Levenshtein edit distance Solution
Target
g
a
m
b
l
e
0
1
2
3
4
5
6
g
1
0
1
2
3
4
5
u
2
1
2
3
4
5
6
m
3
2
3
2
3
4
5
b
4
3
4
3
2
3
4
o
5
4
5
4
3
4
5
Substitute = 2
Delete = 1
Insert = 1
You are here
min(ins + cost_i, del + cost_d, subs + cost_s)
Source

Summary of week 8
● Evaluation
– Different measures are employed to evaluate the obtained record linkage results
– Achieving high linkage quality is the main goal of most record linkage projects
– Ground truth data is needed to measure linkage quality
– However, ground truth data is not always available and we cannot guarantee the correctness of the available data

Summary of week 8
● Evaluating linkage results with ground truth data
– Four possible outcomes : True positives, False positives, False
negatives, True negatives
– Measuring linkage quality: Accuracy, Precision, Recall, and F- measure
– Measuring linkage complexity: Reduction ratio, Pairs completeness, Pairs quality, Runtime, Memory consumption

Q and A Session
● Socrative
– https://b.socrative.com/login/student/ – Room Name: COMP3430