计算机代考程序代写 scheme python database COMP3430_Sem2_2021: generate-student-datasets-rl.py

COMP3430_Sem2_2021: generate-student-datasets-rl.py

Skip to main content

Wattle

Side panel

Resources

Timetable
Programs and Courses
ANU Email
ISIS
ANU Policies

Academic skills
ANU Careers
Research & learn (ILP)

ANUSA
PARSA
Tjabal Centre

Health and Wellbeing

Home
Access and Inclusion
Counselling
Dean of Students
Health
Mental health
Safety and security

Library

Home
SuperSearch
Subject Guides
Past exam papers
Search eBrick and Reserve

Wattle Support

Report a fault
Help and guides

English ‎(en)‎

Deutsch ‎(de)‎
English ‎(en)‎
Español – Internacional ‎(es)‎
Français ‎(fr)‎
Indonesian ‎(id)‎
Italiano ‎(it)‎
Laotian ‎(lo)‎
Thai ‎(th)‎
Русский ‎(ru)‎
عربي ‎(ar)‎
िहन्दी ‎(hi)‎
한국어 ‎(ko)‎
日本語 ‎(ja)‎
简体中文 ‎(zh_cn)‎

30

Notifications

You have no notifications

See all

10

Dashboard

 

Profile

Grades

Messages

Preferences

 

Log out

COMP3430/COMP8430 – Data Wrangling – Sem 2 2021

Dashboard

My courses

COMP3430_Sem2_2021

Topic 6

generate-student-datasets-rl.py

generate-student-datasets-rl.py
9KB Text file Uploaded 24/08/21, 23:02

Click generate-student-datasets-rl.py link to view the file.

◄ dw_assignment_master_rlgt.csv.gz

Jump to…

Jump to…
Announcements Forum
Discussion Forum
Echo360 Active Learning Platform

Important: On/Off-Campus Declaration
Feedback on Software Setup for Online Access
Welcome from the Course Convener

Learning Expectations

Course Outline

Course Schedule
Course Resources

Setup necessary software to use in practical labs
Lecture 1 slides (PDF) – Overview and course introduction
Recording lecture 1, part 1 (WEBM format)
Recording lecture 1, part 2 (WEBM format)
Recording lecture 1, part 1 (MP4 format)
Recording lecture 1, part 2 (MP4 format)
Lecture 2 slides (PDF) – The data wrangling process and understanding data
Recording lecture 2 (WEBM format)
Recording lecture 2 (MP4 format)
Lecture 3 slides (PDF) – Data extraction and storage, data warehousing
Recording lecture 3, part 1 (WEBM format)
Recording lecture 3, part 2 (WEBM format)
Recording lecture 3, part 1 (MP4 format)
Recording lecture 3, part 2 (MP4 format)
Interactive lecture 1 slides (PDF) – Overview and administrative issues
Week 1 Interactive Lecture
Data Cleaning: Problems and Current Approaches (Rahm and Do, 2000)
For Big-Data Scientists, ‘Janitor Work’ Is Key Hurdle to Insights ( Times, 2014)

Sample python scripts (for you to get an understanding of the level of Python we use during the practical lab sessions)
Lecture 4 slides (PDF) – Web scraping and geocoding of data
Recording lecture 4, part 1 (WEBM format)
Recording lecture 4, part 2 (WEBM format)
Recording lecture 4, part 1 (MP4 format)
Recording lecture 4, part 2 (MP4 format)
Lecture 5 slides (PDF) – Data quality assessment and data profiling
Recording lecture 5, part 1 (WEBM format)
Recording lecture 5, part 2 (WEBM format)
Recording lecture 5, part 1 (MP4 format)
Recording lecture 5, part 2 (MP4 format)
Lecture 6 slides (PDF) – Resolving data quality issues and data cleaning
Recording lecture 6, part 1 (WEBM format)
Recording lecture 6, part 2 (WEBM format)
Recording lecture 6, part 1 (MP4 format)
Recording lecture 6, part 2 (MP4 format)
Week 2 Interactive Lecture Slides (PDF)
Week 2 Interactive Lecture 2 Interactive Lecture Recording
Data quality in context (Strong, Lee and Wang, 1997)
Quiz 1 (covering material from weeks 1 and 2)
Assignment 1 Specification
dw_assignment_master.csv.gz
generate-student-dataset.py
student-check-codes-assign1.txt

Lab signup (opens Monday 2 August 4 pm)
Lecture 7 slides (PDF) – Data transformation, aggregation and reduction
Recording lecture 7, part 1 (WEBM format)
Recording lecture 7, part 2 (WEBM format)
Recording lecture 7, part 1 (MP4 format)
Recording lecture 7, part 2 (MP4 format)
Lecture 8 slides (PDF) – Data parsing and standardisation
Recording lecture 8, part 1 (WEBM format)
Recording lecture 8, part 2 (WEBM format)
Recording lecture 8, part 1 (MP4 format)
Recording lecture 8, part 2 (MP4 format)
Lecture 9 slides (PDF) – Data pre-processing using Rattle and Python
Recording lecture 9, part 1 (WEBM format)
Recording lecture 9, part 2 (WEBM format)
Recording lecture 9, part 1 (MP4 format)
Recording lecture 9, part 2 (MP4 format)
Week 3 Review Lecture Slides (PDF)
Week 3 Review Lecture Recording
Week 3 Review Lecture Demo
Towards Reliable Interactive Data Cleaning: A User Survey and Recommendations (Krishnan, Haas, Franklin and Wu, 2016) (copy)
Lab 1 (week 3) Specification
Lecture 10 slides (PDF) – Overview of data integration
Additional material for lecture 10: WOO: A Scalable and Multi-tenant Platform for Continuous Knowledge Base Synthesis (Bellare et al., 2013)
Recording lecture 10, part 1 (WEBM format)
Recording lecture 10, part 2 (WEBM format)
Recording lecture 10, part 1 (MP4 format)
Recording lecture 10, part 2 (MP4 format)
Lecture 11 slides (PDF) – Schema mapping and matching
Recording lecture 11, part 1 (WEBM format)
Recording lecture 11, part 2 (WEBM format)
Recording lecture 11, part 1 (MP4 format)
Recording lecture 11, part 2 (MP4 format)
Lecture 12 slides (PDF) – Overview of record linkage
Recording lecture 12 (WEBM format)
Recording lecture 12 (MP4 format)
Week 4 Interactive Lecture Slides (PDF)
Week 4 Interactive Lecture Recording
Data matching – Chapters 1 and 2 (Christen, 2012)
Lab 2 (week 4) Specification
Lecture 13 slides (PDF) – Data cleaning for record linkage and blocking (1)
Recording lecture 13 (WEBM format)
Recording lecture 13 (MP4 format)
Lecture 14 slides (PDF) – Blocking / indexing (2)
Recording lecture 14 (WEBM format)
Recording lecture 14 (MP4 format)
Week 5 Interactive Lecture Slides (PDF)
Week 5 Interactive Lecture Recording
Quiz 2 (covering material from weeks 2 to 5)
Lecture 15 slides (PDF) – Record pair comparison (1)
Recording lecture 15 (WEBM format)
Recording lecture 15 (MP4 format)
Lecture 16 slides (PDF) – Record pair comparison (2)
Recording lecture 16 (WEBM format)
Recording lecture 16 (MP4 format)
Week 6 Interactive Lecture Recording
Interactive lecture 6 slides 2021
Lab 3 (week 6) Specification
comp3430_comp8430-reclink-lab-3-6.zip (Data sets and Python record linkage programs)
SLK-581 guide for usage
Example solution blocking.py
Lab 3 slides

Assignment 1 submission
General marking feedback assignment 1
Assignment 2 Specification
dw_assignment_master2.csv.gz
generate-student-dataset2.py
student-check-codes-assign2.txt
Education data set description
Assignment 3 Specification
dw_assignment_master_rl1.csv.gz
dw_assignment_master_rl2.csv.gz
dw_assignment_master_rlgt.csv.gz
student-check-codes-assign3.txt
Assignment 4 Specification (

COMP8430 students only)
Adaptive Temporal Entity Resolution on Dynamic Databases (Christen and Gayler, 2013)
Efficient Interactive Training Selection for Large-Scale Entity Resolution (Wang, Vatsalan, and Christen, 2015)
Improving Temporal Record Linkage Using Regression Classification (Hu, Wang, Vatsalan, and Christen, 2017)
Pattern-Mining Based Cryptanalysis of Bloom Filters for Privacy-Preserving Record Linkage (Christen, Vidanage, Ranbaduge, and Schnell, 2018)
A Scalable and Efficient Subgroup Blocking Scheme for Multidatabase Record Linkage (Ranbaduge, Vatsalan, and Christen, 2018)
Robust Temporal Graph Clustering for Group Record Linkage (Nanayakkara, Christen, and Ranbaduge, 2019)
Secure and Accurate Two-Step Hash Encoding for Privacy-Preserving Record Linkage (Ranbaduge, Christen, and Schnell, 2020)
Lecture 17 slides (PDF) – Record pair classification (1)
Recording lecture 17 (WEBM format)
Recording lecture 17 (MP4 format)
Lecture 18 slides (PDF) – Record pair classification (2)
Recording lecture 18 (WEBM format)
Recording lecture 18 (MP4 format)
Week 7 Interactive Lecture Recording
Interactive lecture 7 slides
Lab 4 (week 7) specification
Example solution comparison.py
Lab 4 slides
Lecture 19 slides (PDF) – Record linkage evaluation (1)
Recording lecture 19, part 1 (WEBM format)
Recording lecture 19, part 2 (WEBM format)
Recording lecture 19, part 1 (MP4 format)
Recording lecture 19, part 2 (MP4 format)
Lecture 20 slides (PDF) – Record linkage evaluation (2)
Recording lecture 20 (WEBM format)
Recording lecture 20 (MP4 format)
Week 8 Interactive Lecture Recording
Interactive lecture 8 slides
Quiz 3 (covering material from weeks 6 to 8)
Lab 5 (week 8) specification
Example solution classification.py
Lab 5 slides
Lecture 21 slides (PDF) – Data fusion
Recording lecture 21 (WEBM format)
Recording lecture 21 (MP4 format)
Lecture 22 slides (PDF) – Advanced record linkage techniques
Recording lecture 22 (WEBM format)
Recording lecture 22 (MP4 format)
Lecture 23 slides (PDF) – Privacy aspects in data wrangling and privacy-preserving record linkage
Recording lecture 23 (WEBM format)
Recording lecture 23 (MP4 format)
Week 9 Interactive Lecture Recording
Interactive lecture 9 slides
Privacy-preserving record linkage using Bloom filters

Assignment 2 submission
Lab 6 (week 9) specification
Example solution evaluation.py
Lab 6 slides
Lecture 24 slides (PDF) – Ontology matching
Recording lecture 24 (WEBM format)
Recording lecture 24 (MP4 format)
Lecture 25 slides (PDF) – Wrangling dynamic and spatial data
Recording lecture 25 (WEBM format)
Recording lecture 25 (MP4 format)
Interactive lecture 10
Interactive lecture 10 slides
Lab 7 (week 10) specification
comp3430_comp8430-reclink-lab7-datasets.zip
Python module saveLinkResult.py
Lab 7 slides
Extra lab specification (for those who are intersted)
Python module privacyPreservingRecordLinkage.py

Assignment 3 submission

Assignment 4 submission (COMP8430 students only!)

student-check-codes-assign3.txt ►

COMP3430_Sem2_2021

Participants

Grades

General

Topic 1

Topic 2

Topic 3

Topic 4

Topic 5

Topic 6

Topic 7

Topic 8

Topic 9

Topic 10

Topic 11

Topic 12

Dashboard

Site home

Calendar

My courses

COMP3900_Sem2_2021

COMP3430_Sem2_2021

COMP2610_Sem2_2021

COMP1600_Sem2_2021

Contacts

Messages selected:
1

×

Contacts

0

Settings

Contacts

Requests

0

No contacts

No contact requests

Contact request sent

Personal space

Save draft messages, links, notes etc. to access later.

Delete for me and for everyone else

Block

Unblock

Remove

Add

Delete

Delete

Send contact request

Accept and add to contacts

Decline

OK
Cancel

Starred

()

No starred conversations

Group

()

No group conversations

Private

()

No private conversations

Contacts

Non-contacts

Load more

Messages

Load more

No results

Search people and messages

Privacy

You can restrict who can message you

Accept messages from:

My contacts only

My contacts and anyone in my courses

Notification preferences

General

Use enter to send