程序代写CS代考 python data science database flex finance What is the subject about?

What is the subject about?
“DATA WRANGLING”
“data” —
information we can process
“wrangling” —
round up, herd, or take charge of

What does Data Wrangling entail?

Mapping
Formatting
Data integration
Aggregation Publishing
Structuring
Organising
Data Wrangling
Data enrichment
Visualisation
Converting
Storage

Why is it important?

Who is a data wrangler?
Step forward the data scientist!

Data science
The ability to take data – to be able to understand it, to process it, to extract value from it, to visualize it, to communicate it’s going to be a hugely important skill in the next decades, not only at the professional level but even at the educational level for elementary school kids, for high school kids, for college kids. Because now we really do have essentially free and ubiquitous data. So the complimentary scarce factor is the ability to understand that data and extract value from it.
, Chief Economist at Google The Mc , Jan 2009

Data and health
• Gene sequences
• Mobile data
• Electronic medical records • Insurance claims
• Imaging results
• GP data
• Prescription data
• Social media
• …….
https://upload.wikimedia.org/wikipedia /commons/e/ee/MRI-Philips.JPG
https://upload.wikimedia.org/wikipedia/commons /8/80/DNA_methylation.jpg
https://upload.wikimedia.org/wikipedia/commons/b/b7/Thyroid_Clinic_plan.png
https://upload.wikimedia.org/wikipedia /commons/c/ca/Fitibit_Flex.jpg

Data and business (au.yahoo.com)

Data and education
massively online education
• MOOCs – massively online education (>30 at Unimelb) • Video viewing behaviour
• Quizzes
• Discussion forum
• Assignments
• Interventions to improve learning
Modelling discrete optimisation

Data and humanities
https://books.google.com/ngrams
– the more frequently an irregular verb is used, the less likely it is to be regularized over time (Aiden and Michel)

Data and food
Food analytics
• Analyse food samples
• Identify ingredients, look for contamination or other suprises
• Infer the change in health of the population.
https://www.fooddrinkireland.ie/Sectors/FDI/FDI.nsf/vPages/Publications~the -evolution-of-foodand-drink-in-ireland-2005-2017-20-02- 2019/$file/The+evolution+of+food+and+drink+in+Ireland+2005+-+2017+- +Reformulation+and+Innovation+-+Supporting+Irish+diets.pdf

Data and science
CERN
Large hadron collider 1000 terabytes/second

Data and sport
https://upload.wikimedia.org/wikipedia/commons/3/34/BDS_West_2010-11-26.jpg
• Video analysis
• Wearables, GPS tracking, heart rate
• Skin patch behind ear, mouthguard sensors

AI—“Intelligent machines and software”
Netflix 13/03/2016 10:03 26am
Kids Categories Search Kids… Exit Kids
The Wiggles My Little Pony Mako Mermaids H2O: Just Add Water Good Luck Charlie Pokémon
Recently watched Top Picks for Kids
Popular
Action

Data science landscape
• Data wrangling • ML & NLP
• Databases
• Distributed computing, big data technology
Computing
• Robust models and methods
• Sampling
• Hypothesis testing • …
Statistics
• Health
• Finance
• Social science • …
Domain expertise

What is the subject all about?
Preprocessing & visualization (Chris): Weeks 1 – 6
– Data types and data formats
– Visualization methods with outlier detection
– Preprocessing, data exploration, data cleaning including missing data – Text processing, crawling and scraping
Analysis (introductory) (Uwe): Weeks 7 – 10 – Correlations
– Basic prediction techniques
– Feature engineering & dimensionality reduction – Record linkage and recommender systems.
Social, ethical and privacy issues (Chris): Weeks 11 –12 – K-anonymity, l-diversity,
– Privacy & ethics

Detailed Plan
• Week 1: Intro to Data Science, Python Programming
• Week 2: Visualisation, Data Formats
• Week 3: Text Search & Matching
• Week 4: Text Processing, Representation Models, Web Crawling • Week 5: Recommender Systems
• Week 6: Record Linkage
• Week 7: Correlations
• Week 8: Regression, Classification
• Week 9: Clustering
• Week 10: Feature Engineering, Experimental Design • Week 11: Privacy
• Week 12: Ethics & Summary

What the subject is:
• What the subject is:
• A broad coverage of the steps involved in the data pipeline
• What the subject is not:
• A deep dive into any one particular area of the data pipeline • A programming subject

Image: Swami Chandrasekaran http://nirvacana.com/thou ghts/2013/07/08/becomin g-a-data-scientist/