计算机代考程序代写 python database Excel Semester 2 2021

Semester 2 2021
Lecture 3, Part I: Data Formats

Data Formats

Examples of Data Formats
Unstructured
Semi-Structured
Structured
Text files/documents
XML
Databases
Audio
JSON
Tables
Video
Webpages
Spreadsheets
Social media data
CSV, NoSQL, …
More Machine Readable
More Human Readable

Structured data
Relational databases

Relational Database
https://clockwise.software/blog/relational-vs-non-relational-databases-advantages-and-disadvantages/

Relational Algebra
https://medium.com/swlh/merging-dataframes-with-pandas-pd-merge-7764c7e2d46d

Joins in Python Pandas import pandas as pd
pd.merge()
• on
• left_on, right_on
• how
• inner
• outer • left
• right
inner JOIN
left JOIN
right JOIN
outer JOIN
Image from https://stackoverflow.com/questions/53645882/

Resources
• https://jakevdp.github.io/PythonDataScienceHandbook/03.07-merge-and-join.html
• https://jakevdp.github.io/PythonDataScienceHandbook/03.06-concat-and-append.html • https://pandas.pydata.org/pandas-docs/version/0.22.0/merging.html

Challenges
• Once data is into a relational database, it is easier to wrangle. • But may be difficult to load it there in the first place …

More structured data – Spreadsheets
• Huge amounts of data is in spreadsheets • Businesses
• Hospitals • ….
• Microsoft (Excel), OpenOffice (Calc), Google Sheets