MET CS 689 B1 Designing and Implementing a Data Warehouse Andrew D Wolfe, Jr.
MET CS 689
Data Warehousing
Mary E. Letourneau
Python and Review
March 21, 2020
1
Python Demonstrations
Python Introduction
ETL with Python
Review
Lecture
What is the motivation behind data warehousing?
Definitions & Chronology:
Why are analytical SQL queries different from transactional SQL queries?
Compare/contrast OLTP/OLAP
Module notes
Overview of Big Data – Three V’s (volume, velocity, variety)
Information system topics – data mining, data science, sense-making, machine learning
Review (con’t)
Module notes (con’t)
Analytical functions
“Standard” SQL – group by, having, order by clauses
Analytical – over (partition by … order by …)
Row numbering and ranking
Order by
Lead/Lag
Review (con’t)
Kimball reading
Kimball’s goals of DW & BI: accessible, consistent, adaptable, timely, secure, authoritative, accepted.
DW/BI manager responsibilities (Kimball):
Understand the business users
Deliver high-quality, relevant, and accessible information and analytics to the business users
Sustain the DW/BI environment
Describe/Compare & contrast: Star schema vs OLAP cube
Describe/Compare & contrast: Fact vs dimension
Review (con’t)
Kimball reading (con’t)
Four components of DW/BI environment:
Operational source system
ETL system
Data presentation area
Business intelligence applications
Alternative DW/BI Architectures
Independent data mart
Hub-and-Spoke Corporate Information Factory (Inmon architecture)
Hybrid
Review (con’t)
Kimball reading (con’t)
Five Dimensional Modeling Myths. Dimensional models:
are only for summary data
are departmental, not enterprise
are not scalable
are only for predictable usage
can’t be integrated
Python
Indenting
Pandas
numpy
Have a Good Afternoon and a Great Weekend!
End of Presentation
8
/docProps/thumbnail.jpeg