Microsoft Word – bdm3305-2021-1-coursesyllabus.docx
DEPARTMENT OF DIGITAL BUSINESS MANAGEMENT
Course Syllabus 1/2021
MSM&E VISION
To be distinguished business school with entrepreneurial spirit and international learning environment
MSM&E MISSION
Educating graduates with entrepreneurial spirit, global competency, and social responsibility.
By nurturing business knowledge and skills to develop creative business solutions;
By developing business communication skills and appreciation of diversity;
By fostering ethical awareness to act in the benefit of the society at large.
COURSE ORGANIZATION
Course Title 1. BDM 3305 Big Data Analytics
2. MIS 4142 Big Data
Semester 1/2021
Credits 1. 3 (2-2-5)
2. 3 (2-2-5)
Pre‐requisite None
Description 1. BDM 3305 Big Data Analytics
Fundamental concepts and hands-on learning of big data analytics and cloud
computing technologies. Big data platforms, tools, collection and ingestion, storage,
analysis, preprocessing, processing, visualization, and deployment at scale.
2. MIS 4142 Big Data
Fundamental concepts of big data technology, data tools, organization, storage,
retrieval, analysis and knowledge discovery at scale. Cloud computing, data storage
systems, large-scale data analysis, self-descriptive data representations, and semi-
structured data models.
Objectives On completion of this subject, students should be able to:
1. Understand the fundamental concepts of data science process
2. Gain some hand-on learning of Google Cloud Platform
3. Gain some hand-on learning of Google Cloud Compute
4. Gain some hand-on learning of Google Cloud Storage
5. Gain some hand-on learning of Google Cloud SQL
6. Gain some hand-on learning of Google BigQuery
7. Gain some hand-on learning of Google Dataproc
8. Gain some hand-on learning of Google Dataprep
9. Gain some hand-on learning of Google Dataflow
10. Gain some hand-on learning of Google Data Studio
11. Gain some hand-on learning of Python
12. Gain some hand-on learning of Python Machine Learning
13. Gain some hand-on learning of Jupyter Notebook
14. Gain some hand-on learning of Apache Spark Python
15. Gain some hand-on learning of Apache Spark RDD
16. Gain some hand-on learning of Apache Spark SQL
Marks Allocation Workshops 30 %
Term Project 20 %
Midterm Lab Examination 20 %
Final Lab Examination 30 %
Total: 100 %
COURSE RESOURCES
Learning
materials
Ivan Marin, Ankit Shukla, Sarang VK, Big Data Analysis with Python, Packt, 2019.
Syed Muhammad Fahad Akhtar, Big Data Architect’s Handbook, Packt, 2018.
(Self Reading) Vijay Kotu, Bala Deshpande, Data Science Concepts and Practice, 2nd Edition,
Morgan Kaufmann, 2019.
References Anand Deshpande, Manish Kumar, Artificial Intelligence for Big Data, Packt, 2018.
Manuel Ignacio, Franco Galeano, Big Data Processing with Apache Spark, Packt, 2018.
Sridhar Alla, Big Data Analytics with Hadoop 3, Packt, 2018.
Jiawei Han, Micheline Kamber, Jian Pei, Data Mining Concepts and Techniques, 3rd
Edition, Morgan Kaufmann, 2012.
Joel Grus, Data Science from Scratch, O’Reilly Media, 2015.
Course Website https://lms.msme.au.edu
https://www.piyabute.com/work/
Zoom Login PMI = 345 345 9345 Passcode = abacdbm
Lecturer Asst. Prof. Dr. Piyabute Fuangkhon
Certified Data Science Specialist, iTrain Asia
Certified Applications Use Cases Master, RapidMiner
Certified Data Engineering Master, RapidMiner
Certified Machine Learning Master, RapidMiner
Certified Platform Administration Master, RapidMiner
Class Schedule Mon.
Tue.
9:00-10:30
9:00-10:30
SC 0701
SC 0504
Consultation Wed./Thu./Fri. 9:00-16:30 (make an appointment via E-Mail)
Office SC 0403, Suvarnabhumi Campus
E‐Mail .edu
COURSE EXAMINATIONS
Midterm Lab Exam Date 3 August 2021 (9:00-11:00)
Topic Week 1-7
Final Lab Exam Date 6 October 2021 (13:00-16:00)
Topic Week 1-15
COURSE REQUIREMENTS
1. Students are required to have 80% of class attendance to be eligible for the final lab examination.
The absence of 20% (or three classes) is INCLUSIVE for all reasons such as illness, accidents, etc.
2. Students who come later than the first 15 minutes of class are considered as “LATE.”
Two lateness are counted as one absence.
3. The proper uniform is required in a class, or attendance will not be checked.
4. There will be no MAKE-UP classes, quiz, and exam for those who fail to attend for any reasons.
5. Students are expected to maintain a high level of responsibility concerning academic honesty. Academic
dishonesty includes copying another students’ work or submitting a student’s work that is not entirely
his/her own and can result in disciplinary actions following the University regulations.
COURSE CONTENTS AND TENTATIVE SCHEDULE
Week 1. Introduction to Big Data Analytics
Big Data Platform
Google Cloud Platform – https://cloud.google.com/docs/
Data Science – Introduction to Data Science
Week 2. Big Data Platform
Google Compute Engine – https://cloud.google.com/compute/docs/
Command‐line Interface
Google Cloud Shell – https://cloud.google.com/shell/docs/
Data Science – Data Science Process
Week 3. Big Data Platform
Google Cloud Storage – https://cloud.google.com/storage/docs/
Data Science – Data Exploration
Week 4. Big Data Query
Google Cloud SQL – https://cloud.google.com/sql/docs/mysql/
Data Science – Classification
Week 5. Big Data Query
Google BigQuery – https://cloud.google.com/bigquery/docs/
Data Science – Classification
Week 6. Big Data Cluster
Google Dataproc – https://cloud.google.com/dataproc/docs/
Data Science – Classification
Week 7. Big Data Preprocessing
Google Dataprep – https://cloud.google.com/dataprep/docs/
Data Science – Regression Methods
Week 8. Big Data Preprocessing
Google Dataflow – https://cloud.google.com/dataflow/docs/
Data Science – Association Analysis
Week 9. Big Data Visualization
Google Data Studio – https://support.google.com/datastudio/
Data Science – Clustering
Week 10. Data Analytics
Python Programming
Anaconda – https://www.anaconda.com/
Google Colab – https://colab.research.google.com/
Data Science – Model Evaluation
Week 11. Machine Learning
scikit-learn – Decision Tree – https://scikit-learn.org/stable/modules/tree.html
Data Science – Text Mining
Week 12. Big Data Analytics – Programming Environment
Jupyter Notebook – https://cloud.google.com/dataproc/docs/tutorials/jupyter-notebook/
Jupyter Notebook – https://cloud.google.com/ai-platform/notebooks/docs/create-new/
Big Data Analytics ‐ Apache Spark Python (PySpark)
Apache Spark Python –
https://spark.apache.org/docs/latest/api/python/getting_started/quickstart.html
Data Science – Time Series Forecasting
Week 13. Big Data Analytics ‐ Apache Spark RDD
Apache Spark RDD – https://spark.apache.org/docs/latest/rdd-programming-guide.html
Data Science – Anomaly Detection
Week 14. Big Data Analytics ‐ Apache Spark SQL
Apache Spark SQL – https://spark.apache.org/docs/latest/sql-programming-guide.html
Data Science – Feature Selection
Week 15. Project Presentation