CISC 5950 Big Data Programming Fall 2020
Description: This course covers Apache Hadoop and Spark technologies and their ecosystems in the context of mining big data. It provides students both theoretical background and hands-on computing techniques in big data analytics and its applications. The students will learn how to collect, query, and analyze data. Topics include Hadoop core technologies (HDFS, MapReduce, Yarn), Spark Streaming, MLlib, Clustering, and Spark SQL. Scala programming language will be taught as part of the Spark component.
Instructor: Dr. Ying Mao (JMH 336 / LL610H)
Email: ymao41 at fordham.edu
Office Hours: Email for private meetings.
Textbook: No textbook required. The reading assignments will be posted on the course website. Course Web Resources: https://yingmao.github.io/cisc5950/
Grading Policy:
❖ Programming Labs (3)
❖ Projects (2)
❖ Midterm
❖ Final
● A ( >= 90 )
● B ( >= 80 )
● C ( >= 70 )
● D ( >= 60 )
● F ( < 60 )
---- 15% ---- 25% ---- 25% ---- 35%
Policy: Attendance is strongly encouraged and required.
Notes:
● Late submission of labs/projects: 20% reduction each day.
● Dispute on grading must be resolved within two weeks after receiving your score.
Week 1 Week 2 Week 3 Week 4 Week 5 Week- 6 Week 7 Week 8 Week 9 Week 10 Week 11 Week 12 Week 13 Week 14 Week 15 Week 16
Tentative Course Schedule
Introduction and Overview of Big Data Systems Hadoop Distributed File Systems (HDFS) MapReduce Framework
Resource Management in the cluster (YARN) Apache Scala Basic
No class due to Monday schedule Apache Scala Advances
Resilient Distributed Datasets (RDD) Midterm
Apache Spark
Apache Spark SQL
Data analytics on Hadoop / Spark
Machine learning on Hadoop / Spark
Spark Streaming
Other Hadoop/Spark ecosystem components Final
Additional Remarks
● Academic Honesty
All work produced in this course should be your own unless it is specifically stated that you may work with others. You may discuss the homework problems with other students generally, but may not provide complete solutions to one another; copying of homework solutions is always unacceptable and will be considered a violation of Fordham's academic integrity policy. Violations of this policy will be handled in accordance with university policy which can include automatic failure of the assignment and/or failure of the course. For more information, please refer to the Academic Integrity website.
● Makeup Exam
There will be no make-up exams given after the exam date. If you know in advance that you will have to miss an exam, you must check with me (in advance) to avoid getting a zero for that exam. In case of illness on an exam date, please contact me as soon as possible, so those appropriate arrangements can be made.