Course Description: This course provides the fundamental knowledge to capture and analyze all sorts of large-scale data from a variety of fields, such as people behavior, sensors, biological signals, finance, and more. Platforms for data storage system and distributed processing of large data sets, Hadoop HDFS and MapReduce, Spark, and others, and different ways of handling analytics algorithms on different platforms will be introduced. Prerequisites: (COP 2034 Introduction to Programming Using Python or COP 3809 Advanced Topics in Programming or ISC 2310 Python for Data Analytics) and COP 3710 - Database 1.