This course covers modern cloud computing technologies and their impact on enterprise computing systems. It starts with major cloud computing concepts, economics, and Big Data. It also covers Infrastructure as a Service (IaaS) from Amazon, Google, and Microsoft. Serverless computing, storage, and middleware are introduced, followed by Big Data programming with Apache Hadoop and Apache Spark. The course then focuses on cloud storage services, including object storage systems, virtual hard drives, and virtual archival storage. It presents distributed key-value stores and NoSQL databases, as well as Spark SQL and Distributed Publish/Subscribe systems with Kafka. Higher-end applications in the Cloud are explored, including data analytics, graph processing, machine learning, and fast data systems like Apache Storm and Spark Streaming. Virtualization and containers are also covered in-depth, including Docker, Kubernetes, and Infrastructure as Code. Finally, future trends and an interview with a cloud architect are discussed.
There are 12 programming assignments, all autograded, to ensure students gain hands-on expereince on many Cloud and Big Data Technologies.
Open Couse Syllabus
This course has a research-oriented syllabus and composed of two components:
Component 1: Read and present 4 conference papers, chosen from top computer systems conferences. Peer-reviewed, worth 40% of course grade. Peer-review at least 3 other paper presentations, reading more recommended. This activity is designed to both familiarize students to how academic research papers are written and structured, and also to get the students up to date with the cutting edge of science in the field.
Component 2: Group-based research project, including proposing a novel research idea, literature survey, hypothesis formation, system design and implementation, running experiments to validate hypothesis, measuring system performance, and writing a high-quality paper for possible conference submission. Staff graded with 5 milestones, 20% of grade from teammate review. Course staff provide feedback and suggestions, grading milestones with 80% of the grade. Feedback and grades usually given within 1-2 weeks. Overall goal is to develop a potentially publishable paper by the end of the semester.
See a list of research projects from the CapstoneThis is the first part of a two-course series on Cloud Computing and Big Data. In this course, we cover the concepts and technologies behind cloud computing, including virtualization, containers, and infrastructure as a service from major providers like Amazon, Google, and Microsoft. We also cover higher-level cloud services such as platform as a service, mobile backend as a service, and serverless architectures. Week three covers cloud middleware technologies such as RPC and REST, and metal as a service. In week four, we introduce higher-level cloud services with a focus on cloud storage, including Hive, HDFS, Ceph, cloud object storage systems, virtual hard drives, and virtual archival storage options. We end the course with a discussion on Dropbox cloud solutions.
Open in CourseraThis course is the second part of a two-course series that aims to provide a comprehensive view of Cloud Computing and Big Data. It covers major data analysis systems like Spark, HDFS, and MapReduce, large-scale data storage, consensus difficulties, distributed key-value stores, NOSQL databases, and Distributed Publish/Subscribe systems like Kafka. The course also covers fast data streaming with Storm and Spark Streaming, Graph Processing, Machine Learning, and Deep Learning with examples from Pregel, Giraph, and Spark. It concludes by introducing Deep Learning technologies including Theano, Tensor Flow, CNTK, MXnet, and Caffe on Spark.
Open in CourseraThe Data Mining Capstone course provides an opportunity for those students who have already taken multiple topic courses in the general area of data mining to further extend their knowledge and skills of data mining through both reading recent research papers and working on an open-ended real-world data mining project
Link to SyllabusCopyright © 2023 · All Rights Reserved