Course: Cloud Computing and Storage Class

Project Information

Computer Science (401) 
14.09 Computer Engineering 

Course Objective and Description:

Using large-scale computing systems to solve data-intensive realworld problems has become indispensable for many scientific and engineering disciplines. This course provides a broad introduction to the fundamentals in cloud computing and storage, focusing on system architecture, programming models, algorithmic design, and application development. Selected scientific applications will be used as case studies. 

Prerequisite: introduction to programming or data structures and algorithms (EEL4834 or equivalent), computer architecture (EEL5764 or equivalent), proficiency in Java, or instructor approval. 


  • Hadoop: The Definitive Guide (3rd Edition), Tom White, O'Reilly Media, 2012.

Other References: 

  • Many recent papers in leading conferences/journals will be discussed.
  • Data-Intensive Text Processing with MapReduce, Jimmy Lin and Chris Dyer, 2010. (PDF version available online)
  • Programming Amazon EC2, Jurg van Vliet and Flavia Paganelli, O'Reilly Media, 2011.
  • Distributed and Cloud Computing: From Parallel Processing to the Internet of Things by Kai Hwang, Jack Dongarra & Geoffrey C. Fox, Morgan Kaufmann, 2011.
  • The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines, Luiz Andre Barros and Urs Hoelzle, Morgan and Claypool Publishers, 2009.
  • The Grid: Blueprint for a New Computing Infrastructure (2nd Edition), Ian Foster, Carl Kesselman, Morgan Kaufmann/Elsevier, 2004.
  • The Fourth Paradigm: Data-Intensive Scientific Discovery, Tony Hey, Stewart Tansley, and Kristine Tolle, Microsoft Research, 2009. (PDF version available online)

Course Homepage:

Course Outline (tentative):

  1. Introduction and Overview
  2. Programming Paradigms
  3. Introduction to Hadoop 
  4. MapReduce Runtime Management
  5. Algorithm Design and Implementation in MapReduce 
  6. Consistency and Coordination
  7. Key-Value Structured Storage
  8. Enhancements to Hadoop/MapReduce
  9. Distributed File Systems
  10. Case Study

Intellectual Merit

Novel approach teaching Cloud Computing and Storage in a programming-oriented approach. Some course projects may end up with novel ideas and publications.

Broader Impacts

Currently 65 graduate students enrolled (could be more)

Project Contact

Project Lead
Andy Li (andyli) 
Project Manager
Min Li (minli) 
Project Members
Min Li, Sandeep Nuggehalli Lakshminarayana, Navina Ramesh, Yifeng Zhang, Meng Wang, Kaikai Liu, Mohan RamKarthik Selvamoorthy, David Stoker, Vikrant Sagar, Yue Bai, Xiaofei Ma, Pallami Bhattacharjee, Nikhilesh Reddy Chaduvula, Amruta Badami, Radhika Garg, Ankit Srivastava, Kushal Kewlani, Zhiyi Kang, Neha Bhatia, Binyan Li, Hemanth karthik Kasibhatta, Avinash Kautham Subramaniam Ravi, Lakshmanan Velusamy, Rushabh Shah, Charan Hebri, Bharath Chandrasekhar, NAVEEN CHANDRA GORIJALA, Qiuyuan Huang, Soham Mehta, Dapeng Wu, Madhumita Ramesh Babu, Lakshmi Priya Gopal, Saravanan Sathananda Manidas, Yashovardhan Agarwalla, Neha Uppal, Kumar Abhishek, Revanth Alampally, Siva Kolli, Deepak Dasarathan, sampath kumar tulava, BharathKumar Pareek, Sai Kaushik Nampalli, Sujith Perla, Sharath Chandra Pilli, Saran Vellanki, Sandhya Tejaswi Komaragiri, syam sundara rao kolla, Sri Ramya Tangellamudi, Manas Gupta, Madhav Arora, Rishi Pathak, murali raman, Pratik Somanagoudar  

Resource Requirements

Hardware Systems
  • alamo (Dell optiplex at TACC)
  • foxtrot (IBM iDataPlex at UF)
  • hotel (IBM iDataPlex at U Chicago)
  • india (IBM iDataPlex at IU)
  • sierra (IBM iDataPlex at SDSC)
Use of FutureGrid

We have about 65 graduate students working on course projects. They will use FutureGrid to run mainly MapReduce related jobs and conduct performance analysis.

Scale of Use

We have about 65 graduate students working on about 20 course projects. Most usage will be within 8 VMs, and some might be slightly more.

Project Timeline

08/15/2012 - 12:32