MapReduce Scheduling in Cloud Environments


This project aims to develop modules for a MapReduce framework capable of efficiently utilizing the resources in a cloud environment. We will conduct research on the problems associated with existing MapReduce implementations and the components and trade-offs necessary for scheduling and migrating tasks dynamically.

Intellectual Merit

This project will advance the state of the art in scheduling for large-scale data intensive applications by taking into account the heterogeneity in the infrastructure of large grids and clouds.

Broader Impact

The software modules resulting from this project will be released as open-source to the HPC community.

Use of FutureGrid

Plan to use the FutureGrid resources to increase the scale of our experiments from a few nodes to much larger sizes.

Scale Of Use

100-300 nodes, for experiments run a few times a week, for the next 6-12 months.


Renan DelValle
Binghamton University

Project Members

Jessica Hartog
John Weachock
Renan DelValle

FutureGrid Experts

Tak-Lon Wu