- Overview
- Conference Organizers
- Registration Information
- Registration Discounts
- At a Glance
- Calendar
- Activities
- Technical Sessions
- Workshops
- Posters and Demos
- Birds-of-a-Feather Sessions
- Sponsorship
- Hotel and Travel Information
- Services
- Students
- Questions
- Help Promote!
- For Participants
- Call for Papers
- Past Conferences
sponsors
usenix conference policies
iShuffle: Improving Hadoop Performance with Shuffle-on-Write
Yanfei Guo, Jia Rao, and Xiaobo Zhou, University of Colorado, Colorado Springs
Awarded Best Paper!
Hadoop is a popular implementation of the MapReduce framework for running data-intensive jobs on clusters of commodity servers. Although Hadoop automatically parallelizes job execution with concurrent map and reduce tasks, we find that, shuffle, the all-to-all input data fetching phase in a reduce task can significantly affect job performance. We attribute the delay in job completion to the coupling of the shuffle phase and reduce tasks, which leaves the potential parallelism between multiple waves of map and reduce unexploited, fails to address data distribution skew among reduce tasks, and makes task scheduling inefficient. In this work, we propose to decouple shuffle from reduce tasks and convert it into a platform service provided by Hadoop. We present iShuffle, a user-transparent shuffle service that pro-actively pushes map output data to nodes via a novel shuffle-on-write operation and flexibly schedules reduce tasks considering workload balance. Experimental results with representative workloads show that iShuffle reduces job completion time by as much as 30.2%.
Open Access Media
USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.
author = {Yanfei Guo and Jia Rao and Xiaobo Zhou},
title = {{iShuffle}: Improving Hadoop Performance with {Shuffle-on-Write}},
booktitle = {10th International Conference on Autonomic Computing (ICAC 13)},
year = {2013},
isbn = {978-1-931971-02-7},
address = {San Jose, CA},
pages = {107--117},
url = {https://www.usenix.org/conference/icac13/technical-sessions/presentation/guo},
publisher = {USENIX Association},
month = jun
}
connect with us