Author : K. Uma Maheswari 1
Date of Publication :7th August 2016
Abstract: Network traffic cost for any Map Reduce job by creating a manuscript intermediate data partition plan. Collectively think about the aggregator positioning problem, where each aggregator can help to eliminate merged traffic from multiple map tasks. Although a lot of efforts happen to be designed to enhance the performance of Map Reduce jobs, they ignore the network traffic produced within the shuffle phase, which plays a vital role in performance enhancement. The Map Reduce programming model simplifies large-scale information systems on commodity cluster by exploiting parallel map tasks and lower tasks. Finally, extensive simulation results show our plans can considerably reduce network traffic cost under both offline an internet-based cases. Typically, a hash function is used to partition intermediate data among reduce tasks, which, however, isn't traffic-efficient because network topology and knowledge size associated with every key aren't considered. A decomposition-based distributed formula is suggested to deal with the big-scale optimization problem for giant data application as well as an online formula can also be made to adjust data partition and aggregation inside a dynamic manner.
Reference :