Content area
Abstract
MapReduce is a popular parallel programming model used in large-scale data processing applications running on a cluster computer. MapReduce has two main functions: Map and Reduce. Map function transforms the data into the key-value format on each node and Reduce function merges the values associated with the same key from the different nodes. However, typical MapReduce implementations have the imbalance issue of load (the number of key-value pairs). This thesis proposes an Adaptive Load Balancing Algorithm to balance the load and implements it in X10. The systematic experimental results show that this algorithm enables a good load balances, reduces communication across compute nodes, and consequently improves overall performance.





