Vol. 12, No. 1, January 30, 2018
10.3837/tiis.2018.01.010,
Download Paper (Free):
Abstract
Big data processing applications have been migrated into cloud gradually, due to the advantages of cloud computing. Hadoop Distributed File System (HDFS) is one of the fundamental support systems for big data processing on MapReduce-like frameworks, such as Hadoop and Spark. Since HDFS is not aware of the co-location of virtual machines in the cloud, the default scheme of block allocation in HDFS does not fit well in the cloud environments behaving in two aspects: data reliability loss and performance degradation. In this paper, we present a novel location-aware data block allocation strategy (LDBAS). LDBAS jointly optimizes data reliability and performance for upper-layer applications by allocating data blocks according to the locations and different processing capacities of virtual nodes in the cloud. We apply LDBAS to two stages of data allocation of HDFS in the cloud (the initial data allocation and data recovery), and design the corresponding algorithms. Finally, we implement LDBAS into an actual Hadoop cluster and evaluate the performance with the benchmark suite BigDataBench. The experimental results show that LDBAS can guarantee the designed data reliability while reducing the job execution time of the I/O-intensive applications in Hadoop by 8.9% on average and up to 11.2% compared with the original Hadoop in the cloud.
Statistics
Show / Hide Statistics
Statistics (Cumulative Counts from December 1st, 2015)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.
Cite this article
[IEEE Style]
H. Xu, W. Liu, G. Shu, J. Li, "LDBAS: Location-aware Data Block Allocation Strategy for HDFS-based Applications in the Cloud," KSII Transactions on Internet and Information Systems, vol. 12, no. 1, pp. 204-226, 2018. DOI: 10.3837/tiis.2018.01.010.
[ACM Style]
Hua Xu, Weiqing Liu, Guansheng Shu, and Jing Li. 2018. LDBAS: Location-aware Data Block Allocation Strategy for HDFS-based Applications in the Cloud. KSII Transactions on Internet and Information Systems, 12, 1, (2018), 204-226. DOI: 10.3837/tiis.2018.01.010.
[BibTeX Style]
@article{tiis:21654, title="LDBAS: Location-aware Data Block Allocation Strategy for HDFS-based Applications in the Cloud", author="Hua Xu and Weiqing Liu and Guansheng Shu and Jing Li and ", journal="KSII Transactions on Internet and Information Systems", DOI={10.3837/tiis.2018.01.010}, volume={12}, number={1}, year="2018", month={January}, pages={204-226}}