• KSII Transactions on Internet and Information Systems
    Monthly Online Journal (eISSN: 1976-7277)

LDBAS: Location-aware Data Block Allocation Strategy for HDFS-based Applications in the Cloud

Vol. 12, No.1, January 31, 2018
10.3837/tiis.2018.01.010, Download Paper (Free):

Abstract

Big data processing applications have been migrated into cloud gradually, due to the advantages of cloud computing. Hadoop Distributed File System (HDFS) is one of the fundamental support systems for big data processing on MapReduce-like frameworks, such as Hadoop and Spark. Since HDFS is not aware of the co-location of virtual machines in the cloud, the default scheme of block allocation in HDFS does not fit well in the cloud environments behaving in two aspects: data reliability loss and performance degradation. In this paper, we present a novel location-aware data block allocation strategy (LDBAS). LDBAS jointly optimizes data reliability and performance for upper-layer applications by allocating data blocks according to the locations and different processing capacities of virtual nodes in the cloud. We apply LDBAS to two stages of data allocation of HDFS in the cloud (the initial data allocation and data recovery), and design the corresponding algorithms. Finally, we implement LDBAS into an actual Hadoop cluster and evaluate the performance with the benchmark suite BigDataBench. The experimental results show that LDBAS can guarantee the designed data reliability while reducing the job execution time of the I/O-intensive applications in Hadoop by 8.9% on average and up to 11.2% compared with the original Hadoop in the cloud.


Statistics

Show / Hide Statistics

Statistics (Cumulative Counts from December 1st, 2015)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.


Cite this article

[IEEE Style]
Hua Xu, Weiqing Liu, Guansheng Shu and Jing Li, "LDBAS: Location-aware Data Block Allocation Strategy for HDFS-based Applications in the Cloud," KSII Transactions on Internet and Information Systems, vol. 12, no. 1, pp. 204-226, 2018. DOI: 10.3837/tiis.2018.01.010

[ACM Style]
Xu, H., Liu, W., Shu, G., and Li, J. 2018. LDBAS: Location-aware Data Block Allocation Strategy for HDFS-based Applications in the Cloud. KSII Transactions on Internet and Information Systems, 12, 1, (2018), 204-226. DOI: 10.3837/tiis.2018.01.010