• KSII Transactions on Internet and Information Systems
    Monthly Online Journal (eISSN: 1976-7277)

An Empirical Performance Analysis on Hadoop via Optimizing the Network Heartbeat Period

Vol. 12, No. 11, November 29, 2018
10.3837/tiis.2018.11.005 , Download Paper (Free):

Abstract

To support a large-scale Hadoop cluster, Hadoop heartbeat messages are designed to deliver the significant messages, including task scheduling and completion messages, via piggybacking to reduce the number of messages received by the NameNode. Although Hadoop is designed and optimized for high-throughput computing via batch processing, the real-time processing of large amounts of data in Hadoop is increasingly important. This paper evaluates Hadoop’s performance and costs when the heartbeat period is controlled to support latency sensitive applications. Through an empirical study based on Hadoop 2.0 (YARN) [1] architecture, we improve Hadoop’s I/O performance as well as application performance by up to 13 percent compared to the default configuration. We offer a guideline that predicts the performance, costs and limitations of the total system by controlling the heartbeat period using simple equations. We show that Hive performance can be improved by tuning Hadoop’s heartbeat periods through extensive experiments.


Statistics

Show / Hide Statistics

Statistics (Cumulative Counts from December 1st, 2015)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.


Cite this article

[IEEE Style]
J. Lee, J. Choi, H. Roh, J. S. Shin, "An Empirical Performance Analysis on Hadoop via Optimizing the Network Heartbeat Period," KSII Transactions on Internet and Information Systems, vol. 12, no. 11, pp. 5252-5268, 2018. DOI: 10.3837/tiis.2018.11.005 .

[ACM Style]
Jaehwan Lee, June Choi, Hongchan Roh, and Ji Sun Shin. 2018. An Empirical Performance Analysis on Hadoop via Optimizing the Network Heartbeat Period. KSII Transactions on Internet and Information Systems, 12, 11, (2018), 5252-5268. DOI: 10.3837/tiis.2018.11.005 .

[BibTeX Style]
@article{tiis:21917, title="An Empirical Performance Analysis on Hadoop via Optimizing the Network Heartbeat Period", author="Jaehwan Lee and June Choi and Hongchan Roh and Ji Sun Shin and ", journal="KSII Transactions on Internet and Information Systems", DOI={10.3837/tiis.2018.11.005 }, volume={12}, number={11}, year="2018", month={November}, pages={5252-5268}}