• KSII Transactions on Internet and Information Systems
    Monthly Online Journal (eISSN: 1976-7277)

Tree-Pattern-Based Clone Detection with High Precision and Recall

Vol. 12, No. 5, May 30, 2018
10.3837/tiis.2018.05.002 , Download Paper (Free):

Abstract

The paper proposes a code-clone detection method that gives the highest possible precision and recall, without giving much attention to efficiency and scalability. The goal is to automatically create a reliable reference corpus that can be used as a basis for evaluating the precision and recall of clone detection tools. The algorithm takes an abstract-syntax-tree representation of source code and thoroughly examines every possible pair of all duplicate tree patterns in the tree, while avoiding unnecessary and duplicated comparisons wherever possible. The largest possible duplicate patterns are then collected in the set of pattern clusters that are used to identify code clones. The method is implemented and evaluated for a standard set of open-source Java applications. The experimental result shows very high precision and recall. False-negative clones missed by our method are all non-contiguous clones. Finally, the concept of neighbor patterns, which can be used to improve recall by detecting non-contiguous clones and intertwined clones, is proposed.


Statistics

Show / Hide Statistics

Statistics (Cumulative Counts from December 1st, 2015)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.


Cite this article

[IEEE Style]
H. Lee, M. Choi, K. Doh, "Tree-Pattern-Based Clone Detection with High Precision and Recall," KSII Transactions on Internet and Information Systems, vol. 12, no. 5, pp. 1932-1950, 2018. DOI: 10.3837/tiis.2018.05.002 .

[ACM Style]
Hyo-Sub Lee, Myung-Ryul Choi, and Kyung-Goo Doh. 2018. Tree-Pattern-Based Clone Detection with High Precision and Recall. KSII Transactions on Internet and Information Systems, 12, 5, (2018), 1932-1950. DOI: 10.3837/tiis.2018.05.002 .

[BibTeX Style]
@article{tiis:21749, title="Tree-Pattern-Based Clone Detection with High Precision and Recall", author="Hyo-Sub Lee and Myung-Ryul Choi and Kyung-Goo Doh and ", journal="KSII Transactions on Internet and Information Systems", DOI={10.3837/tiis.2018.05.002 }, volume={12}, number={5}, year="2018", month={May}, pages={1932-1950}}