Scalable and Accurate Clones Detection Based on Metrics for Dependence Graph
Keywords:
Metrics, Dependency graph, Scalable, Code clones, Compiler, Bit vectorAbstract
The article describes a new method of code clones detection for C/C++ programming languages. The method is based on metrics for program dependence graph. For every node of program dependence graph a characteristic vector is constructed, which contains information about neighbors. These characteristic vectors are represented as sixty four bit integer numbers, which allows determining similarity between two nodes in amortized constant time. Due to this it is possible to analyze effectively projects with million lines of source code. The high accuracy of the determined clones was achieved by checking locations of source code for corresponding nodes. The paper also describes new approach for dependency graphs generation, which allows building them much faster than any of the existing methods. This method was compared with several widely used tools. It performs better both execution time and accuracy.
References
B. Baker, “On finding duplication and near-duplication in large software systems”, Proceedings of the 2nd Working Conference on Reverse Engineering, WCRE 1995, pp. 86-95, 1995.
C. K. Roy and J. R. Cordy, “An empirical study of function clones in open source software systems”, Proceedings of the 15th Working Conference on Reverse Engineering, WCRE 2008, pp. 81-90, 2008.
D. Rattana, R. Bhatiab and M. Singhc, “Software clone detection”, A systematic review, Information and Software Technology, vol. 55, no. 7, pp. 1165-1199, 2013.
[Online]. Available: http://llvm.org
Ch. K. Roya , J. R. Cordya and R. Koschkeb, “Comparison and evaluation of code clone detection techniques and tools”, A qualitative approach, Science of Computer Programming, vol.74, no. 7, pp. 470-495, 2009.
S. Ducasse, M. Rieger and S. Demeyer, “A language independent approach for detecting duplicated code”, Proceedings of the 15th International Conference on Software Maintenance, (ICSM’99), Oxford, England, UK, pp. 109-119, 1999.
T.Kamiya, S.Kusumoto and K.Inoue, “CCFinder: A multilinguistic token-based code clone detection system for large scale source code”, IEEE Transactions on Software Engineering, vol. 28, no. 7, pp. 654-670, 2002.
R. Kaur and S. Singh, “Clone detection in software source code using operational similarity of statements”, ACM SIGSOFT Software Engineering Notes, vol. 39, no. 3, pp. 1-5, 2014.
I. Baxter, A. Yahin, L. Moura and M. Anna, “Clone detection using abstract syntax trees”, Proceedings of the 14th IEEE International Conference on Software Maintenance, IEEE Computer Society, pp. 368-377, 1998.
L.Jiang, G.Misherghi, Z.Su and S.Glondu, “DECKARD : Scalable and accurate treebased detection of code clones”, Proceedings of the 29th International Conference on Software Engineering, (ICSE07), IEEE Computer Society, pp. 96-105, 2007.
J. Mayrand, C. Leblanc and E. Merlo, “Experiment on the automatic detection of function clones in a software system using metrics”, Proceedings of the 12th International Conference on Software Maintenance, (ICSM96), Monterey, CA, USA, pp. 244-253, 1996.
N. Davey, P. Barson, S. Field and R. Frank, “The development of a software clone detector”, International Journal of Applied Software Technology, vol. 1, no. 3/4, pp. 219-236, 1995.
R.Komondoor and S.Horwitz, “Using slicing to identify duplication in source code”, Proceedings of the 8th International Symposium on Static Analysis, pp. 40-56, 2001.
J. Krinke, “Identifying similar code with program dependence graphs”, Proceedings of the 8th Working Conference on Reverse Engineering, (WCRE 2001), pp. 301-309, 2001.
S. Gupta and P. C. Gupta, “Literature Survey of Clone Detection Techniques”, International Journal of Computer Applications, vol. 99, no. 3, pp. 41-44, 2014.
Y. Higo and S. Kusumoto, “Code clone detection on specialized PDGs with heuristics”, Proceedings of the 15th European Conference on Software Maintenance and Reengineering (CSMR11), Oldenburg, Germany, pp.75-84, 2011.
M. Gabel, L. Jiang and Z. Su, “Scalable detection of semantic clones”, Proceedings of 30th International Conference on Software Engineering (ICSE08), Leipzig, Germany, pp. 321-330, 2008.
[Online]. Available: http://theory.stanford.edu/∼ aiken/moss/
[Online]. Available: http://www.semdesigns.com/products/clone/
Downloads
Published
How to Cite
Issue
Section
License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.