Scalable and Accurate Clones Detection Based on Metrics for Dependence Graph

Authors

  • Sevak S. Sargsyan Yerevan State University
  • Shamil F. Kurmangaleev Yerevan State University
  • Artiom V. Baloian Yerevan State University
  • Hayk K. Aslanyan Yerevan State University

Keywords:

Metrics, Dependency graph, Scalable, Code clones, Compiler, Bit vector

Abstract

The article describes a new method of code clones detection for C/C++ programming languages. The method is based on metrics for program dependence graph. For every node of program dependence graph a characteristic vector is constructed, which contains information about neighbors. These characteristic vectors are represented as sixty four bit integer numbers, which allows determining similarity between two nodes in amortized constant time. Due to this it is possible to analyze effectively projects with million lines of source code. The high accuracy of the determined clones was achieved by checking locations of source code for corresponding nodes. The paper also describes new approach for dependency graphs generation, which allows building them much faster than any of the existing methods. This method was compared with several widely used tools. It performs better both execution time and accuracy.

Author Biographies

Sevak S. Sargsyan, Yerevan State University

Laboratory of System Programming IT Educational and Research Centre

Shamil F. Kurmangaleev, Yerevan State University

Laboratory of System Programming IT Educational and Research Centre

Artiom V. Baloian, Yerevan State University

Laboratory of System Programming IT Educational and Research Centre

Hayk K. Aslanyan, Yerevan State University

Laboratory of System Programming IT Educational and Research Centre

References

B. Baker, “On finding duplication and near-duplication in large software systems”, Proceedings of the 2nd Working Conference on Reverse Engineering, WCRE 1995, pp. 86-95, 1995.

C. K. Roy and J. R. Cordy, “An empirical study of function clones in open source software systems”, Proceedings of the 15th Working Conference on Reverse Engineering, WCRE 2008, pp. 81-90, 2008.

D. Rattana, R. Bhatiab and M. Singhc, “Software clone detection”, A systematic review, Information and Software Technology, vol. 55, no. 7, pp. 1165-1199, 2013.

[Online]. Available: http://llvm.org

Ch. K. Roya , J. R. Cordya and R. Koschkeb, “Comparison and evaluation of code clone detection techniques and tools”, A qualitative approach, Science of Computer Programming, vol.74, no. 7, pp. 470-495, 2009.

S. Ducasse, M. Rieger and S. Demeyer, “A language independent approach for detecting duplicated code”, Proceedings of the 15th International Conference on Software Maintenance, (ICSM’99), Oxford, England, UK, pp. 109-119, 1999.

T.Kamiya, S.Kusumoto and K.Inoue, “CCFinder: A multilinguistic token-based code clone detection system for large scale source code”, IEEE Transactions on Software Engineering, vol. 28, no. 7, pp. 654-670, 2002.

R. Kaur and S. Singh, “Clone detection in software source code using operational similarity of statements”, ACM SIGSOFT Software Engineering Notes, vol. 39, no. 3, pp. 1-5, 2014.

I. Baxter, A. Yahin, L. Moura and M. Anna, “Clone detection using abstract syntax trees”, Proceedings of the 14th IEEE International Conference on Software Maintenance, IEEE Computer Society, pp. 368-377, 1998.

L.Jiang, G.Misherghi, Z.Su and S.Glondu, “DECKARD : Scalable and accurate treebased detection of code clones”, Proceedings of the 29th International Conference on Software Engineering, (ICSE07), IEEE Computer Society, pp. 96-105, 2007.

J. Mayrand, C. Leblanc and E. Merlo, “Experiment on the automatic detection of function clones in a software system using metrics”, Proceedings of the 12th International Conference on Software Maintenance, (ICSM96), Monterey, CA, USA, pp. 244-253, 1996.

N. Davey, P. Barson, S. Field and R. Frank, “The development of a software clone detector”, International Journal of Applied Software Technology, vol. 1, no. 3/4, pp. 219-236, 1995.

R.Komondoor and S.Horwitz, “Using slicing to identify duplication in source code”, Proceedings of the 8th International Symposium on Static Analysis, pp. 40-56, 2001.

J. Krinke, “Identifying similar code with program dependence graphs”, Proceedings of the 8th Working Conference on Reverse Engineering, (WCRE 2001), pp. 301-309, 2001.

S. Gupta and P. C. Gupta, “Literature Survey of Clone Detection Techniques”, International Journal of Computer Applications, vol. 99, no. 3, pp. 41-44, 2014.

Y. Higo and S. Kusumoto, “Code clone detection on specialized PDGs with heuristics”, Proceedings of the 15th European Conference on Software Maintenance and Reengineering (CSMR11), Oldenburg, Germany, pp.75-84, 2011.

M. Gabel, L. Jiang and Z. Su, “Scalable detection of semantic clones”, Proceedings of 30th International Conference on Software Engineering (ICSE08), Leipzig, Germany, pp. 321-330, 2008.

[Online]. Available: http://theory.stanford.edu/∼ aiken/moss/

[Online]. Available: http://www.semdesigns.com/products/clone/

Downloads

Published

2021-12-10

How to Cite

Sargsyan, S. S. ., Kurmangaleev, S. F. ., Baloian, A. V. ., & Aslanyan, H. K. . (2021). Scalable and Accurate Clones Detection Based on Metrics for Dependence Graph. Mathematical Problems of Computer Science, 42, 54–62. Retrieved from http://mpcs.sci.am/index.php/mpcs/article/view/215