Program clones detection as natural language texts fragments based on constructive-synthesizing modeling

Shynkarenko, Viktor I.; Kuropiatnyk, Olena S.; Demidovich, Inna M.

Program clones detection as natural language texts fragments based on constructive-synthesizing modeling

Files

paper2.pdf (2.71 MB)

Date

2025

Authors

Shynkarenko, Viktor I.

Kuropiatnyk, Olena S.

Demidovich, Inna M.

Publisher

CEUR Workshop Proceedings

Abstract

ENG: The developed and tested method for comparing the structure of natural language texts is adapted to the analysis of program texts. The method is based on the use of stochastic grammars, including rules that describe the algorithmic structure of programs. The certain structures appearance probability is calculated as the product of the different program elements probabilities. Constructive-production modeling tools were used to form the rules. An experiment was conducted to verify the possibility of using this method to detect clones in the programs source text in C++ and C#. Different types of tasks and their software implementations were studied: both those that are equivalent in control flow but different in calculations, and vice versa. As a result of the experiments, it was found that programs that solve different tasks but have almost identical algorithms have high values of similarity indicators. If the algorithms are similar, but solve different tasks, the indicators are slightly lower. Similarity indicators from low to medium, obtained in cases where different tasks are solved with different algorithms that is due to the use of a single programming language syntax.

Description

This article is published under a Creative Commons Attribution 4.0 International (CC BY 4.0) licence. V. Shynkarenko: ORCID 0000-0001-8738-7225; O. Kuropiatnyk: ORCID 0000-0003-2286-884X; I. Demidovich: ORCID 0000-0002-3644-184X

Keywords

software, constructive-synthesizing modeling, natural language, formal language, program clone, information technology, КІТ

Citation

Shynkarenko, V., Kuropiatnyk, O., & Demidovich, I. (2025). Program clones detection as natural language texts fragments based on constructive-synthesizing modeling. In N. Khairova, V. Vysotska, N. Grabar, & T. Hamon (Eds.), Proceedings of the Computational Linguistics Workshop (CLW-CoLInS 2025) at the 9th International Conference on Computational Linguistics and Intelligent Systems (CoLInS 2025), Kharkiv, Ukraine, May 15–16, 2025 (Vol. 3976, pp. 11-22). CEUR-WS.org. https://ceur-ws.org/Vol-3976/

URI

https://ceur-ws.org/Vol-3976/
https://crust.ust.edu.ua/handle/123456789/21152

Collections

Статті КІТ

Creative Commons license

Except where otherwised noted, this item's license is described as Attribution 4.0 International (CC BY 4.0)

Full item page

Program clones detection as natural language texts fragments based on constructive-synthesizing modeling

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections

Endorsement

Review

Supplemented By

Referenced By

Creative Commons license