Uncertainty-aware multi-objective refactoring for code duplication

Dmytro D. Kurinko; Viktoriia I. Kryvda

doi:10.15276/hait.08.2025.19

PDF

Published:
2025-09-26

DOI: https://doi.org/10.15276/hait.08.2025.19

Keywords:

Clone refactoring, artificial intelligence in software engineering, machine learning, deep learning, clone classification, open-set recognition, uncertainty estimation

PDF

How to cite

How to Cite

(1)

Kurinko D. D.; Kryvda V. I. " Uncertainty-Aware Multi-Objective Refactoring for Code Duplication" Publ. Nauka i Tekhnika. Odesa: Ukraine. Herald of Advanced Information Technology 8 (3), 301–315. https://doi.org/10.15276/hait.08.2025.19.

Dmytro D. Kurinko

Odesa Polytechnic National University, 1, Shevchenko Ave. Odesa, 65044

https://orcid.org/0000-0001-8304-3257

Viktoriia I. Kryvda

Odesa Polytechnic National University, 1, Shevchenko Ave. Odesa, 65044

https://orcid.org/0000-0002-0930-1163

Abstract

Code clones are recurring code fragments that may hinder software maintainability if not properly managed. While many clone detection tools exist, they often stop at identification and provide no clear guidance on whether a detected clone group should be refactored, how to do so, or in what order. This paper presents a machine learning–based method for recommending clone refactorings with prioritization and confidence estimation. The proposed approach represents code fragments using abstract syntax trees, program dependency graphs, and semantic embeddings from a pre-trained CodeBERT model. In addition, version control data is used to extract evolutionary features such as churn, age, and co-change patterns. A multi-class classifier predicts refactoring types, while open-set recognition techniques identify uncertain cases and flag them as unknown. Effort and benefit estimation models help prioritize suggestions based on a cost-effectiveness ratio. We evaluated the method on four open-source Java projects using a manually labeled dataset of 600 clone groups. The system achieves a macro-F1 score of zero point seven six on known refactoring types and an AUROC of zero point nine one for unknown detection. Prioritized recommendation quality reaches NDCG@3 of zero point eight nine, showing strong alignment with expert assessments. The results indicate that clone refactoring can be effectively supported through integrated code representation, uncertainty modeling, and prioritization. The approach transforms clone analysis from a passive task into an actionable process.

Downloads

Download data is not yet available.

Issue

Vol. 8 No. 3 (2025): Herald of Advanced Information Technology

Topics

Section

Theoretical aspects of computer science, programming and data analysis

Authors

Author Biographies

Dmytro D. Kurinko, Odesa Polytechnic National University, 1, Shevchenko Ave. Odesa, 65044

PhD Student, Artificial Intelligence and Data Analysis Department

Viktoriia I. Kryvda, Odesa Polytechnic National University, 1, Shevchenko Ave. Odesa, 65044

PhD, Associate Professor, Department of Electricity and Energy Management, Head of Department of Postgraduate and Doctoral Studies

Uncertainty-aware multi-objective refactoring for code duplication

How to cite

How to Cite

Abstract

Downloads

Issue

Topics

Section

Authors

Author Biographies

Dmytro D. Kurinko, Odesa Polytechnic National University, 1, Shevchenko Ave. Odesa, 65044

Viktoriia I. Kryvda, Odesa Polytechnic National University, 1, Shevchenko Ave. Odesa, 65044

Most read articles by the same author(s)

Similar Articles

Menu

Article Sidebar

How to cite

How to Cite

Main Article Content

Abstract

Downloads

Article Details

Issue

Topics

Section

Authors

Author Biographies

Dmytro D. Kurinko, Odesa Polytechnic National University, 1, Shevchenko Ave. Odesa, 65044

Viktoriia I. Kryvda, Odesa Polytechnic National University, 1, Shevchenko Ave. Odesa, 65044

Most read articles by the same author(s)

Similar Articles

Menu