Method for building ensemble classifiers of structured and unstructured data based on a unified approach

Main Article Content

Olena O. Arsirii
Oleksandr K. Andronati

Abstract

Effectively classifying heterogeneous data, including structured and unstructured data, is essential in diverse fields such as healthcare, finance, information security, and audio content analysis. This study aims to develop a unified approach for constructing ensemble classifiers capable of handling diverse data formats within a single framework, enhancing classification accuracy and robustness. The methodology integrates feature extraction and data preprocessing techniques, transforming heterogeneous datasets to a standardized numerical format suitable for ensemble learning. Eight base classifiers including K-nearest neighbors, support vector machines, random forest, extreme gradient boosting, logistic regression, multilayer perception, convolution neural networks and long short-term memory networks–were trained with optimized hyperparameters. The ensemble classification uses stacking with various aggregation types such as hard voting, soft voting, and soft voting using Gompertz fuzzy ranking to effectively combine model predictions while accounting for uncertainty and noise. Experimental evaluation across five datasets, covering medical diagnosis, credit risk, emotion recognition, music genres and deepfake detection–demonstrates consistent improvement in accuracy and F1-score metrics, with gains up to 8 percent compared to the best individual classifiers. The approach proves particularly effective for unstructured audio data, where temporal and spectral dependencies pose significant challenges. The results underscore the versatility the proposed unified ensemble methodology in addressing class imbalance and noise offering a scalable solution adaptable to various domains. This work contributes a comprehensive framework facilitating the development of robust classifiers for complex real-world data and paves the way for future research integrating heterogeneous data sources within cohesive predictive models.

Downloads

Download data is not yet available.

Article Details

Topics

Section

Theoretical aspects of computer science, programming and data analysis

Authors

Author Biographies

Olena O. Arsirii, Odesa Polytechnic National University. 1, Shevchenko Ave. Odesa, 65044, Ukraine

Doctor of Engineering Sciences, Professor, Head of the Department of Information Systems

Scopus Author ID 54419480900

Oleksandr K. Andronati, Odesa Polytechnic National University. 1, Shevchenko Ave. Odesa, 65044, Ukraine

graduate student, Department of Information Systems

Scopus Author ID 58677655800

Most read articles by the same author(s)

Similar Articles

You may also start an advanced similarity search for this article.