Automation of the preparation process weakly-structured multi-dimensional data of sociological surveys in the Data Mining system
DOI:
https://doi.org/10.15276/hait.01.2018.1Keywords:
information technology, data mining, pre-processing, preparation process of data, sociological surveysAbstract
In order to obtain knowledge about the target audience, the preparation process of weakly-structured multi-dimensional data of sociological surveys were automated. The following techniques have been developed for automating data preparation: machine representation, preprocessing of the data from the sociological surveys in order to clean and filter it,
transformation of data into feature space based on a formalized research objective, nonlinear dimensionality reduction and visualization of the multi-dimensional data. As research has shown, the procedures associated with obtaining of primary and secondary feature spaces are the most significant. The Orange3 framework, which includes component-based data mining software and is used as a module for Python, were used to create IT of preparing weakly structured multidimensional data of sociological surveys in the Data Mining system. Approbation of the automated preparation process of weakly-structured multi-dimensional data within the sociological surveys Data Mining system allowed to increase the reliability of decision-making on the lifestyle of the respondents compared to a sociologist of the master's qualification level and the respondents own responses.