Optimization of analysis and minimization of information losses in text mining

Authors

DOI:

https://doi.org/10.15276/hait.01.2020.4

Keywords:

text analysis, annotation, text mining, software, algorithm, text data, natural language

Abstract

Information is one of the most important resources of today's business environment. It is difficult for any company to succeed without having sufficient information about its customers, employees and other key stakeholders. Every day, companies receive unstructured and structured text from a variety of sources, such as survey results, tweets, call center notes, phone emails, online customer reviews, recorded interactions, emails and other documents. These sources provide raw text that is difficult to understand without using the right text analysis tool. You can do text analytics manually, but the manual process is inefficient. Traditional systems use keywords and cannot read and understand language in emails, tweets, web pages, and text documents. For this reason, companies use text analysis software to analyze large amounts of text data. The software helps users retrieve textual information to act accordingly The most common manual annotation is currently the most common, which can be attributed to the high quality of annotation and its “meaningfulness”. Typical disadvantages of manual annotation systems, textual information analysis systems are the high material costs and the inherent low speed of work. Therefore, the topic of this article is to explore the methods by which you can effectively annotate reviews of various products from the largest marketplace in Ukraine. The following tasks should be solved: to analyze modern approaches to data analysis and processing; to study basic algorithms for data analysis and processing; build a program that will collect data, design the program architecture for more efficient use, based on the use of the latest technologies; clear data using minimize information loss techniques; analyze the data collected, using data analysis and processing approaches; to draw conclusions from the results of all the above works. There are quite a number of varieties of the listed tasks, as well as methods of solving them. This again confirms the importance and relevance of the topic we choose. The purpose of the study is the methods and means by which information losses can be minimized when analyzing and processing textual data. The object of the study is the process of minimizing information losses in the analysis and processing of textual data. In the course of the study, recent research on the analysis and processing of textual information was analyzed; methods of textual information processing and Data Mining algorithms are analyzed.

Downloads

Download data is not yet available.

Author Biographies

Olha O. Mezentseva, Taras Shevchenko National University of Kyiv, Volodymyrska Street, 60, Kyiv, Ukraine, 01033

Candidate of Economic Sciences, Assistant of the Technology Management Department

Anna S. Kolomiiets, Taras Shevchenko National University of Kyiv, Volodymyrska Street, 60, Kyiv, Ukraine, 01033

Candidate of Economic Sciences, Assistant of the Technology Management Department

Downloads

Published

2020-02-19

How to Cite

Mezentseva, O. O. ., & Kolomiiets, A. S. . (2020). Optimization of analysis and minimization of information losses in text mining. Herald of Advanced Information Technology, 3(1), 373–382. https://doi.org/10.15276/hait.01.2020.4

Most read articles by the same author(s)