Information system for analyzing public sentiment in web platforms based on machine learning
DOI:
https://doi.org/10.15276/hait.07.2024.14Keywords:
Web platform, information system, public mood, propaganda, disinformation, fake, message, text, data mining, artificial intelligence, machine learningAbstract
The systems for studying public sentiment in web platforms are analyzed. Various tools and methods for effectively determining the mood in textual data from web platforms are described, including the formalization of the social graph and the content graph. The process of classifying comments, which includes the systematization and categorization of statements, is investigated. Based on the studied dataset, information on customer reviews and hotel ratings in Europe from the booking.com web platform is selected. Taking into account the requirements of the information system and the results of the analysis, it is determined that in order to obtain better results in determining the emotional connotation of the texts of reviews and messages from users, the most appropriate is the use of machine learning methods, taking into account natural language methods for processing text data. When choosing a text vectorization method for machine learning, the Term Frequency Inverse Document Frequency Vectorizer was chosen as the most effective among the studied methods. The architectural structure of the studied system is proposed, which is aimed at effective interaction between components and modules. The LogisticRegression model is chosen to determine the public mood. An information system has been developed that analyzes public sentiment about objects, uses advanced machine learning technologies to assess the emotional connotation of text comments, and provides users with insights and analysis of the results.