Application Research on Semantic Analysis Using Latent Dirichlet Allocation and Collapsed Gibbs Sampling for Topic Discovery

Ogunwale, Yetunde Esther and Ajinaja, Micheal Olalekan (2023) Application Research on Semantic Analysis Using Latent Dirichlet Allocation and Collapsed Gibbs Sampling for Topic Discovery. Asian Journal of Research in Computer Science, 16 (4). pp. 445-452. ISSN 2581-8260

[thumbnail of Ajinaja1642023AJRCOS110817.pdf] Text
Ajinaja1642023AJRCOS110817.pdf - Published Version

Download (347kB)

Abstract

Topic discovery is a process of identifying the main topics present in a collection of documents. It is a crucial step in text mining, digital humanities, and information retrieval, as it allows one to extract meaningful information from large volumes of unstructured text data. The most widely used algorithm for topic discovery is Latent Dirichlet Allocation (LDA). LDA assumes that the words in each document are generated by a small number of underlying topics, and the algorithm learns the topics from the text data automatically. One of the main problems of LDA is that the topics extracted are of poor quality if the document does not coherently belong to a single topic. However, Gibbs sampling operates on a word-by-word basis, which allows it to be used on documents with a variety of topics and modifies the topic assignment of a single word. The paper presents application research on Latent Dirichlet Allocation and Collapsed Gibbs Sampling Semantic Analysis for topic discovery.

Item Type: Article
Subjects: STM Repository > Computer Science
Depositing User: Managing Editor
Date Deposited: 30 Dec 2023 07:43
Last Modified: 30 Dec 2023 07:43
URI: http://classical.goforpromo.com/id/eprint/4976

Actions (login required)

View Item
View Item