Evaluating the Limits of the Current Evaluation Metrics for Topic Modeling

WebMedia '23: Proceedings of the 29th Brazilian Symposium on Multimedia and the Web(2023)

引用 0|浏览35
暂无评分
摘要
Topic Modeling (TM) is a popular approach to extracting and organizing information from large amounts of textual data, by discovering and representing semantic topics from documents. In this paper, we investigate an important challenge in the TM context, namely Topic evaluation, responsible for driving the advances in the field and assessing the overall quality of the topic generation process. Traditional TM metrics capture the quality of topics by strictly evaluating the words that built the topics syntactically (i.e., NPMI, TF-IDF Coherence) or semantically (i.e., WEP). In here, we investigate whether we are approaching the limits of what the current evaluation metrics can assess regarding topic quality for TM. We performed a comprehensive experiment, considering three data collections widely used in automatic classification, for which each document’s topic (class) is known (i.e., ACM, 20News and WebKb). We contrast the quality of topics generated by four of the main TM techniques (i.e., LDA, NMF, CluWords and BerTopic) with the previous topic structure of each collection. Our results show that, despite the importance of the current metrics, they could not capture some important idiosyncratic aspects of TM, indicating the need to propose new metrics that consider, for example, the structure and organization of the documents that comprise the topics.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要