An Ensemble Classification Algorithm for Imbalanced Text Data Streams

international conference on artificial intelligence(2020)

引用 1|浏览40
暂无评分
摘要
For unbalanced text data streams, an ensemble classification algorithm for unbalanced text data streams is proposed in this paper. Firstly, an improved resampling method is used to establish balanced data subsets; secondly, the topic model is used to perform topic modeling on the balanced data subsets to establish document-topic training subsets; finally, an ensemble classifier is constructed using the WE ensemble model. The algorithm sets the difference of the F-value of the neighboring data blocks to a certain threshold as the standard of updating the classifier. When the ensemble classifier is updated, the base classifier is retrained after the error positive instance is added to error set. Experimental results show that the proposed algorithm not only has good classification performance for the positive instances, but also has good classification performance for all instances. Therefore, the algorithm proposed in this paper is an effective classification algorithm for unbalanced text data streams.
更多
查看译文
关键词
text data streams,classification,unbalanced,resampling,ensemble model
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要