Constructing comprehensive summaries of large event sequences

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining(2008)

引用 37|浏览3
暂无评分
摘要
Event sequences capture system and user activity over time. Prior research on sequence mining has mostly focused on discovering local patterns. Though interesting, these patterns reveal local associations and fail to give a comprehensive summary of the entire event sequence. Moreover, the number of patterns discovered can be large. In this paper, we take an alternative approach and build short summaries that describe the entire sequence, while revealing local associations among events. We formally define the summarization problem as an optimization problem that balances between shortness of the summary and accuracy of the data description. We show that this problem can be solved optimally in polynomial time by using a combination of two dynamic-programming algorithms. We also explore more efficient greedy alternatives and demonstrate that they work well on large datasets. Experiments on both synthetic and real datasets illustrate that our algorithms are efficient and produce high-quality results, and reveal interesting local structures in the data.
更多
查看译文
关键词
local association,interesting local structure,local pattern,entire event sequence,entire sequence,event sequence,optimization problem,sequence mining,summarization problem,comprehensive summary,large event sequence
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要