HyperLogLogLog: Cardinality Estimation With One Log More

KDD '22: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining(2022)

引用 6|浏览27
暂无评分
摘要
We present HyperLogLogLog, a practical compression of the HyperLogLog sketch that compresses the sketch from $O(młogłog n)$ bits down to $m łog_2łog_2łog_2 m + O(m+łogłog n)$ bits for estimating the number of distinct elements~n using m~registers. The algorithm works as a drop-in replacement that preserves all estimation properties of the HyperLogLog sketch, it is possible to convert back and forth between the compressed and uncompressed representations, and the compressed sketch maintains mergeability in the compressed domain. The compressed sketch can be updated in amortized constant time, assuming n is sufficiently larger than m. We provide a C++ implementation of the sketch, and show by experimental evaluation against well-known implementations by Google and Apache that our implementation provides small sketches while maintaining competitive update and merge times. Concretely, we observed approximately a 40% reduction in the sketch size. Furthermore, we obtain as a corollary a theoretical algorithm that compresses the sketch down to $młog_2łog_2łog_2łog_2 m+O(młogłogłog m/łogłog m+łogłog n)$ bits.
更多
查看译文
关键词
cardinality estimation,hyperlogloglog
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要