Evaluating List Intersection On Ssds For Parallel I/O Skipping

2021 IEEE 37TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2021)(2021)

引用 1|浏览23
暂无评分
摘要
List intersection is at the core of information retrieval systems. Existing disk-based intersection algorithms were optimized for hard disk drives (HDDs) since HDDs have dominated the storage market for decades. In particular, those HDDcentric algorithms read every relevant list entirely to memory to minimize expensive random reads by performing sequential reads, although many entries in the list may be useless. Such a tradeoff makes perfect sense on HDDs, because random reads are one to two orders of magnitude slower than sequential reads. However, fast solid state drives (SSDs) have changed this landscape by improving random I/O performance dramatically. More importantly, they are manufactured with multiple flash channels to support parallel I/Os. As a result, the performance gap between random and sequential reads becomes very small on SSDs. This means that HDD-optimized intersection algorithms might not be suitable on SSDs because the total amount of data accessed is unnecessarily high.To understand the impact of SSDs to list intersection, in this work, we tune existing in-memory intersection algorithms to be SSD-aware with the idea of parallel I/O skipping, and experimentally evaluate them on synthetic and real datasets. The results provide insights on how to design efficient SSD-optimized intersection algorithms.
更多
查看译文
关键词
list intersection,information retrieval systems,hard disk drives,HDD-centric algorithms,fast solid state drives,HDD-optimized intersection algorithms,in-memory intersection algorithms,parallel I/O skipping,disk-based intersection algorithms,SSD-optimized intersection algorithms
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要