P-Massive: A Real-Time Search Engine for a Multi-Terabyte Mass Spectrometry Database

SC22: International Conference for High Performance Computing, Networking, Storage and Analysis(2022)

引用 1|浏览20
暂无评分
摘要
Queries of multi-TB Mass Spectrometry (MS) repositories provide deep insights into biological processes and pose challenging data processing problems. The key bottleneck for running these queries is the number of small random reads. Byte-addressable persistent main memory (PMEM) technologies enable real-time MS search systems by delivering low-latency, high-bandwidth storage. This work presents P-Massive, real-time multi-terabyte scale MS search system. P-Massive takes advantage of PMEM and the underlying nature of its data access patterns to maximize performance. We evaluate P-Massive across various storage hierarchies and project forward over the next decade to understand how MS query systems might evolve. Our evaluation shows that P-Massive offers a cost-effective solution that achieves near-DRAM performance. A single query takes 1.7 seconds in P-Massive, 69× faster than state-of-the-art implementation. In an end-to-end, user-facing application, P-Massive delivers a 90% shorter wait time than the latest MS search tool, returning results within seconds rather than minutes.
更多
查看译文
关键词
Nonvolatile memory,Indexing,Bioinformatics,Mass Spectrometry,Search engines
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要