Massive Indexed Directories in DeltaFS

Abutalib Aghayev,Joy Arulraj, Ben Blum, V. Parvathi Bhogaraju,Amirali Boroumand, Sol Boucher, Christopher Canel, Dominic Chen, Haoxian Chen, Malhar Chaudhari, Andrew Chung, Chris Fallin,Pratik Fegade, Ziqiang Feng, Samarth Gupta,Aaron Harlap,Kevin Hsieh, Fan Hu, Abhilasha Jain, Saksham Jain, Angela Jiang,Ellango Jothimurugesan,Saurabh Arun Kadekodi,Anuj Kalia, Rajat Kateja, Jin Kyu Kim, Thomas, Seth Copen Goldstein, Mor Harchol-Balter, Gauri Joshi,Todd Mowry,Onur Mutlu,Priya Narasimhan, David O’Hallaron,Andy Pavlo, Majd Sakr,George Amvrosiadis,David Andersen,Lujo Bauer,Nathan Beckmann, Daniel Berger, Chuck Cranor,Lorrie Cranor,Christos Faloutsos, Kayvon Fatahalian,Rajeev Gandhi,Saugata Ghose

semanticscholar(2018)

引用 0|浏览4
暂无评分
摘要
Faster storage media, faster interconnection networks, and improvements in systems software have significantly mitigated the effect of I/O bottlenecks in HPC applications. Even so, applications that read and write data in small chunks are limited by the ability of both the hardware and the software to handle such workloads efficiently. Often, scientific applications partition their output using one file per process. This is a problem on HPC computers with hundreds of thousands of cores and will only worsen with exascale computers, which will be an order of magnitude larger. To avoid wasting time creating output files on such machines, scientific applications are forced to use libraries that combine multiple I/O streams into a single file. For many applications where output is produced out-of-order, this must be followed by a costly, massive data sorting operation. DeltaFS allows applications to write to an arbitrarily large number of files, while also guaranteeing efficient data access without requiring sorting.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要