Transparent Network Memory Storage for Efficient Container Execution in Big Data Clouds.

Juhyun Bae,Ling Liu,Ka Ho Chow,Yanzhao Wu,Gong Su,Arun Iyengar

IEEE BigData（2021）

引用 0|浏览31

暂无评分

摘要

This paper presents a transparent Container Network Memory storage device, coined as CNetMem, aiming to address the open problem of unpredictable performance degradation of containers when the working set of an application no longer fits in container memory. First, CNetMem will enable application tenants running in a container to park their working set memory/file to a faster network memory storage by organizing a group of remote memory nodes as remote memory donors. This allows CNetMem to take advantage of remote idle memory on a cluster before resorting to a slow local I/O subsystem like local disk without any modification of host OS or application. Second, CNetMem provides a hybrid batching technique to remove or alleviate performance bottlenecks in the I/O performance critical path for remote memory read/write with replication or disk backup for fault tolerance. Third, CNetMem introduces a rank-based node selection algorithm to find the optimal node for placing remote memory blocks across cluster. This helps CNetMem to reduce the performance impact due to remote memory eviction. Extensive experiments are conducted on three big data applications and four machine learning workloads. The results show that CNetMem achieves up to 172x throughput improvements compared to vanilla Linux and up to 5.9x completion time improvements over existing approaches in big data and ML workload.

查看译文

关键词

container execution,big data clouds,CNetMem,container memory,application tenants,remote memory nodes,remote memory donors,remote idle memory,rank-based node selection algorithm,remote memory blocks,remote memory eviction,big data applications,network memory storage,transparent container network memory storage device,container performance degradation,working set memory,working set file,I/O subsystem,hybrid batching technique,performance bottlenecks,I/O performance critical path,remote memory read-write,disk backup,fault tolerance,machine learning workload,vanilla Linux

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要