Communication lower bounds and optimal algorithms for multiple tensor-times-matrix computation

SIAM JOURNAL ON MATRIX ANALYSIS AND APPLICATIONS(2024)

引用 0|浏览2
暂无评分
摘要
Multiple tensor -times -matrix (Multi-TTM) is a key computation in algorithms for computing and operating with the Tucker tensor decomposition, which is frequently used in multidimensional data analysis. We establish communication lower bounds that determine how much data movement is required (under mild conditions) to perform the Multi-TTM computation in parallel. The crux of the proof relies on analytically solving a constrained, nonlinear optimization problem. We also present a parallel algorithm to perform this computation that organizes the processors into a logical grid with twice as many modes as the input tensor. We show that, with correct choices of grid dimensions, the communication cost of the algorithm attains the lower bounds and is therefore communication optimal. Finally, we show that our algorithm can significantly reduce communication compared to the straightforward approach of expressing the computation as a sequence of tensor -times -matrix operations when the input and output tensors vary greatly in size.
更多
查看译文
关键词
communication lower bounds,Multi-TTM,tensor computations,parallel algorithms,HBL-inequalities
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要