EPQuant: A Graph Neural Network compression approach based on product quantization

Linyong Huang,Zhe Zhang,Zhaoyang Du,Shuangchen Li,Hongzhong Zheng,Yuan Xie,Nianxiong Tan

Neurocomputing（2022）

引用 6|浏览45

暂无评分

摘要

Graph Neural Networks (GNNs) have been widely used in graph analysis due to their strong performance on a wide variety of tasks. Unfortunately, as the size of graphs keeps growing, large graphs can easily consume Terabytes, and training on such graphs may take days. The high memory footprint limits the usage of GNNs on resource-constrained devices like smartphones and IoT devices. Hence, reducing the storage cost, training time, and inference latency is highly desirable. In this work, we apply Product Quantization (PQ) on GNNs for the first time to achieve superior memory capacity reduction. To alleviate the processing burden caused by PQ and improve compression performance, we propose Enhanced Product Quantization (EPQ). It reduces the input graph data, which tends to dominate memory consumption, and accelerates clustering in PQ. Moreover, an efficient quantization framework11Github repository:https://github.com/Lyun-Huang/EPQuant for GNNs is proposed, which combines EPQ with Scalar Quantization (SQ), to achieve improved compression performance and computation acceleration on off-the-shelf hardware, enabling the deployment of GNNs on resource-constrained devices. In addition, the proposed quantization framework can be applied to the existing GNNs without too much porting effort. Extensive experimental results show that the proposed quantization scheme can achieve 321.26× and 184.03× memory capacity compression for input graph data and overall storage, respectively. This impressive memory reduction can come with an accuracy loss of less than 1%.

查看译文

关键词

Graph Neural Network,Compression,Product quantization

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要