Accelerable Lottery Tickets with the Mixed-Precision Quantization.

CVPR Workshops(2023)

引用 0|浏览53
暂无评分
摘要
In recent years, the lottery tickets hypothesis has gained widespread popularity as a means of network compression. However, the practical application of lottery tickets for hardware acceleration is difficult due to their element-wise unstructured sparsity nature. In this paper, we argue that network pruning can be seen as a special case of network quantization, and relax the hard network pruning with mixed-precision quantization in an unstructured manner, which makes it possible for real hardware acceleration. We successfully validate the wide existence of quantized lottery tickets, namely MPQ-tickets, that can match or even surpass the performance of corresponding full-precision dense networks on various representative benchmarks. Also, we demonstrate that MPQ-tickets have much higher flexibility than vanilla lottery tickets, and largely benefit from pruning when compared to QNNs. Moreover, the MPQ-tickets achieve up to 8× hardware acceleration of inference speed and 14× less memory consumption than full-precision models.
更多
查看译文
关键词
8× hardware acceleration,accelerable lottery tickets,element-wise unstructured sparsity nature,hard network pruning,lottery tickets hypothesis,mixed-precision quantization,MPQ-tickets,network compression,network quantization,quantized lottery tickets,vanilla lottery tickets
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要