QuadraNet V2: Efficient and Sustainable Training of High-Order Neural Networks with Quadratic Adaptation
arxiv(2024)
摘要
Machine learning is evolving towards high-order models that necessitate
pre-training on extensive datasets, a process associated with significant
overheads. Traditional models, despite having pre-trained weights, are becoming
obsolete due to architectural differences that obstruct the effective transfer
and initialization of these weights. To address these challenges, we introduce
a novel framework, QuadraNet V2, which leverages quadratic neural networks to
create efficient and sustainable high-order learning models. Our method
initializes the primary term of the quadratic neuron using a standard neural
network, while the quadratic term is employed to adaptively enhance the
learning of data non-linearity or shifts. This integration of pre-trained
primary terms with quadratic terms, which possess advanced modeling
capabilities, significantly augments the information characterization capacity
of the high-order network. By utilizing existing pre-trained weights, QuadraNet
V2 reduces the required GPU hours for training by 90% to 98.4% compared to
training from scratch, demonstrating both efficiency and effectiveness.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要