How to guess a gradient
CoRR(2023)
摘要
How much can you say about the gradient of a neural network without computing
a loss or knowing the label? This may sound like a strange question: surely the
answer is "very little." However, in this paper, we show that gradients are
more structured than previously thought. Gradients lie in a predictable
low-dimensional subspace which depends on the network architecture and incoming
features. Exploiting this structure can significantly improve gradient-free
optimization schemes based on directional derivatives, which have struggled to
scale beyond small networks trained on toy datasets. We study how to narrow the
gap in optimization performance between methods that calculate exact gradients
and those that use directional derivatives. Furthermore, we highlight new
challenges in overcoming the large gap between optimizing with exact gradients
and guessing the gradients.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要