U-Nets as Belief Propagation: Efficient Classification, Denoising, and Diffusion in Generative Hierarchical Models
arxiv(2024)
摘要
U-Nets are among the most widely used architectures in computer vision,
renowned for their exceptional performance in applications such as image
segmentation, denoising, and diffusion modeling. However, a theoretical
explanation of the U-Net architecture design has not yet been fully
established.
This paper introduces a novel interpretation of the U-Net architecture by
studying certain generative hierarchical models, which are tree-structured
graphical models extensively utilized in both language and image domains. With
their encoder-decoder structure, long skip connections, and pooling and
up-sampling layers, we demonstrate how U-Nets can naturally implement the
belief propagation denoising algorithm in such generative hierarchical models,
thereby efficiently approximating the denoising functions. This leads to an
efficient sample complexity bound for learning the denoising function using
U-Nets within these models. Additionally, we discuss the broader implications
of these findings for diffusion models in generative hierarchical models. We
also demonstrate that the conventional architecture of convolutional neural
networks (ConvNets) is ideally suited for classification tasks within these
models. This offers a unified view of the roles of ConvNets and U-Nets,
highlighting the versatility of generative hierarchical models in modeling
complex data distributions across language and image domains.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要