Autoregressive Diffusion Models (Machine Learning Research Paper Explained)

#machinelearning #ardm #generativemodels Diffusion models have made large advances in recent months as a new type of generative models. This paper introduces Autoregressive Diffusion Models (ARDMs), which are a mix between autoregressive generative models and diffusion models. ARDMs are trained to be agnostic to the order of autoregressive decoding and give the user a dynamic tradeoff between speed and performance at decoding time. This paper applies ARDMs to both text and image data, and as an extension, the models can also be used to perform lossless compression. OUTLINE: 0:00 - Intro & Overview 3:15 - Decoding Order in Autoregressive Models 6:15 - Autoregressive Diffusion Models 8:35 - Dependent and Independent Sampling 14:25 - Application to Character-Level Language Models 18:15 - How Sampling & Training Works 26:05 - Extension 1: Parallel Sampling 29:20 - Extension 2: Depth Upscaling 33:10 - Conclusion & Comments Paper:
Back to Top