Diffusion and flow matching
The pre-requisites for this blog post is Journey till diffusion model Stable Diffusion model: Uses cross-attention for allowing conditional modelling ( using text/segmentation map + image to generate image) Mode collapse doesnt happen in likelihood based model and SD is a likelihood based model High frequency details : it means the details / detail-oriented view Related work: (previous work in this field) , same vq-vae and vq-gna What is the Inductive bias of the DM’s inherited by the UNET model ? ...