Exploring Generative AI: An Introduction to Large Language Models and Diffusion Models

Videos

ML4MI

UW-Madison

LLM

VLM

LLaVA

GenAI

Diffusion

Deep learning

Multimodal learning

Presenter

Kangwook Lee

Date

September 11, 2023

About this resource

In this talk from the Machine Learning for Medical Imaging (ML4MI) community, Kangwook Lee, PhD, provides a comprehensive introduction to generative AI through the lens of Large Language Models (LLMs) and diffusion-based models.

Dr. Lee begins by tracing the historical context leading to the development of modern LLMs, such as GPT, and explores how these models serve as universal interfaces to general intelligence through next-word prediction. By presenting key examples, including applications in software development and medical imaging, he demonstrates their transformative potential across diverse domains.

The talk also delves into Language-Interfaced Fine-Tuning (LIFT), a method for adapting LLMs to non-language machine learning tasks, and highlights Visual Instruction Tuning (VIT) work by Haotian Liu, showing how these advancements push the boundaries of multimodal learning.

In the second part of the presentation, Dr. Lee introduces diffusion-based generative models. He explains core concepts behind these models, emphasizing their unique approach to data synthesis and generative tasks. Additionally, topics such as LoRA (Low-Rank Adaptation) are briefly covered, showcasing methods to make training and fine-tuning efficient for large-scale models.

A netID is required to view ML4MI videos: View 2023-09 ML4MI recording.

About this resource

See also