Exploring Generative AI: An Introduction to Large Language Models and Diffusion Models
About this resource
In this talk from the Machine Learning for Medical Imaging (ML4MI) community, Kangwook Lee, PhD, provides a comprehensive introduction to generative AI through the lens of Large Language Models (LLMs) and diffusion-based models.
Dr. Lee begins by tracing the historical context leading to the development of modern LLMs, such as GPT, and explores how these models serve as universal interfaces to general intelligence through next-word prediction. By presenting key examples, including applications in software development and medical imaging, he demonstrates their transformative potential across diverse domains.
The talk also delves into Language-Interfaced Fine-Tuning (LIFT), a method for adapting LLMs to non-language machine learning tasks, and highlights Visual Instruction Tuning (VIT) work by Haotian Liu, showing how these advancements push the boundaries of multimodal learning.
In the second part of the presentation, Dr. Lee introduces diffusion-based generative models. He explains core concepts behind these models, emphasizing their unique approach to data synthesis and generative tasks. Additionally, topics such as LoRA (Low-Rank Adaptation) are briefly covered, showcasing methods to make training and fine-tuning efficient for large-scale models.
A netID is required to view ML4MI videos: View 2023-09 ML4MI recording.
See also
- Application - Video: Exploring Model Sharing in the Age of Foundation Models: Learn more about the LLaVA model.