What are Transformer Models and how do they work?

This is the last of a series of 3 videos where we demystify Transformer models and explain them with visuals and friendly examples. Video 1: The attention mechanism in high level Video 2: The attention mechanism with math Video 3 (This one): Transformer models If you like this material, check out LLM University from Cohere! Get the Grokking Machine Learning book! Discount code (40%): serranoyt (Use the discount code on checkout) 00:00 Introduction 01:50 What is a transformer? 04:35 Generating one word at a time 08:59 Sentiment Analysis 13:05 Neural Networks 18:18 Tokenization 19:12 Embeddings 25:06 Positional encoding 27:54 Attention 32:29 Softmax 35:48 Architecture of a Transformer 39:00 Fine-tuning 42:20 Conclusion
Back to Top