This repository contains detailed code implementations from the "LLM from Scratch" YouTube series by Vizuara, which is based on the book LLM from Scratch by Sebastian Raschka.
- Covers how text data is converted into tokens for processing in LLMs.
- Explains Byte Pair Encoding (BPE), WordPiece, and SentencePiece tokenization techniques.
- Includes implementation using Python & PyTorch.
- Detailed breakdown of the Attention Mechanism, the core concept behind modern Transformers.
- Explains Self-Attention, Scaled Dot-Product Attention, and Multi-Head Attention.
- Hands-on implementation of attention scores calculation.
- A complete step-by-step implementation of a Transformer model from scratch.
- Covers encoder-decoder architecture, positional embeddings, and layer normalization.
- Includes PyTorch code for training a simple Transformer-based model.
Feel free to explore the code files