LLM Training

These are my notes from the very recommended book https://www.manning.com/books/build-a-large-language-model-from-scratch with some extra information.

Basic Information

You should start by reading this post for some basic concepts you should know about:

0. Basic LLM Concepts

1. Tokenization

1. Tokenizing

2. Data Sampling

2. Data Sampling

3. Token Embeddings

3. Token Embeddings

4. Attention Mechanisms

4. Attention Mechanisms

5. LLM Architecture

5. LLM Architecture

6. Pre-training & Loading models

6. Pre-training & Loading models

7.0. LoRA Improvements in fine-tuning

7.0. LoRA Improvements in fine-tuning

7.1. Fine-Tuning for Classification

7.1. Fine-Tuning for Classification

7.2. Fine-Tuning to follow instructions

7.2. Fine-Tuning to follow instructions

Last updated