githubEdit

LLM Training

These are my notes from the very recommended book https://www.manning.com/books/build-a-large-language-model-from-scratcharrow-up-right with some extra information.

Basic Information

You should start by reading this post for some basic concepts you should know about:

0. Basic LLM Conceptschevron-right

1. Tokenization

circle-check
1. Tokenizingchevron-right

2. Data Sampling

circle-check
2. Data Samplingchevron-right

3. Token Embeddings

circle-check
3. Token Embeddingschevron-right

4. Attention Mechanisms

circle-check
4. Attention Mechanismschevron-right

5. LLM Architecture

circle-check
5. LLM Architecturechevron-right

6. Pre-training & Loading models

circle-check
6. Pre-training & Loading modelschevron-right

7.0. LoRA Improvements in fine-tuning

circle-check
7.0. LoRA Improvements in fine-tuningchevron-right

7.1. Fine-Tuning for Classification

circle-check
7.1. Fine-Tuning for Classificationchevron-right

7.2. Fine-Tuning to follow instructions

circle-check
7.2. Fine-Tuning to follow instructionschevron-right

Last updated