What is Pre-training?
Generative AI Fundamentals Part 4 - Pre-training:
https://youtu.be/R75Sy88zSEI?si=be9PFTSr8N5cDtGV
What is Pre-training?
Pre-training is the process of teaching a model to understand and process language before it's fine-tuned for specific tasks. It involves exposing the model to vast amounts of text data.
How does Pre-training work?
During pre-training, the model learns to predict the next word in a sentence, understand context, and capture the essence of language patterns. This is done through unsupervised learning, where the model organizes and makes sense of data without explicit instructions.
How to train your ChatGPT?
1. Download ~10TB of text.
2. Get a cluster of ~6,000 GPUs.
3. Compress the text into a neural network, pay
~$2M, wait ~12 days.
4. Obtain base model.
(Numbers sourced from "Intro to LLMs" by Andrej Karpathy)
In the next couple of videos we will talk about Fine-Tuning and Retrieval-Augmented Generation (RAG). Thanks for tuning in!