Running on CPU Upgrade 233 The Synthetic Data Playbook: Generating Trillions of the Finest Tokens ๐ 233 Explore synthetic data experiments on a virtual bookshelf
Running Featured 49 Porting nanochat to Transformers: an AI modeling history lesson ๐ 49 Learn about ML and Transformers through nanochat
Running 92 Scaling FineWeb to 1000+ languages: Step 1: finding signal in 100s of evaluation tasks ๐ 92 Evaluate multilingual models using FineTasks
Running 224 FineVision: Open Data is All You Need ๐ 224 A new open-source dataset for training VLMs