Torchtune mechanics

Introduction

torchtune is a library for post-training LLMs from Meta. It builds directly on torch and torchrun as a simple and efficient way to customize LLMs. In my experiences with it, it is faster, more robust, and easier to debug and customize than using Huggingface transformers library. Along with the OG transformers library, I’ve seen a few other competitive frameworks in the wild including axolotl and llama-factory.

CLI

Tune structures its components - datasets, tokenizers, models, and optimizers - as composable, modular building blocks. The library implements many of each component. Before a run, each component is configured in a single yaml file, passed to the tune CLI. The training code is defined in a recipe file, of which there are many examples. Recipes are usually about 1000 lines of Python, implementing complex training methods like distributed training, distillation, and low-rank adaptation.

The tune CLI is the entrypoint for processes to download checkpoints and do training runs.

How do dataset APIs interact with configs and recipes?

Workflows begin with any HuggingFace dataset, or a local dataset. There is an insightful dataset taxonomy baked into the source code, like a map of post-training workflow flavors.

The taxonomy of post-training datasets in torchtune. Updated Feb 26, 2025.

In the future, I'll do a deep on these data formats, and how they interact with model training and evals.