Torchtune mechanics
Published on Feb 27, 2025
Last updated on May 10, 2025
Introduction
torchtune
is a library for post-training LLMs from Meta.
It builds directly on torch
and torchrun
as a simple and efficient way to customize LLMs.
In my experiences with it, it is faster, more robust, and easier to debug and customize than using Huggingface transformers
library.
Along with the OG transformers
library, I’ve seen a few other competitive frameworks in the wild including axolotl
and llama-factory
.
CLI
Tune structures its components - datasets, tokenizers, models, and optimizers - as composable, modular building blocks.
The library implements many of each component.
Before a run, each component is configured in a single yaml file, passed to the tune
CLI.
The training code is defined in a recipe file, of which there are many examples.
Recipes are usually about 1000 lines of Python, implementing complex training methods like distributed training, distillation, and low-rank adaptation.
The tune
CLI is the entrypoint for processes to download checkpoints and do training runs.
How do dataset APIs interact with configs and recipes?
Workflows begin with any HuggingFace dataset, or a local dataset. There is an insightful dataset taxonomy baked into the source code, like a map of post-training workflow flavors.