Skip to main content
Most new models need no code — a HuggingFace, vLLM, or LiteLLM model is just a config file (see the Models reference). Write a backend only when the inference engine or model interface is genuinely new. The repo ships a Claude Code skill that walks you through it.
The skill lives in the repo at .claude/skills/adding-a-model/. In Claude Code, type /adding-a-model; you can also read it as plain Markdown.

adding-a-model skill

Pick the right capability interface, register it, and document it.

The process

  1. Pick the capability interfaceGenerativeModel (LLMs/VLMs), ZeroShotClassifier (CLIP-style), or SupervisedClassifier (timm). It decides which task_types the backend can serve.
  2. Implement and register — subclass the interface, implement only the batch hooks, and register it with @register_model and a cache-safe model_name. Declare honest capabilities so the evaluator rejects unsupported (model, task) pairs.
  3. Make it loadable — add a per-family config under mill/models/configs/, and gate optional dependencies with a clear ImportError plus a pyproject.toml extra.
  4. Validate and document — sanity-check on a benchmark its interface supports, then add a backend section to the Models reference.