Evaluate models at a compute / fine-tuning cadence — Overdue

DeepMind’s FSF v1.0 stated an aim to evaluate models for every 6x increase in effective compute and every three months of fine-tuning progress.

Committed 2024-05-17
Due every 6x effective compute / 3 months fine-tuning (FSF v1.0)
Evaluated —
Ruling —

Why this ruling

This 6x / 3-month wording is v1.0 language; FSF v2.0 (2025) replaced the specific numbers with more flexible criteria.

Source: Google DeepMind ↗
Committed: 2024-05-17
As of: 2026-06-19

Cite this commitment

Overdue. "Evaluate models at a compute / fine-tuning cadence." Overdue, 2024. https://overduetracker.org/c/deepmind-fsf-eval-cadence (retrieved 2026-06-19).