DeepMind’s FSF v1.0 stated an aim to evaluate models for every 6x increase in effective compute and every three months of fine-tuning progress.
- Committed 2024-05-17
- Due every 6x effective compute / 3 months fine-tuning (FSF v1.0)
- Evaluated —
- Ruling —
Why this ruling
This 6x / 3-month wording is v1.0 language; FSF v2.0 (2025) replaced the specific numbers with more flexible criteria.
Cite this commitment
Overdue. "Evaluate models at a compute / fine-tuning cadence." Overdue, 2024. https://overduetracker.org/c/deepmind-fsf-eval-cadence (retrieved 2026-06-19).