← All concepts

hyperparameter optimization

3 articles · 6 co-occurring · 0 contradictions · 47 briefs

accumulates git commits to the training script as it finds better settings (of lower validation loss by the end) of the neural network architecture, the optimizer, all the hyperparameters, etc." — Art

2026-W22
3
2026-W21
18
2026-W20
21
2026-W19
15
2026-W18
21
2026-W17
21
2026-W16
21
2026-W15
21

accumulates git commits to the training script as it finds better settings (of lower validation loss by the end) of the neural network architecture, the optimizer, all the hyperparameters, etc." — Art

At just 30B parameters, it scores 87/120 on this year's Putnam" — Provides evidence that competitive mathematical reasoning capability can be achieved with relatively modest 30B parameter model size

adding constraints like single file to edit, single metric to track, a time limit, and a well-written program.md is what makes this work. that combination is what makes @karpathy autoresearch actually

query this concept
$ db.articles("hyperparameter-optimization")
$ db.cooccurrence("hyperparameter-optimization")
$ db.contradictions("hyperparameter-optimization")