← All concepts

hyperparameter optimization

3 articles · 6 co-occurring · 0 contradictions · 5 briefs

accumulates git commits to the training script as it finds better settings (of lower validation loss by the end) of the neural network architecture, the optimizer, all the hyperparameters, etc." — Art

2026-W15
15

accumulates git commits to the training script as it finds better settings (of lower validation loss by the end) of the neural network architecture, the optimizer, all the hyperparameters, etc." — Art

At just 30B parameters, it scores 87/120 on this year's Putnam" — Provides evidence that competitive mathematical reasoning capability can be achieved with relatively modest 30B parameter model size

adding constraints like single file to edit, single metric to track, a time limit, and a well-written program.md is what makes this work. that combination is what makes @karpathy autoresearch actually

query this concept
$ db.articles("hyperparameter-optimization")
$ db.cooccurrence("hyperparameter-optimization")
$ db.contradictions("hyperparameter-optimization")