← All concepts

mid training optimization

3 articles · 15 co-occurring · 1 contradictions · 5 briefs

Mid-training after long-context pretraining gives the largest gains in math, code, and science while preserving general reasoning. Mid-training at 8k context degrades long-context ability, but this ca

@realmcore_: "...representation discovery with AlphaFold is generalizable..."

[STRONG] "A couple of AI researchers have already contacted me about the results, interested in the idea that maybe fine-tunning isn't the only way towards improving math performance." — Article challenges conventional assumption that fine-tuning is primary lever for performance improvement; suggests representation discovery is orthogonal and equally or more effective.

2026-W15
15

Mid-training after long-context pretraining gives the largest gains in math, code, and science while preserving general reasoning. Mid-training at 8k context degrades long-context ability, but this ca

New research explores alternatives to fine-tuning and improving reproducibility, with open datasets supporting diverse languages" — Article documents emerging research into alternatives to fine-tuning

A couple of AI researchers have already contacted me about the results, interested in the idea that maybe fine-tunning isn't the only way towards improving math performance." — Article challenges conv

query this concept
$ db.articles("mid-training-optimization")
$ db.cooccurrence("mid-training-optimization")
$ db.contradictions("mid-training-optimization")