training optimization

2 articles · 3 co-occurring · 0 contradictions · 99 briefs

Listwise 损失 vs Pointwise BCE：+30.7pp（后者在高度同质池中失效）" — The comparison between Listwise loss and Pointwise BCE reveals a significant 30.7pp improvement and identifies a failure mode of BCE in homogeneous

Related concepts

retrieval ranking pipeline 1 ranking systems 1 hyperparameter optimization 1

Signal history

2026-W30

2026-W29

2026-W28

2026-W27

2026-W26

2026-W25

2026-W24

2026-W23

2026-W22

2026-W21

2026-W20

2026-W19

Evidence chain (2 articles, showing 2)

@shao__meng: 两阶段 retrieve-and-rerank 流水线，总参数仅 1.2B（0.6B 编码器 + 0.6B 重排序器），专为消费级硬件设计。 extends

@vivek_2332: definitely agree. the concept of autoresearch isn't new, letting llms optimiz... extends

for weight matrices the frobenius norm gradient (what adam and sgd use) is geometrically wrong. the "correct" steepest descent direction for a weight matrix is the one that minimizes the loss subject

query this concept

$ db.articles("training-optimization")

$ db.cooccurrence("training-optimization")

$ db.contradictions("training-optimization")