← All concepts

mixture of experts

2 articles · 1 co-occurring · 0 contradictions · 5 briefs

hybrid SSM Mixture of Experts architecture" — NVIDIA Nemotron 3 Nano demonstrates a novel hybrid SSM MoE architecture achieving accuracy and inference efficiency trade-offs

2026-W15
10

hybrid SSM Mixture of Experts architecture" — NVIDIA Nemotron 3 Nano demonstrates a novel hybrid SSM MoE architecture achieving accuracy and inference efficiency trade-offs

[INFERRED] "router is still sending every difficult token to the single expert that memorized Wikipedia" — Post describes routing behavior in a mixture-of-experts model where difficult tokens concentr

query this concept
$ db.articles("mixture-of-experts")
$ db.cooccurrence("mixture-of-experts")
$ db.contradictions("mixture-of-experts")