mixture of experts
2 articles · 1 co-occurring · 0 contradictions · 5 briefs
hybrid SSM Mixture of Experts architecture" — NVIDIA Nemotron 3 Nano demonstrates a novel hybrid SSM MoE architecture achieving accuracy and inference efficiency trade-offs
2026-W15 10
hybrid SSM Mixture of Experts architecture" — NVIDIA Nemotron 3 Nano demonstrates a novel hybrid SSM MoE architecture achieving accuracy and inference efficiency trade-offs
@code_star: I don't know who this brand new account belongs to but they are absolutely co... example_of
[INFERRED] "router is still sending every difficult token to the single expert that memorized Wikipedia" — Post describes routing behavior in a mixture-of-experts model where difficult tokens concentr
query this concept
$ db.articles("mixture-of-experts")
$ db.cooccurrence("mixture-of-experts")
$ db.contradictions("mixture-of-experts")