Login / Signup
HyperMoE: Paying Attention to Unselected Experts in Mixture of Experts via Dynamic Transfer.
Hao Zhao
Zihan Qiu
Huijia Wu
Zili Wang
Zhaofeng He
Jie Fu
Published in:
CoRR (2024)
Keyphrases
</>
paying attention
expert finding
mixture model
dynamically changing
real time
data sets
databases
neural network
video sequences