MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts.
Maciej PióroKamil CiebieraKrystian KrólJan LudziejewskiSebastian JaszczurPublished in: CoRR (2024)
Keyphrases
- state space
- parameter estimation
- prior knowledge
- probabilistic model
- neural network
- feature selection
- statistical models
- statistical model
- exponential family
- data sets
- computational models
- bayesian framework
- dynamical systems
- gaussian mixture model
- optimal policy
- model selection
- maximum likelihood
- data structure
- reinforcement learning
- decision trees