Login / Signup

Sigmoid Gating is More Sample Efficient than Softmax Gating in Mixture of Experts.

Huy NguyenNhat HoAlessandro Rinaldo
Published in: CoRR (2024)
Keyphrases
  • mixture model
  • lightweight
  • database
  • data mining
  • artificial neural networks
  • semi supervised
  • language model
  • image reconstruction
  • computationally expensive
  • belief networks
  • activation function
  • lung cancer