FFSplit: Split Feed-Forward Network For Optimizing Accuracy-Efficiency Trade-off in Language Model Inference.
Zirui LiuQingquan SongQiang Charles XiaoSathiya Keerthi SelvarajRahul MazumderAman GuptaXia HuPublished in: CoRR (2024)
Keyphrases
- bayesian inference
- language model
- feed forward
- probabilistic model
- trade off
- recurrent networks
- language modeling
- back propagation
- spiking neural networks
- n gram
- artificial neural networks
- document retrieval
- neural network
- spiking neurons
- speech recognition
- neural nets
- retrieval model
- test collection
- mixture model
- information retrieval
- language modelling
- context sensitive
- query expansion
- ad hoc information retrieval
- hidden layer
- statistical language models
- network structure
- feed forward neural networks
- language model for information retrieval
- translation model
- relevance model
- activation function
- bayesian networks
- word clouds
- pseudo relevance feedback
- data fusion
- statistical machine translation
- query specific
- query terms
- data mining