An LPDDR-based CXL-PNM Platform for TCO-efficient Inference of Transformer-based Large Language Models.
Sangsoo ParkKyungSoo KimJinin SoJin JungJonggeon LeeKyoungwan WooNayeon KimYounghyun LeeHyungyo KimYongsuk KwonJinhyun KimJieun LeeYeonGon ChoYongmin TaiJeonghyeon ChoHoyoung SongJung Ho AhnNam Sung KimPublished in: HPCA (2024)
Keyphrases
- language model
- efficient inference
- language modeling
- probabilistic model
- probabilistic inference
- conditional random fields
- hidden variables
- markov random field
- n gram
- fully connected
- information retrieval
- exact inference
- approximate inference
- structured prediction
- query expansion
- human pose estimation
- markov networks
- bayesian networks
- smoothing methods
- graph structure
- relevance model
- graphical models
- factor graphs
- similarity measure
- computer vision
- learning algorithm