Login / Signup

Efficient Long-Range Transformers: You Need to Attend More, but Not Necessarily at Every Layer.

Qingru ZhangDhananjay RamCole HawkinsSheng ZhaTuo Zhao
Published in: CoRR (2023)
Keyphrases
  • long range
  • short range
  • conditional random fields
  • long range correlations
  • neural network
  • similarity measure
  • artificial neural networks
  • graphical models