Transformers generalize differently from information stored in context vs in weights.
Stephanie C. Y. ChanIshita DasguptaJunkyung KimDharshan KumaranAndrew K. LampinenFelix HillPublished in: CoRR (2022)
Keyphrases
- contextual information
- computer systems
- neural network
- global information
- information seeking
- information content
- information sharing
- context aware
- higher level
- end users
- information processing
- prior knowledge
- spatial information
- image sequences
- prior information
- decision making
- genetic algorithm
- information flow
- context dependent
- weighted sum
- database
- spatial context
- global context