Cognitive Dissonance: Why Do Language Model Outputs Disagree with Internal Representations of Truthfulness?
Kevin LiuStephen CasperDylan Hadfield-MenellJacob AndreasPublished in: CoRR (2023)
Keyphrases
- language model
- internal representations
- language modeling
- cognitive model
- cognitive processing
- sensory data
- probabilistic model
- n gram
- retrieval model
- information retrieval
- query expansion
- test collection
- high level
- sensory information
- ad hoc information retrieval
- cognitive processes
- receptive fields
- mixture model
- computational models
- information processing
- smoothing methods
- decision trees
- cognitive science
- cognitive architecture
- mobile robot
- input data
- neural network
- dimensionality reduction
- mental models
- hidden units
- computer simulation