Login / Signup
Reward tampering problems and solutions in reinforcement learning: a causal influence diagram perspective.
Tom Everitt
Marcus Hutter
Ramana Kumar
Victoria Krakovna
Published in:
Synth. (2021)
Keyphrases
</>
reinforcement learning
decision problems
influence diagrams
sequential decision making
sequential decision problems
learning algorithm
decision making
neural network
search algorithm
computational complexity
graphical models
partially observable