Goal Misgeneralization in Deep Reinforcement Learning.

Lauro Langosco di Langosco Jack Koch Lee D. Sharkey Jacob Pfau David Krueger

Published in: ICML (2022)

Keyphrases

reinforcement learning
state space
evolutionary algorithm
data sets
data mining
search engine
website
database systems
markov decision processes
function approximation
optimal control
temporal difference
deep learning
autonomous learning
agent learns
robotic control