Model-based Meta Reinforcement Learning using Graph Structured Surrogate Models and Amortized Policy Search.

Qi Wang Herke van Hoof

Published in: ICML (2022)

Keyphrases

policy search
reinforcement learning
model free
reinforcement learning algorithms
continuous state
dynamic programming
worst case
reward function
random walk
function approximation
continuous action
markov decision processes
neural network
function approximators
policy gradient
state space
machine learning
transfer learning
heuristic search
partially observable markov decision processes
control policies
hidden state
supervised learning