Model-based Meta Reinforcement Learning using Graph Structured Surrogate Models and Amortized Policy Search.
Qi WangHerke van HoofPublished in: ICML (2022)
Keyphrases
- policy search
- reinforcement learning
- model free
- reinforcement learning algorithms
- continuous state
- dynamic programming
- worst case
- reward function
- random walk
- function approximation
- continuous action
- markov decision processes
- neural network
- function approximators
- policy gradient
- state space
- machine learning
- transfer learning
- heuristic search
- partially observable markov decision processes
- control policies
- hidden state
- supervised learning