Generalized Hidden Parameter MDPs: Transferable Model-Based RL in a Handful of Trials.
Christian F. PerezFelipe Petroski SuchTheofanis KaraletsosPublished in: AAAI (2020)
Keyphrases
- reinforcement learning
- markov decision processes
- model free
- state space
- optimal policy
- policy iteration
- markov decision process
- reinforcement learning algorithms
- state and action spaces
- action space
- policy evaluation
- finite state
- partially observable
- average reward
- function approximation
- infinite horizon
- parameter values
- markov decision problems
- multi agent
- policy search
- factored mdps
- learning algorithm
- average cost
- control policy
- reward function
- optimal control
- decision diagrams
- parameter space
- reinforcement learning problems
- sufficient conditions
- machine learning