Probabilistic Policy Reuse for Safe Reinforcement Learning.
Javier GarcíaFernando FernándezPublished in: ACM Trans. Auton. Adapt. Syst. (2019)
Keyphrases
- reinforcement learning
- optimal policy
- policy search
- action selection
- markov decision process
- function approximation
- action space
- markov decision processes
- function approximators
- policy gradient
- generative model
- reinforcement learning problems
- policy iteration
- state and action spaces
- control policies
- state space
- markov decision problems
- state action
- reinforcement learning algorithms
- uncertain data
- probabilistic model
- bayesian networks
- reward function
- neural network
- transfer learning
- learning algorithm
- partially observable
- temporal difference learning
- approximate dynamic programming
- average reward
- control policy
- state dependent
- optimal control
- learning objects
- dynamic programming
- multi agent
- decision making
- partially observable markov decision processes
- asymptotically optimal
- finite state
- policy making
- actor critic
- multi agent systems
- partially observable domains