Safeguarded Progress in Reinforcement Learning: Safe Bayesian Exploration for Control Policy Synthesis.
Rohan MittaHosein HasanbeigJun WangDaniel KroeningYiannis KantarosAlessandro AbatePublished in: AAAI (2024)
Keyphrases
- control policy
- reinforcement learning
- exploration exploitation
- exploration strategy
- control policies
- approximate dynamic programming
- long run
- admission control
- function approximation
- action selection
- active exploration
- state space
- model free
- optimal policy
- batch mode
- reinforcement learning algorithms
- posterior probability
- multi agent
- bayesian networks
- temporal difference
- average cost
- real time
- machine learning
- design space exploration
- model based reinforcement learning
- probabilistic model
- dynamic programming
- learning algorithm
- infinite horizon
- optimal control
- control strategy
- markov chain
- mobile robot
- computational complexity
- balancing exploration and exploitation