Provably Safe Reinforcement Learning via Action Projection using Reachability Analysis and Polynomial Zonotopes.
Niklas KochdumperHanna KrasowskiXiao WangStanley BakMatthias AlthoffPublished in: CoRR (2022)
Keyphrases
- reachability analysis
- markov decision processes
- reinforcement learning
- action space
- state space
- action selection
- model checking
- optimal policy
- reinforcement learning algorithms
- reward shaping
- state action
- partially observable domains
- finite state
- dynamic programming
- timed automata
- partially observable
- incremental algorithms
- function approximation
- policy iteration
- worst case
- initial state
- fitted q iteration
- sensory inputs
- real time
- average cost
- infinite horizon
- learning algorithm
- decision processes
- transition model
- agent learns
- optimal control
- markov chain
- machine learning