Wasserstein Auto-encoded MDPs: Formal Verification of Efficiently Distilled RL Policies with Many-sided Guarantees.
Florent DelgrangeAnn NowéGuillermo A. PérezPublished in: ICLR (2023)
Keyphrases
- formal verification
- optimal policy
- markov decision processes
- reinforcement learning
- markov decision process
- model checking
- markov decision problems
- policy search
- state space
- reward function
- model checker
- initial state
- finite horizon
- decision problems
- average cost
- automated verification
- policy iteration
- control policies
- dynamic programming
- average reward
- symbolic model checking
- semi markov decision process
- reinforcement learning algorithms
- long run
- partially observable markov decision processes
- bounded model checking
- control policy
- decision processes
- state and action spaces
- factored mdps
- multi agent systems
- continuous state and action spaces
- sufficient conditions
- action sets
- optimal control
- markov games
- discounted reward
- temporal logic
- action space
- pointwise
- rl algorithms
- infinite horizon
- stochastic games
- multiple agents
- model free