Wasserstein Auto-encoded MDPs: Formal Verification of Efficiently Distilled RL Policies with Many-sided Guarantees.
Florent DelgrangeAnn NowéGuillermo A. PérezPublished in: CoRR (2023)
Keyphrases
- formal verification
- optimal policy
- markov decision processes
- reinforcement learning
- markov decision process
- model checking
- state space
- policy search
- finite horizon
- markov decision problems
- reward function
- average cost
- finite state
- initial state
- bounded model checking
- control policies
- model checker
- decision problems
- infinite horizon
- dynamic programming
- policy iteration
- action space
- automated verification
- discounted reward
- symbolic model checking
- decision processes
- reinforcement learning algorithms
- control policy
- long run
- decision theoretic planning
- program slicing
- learning algorithm
- average reward
- partially observable
- action selection
- sufficient conditions
- multi agent
- web services