Conformal Off-Policy Evaluation in Markov Decision Processes.
Daniele FoffanoAlessio RussoAlexandre ProutièrePublished in: CoRR (2023)
Keyphrases
- policy evaluation
- markov decision processes
- policy iteration
- optimal policy
- finite state
- reinforcement learning
- state space
- dynamic programming
- average reward
- reinforcement learning algorithms
- decision processes
- partially observable
- infinite horizon
- planning under uncertainty
- average cost
- reward function
- markov decision process
- least squares
- model free
- markov decision problems