Statistical Inference of the Value Function for Reinforcement Learning in Infinite Horizon Settings.
Chengchun ShiSheng ZhangWenbin LuRui SongPublished in: CoRR (2020)
Keyphrases
- infinite horizon
- statistical inference
- reinforcement learning
- optimal policy
- optimal control
- markov decision processes
- partially observable
- markov decision process
- finite horizon
- state space
- dynamic programming
- policy iteration
- policy evaluation
- model selection
- graphical models
- bayesian inference
- single item
- production planning
- long run
- stochastic demand
- statistical learning
- machine learning
- function approximators
- lead time
- function approximation
- average cost
- policy gradient
- multi agent
- markov decision problems
- statistical methods
- multistage
- particle filter
- reward function
- reinforcement learning algorithms
- sufficient conditions
- inventory level
- feature selection