Online Parameter Estimation in Partially Observed Markov Decision Processes.
Sai Sumedh R. HindupurVivek S. BorkarPublished in: Allerton (2023)
Keyphrases
- parameter estimation
- markov decision processes
- partially observed
- least squares
- expected reward
- maximum likelihood
- transition matrices
- policy iteration
- model selection
- markov random field
- optimal policy
- dynamic programming
- finite state
- state space
- parameter estimation algorithm
- random fields
- reinforcement learning
- planning under uncertainty
- decision theoretic planning
- model based reinforcement learning
- partially observable
- average reward
- markov decision process
- expectation maximization
- reachability analysis
- average cost
- em algorithm
- finite horizon
- approximate inference
- discounted reward
- infinite horizon
- action space
- structure learning
- action sets
- parameter estimates
- probability distribution
- pairwise