Reinforcement Learning for POMDP: Partitioned Rollout and Policy Iteration With Application to Autonomous Sequential Repair Problems.
Sushmita BhattacharyaSahil BadyalThomas WheelerStephanie GilDimitri P. BertsekasPublished in: IEEE Robotics Autom. Lett. (2020)
Keyphrases
- reinforcement learning
- policy iteration
- markov decision process
- markov decision processes
- optimal policy
- model free
- policy evaluation
- markov decision problems
- partially observable markov decision processes
- control problems
- state and action spaces
- temporal difference
- function approximation
- state space
- approximate policy iteration
- stochastic approximation
- temporal difference learning
- average reward
- machine learning
- finite state
- partially observable
- action space
- reinforcement learning algorithms
- decision problems
- reinforcement learning methods
- infinite horizon
- approximate solutions
- action selection
- rl algorithms
- continuous state
- sample path
- least squares
- dynamic programming