Approximation Methods for Partially Observed Markov Decision Processes (POMDPs).
Caleb M. BowyerPublished in: CoRR (2021)
Keyphrases
- approximation methods
- partially observed
- markov decision processes
- belief state
- state space
- expected reward
- partially observable markov decision processes
- partially observable
- reinforcement learning
- finite state
- optimal policy
- policy iteration
- planning under uncertainty
- function approximators
- average reward
- dynamic programming
- infinite horizon
- decision theoretic planning
- average cost
- policy iteration algorithm
- finite horizon
- reinforcement learning algorithms
- action space
- neural network
- heuristic search
- markov decision process
- policy gradient
- search space
- reward function
- basis functions
- markov chain
- learning algorithm
- machine learning