Sound Heuristic Search Value Iteration for Undiscounted POMDPs with Reachability Objectives.
Qi Heng HoMartin S. FeatherFederico RossiZachary N. SunbergMorteza LahijanianPublished in: CoRR (2024)
Keyphrases
- heuristic search
- state space
- markov decision processes
- markov decision problems
- partially observable markov decision processes
- belief state
- partially observable
- optimal policy
- average reward
- reinforcement learning
- policy iteration
- dynamic programming
- infinite horizon
- markov decision process
- planning problems
- finite state
- orders of magnitude
- belief space
- search problems
- search space
- markov chain
- decision theoretic planning
- planning under uncertainty
- heuristic function
- ai planning
- path finding
- action space
- search strategies
- reward function
- dynamical systems
- bidirectional search
- partial observability
- pattern databases
- dec pomdps
- initial state
- average cost
- automated planning
- probabilistic planning
- state space search
- linear programming