Strengthening Deterministic Policies for POMDPs.
Leonore WintererRalf WimmerNils JansenBernd BeckerPublished in: NFM (2020)
Keyphrases
- partially observable markov decision processes
- optimal policy
- policy gradient methods
- reinforcement learning
- policy search
- stochastic domains
- finite state
- markov decision processes
- predictive state representations
- markov decision problems
- continuous state
- dynamic programming
- planning under uncertainty
- belief state
- stationary policies
- decision problems
- state space
- dynamical systems
- expected reward
- control policies
- markov decision process
- multi agent
- partially observable
- machine learning
- policy iteration algorithm
- distributed constraint optimization
- partial observability
- reward function
- belief space
- average reward
- decision processes