Reward-Bounded Reachability Probability for Uncertain Weighted MDPs.
Vahid HashemiHolger HermannsLei SongPublished in: VMCAI (2016)
Keyphrases
- reinforcement learning
- state space
- markov decision processes
- expected reward
- reward function
- average reward
- optimal policy
- discounted reward
- partially observed
- factored mdps
- probability distribution
- dynamic programming
- transition probabilities
- partially observable
- decreasing function
- semi markov decision processes
- policy search
- possibility theory
- finite horizon
- finite state
- incomplete information
- stationary policies
- initial state
- reinforcement learning algorithms
- factored markov decision processes