Robust Online Optimization of Reward-Uncertain MDPs.
Kevin ReganCraig BoutilierPublished in: IJCAI (2011)
Keyphrases
- reinforcement learning
- markov decision processes
- online learning
- average reward
- optimization problems
- robust stability
- reward function
- global optimization
- incomplete information
- decision making
- robust optimization
- real time
- optimal policy
- optimization algorithm
- optimization method
- computationally efficient
- state space
- constrained optimization
- possibility theory
- machine learning