Corruption-Robust Algorithms with Uncertainty Weighting for Nonlinear Contextual Bandits and Markov Decision Processes.
Chenlu YeWei XiongQuanquan GuTong ZhangPublished in: CoRR (2022)
Keyphrases
- markov decision processes
- policy iteration
- reachability analysis
- markov decision process
- dynamic programming
- optimal policy
- factored mdps
- reinforcement learning
- state space
- learning algorithm
- transition matrices
- stochastic shortest path
- decision theoretic planning
- planning under uncertainty
- finite horizon
- robust optimization
- infinite horizon
- finite state
- fixed point
- least squares
- multi agent