Regret-Based Optimization for Robust Reinforcement Learning.
Roman BelairePradeep VarakanthamDavid LoPublished in: CoRR (2023)
Keyphrases
- reinforcement learning
- optimization problems
- global optimization
- machine learning
- optimization algorithm
- genetic algorithm
- neural network
- function approximation
- robust optimization
- learning process
- lower bound
- linear programming
- online learning
- markov decision processes
- optimization method
- optimization methods
- action selection
- reward function
- total reward