Bi-level Optimization Method for Automatic Reward Shaping of Reinforcement Learning.
Ludi WangZhaolei WangQinghai GongPublished in: ICANN (3) (2022)
Keyphrases
- optimization method
- reward shaping
- bi level
- reinforcement learning
- optimization algorithm
- reinforcement learning algorithms
- optimization methods
- simulated annealing
- genetic algorithm
- gray scale
- particle swarm
- function approximation
- complex domains
- differential evolution
- evolutionary algorithm
- markov decision problems
- metaheuristic
- nelder mead simplex
- state space
- model free
- markov decision processes
- temporal difference
- learning process
- learning algorithm
- machine learning
- particle swarm optimization
- supervised learning
- optimal control
- dynamic programming
- domain knowledge
- multi agent
- continuous state