Reward-Adaptive Reinforcement Learning: Dynamic Policy Gradient Optimization for Bipedal Locomotion.
Changxin HuangGuangrun WangZhibo ZhouRonghui ZhangLiang LinPublished in: IEEE Trans. Pattern Anal. Mach. Intell. (2023)
Keyphrases
- policy gradient
- reinforcement learning
- parametric optimization
- actor critic
- function approximation
- reinforcement learning algorithms
- policy search
- policy gradient methods
- gradient method
- model free reinforcement learning
- optimal control
- adaptive control
- state space
- single agent
- average reward
- optimization methods
- dynamic environments
- temporal difference
- state action
- machine learning
- function approximators
- optimization method
- partially observable markov decision processes
- control problems
- reinforcement learning methods
- rl algorithms
- model free
- approximate dynamic programming
- learning problems
- optimization algorithm
- mobile robot
- neural network