Trade-Off Between Robustness and Rewards Adversarial Training for Deep Reinforcement Learning Under Large Perturbations.
Jeffrey HuangHo Jin ChoiNadia FigueroaPublished in: IEEE Robotics Autom. Lett. (2023)
Keyphrases
- reinforcement learning
- trade off
- multi agent
- markov decision processes
- supervised learning
- optimal policy
- model free
- state space
- training algorithm
- optimal control
- function approximation
- online learning
- training set
- deep architectures
- learning algorithm
- neural network
- temporal difference
- reward shaping
- training process
- training examples
- learning process
- machine learning
- learning problems
- training phase
- partially observable
- function approximators
- control policy
- feature selection