Adversarial Policy Training against Deep Reinforcement Learning.
Xian WuWenbo GuoHua WeiXinyu XingPublished in: USENIX Security Symposium (2021)
Keyphrases
- reinforcement learning
- optimal policy
- multi agent
- action selection
- policy search
- markov decision process
- training set
- test set
- markov decision processes
- control policies
- function approximation
- dynamic programming
- control policy
- function approximators
- model free
- partially observable environments
- partially observable
- policy iteration
- decision problems
- supervised learning
- training process
- temporal difference
- deep architectures
- online learning
- state space
- approximate dynamic programming
- machine learning
- training phase
- reinforcement learning problems
- inverse reinforcement learning
- training data
- continuous state spaces
- policy evaluation
- markov decision problems
- learning process
- state action
- action space
- markov chain
- reinforcement learning algorithms
- optimal control