Policy Teaching via Environment Poisoning: Training-time Adversarial Attacks against Reinforcement Learning.
Amin RakhshaGoran RadanovicRati DevidzeXiaojin ZhuAdish SinglaPublished in: CoRR (2020)
Keyphrases
- control policy
- reinforcement learning
- long run
- optimal policy
- learning process
- supervised learning
- online learning
- multi agent
- e learning
- function approximation
- policy iteration
- partially observable
- training set
- state space
- dynamic environments
- complex environments
- mobile robot
- higher education
- markov decision processes
- policy search
- learning community
- computer based training
- model free
- learning environment
- reward signal
- partially observable markov decision process
- medical students
- rl algorithms
- online environment
- reinforcement learning problems
- neural network
- function approximators
- real robot
- action selection
- watermarking algorithm
- problem based learning
- access control
- dynamic programming
- social networks
- learning algorithm