Policy Teaching via Environment Poisoning: Training-time Adversarial Attacks against Reinforcement Learning.
Amin RakhshaGoran RadanovicRati DevidzeXiaojin ZhuAdish SinglaPublished in: ICML (2020)
Keyphrases
- reinforcement learning
- optimal policy
- learning process
- multi agent
- supervised learning
- learning environment
- policy search
- mobile robot
- markov decision process
- online learning
- dynamic environments
- robocup soccer
- training set
- computer based training
- infinite horizon
- e learning
- agent receives
- machine learning
- reward signal
- medical students
- skill acquisition
- teacher training
- educational institutions
- partially observable
- action selection
- countermeasures
- problem based learning
- learning community
- function approximation
- high school
- higher education
- dynamic programming