Towards Optimal Attacks on Reinforcement Learning Policies.
Alessio RussoAlexandre ProutièrePublished in: ACC (2021)
Keyphrases
- reinforcement learning
- optimal policy
- control policy
- control policies
- cooperative multi agent systems
- optimal control
- dynamic programming
- neural network
- function approximation
- approximate dynamic programming
- policy search
- state space
- markov decision process
- optimal solution
- average reward
- model free
- total reward
- supply chain
- markov decision processes
- finite horizon
- multi agent
- machine learning