Simultaneous Control and Human Feedback in the Training of a Robotic Agent with Actor-Critic Reinforcement Learning.
Kory W. MathewsonPatrick M. PilarskiPublished in: CoRR (2016)
Keyphrases
- actor critic
- reinforcement learning
- optimal control
- control problems
- policy gradient
- temporal difference
- function approximation
- approximate dynamic programming
- reinforcement learning algorithms
- control system
- gradient method
- control strategy
- supervised learning
- neuro fuzzy
- control strategies
- control policy
- policy iteration
- dynamic programming
- average reward
- training set
- model free
- action selection
- learning algorithm
- control method
- control law
- optimal policy
- rl algorithms
- markov decision processes
- multi agent
- policy gradient methods