Actor-Critic Reinforcement Learning with Simultaneous Human Control and Feedback.

Kory W. Mathewson Patrick M. Pilarski

Published in: CoRR (2017)

Keyphrases

actor critic
reinforcement learning
optimal control
control problems
policy gradient
temporal difference
function approximation
reinforcement learning algorithms
approximate dynamic programming
control strategies
gradient method
policy iteration
adaptive control
neuro fuzzy
control strategy
control policy
state space
model free
action selection
policy gradient methods
neural network
control method
markov decision processes
transfer learning
control system
learning algorithm
temporal difference learning
supervised learning
dynamic programming