Reinforcement Learning with Trajectory Feedback.

Yonathan Efroni Nadav Merlis Shie Mannor

Published in: CoRR (2020)

Keyphrases

reinforcement learning
function approximation
state space
learning algorithm
multi agent
model free
relevance feedback
multi agent reinforcement learning
reinforcement learning algorithms
machine learning
optimal policy
markov decision processes
temporal difference learning
action selection
dynamic programming
feedback mechanisms
autonomous learning
robotic control
learning problems
learning agent
robot control
temporal difference
data sets
case study
neural network