End-to-End Robotic Reinforcement Learning without Reward Engineering.

Avi Singh Larry Yang Kristian Hartikainen Chelsea Finn Sergey Levine

Published in: CoRR (2019)

Keyphrases

end to end
reinforcement learning
real robot
state space
wireless ad hoc networks
multipath
reinforcement learning algorithms
learning algorithm
congestion control
ad hoc networks
markov decision processes
optimal policy
average reward
eligibility traces
model free
reward function
mobile robot
real time
high bandwidth
internet protocol
admission control
policy gradient
rate allocation
scalable video
packet loss rate