End-to-End Robotic Reinforcement Learning without Reward Engineering.
Avi SinghLarry YangKristian HartikainenChelsea FinnSergey LevinePublished in: CoRR (2019)
Keyphrases
- end to end
- reinforcement learning
- real robot
- state space
- wireless ad hoc networks
- multipath
- reinforcement learning algorithms
- learning algorithm
- congestion control
- ad hoc networks
- markov decision processes
- optimal policy
- average reward
- eligibility traces
- model free
- reward function
- mobile robot
- real time
- high bandwidth
- internet protocol
- admission control
- policy gradient
- rate allocation
- scalable video
- packet loss rate