High Acceleration Reinforcement Learning for Real-World Juggling with Binary Rewards.
Kai PloegerMichael LutterJan PetersPublished in: CoRR (2020)
Keyphrases
- reinforcement learning
- real world
- wide range
- markov decision processes
- function approximation
- learning algorithm
- optimal control
- data sets
- data mining
- reinforcement learning algorithms
- model free
- state space
- dynamic programming
- multi agent
- machine learning
- robotic control
- markov decision process
- reward function
- bandit problems
- multi class
- hamming distance
- optimal policy
- synthetic data
- database
- neural network
- real time