High Acceleration Reinforcement Learning for Real-World Juggling with Binary Rewards.

Kai Ploeger Michael Lutter Jan Peters

Published in: CoRR (2020)

Keyphrases

reinforcement learning
real world
wide range
markov decision processes
function approximation
learning algorithm
optimal control
data sets
data mining
reinforcement learning algorithms
model free
state space
dynamic programming
multi agent
machine learning
robotic control
markov decision process
reward function
bandit problems
multi class
hamming distance
optimal policy
synthetic data
database
neural network
real time