Parallel reward and punishment control in humans and robots: Safe reinforcement learning using the MaxPain algorithm.

Stefan Elfwing Ben Seymour

Published in: ICDL-EPIROB (2017)

Keyphrases

reinforcement learning
learning algorithm
dynamic programming
parallel implementation
detection algorithm
simulated annealing
multi robot
k means
preprocessing
objective function
particle swarm optimization
segmentation algorithm
optimal solution
lower bound
state space
expectation maximization
computational complexity
convergence rate
robot control
artificial agents