Parallel reward and punishment control in humans and robots: Safe reinforcement learning using the MaxPain algorithm.
Stefan ElfwingBen SeymourPublished in: ICDL-EPIROB (2017)
Keyphrases
- reinforcement learning
- learning algorithm
- dynamic programming
- parallel implementation
- detection algorithm
- simulated annealing
- multi robot
- k means
- preprocessing
- objective function
- particle swarm optimization
- segmentation algorithm
- optimal solution
- lower bound
- state space
- expectation maximization
- computational complexity
- convergence rate
- robot control
- artificial agents