Student-t policy in reinforcement learning to acquire global optimum of robot control.
Taisuke KobayashiPublished in: Appl. Intell. (2019)
Keyphrases
- robot control
- global optimum
- reinforcement learning
- optimal policy
- optimization method
- simulated annealing
- policy search
- search space
- optimal solution
- global convergence
- global solution
- objective function
- learning process
- autonomous robots
- mobile robot
- step size
- markov decision process
- action selection
- function approximation
- unstructured environments
- state space
- reward function
- model free
- action space
- function approximators
- reinforcement learning algorithms
- motion control
- markov decision processes
- convergence rate
- dynamic programming
- evolutionary algorithm
- pid controller