Reinforcement Learning by Stochastic Hill Climbing on Discounted Reward.
Hajime KimuraMasayuki YamamuraShigenobu KobayashiPublished in: ICML (1995)
Keyphrases
- hill climbing
- discounted reward
- reinforcement learning
- markov decision processes
- average reward
- state and action spaces
- policy iteration
- simulated annealing
- search space
- optimal policy
- hierarchical reinforcement learning
- state space
- search procedure
- search algorithm
- genetic algorithm ga
- reinforcement learning algorithms
- search strategy
- function approximation
- model free
- path finding
- optimality criterion
- steepest ascent
- finite state
- optimal control
- tabu search
- action space
- machine learning
- temporal difference
- monte carlo
- markov decision process
- long run
- partially observable
- average cost
- dynamic programming
- lower bound
- learning algorithm
- genetic algorithm
- neural network