An Alternative Softmax Operator for Reinforcement Learning.
Kavosh AsadiMichael L. LittmanPublished in: ICML (2017)
Keyphrases
- reinforcement learning
- temporal difference learning
- state space
- reinforcement learning algorithms
- function approximation
- learning algorithm
- optimal control
- robotic control
- stochastic approximation
- model free
- neural network
- multi agent reinforcement learning
- temporal difference
- action selection
- markov decision processes
- least squares
- convergence rate
- learning problems
- learning process
- learning capabilities
- multi agent
- decision making
- genetic algorithm
- data sets