Using a Logarithmic Mapping to Enable Lower Discount Factors in Reinforcement Learning.
Harm van SeijenMehdi FatemiArash TavakoliPublished in: CoRR (2019)
Keyphrases
- reinforcement learning
- function approximation
- state space
- factors that affect
- factors that influence
- robotic control
- multi agent
- factors affecting
- machine learning
- model free
- learning capabilities
- optimal policy
- worst case
- empirical data
- optimal control
- temporal difference
- temporal difference learning
- prior studies
- policy search
- genetic algorithm