Reinforcement Learning with Non-Exponential Discounting.
Matthias SchultheisConstantin A. RothkopfHeinz KoepplPublished in: CoRR (2022)
Keyphrases
- reinforcement learning
- state space
- function approximation
- markov decision processes
- reinforcement learning algorithms
- learning algorithm
- temporal difference
- machine learning
- robotic control
- multi agent
- artificial intelligence
- policy search
- stochastic approximation
- case study
- model free
- action selection
- temporal difference learning
- data sets
- direct policy search
- learning agents
- policy iteration
- evaluation function
- transfer learning
- optimal policy
- dynamic environments