On the Role of Discount Factor in Offline Reinforcement Learning.
Hao HuYiqin YangQianchuan ZhaoChongjie ZhangPublished in: ICML (2022)
Keyphrases
- reinforcement learning
- markov decision processes
- discount factor
- optimal policy
- markov decision problems
- learning algorithm
- partially observable
- function approximation
- machine learning
- model free
- optimal solution
- average reward
- multi agent
- markov chain
- dynamic programming
- optimal control
- temporal difference
- reinforcement learning algorithms