Reward Reports for Reinforcement Learning.
Thomas Krendl GilbertSarah DeanNathan LambertTom ZickAaron J. SnoswellPublished in: CoRR (2022)
Keyphrases
- reinforcement learning
- function approximation
- state space
- learning algorithm
- reward function
- model free
- reinforcement learning algorithms
- supervised learning
- partially observable environments
- temporal difference
- markov decision processes
- multi agent
- eligibility traces
- policy gradient
- reinforcement learning methods
- partially observable
- optimal policy
- machine learning
- action selection
- policy iteration
- learning agent
- dynamic programming
- decision making
- markov decision problems
- optimal control
- policy search
- learning problems