Rethinking Expected Cumulative Reward Formalism of Reinforcement Learning: A Micro-Objective Perspective.
Changjian LiKrzysztof CzarneckiPublished in: CoRR (2019)
Keyphrases
- reinforcement learning
- total reward
- function approximation
- reinforcement learning algorithms
- state space
- eligibility traces
- markov decision processes
- partially observable environments
- temporal difference
- average reward
- viewpoint
- knowledge representation
- optimal policy
- model free
- formal model
- reward function
- action selection
- learning algorithm
- machine learning
- neural network
- supervised learning
- multi agent systems
- function approximators
- learning environment
- reinforcement learning methods
- policy evaluation
- multi agent