Distributional Reinforcement Learning for Multi-Dimensional Reward Functions.
Pushi ZhangXiaoyu ChenLi ZhaoWei XiongTao QinTie-Yan LiuPublished in: CoRR (2021)
Keyphrases
- reward function
- multi dimensional
- reinforcement learning
- reinforcement learning algorithms
- markov decision processes
- optimal policy
- policy search
- state space
- partially observable
- markov decision process
- inverse reinforcement learning
- dynamic programming
- transition model
- multiple agents
- function approximation
- transition probabilities
- simple examples
- model free
- multi agent
- initially unknown
- state action
- control policies
- machine learning
- learning agent
- generative model
- temporal difference
- state variables
- high dimensional
- learning algorithm
- average reward
- markov decision problems
- continuous state
- infinite horizon