Distributional Reinforcement Learning for Multi-Dimensional Reward Functions.
Pushi ZhangXiaoyu ChenLi ZhaoWei XiongTao QinTie-Yan LiuPublished in: NeurIPS (2021)
Keyphrases
- reward function
- multi dimensional
- reinforcement learning
- reinforcement learning algorithms
- markov decision processes
- state space
- policy search
- optimal policy
- partially observable
- inverse reinforcement learning
- markov decision process
- multiple agents
- function approximation
- transition model
- learning agent
- state variables
- temporal difference
- transition probabilities
- high dimensional
- multi agent
- model free
- simple examples
- average reward
- dynamic programming
- function approximators
- learning agents
- control policies
- markov decision problems
- graphical models
- state action
- continuous state
- maximum likelihood
- particle filter
- generative model