A Cramér Distance perspective on Quantile Regression based Distributional Reinforcement Learning.
Alix LheritierNicolas BondouxPublished in: AISTATS (2022)
Keyphrases
- reinforcement learning
- viewpoint
- learning algorithm
- distance function
- optimal policy
- machine learning
- reinforcement learning algorithms
- function approximation
- markov decision processes
- distance metric
- euclidean distance
- distance measure
- neural network
- co occurrence
- supervised learning
- state space
- dynamic programming
- model free
- action selection
- temporal difference
- action space
- function approximators
- robotic control
- rao bound
- regularized kernel