Kernel Metric Learning for In-Sample Off-Policy Evaluation of Deterministic RL Policies.
Haanvid LeeTri Wahyu GuntaraJongmin LeeYung-Kyun NohKee-Eung KimPublished in: CoRR (2024)
Keyphrases
- metric learning
- policy evaluation
- kernel matrix
- optimal policy
- reinforcement learning
- least squares
- policy iteration
- partially observable markov decision processes
- temporal difference
- feature space
- model free
- markov decision processes
- monte carlo
- distance metric
- learning tasks
- function approximation
- markov decision problems
- markov decision process
- semi supervised
- variance reduction
- kernel methods
- distance function
- pairwise
- state space
- multi task
- semi parametric
- kernel function
- sample size
- dimensionality reduction
- semi supervised learning
- reinforcement learning algorithms
- supervised learning
- infinite horizon
- finite state
- dynamic programming
- machine learning
- learning problems
- support vector
- feature vectors
- input space
- principal component analysis
- distance measure
- learning algorithm