Provably Efficient Reinforcement Learning for Discounted MDPs with Feature Mapping.

Dongruo Zhou Jiafan He Quanquan Gu

Published in: ICML (2021)

Keyphrases

reinforcement learning
markov decision processes
feature mapping
optimal policy
markov decision process
dynamic programming
state space
reinforcement learning algorithms
state and action spaces
average reward
action space
finite horizon
policy iteration
average cost
temporal difference
manifold learning
reward function
data analysis
infinite horizon
data sets
high dimensional data
nearest neighbor
pairwise
learning algorithm
machine learning