Provably Efficient Neural Offline Reinforcement Learning via Perturbed Rewards.

Thanh Nguyen-Tang Raman Arora

Published in: CoRR (2023)

Keyphrases

reinforcement learning
state space
temporal difference
markov decision processes
neural network
learning algorithm
dynamic programming
model free
real time
data sets
database
function approximation
fitted q iteration
state and action spaces
computationally expensive
cost effective
optimal policy
computationally efficient
supervised learning