Delphic Offline Reinforcement Learning under Nonidentifiable Hidden Confounding.

Alizée Pace Hugo Yèche Bernhard Schölkopf Gunnar Rätsch Guy Tennenholtz

Published in: ICLR (2024)

Keyphrases

reinforcement learning
function approximation
temporal difference
reinforcement learning algorithms
markov decision processes
direct policy search
artificial intelligence
learning algorithm
robotic control
state space
real time
machine learning
case study
reinforcement learning methods
neural network
action selection
model free
information retrieval
multi agent
learning problems
partially observable
control problems
database