Delphic Offline Reinforcement Learning under Nonidentifiable Hidden Confounding.
Alizée PaceHugo YècheBernhard SchölkopfGunnar RätschGuy TennenholtzPublished in: CoRR (2023)
Keyphrases
- reinforcement learning
- function approximation
- temporal difference
- state space
- real time
- reinforcement learning algorithms
- markov decision processes
- control problems
- data mining
- machine learning
- dynamic programming
- data sets
- model free
- optimal policy
- databases
- support vector
- objective function
- website
- artificial intelligence
- learning algorithm
- genetic algorithm
- partially observable
- robot control
- multi agent reinforcement learning
- robotic control