Doubly Inhomogeneous Reinforcement Learning.

Liyuan Hu Mengbing Li Chengchun Shi Zhenke Wu Piotr Fryzlewicz

Published in: CoRR (2022)

Keyphrases

reinforcement learning
function approximation
markov chain
reinforcement learning algorithms
temporal difference
search algorithm
model free
markov decision processes
machine learning
temporal difference learning
state space
transfer learning
optimal policy
dynamic programming
data sets
learning classifier systems
learning process
multi agent
website
control problems
learning algorithm
reinforcement learning methods
continuous state
transition model
direct policy search