Stabilizing Off-Policy Deep Reinforcement Learning from Pixels.

Edoardo Cetin Philip J. Ball Steve Roberts Oya Çeliktutan

Published in: CoRR (2022)

Keyphrases

reinforcement learning
function approximation
input image
markov decision processes
image pixels
optimal control
multi agent
action selection
nonlinear systems
temporal difference
reinforcement learning algorithms
pixel values
intensity values
neighboring pixels
temporal difference learning
reinforcement learning methods
neural network
optimal policy
dynamic programming
data sets
model free
pixel wise
learning capabilities
homogeneous regions
pixel classification
genetic algorithm