Exploring Pessimism and Optimism Dynamics in Deep Reinforcement Learning.

Bahareh Tasdighi Nicklas Werge Yi-Shan Wu Melih Kandemir

Published in: CoRR (2024)

Keyphrases

reinforcement learning
dynamic model
dynamical systems
function approximation
information retrieval
probabilistic model
optimal policy
partially observable
deep learning
collective behavior
learning algorithm
temporal difference learning
function approximators
reinforcement learning algorithms
action selection
nonlinear dynamics
data sets
radial basis function
markov decision processes
sufficient conditions
supervised learning
state space
learning environment
search engine
machine learning
real world