The In-Sample Softmax for Offline Reinforcement Learning.

Chenjun Xiao Han Wang Yangchen Pan Adam White Martha White

Published in: ICLR (2023)

Keyphrases

reinforcement learning
temporal difference learning
function approximation
state space
reinforcement learning algorithms
real time
data sets
learning process
randomly selected
multi agent
activation function
robotic control
database
stochastic approximation
learning agents
data samples
game playing
transfer learning
sample size
dynamic programming
support vector