The In-Sample Softmax for Offline Reinforcement Learning.
Chenjun XiaoHan WangYangchen PanAdam WhiteMartha WhitePublished in: ICLR (2023)
Keyphrases
- reinforcement learning
- temporal difference learning
- function approximation
- state space
- reinforcement learning algorithms
- real time
- data sets
- learning process
- randomly selected
- multi agent
- activation function
- robotic control
- database
- stochastic approximation
- learning agents
- data samples
- game playing
- transfer learning
- sample size
- dynamic programming
- support vector