Q-learning Decision Transformer: Leveraging Dynamic Programming for Conditional Sequence Modelling in Offline RL.
Taku YamagataAhmed KhalilRaúl Santos-RodríguezPublished in: CoRR (2022)
Keyphrases
- reinforcement learning
- dynamic programming
- optimal policy
- state space
- function approximation
- markov decision processes
- decision problems
- reinforcement learning algorithms
- multi agent
- sequence alignment
- model free
- learning algorithm
- fuzzy logic
- optimal control
- action selection
- markov decision problems
- linear programming
- cooperative
- decision making
- decision processes
- fault diagnosis
- multi agent reinforcement learning
- rl algorithms
- temporal difference
- real time
- policy iteration
- temporal difference methods
- temporal difference learning
- single agent
- infinite horizon
- influence diagrams
- multi agent systems
- decision rules