Extreme Q-Learning: MaxEnt RL without Entropy.
Divyansh GargJoey HejnaMatthieu GeistStefano ErmonPublished in: ICLR (2023)
Keyphrases
- reinforcement learning
- function approximation
- reinforcement learning algorithms
- model free
- maximum entropy
- state space
- optimal policy
- action selection
- multi agent
- rl algorithms
- learning algorithm
- markov decision processes
- mutual information
- temporal difference learning
- reinforcement learning methods
- multi agent reinforcement learning
- information theoretic
- temporal difference methods
- information theory
- temporal difference
- optimal control
- continuous state spaces
- control problems
- reinforcement learning problems
- td learning
- learning agent
- policy iteration
- markov decision process
- actor critic
- decision problems
- policy gradient
- cooperative
- continuous state and action spaces
- state action
- average reward
- function approximators
- action space
- information entropy
- learning capabilities
- learning rate
- transfer learning
- dynamic programming
- search space