Multi-agent Off-policy Actor-Critic Reinforcement Learning for Partially Observable Environments.
Ainur ZhaikhanAli H. SayedPublished in: CoRR (2024)
Keyphrases
- partially observable environments
- actor critic
- reinforcement learning
- reinforcement learning algorithms
- multi agent
- partially observable
- temporal difference
- partially observable markov decision processes
- policy gradient
- model free
- function approximation
- state space
- markov decision processes
- learning algorithm
- single agent
- stochastic games
- inverse reinforcement learning
- cooperative
- multi agent systems
- multiagent systems
- rl algorithms
- policy iteration
- optimal control
- transfer learning
- approximate dynamic programming
- average reward
- learning problems
- supervised learning
- reinforcement learning methods
- machine learning
- state action
- function approximators
- optimal policy
- reward function
- monte carlo
- action selection
- multiple agents
- autonomous agents