AlberDICE: Addressing Out-Of-Distribution Joint Actions in Offline Multi-Agent RL via Alternating Stationary Distribution Correction Estimation.
Daiki E. MatsunagaJongmin LeeJaeseok YoonStefanos LeonardosPieter AbbeelKee-Eung KimPublished in: CoRR (2023)
Keyphrases
- stationary distribution
- multi agent
- initial state
- markov chain
- reinforcement learning
- product form
- multiagent reinforcement learning
- random walk
- optimal policy
- queueing networks
- queueing model
- multiple agents
- transition probabilities
- state space
- queue length
- action selection
- state dependent
- multiagent systems
- situation calculus
- service times
- action space
- multi agent systems
- single agent
- neural network
- service rates
- sufficient conditions
- reward function
- finite state
- decision problems
- probability distribution
- infinite horizon
- markov decision processes
- parameter estimation
- learning algorithm