AlberDICE: Addressing Out-Of-Distribution Joint Actions in Offline Multi-Agent RL via Alternating Stationary Distribution Correction Estimation.
Daiki E. MatsunagaJongmin LeeJaeseok YoonStefanos LeonardosPieter AbbeelKee-Eung KimPublished in: NeurIPS (2023)
Keyphrases
- stationary distribution
- multi agent
- initial state
- markov chain
- reinforcement learning
- product form
- multiagent reinforcement learning
- random walk
- state space
- optimal policy
- queueing model
- state dependent
- transition probabilities
- queue length
- queueing networks
- multiple agents
- service times
- sufficient conditions
- action selection
- steady state
- multi agent systems
- service rates
- action space
- finite state
- markov decision processes
- single agent
- partially observable
- multiagent systems
- situation calculus
- parameter estimation
- average cost
- learning agent
- marginal distributions
- maximum likelihood
- higher order
- search algorithm