Distributed consensus-based multi-agent temporal-difference learning.

Milos S. Stankovic Marko Beko Srdjan S. Stankovic

Published in: Autom. (2023)

Keyphrases

multi agent
temporal difference learning
reinforcement learning
fixed point
function approximation
approximate value iteration
game playing
evaluation function
temporal difference
reinforcement learning algorithms
multi agent systems
markov decision process
model selection
monte carlo
graphical models
state space
multiple agents
pairwise