Advantage Actor-Critic with Reasoner: Explaining the Agent's Behavior from an Exploratory Perspective.

Muzhe Guo Feixu Yu Tian Lan Fang Jin

Published in: CoRR (2023)

Keyphrases

actor critic
multi agent
multi agent systems
decision making
reinforcement learning
gradient method
multiple agents
policy gradient
approximate dynamic programming
function approximation
neural network
state space