E-HBA: Using Action Policies for Expert Advice and Agent Typification.

Stefano V. Albrecht Jacob W. Crandall Subramanian Ramamoorthy

Published in: CoRR (2019)

Keyphrases

expert advice
action selection
expected reward
discounted reward
multi agent systems
practical reasoning
multi agent
multiagent systems
internal state
markov decision processes
reward function
optimal policy
markov decision process
loss bounds
multiple agents
plan execution
state action
regret bounds
least squares