Chaining Value Functions for Off-Policy Learning.

Simon Schmitt John Shawe-Taylor Hado van Hasselt

Published in: AAAI (2022)

Keyphrases

learning process
learning algorithm
learning problems
learning systems
online learning
supervised learning
concept learning
prior knowledge
reinforcement learning
computer vision
user interface
multi agent systems
learning environment
multi class
neural network
collaborative learning
empirical studies
website
background knowledge
artificial intelligence
learning community
learning scheme
positive examples
computer programming