Goal Agnostic Learning and Planning without Reward Functions.

Christopher Kevin Robinson Joshua Lancaster

Published in: Adv. Artif. Intell. Mach. Learn. (2023)

Keyphrases

agnostic learning
reward function
uniform distribution
markov decision processes
noise tolerant
optimal policy
reinforcement learning
search algorithm
pairwise
state space
learning theory
state variables
multiple agents
membership queries