B-Pref: Benchmarking Preference-Based Reinforcement Learning.

Kimin Lee Laura M. Smith Anca D. Dragan Pieter Abbeel

Published in: NeurIPS Datasets and Benchmarks (2021)

Keyphrases

reinforcement learning
function approximation
temporal difference
state space
reinforcement learning algorithms
markov decision processes
multi agent reinforcement learning
model free
optimal policy
learning process
multi agent
sufficient conditions
dynamic programming
information systems
learning problems
database systems
function approximators
learning agents
reinforcement learning methods
machine learning