A State Augmentation based approach to Reinforcement Learning from Human Preferences.

Mudit Verma Subbarao Kambhampati

Published in: CoRR (2023)

Keyphrases

reinforcement learning
state space
machine learning
decision making
human interaction
perceptual aliasing
neural network
artificial intelligence
markov decision processes
human behavior
temporal difference
human operators
action space
state abstraction
transition model