Cherry-Picking with Reinforcement Learning.

Yunchu Zhang Liyiming Ke Abhay Deshpande Abhishek Gupta Siddhartha S. Srinivasa

Published in: Robotics: Science and Systems (2023)

Keyphrases

reinforcement learning
function approximation
state space
optimal policy
learning algorithm
markov decision processes
temporal difference
data sets
reinforcement learning algorithms
model free
optimal control
robotic control
evolutionary learning
autonomous learning
temporal difference learning
action selection
database
learning process
partially observable
action space
artificial neural networks
multi agent
machine learning
multi agent reinforcement learning
relational reinforcement learning
neural network
partially observable domains