Cherry-Picking with Reinforcement Learning.
Yunchu ZhangLiyiming KeAbhay DeshpandeAbhishek GuptaSiddhartha S. SrinivasaPublished in: Robotics: Science and Systems (2023)
Keyphrases
- reinforcement learning
- function approximation
- state space
- optimal policy
- learning algorithm
- markov decision processes
- temporal difference
- data sets
- reinforcement learning algorithms
- model free
- optimal control
- robotic control
- evolutionary learning
- autonomous learning
- temporal difference learning
- action selection
- database
- learning process
- partially observable
- action space
- artificial neural networks
- multi agent
- machine learning
- multi agent reinforcement learning
- relational reinforcement learning
- neural network
- partially observable domains