Cherry-Picking with Reinforcement Learning.
Yunchu ZhangLiyiming KeAbhay DeshpandeAbhishek GuptaSiddhartha S. SrinivasaPublished in: CoRR (2023)
Keyphrases
- reinforcement learning
- function approximation
- learning algorithm
- model free
- reinforcement learning algorithms
- state space
- optimal policy
- multi agent reinforcement learning
- optimal control
- learning agents
- learning process
- temporal difference
- control problems
- action selection
- policy search
- temporal difference learning
- transition model
- multi agent
- real world
- databases
- learning classifier systems
- learning problems
- least squares
- reward function
- dynamic programming
- control system
- computer vision
- reinforcement learning methods
- stochastic approximation
- autonomous learning
- data sets