Using q-learning to select the best among functionally equivalent implementations.
Meggie van den OeverLauren E. GrimleyRichard Michael VerasPublished in: ARRAY@PLDI (2022)
Keyphrases
- reinforcement learning
- learning algorithm
- cooperative
- state space
- function approximation
- multi agent
- efficient implementation
- optimal policy
- stochastic approximation
- information systems
- model free
- least squares
- neural network
- dynamic programming
- expert systems
- learning rate
- action selection
- reinforcement learning algorithms
- software implementation
- multi agent reinforcement learning
- td learning