Using q-learning to select the best among functionally equivalent implementations.

Meggie van den Oever Lauren E. Grimley Richard Michael Veras

Published in: ARRAY@PLDI (2022)

Keyphrases

reinforcement learning
learning algorithm
cooperative
state space
function approximation
multi agent
efficient implementation
optimal policy
stochastic approximation
information systems
model free
least squares
neural network
dynamic programming
expert systems
learning rate
action selection
reinforcement learning algorithms
software implementation
multi agent reinforcement learning
td learning