Login / Signup
Switching the Loss Reduces the Cost in Batch Reinforcement Learning.
Alex Ayoub
Kaiwen Wang
Vincent Liu
Samuel Robertson
James McInerney
Dawen Liang
Nathan Kallus
Csaba Szepesvári
Published in:
CoRR (2024)
Keyphrases
</>
reinforcement learning
function approximation
machine learning
high cost
total cost
markov decision processes
optimal policy
neural network
state space
multi agent
learning algorithm
genetic algorithm
mobile robot
temporal difference
multi agent reinforcement learning
transition model
robotic control