Blending MPC & Value Function Approximation for Efficient Reinforcement Learning.

Mohak Bhardwaj Sanjiban Choudhury Byron Boots

Published in: ICLR (2021)

Keyphrases

reinforcement learning
state space
multi agent
information systems
learning process
computationally efficient
cost effective
function approximation
temporal difference
neural network
closed loop
optimal control
model free
temporal difference learning
multi agent reinforcement learning