Login / Signup
Softmax Deep Double Deterministic Policy Gradients.
Ling Pan
Qingpeng Cai
Longbo Huang
Published in:
CoRR (2020)
Keyphrases
</>
optimal policy
fluid model
fully observable
black box
infinite horizon
action selection
asymptotically optimal
multi agent
markov decision processes
learning rate
deep learning
control policy
randomized algorithms
relaxation algorithm