Temporal Consistency-Based Loss Function for Both Deep Q-Networks and Deep Deterministic Policy Gradients for Continuous Actions.

Published in: Symmetry (2021)

Keyphrases

loss function
temporal consistency
pairwise
learning to rank
support vector
reproducing kernel hilbert space
empirical risk
boosting framework
machine learning
optimal solution
least squares
action space
risk minimization
convex loss functions