Login / Signup

Logarithmic Smoothing for Pessimistic Off-Policy Evaluation, Selection and Learning.

Otmane SakhiImad AoualiPierre AlquierNicolas Chopin
Published in: CoRR (2024)
Keyphrases
  • learning algorithm
  • reinforcement learning
  • active learning
  • supervised learning