Sign in

Efficient Off-Policy Safe Reinforcement Learning Using Trust Region Conditional Value at Risk.

Dohyeong KimSonghwai Oh
Published in: CoRR (2023)
Keyphrases
  • trust region
  • reinforcement learning
  • state space
  • dynamic programming
  • function approximation
  • machine learning
  • genetic algorithm
  • upper bound
  • multi view
  • global optimum
  • column generation
  • log likelihood