Login / Signup

Policy Gradient Reinforcement Learning with Environmental Dynamics and Action-Values in Policies.

Seiji IshiharaHarukazu Igarashi
Published in: KES (1) (2011)
Keyphrases
  • optimal policy
  • dynamical systems
  • dynamic model
  • fitted q iteration
  • sufficient conditions
  • markov decision processes
  • initial state
  • revenue management