LAVA: Latent Action Spaces via Variational Auto-encoding for Dialogue Policy Optimization.
Nurul LubisChristian GeishauserMichael HeckHsien-Chin LinMarco MoresiCarel van NiekerkMilica GasicPublished in: CoRR (2020)
Keyphrases
- action space
- state space
- markov decision processes
- state and action spaces
- real valued
- reinforcement learning
- control policies
- continuous state spaces
- reinforcement learning problems
- continuous state
- stochastic processes
- optimal policy
- action selection
- markov decision process
- machine learning
- reinforcement learning algorithms
- function approximators
- state action
- markov decision problems
- skill learning
- continuous action