Time-Efficient Reinforcement Learning with Stochastic Stateful Policies.
Firas Al-HafezGuoping ZhaoJan PetersDavide TateoPublished in: ICLR (2024)
Keyphrases
- reinforcement learning
- optimal policy
- direct policy search
- policy search
- control policies
- function approximation
- markov decision processes
- monte carlo
- computationally expensive
- learning algorithm
- markov decision process
- learning automata
- computationally efficient
- model free
- stochastic processes
- stochastic approximation
- hidden markov models
- approximate dynamic programming
- multi agent
- bayesian networks