Jointly Learning Environments and Control Policies with Projected Stochastic Gradient Ascent.
Adrien BollandIoannis BoukasMathias BergerDamien ErnstPublished in: J. Artif. Intell. Res. (2022)
Keyphrases
- control policies
- gradient ascent
- reinforcement learning
- optimal policy
- control system
- control policy
- control strategies
- cross entropy
- expectation maximization
- action space
- motion control
- exponential family
- finite horizon
- average reward
- neural network
- sufficient conditions
- state space
- policy gradient
- maximum likelihood
- genetic algorithm