A surrogate model-based genetic algorithm for the optimal policy in cart-pole balancing environments.
Seung-Soo ShinYong-Hyuk KimPublished in: GECCO Companion (2022)
Keyphrases
- optimal policy
- genetic algorithm
- average reward reinforcement learning
- markov decision processes
- decision problems
- finite horizon
- state space
- reinforcement learning
- multistage
- finite state
- infinite horizon
- dynamic programming
- long run
- markov decision process
- bayesian reinforcement learning
- state dependent
- sufficient conditions
- average reward
- control policies
- model free
- serial inventory systems
- policy iteration
- lost sales
- search algorithm
- average cost
- markov decision problems
- partially observable
- control strategies
- stochastic inventory control
- develop a mathematical model