Direct Random Search for Fine Tuning of Deep Reinforcement Learning Policies.
Sean GillenAsutay OzmenKatie BylPublished in: CoRR (2021)
Keyphrases
- fine tuning
- random search
- reinforcement learning
- optimal policy
- policy search
- simulated annealing
- viable alternative
- genetic algorithm
- parameter optimization
- fine tune
- search space
- markov decision process
- fitted q iteration
- markov decision processes
- reward function
- control policy
- partially observable markov decision processes
- state space
- fine tuned
- hyperparameters
- higher order
- evolutionary algorithm
- learning algorithm