Novelty search for deep reinforcement learning policy network weights by action sequence edit metric distance.
Ethan C. JacksonMark DaleyPublished in: GECCO (Companion) (2019)
Keyphrases
- reinforcement learning
- action selection
- optimal policy
- action space
- search algorithm
- distance metric
- distance function
- distance measure
- peer to peer
- function approximation
- markov decision process
- agent learns
- state action
- dynamic programming
- search space
- policy search
- euclidean distance
- hidden state
- transition model
- partially observable domains
- state space
- optimal control
- evaluation function
- network structure
- multi agent
- search result diversification
- neural network
- triangular inequality
- sensory inputs
- action sequences
- function approximators
- nearest neighbor
- machine learning