MDP Homomorphic Networks: Group Symmetries in Reinforcement Learning.
Elise van der PolDaniel E. WorrallHerke van HoofFrans A. OliehoekMax WellingPublished in: CoRR (2020)
Keyphrases
- reinforcement learning
- markov decision processes
- optimal policy
- state space
- markov decision process
- reinforcement learning algorithms
- partially observable
- real time dynamic programming
- policy iteration
- reward function
- function approximation
- social networks
- action sets
- state and action spaces
- privacy preserving
- data exchange
- action space
- model free
- supervised learning
- temporal difference
- factored markov decision processes
- neural network
- bayesian networks
- linear programming
- utility function
- computer networks
- partially observable markov decision processes
- action selection
- dynamic programming
- finite state
- multi agent
- network structure
- bayesian reinforcement learning
- decision problems