Improving Generalization of Reinforcement Learning Using a Bilinear Policy Network.
Fen FangWenyu LiangYan WuQianli XuJoo-Hwee LimPublished in: ICIP (2022)
Keyphrases
- reinforcement learning
- optimal policy
- policy search
- markov decision process
- peer to peer
- network structure
- network model
- function approximators
- complex networks
- computer networks
- function approximation
- network architecture
- bayesian networks
- machine learning
- infinite horizon
- model free
- temporal difference
- reward function
- reinforcement learning algorithms
- multi agent
- approximate dynamic programming
- continuous state spaces
- learning algorithm