Relaxation-Free Deep Hashing via Policy Gradient.
Xin YuanLiangliang RenJiwen LuJie ZhouPublished in: ECCV (4) (2018)
Keyphrases
- policy gradient
- actor critic
- function approximation
- parametric optimization
- reinforcement learning
- model free reinforcement learning
- reinforcement learning algorithms
- optimal control
- gradient method
- average reward
- variance reduction
- neural network
- learning algorithm
- single agent
- partially observable markov decision processes
- approximation methods