Reward Prediction Error as an Exploration Objective in Deep RL.
Riley Simmons-EdlerBen EisnerDaniel YangAnthony BisulcoEric MitchellH. Sebastian SeungDaniel D. LeePublished in: IJCAI (2020)
Keyphrases
- prediction error
- reinforcement learning
- motion compensated
- exploration strategy
- exploration exploitation
- action selection
- motion vectors
- linear prediction
- bit rate
- inter frame
- linear predictors
- three dimensional
- exploration exploitation tradeoff
- autonomous learning
- model free
- state space
- function approximation
- reversible watermarking
- bandit problems
- multi agent
- machine learning
- image sequences
- power spectral density
- high quality
- empirical risk
- multiscale
- state action
- average reward
- image data
- reward function
- motion estimation
- video coding