Online Caching Policy with User Preferences and Time-Dependent Requests: A Reinforcement Learning Approach.
Mohammad HatamiMarkus LeinonenMarian CodreanuPublished in: ACSSC (2019)
Keyphrases
- user preferences
- reinforcement learning
- optimal policy
- policy search
- recommender systems
- collaborative filtering
- user behavior
- user profiles
- recommendation systems
- markov decision process
- action selection
- user feedback
- cache management
- function approximation
- control policy
- replacement policy
- function approximators
- user specific
- user behaviour
- prefetching
- preference model
- reward function
- markov decision processes
- state space
- preference models
- temporal difference
- qualitative preferences
- exploration exploitation tradeoff
- action space
- partially observable
- information retrieval
- dynamic programming
- reinforcement learning algorithms
- decision making