Keyphrases
- reinforcement learning
- optimal policy
- state and action spaces
- policy search
- function approximation
- markov decision processes
- multi agent
- small number
- machine learning
- metadata
- actor critic
- state space
- action space
- markov decision process
- neural network
- policy gradient
- multimedia
- model free
- visual cortex
- policy iteration
- partially observable
- partially observable domains
- partially observable environments
- state dependent
- partially observable markov decision processes
- optimal control
- multimedia content
- web content
- learning problems
- website
- learning process
- web documents