Provable General Function Class Representation Learning in Multitask Bandits and MDP.
Rui LuAndrew ZhaoSimon S. DuGao HuangPublished in: NeurIPS (2022)
Keyphrases
- multi task
- learning tasks
- reinforcement learning
- multi task learning
- learning process
- learning algorithm
- multitask learning
- multi armed bandits
- multiple tasks
- markov decision processes
- online learning
- supervised learning
- optimal policy
- unsupervised learning
- learning problems
- latent variables
- k nearest neighbor
- state space
- prior knowledge