Login / Signup

Pretraining Decision Transformers with Reward Prediction for In-Context Multi-task Structured Bandit Learning.

Subhojyoti MukherjeeJosiah P. HannaQiaomin XieRobert D. Nowak
Published in: CoRR (2024)
Keyphrases