Oracle-Efficient Pessimism: Offline Policy Optimization In Contextual Bandits.
Lequn WangAkshay KrishnamurthyAlex SlivkinsPublished in: AISTATS (2024)
Keyphrases
- optimization methods
- global optimization
- optimization problems
- real time
- cost effective
- optimization algorithm
- contextual information
- computationally efficient
- evolutionary algorithm
- machine learning
- database
- multi objective
- lightweight
- data structure
- reinforcement learning
- decision making
- neural network
- optimization process
- infinite horizon