Double Pessimism is Provably Efficient for Distributionally Robust Offline Reinforcement Learning: Generic Algorithm and Robust Partial Coverage.
Jose H. BlanchetMiao LuTong ZhangHan ZhongPublished in: NeurIPS (2023)
Keyphrases
- computationally efficient
- learning algorithm
- single pass
- parameter tuning
- detection algorithm
- reinforcement learning
- np hard
- computational complexity
- object tracking algorithm
- highly efficient
- dynamic programming
- experimental evaluation
- worst case
- algorithm is computationally efficient
- memory efficient
- optimization algorithm
- segmentation algorithm
- high accuracy
- preprocessing
- cost function
- convergence rate
- linear programming
- high efficiency
- computationally demanding
- numerically stable
- particle swarm optimization
- matching algorithm
- image matching
- recognition algorithm
- optimal solution
- objective function
- simulated annealing
- probabilistic model