Double Pessimism is Provably Efficient for Distributionally Robust Offline Reinforcement Learning: Generic Algorithm and Robust Partial Coverage.
Jose BlanchetMiao LuTong ZhangHan ZhongPublished in: CoRR (2023)
Keyphrases
- computationally efficient
- reinforcement learning
- parameter tuning
- single pass
- learning algorithm
- highly efficient
- worst case
- computational complexity
- search space
- preprocessing
- experimental evaluation
- algorithm is computationally efficient
- high efficiency
- np hard
- computational cost
- real time
- segmentation algorithm
- data structure
- cost function
- optimal solution
- path planning
- k means
- object tracking algorithm
- computationally demanding
- pruning strategy
- similarity measure
- model free
- matching algorithm
- optimization algorithm
- simulated annealing
- state space
- probabilistic model