SPQR: Controlling Q-ensemble Independence with Spiked Random Model for Reinforcement Learning.
Dohyeok LeeSeungyub HanTaehyun ChoJungwoo LeePublished in: NeurIPS (2023)
Keyphrases
- reinforcement learning
- computational model
- high level
- probabilistic model
- neural network
- probability distribution
- formal model
- statistical model
- mathematical model
- theoretical analysis
- management system
- training data
- markov random field
- model selection
- prior knowledge
- multi agent
- knowledge base
- markov decision processes
- machine learning