Open the Black Box: Step-based Policy Updates for Temporally-Correlated Episodic Reinforcement Learning.
Ge LiHongyi ZhouDominik RothSerge ThilgesFabian OttoRudolf LioutikovGerhard NeumannPublished in: ICLR (2024)
Keyphrases
- black box
- reinforcement learning
- optimal policy
- black boxes
- policy search
- action selection
- test cases
- hybrid systems
- state space
- white box
- markov decision process
- markov decision processes
- control policy
- integration testing
- reward function
- state and action spaces
- action space
- partially observable environments
- policy iteration
- white box testing
- function approximation
- software engineering
- dynamic programming
- temporal information
- neural network
- partially observable
- reinforcement learning algorithms
- state transition
- average reward
- policy evaluation
- artificial intelligence
- partially observable markov decision processes
- inverse reinforcement learning
- actor critic
- policy gradient
- rl algorithms
- learning algorithm