Leveraging Reward Consistency for Interpretable Feature Discovery in Reinforcement Learning.
Qisen YangHuanqian WangMukun TongWenjie ShiGao HuangShiji SongPublished in: CoRR (2023)
Keyphrases
- reinforcement learning
- function approximation
- state space
- model free
- reinforcement learning algorithms
- image features
- optimal policy
- eligibility traces
- temporal difference
- markov decision processes
- feature vectors
- learning problems
- reward shaping
- reward function
- data mining
- transfer learning
- learning algorithm
- optimal control
- classification rules
- knowledge discovery
- learning process
- learning capabilities
- learning agent
- policy search
- multi agent
- total reward