Fuzzy Logic Guided Reward Function Variation: An Oracle for Testing Reinforcement Learning Programs.
Shiyu ZhangHaoyang SongQixin WangYu PeiPublished in: CoRR (2024)
Keyphrases
- reward function
- reinforcement learning
- fuzzy logic
- reinforcement learning algorithms
- markov decision processes
- state space
- policy search
- optimal policy
- partially observable
- inverse reinforcement learning
- transition model
- markov decision process
- neural network
- transition probabilities
- expert systems
- state action
- initially unknown
- model free
- hierarchical reinforcement learning
- control system
- function approximation
- oracle database
- decision making
- artificial intelligence
- multiple agents
- temporal difference
- state variables
- markov chain
- fuzzy sets
- multi agent
- multi criteria
- policy iteration
- learning agent
- markov decision problems