Mechanistic Interpretability for AI Safety - A Review.
Leonard BereskaEfstratios GavvesPublished in: CoRR (2024)
Keyphrases
- artificial intelligence
- machine learning
- knowledge representation
- case based reasoning
- prediction accuracy
- ai community
- expert systems
- ai systems
- ai technologies
- databases
- artificial intelligence in education
- intelligent behavior
- knowledge representation and reasoning
- literature review
- rule base
- web intelligence
- general intelligence
- intelligent systems
- multi agent
- artificial intelligent
- road safety
- data sets