Mechanistic Interpretability for AI Safety - A Review.

Leonard Bereska Efstratios Gavves

Published in: CoRR (2024)

Keyphrases

artificial intelligence
machine learning
knowledge representation
case based reasoning
prediction accuracy
ai community
expert systems
ai systems
ai technologies
databases
artificial intelligence in education
intelligent behavior
knowledge representation and reasoning
literature review
rule base
web intelligence
general intelligence
intelligent systems
multi agent
artificial intelligent
road safety
data sets