Is This the Subspace You Are Looking for? An Interpretability Illusion for Subspace Activation Patching.
Aleksandar MakelovGeorg LangeAtticus GeigerNeel NandaPublished in: ICLR (2024)
Keyphrases
- low dimensional
- subspace clustering
- high dimensional data
- principal component analysis
- clustering high dimensional data
- high dimensional
- dimensionality reduction
- subspace learning
- linear subspace
- prediction accuracy
- lower dimensional
- subspace methods
- neural network
- denoising
- feature space
- feature extraction
- artificial intelligence
- real world
- null space