Lundberg & Lee 2017 introduced SHAP, a feature attribution method. The BAIR blog explores how SHAP, data attribution, and mechanistic interpretability can jointly map interactions in LLMs. These insights help model builders debug and audit LLMs more effectively. By quantifying training examples, researchers actively trace model biases and improve safety.