Anthropic's CEO admits the quiet part loud: we dont fully understand how AI works. They're building tools to decode it, like an MRI for AI, aiming for safety before it gets too powerful.
#AI #Interpretability #Anthropic
Anthropic's CEO admits the quiet part loud: we dont fully understand how AI works. They're building tools to decode it, like an MRI for AI, aiming for safety before it gets too powerful.
#AI #Interpretability #Anthropic
Dario Amodei — The Urgency of Interpretability https://www.darioamodei.com/post/the-urgency-of-interpretability #AI #Anthropic #interpretability
New demo! Explore CLIP’s hidden concepts with SemanticLens.
Built on 16 SAEs from ViT Prisma (Check out https://github.com/soniajoseph/ViT-Prisma)
Try it: https://semanticlens.hhi-research-insights.eu
Paper: https://arxiv.org/pdf/2501.05398
#AI #interpretability vs #explainability
"The explanations themselves can be difficult to convey to nonexperts, such as end users and line-of-business teams" https://www.techtarget.com/searchenterpriseai/feature/Interpretability-vs-explainability-in-AI-and-machine-learning
"Feature importance helps in understanding which features contribute most to the prediction"
A few lines with #sklearn: https://mljourney.com/sklearn-linear-regression-feature-importance/
"The following sections discuss several state-of-the-art interpretable and explainable #ML methods. The selection of works does not comprise an exhaustive survey of the literature. Instead, it is meant to illustrate the commonest properties and inductive biases behind interpretable models and [black-box] explanation methods using concrete instances."
https://wires.onlinelibrary.wiley.com/doi/full/10.1002/widm.1493#widm1493-sec-0010-title
Model "#interpretability and [black-box] #explainability, although not necessary in many straightforward applications, become instrumental when the problem definition is incomplete and in the presence of additional desiderata, such as trust, causality, or fairness."
https://wires.onlinelibrary.wiley.com/doi/full/10.1002/widm.1493
Heard I should write an #introduction? Ok.
Hi! I'm Jenn. I do research on #responsibleAI at Microsoft Research NYC. I'm in the FATE group and co-chair Microsoft's Aether working group on transparency.
My research background is in #machinelearning theory and algorithmic econ, but since my mid-career crisis I've focused on human-centered approaches to #transparency, #interpretability, & #fairness of AI.
I'm into #AI that augments, rather than replaces, human abilities.