AttnLRP: Explainable AI for Transformers

https://is1-ssl.mzstatic.com/image/thumb/Podcasts221/v4/1c/18/83/1c1883a3-8260-40c1-1483-a0261cac93d6/mza_8947180628080687200.jpg/600x600bb.jpg

AI: AX - introspection

mcgrof

8 episodes

3 days ago

The art of looking into a model and understanding what is going on through introspection is referred to AX.

Technology

RSS

All content for AI: AX - introspection is the property of mcgrof and is served directly from their servers with no modification, redirects, or rehosting. The podcast is not affiliated with or endorsed by Podjoint in any way.

The art of looking into a model and understanding what is going on through introspection is referred to AX.

Technology

https://d3t3ozftmdmh3i.cloudfront.net/staging/podcast_uploaded_nologo/44214955/44214955-1754722534071-bb9d45cf6b3f5.jpg

AttnLRP: Explainable AI for Transformers

AI: AX - introspection

16 minutes 24 seconds

3 months ago

AttnLRP: Explainable AI for Transformers

This paper 2024 introduces AttnLRP, a novel method for explaining the internal reasoning of transformer models, including Large Language Models (LLMs) and Vision Transformers (ViTs). It extends Layer-wise Relevance Propagation (LRP) by introducing new rules for non-linear operations like softmax and matrix multiplication within attention layers, improving faithfulness and computational efficiency compared to existing methods. The paper highlights AttnLRP's ability to provide attributions for latent representations, enabling the identification and manipulation of "knowledge neurons" within these complex models. Experimental results demonstrate AttnLRP's superior performance across various benchmarks and model architectures.

Source: https://arxiv.org/pdf/2402.05602