Home
Categories
EXPLORE
True Crime
Comedy
Society & Culture
Business
News
Sports
TV & Film
About Us
Contact Us
Copyright
© 2024 PodJoint
00:00 / 00:00
Sign in

or

Don't have an account?
Sign up
Forgot password
https://is1-ssl.mzstatic.com/image/thumb/Podcasts221/v4/1c/18/83/1c1883a3-8260-40c1-1483-a0261cac93d6/mza_8947180628080687200.jpg/600x600bb.jpg
AI: AX - introspection
mcgrof
8 episodes
3 days ago
The art of looking into a model and understanding what is going on through introspection is referred to AX.
Show more...
Technology
RSS
All content for AI: AX - introspection is the property of mcgrof and is served directly from their servers with no modification, redirects, or rehosting. The podcast is not affiliated with or endorsed by Podjoint in any way.
The art of looking into a model and understanding what is going on through introspection is referred to AX.
Show more...
Technology
https://d3t3ozftmdmh3i.cloudfront.net/staging/podcast_uploaded_nologo/44214955/44214955-1754722534071-bb9d45cf6b3f5.jpg
Multi-Layer Sparse Autoencoders for Transformer Interpretation
AI: AX - introspection
14 minutes 1 second
3 months ago
Multi-Layer Sparse Autoencoders for Transformer Interpretation

This paper introduces the Multi-Layer Sparse Autoencoder (MLSAE), a novel approach for interpreting the internal representations of transformer language models. Unlike traditional Sparse Autoencoders (SAEs) that analyze individual layers, MLSAEs are trained across all layers of a transformer's residual stream, enabling the study of information flow across layers. The research found that while individual "latents" (features learned by the SAE) tend to be active at a single layer for a given input, they are active at multiple layers when aggregated over many inputs, with this multi-layer activity increasing in larger models. The authors also explored the effect of "tuned-lens" transformations on latent activations, ultimately providing a new method for understanding how representations evolve within transformers.

AI: AX - introspection
The art of looking into a model and understanding what is going on through introspection is referred to AX.