Home
Categories
EXPLORE
True Crime
Comedy
Business
Society & Culture
History
Sports
Health & Fitness
About Us
Contact Us
Copyright
© 2024 PodJoint
00:00 / 00:00
Sign in

or

Don't have an account?
Sign up
Forgot password
https://is1-ssl.mzstatic.com/image/thumb/Podcasts221/v4/e2/1c/bd/e21cbdc0-d2ba-e74e-8599-5b3308dd994b/mza_2741311095089132519.jpg/600x600bb.jpg
Get the Check
Anika, Maya, Priya
51 episodes
2 days ago
Tune in on Wednesday at 6 AM ET to hear the latest tech news and listen to guests from emerging tech companies.
Show more...
Technology
RSS
All content for Get the Check is the property of Anika, Maya, Priya and is served directly from their servers with no modification, redirects, or rehosting. The podcast is not affiliated with or endorsed by Podjoint in any way.
Tune in on Wednesday at 6 AM ET to hear the latest tech news and listen to guests from emerging tech companies.
Show more...
Technology
https://d3t3ozftmdmh3i.cloudfront.net/staging/podcast_uploaded_nologo/42024018/42024018-1747613938483-a4428cbd5069f.jpg
Inside the Viral Subliminal Learning AI Paper with author Minh Le
Get the Check
36 minutes 54 seconds
2 months ago
Inside the Viral Subliminal Learning AI Paper with author Minh Le

This week the pod sat down with Minh Le, one of the researchers behind the viral AI safety research study “Subliminal Learning: Language models transmit behavioral traits via hidden signals in data.”

The paper showed that if you have a teacher model with a love for owls that teaches a student model a series of random numbers, the student will also inherit a love for owls as long as they share the same base model, which means models can inherit misaligned traits from other models even if it’s not observable in training data.

The hosts deep dive into the paper’s methodology and ask about Minh’s strategy when filtering out numbers that might carry unintended associations like 666 or 911 that have an association with evil or danger. Fun fact the original plan was to use a love for eagles but they switched it to owls because there were fewer associations that could create potential noise. They also go over theories about why the teacher’s behavior is transmitted when the data transferred is random and filtered. Spoiler, it probably wasn’t a secret code in the numbers but rather the data distribution triggering emergent behaviors in the student model like a love for owls.

The pod also gets into what the media got wrong about the paper, AI safety, and Minh’s hot take on why he doesn’t buy into p doom (the idea that AI leads to human extinction…). Minh also talks about how he went from being an independent researcher to the prestigious Anthropic Fellowship and now a full time role at Anthropic.

00:00 Minh's career journey

04:36 Deep dive into the subliminal learning study

26:23 Larger discussion about AI safety

Get the Check
Tune in on Wednesday at 6 AM ET to hear the latest tech news and listen to guests from emerging tech companies.