Home
Categories
EXPLORE
True Crime
Comedy
Society & Culture
Business
Sports
Technology
Health & Fitness
About Us
Contact Us
Copyright
© 2024 PodJoint
Podjoint Logo
US
00:00 / 00:00
Sign in

or

Don't have an account?
Sign up
Forgot password
https://is1-ssl.mzstatic.com/image/thumb/Podcasts211/v4/d8/b7/27/d8b72741-4a96-73a6-e98e-c6c2402e48ec/mza_11654084090888999774.jpg/600x600bb.jpg
LessWrong (Curated & Popular)
LessWrong
655 episodes
1 day ago
This is a link post. New Anthropic research (tweet, blog post, paper): We investigate whether large language models can introspect on their internal states. It is difficult to answer this question through conversation alone, as genuine introspection cannot be distinguished from confabulations. Here, we address this challenge by injecting representations of known concepts into a model's activations, and measuring the influence of these manipulations on the model's self-reported states. We f...
Show more...
Technology
Society & Culture,
Philosophy
RSS
All content for LessWrong (Curated & Popular) is the property of LessWrong and is served directly from their servers with no modification, redirects, or rehosting. The podcast is not affiliated with or endorsed by Podjoint in any way.
This is a link post. New Anthropic research (tweet, blog post, paper): We investigate whether large language models can introspect on their internal states. It is difficult to answer this question through conversation alone, as genuine introspection cannot be distinguished from confabulations. Here, we address this challenge by injecting representations of known concepts into a model's activations, and measuring the influence of these manipulations on the model's self-reported states. We f...
Show more...
Technology
Society & Culture,
Philosophy
https://is1-ssl.mzstatic.com/image/thumb/Podcasts211/v4/d8/b7/27/d8b72741-4a96-73a6-e98e-c6c2402e48ec/mza_11654084090888999774.jpg/600x600bb.jpg
“On Fleshling Safety: A Debate by Klurl and Trapaucius.” by Eliezer Yudkowsky
LessWrong (Curated & Popular)
2 hours 22 minutes
1 week ago
“On Fleshling Safety: A Debate by Klurl and Trapaucius.” by Eliezer Yudkowsky
(23K words; best considered as nonfiction with a fictional-dialogue frame, not a proper short story.) Prologue: Klurl and Trapaucius were members of the machine race. And no ordinary citizens they, but Constructors: licensed, bonded, and insured; proven, experienced, and reputed. Together Klurl and Trapaucius had collaborated on such famed artifices as the Eternal Clock, Silicon Sphere, Wandering Flame, and Diamond Book; and as individuals, both had constructed wonders too numerous to nu...
LessWrong (Curated & Popular)
This is a link post. New Anthropic research (tweet, blog post, paper): We investigate whether large language models can introspect on their internal states. It is difficult to answer this question through conversation alone, as genuine introspection cannot be distinguished from confabulations. Here, we address this challenge by injecting representations of known concepts into a model's activations, and measuring the influence of these manipulations on the model's self-reported states. We f...