Home
Categories
EXPLORE
True Crime
Comedy
Society & Culture
Business
Sports
Health & Fitness
Technology
About Us
Contact Us
Copyright
© 2024 PodJoint
00:00 / 00:00
Podjoint Logo
US
Sign in

or

Don't have an account?
Sign up
Forgot password
https://is1-ssl.mzstatic.com/image/thumb/Podcasts221/v4/6e/e3/95/6ee39578-d477-b358-f5b9-cec7ad10f081/mza_8422032406383167466.jpg/600x600bb.jpg
Into AI Safety
Jacob Haimes
24 episodes
1 week ago
The Into AI Safety podcast aims to make it easier for everyone, regardless of background, to get meaningfully involved with the conversations surrounding the rules and regulations which should govern the research, development, deployment, and use of the technologies encompassed by the term "artificial intelligence" or "AI" For better formatted show notes, additional resources, and more, go to https://kairos.fm/intoaisafety/
Show more...
Technology
Science,
Mathematics
RSS
All content for Into AI Safety is the property of Jacob Haimes and is served directly from their servers with no modification, redirects, or rehosting. The podcast is not affiliated with or endorsed by Podjoint in any way.
The Into AI Safety podcast aims to make it easier for everyone, regardless of background, to get meaningfully involved with the conversations surrounding the rules and regulations which should govern the research, development, deployment, and use of the technologies encompassed by the term "artificial intelligence" or "AI" For better formatted show notes, additional resources, and more, go to https://kairos.fm/intoaisafety/
Show more...
Technology
Science,
Mathematics
Episodes (20/24)
Into AI Safety
Getting Agentic w/ Alistair Lowe-Norris

Alistair Lowe-Norris, Chief Responsible AI Officer at Iridius and co-host of The Agentic Insider podcast, joins to discuss AI compliance standards, the importance of narrowly scoping systems, and how procurement requirements could encourage responsible AI adoption across industries. We explore the gap between the empty promises companies provide and actual safety practices, as well as the importance of vigilance and continuous oversight.

Listen to Alistair on his podcast, The Agentic Insider!

As part of my effort to make this whole podcasting thing more sustainable, I have created a Kairos.fm Patreon which includes an extended version of this episode. Supporting gets you access to these extended cuts, as well as other perks in development.

Chapters

  • (00:00) - Intro
  • (02:46) - Trustworthy AI and the Human Side of Change
  • (13:57) - This is Essentially Avatar, Right?
  • (23:00) - AI Call Centers
  • (49:38) - Standards, Audits, and Accountability
  • (01:04:11) - What Happens when Standards aren’t Met?

Links
  • Iridius website

GPT-5 Commentary

  • Where's Your Ed At blogpost - How Does GPT-5 Work?
  • Zvi LessWrong blogpost - GPT-5: The Reverse DeepSeek moment
  • Blood in the Machine article - GPT-5 Is a Joke. Will It Matter?
  • Futurism article - Evidence Grows That GPT-5 Is a Bit of a Dud
  • Gary Marcus substack - GPT-5: Overdue, overhyped and underwhelming. And that’s not the worst of it.

Customer Service and AI Adoption

  • Gartner press release - Gartner Survey Finds 64% of Customers Would Prefer That Companies Didn't Use AI for Customer Service
  • Preprint - Deploying Chatbots in Customer Service: Adoption Hurdles and Simple Remedies
  • KDD '25 paper - Retrieval And Structuring Augmented Generation with Large Language Models
  • Global Nerdy blogpost - Retrieval-augmented generation explained “Star Wars” style
  • The Security Cafe article - A Quick And Dirty Guide To Starting SOC2

Standards

  • ISO overview - AI management systems
  • ISO standard - ISO/IEC 42001
    • CyberZoni guide - ISO 42001 The Complete Guide
    • A-LIGN article - Understanding ISO 42001
  • ISO standard - ISO/IEC 27001
  • ISO standard - ISO/IEC 42005

Governance and Regulation

  • NIST framework - AI Risk Management Framework
  • EU AI Act article - Article 99: Penalties
  • Colorado Senate Bill 24-205 (Colorado AI Act) webpage
  • Utah Senate Bill 149 webpage

Microsoft AI Compliance

  • Schellman blogpost - Microsoft DPR AI Requirements and ISO 42001
  • Microsoft documentation - ISO/IEC 42001 AI Management System offering
  • Microsoft webpage - Responsible AI Principles and Approach
  • Microsoft Service Trust Portal documentation - Responsible AI Standard v2
  • Microsoft documentation - Supplier Security & Privacy Assurance Program Guide v11 April 2025
Show more...
1 week ago
1 hour 11 minutes

Into AI Safety
Growing BlueDot's Impact w/ Li-Lian Ang

I'm joined by my good friend, Li-Lian Ang, first hire and product manager at BlueDot Impact. We discuss how BlueDot has evolved from their original course offerings to a new "defense-in-depth" approach, which focuses on three core threat models: reduced oversight in high risk scenarios (e.g. accelerated warfare), catastrophic terrorism (e.g. rogue actors with bioweapons), and the concentration of wealth and power (e.g. supercharged surveillance states). On top of that, we cover how BlueDot's strategies account for and reduce the negative impacts of common issues in AI safety, including exclusionary tendencies, elitism, and echo chambers.

2025.09.15: Learn more about how to make design effective interventions to make AI go well and potentially even get funded for it on BlueDot Impact's AGI Strategy course! BlueDot is also hiring, so if you think you’d be a good fit, I definitely recommend applying; I had a great experience when I contracted as a course facilitator. If you do end up applying, let them know you found out about the opportunity from the podcast!


Follow Li-Lian on LinkedIn, and look at more of her work on her blog!

As part of my effort to make this whole podcasting thing more sustainable, I have created a Kairos.fm Patreon which includes an extended version of this episode. Supporting gets you access to these extended cuts, as well as other perks in development.


  • (03:23) - Meeting Through the Course
  • (05:46) - Eating Your Own Dog Food
  • (13:13) - Impact Acceleration
  • (22:13) - Breaking Out of the AI Safety Mold
  • (26:06) - Bluedot’s Risk Framework
  • (41:38) - Dangers of "Frontier" Models
  • (54:06) - The Need for AI Safety Advocates
  • (01:00:11) - Hot Takes and Pet Peeves

Links
  • BlueDot Impact website

Defense-in-Depth

  • BlueDot Impact blogpost - Our vision for comprehensive AI safety training
  • Engineering for Humans blogpost - The Swiss cheese model: Designing to reduce catastrophic losses
  • Open Journal of Safety Science and Technology article - The Evolution of Defense in Depth Approach: A Cross Sectorial Analysis

X-clusion and X-risk

  • Nature article - AI Safety for Everyone
  • Ben Kuhn blogpost - On being welcoming
  • Reflective Altruism blogpost - Belonging (Part 1: That Bostrom email)

AIxBio

  • RAND report - The Operational Risks of AI in Large-Scale Biological Attacks
  • OpenAI "publication" (press release) - Building an early warning system for LLM-aided biological threat creation
  • Anthropic Frontier AI Red Team blogpost - Why do we take LLMs seriously as a potential source of biorisk?
  • Kevin Esvelt preprint - Foundation models may exhibit staged progression in novel CBRN threat disclosure
  • Anthropic press release - Activating AI Safety Level 3 protections

Persuasive AI

  • Preprint - Lies, Damned Lies, and Distributional Language Statistics: Persuasion and Deception with Large Language Models
  • Nature Human Behavior article - On the conversational persuasiveness of GPT-4
  • Preprint - Large Language Models Are More Persuasive Than Incentivized Human Persuaders

AI, Anthropomorphization, and Mental Health

  • Western News article - Expert insight: Humanlike chatbots detract from developing AI for the human good
  • AI & Society article - Anthropomorphization and beyond: conceptualizing humanwashing of AI-enabled machines
  • Artificial Ignorance article - The Chatbot Trap
  • Making Noise and Hearing Things blogpost - Large language models cannot replace mental health professionals
  • Idealogo blogpost - 4 reasons not to turn ChatGPT into your therapist
  • Journal of Medical Society Editorial - Importance of informed consent in medical practice
  • Indian Journal of Medical Research article - Consent in psychiatry - concept, application & implications
  • Media Naama article - The Risk of Humanising AI Chabots: Why ChatGPT Mimicking Feelings Can Backfire
  • Becker's Behavioral Health blogpost - OpenAI’s mental health roadmap: 5 things to know

Miscellaneous References

  • Carnegie Council blogpost - What Do We Mean When We Talk About "AI Democratization"?
  • Collective Intelligence Project policy brief - Four Approaches to Democratizing AI
  • BlueDot Impact blogpost - How Does AI Learn? A Beginner's Guide with Examples
  • BlueDot Impact blogpost - AI safety needs more public-facing advocacy

More Li-Lian Links

  • Humans of Minerva podcast website
  • Li-Lian's book - Purple is the Noblest Shroud

Relevant Podcasts from Kairos.fm

  • Scaling Democracy w/ Dr. Igor Krawczuk for AI safety exclusion and echo chambers
  • Getting into PauseAI w/ Will Petillo for AI in warfare and exclusion in AI safety
Show more...
1 month ago
1 hour 7 minutes

Into AI Safety
Layoffs to Leadership w/ Andres Sepulveda Morales

Andres Sepulveda Morales joins me to discuss his journey from three tech layoffs to founding Red Mage Creative and leading the Fort Collins chapter of the Rocky Mountain AI Interest Group (RMAIIG). We explore the current tech job market, AI anxiety in nonprofits, dark patterns in AI systems, and building inclusive tech communities that welcome diverse perspectives.


Reach out to Andres on his LinkedIn, or check out the Red Mage Creative website!

For any listeners in Colorado, consider attending an RMAIIG event: Boulder; Fort Collins

  • (00:00) - Intro
  • (01:04) - Andres' Journey
  • (05:15) - Tech Layoff Cycle
  • (26:12) - Why AI?
  • (30:58) - What is Red Mage?
  • (36:12) - AI as a Tool
  • (41:55) - AInxiety
  • (47:26) - Dark Patterns and Critical Perspectives
  • (01:01:35) - RMAIIG
  • (01:10:09) - Inclusive Tech Education
  • (01:18:05) - Colorado AI Governance
  • (01:23:46) - Building Your Own Tech Community

Links

Tech Job Market

  • Layoff tracker website
  • The Big Newsletter article - Why Are We Pretending AI Is Going to Take All the Jobs?
  • METR preprint - Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity
  • AI Business blogpost - https://aibusiness.com/responsible-ai/debunking-the-ai-job-crisis
  • Crunchbase article - Data: Tech Layoffs Remain Stubbornly High, With Big Tech Leading The Way
  • Computerworld article - Tech layoffs surge even as US unemployment remains stable
  • Apollo Technical blogpost - Ghost jobs in tech: Why companies are posting roles they don’t plan to fill
  • The HR Digest article - The Rise of Ghost Jobs Is Leaving Job Seekers Frustrated and Disappointed
  • A Life After Layoff video - The Tech Job Market Is Hot Trash Right Now
  • Economy Media video - Will The Tech Job Market Ever Recover?
  • Soleyman Shahir video - Tech CEO Explains: The Real Reason Behind AI Layoffs

Dark Patterns

  • Deceptive Design website
  • Journal of Legal Analysis article - Shining a Light on Dark Patterns
  • ICLR paper - DarkBench: Benchmarking Dark Patterns in Large Language Models
  • Computing Within Limits paper - Imposing AI: Deceptive design patterns against sustainability
  • Communications of the ACM blogpost - Dark Patterns
  • [Preprint] - A Comprehensive Study on Dark Patterns

Colorado AI Regulation

  • Senate Bill 24-205 (Colorado AI Act) bill and webpage
  • NAAG article - A Deep Dive into Colorado’s Artificial Intelligence Act
  • Colorado Sun article - Why Colorado’s artificial intelligence law is a big deal for the whole country
  • CFO Dive blogpost - ‘Heavy lift’: Colorado AI law sets high bar, analysts say
  • Denver 7 article - Colorado could lose federal funding as Trump administration targets AI regulations
  • America's AI Action Plan document

Other Sources

  • Concordia Framework report and repo
  • 80,000 Hours website
  • AI Incident Database website
Show more...
3 months ago
1 hour 39 minutes

Into AI Safety
Getting Into PauseAI w/ Will Petillo

Will Petillo, onboarding team lead at PauseAI, joins me to discuss the grassroots movement advocating for a pause on frontier AI model development. We explore PauseAI's strategy, talk about common misconceptions Will hears, and dig into how diverse perspectives still converge on the need to slow down AI development.

Will's Links

  • Personal blog on AI
  • His mindmap of the AI x-risk debate
  • Game demos
  • AI focused YouTube channel


  • (00:00) - Intro
  • (03:36) - What is PauseAI
  • (10:10) - Will Petillo's journey into AI safety advocacy
  • (21:13) - Understanding PauseAI
  • (31:35) - Pursuing a pause
  • (40:06) - Balancing advocacy in a complex world
  • (45:54) - Why a pause on frontier models?
  • (54:48) - Diverse perspectives within PauseAI
  • (59:55) - PauseAI misconceptions
  • (01:16:40) - Ongoing AI governance efforts (SB1047)
  • (01:28:52) - The role of incremental progress
  • (01:35:16) - Safety-washing and corporate responsibility
  • (01:37:23) - Lessons from environmentalism
  • (01:41:59) - Will's superlatives

Links
  • PauseAI
  • PauseAI-US

Related Kairos.fm Episodes

  • Into AI Safety episode with Dr. Igor Krawczuk
  • muckrAIkers episode on SB1047

Exclusionary Tendencies

  • Jacobin article - Elite Universities Gave Us Effective Altruism, the Dumbest Idea of the Century
  • SSIR article - The Elitist Philanthropy of So-Called Effective Altruism
  • Persuasion blogpost - The Problem with Effective Altruism
  • Dark Markets blogpost - What's So Bad About Rationalism?
  • FEE blogpost - What’s Wrong With the Rationality Community?

AI in Warfare

  • Master's Thesis - The Evolution of Artificial Intelligence and Expert Computer Systems in the Army
  • International Journal of Intelligent Systems article - Artificial Intelligence in the Military: An Overview of the Capabilities, Applications, and Challenges
  • Preprint - Basic Research, Lethal Effects: Military AI Research Funding as Enlistment
  • AOAV Article - ‘Military Age Males’ in US Drone Strikes
  • The Conversation article - Gaza war: Israel using AI to identify human targets raising fears that innocents are being caught in the net
  • 972 article - ‘Lavender’: The AI machine directing Israel’s bombing spree in Gaza
  • IDF press release - The IDF's Use of Data Technologies in Intelligence Processing
  • Lieber Institute West Point article - Israel–Hamas 2024 Symposium
  • Verfassungsblog article - Gaza, Artificial Intelligence, and Kill Lists
  • RAND research report - Dr. Li Bicheng, or How China Learned to Stop Worrying and Love Social Media Manipulation
  • The Intercept article collection - The Drone Papers
  • AFIT faculty publication - On Large Language Models in National Security Applications
  • Nature article - Death by Metadata: The Bioinformationalisation of Life and the Transliteration of Algorithms to Flesh

Legislation

  • LegiScan page on SB1047
  • NY State Senate page on the RAISE Act
  • Congress page on the TAKE IT DOWN Act

The Gavernor

  • FastCompany article - Big Tech may be focusing its lobbying push on the California AI safety bill’s last stop: Gavin Newsom
  • POLITICO article - How California politics killed a nationally important AI bill
  • Newsom's veto message
  • Additional relevant lobbying documentation - [1], [2]
  • Jacobin article - With Newsom’s Veto, Big Tech Beats Democracy

Misc. Links

  • FLI Open Letter on an AI pause
  • Wikipedia article - Overton window
  • Daniel Smachtenburger YouTube video - An Introduction to the Metacrisis
  • VAISU website (looks broken as of 2025.06.19)
  • AI Impacts report - Why Did Environmentalism Become Partisan?
Show more...
4 months ago
1 hour 48 minutes

Into AI Safety
Making Your Voice Heard w/ Tristan Williams & Felix de Simone

I am joined by Tristan Williams and Felix de Simone to discuss their work on the potential of constituent communication, specifically in the context of AI legislation. These two worked as part of an AI Safety Camp team to understand whether or not it would be useful for more people to be sharing their experiences, concerns, and opinions with their government representative (hint, it is).

Check out the blogpost on their findings, "Talking to Congress: Can constituents contacting their legislator influence policy?" and the tool they created!

  • (01:53) - Introductions
  • (04:04) - Starting the project
  • (13:30) - Project overview
  • (16:36) - Understanding constituent communication
  • (28:50) - Literature review
  • (35:52) - Phase 2
  • (43:26) - Creating a tool for citizen engagement
  • (50:16) - Crafting your message
  • (59:40) - The game of advocacy
  • (01:15:19) - Difficulties on the project
  • (01:22:33) - Call to action
  • (01:32:30) - Outro

Links

Links to all articles/papers which are mentioned throughout the episode can be found below, in order of their appearance.

  • AI Safety Camp
  • Pause AI
  • BlueDot Impact
  • TIME article - There’s an AI Lobbying Frenzy in Washington. Big Tech Is Dominating
  • Congressional Management Foundation study - Communicating with Congress: Perceptions of Citizen Advocacy on Capitol Hill
  • Congressional Management Foundation study - The Future of Citizen Engagement: Rebuilding the Democratic Dialogue
  • Tristan and Felix's blogpost - Talking to Congress: Can constituents contacting their legislator influence policy?
  • Wired article - What It Takes to Make Congress Actually Listen
  • American Journal of Polical Science article - Congressional Representation: Accountability from the Constituent’s Perspective
  • Political Behavior article - Call Your Legislator: A Field Experimental Study of the Impact of a Constituency Mobilization Campaign on Legislative Voting
  • Guided Track website
  • The Tool
  • Holistic AI global regulatory tracker
  • White & Case global regulatory tracker
  • Steptoe US AI legislation tracker
  • Manatt US AIxHealth legislation tracker
  • Issue One article - Big Tech Cozies Up to New Administration After Spending Record Sums on Lobbying Last Year
  • Verfassungsblog article - BigTech’s Efforts to Derail the AI Act
  • MIT Technology Review article - OpenAI has upped its lobbying efforts nearly sevenfold
  • Open Secrets webpage - Issue Profile: Science & Technology
  • Statista data - Leading lobbying spenders in the United States in 2024
  • Global Justice Now report - Democracy at risk in Davos: new report exposes big tech lobbying and political interference
  • Ipsos article - Where Americans stand on AI
  • AP-NORC report - There Is Bipartisan Concern About the Use of AI in the 2024 Elections
  • AI Action Summit report - International AI Safety Report
  • YouGov article - Do Americans think AI will have a positive or negative impact on society?
Show more...
5 months ago
1 hour 33 minutes

Into AI Safety
INTERVIEW: Scaling Democracy w/ (Dr.) Igor Krawczuk

The almost Dr. Igor Krawczuk joins me for what is the equivalent of 4 of my previous episodes. We get into all the classics: eugenics, capitalism, philosophical toads... Need I say more?

If you're interested in connecting with Igor, head on over to his website, or check out placeholder for thesis (it isn't published yet).

Because the full show notes have a whopping 115 additional links, I'll highlight some that I think are particularly worthwhile here:

  • The best article you'll ever read on Open Source AI
  • The best article you'll ever read on emergence in ML
  • Kate Crawford's Atlas of AI (Wikipedia)
  • On the Measure of Intelligence
  • Thomas Piketty's Capital in the Twenty-First Century (Wikipedia)
  • Yurii Nesterov's Introductory Lectures on Convex Optimization

Chapters

  • (02:32) - Introducing Igor
  • (10:11) - Aside on EY, LW, EA, etc., a.k.a. lettersoup
  • (18:30) - Igor on AI alignment
  • (33:06) - "Open Source" in AI
  • (41:20) - The story of infinite riches and suffering
  • (59:11) - On AI threat models
  • (01:09:25) - Representation in AI
  • (01:15:00) - Hazard fishing
  • (01:18:52) - Intelligence and eugenics
  • (01:34:38) - Emergence
  • (01:48:19) - Considering externalities
  • (01:53:33) - The shape of an argument
  • (02:01:39) - More eugenics
  • (02:06:09) - I'm convinced, what now?
  • (02:18:03) - AIxBio (round ??)
  • (02:29:09) - On open release of models
  • (02:40:28) - Data and copyright
  • (02:44:09) - Scientific accessibility and bullshit
  • (02:53:04) - Igor's point of view
  • (02:57:20) - Outro


Links

Links to all articles/papers which are mentioned throughout the episode can be found below, in order of their appearance. All references, including those only mentioned in the extended version of this episode, are included.

  • Suspicious Machines Methodology, referred to as the "Rotterdam Lighthouse Report" in the episode
  • LIONS Lab at EPFL
  • The meme that Igor references
  • On the Hardness of Learning Under Symmetries
  • Course on the concept of equivariant deep learning
  • Aside on EY/EA/etc.
    • Sources on Eliezer Yudkowski
      • Scholarly Community Encyclopedia
      • TIME100 AI
      • Yudkowski's personal website
      • EY Wikipedia
      • A Very Literary Wiki -TIME article: Pausing AI Developments Isn’t Enough. We Need to Shut it All Down documenting EY's ruminations of bombing datacenters; this comes up later in the episode but is included here because it about EY.
    • LessWrong
      • LW Wikipedia
    • MIRI
    • Coverage on Nick Bostrom (being a racist)
      • The Guardian article: ‘Eugenics on steroids’: the toxic and contested legacy of Oxford’s Future of Humanity Institute
      • The Guardian article: Oxford shuts down institute run by Elon Musk-backed philosopher
    • Investigative piece on Émile Torres
    • On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜
    • NY Times article: We Teach A.I. Systems Everything, Including Our Biases
    • NY Times article: Google Researcher Says She Was Fired Over Paper Highlighting Bias in A.I.
    • Timnit Gebru's Wikipedia
    • The TESCREAL Bundle: Eugenics and the Promise of Utopia through Artificial General Intelligence
    • Sources on the environmental impact of LLMs
      • The Environmental Impact of LLMs
      • The Cost of Inference: Running the Models
      • Energy and Policy Considerations for Deep Learning in NLP
      • The Carbon Impact of AI vs Search Engines
  • Filling Gaps in Trustworthy Development of AI (Igor is an author on this one)
  • A Computational Turn in Policy Process Studies: Coevolving Network Dynamics of Policy Change
  • The Smoothed Possibility of Social Choice, an intro in social choice theory and how it overlaps with ML
  • Relating to Dan Hendrycks
    • Natural Selection Favors AIs over Humans
      • "One easy-to-digest source to highlight what he gets wrong [is] Social and Biopolitical Dimensions of Evolutionary Thinking" -Igor
    • Introduction to AI Safety, Ethics, and Society, recently published textbook
    • "Source to the section [of this paper] that makes Dan one of my favs from that crowd." -Igor
    • Twitter post referenced in the episode<...
Show more...
1 year ago
2 hours 58 minutes

Into AI Safety
INTERVIEW: StakeOut.AI w/ Dr. Peter Park (3)
As always, the best things come in 3s: dimensions, musketeers, pyramids, and... 3 installments of my interview with Dr. Peter Park, an AI Existential Safety Post-doctoral Fellow working with Dr. Max Tegmark at MIT.As you may have ascertained from the previous two segments of the interview, Dr. Park cofounded StakeOut.AI along with Harry Luk and one other cofounder whose name has been removed due to requirements of her current position. The non-profit had a simple but important mission: make the adoption of AI technology go well, for humanity, but unfortunately, StakeOut.AI had to dissolve in late February of 2024 because no granter would fund them. Although it certainly is disappointing that the organization is no longer functioning, all three cofounders continue to contribute positively towards improving our world in their current roles.If you would like to investigate further into Dr. Park's work, view his website, Google Scholar, or follow him on Twitter00:00:54 ❙ Intro00:02:41 ❙ Rapid development00:08:25 ❙ Provable safety, safety factors, & CSAM00:18:50 ❙ Litigation00:23:06 ❙ Open/Closed Source00:38:52 ❙ AIxBio00:47:50 ❙ Scientific rigor in AI00:56:22 ❙ AI deception01:02:45 ❙ No takesies-backsies01:08:22 ❙ StakeOut.AI's start01:12:53 ❙ Sustainability & Agency01:18:21 ❙ "I'm sold, next steps?" -you01:23:53 ❙ Lessons from the amazing Spiderman01:33:15 ❙ "I'm ready to switch careers, next steps?" -you01:40:00 ❙ The most important question01:41:11 ❙ OutroLinks to all articles/papers which are mentioned throughout the episode can be found below, in order of their appearance.StakeOut.AIPause AIAI Governance Scorecard (go to Pg. 3)CIVITAIArticle on CIVITAI and CSAMSenate Hearing: Protecting Children OnlinePBS Newshour CoverageThe Times Sues OpenAI and Microsoft Over A.I. Use of Copyrighted WorkOpen Source/Weights/Release/InterpretationOpen Source InitiativeHistory of the OSIMeta’s LLaMa 2 license is not Open SourceIs Llama 2 open source? No – and perhaps we need a new definition of open…Apache License, Version 2.03Blue1Brown: Neural NetworksOpening up ChatGPT: Tracking openness, transparency, and accountability in instruction-tuned text generatorsThe online tableSignalBloomz model on HuggingFaceMistral websiteNASA TragediesChallenger disaster on WikipediaColumbia disaster on WikipediaAIxBio RiskDual use of artificial-intelligence-powered drug discoveryCan large language models democratize access to dual-use biotechnology?Open-Sourcing Highly Capable Foundation Models (sadly, I can't rename the article...)Propaganda or Science: Open Source AI and Bioterrorism RiskExaggerating the risks (Part 15: Biorisk from LLMs)Will releasing the weights of future large language models grant widespread access to pandemic agents?On the Societal Impact of Open Foundation ModelsPolicy briefApart ResearchScienceCiceroHuman-level play in the game of Diplomacy by combining language models with strategic reasoningCicero webpageAI Deception: A Survey of Examples, Risks, and Potential SolutionsOpen Sourcing the AI Revolution: Framing the debate on open source, artificial intelligence and regulationAI Safety CampInto AI Safety Patreon
Show more...
1 year ago
1 hour 42 minutes

Into AI Safety
INTERVIEW: StakeOut.AI w/ Dr. Peter Park (2)
Join me for round 2 with Dr. Peter Park, an AI Existential Safety Postdoctoral Fellow working with Dr. Max Tegmark at MIT. Dr. Park was a cofounder of StakeOut.AI, a non-profit focused on making AI go well for humans, along with Harry Luk and one other individual, whose name has been removed due to requirements of her current position.In addition to the normal links, I wanted to include the links to the petitions that Dr. Park mentions during the podcast. Note that the nonprofit which began these petitions, StakeOut.AI, has been dissolved.Right AI Laws, to Right Our Future: Support Artificial Intelligence Safety Regulations NowIs Deepfake Illegal? Not Yet! Ban Deepfakes to Protect Your Family & Demand Deepfake LawsBan Superintelligence: Stop AI-Driven Human Extinction Risk 00:00:54 - Intro00:02:34 - Battleground 1: Copyright00:06:28 - Battleground 2: Moral Critique of AI Collaborationists00:08:15 - Rich Sutton00:20:41 - OpenAI Drama00:34:28 - Battleground 3: Contract Negotiations for AI Ban Clauses00:37:57 - Tesla, Autopilot, and FSD00:40:02 - Recycling00:47:40 - Battleground 4: New Laws and Policies00:50:00 - Battleground 5: Whistleblower Protections00:53:07 - Whistleblowing on Microsoft00:54:43 - Andrej Karpathy & Exercises in Empathy01:05:57 - OutroLinks to all articles/papers which are mentioned throughout the episode can be found below, in order of their appearance.StakeOut.AIThe Times Sues OpenAI and Microsoft Over A.I. Use of Copyrighted WorkSusman Godfrey LLPRich SuttonReinforcement Learning: An Introduction (textbook)AI Succession (presentation by Rich Sutton)The Alberta Plan for AI Research Moore's LawThe Future of Integrated Electronics (original paper)Computer History Museum's entry on Moore's LawStochastic gradient descent (SGD) on WikipediaOpenAI DramaMax Read's Substack postZvi Mowshowitz's Substack series, in order of postingOpenAI: Facts from a WeekendOpenAI: The Battle of the BoardOpenAI: Altman ReturnsOpenAI: Leaks Confirm the Story ← best singular post in the seriesOpenAI: The Board ExpandsOfficial OpenAI announcementWGA on WikipediaSAG-AFTRA on WikipediaTesla's False AdvertisingTesla's response to the DMV's false-advertising allegations: What took so long?Tesla Tells California DMV that FSD Is Not Capable of Autonomous DrivingWhat to Call Full Self-Driving When It Isn't Full Self-Driving?Tesla fired an employee after he posted driverless tech reviews on YouTubeTesla's page on Autopilot and Full Self-DrivingRecyclingBoulder County Recycling Center Stockpiles Accurately Sorted Recyclable MaterialsOut of sight, out of mindBoulder Eco-Cycle Recycling GuidelinesDivide-and-Conquer Dynamics in AI-Driven DisempowermentMicrosoft WhistleblowerWhistleblowers call out AI's flawsShane's LinkedIn postLetters sent by JonesKarpathy announces departure from OpenAI
Show more...
1 year ago
1 hour 6 minutes

Into AI Safety
MINISODE: Restructure Vol. 2
UPDATE: Contrary to what I say in this episode, I won't be removing any episodes that are already published from the podcast RSS feed. After getting some advice and reflecting more on my own personal goals, I have decided to shift the direction of the podcast towards accessible content regarding "AI" instead of the show's original focus. I will still be releasing what I am calling research ride-along content to my Patreon, but the show's feed will consist only of content that I aim to make as accessible as possible. 00:35 - TL;DL01:12 - Advice from Pete03:10 - My personal goal05:39 - Reflection on refining my goal09:08 - Looking forward (logistics
Show more...
1 year ago
13 minutes

Into AI Safety
INTERVIEW: StakeOut.AI w/ Dr. Peter Park (1)
Dr. Peter Park is an AI Existential Safety Postdoctoral Fellow working with Dr. Max Tegmark at MIT. In conjunction with Harry Luk and one other cofounder, he founded ⁠StakeOut.AI, a non-profit focused on making AI go well for humans. 00:54 - Intro03:15 - Dr. Park, x-risk, and AGI08:55 - StakeOut.AI12:05 - Governance scorecard19:34 - Hollywood webinar22:02 - Regulations.gov comments23:48 - Open letters 26:15 - EU AI Act35:07 - Effective accelerationism40:50 - Divide and conquer dynamics45:40 - AI "art"53:09 - Outro Links to all articles/papers which are mentioned throughout the episode can be found below, in order of their appearance. StakeOut.AI AI Governance Scorecard (go to Pg. 3) Pause AI Regulations.gov USCO StakeOut.AI Comment OMB StakeOut.AI Comment AI Treaty open letter TAISC Alpaca: A Strong, Replicable Instruction-Following Model References on EU AI Act and Cedric O Tweet from Cedric O EU policymakers enter the last mile for Artificial Intelligence rulebook AI Act: EU Parliament’s legal office gives damning opinion on high-risk classification ‘filters’ EU’s AI Act negotiations hit the brakes over foundation models The EU AI Act needs Foundation Model Regulation BigTech’s Efforts to Derail the AI Act Open Sourcing the AI Revolution: Framing the debate on open source, artificial intelligence and regulation Divide-and-Conquer Dynamics in AI-Driven Disempowerment
Show more...
1 year ago
54 minutes

Into AI Safety
MINISODE: "LLMs, a Survey"
Take a trip with me through the paper Large Language Models, A Survey, published on February 9th of 2024. All figures and tables mentioned throughout the episode can be found on the Into AI Safety podcast website. 00:36 - Intro and authors01:50 - My takes and paper structure04:40 - Getting to LLMs07:27 - Defining LLMs & emergence12:12 - Overview of PLMs15:00 - How LLMs are built18:52 - Limitations if LLMs23:06 - Uses of LLMs25:16 - Evaluations and Benchmarks28:11 - Challenges and future directions29:21 - Recap & outro Links to all articles/papers which are mentioned throughout the episode can be found below, in order of their appearance.Large Language Models, A SurveyMeysam's LinkedIn PostClaude E. ShannonA symbolic analysis of relay and switching circuits (Master's Thesis)Communication theory of secrecy systemsA mathematical theory of communicationPrediction and entropy of printed EnglishFuture ML Systems Will Be Qualitatively DifferentMore Is DifferentSleeper Agents: Training Deceptive LLMs that Persist Through Safety TrainingAre Emergent Abilities of Large Language Models a Mirage?Are Emergent Abilities of Large Language Models just In-Context Learning?Attention is all you needDirect Preference Optimization: Your Language Model is Secretly a Reward ModelKTO: Model Alignment as Prospect Theoretic OptimizationOptimization by Simulated AnnealingMemory and new controls for ChatGPTHallucinations and related concepts—their conceptual background
Show more...
1 year ago
30 minutes

Into AI Safety
FEEDBACK: Applying for Funding w/ Esben Kran
Esben reviews an application that I would soon submit for Open Philanthropy's Career Transitition Funding opportunity. Although I didn't end up receiving the funding, I do think that this episode can be a valuable resource for both others and myself when applying for funding in the future.Head over to Apart Research's website to check out their work, or the Alignment Jam website for information on upcoming hackathons.A doc-capsule of the application at the time of this recording can be found at this link.01:38 - Interview starts05:41 - Proposal11:00 - Personal statement14:00 - Budget21:12 - CV22:45 - Application questions34:06 - Funding questions44:25 - OutroLinks to all articles/papers which are mentioned throughout the episode can be found below, in order of their appearance.AI governance talent profiles we’d like to seeThe AI Governance Research SprintReasoning TransparencyPlaces to look for fundingOpen Philanthropy's Career development and transition fundingLong-Term Future FundManifund
Show more...
1 year ago
45 minutes

Into AI Safety
MINISODE: Reading a Research Paper
Before I begin with the paper-distillation based minisodes, I figured we would go over best practices for reading research papers. I go through the anatomy of typical papers, and some generally applicable advice.00:56 - Anatomy of a paper02:38 - Most common advice05:24 - Reading sparsity and path07:30 - Notes and motivationLinks to all articles/papers which are mentioned throughout the episode can be found below, in order of their appearance.Ten simple rules for reading a scientific paperBest sources I foundLet's get critical: Reading academic articles#GradHacks: A guide to reading research papersHow to read a scientific paper (presentation)Some more sourcesHow to read a scientific articleHow to read a research paperReading a scientific article
Show more...
1 year ago
9 minutes

Into AI Safety
HACKATHON: Evals November 2023 (2)
Join our hackathon group for the second episode in the Evals November 2023 Hackathon subseries. In this episode, we solidify our goals for the hackathon after some preliminary experimentation and ideation.Check out Stellaric's website, or follow them on Twitter.01:53 - Meeting starts05:05 - Pitch: extension of locked models23:23 - Pitch: retroactive holdout datasets34:04 - Preliminary results37:44 - Next steps42:55 - RecapLinks to all articles/papers which are mentioned throughout the episode can be found below, in order of their appearance.Evalugator libraryPassword Locked Model blogpostTruthfulQA: Measuring How Models Mimic Human FalsehoodsBLEU: a Method for Automatic Evaluation of Machine TranslationBoolQ: Exploring the Surprising Difficulty of Natural Yes/No QuestionsDetecting Pretraining Data from Large Language Models
Show more...
1 year ago
48 minutes

Into AI Safety
MINISODE: Portfolios
I provide my thoughts and recommendations regarding personal professional portfolios. 00:35 - Intro to portfolios01:42 - Modern portfolios02:27 - What to include04:38 - Importance of visual05:50 - The "About" page06:25 - Tools08:12 - Future of "Minisodes" Links to all articles/papers which are mentioned throughout the episode can be found below, in order of their appearance.From Portafoglio to Eportfolio: The Evolution of Portfolio in Higher EducationGIMPAlternativeToJekyllGitHub PagesMinimal MistakesMy portfolio
Show more...
1 year ago
9 minutes

Into AI Safety
INTERVIEW: Polysemanticity w/ Dr. Darryl Wright
Darryl and I discuss his background, how he became interested in machine learning, and a project we are currently working on investigating the penalization of polysemanticity during the training of neural networks. Check out a diagram of the decoder task used for our research! 01:46 - Interview begins02:14 - Supernovae classification08:58 - Penalizing polysemanticity20:58 - Our "toy model"30:06 - Task description32:47 - Addressing hurdles39:20 - Lessons learned Links to all articles/papers which are mentioned throughout the episode can be found below, in order of their appearance. Zooniverse BlueDot Impact AI Safety Support Zoom In: An Introduction to Circuits MNIST dataset on PapersWithCode Clusterability in Neural Networks CIFAR-10 dataset Effective Altruism Global CLIP (blog post) Long Term Future Fund Engineering Monosemanticity in Toy Models
Show more...
1 year ago
45 minutes

Into AI Safety
MINISODE: Starting a Podcast
A summary and reflections on the path I have taken to get this podcast started, including some resources recommendations for others who want to do something similar.Links to all articles/papers which are mentioned throughout the episode can be found below, in order of their appearance.LessWrongSpotify for PodcastersInto AI Safety podcast websiteEffective Altruism GlobalOpen Broadcaster Software (OBS)CraigRiverside
Show more...
1 year ago
10 minutes

Into AI Safety
HACKATHON: Evals November 2023 (1)
This episode kicks off our first subseries, which will consist of recordings taken during my team's meetings for the AlignmentJams Evals Hackathon in November of 2023. Our team won first place, so you'll be listening to the process which, at the end of the day, turned out to be pretty good.Check out Apart Research, the group that runs the AlignmentJamz Hackathons.Links to all articles/papers which are mentioned throughout the episode can be found below, in order of their appearance.Generalization Analogies: A Testbed for Generalizing AI Oversight to Hard-To-Measure DomainsNew paper shows truthfulness & instruction-following don't generalize by defaultGeneralization Analogies WebsiteDiscovering Language Model Behaviors with Model-Written EvaluationsModel-Written Evals WebsiteOpenAI Evals GitHubMETR (previously ARC Evals)Goodharting on WikipediaFrom Instructions to Intrinsic Human Values, a Survey of Alignment Goals for Big ModelsFine Tuning Aligned Language Models Compromises Safety Even When Users Do Not IntendShadow Alignment: The Ease of Subverting Safely Aligned Language ModelsWill Releasing the Weights of Future Large Language Models Grant Widespread Access to Pandemic Agents?Building Less Flawed Metrics, Understanding and Creating Better Measurement and Incentive SystemseLeutherAI's Model Evaluation HarnessEvalugator Library
Show more...
1 year ago
1 hour 8 minutes

Into AI Safety
MINISODE: Staying Up-to-Date in AI
In this minisode I give some tips for staying up-to-date in the everchanging landscape of AI. I would like to point out that I am constantly iterating on these strategies, tools, and sources, so it is likely that I will make an update episode in the future. Links to all articles/papers which are mentioned throughout the episode can be found below, in order of their appearance. ToolsFeedlyarXiv Sanity LiteZoteroAlternativeToMy "Distilled AI" FolderAI Explained YouTube channelAI Safety newsletterData Machina newsletterImport AIMidwit AlignmentHonourable MentionsAI Alignment ForumLessWrongBounded Regret (Jacob Steinhart's blog)Cold Takes (Holden Karnofsky's blog)Chris Olah's blog Tim Dettmers blogEpoch blogApollo Research blog
Show more...
1 year ago
13 minutes

Into AI Safety
INTERVIEW: Applications w/ Alice Rigg
Alice Rigg, a mechanistic interpretability researcher from Ottawa, Canada, joins me to discuss their path and the applications process for research/mentorship programs. Join the Mech Interp Discord server and attend reading groups at 11:00am on Wednesdays (Mountain Time)! Check out Alice's website. Links to all articles/papers which are mentioned throughout the episode can be found below, in order of their appearance. EleutherAI Join the public EleutherAI discord server Distill Effective Altruism (EA) MATS Retrospective Summer 2023 post Ambitious Mechanistic Interpretability AISC research plan by Alice Rigg SPAR Stability AI During their most recent fundraising round, Stability AI had a valuation of $4B (Bloomberg) Mech Interp Discord Server
Show more...
1 year ago
1 hour 10 minutes

Into AI Safety
The Into AI Safety podcast aims to make it easier for everyone, regardless of background, to get meaningfully involved with the conversations surrounding the rules and regulations which should govern the research, development, deployment, and use of the technologies encompassed by the term "artificial intelligence" or "AI" For better formatted show notes, additional resources, and more, go to https://kairos.fm/intoaisafety/