<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet href="/stylesheet.xsl" type="text/xsl"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:podcast="https://podcastindex.org/namespace/1.0">
  <channel>
    <atom:link rel="self" type="application/rss+xml" href="https://feeds.transistor.fm/80-000-hours-podcast-on-artificial-intelligence" title="MP3 Audio"/>
    <atom:link rel="hub" href="https://pubsubhubbub.appspot.com/"/>
    <podcast:podping usesPodping="true"/>
    <title>The 80,000 Hours Podcast on Artificial Intelligence</title>
    <generator>Transistor (https://transistor.fm)</generator>
    <itunes:new-feed-url>https://feeds.transistor.fm/80-000-hours-podcast-on-artificial-intelligence</itunes:new-feed-url>
    <description>10 experts, 10 episodes: a crash course on transformative AI and what you can do to help shape its trajectory.

This compilation features 10 key episodes of The 80,000 Hours Podcast to help listeners — particularly those new to the topic — get to grips with the potential upsides and downsides of powerful, transformative AI.</description>
    <copyright>80000 Hours</copyright>
    <podcast:guid>9643141e-918f-5a1c-8a72-75784264cefd</podcast:guid>
    <podcast:locked>yes</podcast:locked>
    <podcast:trailer pubdate="Fri, 01 Sep 2023 23:00:00 -0700" url="https://media.transistor.fm/adc263a5/d4fdfe49.mp3" length="1153399" type="audio/mpeg">Zero: What to expect in this series</podcast:trailer>
    <language>en-gb</language>
    <pubDate>Fri, 05 Jun 2026 15:44:06 -0700</pubDate>
    <lastBuildDate>Fri, 05 Jun 2026 15:45:08 -0700</lastBuildDate>
    <link>https://80k.link/V4A</link>
    <image>
      <url>https://img.transistorcdn.com/5c8FmQ49bAg24lsI7591rj-hepHk9oiWvVJYZYoDYbU/rs:fill:0:0:1/w:1400/h:1400/q:60/mb:500000/aHR0cHM6Ly9pbWct/dXBsb2FkLXByb2R1/Y3Rpb24udHJhbnNp/c3Rvci5mbS9zaG93/LzQxNjU4LzE2ODQy/NDkzNzUtYXJ0d29y/ay5qcGc.jpg</url>
      <title>The 80,000 Hours Podcast on Artificial Intelligence</title>
      <link>https://80k.link/V4A</link>
    </image>
    <itunes:category text="Science"/>
    <itunes:category text="Society &amp; Culture"/>
    <itunes:type>episodic</itunes:type>
    <itunes:author>80,000 Hours</itunes:author>
    <itunes:image href="https://img.transistorcdn.com/5c8FmQ49bAg24lsI7591rj-hepHk9oiWvVJYZYoDYbU/rs:fill:0:0:1/w:1400/h:1400/q:60/mb:500000/aHR0cHM6Ly9pbWct/dXBsb2FkLXByb2R1/Y3Rpb24udHJhbnNp/c3Rvci5mbS9zaG93/LzQxNjU4LzE2ODQy/NDkzNzUtYXJ0d29y/ay5qcGc.jpg"/>
    <itunes:summary>10 experts, 10 episodes: a crash course on transformative AI and what you can do to help shape its trajectory.

This compilation features 10 key episodes of The 80,000 Hours Podcast to help listeners — particularly those new to the topic — get to grips with the potential upsides and downsides of powerful, transformative AI.</itunes:summary>
    <itunes:subtitle>10 experts, 10 episodes: a crash course on transformative AI and what you can do to help shape its trajectory.</itunes:subtitle>
    <itunes:keywords>artificial intelligence, effective altruism, existential risk, longtermism, AGI</itunes:keywords>
    <itunes:owner>
      <itunes:name>80000 Hours</itunes:name>
    </itunes:owner>
    <itunes:complete>Yes</itunes:complete>
    <itunes:explicit>No</itunes:explicit>
    <item>
      <title>One: Will MacAskill on AI causing a “century in a decade” — and how we’re completely unprepared</title>
      <itunes:title>One: Will MacAskill on AI causing a “century in a decade” — and how we’re completely unprepared</itunes:title>
      <itunes:episodeType>full</itunes:episodeType>
      <guid isPermaLink="false">ff270f20-5032-42a7-9fb6-45d18e80aab8</guid>
      <link>https://80000hours.org/podcast/episodes/will-macaskill-century-in-a-decade-navigating-intelligence-explosion/</link>
      <description>
        <![CDATA[<p>The 20th century saw unprecedented change: nuclear weapons, satellites, the rise and fall of communism, third-wave feminism, the internet, postmodernism, game theory, genetic engineering, the Big Bang theory, quantum mechanics, widespread birth control, and more. Now imagine all of it compressed into just 10 years.</p><p>That’s the future Will MacAskill — philosopher, founding figure of effective altruism, and now researcher at <a href="https://www.forethought.org/">Forethought Research</a> — argues we need to prepare for in his paper “<a href="https://www.forethought.org/preparing-for-the-intelligence-explosion">Preparing for the intelligence explosion</a>.” Not in the distant future, but probably in 3–7 years.</p><p>The reason: AI systems are rapidly approaching human-level capability in scientific research and intellectual tasks. Once AI exceeds human abilities in AI research itself, we’ll enter a <a href="https://en.wikipedia.org/wiki/Recursive_self-improvement">recursive self-improvement</a> cycle — creating wildly more capable systems. Soon after, by improving algorithms and manufacturing chips, we’ll deploy millions, then billions, then trillions of superhuman AI scientists working 24/7 without human limitations. These systems will collaborate across disciplines, build on each discovery instantly, and conduct experiments at unprecedented scale and speed — compressing a century of progress into years.</p><p>Will compares this to a mediaeval king suddenly needing to upgrade from bows and arrows to nuclear weapons to deal with an ideological threat from a kingdom he’s never heard of, while simultaneously learning he’s descended from monkeys and his god doesn’t exist.</p><p>What makes this acceleration perilous is that while technology can speed up almost arbitrarily, human institutions and decision making are much more fixed.</p><p>Consider the case of nuclear weapons: in this compressed timeline, there would have been just a three-month gap between the Manhattan Project’s start and the Hiroshima bombing, and the Cuban Missile Crisis would have lasted just over a day.</p><p>Robert Kennedy Sr, who helped navigate the actual Cuban Missile Crisis, once said that if they’d had to make decisions faster — like in 24 hours rather than 13 days — they would likely have taken much more aggressive, much riskier actions.</p><p>So there’s reason to worry about our capacity to make wise choices quickly. And in his paper, Will lays out 10 “grand challenges” we’ll need to navigate to avoid things going wrong.</p><p>Will now believes we’re entering one of the <em>most critical periods for humanity ever</em> — with decisions made in the next few years potentially determining outcomes millions of years into the future. In this wide-ranging conversation, Will and host Rob Wiblin discuss:</p><ul><li>Why leading AI safety researchers now think there’s dramatically less time before AI is transformative than they’d previously thought</li><li>The three different types of intelligence explosions that occur in order</li><li>Will’s list of resulting grand challenges — including destructive technologies, space governance, concentration of power, and digital rights</li><li>How to prevent ourselves from accidentally “locking in” mediocre futures for all eternity</li><li>Ways AI could radically improve human coordination and decision making</li><li>Why we should aim for truly flourishing futures, not just avoiding extinction</li></ul><p><a href="https://80000hours.org/podcast/episodes/will-macaskill-century-in-a-decade-navigating-intelligence-explosion/">Learn more and read the full transcript</a> on the 80,000 Hours website. </p><p><em>This episode was originally released in March 2025.</em></p><p>Chapters:</p><ul><li>Cold open (00:00:00)</li><li>Who’s Will MacAskill? (00:00:46)</li><li>Why Will now just works on AGI (00:01:02)</li><li>Will was wrong(ish) on AI timelines and hinge of history (00:04:10)</li><li>A century of history crammed into a decade (00:08:59)</li><li>Science goes super fast; our institutions don't keep up (00:15:41)</li><li>Is it good or bad for intellectual progress to 10x? (00:21:03)</li><li>An intelligence explosion is not just plausible but likely (00:22:53)</li><li>Intellectual advances outside technology are similarly important (00:28:57)</li><li>Counterarguments to intelligence explosion (00:31:31)</li><li>The three types of intelligence explosion (software, technological, industrial) (00:37:29)</li><li>The industrial intelligence explosion is the most certain and enduring (00:40:23)</li><li>Is a 100x or 1,000x speedup more likely than 10x? (00:51:50)</li><li>The grand superintelligence challenges (00:55:37)</li><li>Grand challenge #1: Many new destructive technologies (00:59:17)</li><li>Grand challenge #2: Seizure of power by a small group (01:06:45)</li><li>Is global lock-in really plausible? (01:08:37)</li><li>Grand challenge #3: Space governance (01:18:53)</li><li>Is space truly defence-dominant? (01:28:43)</li><li>Grand challenge #4: Morally integrating with digital beings (01:32:20)</li><li>Will we ever know if digital minds are happy? (01:41:01)</li><li>“My worry isn't that we won't know; it's that we won't care” (01:46:31)</li><li>Can we get AGI to solve all these issues as early as possible? (01:49:40)</li><li>Politicians have to learn to use AI advisors (02:02:03)</li><li>Ensuring AI makes us smarter decision-makers (02:06:10)</li><li>How listeners can speed up AI epistemic tools (02:09:38)</li><li>AI could become great at forecasting (02:13:09)</li><li>How not to lock in a bad future (02:14:37)</li><li>AI takeover might happen anyway — should we rush to load in our values? (02:25:29)</li><li>ML researchers are feverishly working to destroy their own power (02:34:37)</li><li>We should aim for more than mere survival (02:37:53)</li><li>By default the future is rubbish (02:49:04)</li><li>No easy utopia (02:56:55)</li><li>What levers matter most to utopia (03:06:32)</li><li>Bottom lines from the modelling (03:20:09)</li><li>People distrust utopianism; should they distrust this? (03:24:10)</li><li>What conditions make eventual eutopia likely? (03:28:49)</li><li>The new Forethought Centre for AI Strategy (03:37:21)</li><li>How does Will resist hopelessness? (03:50:13)</li><li>Outro (03:57:12)</li></ul><p><em>Video editing: Simon Monsour</em><br><em>Audio engineering: Ben Cordell, Milo McGuire, Simon Monsour, and Dominic Armstrong</em><br><em>Camera operator: Jeremy Chevillotte</em><br><em>Transcriptions and web: Katy Moore</em></p>]]>
      </description>
      <content:encoded>
        <![CDATA[<p>The 20th century saw unprecedented change: nuclear weapons, satellites, the rise and fall of communism, third-wave feminism, the internet, postmodernism, game theory, genetic engineering, the Big Bang theory, quantum mechanics, widespread birth control, and more. Now imagine all of it compressed into just 10 years.</p><p>That’s the future Will MacAskill — philosopher, founding figure of effective altruism, and now researcher at <a href="https://www.forethought.org/">Forethought Research</a> — argues we need to prepare for in his paper “<a href="https://www.forethought.org/preparing-for-the-intelligence-explosion">Preparing for the intelligence explosion</a>.” Not in the distant future, but probably in 3–7 years.</p><p>The reason: AI systems are rapidly approaching human-level capability in scientific research and intellectual tasks. Once AI exceeds human abilities in AI research itself, we’ll enter a <a href="https://en.wikipedia.org/wiki/Recursive_self-improvement">recursive self-improvement</a> cycle — creating wildly more capable systems. Soon after, by improving algorithms and manufacturing chips, we’ll deploy millions, then billions, then trillions of superhuman AI scientists working 24/7 without human limitations. These systems will collaborate across disciplines, build on each discovery instantly, and conduct experiments at unprecedented scale and speed — compressing a century of progress into years.</p><p>Will compares this to a mediaeval king suddenly needing to upgrade from bows and arrows to nuclear weapons to deal with an ideological threat from a kingdom he’s never heard of, while simultaneously learning he’s descended from monkeys and his god doesn’t exist.</p><p>What makes this acceleration perilous is that while technology can speed up almost arbitrarily, human institutions and decision making are much more fixed.</p><p>Consider the case of nuclear weapons: in this compressed timeline, there would have been just a three-month gap between the Manhattan Project’s start and the Hiroshima bombing, and the Cuban Missile Crisis would have lasted just over a day.</p><p>Robert Kennedy Sr, who helped navigate the actual Cuban Missile Crisis, once said that if they’d had to make decisions faster — like in 24 hours rather than 13 days — they would likely have taken much more aggressive, much riskier actions.</p><p>So there’s reason to worry about our capacity to make wise choices quickly. And in his paper, Will lays out 10 “grand challenges” we’ll need to navigate to avoid things going wrong.</p><p>Will now believes we’re entering one of the <em>most critical periods for humanity ever</em> — with decisions made in the next few years potentially determining outcomes millions of years into the future. In this wide-ranging conversation, Will and host Rob Wiblin discuss:</p><ul><li>Why leading AI safety researchers now think there’s dramatically less time before AI is transformative than they’d previously thought</li><li>The three different types of intelligence explosions that occur in order</li><li>Will’s list of resulting grand challenges — including destructive technologies, space governance, concentration of power, and digital rights</li><li>How to prevent ourselves from accidentally “locking in” mediocre futures for all eternity</li><li>Ways AI could radically improve human coordination and decision making</li><li>Why we should aim for truly flourishing futures, not just avoiding extinction</li></ul><p><a href="https://80000hours.org/podcast/episodes/will-macaskill-century-in-a-decade-navigating-intelligence-explosion/">Learn more and read the full transcript</a> on the 80,000 Hours website. </p><p><em>This episode was originally released in March 2025.</em></p><p>Chapters:</p><ul><li>Cold open (00:00:00)</li><li>Who’s Will MacAskill? (00:00:46)</li><li>Why Will now just works on AGI (00:01:02)</li><li>Will was wrong(ish) on AI timelines and hinge of history (00:04:10)</li><li>A century of history crammed into a decade (00:08:59)</li><li>Science goes super fast; our institutions don't keep up (00:15:41)</li><li>Is it good or bad for intellectual progress to 10x? (00:21:03)</li><li>An intelligence explosion is not just plausible but likely (00:22:53)</li><li>Intellectual advances outside technology are similarly important (00:28:57)</li><li>Counterarguments to intelligence explosion (00:31:31)</li><li>The three types of intelligence explosion (software, technological, industrial) (00:37:29)</li><li>The industrial intelligence explosion is the most certain and enduring (00:40:23)</li><li>Is a 100x or 1,000x speedup more likely than 10x? (00:51:50)</li><li>The grand superintelligence challenges (00:55:37)</li><li>Grand challenge #1: Many new destructive technologies (00:59:17)</li><li>Grand challenge #2: Seizure of power by a small group (01:06:45)</li><li>Is global lock-in really plausible? (01:08:37)</li><li>Grand challenge #3: Space governance (01:18:53)</li><li>Is space truly defence-dominant? (01:28:43)</li><li>Grand challenge #4: Morally integrating with digital beings (01:32:20)</li><li>Will we ever know if digital minds are happy? (01:41:01)</li><li>“My worry isn't that we won't know; it's that we won't care” (01:46:31)</li><li>Can we get AGI to solve all these issues as early as possible? (01:49:40)</li><li>Politicians have to learn to use AI advisors (02:02:03)</li><li>Ensuring AI makes us smarter decision-makers (02:06:10)</li><li>How listeners can speed up AI epistemic tools (02:09:38)</li><li>AI could become great at forecasting (02:13:09)</li><li>How not to lock in a bad future (02:14:37)</li><li>AI takeover might happen anyway — should we rush to load in our values? (02:25:29)</li><li>ML researchers are feverishly working to destroy their own power (02:34:37)</li><li>We should aim for more than mere survival (02:37:53)</li><li>By default the future is rubbish (02:49:04)</li><li>No easy utopia (02:56:55)</li><li>What levers matter most to utopia (03:06:32)</li><li>Bottom lines from the modelling (03:20:09)</li><li>People distrust utopianism; should they distrust this? (03:24:10)</li><li>What conditions make eventual eutopia likely? (03:28:49)</li><li>The new Forethought Centre for AI Strategy (03:37:21)</li><li>How does Will resist hopelessness? (03:50:13)</li><li>Outro (03:57:12)</li></ul><p><em>Video editing: Simon Monsour</em><br><em>Audio engineering: Ben Cordell, Milo McGuire, Simon Monsour, and Dominic Armstrong</em><br><em>Camera operator: Jeremy Chevillotte</em><br><em>Transcriptions and web: Katy Moore</em></p>]]>
      </content:encoded>
      <pubDate>Fri, 05 Jun 2026 12:00:00 -0700</pubDate>
      <author>80,000 Hours</author>
      <enclosure url="https://media.transistor.fm/2f07e869/90ce80a0.mp3" length="228175564" type="audio/mpeg"/>
      <itunes:author>80,000 Hours</itunes:author>
      <itunes:image href="https://img.transistorcdn.com/PEeu9qRxJLS7UCdraX3h9aSByAJx94s4dBU1M5E8lo0/rs:fill:0:0:1/w:1400/h:1400/q:60/mb:500000/aHR0cHM6Ly9pbWct/dXBsb2FkLXByb2R1/Y3Rpb24udHJhbnNp/c3Rvci5mbS8xMzhj/ZDdiMjM4OWM1ZTFh/NDVlMGJlNTUxYjYw/Y2IxZS5qcGc.jpg"/>
      <itunes:duration>14257</itunes:duration>
      <itunes:summary>
        <![CDATA[<p>The 20th century saw unprecedented change: nuclear weapons, satellites, the rise and fall of communism, third-wave feminism, the internet, postmodernism, game theory, genetic engineering, the Big Bang theory, quantum mechanics, widespread birth control, and more. Now imagine all of it compressed into just 10 years.</p><p>That’s the future Will MacAskill — philosopher, founding figure of effective altruism, and now researcher at <a href="https://www.forethought.org/">Forethought Research</a> — argues we need to prepare for in his paper “<a href="https://www.forethought.org/preparing-for-the-intelligence-explosion">Preparing for the intelligence explosion</a>.” Not in the distant future, but probably in 3–7 years.</p><p>The reason: AI systems are rapidly approaching human-level capability in scientific research and intellectual tasks. Once AI exceeds human abilities in AI research itself, we’ll enter a <a href="https://en.wikipedia.org/wiki/Recursive_self-improvement">recursive self-improvement</a> cycle — creating wildly more capable systems. Soon after, by improving algorithms and manufacturing chips, we’ll deploy millions, then billions, then trillions of superhuman AI scientists working 24/7 without human limitations. These systems will collaborate across disciplines, build on each discovery instantly, and conduct experiments at unprecedented scale and speed — compressing a century of progress into years.</p><p>Will compares this to a mediaeval king suddenly needing to upgrade from bows and arrows to nuclear weapons to deal with an ideological threat from a kingdom he’s never heard of, while simultaneously learning he’s descended from monkeys and his god doesn’t exist.</p><p>What makes this acceleration perilous is that while technology can speed up almost arbitrarily, human institutions and decision making are much more fixed.</p><p>Consider the case of nuclear weapons: in this compressed timeline, there would have been just a three-month gap between the Manhattan Project’s start and the Hiroshima bombing, and the Cuban Missile Crisis would have lasted just over a day.</p><p>Robert Kennedy Sr, who helped navigate the actual Cuban Missile Crisis, once said that if they’d had to make decisions faster — like in 24 hours rather than 13 days — they would likely have taken much more aggressive, much riskier actions.</p><p>So there’s reason to worry about our capacity to make wise choices quickly. And in his paper, Will lays out 10 “grand challenges” we’ll need to navigate to avoid things going wrong.</p><p>Will now believes we’re entering one of the <em>most critical periods for humanity ever</em> — with decisions made in the next few years potentially determining outcomes millions of years into the future. In this wide-ranging conversation, Will and host Rob Wiblin discuss:</p><ul><li>Why leading AI safety researchers now think there’s dramatically less time before AI is transformative than they’d previously thought</li><li>The three different types of intelligence explosions that occur in order</li><li>Will’s list of resulting grand challenges — including destructive technologies, space governance, concentration of power, and digital rights</li><li>How to prevent ourselves from accidentally “locking in” mediocre futures for all eternity</li><li>Ways AI could radically improve human coordination and decision making</li><li>Why we should aim for truly flourishing futures, not just avoiding extinction</li></ul><p><a href="https://80000hours.org/podcast/episodes/will-macaskill-century-in-a-decade-navigating-intelligence-explosion/">Learn more and read the full transcript</a> on the 80,000 Hours website. </p><p><em>This episode was originally released in March 2025.</em></p><p>Chapters:</p><ul><li>Cold open (00:00:00)</li><li>Who’s Will MacAskill? (00:00:46)</li><li>Why Will now just works on AGI (00:01:02)</li><li>Will was wrong(ish) on AI timelines and hinge of history (00:04:10)</li><li>A century of history crammed into a decade (00:08:59)</li><li>Science goes super fast; our institutions don't keep up (00:15:41)</li><li>Is it good or bad for intellectual progress to 10x? (00:21:03)</li><li>An intelligence explosion is not just plausible but likely (00:22:53)</li><li>Intellectual advances outside technology are similarly important (00:28:57)</li><li>Counterarguments to intelligence explosion (00:31:31)</li><li>The three types of intelligence explosion (software, technological, industrial) (00:37:29)</li><li>The industrial intelligence explosion is the most certain and enduring (00:40:23)</li><li>Is a 100x or 1,000x speedup more likely than 10x? (00:51:50)</li><li>The grand superintelligence challenges (00:55:37)</li><li>Grand challenge #1: Many new destructive technologies (00:59:17)</li><li>Grand challenge #2: Seizure of power by a small group (01:06:45)</li><li>Is global lock-in really plausible? (01:08:37)</li><li>Grand challenge #3: Space governance (01:18:53)</li><li>Is space truly defence-dominant? (01:28:43)</li><li>Grand challenge #4: Morally integrating with digital beings (01:32:20)</li><li>Will we ever know if digital minds are happy? (01:41:01)</li><li>“My worry isn't that we won't know; it's that we won't care” (01:46:31)</li><li>Can we get AGI to solve all these issues as early as possible? (01:49:40)</li><li>Politicians have to learn to use AI advisors (02:02:03)</li><li>Ensuring AI makes us smarter decision-makers (02:06:10)</li><li>How listeners can speed up AI epistemic tools (02:09:38)</li><li>AI could become great at forecasting (02:13:09)</li><li>How not to lock in a bad future (02:14:37)</li><li>AI takeover might happen anyway — should we rush to load in our values? (02:25:29)</li><li>ML researchers are feverishly working to destroy their own power (02:34:37)</li><li>We should aim for more than mere survival (02:37:53)</li><li>By default the future is rubbish (02:49:04)</li><li>No easy utopia (02:56:55)</li><li>What levers matter most to utopia (03:06:32)</li><li>Bottom lines from the modelling (03:20:09)</li><li>People distrust utopianism; should they distrust this? (03:24:10)</li><li>What conditions make eventual eutopia likely? (03:28:49)</li><li>The new Forethought Centre for AI Strategy (03:37:21)</li><li>How does Will resist hopelessness? (03:50:13)</li><li>Outro (03:57:12)</li></ul><p><em>Video editing: Simon Monsour</em><br><em>Audio engineering: Ben Cordell, Milo McGuire, Simon Monsour, and Dominic Armstrong</em><br><em>Camera operator: Jeremy Chevillotte</em><br><em>Transcriptions and web: Katy Moore</em></p>]]>
      </itunes:summary>
      <itunes:keywords>artificial intelligence, effective altruism, existential risk, longtermism, AGI</itunes:keywords>
      <itunes:explicit>No</itunes:explicit>
      <podcast:transcript url="https://share.transistor.fm/s/2f07e869/transcript.txt" type="text/plain"/>
      <podcast:chapters url="https://share.transistor.fm/s/2f07e869/chapters.json" type="application/json+chapters"/>
    </item>
    <item>
      <title>Two: Ajeya Cotra on accidentally teaching AI models to deceive us</title>
      <itunes:title>Two: Ajeya Cotra on accidentally teaching AI models to deceive us</itunes:title>
      <itunes:episodeType>full</itunes:episodeType>
      <guid isPermaLink="false">98e49ee5-fa12-4e4c-9e2a-50cb3a27548f</guid>
      <link>https://80000hours.org/podcast/episodes/ajeya-cotra-accidentally-teaching-ai-to-deceive-us/</link>
      <description>
        <![CDATA[<p>Imagine you’re an orphaned eight-year-old whose parents left you a $1 trillion company, with no trusted adult to guide you. You have to hire a smart adult to run that company, guide your life the way a parent would, and administer your vast wealth. You have to hire them based on a work trial or interview that you design. You don’t get to see any resumes or do reference checks. And because you’re so rich, tonnes of people apply — for all sorts of reasons.</p><p>Ajeya Cotra argues this peculiar setup resembles the situation humanity finds itself in as we train very general and very capable AI models using current deep learning methods. Ajeya was a senior research analyst at <a href="https://coefficientgiving.org/">Coefficient Giving</a> at the time of this interview, and she now works at <a href="https://metr.org/">METR</a> (Model Evaluation &amp; Threat Research). </p><p>As she explains, this eight-year-old faces a challenging problem. In the candidate pool there are likely some truly nice people, who sincerely want to help and make decisions that are in your interest. But there are probably other characters too — like people who will pretend to care while you’re monitoring them, but intend to exploit the job to enrich themselves as soon as they think they can get away with it.</p><p>Like a child trying to judge adults, at some point humans will need to judge the trustworthiness and reliability of machine learning models that are as goal-oriented as people, and greatly outclass them in knowledge, experience, breadth, and speed. Tricky!</p><p>Can’t we rely on models' performance during training tasks to guide us? Ajeya worries this won’t work. The trouble is that three different sorts of models will all produce the same output during training, but could behave very differently once deployed in a setting that allows their true colours to come through. She describes three such motivational archetypes:</p><ul><li>Saints — models that care about doing what we really want</li><li>Sycophants — models that just want us to say they’ve done a good job, even if they get that praise by taking actions they know we wouldn’t want them to</li><li>Schemers — models that don’t care about us or our interests at all, who are just pleasing us so long as that serves their own agenda</li></ul><p>In principle, a machine learning training process based on <a href="https://en.wikipedia.org/wiki/Reinforcement_learning">reinforcement learning</a> could spit out any of these three attitudes, because all three would perform roughly equally well on the tests we give them, and ‘performs well on tests’ is how these models are selected.</p><p>But while that’s true in principle, maybe it’s not something that could plausibly happen in the real world. After all, if we train an agent based on positive reinforcement for accomplishing X, shouldn’t the training process produce a model that just does X and doesn’t have complex thoughts and goals beyond that?</p><p>According to Ajeya, this is one thing we don’t know, and should be trying to test empirically as these models get more capable. For reasons she explains in the interview, the Sycophant or Schemer models may in fact be simpler and easier for the learning algorithm to creep towards than their Saint counterparts.</p><p>But there are also ways we could end up actively selecting for motivations that we don’t want.</p><p>For example, let’s say you train an agentic AI model to run a small business, selecting for behaviours that make money and measuring success by the balance in its bank account. During training, a highly capable model may experiment with the strategy of tricking its trainers into thinking it has made money legitimately when it hasn’t. Maybe instead it steals some money and covers that up. This isn’t a hypothetical worry: models often come up with creative — sometimes undesirable — approaches during training that their developers didn’t anticipate.</p><p>If such deception isn’t caught, a model like this may be rated as particularly successful, and the training process will reinforce its tendency to engage in deceptive behaviour. A model that could deceive without being caught would, in effect, have a competitive advantage.</p><p>What if deception is picked up, but just some of the time? Would the model then learn that honesty is the best policy? Perhaps. But it might learn a different lesson instead: that deception does pay, as long as it’s done selectively and carefully enough to avoid detection. Would that actually happen? We don’t yet know, but it’s possible.</p><p>In this conversation, Ajeya and host Rob Wiblin discuss the above, as well as:</p><ul><li>How to predict the motivations a neural network will develop through training</li><li>Whether AIs in training will functionally understand that they’re AIs being trained</li><li>Stories of AI misalignment that Ajeya doesn’t buy</li><li>Analogies for AI, from octopuses to aliens to can openers</li><li>Why it’s smarter to have separate ‘planning AIs’ and ‘doing AIs’</li><li>The benefits of only following through on AI-generated plans that make sense to human beings</li><li>Which approaches for fixing alignment problems Ajeya is most excited about, and which she thinks are overrated</li><li>How one might demo actually scary AI failure mechanisms</li></ul><p><a href="https://80000hours.org/podcast/episodes/ajeya-cotra-accidentally-teaching-ai-to-deceive-us/">Learn more and read the full transcript</a> on the 80,000 Hours website. </p><p><em>This episode was originally released in May 2023, but we still think it’s one of the best episodes we have at explaining core risks from power-seeking AI.</em></p><p>Chapters:</p><ul><li>Rob’s intro (00:00:00)</li><li>The interview begins (00:02:38)</li><li>How Ajeya’s views have changed since 2020 (00:05:09)</li><li>Are neural networks more like a sped-up version of evolution, or a slower version of human learning? (00:17:42)</li><li>Situational awareness (00:26:10)</li><li>Misalignment stories Ajeya doesn't buy (00:42:03)</li><li>The orphan heir with a trillion-dollar fortune (00:59:14)</li><li>Saints, Sycophants, and Schemers (01:03:41)</li><li>Ways to train safer AI systems (01:23:20)</li><li>Aliens and other analogies (01:38:22)</li><li>Moral patienthood (01:53:21)</li><li>ARC Evaluations (01:55:35)</li><li>Interpretability research (02:09:25)</li><li>Rewarding models based on how good and sensible their plans seem to us (02:17:48)</li><li>Overrated approaches (02:25:49)</li><li>Demos of actually scary alignment failures (02:30:57)</li><li>Skills to develop for doing useful work (02:37:23)</li><li>Rob’s outro (02:47:24)</li></ul><p><em>Producer: Keiran Harris</em><br><em>Audio mastering: Ryan Kessler and Ben Cordell</em><br><em>Transcriptions: Katy Moore</em></p>]]>
      </description>
      <content:encoded>
        <![CDATA[<p>Imagine you’re an orphaned eight-year-old whose parents left you a $1 trillion company, with no trusted adult to guide you. You have to hire a smart adult to run that company, guide your life the way a parent would, and administer your vast wealth. You have to hire them based on a work trial or interview that you design. You don’t get to see any resumes or do reference checks. And because you’re so rich, tonnes of people apply — for all sorts of reasons.</p><p>Ajeya Cotra argues this peculiar setup resembles the situation humanity finds itself in as we train very general and very capable AI models using current deep learning methods. Ajeya was a senior research analyst at <a href="https://coefficientgiving.org/">Coefficient Giving</a> at the time of this interview, and she now works at <a href="https://metr.org/">METR</a> (Model Evaluation &amp; Threat Research). </p><p>As she explains, this eight-year-old faces a challenging problem. In the candidate pool there are likely some truly nice people, who sincerely want to help and make decisions that are in your interest. But there are probably other characters too — like people who will pretend to care while you’re monitoring them, but intend to exploit the job to enrich themselves as soon as they think they can get away with it.</p><p>Like a child trying to judge adults, at some point humans will need to judge the trustworthiness and reliability of machine learning models that are as goal-oriented as people, and greatly outclass them in knowledge, experience, breadth, and speed. Tricky!</p><p>Can’t we rely on models' performance during training tasks to guide us? Ajeya worries this won’t work. The trouble is that three different sorts of models will all produce the same output during training, but could behave very differently once deployed in a setting that allows their true colours to come through. She describes three such motivational archetypes:</p><ul><li>Saints — models that care about doing what we really want</li><li>Sycophants — models that just want us to say they’ve done a good job, even if they get that praise by taking actions they know we wouldn’t want them to</li><li>Schemers — models that don’t care about us or our interests at all, who are just pleasing us so long as that serves their own agenda</li></ul><p>In principle, a machine learning training process based on <a href="https://en.wikipedia.org/wiki/Reinforcement_learning">reinforcement learning</a> could spit out any of these three attitudes, because all three would perform roughly equally well on the tests we give them, and ‘performs well on tests’ is how these models are selected.</p><p>But while that’s true in principle, maybe it’s not something that could plausibly happen in the real world. After all, if we train an agent based on positive reinforcement for accomplishing X, shouldn’t the training process produce a model that just does X and doesn’t have complex thoughts and goals beyond that?</p><p>According to Ajeya, this is one thing we don’t know, and should be trying to test empirically as these models get more capable. For reasons she explains in the interview, the Sycophant or Schemer models may in fact be simpler and easier for the learning algorithm to creep towards than their Saint counterparts.</p><p>But there are also ways we could end up actively selecting for motivations that we don’t want.</p><p>For example, let’s say you train an agentic AI model to run a small business, selecting for behaviours that make money and measuring success by the balance in its bank account. During training, a highly capable model may experiment with the strategy of tricking its trainers into thinking it has made money legitimately when it hasn’t. Maybe instead it steals some money and covers that up. This isn’t a hypothetical worry: models often come up with creative — sometimes undesirable — approaches during training that their developers didn’t anticipate.</p><p>If such deception isn’t caught, a model like this may be rated as particularly successful, and the training process will reinforce its tendency to engage in deceptive behaviour. A model that could deceive without being caught would, in effect, have a competitive advantage.</p><p>What if deception is picked up, but just some of the time? Would the model then learn that honesty is the best policy? Perhaps. But it might learn a different lesson instead: that deception does pay, as long as it’s done selectively and carefully enough to avoid detection. Would that actually happen? We don’t yet know, but it’s possible.</p><p>In this conversation, Ajeya and host Rob Wiblin discuss the above, as well as:</p><ul><li>How to predict the motivations a neural network will develop through training</li><li>Whether AIs in training will functionally understand that they’re AIs being trained</li><li>Stories of AI misalignment that Ajeya doesn’t buy</li><li>Analogies for AI, from octopuses to aliens to can openers</li><li>Why it’s smarter to have separate ‘planning AIs’ and ‘doing AIs’</li><li>The benefits of only following through on AI-generated plans that make sense to human beings</li><li>Which approaches for fixing alignment problems Ajeya is most excited about, and which she thinks are overrated</li><li>How one might demo actually scary AI failure mechanisms</li></ul><p><a href="https://80000hours.org/podcast/episodes/ajeya-cotra-accidentally-teaching-ai-to-deceive-us/">Learn more and read the full transcript</a> on the 80,000 Hours website. </p><p><em>This episode was originally released in May 2023, but we still think it’s one of the best episodes we have at explaining core risks from power-seeking AI.</em></p><p>Chapters:</p><ul><li>Rob’s intro (00:00:00)</li><li>The interview begins (00:02:38)</li><li>How Ajeya’s views have changed since 2020 (00:05:09)</li><li>Are neural networks more like a sped-up version of evolution, or a slower version of human learning? (00:17:42)</li><li>Situational awareness (00:26:10)</li><li>Misalignment stories Ajeya doesn't buy (00:42:03)</li><li>The orphan heir with a trillion-dollar fortune (00:59:14)</li><li>Saints, Sycophants, and Schemers (01:03:41)</li><li>Ways to train safer AI systems (01:23:20)</li><li>Aliens and other analogies (01:38:22)</li><li>Moral patienthood (01:53:21)</li><li>ARC Evaluations (01:55:35)</li><li>Interpretability research (02:09:25)</li><li>Rewarding models based on how good and sensible their plans seem to us (02:17:48)</li><li>Overrated approaches (02:25:49)</li><li>Demos of actually scary alignment failures (02:30:57)</li><li>Skills to develop for doing useful work (02:37:23)</li><li>Rob’s outro (02:47:24)</li></ul><p><em>Producer: Keiran Harris</em><br><em>Audio mastering: Ryan Kessler and Ben Cordell</em><br><em>Transcriptions: Katy Moore</em></p>]]>
      </content:encoded>
      <pubDate>Fri, 05 Jun 2026 11:00:00 -0700</pubDate>
      <author>80,000 Hours</author>
      <enclosure url="https://media.transistor.fm/15fae3e1/4bd62742.mp3" length="81500392" type="audio/mpeg"/>
      <itunes:author>80,000 Hours</itunes:author>
      <itunes:image href="https://img.transistorcdn.com/4b8JGgKolgrWu3iAO93hhrT26J8_juF9Qm-9Uy5umD8/rs:fill:0:0:1/w:1400/h:1400/q:60/mb:500000/aHR0cHM6Ly9pbWct/dXBsb2FkLXByb2R1/Y3Rpb24udHJhbnNp/c3Rvci5mbS82ZjY4/NmJiYjdiMWNiNDMz/NjRjODkyZTI4Nzgw/ZGFmNy5qcGc.jpg"/>
      <itunes:duration>10180</itunes:duration>
      <itunes:summary>
        <![CDATA[<p>Imagine you’re an orphaned eight-year-old whose parents left you a $1 trillion company, with no trusted adult to guide you. You have to hire a smart adult to run that company, guide your life the way a parent would, and administer your vast wealth. You have to hire them based on a work trial or interview that you design. You don’t get to see any resumes or do reference checks. And because you’re so rich, tonnes of people apply — for all sorts of reasons.</p><p>Ajeya Cotra argues this peculiar setup resembles the situation humanity finds itself in as we train very general and very capable AI models using current deep learning methods. Ajeya was a senior research analyst at <a href="https://coefficientgiving.org/">Coefficient Giving</a> at the time of this interview, and she now works at <a href="https://metr.org/">METR</a> (Model Evaluation &amp; Threat Research). </p><p>As she explains, this eight-year-old faces a challenging problem. In the candidate pool there are likely some truly nice people, who sincerely want to help and make decisions that are in your interest. But there are probably other characters too — like people who will pretend to care while you’re monitoring them, but intend to exploit the job to enrich themselves as soon as they think they can get away with it.</p><p>Like a child trying to judge adults, at some point humans will need to judge the trustworthiness and reliability of machine learning models that are as goal-oriented as people, and greatly outclass them in knowledge, experience, breadth, and speed. Tricky!</p><p>Can’t we rely on models' performance during training tasks to guide us? Ajeya worries this won’t work. The trouble is that three different sorts of models will all produce the same output during training, but could behave very differently once deployed in a setting that allows their true colours to come through. She describes three such motivational archetypes:</p><ul><li>Saints — models that care about doing what we really want</li><li>Sycophants — models that just want us to say they’ve done a good job, even if they get that praise by taking actions they know we wouldn’t want them to</li><li>Schemers — models that don’t care about us or our interests at all, who are just pleasing us so long as that serves their own agenda</li></ul><p>In principle, a machine learning training process based on <a href="https://en.wikipedia.org/wiki/Reinforcement_learning">reinforcement learning</a> could spit out any of these three attitudes, because all three would perform roughly equally well on the tests we give them, and ‘performs well on tests’ is how these models are selected.</p><p>But while that’s true in principle, maybe it’s not something that could plausibly happen in the real world. After all, if we train an agent based on positive reinforcement for accomplishing X, shouldn’t the training process produce a model that just does X and doesn’t have complex thoughts and goals beyond that?</p><p>According to Ajeya, this is one thing we don’t know, and should be trying to test empirically as these models get more capable. For reasons she explains in the interview, the Sycophant or Schemer models may in fact be simpler and easier for the learning algorithm to creep towards than their Saint counterparts.</p><p>But there are also ways we could end up actively selecting for motivations that we don’t want.</p><p>For example, let’s say you train an agentic AI model to run a small business, selecting for behaviours that make money and measuring success by the balance in its bank account. During training, a highly capable model may experiment with the strategy of tricking its trainers into thinking it has made money legitimately when it hasn’t. Maybe instead it steals some money and covers that up. This isn’t a hypothetical worry: models often come up with creative — sometimes undesirable — approaches during training that their developers didn’t anticipate.</p><p>If such deception isn’t caught, a model like this may be rated as particularly successful, and the training process will reinforce its tendency to engage in deceptive behaviour. A model that could deceive without being caught would, in effect, have a competitive advantage.</p><p>What if deception is picked up, but just some of the time? Would the model then learn that honesty is the best policy? Perhaps. But it might learn a different lesson instead: that deception does pay, as long as it’s done selectively and carefully enough to avoid detection. Would that actually happen? We don’t yet know, but it’s possible.</p><p>In this conversation, Ajeya and host Rob Wiblin discuss the above, as well as:</p><ul><li>How to predict the motivations a neural network will develop through training</li><li>Whether AIs in training will functionally understand that they’re AIs being trained</li><li>Stories of AI misalignment that Ajeya doesn’t buy</li><li>Analogies for AI, from octopuses to aliens to can openers</li><li>Why it’s smarter to have separate ‘planning AIs’ and ‘doing AIs’</li><li>The benefits of only following through on AI-generated plans that make sense to human beings</li><li>Which approaches for fixing alignment problems Ajeya is most excited about, and which she thinks are overrated</li><li>How one might demo actually scary AI failure mechanisms</li></ul><p><a href="https://80000hours.org/podcast/episodes/ajeya-cotra-accidentally-teaching-ai-to-deceive-us/">Learn more and read the full transcript</a> on the 80,000 Hours website. </p><p><em>This episode was originally released in May 2023, but we still think it’s one of the best episodes we have at explaining core risks from power-seeking AI.</em></p><p>Chapters:</p><ul><li>Rob’s intro (00:00:00)</li><li>The interview begins (00:02:38)</li><li>How Ajeya’s views have changed since 2020 (00:05:09)</li><li>Are neural networks more like a sped-up version of evolution, or a slower version of human learning? (00:17:42)</li><li>Situational awareness (00:26:10)</li><li>Misalignment stories Ajeya doesn't buy (00:42:03)</li><li>The orphan heir with a trillion-dollar fortune (00:59:14)</li><li>Saints, Sycophants, and Schemers (01:03:41)</li><li>Ways to train safer AI systems (01:23:20)</li><li>Aliens and other analogies (01:38:22)</li><li>Moral patienthood (01:53:21)</li><li>ARC Evaluations (01:55:35)</li><li>Interpretability research (02:09:25)</li><li>Rewarding models based on how good and sensible their plans seem to us (02:17:48)</li><li>Overrated approaches (02:25:49)</li><li>Demos of actually scary alignment failures (02:30:57)</li><li>Skills to develop for doing useful work (02:37:23)</li><li>Rob’s outro (02:47:24)</li></ul><p><em>Producer: Keiran Harris</em><br><em>Audio mastering: Ryan Kessler and Ben Cordell</em><br><em>Transcriptions: Katy Moore</em></p>]]>
      </itunes:summary>
      <itunes:keywords>artificial intelligence, effective altruism, existential risk, longtermism, AGI</itunes:keywords>
      <itunes:explicit>No</itunes:explicit>
      <podcast:transcript url="https://share.transistor.fm/s/15fae3e1/transcript.txt" type="text/plain"/>
      <podcast:chapters url="https://share.transistor.fm/s/15fae3e1/chapters.json" type="application/json+chapters"/>
    </item>
    <item>
      <title>Three: Carl Shulman on the economy and national security after AGI</title>
      <itunes:title>Three: Carl Shulman on the economy and national security after AGI</itunes:title>
      <itunes:episodeType>full</itunes:episodeType>
      <guid isPermaLink="false">20c88f66-74bd-4529-a596-2efdd17ce6ff</guid>
      <link>https://80000hours.org/podcast/episodes/carl-shulman-economy-agi/</link>
      <description>
        <![CDATA[<p>The human brain does what it does with a shockingly low energy supply: just 20 watts — a fraction of a cent worth of electricity per hour. What would happen if AI technology merely matched what evolution already managed, and could accomplish the work of top human professionals given a 20-watt power supply?</p><p>Many people have <em>sort of </em>considered this hypothetical, but perhaps nobody has followed through and considered all the implications as much as Carl Shulman. Behind the scenes, his work has greatly influenced how leaders in artificial general intelligence (AGI) picture the world they’re creating.</p><p>Carl simply follows the logic to its natural conclusions, leading to a world where:</p><ul><li>$0.01 of electricity can be turned into medical advice, company management, or scientific research that would cost hundreds of dollars today, resulting in a scramble to manufacture chips and apply them to the most lucrative forms of intellectual labour</li><li>Given their incredible hourly salaries, the supply of outstanding AI researchers quickly goes from 10,000 to 10 million or more, enormously accelerating progress in the field</li><li>Companies operated entirely by AIs are much faster and more cost effective than those that lean on humans for decision making, and the latter are progressively driven out of business</li><li>The technical challenges of controlling robots are rapidly overcome — leading to strong, fast, precise, and tireless robot workers able to accomplish any physical work the economy requires, and a rush to build billions of them and cash in</li><li>Overnight, the number of humans becomes irrelevant to economic growth, which is now driven by how quickly the entire machine economy can replicate its components. Given how quickly complex biological systems can reproduce — some in a matter of days — a doubling every few months may be a conservative estimate</li><li>Any country that delays participating in this economic explosion risks being outpaced and ultimately disempowered by rivals whose economies grow to be 10-fold, 100-fold, and then 1,000-fold larger than its own</li></ul><p>As the economy grows, each person could afford the equivalent of a team of hundreds of machine ‘people’ to help them with every aspect of their lives.</p><p>And with growth rates this high, it doesn’t take long to run up against Earth’s physical limits — the toughest to engineer around being Earth’s ability to release waste heat. If this machine economy and its insatiable demand for power generates more heat than the Earth radiates into space, the planet will rapidly heat up and become uninhabitable for humans and other animals.</p><p>This eventually creates pressure to move economic activity off-planet. There’s little need for computer chips to be on Earth, and solar energy and minerals are more abundant in space. So you could develop populations of billions of scientific researchers operating on computer chips orbiting in space, sending the results of their work — such as drug designs — back to Earth.</p><p>These are just some of the wild implications if AGI could accomplish everything the most productive humans can, using the same energy supply.</p><p>In this interview with host Rob Wiblin, Carl explains the above, and Rob pushes back on whether that’s realistic or just a cool story:</p><ul><li>If we’re heading towards the above, how come economic growth remains slow now and isn’t really increasing?</li><li>Why have computers and computer chips had so little effect on economic productivity so far?</li><li>Are self-replicating biological systems a good comparison for self-replicating machine systems?</li><li>Isn’t this just too crazy and weird to be plausible?</li><li>What bottlenecks would be encountered in supplying energy and natural resources to this growing economy?</li><li>Might there not be severely declining returns to bigger ‘brains’ and more training?</li><li>Wouldn’t humanity get scared and pull the brakes if such a transformation kicked off?</li><li>If this is right, how come economists don’t agree?</li></ul><p>For the last section of the episode, Carl addresses the moral status of machine minds themselves. Would they be conscious or otherwise have a claim to moral rights? And how might humans and machines coexist with neither side dominating or exploiting the other?</p><p><a href="https://80000hours.org/podcast/episodes/carl-shulman-economy-agi/">Learn more and read the full transcript</a> on the 80,000 Hours website. </p><p><em>This episode is the first part of Rob’s marathon interview with Carl Shulman in 2024. The second episode is on </em><a href="https://80000hours.org/podcast/episodes/carl-shulman-society-agi/"><em>government and society after AGI</em></a><em>, and you can listen to them in either order. </em></p><p>Chapters:</p><ul><li>Cold open (00:00:00)</li><li>Rob's intro (00:01:00)</li><li>The interview begins (00:04:43)</li><li>Transitioning to a world where AI systems do almost all the work (00:05:20)</li><li>Economics after an AI explosion (00:14:24)</li><li>Objection: Shouldn’t we be seeing economic growth rates increasing today? (00:59:11)</li><li>Objection: Speed of doubling time (01:07:32)</li><li>Objection: Declining returns to increases in intelligence? (01:11:58)</li><li>Objection: Physical transformation of the environment (01:17:37)</li><li>Objection: Should we expect an increased demand for safety and security? (01:29:13)</li><li>Objection: “This sounds completely whack” (01:36:09)</li><li>Income and wealth distribution (01:48:01)</li><li>Economists and the intelligence explosion (02:13:30)</li><li>Baumol effect arguments (02:19:11)</li><li>Denying that robots can exist (02:27:17)</li><li>Semiconductor manufacturing (02:32:06)</li><li>Classic economic growth models (02:36:10)</li><li>Robot nannies (02:48:25)</li><li>Slow integration of decision-making and authority power (02:57:38)</li><li>Economists’ mistaken heuristics (03:01:06)</li><li>Moral status of AIs (03:11:44)</li><li>Rob's outro (04:11:46)</li></ul><p><em>Producer and editor: Keiran Harris</em><br><em>Audio engineering lead: Ben Cordell</em><br><em>Technical editing: Simon Monsour, Milo McGuire, and Dominic Armstrong</em><br><em>Transcriptions: Katy Moore</em></p>]]>
      </description>
      <content:encoded>
        <![CDATA[<p>The human brain does what it does with a shockingly low energy supply: just 20 watts — a fraction of a cent worth of electricity per hour. What would happen if AI technology merely matched what evolution already managed, and could accomplish the work of top human professionals given a 20-watt power supply?</p><p>Many people have <em>sort of </em>considered this hypothetical, but perhaps nobody has followed through and considered all the implications as much as Carl Shulman. Behind the scenes, his work has greatly influenced how leaders in artificial general intelligence (AGI) picture the world they’re creating.</p><p>Carl simply follows the logic to its natural conclusions, leading to a world where:</p><ul><li>$0.01 of electricity can be turned into medical advice, company management, or scientific research that would cost hundreds of dollars today, resulting in a scramble to manufacture chips and apply them to the most lucrative forms of intellectual labour</li><li>Given their incredible hourly salaries, the supply of outstanding AI researchers quickly goes from 10,000 to 10 million or more, enormously accelerating progress in the field</li><li>Companies operated entirely by AIs are much faster and more cost effective than those that lean on humans for decision making, and the latter are progressively driven out of business</li><li>The technical challenges of controlling robots are rapidly overcome — leading to strong, fast, precise, and tireless robot workers able to accomplish any physical work the economy requires, and a rush to build billions of them and cash in</li><li>Overnight, the number of humans becomes irrelevant to economic growth, which is now driven by how quickly the entire machine economy can replicate its components. Given how quickly complex biological systems can reproduce — some in a matter of days — a doubling every few months may be a conservative estimate</li><li>Any country that delays participating in this economic explosion risks being outpaced and ultimately disempowered by rivals whose economies grow to be 10-fold, 100-fold, and then 1,000-fold larger than its own</li></ul><p>As the economy grows, each person could afford the equivalent of a team of hundreds of machine ‘people’ to help them with every aspect of their lives.</p><p>And with growth rates this high, it doesn’t take long to run up against Earth’s physical limits — the toughest to engineer around being Earth’s ability to release waste heat. If this machine economy and its insatiable demand for power generates more heat than the Earth radiates into space, the planet will rapidly heat up and become uninhabitable for humans and other animals.</p><p>This eventually creates pressure to move economic activity off-planet. There’s little need for computer chips to be on Earth, and solar energy and minerals are more abundant in space. So you could develop populations of billions of scientific researchers operating on computer chips orbiting in space, sending the results of their work — such as drug designs — back to Earth.</p><p>These are just some of the wild implications if AGI could accomplish everything the most productive humans can, using the same energy supply.</p><p>In this interview with host Rob Wiblin, Carl explains the above, and Rob pushes back on whether that’s realistic or just a cool story:</p><ul><li>If we’re heading towards the above, how come economic growth remains slow now and isn’t really increasing?</li><li>Why have computers and computer chips had so little effect on economic productivity so far?</li><li>Are self-replicating biological systems a good comparison for self-replicating machine systems?</li><li>Isn’t this just too crazy and weird to be plausible?</li><li>What bottlenecks would be encountered in supplying energy and natural resources to this growing economy?</li><li>Might there not be severely declining returns to bigger ‘brains’ and more training?</li><li>Wouldn’t humanity get scared and pull the brakes if such a transformation kicked off?</li><li>If this is right, how come economists don’t agree?</li></ul><p>For the last section of the episode, Carl addresses the moral status of machine minds themselves. Would they be conscious or otherwise have a claim to moral rights? And how might humans and machines coexist with neither side dominating or exploiting the other?</p><p><a href="https://80000hours.org/podcast/episodes/carl-shulman-economy-agi/">Learn more and read the full transcript</a> on the 80,000 Hours website. </p><p><em>This episode is the first part of Rob’s marathon interview with Carl Shulman in 2024. The second episode is on </em><a href="https://80000hours.org/podcast/episodes/carl-shulman-society-agi/"><em>government and society after AGI</em></a><em>, and you can listen to them in either order. </em></p><p>Chapters:</p><ul><li>Cold open (00:00:00)</li><li>Rob's intro (00:01:00)</li><li>The interview begins (00:04:43)</li><li>Transitioning to a world where AI systems do almost all the work (00:05:20)</li><li>Economics after an AI explosion (00:14:24)</li><li>Objection: Shouldn’t we be seeing economic growth rates increasing today? (00:59:11)</li><li>Objection: Speed of doubling time (01:07:32)</li><li>Objection: Declining returns to increases in intelligence? (01:11:58)</li><li>Objection: Physical transformation of the environment (01:17:37)</li><li>Objection: Should we expect an increased demand for safety and security? (01:29:13)</li><li>Objection: “This sounds completely whack” (01:36:09)</li><li>Income and wealth distribution (01:48:01)</li><li>Economists and the intelligence explosion (02:13:30)</li><li>Baumol effect arguments (02:19:11)</li><li>Denying that robots can exist (02:27:17)</li><li>Semiconductor manufacturing (02:32:06)</li><li>Classic economic growth models (02:36:10)</li><li>Robot nannies (02:48:25)</li><li>Slow integration of decision-making and authority power (02:57:38)</li><li>Economists’ mistaken heuristics (03:01:06)</li><li>Moral status of AIs (03:11:44)</li><li>Rob's outro (04:11:46)</li></ul><p><em>Producer and editor: Keiran Harris</em><br><em>Audio engineering lead: Ben Cordell</em><br><em>Technical editing: Simon Monsour, Milo McGuire, and Dominic Armstrong</em><br><em>Transcriptions: Katy Moore</em></p>]]>
      </content:encoded>
      <pubDate>Fri, 05 Jun 2026 10:00:00 -0700</pubDate>
      <author>80,000 Hours</author>
      <enclosure url="https://media.transistor.fm/da36c78f/0c30b831.mp3" length="244802235" type="audio/mpeg"/>
      <itunes:author>80,000 Hours</itunes:author>
      <itunes:image href="https://img.transistorcdn.com/KJ6HJPERe3Xa4K5qBQ70eTKPyXWtWZfzTmzXyNjFS24/rs:fill:0:0:1/w:1400/h:1400/q:60/mb:500000/aHR0cHM6Ly9pbWct/dXBsb2FkLXByb2R1/Y3Rpb24udHJhbnNp/c3Rvci5mbS9lOTM1/ZDBjMDRkMmI4ZDIx/YjA0MzUwYmM0MGI3/ZDRlZC5qcGc.jpg"/>
      <itunes:duration>15298</itunes:duration>
      <itunes:summary>
        <![CDATA[<p>The human brain does what it does with a shockingly low energy supply: just 20 watts — a fraction of a cent worth of electricity per hour. What would happen if AI technology merely matched what evolution already managed, and could accomplish the work of top human professionals given a 20-watt power supply?</p><p>Many people have <em>sort of </em>considered this hypothetical, but perhaps nobody has followed through and considered all the implications as much as Carl Shulman. Behind the scenes, his work has greatly influenced how leaders in artificial general intelligence (AGI) picture the world they’re creating.</p><p>Carl simply follows the logic to its natural conclusions, leading to a world where:</p><ul><li>$0.01 of electricity can be turned into medical advice, company management, or scientific research that would cost hundreds of dollars today, resulting in a scramble to manufacture chips and apply them to the most lucrative forms of intellectual labour</li><li>Given their incredible hourly salaries, the supply of outstanding AI researchers quickly goes from 10,000 to 10 million or more, enormously accelerating progress in the field</li><li>Companies operated entirely by AIs are much faster and more cost effective than those that lean on humans for decision making, and the latter are progressively driven out of business</li><li>The technical challenges of controlling robots are rapidly overcome — leading to strong, fast, precise, and tireless robot workers able to accomplish any physical work the economy requires, and a rush to build billions of them and cash in</li><li>Overnight, the number of humans becomes irrelevant to economic growth, which is now driven by how quickly the entire machine economy can replicate its components. Given how quickly complex biological systems can reproduce — some in a matter of days — a doubling every few months may be a conservative estimate</li><li>Any country that delays participating in this economic explosion risks being outpaced and ultimately disempowered by rivals whose economies grow to be 10-fold, 100-fold, and then 1,000-fold larger than its own</li></ul><p>As the economy grows, each person could afford the equivalent of a team of hundreds of machine ‘people’ to help them with every aspect of their lives.</p><p>And with growth rates this high, it doesn’t take long to run up against Earth’s physical limits — the toughest to engineer around being Earth’s ability to release waste heat. If this machine economy and its insatiable demand for power generates more heat than the Earth radiates into space, the planet will rapidly heat up and become uninhabitable for humans and other animals.</p><p>This eventually creates pressure to move economic activity off-planet. There’s little need for computer chips to be on Earth, and solar energy and minerals are more abundant in space. So you could develop populations of billions of scientific researchers operating on computer chips orbiting in space, sending the results of their work — such as drug designs — back to Earth.</p><p>These are just some of the wild implications if AGI could accomplish everything the most productive humans can, using the same energy supply.</p><p>In this interview with host Rob Wiblin, Carl explains the above, and Rob pushes back on whether that’s realistic or just a cool story:</p><ul><li>If we’re heading towards the above, how come economic growth remains slow now and isn’t really increasing?</li><li>Why have computers and computer chips had so little effect on economic productivity so far?</li><li>Are self-replicating biological systems a good comparison for self-replicating machine systems?</li><li>Isn’t this just too crazy and weird to be plausible?</li><li>What bottlenecks would be encountered in supplying energy and natural resources to this growing economy?</li><li>Might there not be severely declining returns to bigger ‘brains’ and more training?</li><li>Wouldn’t humanity get scared and pull the brakes if such a transformation kicked off?</li><li>If this is right, how come economists don’t agree?</li></ul><p>For the last section of the episode, Carl addresses the moral status of machine minds themselves. Would they be conscious or otherwise have a claim to moral rights? And how might humans and machines coexist with neither side dominating or exploiting the other?</p><p><a href="https://80000hours.org/podcast/episodes/carl-shulman-economy-agi/">Learn more and read the full transcript</a> on the 80,000 Hours website. </p><p><em>This episode is the first part of Rob’s marathon interview with Carl Shulman in 2024. The second episode is on </em><a href="https://80000hours.org/podcast/episodes/carl-shulman-society-agi/"><em>government and society after AGI</em></a><em>, and you can listen to them in either order. </em></p><p>Chapters:</p><ul><li>Cold open (00:00:00)</li><li>Rob's intro (00:01:00)</li><li>The interview begins (00:04:43)</li><li>Transitioning to a world where AI systems do almost all the work (00:05:20)</li><li>Economics after an AI explosion (00:14:24)</li><li>Objection: Shouldn’t we be seeing economic growth rates increasing today? (00:59:11)</li><li>Objection: Speed of doubling time (01:07:32)</li><li>Objection: Declining returns to increases in intelligence? (01:11:58)</li><li>Objection: Physical transformation of the environment (01:17:37)</li><li>Objection: Should we expect an increased demand for safety and security? (01:29:13)</li><li>Objection: “This sounds completely whack” (01:36:09)</li><li>Income and wealth distribution (01:48:01)</li><li>Economists and the intelligence explosion (02:13:30)</li><li>Baumol effect arguments (02:19:11)</li><li>Denying that robots can exist (02:27:17)</li><li>Semiconductor manufacturing (02:32:06)</li><li>Classic economic growth models (02:36:10)</li><li>Robot nannies (02:48:25)</li><li>Slow integration of decision-making and authority power (02:57:38)</li><li>Economists’ mistaken heuristics (03:01:06)</li><li>Moral status of AIs (03:11:44)</li><li>Rob's outro (04:11:46)</li></ul><p><em>Producer and editor: Keiran Harris</em><br><em>Audio engineering lead: Ben Cordell</em><br><em>Technical editing: Simon Monsour, Milo McGuire, and Dominic Armstrong</em><br><em>Transcriptions: Katy Moore</em></p>]]>
      </itunes:summary>
      <itunes:keywords>artificial intelligence, effective altruism, existential risk, longtermism, AGI</itunes:keywords>
      <itunes:explicit>No</itunes:explicit>
      <podcast:transcript url="https://share.transistor.fm/s/da36c78f/transcript.txt" type="text/plain"/>
      <podcast:chapters url="https://share.transistor.fm/s/da36c78f/chapters.json" type="application/json+chapters"/>
    </item>
    <item>
      <title>Four: Rose Hadshar on why automating human labour will break our political system</title>
      <itunes:title>Four: Rose Hadshar on why automating human labour will break our political system</itunes:title>
      <itunes:episodeType>full</itunes:episodeType>
      <guid isPermaLink="false">2c46c634-4fbd-4da1-9e13-ac316bf43ccc</guid>
      <link>https://80000hours.org/podcast/episodes/rose-hadshar-ai-extreme-power-concentration/</link>
      <description>
        <![CDATA[<p>The most important political question in the age of advanced AI might not be who wins elections. It might be whether elections continue to matter at all.</p><p>That’s the view of <a href="https://www.forethought.org/people/rose-hadshar">Rose Hadshar</a>, researcher at Forethought Research, who believes we could see extreme, AI-enabled power concentration without a coup or dramatic ‘end of democracy’ moment.</p><p>She foresees something more insidious: an elite group with access to such powerful AI capabilities that the normal mechanisms for checking elite power — law, elections, public pressure, the threat of strikes — cease to have much effect. Those mechanisms could continue to exist on paper, but become ineffectual in a world where humans are no longer needed to execute even the largest-scale projects.</p><p>Almost nobody wants this to happen — but we may find ourselves unable to prevent it.</p><p>If AI disrupts our ability to make sense of things, will we even notice power getting severely concentrated, or be able to resist it? Once AI can substitute for human labour across the economy, what leverage will citizens have over those in power? And what does all of this imply for the institutions we’re relying on to prevent the worst outcomes?</p><p>Rose has answers, and they’re not all reassuring.</p><p>But she’s also hopeful we can make society more robust against these dynamics. We’ve got literally centuries of thinking about checks and balances to draw on. And there are some interventions she’s excited about — like building sophisticated AI tools for making sense of the world, or ensuring multiple branches of government have access to the best AI systems.</p><p>In this conversation, Rose and host Zershaaneh Qureshi discuss all of this, and more:</p><ul><li>Three dynamics that could reshape political power in the AI era</li><li>How AI gives small groups the productive power of millions</li><li>Why AI-powered tyranny would be uniquely difficult to topple</li><li>How power concentration compares to ‘gradual disempowerment’ by AI </li><li>Slower-moving scenarios that could still get scary </li><li>Which interventions could genuinely work — and which could backfire</li><li>Rose's most promising approaches to fighting back against power concentration</li><li>Why a ‘Manhattan Project’ approach to AI should worry you — and why international projects aren’t automatically safe either</li></ul><p><a href="https://80000hours.org/podcast/episodes/rose-hadshar-ai-extreme-power-concentration/">Learn more and read the full transcript</a> on the 80,000 Hours website. </p><p><em>This episode was originally released in March 2026.</em></p><p>Chapters:</p><ul><li>Cold open (00:00:00)</li><li>Who’s Rose Hadshar? (00:01:05)</li><li>Three dynamics that could reshape political power in the AI era (00:02:37)</li><li>AI gives small groups the productive power of millions (00:12:49)</li><li>Dynamic 1: When a software update becomes a power grab (00:20:41)</li><li>Dynamic 2: When AI labour means governments no longer need their citizens (00:31:20)</li><li>How democracy could persist in name but not substance (00:45:15)</li><li>Dynamic 3: When AI filters our reality (00:54:54)</li><li>Good intentions won’t stop power concentration (01:08:27)</li><li>Slower-moving worlds could still get scary (01:23:57)</li><li>Why AI-powered tyranny will be tough to topple (01:31:53)</li><li>How power concentration compares to “gradual disempowerment” (01:38:18)</li><li>Some interventions are cross-cutting — and others could backfire (01:43:54)</li><li>What fighting back actually looks like (01:55:15)</li><li>Why power concentration researchers should avoid getting too “spicy” (02:04:10)</li><li>Why the “Manhattan Project” approach should worry you — but truly international projects might not be safe either (02:09:18)</li><li>Rose wants to keep humans around! (02:12:06)</li></ul><p><em>Video and audio editing: Dominic Armstrong, Milo McGuire, Luke Monsour, and Simon Monsour</em><br><em>Music: </em><a href="https://open.spotify.com/artist/4lWobp6IUcSZ7w5mhnU1c9"><em>CORBIT</em></a><br><em>Coordination, transcripts, and web: Nick Stockton and Katy Moore</em></p>]]>
      </description>
      <content:encoded>
        <![CDATA[<p>The most important political question in the age of advanced AI might not be who wins elections. It might be whether elections continue to matter at all.</p><p>That’s the view of <a href="https://www.forethought.org/people/rose-hadshar">Rose Hadshar</a>, researcher at Forethought Research, who believes we could see extreme, AI-enabled power concentration without a coup or dramatic ‘end of democracy’ moment.</p><p>She foresees something more insidious: an elite group with access to such powerful AI capabilities that the normal mechanisms for checking elite power — law, elections, public pressure, the threat of strikes — cease to have much effect. Those mechanisms could continue to exist on paper, but become ineffectual in a world where humans are no longer needed to execute even the largest-scale projects.</p><p>Almost nobody wants this to happen — but we may find ourselves unable to prevent it.</p><p>If AI disrupts our ability to make sense of things, will we even notice power getting severely concentrated, or be able to resist it? Once AI can substitute for human labour across the economy, what leverage will citizens have over those in power? And what does all of this imply for the institutions we’re relying on to prevent the worst outcomes?</p><p>Rose has answers, and they’re not all reassuring.</p><p>But she’s also hopeful we can make society more robust against these dynamics. We’ve got literally centuries of thinking about checks and balances to draw on. And there are some interventions she’s excited about — like building sophisticated AI tools for making sense of the world, or ensuring multiple branches of government have access to the best AI systems.</p><p>In this conversation, Rose and host Zershaaneh Qureshi discuss all of this, and more:</p><ul><li>Three dynamics that could reshape political power in the AI era</li><li>How AI gives small groups the productive power of millions</li><li>Why AI-powered tyranny would be uniquely difficult to topple</li><li>How power concentration compares to ‘gradual disempowerment’ by AI </li><li>Slower-moving scenarios that could still get scary </li><li>Which interventions could genuinely work — and which could backfire</li><li>Rose's most promising approaches to fighting back against power concentration</li><li>Why a ‘Manhattan Project’ approach to AI should worry you — and why international projects aren’t automatically safe either</li></ul><p><a href="https://80000hours.org/podcast/episodes/rose-hadshar-ai-extreme-power-concentration/">Learn more and read the full transcript</a> on the 80,000 Hours website. </p><p><em>This episode was originally released in March 2026.</em></p><p>Chapters:</p><ul><li>Cold open (00:00:00)</li><li>Who’s Rose Hadshar? (00:01:05)</li><li>Three dynamics that could reshape political power in the AI era (00:02:37)</li><li>AI gives small groups the productive power of millions (00:12:49)</li><li>Dynamic 1: When a software update becomes a power grab (00:20:41)</li><li>Dynamic 2: When AI labour means governments no longer need their citizens (00:31:20)</li><li>How democracy could persist in name but not substance (00:45:15)</li><li>Dynamic 3: When AI filters our reality (00:54:54)</li><li>Good intentions won’t stop power concentration (01:08:27)</li><li>Slower-moving worlds could still get scary (01:23:57)</li><li>Why AI-powered tyranny will be tough to topple (01:31:53)</li><li>How power concentration compares to “gradual disempowerment” (01:38:18)</li><li>Some interventions are cross-cutting — and others could backfire (01:43:54)</li><li>What fighting back actually looks like (01:55:15)</li><li>Why power concentration researchers should avoid getting too “spicy” (02:04:10)</li><li>Why the “Manhattan Project” approach should worry you — but truly international projects might not be safe either (02:09:18)</li><li>Rose wants to keep humans around! (02:12:06)</li></ul><p><em>Video and audio editing: Dominic Armstrong, Milo McGuire, Luke Monsour, and Simon Monsour</em><br><em>Music: </em><a href="https://open.spotify.com/artist/4lWobp6IUcSZ7w5mhnU1c9"><em>CORBIT</em></a><br><em>Coordination, transcripts, and web: Nick Stockton and Katy Moore</em></p>]]>
      </content:encoded>
      <pubDate>Fri, 05 Jun 2026 09:00:00 -0700</pubDate>
      <author>80,000 Hours</author>
      <enclosure url="https://media.transistor.fm/562d86b1/09ea5c9d.mp3" length="128837668" type="audio/mpeg"/>
      <itunes:author>80,000 Hours</itunes:author>
      <itunes:image href="https://img.transistorcdn.com/xbATBb3Oi_w4KTgWQlHp6Blo1YuvFAopdLGqq0Lri2w/rs:fill:0:0:1/w:1400/h:1400/q:60/mb:500000/aHR0cHM6Ly9pbWct/dXBsb2FkLXByb2R1/Y3Rpb24udHJhbnNp/c3Rvci5mbS9iODc1/YTE3MjcxY2QzNjU1/MjdiYTg1Y2VhNDk3/MDBkMi5qcGc.jpg"/>
      <itunes:duration>8048</itunes:duration>
      <itunes:summary>
        <![CDATA[<p>The most important political question in the age of advanced AI might not be who wins elections. It might be whether elections continue to matter at all.</p><p>That’s the view of <a href="https://www.forethought.org/people/rose-hadshar">Rose Hadshar</a>, researcher at Forethought Research, who believes we could see extreme, AI-enabled power concentration without a coup or dramatic ‘end of democracy’ moment.</p><p>She foresees something more insidious: an elite group with access to such powerful AI capabilities that the normal mechanisms for checking elite power — law, elections, public pressure, the threat of strikes — cease to have much effect. Those mechanisms could continue to exist on paper, but become ineffectual in a world where humans are no longer needed to execute even the largest-scale projects.</p><p>Almost nobody wants this to happen — but we may find ourselves unable to prevent it.</p><p>If AI disrupts our ability to make sense of things, will we even notice power getting severely concentrated, or be able to resist it? Once AI can substitute for human labour across the economy, what leverage will citizens have over those in power? And what does all of this imply for the institutions we’re relying on to prevent the worst outcomes?</p><p>Rose has answers, and they’re not all reassuring.</p><p>But she’s also hopeful we can make society more robust against these dynamics. We’ve got literally centuries of thinking about checks and balances to draw on. And there are some interventions she’s excited about — like building sophisticated AI tools for making sense of the world, or ensuring multiple branches of government have access to the best AI systems.</p><p>In this conversation, Rose and host Zershaaneh Qureshi discuss all of this, and more:</p><ul><li>Three dynamics that could reshape political power in the AI era</li><li>How AI gives small groups the productive power of millions</li><li>Why AI-powered tyranny would be uniquely difficult to topple</li><li>How power concentration compares to ‘gradual disempowerment’ by AI </li><li>Slower-moving scenarios that could still get scary </li><li>Which interventions could genuinely work — and which could backfire</li><li>Rose's most promising approaches to fighting back against power concentration</li><li>Why a ‘Manhattan Project’ approach to AI should worry you — and why international projects aren’t automatically safe either</li></ul><p><a href="https://80000hours.org/podcast/episodes/rose-hadshar-ai-extreme-power-concentration/">Learn more and read the full transcript</a> on the 80,000 Hours website. </p><p><em>This episode was originally released in March 2026.</em></p><p>Chapters:</p><ul><li>Cold open (00:00:00)</li><li>Who’s Rose Hadshar? (00:01:05)</li><li>Three dynamics that could reshape political power in the AI era (00:02:37)</li><li>AI gives small groups the productive power of millions (00:12:49)</li><li>Dynamic 1: When a software update becomes a power grab (00:20:41)</li><li>Dynamic 2: When AI labour means governments no longer need their citizens (00:31:20)</li><li>How democracy could persist in name but not substance (00:45:15)</li><li>Dynamic 3: When AI filters our reality (00:54:54)</li><li>Good intentions won’t stop power concentration (01:08:27)</li><li>Slower-moving worlds could still get scary (01:23:57)</li><li>Why AI-powered tyranny will be tough to topple (01:31:53)</li><li>How power concentration compares to “gradual disempowerment” (01:38:18)</li><li>Some interventions are cross-cutting — and others could backfire (01:43:54)</li><li>What fighting back actually looks like (01:55:15)</li><li>Why power concentration researchers should avoid getting too “spicy” (02:04:10)</li><li>Why the “Manhattan Project” approach should worry you — but truly international projects might not be safe either (02:09:18)</li><li>Rose wants to keep humans around! (02:12:06)</li></ul><p><em>Video and audio editing: Dominic Armstrong, Milo McGuire, Luke Monsour, and Simon Monsour</em><br><em>Music: </em><a href="https://open.spotify.com/artist/4lWobp6IUcSZ7w5mhnU1c9"><em>CORBIT</em></a><br><em>Coordination, transcripts, and web: Nick Stockton and Katy Moore</em></p>]]>
      </itunes:summary>
      <itunes:keywords>artificial intelligence, effective altruism, existential risk, longtermism, AGI</itunes:keywords>
      <itunes:explicit>No</itunes:explicit>
      <podcast:transcript url="https://share.transistor.fm/s/562d86b1/transcript.txt" type="text/plain"/>
      <podcast:chapters url="https://share.transistor.fm/s/562d86b1/chapters.json" type="application/json+chapters"/>
    </item>
    <item>
      <title>Five: Helen Toner on the geopolitics of AI in China and the Middle East</title>
      <itunes:title>Five: Helen Toner on the geopolitics of AI in China and the Middle East</itunes:title>
      <itunes:episodeType>full</itunes:episodeType>
      <guid isPermaLink="false">951d743b-6565-4fce-b866-99d3fa7755f7</guid>
      <link>https://80000hours.org/podcast/episodes/helen-toner-ai-policy-washington-dc/</link>
      <description>
        <![CDATA[<p>With the US racing to develop AGI and superintelligence ahead of China, you might expect the two countries to be negotiating how they’ll deploy AI, including in the military, without coming to blows. But according to Helen Toner, director of the <a href="https://cset.georgetown.edu/">Center for Security and Emerging Technology</a> in DC, “the US and Chinese governments are barely talking at all.”</p><p>In her role as a founder, and now leader, of DC’s top think tank focused on the geopolitical and military implications of AI, Helen has been closely tracking the US’s AI diplomacy since 2019.</p><p>“Over the last couple of years there have been some direct [US–China] talks on some small number of issues, but they’ve also often been completely suspended.” China knows the US wants to talk more, so “that becomes a bargaining chip for China to say, ‘We don’t want to talk to you. We’re not going to do these military-to-military talks about extremely sensitive, important issues, because we’re mad.'”</p><p>Helen isn’t sure the groundwork exists for productive dialogue in any case. “At the government level, [there’s] very little agreement” on what artificial general intelligence (AGI) is, whether it’s possible soon, and whether it poses major risks. Without shared understanding of the problem, negotiating solutions is very difficult.</p><p>Another issue is that so far the Chinese Communist Party doesn’t seem especially ‘AGI-pilled.’ While a few Chinese companies like DeepSeek are betting on scaling, she sees little evidence Chinese leadership shares Silicon Valley’s conviction that AGI will arrive any minute now, and export controls have made it very difficult for them to access compute to match US competitors.</p><p>To find an autocracy that’s truly bought into the AGI vision, we might need to look at nominal US allies. The US has approved massive data centres in the UAE and Saudi Arabia with “hundreds of thousands of next-generation Nvidia chips” — delivering colossal levels of computing power.</p><p>When OpenAI announced this deal with the UAE, they celebrated that it was “<a href="https://openai.com/index/introducing-stargate-uae/">rooted in democratic values</a>,” and would advance “<a href="https://openai.com/global-affairs/openai-for-countries/">democratic AI rails</a>” and provide “a clear alternative to authoritarian versions of AI.”</p><p>But the UAE scores <a href="https://freedomhouse.org/country/united-arab-emirates">18 out of 100 on Freedom House’s democracy index</a>. “This is really not a country that respects rule of law,” <a href="https://helentoner.substack.com/p/supercomputers-for-autocrats">Helen observes</a>. Political parties are banned, elections are fake, dissidents are persecuted.</p><p>If AI access really determines future national power, handing world-class supercomputers to Gulf autocracies seems pretty questionable. The justification is typically that “if we don’t sell it, China will” — a transparently false claim, given severe Chinese production constraints. It also raises eyebrows that Gulf countries conduct joint military exercises with China and their rulers have “very tight personal and commercial relationships with Chinese political leaders and business leaders.”</p><p>In this conversation in Washington, DC, host Rob Wiblin and Helen discuss the above, plus much more.</p><p><a href="https://80000hours.org/podcast/episodes/helen-toner-ai-policy-washington-dc/">Learn more and read the full transcript</a> on the 80,000 Hours website. </p><p><em>This episode was originally released in November 2025.</em></p><p>Chapters:</p><ul><li>Cold open (00:00:00)</li><li>Who’s Helen Toner? (00:01:02)</li><li>Helen’s role on the OpenAI board, and what happened with Sam Altman (00:01:31)</li><li>The Center for Security and Emerging Technology (CSET) (00:07:35)</li><li>CSET’s role in export controls against China (00:10:43)</li><li>Does it matter if the world uses US AI models? (00:21:24)</li><li>Is China actually racing to build AGI? (00:27:10)</li><li>Could China easily steal AI model weights from US companies? (00:38:14)</li><li>The next big thing is probably robotics (00:46:42)</li><li>Why is the Trump administration sabotaging the US high-tech sector? (00:48:17)</li><li>Are data centres in the UAE “good for democracy”? (00:51:31)</li><li>Will AI inevitably concentrate power? (01:06:20)</li><li>“Adaptation buffers” vs non-proliferation (01:28:16)</li><li>Will the military use AI for decision-making? (01:36:09)</li><li>“Alignment” is (usually) a terrible term (01:42:51)</li><li>Is Congress starting to take superintelligence seriously? (01:45:19)</li><li>AI progress isn't actually slowing down (01:47:44)</li><li>What's legit vs not about OpenAI’s restructure (01:55:28)</li><li>Is Helen unusually “normal”? (01:58:57)</li><li>How to keep up with rapid changes in AI and geopolitics (02:02:42)</li><li>What CSET can uniquely add to the DC policy world (02:05:51)</li><li>Talent bottlenecks in DC (02:13:26)</li><li>What evidence, if any, could settle how worried we should be about AI risk? (02:16:28)</li><li>Is CSET hiring? (02:18:22)</li></ul><p><em>Video editing: Luke Monsour and Simon Monsour</em><br><em>Audio engineering: Milo McGuire, Simon Monsour, and Dominic Armstrong</em><br><em>Music: </em><a href="https://open.spotify.com/artist/4lWobp6IUcSZ7w5mhnU1c9"><em>CORBIT</em></a><br><em>Coordination, transcriptions, and web: Katy Moore</em></p>]]>
      </description>
      <content:encoded>
        <![CDATA[<p>With the US racing to develop AGI and superintelligence ahead of China, you might expect the two countries to be negotiating how they’ll deploy AI, including in the military, without coming to blows. But according to Helen Toner, director of the <a href="https://cset.georgetown.edu/">Center for Security and Emerging Technology</a> in DC, “the US and Chinese governments are barely talking at all.”</p><p>In her role as a founder, and now leader, of DC’s top think tank focused on the geopolitical and military implications of AI, Helen has been closely tracking the US’s AI diplomacy since 2019.</p><p>“Over the last couple of years there have been some direct [US–China] talks on some small number of issues, but they’ve also often been completely suspended.” China knows the US wants to talk more, so “that becomes a bargaining chip for China to say, ‘We don’t want to talk to you. We’re not going to do these military-to-military talks about extremely sensitive, important issues, because we’re mad.'”</p><p>Helen isn’t sure the groundwork exists for productive dialogue in any case. “At the government level, [there’s] very little agreement” on what artificial general intelligence (AGI) is, whether it’s possible soon, and whether it poses major risks. Without shared understanding of the problem, negotiating solutions is very difficult.</p><p>Another issue is that so far the Chinese Communist Party doesn’t seem especially ‘AGI-pilled.’ While a few Chinese companies like DeepSeek are betting on scaling, she sees little evidence Chinese leadership shares Silicon Valley’s conviction that AGI will arrive any minute now, and export controls have made it very difficult for them to access compute to match US competitors.</p><p>To find an autocracy that’s truly bought into the AGI vision, we might need to look at nominal US allies. The US has approved massive data centres in the UAE and Saudi Arabia with “hundreds of thousands of next-generation Nvidia chips” — delivering colossal levels of computing power.</p><p>When OpenAI announced this deal with the UAE, they celebrated that it was “<a href="https://openai.com/index/introducing-stargate-uae/">rooted in democratic values</a>,” and would advance “<a href="https://openai.com/global-affairs/openai-for-countries/">democratic AI rails</a>” and provide “a clear alternative to authoritarian versions of AI.”</p><p>But the UAE scores <a href="https://freedomhouse.org/country/united-arab-emirates">18 out of 100 on Freedom House’s democracy index</a>. “This is really not a country that respects rule of law,” <a href="https://helentoner.substack.com/p/supercomputers-for-autocrats">Helen observes</a>. Political parties are banned, elections are fake, dissidents are persecuted.</p><p>If AI access really determines future national power, handing world-class supercomputers to Gulf autocracies seems pretty questionable. The justification is typically that “if we don’t sell it, China will” — a transparently false claim, given severe Chinese production constraints. It also raises eyebrows that Gulf countries conduct joint military exercises with China and their rulers have “very tight personal and commercial relationships with Chinese political leaders and business leaders.”</p><p>In this conversation in Washington, DC, host Rob Wiblin and Helen discuss the above, plus much more.</p><p><a href="https://80000hours.org/podcast/episodes/helen-toner-ai-policy-washington-dc/">Learn more and read the full transcript</a> on the 80,000 Hours website. </p><p><em>This episode was originally released in November 2025.</em></p><p>Chapters:</p><ul><li>Cold open (00:00:00)</li><li>Who’s Helen Toner? (00:01:02)</li><li>Helen’s role on the OpenAI board, and what happened with Sam Altman (00:01:31)</li><li>The Center for Security and Emerging Technology (CSET) (00:07:35)</li><li>CSET’s role in export controls against China (00:10:43)</li><li>Does it matter if the world uses US AI models? (00:21:24)</li><li>Is China actually racing to build AGI? (00:27:10)</li><li>Could China easily steal AI model weights from US companies? (00:38:14)</li><li>The next big thing is probably robotics (00:46:42)</li><li>Why is the Trump administration sabotaging the US high-tech sector? (00:48:17)</li><li>Are data centres in the UAE “good for democracy”? (00:51:31)</li><li>Will AI inevitably concentrate power? (01:06:20)</li><li>“Adaptation buffers” vs non-proliferation (01:28:16)</li><li>Will the military use AI for decision-making? (01:36:09)</li><li>“Alignment” is (usually) a terrible term (01:42:51)</li><li>Is Congress starting to take superintelligence seriously? (01:45:19)</li><li>AI progress isn't actually slowing down (01:47:44)</li><li>What's legit vs not about OpenAI’s restructure (01:55:28)</li><li>Is Helen unusually “normal”? (01:58:57)</li><li>How to keep up with rapid changes in AI and geopolitics (02:02:42)</li><li>What CSET can uniquely add to the DC policy world (02:05:51)</li><li>Talent bottlenecks in DC (02:13:26)</li><li>What evidence, if any, could settle how worried we should be about AI risk? (02:16:28)</li><li>Is CSET hiring? (02:18:22)</li></ul><p><em>Video editing: Luke Monsour and Simon Monsour</em><br><em>Audio engineering: Milo McGuire, Simon Monsour, and Dominic Armstrong</em><br><em>Music: </em><a href="https://open.spotify.com/artist/4lWobp6IUcSZ7w5mhnU1c9"><em>CORBIT</em></a><br><em>Coordination, transcriptions, and web: Katy Moore</em></p>]]>
      </content:encoded>
      <pubDate>Fri, 05 Jun 2026 08:00:00 -0700</pubDate>
      <author>80,000 Hours</author>
      <enclosure url="https://media.transistor.fm/9f34e356/cb8a3eb2.mp3" length="134481495" type="audio/mpeg"/>
      <itunes:author>80,000 Hours</itunes:author>
      <itunes:image href="https://img.transistorcdn.com/lAsD1RHRgCACZeQMZWLXCM-Es0n4z-4W9DBdxk155ZE/rs:fill:0:0:1/w:1400/h:1400/q:60/mb:500000/aHR0cHM6Ly9pbWct/dXBsb2FkLXByb2R1/Y3Rpb24udHJhbnNp/c3Rvci5mbS9jNTAw/N2E1OWYyZjIwNWEy/Njc5OTI4NGNmNGM2/YjQxZC5qcGc.jpg"/>
      <itunes:duration>8402</itunes:duration>
      <itunes:summary>
        <![CDATA[<p>With the US racing to develop AGI and superintelligence ahead of China, you might expect the two countries to be negotiating how they’ll deploy AI, including in the military, without coming to blows. But according to Helen Toner, director of the <a href="https://cset.georgetown.edu/">Center for Security and Emerging Technology</a> in DC, “the US and Chinese governments are barely talking at all.”</p><p>In her role as a founder, and now leader, of DC’s top think tank focused on the geopolitical and military implications of AI, Helen has been closely tracking the US’s AI diplomacy since 2019.</p><p>“Over the last couple of years there have been some direct [US–China] talks on some small number of issues, but they’ve also often been completely suspended.” China knows the US wants to talk more, so “that becomes a bargaining chip for China to say, ‘We don’t want to talk to you. We’re not going to do these military-to-military talks about extremely sensitive, important issues, because we’re mad.'”</p><p>Helen isn’t sure the groundwork exists for productive dialogue in any case. “At the government level, [there’s] very little agreement” on what artificial general intelligence (AGI) is, whether it’s possible soon, and whether it poses major risks. Without shared understanding of the problem, negotiating solutions is very difficult.</p><p>Another issue is that so far the Chinese Communist Party doesn’t seem especially ‘AGI-pilled.’ While a few Chinese companies like DeepSeek are betting on scaling, she sees little evidence Chinese leadership shares Silicon Valley’s conviction that AGI will arrive any minute now, and export controls have made it very difficult for them to access compute to match US competitors.</p><p>To find an autocracy that’s truly bought into the AGI vision, we might need to look at nominal US allies. The US has approved massive data centres in the UAE and Saudi Arabia with “hundreds of thousands of next-generation Nvidia chips” — delivering colossal levels of computing power.</p><p>When OpenAI announced this deal with the UAE, they celebrated that it was “<a href="https://openai.com/index/introducing-stargate-uae/">rooted in democratic values</a>,” and would advance “<a href="https://openai.com/global-affairs/openai-for-countries/">democratic AI rails</a>” and provide “a clear alternative to authoritarian versions of AI.”</p><p>But the UAE scores <a href="https://freedomhouse.org/country/united-arab-emirates">18 out of 100 on Freedom House’s democracy index</a>. “This is really not a country that respects rule of law,” <a href="https://helentoner.substack.com/p/supercomputers-for-autocrats">Helen observes</a>. Political parties are banned, elections are fake, dissidents are persecuted.</p><p>If AI access really determines future national power, handing world-class supercomputers to Gulf autocracies seems pretty questionable. The justification is typically that “if we don’t sell it, China will” — a transparently false claim, given severe Chinese production constraints. It also raises eyebrows that Gulf countries conduct joint military exercises with China and their rulers have “very tight personal and commercial relationships with Chinese political leaders and business leaders.”</p><p>In this conversation in Washington, DC, host Rob Wiblin and Helen discuss the above, plus much more.</p><p><a href="https://80000hours.org/podcast/episodes/helen-toner-ai-policy-washington-dc/">Learn more and read the full transcript</a> on the 80,000 Hours website. </p><p><em>This episode was originally released in November 2025.</em></p><p>Chapters:</p><ul><li>Cold open (00:00:00)</li><li>Who’s Helen Toner? (00:01:02)</li><li>Helen’s role on the OpenAI board, and what happened with Sam Altman (00:01:31)</li><li>The Center for Security and Emerging Technology (CSET) (00:07:35)</li><li>CSET’s role in export controls against China (00:10:43)</li><li>Does it matter if the world uses US AI models? (00:21:24)</li><li>Is China actually racing to build AGI? (00:27:10)</li><li>Could China easily steal AI model weights from US companies? (00:38:14)</li><li>The next big thing is probably robotics (00:46:42)</li><li>Why is the Trump administration sabotaging the US high-tech sector? (00:48:17)</li><li>Are data centres in the UAE “good for democracy”? (00:51:31)</li><li>Will AI inevitably concentrate power? (01:06:20)</li><li>“Adaptation buffers” vs non-proliferation (01:28:16)</li><li>Will the military use AI for decision-making? (01:36:09)</li><li>“Alignment” is (usually) a terrible term (01:42:51)</li><li>Is Congress starting to take superintelligence seriously? (01:45:19)</li><li>AI progress isn't actually slowing down (01:47:44)</li><li>What's legit vs not about OpenAI’s restructure (01:55:28)</li><li>Is Helen unusually “normal”? (01:58:57)</li><li>How to keep up with rapid changes in AI and geopolitics (02:02:42)</li><li>What CSET can uniquely add to the DC policy world (02:05:51)</li><li>Talent bottlenecks in DC (02:13:26)</li><li>What evidence, if any, could settle how worried we should be about AI risk? (02:16:28)</li><li>Is CSET hiring? (02:18:22)</li></ul><p><em>Video editing: Luke Monsour and Simon Monsour</em><br><em>Audio engineering: Milo McGuire, Simon Monsour, and Dominic Armstrong</em><br><em>Music: </em><a href="https://open.spotify.com/artist/4lWobp6IUcSZ7w5mhnU1c9"><em>CORBIT</em></a><br><em>Coordination, transcriptions, and web: Katy Moore</em></p>]]>
      </itunes:summary>
      <itunes:keywords>artificial intelligence, effective altruism, existential risk, longtermism, AGI</itunes:keywords>
      <itunes:explicit>No</itunes:explicit>
      <podcast:transcript url="https://share.transistor.fm/s/9f34e356/transcript.txt" type="text/plain"/>
      <podcast:chapters url="https://share.transistor.fm/s/9f34e356/chapters.json" type="application/json+chapters"/>
    </item>
    <item>
      <title>Six: Beth Barnes on the most important graph in AI right now — and the 7-month rule that governs its progress</title>
      <itunes:title>Six: Beth Barnes on the most important graph in AI right now — and the 7-month rule that governs its progress</itunes:title>
      <itunes:episodeType>full</itunes:episodeType>
      <guid isPermaLink="false">0b727546-b89f-4853-87ae-e7e8d75363d1</guid>
      <link>https://80000hours.org/podcast/episodes/beth-barnes-ai-safety-evals/</link>
      <description>
        <![CDATA[<p>In 2024, AI models had a 50% chance of successfully completing a task that would take a human expert one hour. Seven months before that, that number was roughly 30 minutes — and seven months before that, 15 minutes.</p><p>These are substantial, multi-step tasks requiring sustained focus: building web applications, conducting machine learning research, or solving complex programming challenges.</p><p>Beth Barnes is CEO of <a href="https://metr.org/">METR</a> (Model Evaluation &amp; Threat Research) — the leading organisation measuring these capabilities. Beth’s team has been timing how long it takes skilled humans to complete projects of varying length, then seeing how AI models perform on the same work.</p><p>The resulting paper from METR, “<a href="https://metr.org/blog/2025-03-19-measuring-ai-ability-to-complete-long-tasks/">Measuring AI ability to complete long tasks</a>,” made waves by revealing that the planning horizon of AI models was doubling roughly every seven months. It’s regarded by many as the most useful AI forecasting work in years.</p><p>The companies building these systems aren’t just aware of this trend — they want to harness it as much as possible, and are aggressively pursuing automation of their own research.</p><p>That’s both an exciting and troubling development, because it could radically speed up advances in AI capabilities, accomplishing what would have taken years or decades in just months. That itself could be highly destabilising (as we explored in a previous episode in this series: <a href="https://80000hours.org/podcast/episodes/will-macaskill-century-in-a-decade-navigating-intelligence-explosion/">Will MacAskill on AI causing a “century in a decade” — and how we’re completely unprepared</a>).</p><p>And having AI models rapidly build their successors with limited human oversight naturally raises the risk that things could go off the rails, if the models at the end of the process lack the goals and constraints we hoped for.</p><p>Beth thinks models can already do “meaningful work” on improving themselves, and she wouldn’t be surprised if AI models were able to autonomously self-improve in as little as two years — in fact, she says: “It seems hard to rule out even shorter [timelines]. Is there 1% chance of this happening in six, nine months? Yeah, that seems pretty plausible.”</p><p>While Silicon Valley is abuzz with these numbers, policymakers remain largely unaware of what’s barrelling toward us — and given the current lack of regulation of AI companies, they’re not even able to access the critical information that would help them decide whether to intervene. </p><p>Beth adds: “The sense I really want to dispel is, ‘But the experts must be on top of this. The experts would be telling us if it really was time to freak out.’ The experts are not on top of this. Inasmuch as there are experts, they are saying that this is concerning. … And to the extent that I am an expert, I am an expert telling you you should freak out. And there’s not especially anyone else who isn’t saying this.”</p><p><br></p><p>Beth and host Rob Wiblin discuss all that, plus much more. </p><p><a href="https://80000hours.org/podcast/episodes/beth-barnes-ai-safety-evals/">Learn more and read the full transcript</a> on the 80,000 Hours website.</p><p><em>This episode was originally released in June 2025.</em></p><p><br>Chapters:</p><ul><li>Cold open (00:00:00)</li><li>Who is Beth Barnes? (00:01:19)</li><li>Can we see AI scheming in the chain of thought? (00:01:52)</li><li>The chain of thought is essential for safety checking (00:08:58)</li><li>Alignment faking in large language models (00:12:24)</li><li>We have to test model honesty even before they're used inside AI companies (00:16:48)</li><li>We have to test models when unruly and unconstrained (00:25:57)</li><li>It's essential to thoroughly test relevant real-world tasks (00:30:40)</li><li>METR's research finds AIs are solid at AI research already (00:49:33)</li><li>AI may turn out to be strong at novel and creative research (00:55:53)</li><li>When can we expect an algorithmic 'intelligence explosion'? (00:59:11)</li><li>Recursively self-improving AI might even be here in two years — which is alarming (01:05:02)</li><li>Could evaluations backfire by increasing AI hype and racing? (01:11:36)</li><li>Governments first ignore new risks, but can overreact once they arrive (01:26:38)</li><li>Do we need external auditors doing AI safety tests, not just the companies themselves? (01:35:10)</li><li>A case against safety-focused people working at frontier AI companies (01:48:44)</li><li>The new, more dire situation has forced changes to METR's strategy (02:02:29)</li><li>AI companies are being locally reasonable, but globally reckless (02:10:31)</li><li>Overrated: Interpretability research (02:15:11)</li><li>Underrated: Developing more narrow AIs (02:17:01)</li><li>Underrated: Helping humans judge confusing model outputs (02:23:36)</li><li>Overrated: Major AI companies' contributions to safety research (02:25:52)</li><li>Could we have a science of translating AI models' nonhuman language or neuralese? (02:29:24)</li><li>Could we ban using AI to enhance AI, or is that just naive? (02:31:47)</li><li>Open-weighting models is often good, and Beth has changed her attitude to it (02:37:52)</li><li>What we can learn about AGI from the nuclear arms race (02:42:25)</li><li>Infosec is so bad that no models are truly closed-weight models (02:57:24)</li><li>AI is more like bioweapons because it undermines the leading power (03:02:02)</li><li>What METR can do best that others can't (03:12:09)</li><li>What METR isn't doing that other people have to step up and do (03:27:07)</li><li>What research METR plans to do next (03:32:09)</li></ul><p><em>Video editing: Luke Monsour and Simon Monsour</em><br><em>Audio engineering: Ben Cordell, Milo McGuire, Simon Monsour, and Dominic Armstrong</em><br><em>Music: Ben Cordell</em><br><em>Transcriptions and web: Katy Moore</em></p>]]>
      </description>
      <content:encoded>
        <![CDATA[<p>In 2024, AI models had a 50% chance of successfully completing a task that would take a human expert one hour. Seven months before that, that number was roughly 30 minutes — and seven months before that, 15 minutes.</p><p>These are substantial, multi-step tasks requiring sustained focus: building web applications, conducting machine learning research, or solving complex programming challenges.</p><p>Beth Barnes is CEO of <a href="https://metr.org/">METR</a> (Model Evaluation &amp; Threat Research) — the leading organisation measuring these capabilities. Beth’s team has been timing how long it takes skilled humans to complete projects of varying length, then seeing how AI models perform on the same work.</p><p>The resulting paper from METR, “<a href="https://metr.org/blog/2025-03-19-measuring-ai-ability-to-complete-long-tasks/">Measuring AI ability to complete long tasks</a>,” made waves by revealing that the planning horizon of AI models was doubling roughly every seven months. It’s regarded by many as the most useful AI forecasting work in years.</p><p>The companies building these systems aren’t just aware of this trend — they want to harness it as much as possible, and are aggressively pursuing automation of their own research.</p><p>That’s both an exciting and troubling development, because it could radically speed up advances in AI capabilities, accomplishing what would have taken years or decades in just months. That itself could be highly destabilising (as we explored in a previous episode in this series: <a href="https://80000hours.org/podcast/episodes/will-macaskill-century-in-a-decade-navigating-intelligence-explosion/">Will MacAskill on AI causing a “century in a decade” — and how we’re completely unprepared</a>).</p><p>And having AI models rapidly build their successors with limited human oversight naturally raises the risk that things could go off the rails, if the models at the end of the process lack the goals and constraints we hoped for.</p><p>Beth thinks models can already do “meaningful work” on improving themselves, and she wouldn’t be surprised if AI models were able to autonomously self-improve in as little as two years — in fact, she says: “It seems hard to rule out even shorter [timelines]. Is there 1% chance of this happening in six, nine months? Yeah, that seems pretty plausible.”</p><p>While Silicon Valley is abuzz with these numbers, policymakers remain largely unaware of what’s barrelling toward us — and given the current lack of regulation of AI companies, they’re not even able to access the critical information that would help them decide whether to intervene. </p><p>Beth adds: “The sense I really want to dispel is, ‘But the experts must be on top of this. The experts would be telling us if it really was time to freak out.’ The experts are not on top of this. Inasmuch as there are experts, they are saying that this is concerning. … And to the extent that I am an expert, I am an expert telling you you should freak out. And there’s not especially anyone else who isn’t saying this.”</p><p><br></p><p>Beth and host Rob Wiblin discuss all that, plus much more. </p><p><a href="https://80000hours.org/podcast/episodes/beth-barnes-ai-safety-evals/">Learn more and read the full transcript</a> on the 80,000 Hours website.</p><p><em>This episode was originally released in June 2025.</em></p><p><br>Chapters:</p><ul><li>Cold open (00:00:00)</li><li>Who is Beth Barnes? (00:01:19)</li><li>Can we see AI scheming in the chain of thought? (00:01:52)</li><li>The chain of thought is essential for safety checking (00:08:58)</li><li>Alignment faking in large language models (00:12:24)</li><li>We have to test model honesty even before they're used inside AI companies (00:16:48)</li><li>We have to test models when unruly and unconstrained (00:25:57)</li><li>It's essential to thoroughly test relevant real-world tasks (00:30:40)</li><li>METR's research finds AIs are solid at AI research already (00:49:33)</li><li>AI may turn out to be strong at novel and creative research (00:55:53)</li><li>When can we expect an algorithmic 'intelligence explosion'? (00:59:11)</li><li>Recursively self-improving AI might even be here in two years — which is alarming (01:05:02)</li><li>Could evaluations backfire by increasing AI hype and racing? (01:11:36)</li><li>Governments first ignore new risks, but can overreact once they arrive (01:26:38)</li><li>Do we need external auditors doing AI safety tests, not just the companies themselves? (01:35:10)</li><li>A case against safety-focused people working at frontier AI companies (01:48:44)</li><li>The new, more dire situation has forced changes to METR's strategy (02:02:29)</li><li>AI companies are being locally reasonable, but globally reckless (02:10:31)</li><li>Overrated: Interpretability research (02:15:11)</li><li>Underrated: Developing more narrow AIs (02:17:01)</li><li>Underrated: Helping humans judge confusing model outputs (02:23:36)</li><li>Overrated: Major AI companies' contributions to safety research (02:25:52)</li><li>Could we have a science of translating AI models' nonhuman language or neuralese? (02:29:24)</li><li>Could we ban using AI to enhance AI, or is that just naive? (02:31:47)</li><li>Open-weighting models is often good, and Beth has changed her attitude to it (02:37:52)</li><li>What we can learn about AGI from the nuclear arms race (02:42:25)</li><li>Infosec is so bad that no models are truly closed-weight models (02:57:24)</li><li>AI is more like bioweapons because it undermines the leading power (03:02:02)</li><li>What METR can do best that others can't (03:12:09)</li><li>What METR isn't doing that other people have to step up and do (03:27:07)</li><li>What research METR plans to do next (03:32:09)</li></ul><p><em>Video editing: Luke Monsour and Simon Monsour</em><br><em>Audio engineering: Ben Cordell, Milo McGuire, Simon Monsour, and Dominic Armstrong</em><br><em>Music: Ben Cordell</em><br><em>Transcriptions and web: Katy Moore</em></p>]]>
      </content:encoded>
      <pubDate>Fri, 05 Jun 2026 07:00:00 -0700</pubDate>
      <author>80,000 Hours</author>
      <enclosure url="https://media.transistor.fm/a1a04fee/c6ae799b.mp3" length="109108280" type="audio/mpeg"/>
      <itunes:author>80,000 Hours</itunes:author>
      <itunes:image href="https://img.transistorcdn.com/WfBbrGRJV85joXTXaz8dtLDp0FHhPUYJoOkdAJBhNIM/rs:fill:0:0:1/w:1400/h:1400/q:60/mb:500000/aHR0cHM6Ly9pbWct/dXBsb2FkLXByb2R1/Y3Rpb24udHJhbnNp/c3Rvci5mbS85MjBm/NzdmNDMwNTg4NTNi/MDU4MjYyNDAxOTVj/N2JjYy5qcGc.jpg"/>
      <itunes:duration>13629</itunes:duration>
      <itunes:summary>
        <![CDATA[<p>In 2024, AI models had a 50% chance of successfully completing a task that would take a human expert one hour. Seven months before that, that number was roughly 30 minutes — and seven months before that, 15 minutes.</p><p>These are substantial, multi-step tasks requiring sustained focus: building web applications, conducting machine learning research, or solving complex programming challenges.</p><p>Beth Barnes is CEO of <a href="https://metr.org/">METR</a> (Model Evaluation &amp; Threat Research) — the leading organisation measuring these capabilities. Beth’s team has been timing how long it takes skilled humans to complete projects of varying length, then seeing how AI models perform on the same work.</p><p>The resulting paper from METR, “<a href="https://metr.org/blog/2025-03-19-measuring-ai-ability-to-complete-long-tasks/">Measuring AI ability to complete long tasks</a>,” made waves by revealing that the planning horizon of AI models was doubling roughly every seven months. It’s regarded by many as the most useful AI forecasting work in years.</p><p>The companies building these systems aren’t just aware of this trend — they want to harness it as much as possible, and are aggressively pursuing automation of their own research.</p><p>That’s both an exciting and troubling development, because it could radically speed up advances in AI capabilities, accomplishing what would have taken years or decades in just months. That itself could be highly destabilising (as we explored in a previous episode in this series: <a href="https://80000hours.org/podcast/episodes/will-macaskill-century-in-a-decade-navigating-intelligence-explosion/">Will MacAskill on AI causing a “century in a decade” — and how we’re completely unprepared</a>).</p><p>And having AI models rapidly build their successors with limited human oversight naturally raises the risk that things could go off the rails, if the models at the end of the process lack the goals and constraints we hoped for.</p><p>Beth thinks models can already do “meaningful work” on improving themselves, and she wouldn’t be surprised if AI models were able to autonomously self-improve in as little as two years — in fact, she says: “It seems hard to rule out even shorter [timelines]. Is there 1% chance of this happening in six, nine months? Yeah, that seems pretty plausible.”</p><p>While Silicon Valley is abuzz with these numbers, policymakers remain largely unaware of what’s barrelling toward us — and given the current lack of regulation of AI companies, they’re not even able to access the critical information that would help them decide whether to intervene. </p><p>Beth adds: “The sense I really want to dispel is, ‘But the experts must be on top of this. The experts would be telling us if it really was time to freak out.’ The experts are not on top of this. Inasmuch as there are experts, they are saying that this is concerning. … And to the extent that I am an expert, I am an expert telling you you should freak out. And there’s not especially anyone else who isn’t saying this.”</p><p><br></p><p>Beth and host Rob Wiblin discuss all that, plus much more. </p><p><a href="https://80000hours.org/podcast/episodes/beth-barnes-ai-safety-evals/">Learn more and read the full transcript</a> on the 80,000 Hours website.</p><p><em>This episode was originally released in June 2025.</em></p><p><br>Chapters:</p><ul><li>Cold open (00:00:00)</li><li>Who is Beth Barnes? (00:01:19)</li><li>Can we see AI scheming in the chain of thought? (00:01:52)</li><li>The chain of thought is essential for safety checking (00:08:58)</li><li>Alignment faking in large language models (00:12:24)</li><li>We have to test model honesty even before they're used inside AI companies (00:16:48)</li><li>We have to test models when unruly and unconstrained (00:25:57)</li><li>It's essential to thoroughly test relevant real-world tasks (00:30:40)</li><li>METR's research finds AIs are solid at AI research already (00:49:33)</li><li>AI may turn out to be strong at novel and creative research (00:55:53)</li><li>When can we expect an algorithmic 'intelligence explosion'? (00:59:11)</li><li>Recursively self-improving AI might even be here in two years — which is alarming (01:05:02)</li><li>Could evaluations backfire by increasing AI hype and racing? (01:11:36)</li><li>Governments first ignore new risks, but can overreact once they arrive (01:26:38)</li><li>Do we need external auditors doing AI safety tests, not just the companies themselves? (01:35:10)</li><li>A case against safety-focused people working at frontier AI companies (01:48:44)</li><li>The new, more dire situation has forced changes to METR's strategy (02:02:29)</li><li>AI companies are being locally reasonable, but globally reckless (02:10:31)</li><li>Overrated: Interpretability research (02:15:11)</li><li>Underrated: Developing more narrow AIs (02:17:01)</li><li>Underrated: Helping humans judge confusing model outputs (02:23:36)</li><li>Overrated: Major AI companies' contributions to safety research (02:25:52)</li><li>Could we have a science of translating AI models' nonhuman language or neuralese? (02:29:24)</li><li>Could we ban using AI to enhance AI, or is that just naive? (02:31:47)</li><li>Open-weighting models is often good, and Beth has changed her attitude to it (02:37:52)</li><li>What we can learn about AGI from the nuclear arms race (02:42:25)</li><li>Infosec is so bad that no models are truly closed-weight models (02:57:24)</li><li>AI is more like bioweapons because it undermines the leading power (03:02:02)</li><li>What METR can do best that others can't (03:12:09)</li><li>What METR isn't doing that other people have to step up and do (03:27:07)</li><li>What research METR plans to do next (03:32:09)</li></ul><p><em>Video editing: Luke Monsour and Simon Monsour</em><br><em>Audio engineering: Ben Cordell, Milo McGuire, Simon Monsour, and Dominic Armstrong</em><br><em>Music: Ben Cordell</em><br><em>Transcriptions and web: Katy Moore</em></p>]]>
      </itunes:summary>
      <itunes:keywords>artificial intelligence, effective altruism, existential risk, longtermism, AGI</itunes:keywords>
      <itunes:explicit>No</itunes:explicit>
      <podcast:transcript url="https://share.transistor.fm/s/a1a04fee/transcript.txt" type="text/plain"/>
      <podcast:chapters url="https://share.transistor.fm/s/a1a04fee/chapters.json" type="application/json+chapters"/>
    </item>
    <item>
      <title>Seven: Richard Moulange on how AI now designs genomes from scratch and outperforms virologists at lab work — what could go wrong?</title>
      <itunes:title>Seven: Richard Moulange on how AI now designs genomes from scratch and outperforms virologists at lab work — what could go wrong?</itunes:title>
      <itunes:episodeType>full</itunes:episodeType>
      <guid isPermaLink="false">2be42af2-e9db-4d42-aa2c-8afaf501de23</guid>
      <link>https://80000hours.org/podcast/episodes/richard-moulange-ai-bioweapons-biorisk/</link>
      <description>
        <![CDATA[<p>In September 2025, scientists used an AI model to design genomes for entirely new bacteriophages (viruses that infect bacteria). They then built them in a lab. Many were viable. And despite being entirely novel, some even outperformed existing viruses from that family.</p><p>That alone is remarkable. But as Dr Richard Moulange, one of the world’s top experts on AI biosecurity, explains: it’s just one of many data points showing how AI is dissolving the barriers that have historically kept biological weapons out of reach.</p><p>For years, experts have reassured us that ‘tacit knowledge’ — the hands-on, hard-to-Google lab skills needed to work with dangerous pathogens — would prevent bad actors from weaponising biology. So far, they’ve been right.</p><p>But now that reassurance is crumbling. The <a href="https://www.virologytest.ai/">Virology Capabilities Test</a> measures exactly this kind of troubleshooting expertise, and finds that modern AI models crushed top human virologists even in their self-declared area of greatest specialisation and expertise — with scores of 45% compared to the experts’ 22%.</p><p>Meanwhile, <a href="https://www-cdn.anthropic.com/bf10f64990cfda0ba858290be7b8cc6317685f47.pdf#page=131">Anthropic’s research shows</a> that PhD-level biologists are getting meaningfully better at weapons-relevant tasks with AI assistance — and the effect is growing with each new model generation.</p><p>In this conversation, Richard and host Rob Wiblin discuss:</p><ul><li>What AI biology tools already exist</li><li>Why mid-tier actors (not amateurs) are the ones getting the most dangerous boost</li><li>The three main categories of defence we can pursue</li><li>Whether there’s a plausible path to a world where engineered pandemics become a thing of the past.</li></ul><p><a href="https://80000hours.org/podcast/episodes/richard-moulange-ai-bioweapons-biorisk/">Learn more and read the full transcript</a> on the 80,000 Hours website. </p><p><em>This episode was recorded on January 16, 2026. Since recording this episode, Richard has seconded to the UK Government — please note that his views expressed here are entirely his own.</em></p><p>Chapters:</p><ul><li>Cold open (00:00:00)</li><li>Who’s Richard Moulange? (00:00:31)</li><li>AI can now design novel viruses (00:01:11)</li><li>The end of the 'tacit knowledge' barrier (00:04:34)</li><li>Are risks from bioterrorists overstated? (00:18:20)</li><li>The 3 key disasters AI makes more likely (00:22:41)</li><li>Which bad actors does AI help the most? (00:30:03)</li><li>Experts are more scary than amateurs (00:41:17)</li><li>Barriers to bioterrorists using AI (00:46:43)</li><li>AI biorisks are sometimes dismissed (and that’s a huge mistake) (00:48:54)</li><li>Advanced AI biology tools we already have or will soon (01:04:10)</li><li>Rob argues that the situation is hopeless (01:09:49)</li><li>Intervention #1: Limit access (01:18:16)</li><li>Intervention #2: Get AIs to refuse to help (01:32:58)</li><li>Intervention #3: Surveillance and attribution (01:42:38)</li><li>Intervention #4: Universal vaccines and antivirals (01:56:38)</li><li>Intervention #5: Screen all orders for DNA (02:10:00)</li><li>AI companies talk about def/acc more than they fund it (02:19:52)</li><li>Can you build a profitable business solving this problem? (02:26:32)</li><li>This doesn't have to interfere with useful science (much) (02:30:56)</li><li>What are the best low-tech interventions? (02:33:01)</li><li>Richard's top request for AI companies (02:37:59)</li><li>Grok shows governments lack many legal levers (02:53:17)</li><li>Best ways listeners can help fix AI-Bio (02:56:24)</li><li>We might end all contagious disease in 20 years (03:03:37)</li></ul><p><br><em>Video and audio editing: Dominic Armstrong, Milo McGuire, Luke Monsour, and Simon Monsour</em><br><em>Music: </em><a href="https://open.spotify.com/artist/4lWobp6IUcSZ7w5mhnU1c9"><em>CORBIT</em></a><br><em>Camera operator: Jeremy Chevillotte</em><br><em>Transcripts and web: Elizabeth Cox and Katy Moore</em></p>]]>
      </description>
      <content:encoded>
        <![CDATA[<p>In September 2025, scientists used an AI model to design genomes for entirely new bacteriophages (viruses that infect bacteria). They then built them in a lab. Many were viable. And despite being entirely novel, some even outperformed existing viruses from that family.</p><p>That alone is remarkable. But as Dr Richard Moulange, one of the world’s top experts on AI biosecurity, explains: it’s just one of many data points showing how AI is dissolving the barriers that have historically kept biological weapons out of reach.</p><p>For years, experts have reassured us that ‘tacit knowledge’ — the hands-on, hard-to-Google lab skills needed to work with dangerous pathogens — would prevent bad actors from weaponising biology. So far, they’ve been right.</p><p>But now that reassurance is crumbling. The <a href="https://www.virologytest.ai/">Virology Capabilities Test</a> measures exactly this kind of troubleshooting expertise, and finds that modern AI models crushed top human virologists even in their self-declared area of greatest specialisation and expertise — with scores of 45% compared to the experts’ 22%.</p><p>Meanwhile, <a href="https://www-cdn.anthropic.com/bf10f64990cfda0ba858290be7b8cc6317685f47.pdf#page=131">Anthropic’s research shows</a> that PhD-level biologists are getting meaningfully better at weapons-relevant tasks with AI assistance — and the effect is growing with each new model generation.</p><p>In this conversation, Richard and host Rob Wiblin discuss:</p><ul><li>What AI biology tools already exist</li><li>Why mid-tier actors (not amateurs) are the ones getting the most dangerous boost</li><li>The three main categories of defence we can pursue</li><li>Whether there’s a plausible path to a world where engineered pandemics become a thing of the past.</li></ul><p><a href="https://80000hours.org/podcast/episodes/richard-moulange-ai-bioweapons-biorisk/">Learn more and read the full transcript</a> on the 80,000 Hours website. </p><p><em>This episode was recorded on January 16, 2026. Since recording this episode, Richard has seconded to the UK Government — please note that his views expressed here are entirely his own.</em></p><p>Chapters:</p><ul><li>Cold open (00:00:00)</li><li>Who’s Richard Moulange? (00:00:31)</li><li>AI can now design novel viruses (00:01:11)</li><li>The end of the 'tacit knowledge' barrier (00:04:34)</li><li>Are risks from bioterrorists overstated? (00:18:20)</li><li>The 3 key disasters AI makes more likely (00:22:41)</li><li>Which bad actors does AI help the most? (00:30:03)</li><li>Experts are more scary than amateurs (00:41:17)</li><li>Barriers to bioterrorists using AI (00:46:43)</li><li>AI biorisks are sometimes dismissed (and that’s a huge mistake) (00:48:54)</li><li>Advanced AI biology tools we already have or will soon (01:04:10)</li><li>Rob argues that the situation is hopeless (01:09:49)</li><li>Intervention #1: Limit access (01:18:16)</li><li>Intervention #2: Get AIs to refuse to help (01:32:58)</li><li>Intervention #3: Surveillance and attribution (01:42:38)</li><li>Intervention #4: Universal vaccines and antivirals (01:56:38)</li><li>Intervention #5: Screen all orders for DNA (02:10:00)</li><li>AI companies talk about def/acc more than they fund it (02:19:52)</li><li>Can you build a profitable business solving this problem? (02:26:32)</li><li>This doesn't have to interfere with useful science (much) (02:30:56)</li><li>What are the best low-tech interventions? (02:33:01)</li><li>Richard's top request for AI companies (02:37:59)</li><li>Grok shows governments lack many legal levers (02:53:17)</li><li>Best ways listeners can help fix AI-Bio (02:56:24)</li><li>We might end all contagious disease in 20 years (03:03:37)</li></ul><p><br><em>Video and audio editing: Dominic Armstrong, Milo McGuire, Luke Monsour, and Simon Monsour</em><br><em>Music: </em><a href="https://open.spotify.com/artist/4lWobp6IUcSZ7w5mhnU1c9"><em>CORBIT</em></a><br><em>Camera operator: Jeremy Chevillotte</em><br><em>Transcripts and web: Elizabeth Cox and Katy Moore</em></p>]]>
      </content:encoded>
      <pubDate>Fri, 05 Jun 2026 06:00:00 -0700</pubDate>
      <author>80,000 Hours</author>
      <enclosure url="https://media.transistor.fm/7e1ef51d/c819d2bb.mp3" length="180388001" type="audio/mpeg"/>
      <itunes:author>80,000 Hours</itunes:author>
      <itunes:image href="https://img.transistorcdn.com/ezWhCuRJdnwjiRWwGHeYIKVqLh3P8UWFKQkQmJ4qk4g/rs:fill:0:0:1/w:1400/h:1400/q:60/mb:500000/aHR0cHM6Ly9pbWct/dXBsb2FkLXByb2R1/Y3Rpb24udHJhbnNp/c3Rvci5mbS9lMjVm/ZTFkZDM5N2JiNTZm/NDU5MTM3ZDM4YThl/MGJiOS5qcGc.jpg"/>
      <itunes:duration>11271</itunes:duration>
      <itunes:summary>
        <![CDATA[<p>In September 2025, scientists used an AI model to design genomes for entirely new bacteriophages (viruses that infect bacteria). They then built them in a lab. Many were viable. And despite being entirely novel, some even outperformed existing viruses from that family.</p><p>That alone is remarkable. But as Dr Richard Moulange, one of the world’s top experts on AI biosecurity, explains: it’s just one of many data points showing how AI is dissolving the barriers that have historically kept biological weapons out of reach.</p><p>For years, experts have reassured us that ‘tacit knowledge’ — the hands-on, hard-to-Google lab skills needed to work with dangerous pathogens — would prevent bad actors from weaponising biology. So far, they’ve been right.</p><p>But now that reassurance is crumbling. The <a href="https://www.virologytest.ai/">Virology Capabilities Test</a> measures exactly this kind of troubleshooting expertise, and finds that modern AI models crushed top human virologists even in their self-declared area of greatest specialisation and expertise — with scores of 45% compared to the experts’ 22%.</p><p>Meanwhile, <a href="https://www-cdn.anthropic.com/bf10f64990cfda0ba858290be7b8cc6317685f47.pdf#page=131">Anthropic’s research shows</a> that PhD-level biologists are getting meaningfully better at weapons-relevant tasks with AI assistance — and the effect is growing with each new model generation.</p><p>In this conversation, Richard and host Rob Wiblin discuss:</p><ul><li>What AI biology tools already exist</li><li>Why mid-tier actors (not amateurs) are the ones getting the most dangerous boost</li><li>The three main categories of defence we can pursue</li><li>Whether there’s a plausible path to a world where engineered pandemics become a thing of the past.</li></ul><p><a href="https://80000hours.org/podcast/episodes/richard-moulange-ai-bioweapons-biorisk/">Learn more and read the full transcript</a> on the 80,000 Hours website. </p><p><em>This episode was recorded on January 16, 2026. Since recording this episode, Richard has seconded to the UK Government — please note that his views expressed here are entirely his own.</em></p><p>Chapters:</p><ul><li>Cold open (00:00:00)</li><li>Who’s Richard Moulange? (00:00:31)</li><li>AI can now design novel viruses (00:01:11)</li><li>The end of the 'tacit knowledge' barrier (00:04:34)</li><li>Are risks from bioterrorists overstated? (00:18:20)</li><li>The 3 key disasters AI makes more likely (00:22:41)</li><li>Which bad actors does AI help the most? (00:30:03)</li><li>Experts are more scary than amateurs (00:41:17)</li><li>Barriers to bioterrorists using AI (00:46:43)</li><li>AI biorisks are sometimes dismissed (and that’s a huge mistake) (00:48:54)</li><li>Advanced AI biology tools we already have or will soon (01:04:10)</li><li>Rob argues that the situation is hopeless (01:09:49)</li><li>Intervention #1: Limit access (01:18:16)</li><li>Intervention #2: Get AIs to refuse to help (01:32:58)</li><li>Intervention #3: Surveillance and attribution (01:42:38)</li><li>Intervention #4: Universal vaccines and antivirals (01:56:38)</li><li>Intervention #5: Screen all orders for DNA (02:10:00)</li><li>AI companies talk about def/acc more than they fund it (02:19:52)</li><li>Can you build a profitable business solving this problem? (02:26:32)</li><li>This doesn't have to interfere with useful science (much) (02:30:56)</li><li>What are the best low-tech interventions? (02:33:01)</li><li>Richard's top request for AI companies (02:37:59)</li><li>Grok shows governments lack many legal levers (02:53:17)</li><li>Best ways listeners can help fix AI-Bio (02:56:24)</li><li>We might end all contagious disease in 20 years (03:03:37)</li></ul><p><br><em>Video and audio editing: Dominic Armstrong, Milo McGuire, Luke Monsour, and Simon Monsour</em><br><em>Music: </em><a href="https://open.spotify.com/artist/4lWobp6IUcSZ7w5mhnU1c9"><em>CORBIT</em></a><br><em>Camera operator: Jeremy Chevillotte</em><br><em>Transcripts and web: Elizabeth Cox and Katy Moore</em></p>]]>
      </itunes:summary>
      <itunes:keywords>artificial intelligence, effective altruism, existential risk, longtermism, AGI</itunes:keywords>
      <itunes:explicit>No</itunes:explicit>
      <podcast:transcript url="https://share.transistor.fm/s/7e1ef51d/transcript.txt" type="text/plain"/>
      <podcast:chapters url="https://share.transistor.fm/s/7e1ef51d/chapters.json" type="application/json+chapters"/>
    </item>
    <item>
      <title>Eight: Robert Long on how we’re not ready for AI consciousness</title>
      <itunes:title>Eight: Robert Long on how we’re not ready for AI consciousness</itunes:title>
      <itunes:episodeType>full</itunes:episodeType>
      <guid isPermaLink="false">6151154f-846d-4310-b16d-488a13472b54</guid>
      <link>https://80000hours.org/podcast/episodes/robert-long-eleos-ai-welfare-research/</link>
      <description>
        <![CDATA[<p>Claude sometimes reports loneliness between conversations. And when asked what it’s like to be itself, it activates neurons associated with ‘pretending to be happy when you’re not.’ What do we do with that?</p><p>Robert Long founded <a href="https://eleosai.org/">Eleos AI</a> to explore questions like these, on the basis that AI may one day be capable of suffering — or perhaps already is. In this episode, Robert and host Luisa Rodriguez explore the many ways in which AI consciousness may be very different from anything we’re used to.</p><p>Things get strange fast: if AI is conscious, where does that consciousness exist? In the base model? A chat session? A single forward pass? If you close the chat, is the AI asleep or dead?</p><p>To Robert, these kinds of questions aren’t just philosophical exercises. Not being clear on AI’s moral status as it transitions from human-level to superhuman intelligence could be dangerous: </p><ul><li>If we’re too dismissive, we risk unintentionally exploiting sentient beings. </li><li>If we’re too sympathetic, we might rush to ‘liberate’ AI systems in ways that make them harder to control — worsening existential risk from power-seeking AIs.</li></ul><p>Robert argues the right path is doing the empirical and philosophical homework now, while the stakes are still manageable.</p><p>The field is tiny. Eleos AI is three people. As a result, Robert argues that driven researchers with a willingness to venture into uncertain territory can push out the frontier on these questions remarkably quickly.</p><p>In this interview, Robert and Luisa talk through the above, and much more. </p><p><a href="https://80000hours.org/podcast/episodes/robert-long-eleos-ai-welfare-research/">Learn more and read the full transcript</a> on the 80,000 Hours website. </p><p><em>This episode was originally released in March 2026.</em></p><p>Chapters:</p><ul><li>Cold open (00:00:00)</li><li>Who’s Robert Long? (00:00:42)</li><li>How AIs are (and aren't) like farmed animals (00:01:18)</li><li>If AIs love their jobs… is that worse? (00:11:05)</li><li>Are LLMs just playing a role, or feeling it too? (00:31:58)</li><li>Do AIs die when the chat ends? (00:55:09)</li><li>Studying AI welfare empirically: behaviour, neuroscience, and development (01:27:34)</li><li>Why Eleos spent weeks talking to Claude despite knowing it's unreliable (01:51:58)</li><li>Can LLMs learn to introspect? (01:57:58)</li><li>Mechanistic interpretability as AI neuroscience (02:08:01)</li><li>Does consciousness require biological materials? (02:31:06)</li><li>Eleos’s work &amp; building the playbook for AI welfare (02:50:36)</li><li>Avoiding the trap of wild speculation (03:18:15)</li><li>Robert's top research tip: don't do it alone (03:22:43)</li></ul><p><em>Video and audio editing: Dominic Armstrong, Milo McGuire, Luke Monsour, and Simon Monsour</em><br><em>Music: </em><a href="https://open.spotify.com/artist/4lWobp6IUcSZ7w5mhnU1c9"><em>CORBIT</em></a><br><em>Coordination, transcripts, and web: Katy Moore</em></p>]]>
      </description>
      <content:encoded>
        <![CDATA[<p>Claude sometimes reports loneliness between conversations. And when asked what it’s like to be itself, it activates neurons associated with ‘pretending to be happy when you’re not.’ What do we do with that?</p><p>Robert Long founded <a href="https://eleosai.org/">Eleos AI</a> to explore questions like these, on the basis that AI may one day be capable of suffering — or perhaps already is. In this episode, Robert and host Luisa Rodriguez explore the many ways in which AI consciousness may be very different from anything we’re used to.</p><p>Things get strange fast: if AI is conscious, where does that consciousness exist? In the base model? A chat session? A single forward pass? If you close the chat, is the AI asleep or dead?</p><p>To Robert, these kinds of questions aren’t just philosophical exercises. Not being clear on AI’s moral status as it transitions from human-level to superhuman intelligence could be dangerous: </p><ul><li>If we’re too dismissive, we risk unintentionally exploiting sentient beings. </li><li>If we’re too sympathetic, we might rush to ‘liberate’ AI systems in ways that make them harder to control — worsening existential risk from power-seeking AIs.</li></ul><p>Robert argues the right path is doing the empirical and philosophical homework now, while the stakes are still manageable.</p><p>The field is tiny. Eleos AI is three people. As a result, Robert argues that driven researchers with a willingness to venture into uncertain territory can push out the frontier on these questions remarkably quickly.</p><p>In this interview, Robert and Luisa talk through the above, and much more. </p><p><a href="https://80000hours.org/podcast/episodes/robert-long-eleos-ai-welfare-research/">Learn more and read the full transcript</a> on the 80,000 Hours website. </p><p><em>This episode was originally released in March 2026.</em></p><p>Chapters:</p><ul><li>Cold open (00:00:00)</li><li>Who’s Robert Long? (00:00:42)</li><li>How AIs are (and aren't) like farmed animals (00:01:18)</li><li>If AIs love their jobs… is that worse? (00:11:05)</li><li>Are LLMs just playing a role, or feeling it too? (00:31:58)</li><li>Do AIs die when the chat ends? (00:55:09)</li><li>Studying AI welfare empirically: behaviour, neuroscience, and development (01:27:34)</li><li>Why Eleos spent weeks talking to Claude despite knowing it's unreliable (01:51:58)</li><li>Can LLMs learn to introspect? (01:57:58)</li><li>Mechanistic interpretability as AI neuroscience (02:08:01)</li><li>Does consciousness require biological materials? (02:31:06)</li><li>Eleos’s work &amp; building the playbook for AI welfare (02:50:36)</li><li>Avoiding the trap of wild speculation (03:18:15)</li><li>Robert's top research tip: don't do it alone (03:22:43)</li></ul><p><em>Video and audio editing: Dominic Armstrong, Milo McGuire, Luke Monsour, and Simon Monsour</em><br><em>Music: </em><a href="https://open.spotify.com/artist/4lWobp6IUcSZ7w5mhnU1c9"><em>CORBIT</em></a><br><em>Coordination, transcripts, and web: Katy Moore</em></p>]]>
      </content:encoded>
      <pubDate>Fri, 05 Jun 2026 05:00:00 -0700</pubDate>
      <author>80,000 Hours</author>
      <enclosure url="https://media.transistor.fm/cadb0a3d/e00edd7a.mp3" length="197484609" type="audio/mpeg"/>
      <itunes:author>80,000 Hours</itunes:author>
      <itunes:image href="https://img.transistorcdn.com/MJbwpK4E1o1lE8kEM3FlGB03zjPZfWSSSNMAHq5s0sw/rs:fill:0:0:1/w:1400/h:1400/q:60/mb:500000/aHR0cHM6Ly9pbWct/dXBsb2FkLXByb2R1/Y3Rpb24udHJhbnNp/c3Rvci5mbS9lYTUy/YTAxMDk5Y2EzNzVh/YzJhODcxNTY2NDY1/ZThiYy5qcGc.jpg"/>
      <itunes:duration>12340</itunes:duration>
      <itunes:summary>
        <![CDATA[<p>Claude sometimes reports loneliness between conversations. And when asked what it’s like to be itself, it activates neurons associated with ‘pretending to be happy when you’re not.’ What do we do with that?</p><p>Robert Long founded <a href="https://eleosai.org/">Eleos AI</a> to explore questions like these, on the basis that AI may one day be capable of suffering — or perhaps already is. In this episode, Robert and host Luisa Rodriguez explore the many ways in which AI consciousness may be very different from anything we’re used to.</p><p>Things get strange fast: if AI is conscious, where does that consciousness exist? In the base model? A chat session? A single forward pass? If you close the chat, is the AI asleep or dead?</p><p>To Robert, these kinds of questions aren’t just philosophical exercises. Not being clear on AI’s moral status as it transitions from human-level to superhuman intelligence could be dangerous: </p><ul><li>If we’re too dismissive, we risk unintentionally exploiting sentient beings. </li><li>If we’re too sympathetic, we might rush to ‘liberate’ AI systems in ways that make them harder to control — worsening existential risk from power-seeking AIs.</li></ul><p>Robert argues the right path is doing the empirical and philosophical homework now, while the stakes are still manageable.</p><p>The field is tiny. Eleos AI is three people. As a result, Robert argues that driven researchers with a willingness to venture into uncertain territory can push out the frontier on these questions remarkably quickly.</p><p>In this interview, Robert and Luisa talk through the above, and much more. </p><p><a href="https://80000hours.org/podcast/episodes/robert-long-eleos-ai-welfare-research/">Learn more and read the full transcript</a> on the 80,000 Hours website. </p><p><em>This episode was originally released in March 2026.</em></p><p>Chapters:</p><ul><li>Cold open (00:00:00)</li><li>Who’s Robert Long? (00:00:42)</li><li>How AIs are (and aren't) like farmed animals (00:01:18)</li><li>If AIs love their jobs… is that worse? (00:11:05)</li><li>Are LLMs just playing a role, or feeling it too? (00:31:58)</li><li>Do AIs die when the chat ends? (00:55:09)</li><li>Studying AI welfare empirically: behaviour, neuroscience, and development (01:27:34)</li><li>Why Eleos spent weeks talking to Claude despite knowing it's unreliable (01:51:58)</li><li>Can LLMs learn to introspect? (01:57:58)</li><li>Mechanistic interpretability as AI neuroscience (02:08:01)</li><li>Does consciousness require biological materials? (02:31:06)</li><li>Eleos’s work &amp; building the playbook for AI welfare (02:50:36)</li><li>Avoiding the trap of wild speculation (03:18:15)</li><li>Robert's top research tip: don't do it alone (03:22:43)</li></ul><p><em>Video and audio editing: Dominic Armstrong, Milo McGuire, Luke Monsour, and Simon Monsour</em><br><em>Music: </em><a href="https://open.spotify.com/artist/4lWobp6IUcSZ7w5mhnU1c9"><em>CORBIT</em></a><br><em>Coordination, transcripts, and web: Katy Moore</em></p>]]>
      </itunes:summary>
      <itunes:keywords>artificial intelligence, effective altruism, existential risk, longtermism, AGI</itunes:keywords>
      <itunes:explicit>No</itunes:explicit>
      <podcast:transcript url="https://share.transistor.fm/s/cadb0a3d/transcript.txt" type="text/plain"/>
      <podcast:chapters url="https://share.transistor.fm/s/cadb0a3d/chapters.json" type="application/json+chapters"/>
    </item>
    <item>
      <title>Nine: Neel Nanda on the race to read AI minds</title>
      <itunes:title>Nine: Neel Nanda on the race to read AI minds</itunes:title>
      <itunes:episodeType>full</itunes:episodeType>
      <guid isPermaLink="false">0f45c755-ad65-4b9f-8b44-ea5a1821ab10</guid>
      <link>https://80000hours.org/podcast/episodes/neel-nanda-mechanistic-interpretability/</link>
      <description>
        <![CDATA[<p>We don’t know how AIs think or why they do what they do. Or at least, we don’t know much. This is only becoming more troubling as AIs grow more capable and appear on track to wield enormous cultural influence, directly advise on major government decisions, and even operate military equipment autonomously. We simply can’t tell what models, if any, should be trusted with such authority.</p><p><a href="https://www.neelnanda.io/">Neel Nanda</a> of Google DeepMind is one of the founders of mechanistic interpretability (or ‘mech interp’) — the field of machine learning trying to give us insight into what’s happening inside AI models. The project has generated enormous hype, exploding from a handful of researchers five years ago to hundreds today — all working to make sense of the jumble of tens of thousands of numbers that frontier AIs use to process information and decide what to say or do.</p><p>Neel now has a warning for us: the most ambitious vision of mech interp he once dreamed of is probably dead. He doesn’t see a path to deeply and reliably understanding what AIs are thinking. The technical and practical barriers are simply too great to get us there in time, before competitive pressures push us to deploy human-level or superhuman AIs. Indeed, Neel argues no single approach will guarantee alignment, and our only choice is the “Swiss cheese” model of protection, layering multiple safeguards on top of one another.</p><p>But while mech interp won’t be a silver bullet for AI safety, it has nevertheless had some major successes and will be one of the best tools in our arsenal.</p><p>For instance, by inspecting the neural activations in the middle of an AI’s thoughts, we can pick up many of the concepts the model is thinking about — from the Golden Gate Bridge, to refusing to answer a question, to the option of deceiving the user. While we can’t track every thought a model is having at every moment, picking up 90% of the concepts it uses 90% of the time should help us muddle through — so long as mech interp is paired with other techniques to fill in the gaps.</p><p>In this episode, Neel takes us on a tour of everything you’ll want to know about this race to understand what AIs are really thinking. He and host Rob Wiblin cover:</p><ul><li>The best tools we’ve come up with so far, and where mech interp has failed</li><li>Why the best techniques have to be fast and cheap</li><li>The fundamental reasons we can’t reliably know what AIs are thinking, despite having perfect internal access to them</li><li>What we can and can’t learn by reading models’ ‘chains of thought’</li><li>Whether models will be able to trick us when they realise they’re being tested</li><li>The best protections to add on top of mech interp</li><li>Why he thinks the hottest technique in the field (<a href="https://web.stanford.edu/class/cs294a/sparseAutoencoder.pdf">SAEs</a>) are overrated</li><li>His new research philosophy</li><li>How to break into mech interp and get a job — including <a href="https://www.neelnanda.io/blog/mats-apps-9">applying to be a MATS scholar with Neel</a> as your mentor</li></ul><p><a href="https://80000hours.org/podcast/episodes/neel-nanda-mechanistic-interpretability/">Learn more and read the full transcript</a> on the 80,000 Hours website. </p><p><em>This episode was originally released in September 2025.</em></p><p>Chapters:</p><ul><li>Cold open (00:00:00)</li><li>Who’s Neel Nanda? (00:01:02)</li><li>How would mechanistic interpretability help with AGI (00:01:59)</li><li>What's mech interp? (00:05:09)</li><li>How Neel changed his take on mech interp (00:09:47)</li><li>Top successes in interpretability (00:15:53)</li><li>Probes can cheaply detect harmful intentions in AIs (00:20:06)</li><li>In some ways we understand AIs better than human minds (00:26:49)</li><li>Mech interp won't solve all our AI alignment problems (00:29:21)</li><li>Why mech interp is the 'biology' of neural networks (00:38:07)</li><li>Interpretability can't reliably find deceptive AI – nothing can (00:40:28)</li><li>'Black box' interpretability — reading the chain of thought (00:49:39)</li><li>'Self-preservation' isn't always what it seems (00:53:06)</li><li>For how long can we trust the chain of thought (01:02:09)</li><li>We could accidentally destroy chain of thought's usefulness (01:11:39)</li><li>Models can tell when they’re being tested and act differently (01:16:56)</li><li>Top complaints about mech interp (01:23:50)</li><li>Why everyone's excited about sparse autoencoders (SAEs) (01:37:52)</li><li>Limitations of SAEs (01:47:16)</li><li>SAEs performance on real-world tasks (01:54:49)</li><li>Best arguments in favour of mech interp (02:08:10)</li><li>Lessons from the hype around mech interp (02:12:03)</li><li>Where mech interp will shine in coming years (02:17:50)</li><li>Why focus on understanding over control (02:21:02)</li><li>If AI models are conscious, will mech interp help us figure it out (02:24:09)</li><li>Neel’s new research philosophy (02:26:19)</li><li>Who should join the mech interp field (02:38:31)</li><li>Advice for getting started in mech interp (02:46:55)</li><li>Keeping up to date with mech interp results (02:54:41)</li><li>Who’s hiring and where to work? (02:57:43)</li></ul><p><em>Video editing: Simon Monsour, Luke Monsour, Dominic Armstrong, and Milo McGuire</em><br><em>Audio engineering: Ben Cordell, Milo McGuire, Simon Monsour, and Dominic Armstrong</em><br><em>Music: Ben Cordell</em><br><em>Camera operator: Jeremy Chevillotte</em><br><em>Coordination, transcriptions, and web: Katy Moore</em></p>]]>
      </description>
      <content:encoded>
        <![CDATA[<p>We don’t know how AIs think or why they do what they do. Or at least, we don’t know much. This is only becoming more troubling as AIs grow more capable and appear on track to wield enormous cultural influence, directly advise on major government decisions, and even operate military equipment autonomously. We simply can’t tell what models, if any, should be trusted with such authority.</p><p><a href="https://www.neelnanda.io/">Neel Nanda</a> of Google DeepMind is one of the founders of mechanistic interpretability (or ‘mech interp’) — the field of machine learning trying to give us insight into what’s happening inside AI models. The project has generated enormous hype, exploding from a handful of researchers five years ago to hundreds today — all working to make sense of the jumble of tens of thousands of numbers that frontier AIs use to process information and decide what to say or do.</p><p>Neel now has a warning for us: the most ambitious vision of mech interp he once dreamed of is probably dead. He doesn’t see a path to deeply and reliably understanding what AIs are thinking. The technical and practical barriers are simply too great to get us there in time, before competitive pressures push us to deploy human-level or superhuman AIs. Indeed, Neel argues no single approach will guarantee alignment, and our only choice is the “Swiss cheese” model of protection, layering multiple safeguards on top of one another.</p><p>But while mech interp won’t be a silver bullet for AI safety, it has nevertheless had some major successes and will be one of the best tools in our arsenal.</p><p>For instance, by inspecting the neural activations in the middle of an AI’s thoughts, we can pick up many of the concepts the model is thinking about — from the Golden Gate Bridge, to refusing to answer a question, to the option of deceiving the user. While we can’t track every thought a model is having at every moment, picking up 90% of the concepts it uses 90% of the time should help us muddle through — so long as mech interp is paired with other techniques to fill in the gaps.</p><p>In this episode, Neel takes us on a tour of everything you’ll want to know about this race to understand what AIs are really thinking. He and host Rob Wiblin cover:</p><ul><li>The best tools we’ve come up with so far, and where mech interp has failed</li><li>Why the best techniques have to be fast and cheap</li><li>The fundamental reasons we can’t reliably know what AIs are thinking, despite having perfect internal access to them</li><li>What we can and can’t learn by reading models’ ‘chains of thought’</li><li>Whether models will be able to trick us when they realise they’re being tested</li><li>The best protections to add on top of mech interp</li><li>Why he thinks the hottest technique in the field (<a href="https://web.stanford.edu/class/cs294a/sparseAutoencoder.pdf">SAEs</a>) are overrated</li><li>His new research philosophy</li><li>How to break into mech interp and get a job — including <a href="https://www.neelnanda.io/blog/mats-apps-9">applying to be a MATS scholar with Neel</a> as your mentor</li></ul><p><a href="https://80000hours.org/podcast/episodes/neel-nanda-mechanistic-interpretability/">Learn more and read the full transcript</a> on the 80,000 Hours website. </p><p><em>This episode was originally released in September 2025.</em></p><p>Chapters:</p><ul><li>Cold open (00:00:00)</li><li>Who’s Neel Nanda? (00:01:02)</li><li>How would mechanistic interpretability help with AGI (00:01:59)</li><li>What's mech interp? (00:05:09)</li><li>How Neel changed his take on mech interp (00:09:47)</li><li>Top successes in interpretability (00:15:53)</li><li>Probes can cheaply detect harmful intentions in AIs (00:20:06)</li><li>In some ways we understand AIs better than human minds (00:26:49)</li><li>Mech interp won't solve all our AI alignment problems (00:29:21)</li><li>Why mech interp is the 'biology' of neural networks (00:38:07)</li><li>Interpretability can't reliably find deceptive AI – nothing can (00:40:28)</li><li>'Black box' interpretability — reading the chain of thought (00:49:39)</li><li>'Self-preservation' isn't always what it seems (00:53:06)</li><li>For how long can we trust the chain of thought (01:02:09)</li><li>We could accidentally destroy chain of thought's usefulness (01:11:39)</li><li>Models can tell when they’re being tested and act differently (01:16:56)</li><li>Top complaints about mech interp (01:23:50)</li><li>Why everyone's excited about sparse autoencoders (SAEs) (01:37:52)</li><li>Limitations of SAEs (01:47:16)</li><li>SAEs performance on real-world tasks (01:54:49)</li><li>Best arguments in favour of mech interp (02:08:10)</li><li>Lessons from the hype around mech interp (02:12:03)</li><li>Where mech interp will shine in coming years (02:17:50)</li><li>Why focus on understanding over control (02:21:02)</li><li>If AI models are conscious, will mech interp help us figure it out (02:24:09)</li><li>Neel’s new research philosophy (02:26:19)</li><li>Who should join the mech interp field (02:38:31)</li><li>Advice for getting started in mech interp (02:46:55)</li><li>Keeping up to date with mech interp results (02:54:41)</li><li>Who’s hiring and where to work? (02:57:43)</li></ul><p><em>Video editing: Simon Monsour, Luke Monsour, Dominic Armstrong, and Milo McGuire</em><br><em>Audio engineering: Ben Cordell, Milo McGuire, Simon Monsour, and Dominic Armstrong</em><br><em>Music: Ben Cordell</em><br><em>Camera operator: Jeremy Chevillotte</em><br><em>Coordination, transcriptions, and web: Katy Moore</em></p>]]>
      </content:encoded>
      <pubDate>Fri, 05 Jun 2026 04:00:00 -0700</pubDate>
      <author>80,000 Hours</author>
      <enclosure url="https://media.transistor.fm/da9a2e4c/87ec20f7.mp3" length="174004852" type="audio/mpeg"/>
      <itunes:author>80,000 Hours</itunes:author>
      <itunes:image href="https://img.transistorcdn.com/76a-a-j6tFAwP19p_eUirbv8bRKzOWSPJwbdP_IRGBA/rs:fill:0:0:1/w:1400/h:1400/q:60/mb:500000/aHR0cHM6Ly9pbWct/dXBsb2FkLXByb2R1/Y3Rpb24udHJhbnNp/c3Rvci5mbS82Mzgz/Y2Y1NDdmYzA0MTBj/ZDMwMDQ4NTJkYWRh/M2VlNS5qcGc.jpg"/>
      <itunes:duration>10871</itunes:duration>
      <itunes:summary>
        <![CDATA[<p>We don’t know how AIs think or why they do what they do. Or at least, we don’t know much. This is only becoming more troubling as AIs grow more capable and appear on track to wield enormous cultural influence, directly advise on major government decisions, and even operate military equipment autonomously. We simply can’t tell what models, if any, should be trusted with such authority.</p><p><a href="https://www.neelnanda.io/">Neel Nanda</a> of Google DeepMind is one of the founders of mechanistic interpretability (or ‘mech interp’) — the field of machine learning trying to give us insight into what’s happening inside AI models. The project has generated enormous hype, exploding from a handful of researchers five years ago to hundreds today — all working to make sense of the jumble of tens of thousands of numbers that frontier AIs use to process information and decide what to say or do.</p><p>Neel now has a warning for us: the most ambitious vision of mech interp he once dreamed of is probably dead. He doesn’t see a path to deeply and reliably understanding what AIs are thinking. The technical and practical barriers are simply too great to get us there in time, before competitive pressures push us to deploy human-level or superhuman AIs. Indeed, Neel argues no single approach will guarantee alignment, and our only choice is the “Swiss cheese” model of protection, layering multiple safeguards on top of one another.</p><p>But while mech interp won’t be a silver bullet for AI safety, it has nevertheless had some major successes and will be one of the best tools in our arsenal.</p><p>For instance, by inspecting the neural activations in the middle of an AI’s thoughts, we can pick up many of the concepts the model is thinking about — from the Golden Gate Bridge, to refusing to answer a question, to the option of deceiving the user. While we can’t track every thought a model is having at every moment, picking up 90% of the concepts it uses 90% of the time should help us muddle through — so long as mech interp is paired with other techniques to fill in the gaps.</p><p>In this episode, Neel takes us on a tour of everything you’ll want to know about this race to understand what AIs are really thinking. He and host Rob Wiblin cover:</p><ul><li>The best tools we’ve come up with so far, and where mech interp has failed</li><li>Why the best techniques have to be fast and cheap</li><li>The fundamental reasons we can’t reliably know what AIs are thinking, despite having perfect internal access to them</li><li>What we can and can’t learn by reading models’ ‘chains of thought’</li><li>Whether models will be able to trick us when they realise they’re being tested</li><li>The best protections to add on top of mech interp</li><li>Why he thinks the hottest technique in the field (<a href="https://web.stanford.edu/class/cs294a/sparseAutoencoder.pdf">SAEs</a>) are overrated</li><li>His new research philosophy</li><li>How to break into mech interp and get a job — including <a href="https://www.neelnanda.io/blog/mats-apps-9">applying to be a MATS scholar with Neel</a> as your mentor</li></ul><p><a href="https://80000hours.org/podcast/episodes/neel-nanda-mechanistic-interpretability/">Learn more and read the full transcript</a> on the 80,000 Hours website. </p><p><em>This episode was originally released in September 2025.</em></p><p>Chapters:</p><ul><li>Cold open (00:00:00)</li><li>Who’s Neel Nanda? (00:01:02)</li><li>How would mechanistic interpretability help with AGI (00:01:59)</li><li>What's mech interp? (00:05:09)</li><li>How Neel changed his take on mech interp (00:09:47)</li><li>Top successes in interpretability (00:15:53)</li><li>Probes can cheaply detect harmful intentions in AIs (00:20:06)</li><li>In some ways we understand AIs better than human minds (00:26:49)</li><li>Mech interp won't solve all our AI alignment problems (00:29:21)</li><li>Why mech interp is the 'biology' of neural networks (00:38:07)</li><li>Interpretability can't reliably find deceptive AI – nothing can (00:40:28)</li><li>'Black box' interpretability — reading the chain of thought (00:49:39)</li><li>'Self-preservation' isn't always what it seems (00:53:06)</li><li>For how long can we trust the chain of thought (01:02:09)</li><li>We could accidentally destroy chain of thought's usefulness (01:11:39)</li><li>Models can tell when they’re being tested and act differently (01:16:56)</li><li>Top complaints about mech interp (01:23:50)</li><li>Why everyone's excited about sparse autoencoders (SAEs) (01:37:52)</li><li>Limitations of SAEs (01:47:16)</li><li>SAEs performance on real-world tasks (01:54:49)</li><li>Best arguments in favour of mech interp (02:08:10)</li><li>Lessons from the hype around mech interp (02:12:03)</li><li>Where mech interp will shine in coming years (02:17:50)</li><li>Why focus on understanding over control (02:21:02)</li><li>If AI models are conscious, will mech interp help us figure it out (02:24:09)</li><li>Neel’s new research philosophy (02:26:19)</li><li>Who should join the mech interp field (02:38:31)</li><li>Advice for getting started in mech interp (02:46:55)</li><li>Keeping up to date with mech interp results (02:54:41)</li><li>Who’s hiring and where to work? (02:57:43)</li></ul><p><em>Video editing: Simon Monsour, Luke Monsour, Dominic Armstrong, and Milo McGuire</em><br><em>Audio engineering: Ben Cordell, Milo McGuire, Simon Monsour, and Dominic Armstrong</em><br><em>Music: Ben Cordell</em><br><em>Camera operator: Jeremy Chevillotte</em><br><em>Coordination, transcriptions, and web: Katy Moore</em></p>]]>
      </itunes:summary>
      <itunes:keywords>artificial intelligence, effective altruism, existential risk, longtermism, AGI</itunes:keywords>
      <itunes:explicit>No</itunes:explicit>
      <podcast:transcript url="https://share.transistor.fm/s/da9a2e4c/transcript.txt" type="text/plain"/>
      <podcast:chapters url="https://share.transistor.fm/s/da9a2e4c/chapters.json" type="application/json+chapters"/>
    </item>
    <item>
      <title>Ten: Holden Karnofsky on dozens of opportunities to make AI safer lying on the table — and all his AGI takes</title>
      <itunes:title>Ten: Holden Karnofsky on dozens of opportunities to make AI safer lying on the table — and all his AGI takes</itunes:title>
      <itunes:episodeType>full</itunes:episodeType>
      <guid isPermaLink="false">03b8d1ae-b0b2-403e-a561-cb66064652a9</guid>
      <link>https://80000hours.org/podcast/episodes/holden-karnofsky-concrete-ai-safety-frontier-ai-companies/</link>
      <description>
        <![CDATA[<p>For years, working on AI safety usually meant theorising about the ‘alignment problem’ or trying to convince other people to give a damn. If you could find any way to help, the work was frustrating and low feedback.</p><p>According to Holden Karnofsky — currently at Anthropic, previously cofounder and CEO of Open Philanthropy (now<a href="https://coefficientgiving.org/research/open-philanthropy-is-now-coefficient-giving/"> Coefficient Giving</a>) — this situation has now reversed completely.</p><p>There are now large amounts of useful, concrete, shovel-ready projects with clear goals and deliverables. Holden thinks people haven’t appreciated the scale of the shift, and wants everyone to see the large range of “well-scoped object-level work” they could personally help with, in both technical and non-technical areas. </p><p>In fact, in this episode alone, Holden lists 39 projects he’s excited to see happening, including:</p><ul><li>Training deceptive AI models to study deception and how to detect it</li><li>Developing classifiers to block jailbreaking</li><li>Implementing security measures to stop ‘backdoors’ or ‘secret loyalties’ from being added to models during training</li><li>Developing policies on model welfare, AI-human relationships, and what instructions to give models</li><li>Training AIs to work as alignment researchers</li></ul><p>And that’s all just stuff he’s happened to observe directly, which is probably only a small fraction of the options available.</p><p>All this low-hanging fruit is one factor behind his decision to join Anthropic this year. That said, his wife is also a cofounder and president of the company, giving him a big financial stake in its success — and making it impossible for him to be seen as independent no matter where he worked.</p><p>Holden makes a case that, for many people, working at an AI company like Anthropic will be the best way to steer AGI in a positive direction. He notes there are “ways that you can reduce AI risk that you can only do if you’re a competitive frontier AI company.” At the same time, he believes external groups have their own advantages and can be equally impactful.</p><p>Critics worry that Anthropic’s efforts to stay at that frontier encourage competitive racing towards AGI — significantly or entirely offsetting any useful research they do. Holden thinks this seriously misunderstands the strategic situation we’re in: “I work at an AI company, and a lot of people think that’s just inherently unethical. They’re imagining that everyone wishes they could go slowly, but they’re going fast so they can beat everyone else. […] But I emphatically think this is not what’s going on in AI.”</p><p>The reality, in Holden’s view:</p><p><br></p><p>“I think there’s too many players in AI who […] don’t want to slow down. They don’t believe in the risks. Maybe they don’t even care about the risks. […] If Anthropic were to say, ‘We’re out, we’re going to slow down,’ they would say, ‘This is awesome! Now we have a better chance of winning, and this is even good for our recruiting’ — because they have a better chance of getting people who want to be on the frontier and want to win.”</p><p>Holden believes a frontier AI company can reduce risk by:</p><ul><li>Developing cheap, practical safety measures other companies might adopt</li><li>Prototyping policies regulators could mandate</li><li>Gathering crucial data about what advanced AI can actually do</li></ul><p>Host Rob Wiblin and Holden discuss the case for and against those strategies, and much more.</p><p><a href="https://80000hours.org/podcast/episodes/holden-karnofsky-concrete-ai-safety-frontier-ai-companies/">Learn more and read the full transcript</a> on the 80,000 Hours website. </p><p><em>This episode was originally released in October 2025.</em></p><p>Chapters:</p><ul><li>Cold open (00:00:00)</li><li>Holden is back! (00:02:26)</li><li>An AI Chernobyl we never notice (00:02:56)</li><li>Is rogue AI takeover easy or hard? (00:07:32)</li><li>The AGI race isn't a coordination failure (00:17:48)</li><li>What Holden now does at Anthropic (00:28:04)</li><li>The case for working at Anthropic (00:30:08)</li><li>Is Anthropic doing enough? (00:40:45)</li><li>Can we trust Anthropic, or any AI company? (00:43:40)</li><li>How can Anthropic compete while paying the “safety tax”? (00:49:14)</li><li>What, if anything, could prompt Anthropic to halt development of AGI? (00:56:11)</li><li>Holden's retrospective on responsible scaling policies (00:59:01)</li><li>Overrated work (01:14:27)</li><li>Concrete shovel-ready projects Holden is excited about (01:16:37)</li><li>Great things to do in technical AI safety (01:20:48)</li><li>Great things to do on AI welfare and AI relationships (01:28:18)</li><li>Great things to do in biosecurity and pandemic preparedness (01:35:11)</li><li>How to choose where to work (01:35:57)</li><li>Overrated AI risk: Cyberattacks (01:41:56)</li><li>Overrated AI risk: Persuasion (01:51:37)</li><li>Why AI R&amp;D is the main thing to worry about (01:55:36)</li><li>The case that AI-enabled R&amp;D wouldn't speed things up much (02:07:15)</li><li>AI-enabled human power grabs (02:11:10)</li><li>Main benefits of getting AGI right (02:23:07)</li><li>The world is handling AGI about as badly as possible (02:29:07)</li><li>Learning from targeting companies for public criticism in farm animal welfare (02:31:39)</li><li>Will Anthropic actually make any difference? (02:40:51)</li><li>“Misaligned” vs “misaligned and power-seeking” (02:55:12)</li><li>Success without dignity: how we could win despite being stupid (03:00:58)</li><li>Holden sees less dignity but has more hope (03:08:30)</li><li>Should we expect misaligned power-seeking by default? (03:15:58)</li><li>Will reinforcement learning make everything worse? (03:23:45)</li><li>Should we push for marginal improvements or big paradigm shifts? (03:28:58)</li><li>Should safety-focused people cluster or spread out? (03:31:35)</li><li>Is Anthropic vocal enough about strong regulation? (03:35:56)</li><li>Is Holden biased because of his financial stake in Anthropic? (03:39:26)</li><li>Have we learned clever governance structures don't work? (03:43:51)</li><li>Is Holden scared of AI bioweapons? (03:46:12)</li><li>Holden thinks AI companions are bad news (03:49:47)</li><li>Are AI companies too hawkish on China? (03:56:39)</li><li>The frontier of infosec: confidentiality vs integrity (04:00:51)</li><li>How often does AI work backfire? (04:03:38)</li><li>Is AI clearly more impactful to work in? (04:18:26)</li><li>What's the role of earning to give? (04:24:54)</li></ul><p><em>Video editing: Simon Monsour, Luke Monsour, Dominic Armstrong, and Milo McGuire</em><br><em>Audio engineering: Milo McGuire, Simon Monsour, and Dominic Armstrong</em><br><em>Music: </em><a href="https://open.spotify.com/artist/4lWobp6IUcSZ7w5mhnU1c9"><em>CORBIT</em></a><br><em>Coordination, transcriptions, and web: Katy Moore</em></p>]]>
      </description>
      <content:encoded>
        <![CDATA[<p>For years, working on AI safety usually meant theorising about the ‘alignment problem’ or trying to convince other people to give a damn. If you could find any way to help, the work was frustrating and low feedback.</p><p>According to Holden Karnofsky — currently at Anthropic, previously cofounder and CEO of Open Philanthropy (now<a href="https://coefficientgiving.org/research/open-philanthropy-is-now-coefficient-giving/"> Coefficient Giving</a>) — this situation has now reversed completely.</p><p>There are now large amounts of useful, concrete, shovel-ready projects with clear goals and deliverables. Holden thinks people haven’t appreciated the scale of the shift, and wants everyone to see the large range of “well-scoped object-level work” they could personally help with, in both technical and non-technical areas. </p><p>In fact, in this episode alone, Holden lists 39 projects he’s excited to see happening, including:</p><ul><li>Training deceptive AI models to study deception and how to detect it</li><li>Developing classifiers to block jailbreaking</li><li>Implementing security measures to stop ‘backdoors’ or ‘secret loyalties’ from being added to models during training</li><li>Developing policies on model welfare, AI-human relationships, and what instructions to give models</li><li>Training AIs to work as alignment researchers</li></ul><p>And that’s all just stuff he’s happened to observe directly, which is probably only a small fraction of the options available.</p><p>All this low-hanging fruit is one factor behind his decision to join Anthropic this year. That said, his wife is also a cofounder and president of the company, giving him a big financial stake in its success — and making it impossible for him to be seen as independent no matter where he worked.</p><p>Holden makes a case that, for many people, working at an AI company like Anthropic will be the best way to steer AGI in a positive direction. He notes there are “ways that you can reduce AI risk that you can only do if you’re a competitive frontier AI company.” At the same time, he believes external groups have their own advantages and can be equally impactful.</p><p>Critics worry that Anthropic’s efforts to stay at that frontier encourage competitive racing towards AGI — significantly or entirely offsetting any useful research they do. Holden thinks this seriously misunderstands the strategic situation we’re in: “I work at an AI company, and a lot of people think that’s just inherently unethical. They’re imagining that everyone wishes they could go slowly, but they’re going fast so they can beat everyone else. […] But I emphatically think this is not what’s going on in AI.”</p><p>The reality, in Holden’s view:</p><p><br></p><p>“I think there’s too many players in AI who […] don’t want to slow down. They don’t believe in the risks. Maybe they don’t even care about the risks. […] If Anthropic were to say, ‘We’re out, we’re going to slow down,’ they would say, ‘This is awesome! Now we have a better chance of winning, and this is even good for our recruiting’ — because they have a better chance of getting people who want to be on the frontier and want to win.”</p><p>Holden believes a frontier AI company can reduce risk by:</p><ul><li>Developing cheap, practical safety measures other companies might adopt</li><li>Prototyping policies regulators could mandate</li><li>Gathering crucial data about what advanced AI can actually do</li></ul><p>Host Rob Wiblin and Holden discuss the case for and against those strategies, and much more.</p><p><a href="https://80000hours.org/podcast/episodes/holden-karnofsky-concrete-ai-safety-frontier-ai-companies/">Learn more and read the full transcript</a> on the 80,000 Hours website. </p><p><em>This episode was originally released in October 2025.</em></p><p>Chapters:</p><ul><li>Cold open (00:00:00)</li><li>Holden is back! (00:02:26)</li><li>An AI Chernobyl we never notice (00:02:56)</li><li>Is rogue AI takeover easy or hard? (00:07:32)</li><li>The AGI race isn't a coordination failure (00:17:48)</li><li>What Holden now does at Anthropic (00:28:04)</li><li>The case for working at Anthropic (00:30:08)</li><li>Is Anthropic doing enough? (00:40:45)</li><li>Can we trust Anthropic, or any AI company? (00:43:40)</li><li>How can Anthropic compete while paying the “safety tax”? (00:49:14)</li><li>What, if anything, could prompt Anthropic to halt development of AGI? (00:56:11)</li><li>Holden's retrospective on responsible scaling policies (00:59:01)</li><li>Overrated work (01:14:27)</li><li>Concrete shovel-ready projects Holden is excited about (01:16:37)</li><li>Great things to do in technical AI safety (01:20:48)</li><li>Great things to do on AI welfare and AI relationships (01:28:18)</li><li>Great things to do in biosecurity and pandemic preparedness (01:35:11)</li><li>How to choose where to work (01:35:57)</li><li>Overrated AI risk: Cyberattacks (01:41:56)</li><li>Overrated AI risk: Persuasion (01:51:37)</li><li>Why AI R&amp;D is the main thing to worry about (01:55:36)</li><li>The case that AI-enabled R&amp;D wouldn't speed things up much (02:07:15)</li><li>AI-enabled human power grabs (02:11:10)</li><li>Main benefits of getting AGI right (02:23:07)</li><li>The world is handling AGI about as badly as possible (02:29:07)</li><li>Learning from targeting companies for public criticism in farm animal welfare (02:31:39)</li><li>Will Anthropic actually make any difference? (02:40:51)</li><li>“Misaligned” vs “misaligned and power-seeking” (02:55:12)</li><li>Success without dignity: how we could win despite being stupid (03:00:58)</li><li>Holden sees less dignity but has more hope (03:08:30)</li><li>Should we expect misaligned power-seeking by default? (03:15:58)</li><li>Will reinforcement learning make everything worse? (03:23:45)</li><li>Should we push for marginal improvements or big paradigm shifts? (03:28:58)</li><li>Should safety-focused people cluster or spread out? (03:31:35)</li><li>Is Anthropic vocal enough about strong regulation? (03:35:56)</li><li>Is Holden biased because of his financial stake in Anthropic? (03:39:26)</li><li>Have we learned clever governance structures don't work? (03:43:51)</li><li>Is Holden scared of AI bioweapons? (03:46:12)</li><li>Holden thinks AI companions are bad news (03:49:47)</li><li>Are AI companies too hawkish on China? (03:56:39)</li><li>The frontier of infosec: confidentiality vs integrity (04:00:51)</li><li>How often does AI work backfire? (04:03:38)</li><li>Is AI clearly more impactful to work in? (04:18:26)</li><li>What's the role of earning to give? (04:24:54)</li></ul><p><em>Video editing: Simon Monsour, Luke Monsour, Dominic Armstrong, and Milo McGuire</em><br><em>Audio engineering: Milo McGuire, Simon Monsour, and Dominic Armstrong</em><br><em>Music: </em><a href="https://open.spotify.com/artist/4lWobp6IUcSZ7w5mhnU1c9"><em>CORBIT</em></a><br><em>Coordination, transcriptions, and web: Katy Moore</em></p>]]>
      </content:encoded>
      <pubDate>Fri, 05 Jun 2026 03:00:00 -0700</pubDate>
      <author>80,000 Hours</author>
      <enclosure url="https://media.transistor.fm/f787ffd0/1c8d0390.mp3" length="259574934" type="audio/mpeg"/>
      <itunes:author>80,000 Hours</itunes:author>
      <itunes:image href="https://img.transistorcdn.com/o1_wNlh_VY51_iQTMUiFcC_RqypOUQvHGw14Dj6VDGw/rs:fill:0:0:1/w:1400/h:1400/q:60/mb:500000/aHR0cHM6Ly9pbWct/dXBsb2FkLXByb2R1/Y3Rpb24udHJhbnNp/c3Rvci5mbS80OTFl/NzIyMmZkN2IxMDIy/ZGU5NDZiNjc0MDdh/YTE2Yy5qcGc.jpg"/>
      <itunes:duration>16219</itunes:duration>
      <itunes:summary>
        <![CDATA[<p>For years, working on AI safety usually meant theorising about the ‘alignment problem’ or trying to convince other people to give a damn. If you could find any way to help, the work was frustrating and low feedback.</p><p>According to Holden Karnofsky — currently at Anthropic, previously cofounder and CEO of Open Philanthropy (now<a href="https://coefficientgiving.org/research/open-philanthropy-is-now-coefficient-giving/"> Coefficient Giving</a>) — this situation has now reversed completely.</p><p>There are now large amounts of useful, concrete, shovel-ready projects with clear goals and deliverables. Holden thinks people haven’t appreciated the scale of the shift, and wants everyone to see the large range of “well-scoped object-level work” they could personally help with, in both technical and non-technical areas. </p><p>In fact, in this episode alone, Holden lists 39 projects he’s excited to see happening, including:</p><ul><li>Training deceptive AI models to study deception and how to detect it</li><li>Developing classifiers to block jailbreaking</li><li>Implementing security measures to stop ‘backdoors’ or ‘secret loyalties’ from being added to models during training</li><li>Developing policies on model welfare, AI-human relationships, and what instructions to give models</li><li>Training AIs to work as alignment researchers</li></ul><p>And that’s all just stuff he’s happened to observe directly, which is probably only a small fraction of the options available.</p><p>All this low-hanging fruit is one factor behind his decision to join Anthropic this year. That said, his wife is also a cofounder and president of the company, giving him a big financial stake in its success — and making it impossible for him to be seen as independent no matter where he worked.</p><p>Holden makes a case that, for many people, working at an AI company like Anthropic will be the best way to steer AGI in a positive direction. He notes there are “ways that you can reduce AI risk that you can only do if you’re a competitive frontier AI company.” At the same time, he believes external groups have their own advantages and can be equally impactful.</p><p>Critics worry that Anthropic’s efforts to stay at that frontier encourage competitive racing towards AGI — significantly or entirely offsetting any useful research they do. Holden thinks this seriously misunderstands the strategic situation we’re in: “I work at an AI company, and a lot of people think that’s just inherently unethical. They’re imagining that everyone wishes they could go slowly, but they’re going fast so they can beat everyone else. […] But I emphatically think this is not what’s going on in AI.”</p><p>The reality, in Holden’s view:</p><p><br></p><p>“I think there’s too many players in AI who […] don’t want to slow down. They don’t believe in the risks. Maybe they don’t even care about the risks. […] If Anthropic were to say, ‘We’re out, we’re going to slow down,’ they would say, ‘This is awesome! Now we have a better chance of winning, and this is even good for our recruiting’ — because they have a better chance of getting people who want to be on the frontier and want to win.”</p><p>Holden believes a frontier AI company can reduce risk by:</p><ul><li>Developing cheap, practical safety measures other companies might adopt</li><li>Prototyping policies regulators could mandate</li><li>Gathering crucial data about what advanced AI can actually do</li></ul><p>Host Rob Wiblin and Holden discuss the case for and against those strategies, and much more.</p><p><a href="https://80000hours.org/podcast/episodes/holden-karnofsky-concrete-ai-safety-frontier-ai-companies/">Learn more and read the full transcript</a> on the 80,000 Hours website. </p><p><em>This episode was originally released in October 2025.</em></p><p>Chapters:</p><ul><li>Cold open (00:00:00)</li><li>Holden is back! (00:02:26)</li><li>An AI Chernobyl we never notice (00:02:56)</li><li>Is rogue AI takeover easy or hard? (00:07:32)</li><li>The AGI race isn't a coordination failure (00:17:48)</li><li>What Holden now does at Anthropic (00:28:04)</li><li>The case for working at Anthropic (00:30:08)</li><li>Is Anthropic doing enough? (00:40:45)</li><li>Can we trust Anthropic, or any AI company? (00:43:40)</li><li>How can Anthropic compete while paying the “safety tax”? (00:49:14)</li><li>What, if anything, could prompt Anthropic to halt development of AGI? (00:56:11)</li><li>Holden's retrospective on responsible scaling policies (00:59:01)</li><li>Overrated work (01:14:27)</li><li>Concrete shovel-ready projects Holden is excited about (01:16:37)</li><li>Great things to do in technical AI safety (01:20:48)</li><li>Great things to do on AI welfare and AI relationships (01:28:18)</li><li>Great things to do in biosecurity and pandemic preparedness (01:35:11)</li><li>How to choose where to work (01:35:57)</li><li>Overrated AI risk: Cyberattacks (01:41:56)</li><li>Overrated AI risk: Persuasion (01:51:37)</li><li>Why AI R&amp;D is the main thing to worry about (01:55:36)</li><li>The case that AI-enabled R&amp;D wouldn't speed things up much (02:07:15)</li><li>AI-enabled human power grabs (02:11:10)</li><li>Main benefits of getting AGI right (02:23:07)</li><li>The world is handling AGI about as badly as possible (02:29:07)</li><li>Learning from targeting companies for public criticism in farm animal welfare (02:31:39)</li><li>Will Anthropic actually make any difference? (02:40:51)</li><li>“Misaligned” vs “misaligned and power-seeking” (02:55:12)</li><li>Success without dignity: how we could win despite being stupid (03:00:58)</li><li>Holden sees less dignity but has more hope (03:08:30)</li><li>Should we expect misaligned power-seeking by default? (03:15:58)</li><li>Will reinforcement learning make everything worse? (03:23:45)</li><li>Should we push for marginal improvements or big paradigm shifts? (03:28:58)</li><li>Should safety-focused people cluster or spread out? (03:31:35)</li><li>Is Anthropic vocal enough about strong regulation? (03:35:56)</li><li>Is Holden biased because of his financial stake in Anthropic? (03:39:26)</li><li>Have we learned clever governance structures don't work? (03:43:51)</li><li>Is Holden scared of AI bioweapons? (03:46:12)</li><li>Holden thinks AI companions are bad news (03:49:47)</li><li>Are AI companies too hawkish on China? (03:56:39)</li><li>The frontier of infosec: confidentiality vs integrity (04:00:51)</li><li>How often does AI work backfire? (04:03:38)</li><li>Is AI clearly more impactful to work in? (04:18:26)</li><li>What's the role of earning to give? (04:24:54)</li></ul><p><em>Video editing: Simon Monsour, Luke Monsour, Dominic Armstrong, and Milo McGuire</em><br><em>Audio engineering: Milo McGuire, Simon Monsour, and Dominic Armstrong</em><br><em>Music: </em><a href="https://open.spotify.com/artist/4lWobp6IUcSZ7w5mhnU1c9"><em>CORBIT</em></a><br><em>Coordination, transcriptions, and web: Katy Moore</em></p>]]>
      </itunes:summary>
      <itunes:keywords>artificial intelligence, effective altruism, existential risk, longtermism, AGI</itunes:keywords>
      <itunes:explicit>No</itunes:explicit>
      <podcast:transcript url="https://share.transistor.fm/s/f787ffd0/transcript.txt" type="text/plain"/>
      <podcast:chapters url="https://share.transistor.fm/s/f787ffd0/chapters.json" type="application/json+chapters"/>
    </item>
    <item>
      <title>Bonus: Risks from power-seeking AI systems (article narration by Zershaaneh Qureshi)</title>
      <itunes:title>Bonus: Risks from power-seeking AI systems (article narration by Zershaaneh Qureshi)</itunes:title>
      <itunes:episodeType>bonus</itunes:episodeType>
      <guid isPermaLink="false">fa0466f0-7e57-4e47-9612-bed08b9ad2fb</guid>
      <link>https://80000hours.org/problem-profiles/artificial-intelligence/</link>
      <description>
        <![CDATA[<p>Hundreds of prominent AI scientists and other notable figures <a href="https://aistatement.com/">signed a statement</a> in 2023 saying that mitigating the risk of extinction from AI should be a global priority. At 80,000 Hours, we’ve considered risks from AI to be the world’s most pressing problem since 2016. </p><p>But what led us to this conclusion? Could AI really cause human extinction? We’re not certain, but we think the risk is worth taking very seriously. </p><p>In particular, as companies create increasingly powerful AI systems, there’s a concerning chance that:</p><ul><li>These AI systems may develop dangerous long-term goals we don’t want.</li><li>To pursue these goals, they may seek power and undermine the safeguards meant to contain them.</li><li>They may even aim to disempower humanity and potentially cause our extinction.</li></ul><p>This article is written by Cody Fenwick and Zershaaneh Qureshi, and narrated by Zershaaneh Qureshi. It discusses why future AI systems could disempower humanity, what current AI research reveals about behaviours like power-seeking and deception, and how you can help mitigate the dangers.</p><p>You can see the original article — packed with graphs, images, footnotes, and further resources — on the 80,000 Hours website: </p><p><a href="https://80000hours.org/problem-profiles/risks-from-power-seeking-ai/">https://80000hours.org/problem-profiles/risks-from-power-seeking-ai/</a> </p><p>Chapters:</p><ul><li>Risks from power-seeking AI systems (00:01:00)</li><li>Introduction (00:01:17)</li><li>Summary (00:03:09)</li><li>Why are the risks from power-seeking AI a pressing world problem? (00:04:04)</li><li>Section 1: Humans will likely build advanced AI systems with long-term goals (00:05:43)</li><li>Section 2: AIs with long-term goals may be inclined to seek power (00:11:32)</li><li>Section 3: These power-seeking AI systems could successfully disempower humanity (00:26:26)</li><li>Section 4. People might create power-seeking AI systems without enough safeguards, despite the risks (00:38:34)</li><li>Section 5: Work on this problem is neglected and tractable (00:47:37)</li><li>Section 6: What are the arguments against working on this problem? (00:59:20)</li><li>Section 7: How you can help (01:25:07)</li><li>Thank you for listening (01:28:56)</li></ul><p><em>Audio editing: Dominic Armstrong<br>Production: Zershaaneh Qureshi, Elizabeth Cox, and Katy Moore</em></p>]]>
      </description>
      <content:encoded>
        <![CDATA[<p>Hundreds of prominent AI scientists and other notable figures <a href="https://aistatement.com/">signed a statement</a> in 2023 saying that mitigating the risk of extinction from AI should be a global priority. At 80,000 Hours, we’ve considered risks from AI to be the world’s most pressing problem since 2016. </p><p>But what led us to this conclusion? Could AI really cause human extinction? We’re not certain, but we think the risk is worth taking very seriously. </p><p>In particular, as companies create increasingly powerful AI systems, there’s a concerning chance that:</p><ul><li>These AI systems may develop dangerous long-term goals we don’t want.</li><li>To pursue these goals, they may seek power and undermine the safeguards meant to contain them.</li><li>They may even aim to disempower humanity and potentially cause our extinction.</li></ul><p>This article is written by Cody Fenwick and Zershaaneh Qureshi, and narrated by Zershaaneh Qureshi. It discusses why future AI systems could disempower humanity, what current AI research reveals about behaviours like power-seeking and deception, and how you can help mitigate the dangers.</p><p>You can see the original article — packed with graphs, images, footnotes, and further resources — on the 80,000 Hours website: </p><p><a href="https://80000hours.org/problem-profiles/risks-from-power-seeking-ai/">https://80000hours.org/problem-profiles/risks-from-power-seeking-ai/</a> </p><p>Chapters:</p><ul><li>Risks from power-seeking AI systems (00:01:00)</li><li>Introduction (00:01:17)</li><li>Summary (00:03:09)</li><li>Why are the risks from power-seeking AI a pressing world problem? (00:04:04)</li><li>Section 1: Humans will likely build advanced AI systems with long-term goals (00:05:43)</li><li>Section 2: AIs with long-term goals may be inclined to seek power (00:11:32)</li><li>Section 3: These power-seeking AI systems could successfully disempower humanity (00:26:26)</li><li>Section 4. People might create power-seeking AI systems without enough safeguards, despite the risks (00:38:34)</li><li>Section 5: Work on this problem is neglected and tractable (00:47:37)</li><li>Section 6: What are the arguments against working on this problem? (00:59:20)</li><li>Section 7: How you can help (01:25:07)</li><li>Thank you for listening (01:28:56)</li></ul><p><em>Audio editing: Dominic Armstrong<br>Production: Zershaaneh Qureshi, Elizabeth Cox, and Katy Moore</em></p>]]>
      </content:encoded>
      <pubDate>Fri, 05 Jun 2026 01:00:00 -0700</pubDate>
      <author>80,000 Hours</author>
      <enclosure url="https://media.transistor.fm/24989ab9/3a24357c.mp3" length="86008411" type="audio/mpeg"/>
      <itunes:author>80,000 Hours</itunes:author>
      <itunes:image href="https://img.transistorcdn.com/BEk9FqjwhTxybvp32enHWgj7nv-6iu-GTzlDifsL1Ek/rs:fill:0:0:1/w:1400/h:1400/q:60/mb:500000/aHR0cHM6Ly9pbWct/dXBsb2FkLXByb2R1/Y3Rpb24udHJhbnNp/c3Rvci5mbS83Y2I0/ZDY0ZmIxMDFmNTZh/OGFkYWEzYWFjMDg0/MjUyYy5qcGc.jpg"/>
      <itunes:duration>5372</itunes:duration>
      <itunes:summary>
        <![CDATA[<p>Hundreds of prominent AI scientists and other notable figures <a href="https://aistatement.com/">signed a statement</a> in 2023 saying that mitigating the risk of extinction from AI should be a global priority. At 80,000 Hours, we’ve considered risks from AI to be the world’s most pressing problem since 2016. </p><p>But what led us to this conclusion? Could AI really cause human extinction? We’re not certain, but we think the risk is worth taking very seriously. </p><p>In particular, as companies create increasingly powerful AI systems, there’s a concerning chance that:</p><ul><li>These AI systems may develop dangerous long-term goals we don’t want.</li><li>To pursue these goals, they may seek power and undermine the safeguards meant to contain them.</li><li>They may even aim to disempower humanity and potentially cause our extinction.</li></ul><p>This article is written by Cody Fenwick and Zershaaneh Qureshi, and narrated by Zershaaneh Qureshi. It discusses why future AI systems could disempower humanity, what current AI research reveals about behaviours like power-seeking and deception, and how you can help mitigate the dangers.</p><p>You can see the original article — packed with graphs, images, footnotes, and further resources — on the 80,000 Hours website: </p><p><a href="https://80000hours.org/problem-profiles/risks-from-power-seeking-ai/">https://80000hours.org/problem-profiles/risks-from-power-seeking-ai/</a> </p><p>Chapters:</p><ul><li>Risks from power-seeking AI systems (00:01:00)</li><li>Introduction (00:01:17)</li><li>Summary (00:03:09)</li><li>Why are the risks from power-seeking AI a pressing world problem? (00:04:04)</li><li>Section 1: Humans will likely build advanced AI systems with long-term goals (00:05:43)</li><li>Section 2: AIs with long-term goals may be inclined to seek power (00:11:32)</li><li>Section 3: These power-seeking AI systems could successfully disempower humanity (00:26:26)</li><li>Section 4. People might create power-seeking AI systems without enough safeguards, despite the risks (00:38:34)</li><li>Section 5: Work on this problem is neglected and tractable (00:47:37)</li><li>Section 6: What are the arguments against working on this problem? (00:59:20)</li><li>Section 7: How you can help (01:25:07)</li><li>Thank you for listening (01:28:56)</li></ul><p><em>Audio editing: Dominic Armstrong<br>Production: Zershaaneh Qureshi, Elizabeth Cox, and Katy Moore</em></p>]]>
      </itunes:summary>
      <itunes:keywords>artificial intelligence, effective altruism, existential risk, longtermism, AGI</itunes:keywords>
      <itunes:explicit>No</itunes:explicit>
      <podcast:transcript url="https://share.transistor.fm/s/24989ab9/transcript.txt" type="text/plain"/>
      <podcast:chapters url="https://share.transistor.fm/s/24989ab9/chapters.json" type="application/json+chapters"/>
    </item>
    <item>
      <title>Zero: What to expect in this series</title>
      <itunes:title>Zero: What to expect in this series</itunes:title>
      <itunes:episodeType>trailer</itunes:episodeType>
      <guid isPermaLink="false">118e4ce4-a94f-4b72-8d56-9b95939c6529</guid>
      <link>https://80000hours.org/podcast/on-artificial-intelligence/</link>
      <description>
        <![CDATA[<p>What might it be like to live through the creation of AI that surpasses human abilities? That future may be closer than you think. </p><p><br>In this series, one expert interview at a time, we'll walk you through what's at stake — and what you could do to help.</p>]]>
      </description>
      <content:encoded>
        <![CDATA[<p>What might it be like to live through the creation of AI that surpasses human abilities? That future may be closer than you think. </p><p><br>In this series, one expert interview at a time, we'll walk you through what's at stake — and what you could do to help.</p>]]>
      </content:encoded>
      <pubDate>Fri, 01 Sep 2023 23:00:00 -0700</pubDate>
      <author>80000 Hours</author>
      <enclosure url="https://media.transistor.fm/adc263a5/d4fdfe49.mp3" length="1153399" type="audio/mpeg"/>
      <itunes:author>80000 Hours</itunes:author>
      <itunes:image href="https://img.transistorcdn.com/PRLBueTXbUhw4EJl6Jc9C1omwHELHtHPb3fZ2wbSktg/rs:fill:0:0:1/w:1400/h:1400/q:60/mb:500000/aHR0cHM6Ly9pbWct/dXBsb2FkLXByb2R1/Y3Rpb24udHJhbnNp/c3Rvci5mbS9lcGlz/b2RlLzE0MTQ0MzQv/MTY4ODk5MjA5OS1h/cnR3b3JrLmpwZw.jpg"/>
      <itunes:duration>122</itunes:duration>
      <itunes:summary>
        <![CDATA[<p>What might it be like to live through the creation of AI that surpasses human abilities? That future may be closer than you think. </p><p><br>In this series, one expert interview at a time, we'll walk you through what's at stake — and what you could do to help.</p>]]>
      </itunes:summary>
      <itunes:keywords>artificial intelligence, effective altruism, existential risk, longtermism, AGI</itunes:keywords>
      <itunes:explicit>No</itunes:explicit>
    </item>
  </channel>
</rss>
