Into AI Safety

Reclaiming UBI in the AI Age w/ Joe Williams

Jacob Haimes — Mon, 01 Jun 2026 12:00:00 -0600

Today's episode does double duty as an interview and an announcement. Joe Williams, host of the new Kairos.fm show "Beyond the Paycheck: Reclaiming the Case for UBI in the Age of AI," joins Jacob to talk about his background as a freelance translator and how AI quietly dismantled his livelihood in 2025. From there the conversation expands into whether this moment is really different from past waves of automation, who exactly makes up the tech billionaire class, and why you should probably raise an eyebrow when someone like Elon Musk says he supports UBI. Go check out Reclaiming UBI!

Chapters

(00:00) - Intro
(05:01) - Joe's Background
(19:41) - What is UBI & Why Does It Matter Now?
(27:25) - ASIDE: The Corporation Who Cried " My Technology Will End Work"
(32:25) - Wealth Concetration & the Digital Economy
(39:06) - The AI Gentry
(48:13) - ASIDE: Sam Altman + OpenResearch + Worldcoin
(53:38) - Powerful UBI Supporters Should Raise Red Flags
(01:02:56) - How to Read AI Gentry Discourse Critically
(01:10:00) - Who Is "Reclaiming UBI" For?
(01:18:17) - Outro

Critical Links
Below are the most important links for this episode. For more, visit the episode page on Kairos.fm.

NEW KAIROS.FM PODCAST - Reclaiming UBI: Work and Values in the Age of AI
Josh Steimle blog post - Will AI Destroy Jobs? 500 Years of Predictions Say No
Smithsonian Magazine article - What the Luddites Really Fought Against
ITIF report - Oops: The Predicted 47 Percent of Job Loss From AI Didn't Happen
Frontiers in Artificial Intelligence journal article - AI, Universal Basic Income, and Power: Symbolic Violence in the Tech Elite's Narrative
Time article - What to Know About Worldcoin and the Controversy Around It
Wikipedia page - World (blockchain)

Building Asymmetric Defense w/ Zainab Majid

Jacob Haimes — Mon, 11 May 2026 18:00:00 -0600

Zainab Majid, co-founder of Asymmetric Security, joins Jacob for a conversation on the intersection between AI Safety and cybersecurity, as well as the future of digital forensics. Drawing from years of incident response work, she explains how cyber attacks actually unfold, why AI is changing both offense and defense, and how her team is building AI-native tools to investigate breaches faster and more effectively. Other topics explored in this episode include trust in the AI/cybersecurity industries, the realities behind cybersecurity hype, and the challenge of keeping humans meaningfully involved as these systems become more capable. Zainab also gives practical, actionable advice on how you can protect yourself.

If you're interested in over 30 minutes of additional content, head on over to the Kairos.fm Patreon where you can become a subscriber for just $2 per month, which helps make this whole podcasting thing a bit more sustainable.

Chapters

(00:00) - Intro
(04:34) - Zainab's Background
(08:49) - Jacob & Zainab's History
(16:03) - Founding Asymmetric Security
(24:49) - How to Know Who You Can Trust
(36:31) - The Threats Asymmetric Is Built to Fight
(01:05:54) - What's Asymmetric Tackling Next?
(01:15:33) - Glasswing, Dual Use, and Power Concentration
(01:24:21) - The Relationship Between AI Safety & Cybersecurity
(01:37:01) - Outro

Critical Links
Below are the most important links for this episode. For more, visit the episode page on Kairos.fm.

Asymmetric Security website
EvalEval @ NeurIPS workshop paper - Rethinking CyberSecEval
Meta AI report - Purple Llama CyberSecEval
Anthropic press release - Project Glasswing
Schneier on Security blogpost - What Anthropic’s Mythos Means for the Future of Cybersecurity

Drawing Red Lines w/ Su Cizem

Jacob Haimes — Mon, 06 Apr 2026 10:30:00 -0600

Technology has been moving faster than policy for some time now, and the advent of AI isn't changing that, so what can we do to maintain safety despite uncertainty? Su Cizem has spent the last few years trying to answer that question. As an analyst at the Future Society, she works on global AI governance, specifically on building international consensus around AI red lines: the thresholds we collectively agree must never be crossed. In this conversation, Su walks through her path from philosophy to policy, the evolution of the global AI safety summit series, why voluntary commitments from AI labs aren't enough, and what it would actually take to make international cooperation on AI safety real.

Chapters

(00:00) - Introduction
(03:23) - From Philosophy to Policy
(22:25) - What AI Governance Actually Means
(26:49) - The Summit Series
(43:01) - Drawing The Red Lines
(01:10:51) - Can These Companies Govern Themselves?
(01:24:01) - Breaking Into The Field
(01:27:51) - Closing Thoughts & Outro

Critical Links
Below are the most important links for this episode. For more, visit the episode page on Kairos.fm.

Su's LinkedIn
Global Call for AI Red Lines
The Futures Society report - “Facing the Stakes of AI Together”: 2025 Athens Roundtable Report
Politico article - How the global effort to keep AI safe went off the rails
TechPolicy.Press article - A Timeline of the Anthropic-Pentagon Dispute
The Guardian article - AI got the blame for the Iran school bombing. The truth is far more worrying
Google and OpenAI Employee open letter - We Will Not Be Divided
The Register article - Altman said no to military AI abuses – then signed Pentagon deal anyway
SaferAI report - Evaluating AI Providers’ Frontier AI Safety Frameworks

Thinking Through "Digital Minds" w/ Jacy Reese-Anthis

Jacob Haimes — Mon, 09 Mar 2026 21:00:00 -0600

Jacy Reese-Anthis, founder of Sentience Institute and researcher at Stanford, began his journey working for animal welfare, but is now finishing up his PhD with research in many different AI subfields at the intersection of neuroscience, philosophy, social science, and machine learning. While this may seem like an odd jump at first, Jacy shares how his work has all been centered around the idea of moral circle expansion. In this episode, we dig into what sentience actually means (or at least how we can begin to think about it), why anthropomorphization is more complicated than it sounds, and how language models may be able to be leveraged as an effective tool for social science research.

Jacy also shares his median AGI estimate somewhere in there, so stay tuned if you want to catch it.

As part of my effort to make this whole podcasting thing more sustainable, I have created a Kairos.fm Patreon which includes an extended version of this episode. Supporting gets you access to these extended cuts, as well as other perks in development.

Chapters

(00:00) - Introduction
(05:41) - From Animal Welfare to Digital Minds
(09:00) - Founding Sentience Institute
(22:00) - Defining Sentience
(27:13) - The Anthropomorphization Problem
(47:51) - Why "Digital Minds" (Not "Artificial Intelligence")
(51:05) - LLMs as Social Science Tools
(01:07:03) - Jacy’s AGI Timeline & The Singularity
(01:09:23) - Final Thoughts & Outro

Critical Links
Below are the most important links for this episode. For more, visit the episode page on Kairos.fm.

Jacy's website
Wikipedia article - Jacy Reese Anthis
Sentience Institute website
CHI paper - Digital Companionship: Overlapping Uses of AI Companions and AI Assistants
ICML paper - LLM Social Simulations Are a Promising Research Method
ACL paper - The Impossibility of Fair LLMs
Wikipedia article - ELIZA effect
The Atlantic article - How a Google Employee Fell for the Eliza Effect

Scaling AI Safety Through Mentorship w/ Dr. Ryan Kidd

Jacob Haimes — Mon, 02 Feb 2026 14:00:00 -0700

What does it actually take to build a successful AI safety organization? I'm joined by Dr. Ryan Kidd, who has co-led MATS from a small pilot program to one of the field's premier talent pipelines. In this episode, he reveals the low-hanging fruit in AI safety field-building that most people are missing: the amplifier archetype.

I pushed Ryan on some hard questions, from balancing funder priorities and research independence, to building a robust selection process for both mentors and participants. Whether you're considering a career pivot into AI safety or already working in the field, this conversation offers practical advice on how to actually make an impact.

Chapters

(00:00) - - Intro
(08:16) - - Building MATS Post-FTX & Summer of Love
(13:09) - - Balancing Funder Priorities and Research Independence
(19:44) - - The MATS Selection Process
(33:15) - - Talent Archetypes in AI Safety
(50:22) - - Comparative Advantage and Career Capital in AI Safety
(01:04:35) - - Building the AI Safety Ecosystem
(01:15:28) - - What Makes a Great AI Safety Amplifier
(01:21:44) - - Lightning Round Questions
(01:30:30) - - Final Thoughts & Outro

Links

MATS

Ryan's Writing

LessWrong post - Talent needs of technical AI safety teams
LessWrong post - AI safety undervalues founders
LessWrong comment - Comment permalink with 2025 MATS program details
LessWrong post - Talk: AI Safety Fieldbuilding at MATS
LessWrong post - MATS Mentor Selection
LessWrong post - Why I funded PIBBSS
EA Forum post - How MATS addresses mass movement building concerns

FTX Funding of AI Safety

LessWrong blogpost - An Overview of the AI Safety Funding Situation
Fortune article - Why Sam Bankman-Fried’s FTX debacle is roiling A.I. research
NY Times article - FTX probes $6.5M in payments to AI safety group amid clawback crusade
Cointelegraph article - FTX probes $6.5M in payments to AI safety group amid clawback crusade
FTX Future Fund article - Future Fund June 2022 Update (archive)
Tracxn page - Anthropic Funding and Investors

Training & Support Programs

Funding Organizations

Coefficient Giving (previously Open Philanthropy)
LTFF
Longview Philanthropy
Renaissance Philanthropy

Coworking Spaces

Research Organizations & Startups

Other Sources

AXRP website - The AI X-risk Research Podcast
LessWrong blogpost - Shard Theory: An Overview

Sobering Up on AI Progress w/ Dr. Sean McGregor

Jacob Haimes — Mon, 29 Dec 2025 13:00:00 -0700

Sean McGregor and I discuss about why evaluating AI systems has become so difficult; we cover everything from the breakdown of benchmarking, how incentives shape safety work, and what approaches like BenchRisk (his recent paper at NeurIPS) and AI auditing aim to fix as systems move into the real world. We also talk about his history and journey in AI safety, including his PhD on ML for public policy, how he started the AI Incident Database, and what he's working on now: AVERI, a non-profit for frontier model auditing.

Chapters

(00:00) - Intro
(02:36) - What's broken about benchmarking
(03:41) - Sean’s wild PhD
(14:28) - The phantom internship
(19:25) - Sean's journey
(22:25) - Market-vs-regulatory modes and AIID
(32:13) - Drunk on AI progress
(38:34) - BenchRisk
(43:20) - Moral hazards and Master Hand
(50:34) - Liability, Section 230, and open source
(59:20) - AVERI
(01:11:30) - Closing thoughts & outro

Links

Sean McGregor's website
AVERI website

BenchRisk

BenchRisk website
NeurIPS paper - Risk Management for Mitigating Benchmark Failure Modes: BenchRisk
NeurIPS paper - AI and the Everything in the Whole Wide World Benchmark

AIID

AI Incident Database website
IAAI paper - Preventing Repeated Real World AI Failures by Cataloging Incidents: The AI Incident Database
Preprint - Lessons for Editors of AI Incidents from the AI Incident Database
AIAAIC website (another incident tracker)

Hot AI Summer

CACM article - A Few Useful Things to Know About Machine Learning
CACM article - How the AI Boom Went Bust
Undergraduate Thesis - Analyzing the Prospect of an Approaching AI Winter
Tech Genies article - AI History: The First Summer and Winter of AI
CACM article - There Was No ‘First AI Winter’

Measuring Generalization

Neural Computation article - The Lack of A Priori Distinctions Between Learning Algorithms
ICLR paper - Understanding deep learning requires rethinking generalization
ICML paper - Model-agnostic Measure of Generalization Difficulty
Radiology Artificial Intelligence article - Generalizability of Machine Learning Models: Quantitative Evaluation of Three Methodological Pitfalls
Preprint - Quantifying Generalization Complexity for Large Language Models

Insurers Exclude AI

Financial Times article - Insurers retreat from AI cover as risk of multibillion-dollar claims mount
Tom's Hardware article - Major insurers move to avoid liability for AI lawsuits as multi-billion dollar risks emerge — Recent public incidents have lead to costly repercussions
Insurance Newsnet article - Insurers Scale Back AI Coverage Amid Fears of Billion-Dollar Claims
Insurance Business article - Insurance’s gen AI reckoning has come

Section 230

Section 230 overview
Legal sidebar - Section 230 Immunity and Generative Artificial Intelligence
Bad Internet Bills website
TechDirt article - Section 230 Faces Repeal. Support The Coverage That’s Been Getting It Right All Along.
Privacy Guides video - Dissecting Bad Internet Bills with Taylor Lorenz: KOSA, SCREEN Act, Section 230
Journal of Technology in Behavioral Health article - Social Media and Mental Health: Benefits, Risks, and Opportunities for Research and Practice
Time article - Lawmakers Unveil New Bills to Curb Big Tech’s Power and Profit
House Hearing transcript - Legislative Solutions to Protect Children and Teens Online

Relevant Kairos.fm Episodes

Into AI Safety episode - Growing BlueDot's Impact w/ Li-Lian Ang
muckrAIkers episode - NeurIPS 2024 Wrapped 🌯

Other Links

Encyclopedia of Life website
IBM Watson AI XPRIZE website
ML Commons website
Wikipedia article

Against 'The Singularity' w/ Dr. David Thorstad

Jacob Haimes — Mon, 24 Nov 2025 09:00:00 -0700

Philosopher Dr. David Thorstad tears into one of AI safety's most influential arguments: the singularity hypothesis. We discuss why the idea of recursive self-improvement leading to superintelligence doesn't hold up under scrutiny, how these arguments have redirected hundreds of millions in funding away from proven interventions, and why people keep backpedaling to weaker versions when challenged.

David walks through the actual structure of singularity arguments, explains why similar patterns show up in other longtermist claims, and makes the case for why we should focus on concrete problems happening right now like poverty, disease, the rise of authoritarianism instead of speculative far-future scenarios.

Chapters

(00:00) - Intro
(02:13) - David's background
(08:00) - (Against) The Singularity Hypothesis
(29:46) - Beyond the The Singularity
(39:56) - What We Should Actually Be Worried About
(49:00) - Philanthropic Funding

Links

David's personal website
Reflective Altruism, David's blog

The Singularity Hypothesis

David's Philosophical Studies article - Against the singularity hypothesis
Time "AI Dictionary" page - Singularity
EA Forum blogpost - Summary: Against the singularity hypothesis
Journal of Conciousness Studies article - The Singularity: A Philisophical Analysis
Interim Report from the Panel Chairs: AAAI Presidential Panel on Long-Term AI Futures
Epoch AI blogpost - Do the returns to software R&D point towards a singularity?
Epoch AI report - Estimating Idea Production: A Methodological Survey

Funding References

LessWrong blogpost - An Overview of the AI Safety Funding Situation
AISafety.com funding page
Report - Stanford AI Index 2025, Chapter 4.3
Forbes article - AI Spending To Exceed A Quarter Trillion Next Year
AI Panic article - The “AI Existential Risk” Industrial Complex
GiveWell webpage - How Much Does It Cost To Save a Life?
Wikipedia article - Purchasing power parity

Pascal's Mugging and the St. Petersburg Paradox

Wikipedia article - St. Petersburg Paradox
Conjecture Magazine article - Pascal’s Mugging and Bad Explanations
neurabites explainer - Ergodicity: the Most Over-Looked Assumption
Wikipedia article - Extraordinary claims require extraordinary evidence

The Time of Perils

Global Priorities Institute working paper - Existential risk pessimism and the time of perils
Ethics article - Mistakes in the Moral Mathematics of Existential Risk
Philosophy & Public Affairs article - High Risk, Low Reward: A Challenge to the Astronomical Value of Existential Risk Mitigation
Toby Ord Book - The Precipice
Rethink Priorities blogpost - Charting the precipice
AI Futures Project blogpost - AI 2027

Trump's Higher Education Threat Compact

Wikipedia article - Compact for Academic Excellence in Higher Education
Pen America explainer - What is Trump’s Compact for Higher Education? And More Frequently Asked Questions
Statement by the Vanderbilt AAUP Executive Committee on the “Compact for Academic Excellence in Higher Education”
The Vanderbilt Hustler article - BREAKING: Chancellor Daniel Diermeier fails to reject higher education compact, reaffirms Vanderbilt’s values and openness to discussion
The Vanderbilt Hustler article - Students and faculty organize rally outside Kirkland Hall against Trump administration’s higher education compact
Free Speech Center article - Compact for Academic Excellence

More of David's Work

Global Priorities Institute working paper - What power-seeking theorems do not show
Book - Essays on Longtermism

Vibe Shift

Blood in the Machine article - GPT-5 Is a Joke. Will It Matter?
Futurism article - Evidence Grows That GPT-5 Is a Bit of a Dud
Gary Marcus substack - GPT-5: Overdue, overhyped and underwhelming. And that’s not the worst of it.
Pew Research report - How the U.S. Public and AI Experts View Artificial Intelligence
N...

Getting Agentic w/ Alistair Lowe-Norris

Jacob Haimes — Mon, 20 Oct 2025 14:30:00 -0600

Alistair Lowe-Norris, Chief Responsible AI Officer at Iridius and co-host of The Agentic Insider podcast, joins to discuss AI compliance standards, the importance of narrowly scoping systems, and how procurement requirements could encourage responsible AI adoption across industries. We explore the gap between the empty promises companies provide and actual safety practices, as well as the importance of vigilance and continuous oversight.

Listen to Alistair on his podcast, The Agentic Insider!

Chapters

(00:00) - Intro
(02:46) - Trustworthy AI and the Human Side of Change
(13:57) - This is Essentially Avatar, Right?
(23:00) - AI Call Centers
(49:38) - Standards, Audits, and Accountability
(01:04:11) - What Happens when Standards aren’t Met?

Links

Iridius website

GPT-5 Commentary

Where's Your Ed At blogpost - How Does GPT-5 Work?
Zvi LessWrong blogpost - GPT-5: The Reverse DeepSeek moment
Blood in the Machine article - GPT-5 Is a Joke. Will It Matter?
Futurism article - Evidence Grows That GPT-5 Is a Bit of a Dud
Gary Marcus substack - GPT-5: Overdue, overhyped and underwhelming. And that’s not the worst of it.

Customer Service and AI Adoption

Gartner press release - Gartner Survey Finds 64% of Customers Would Prefer That Companies Didn't Use AI for Customer Service
Preprint - Deploying Chatbots in Customer Service: Adoption Hurdles and Simple Remedies
KDD '25 paper - Retrieval And Structuring Augmented Generation with Large Language Models
Global Nerdy blogpost - Retrieval-augmented generation explained “Star Wars” style
The Security Cafe article - A Quick And Dirty Guide To Starting SOC2

Standards

ISO overview - AI management systems
ISO standard - ISO/IEC 42001
- CyberZoni guide - ISO 42001 The Complete Guide
- A-LIGN article - Understanding ISO 42001
ISO standard - ISO/IEC 27001
ISO standard - ISO/IEC 42005

Governance and Regulation

NIST framework - AI Risk Management Framework
EU AI Act article - Article 99: Penalties
Colorado Senate Bill 24-205 (Colorado AI Act) webpage
Utah Senate Bill 149 webpage

Microsoft AI Compliance

Schellman blogpost - Microsoft DPR AI Requirements and ISO 42001
Microsoft documentation - ISO/IEC 42001 AI Management System offering
Microsoft webpage - Responsible AI Principles and Approach
Microsoft Service Trust Portal documentation - Responsible AI Standard v2
Microsoft documentation - Supplier Security & Privacy Assurance Program Guide v11 April 2025

Growing BlueDot's Impact w/ Li-Lian Ang

Jacob Haimes — Mon, 15 Sep 2025 09:14:55 -0600

I'm joined by my good friend, Li-Lian Ang, first hire and product manager at BlueDot Impact. We discuss how BlueDot has evolved from their original course offerings to a new "defense-in-depth" approach, which focuses on three core threat models: reduced oversight in high risk scenarios (e.g. accelerated warfare), catastrophic terrorism (e.g. rogue actors with bioweapons), and the concentration of wealth and power (e.g. supercharged surveillance states). On top of that, we cover how BlueDot's strategies account for and reduce the negative impacts of common issues in AI safety, including exclusionary tendencies, elitism, and echo chambers.

2025.09.15: Learn more about how to make design effective interventions to make AI go well and potentially even get funded for it on BlueDot Impact's AGI Strategy course! BlueDot is also hiring, so if you think you’d be a good fit, I definitely recommend applying; I had a great experience when I contracted as a course facilitator. If you do end up applying, let them know you found out about the opportunity from the podcast!

Follow Li-Lian on LinkedIn, and look at more of her work on her blog!

(03:23) - Meeting Through the Course
(05:46) - Eating Your Own Dog Food
(13:13) - Impact Acceleration
(22:13) - Breaking Out of the AI Safety Mold
(26:06) - Bluedot’s Risk Framework
(41:38) - Dangers of "Frontier" Models
(54:06) - The Need for AI Safety Advocates
(01:00:11) - Hot Takes and Pet Peeves

Links

BlueDot Impact website

Defense-in-Depth

BlueDot Impact blogpost - Our vision for comprehensive AI safety training
Engineering for Humans blogpost - The Swiss cheese model: Designing to reduce catastrophic losses
Open Journal of Safety Science and Technology article - The Evolution of Defense in Depth Approach: A Cross Sectorial Analysis

X-clusion and X-risk

Nature article - AI Safety for Everyone
Ben Kuhn blogpost - On being welcoming
Reflective Altruism blogpost - Belonging (Part 1: That Bostrom email)

AIxBio

RAND report - The Operational Risks of AI in Large-Scale Biological Attacks
OpenAI "publication" (press release) - Building an early warning system for LLM-aided biological threat creation
Anthropic Frontier AI Red Team blogpost - Why do we take LLMs seriously as a potential source of biorisk?
Kevin Esvelt preprint - Foundation models may exhibit staged progression in novel CBRN threat disclosure
Anthropic press release - Activating AI Safety Level 3 protections

Persuasive AI

Preprint - Lies, Damned Lies, and Distributional Language Statistics: Persuasion and Deception with Large Language Models
Nature Human Behavior article - On the conversational persuasiveness of GPT-4
Preprint - Large Language Models Are More Persuasive Than Incentivized Human Persuaders

AI, Anthropomorphization, and Mental Health

Western News article - Expert insight: Humanlike chatbots detract from developing AI for the human good
AI & Society article - Anthropomorphization and beyond: conceptualizing humanwashing of AI-enabled machines
Artificial Ignorance article - The Chatbot Trap
Making Noise and Hearing Things blogpost - Large language models cannot replace mental health professionals
Idealogo blogpost - 4 reasons not to turn ChatGPT into your therapist
Journal of Medical Society Editorial - Importance of informed consent in medical practice
Indian Journal of Medical Research article - Consent in psychiatry - concept, application & implications
Media Naama article - The Risk of Humanising AI Chabots: Why ChatGPT Mimicking Feelings Can Backfire
Becker's Behavioral Health blogpost - OpenAI’s mental health roadmap: 5 things to know

Miscellaneous References

Carnegie Council blogpost - What Do We Mean When We Talk About "AI Democratization"?
Collective Intelligence Project policy brief - Four Approaches to Democratizing AI
BlueDot Impact blogpost - How Does AI Learn? A Beginner's Guide with Examples
BlueDot Impact blogpost - AI safety needs more public-facing advocacy

More Li-Lian Links

Humans of Minerva podcast website
Li-Lian's book - Purple is the Noblest Shroud

Relevant Podcasts from Kairos.fm

Scaling Democracy w/ Dr. Igor Krawczuk for AI safety exclusion and echo chambers
Getting into PauseAI w/ Will Petillo for AI in warfare and exclusion in AI safety

Layoffs to Leadership w/ Andres Sepulveda Morales

Jacob Haimes — Mon, 04 Aug 2025 09:00:00 -0600

Andres Sepulveda Morales joins me to discuss his journey from three tech layoffs to founding Red Mage Creative and leading the Fort Collins chapter of the Rocky Mountain AI Interest Group (RMAIIG). We explore the current tech job market, AI anxiety in nonprofits, dark patterns in AI systems, and building inclusive tech communities that welcome diverse perspectives.

Reach out to Andres on his LinkedIn, or check out the Red Mage Creative website!

For any listeners in Colorado, consider attending an RMAIIG event: Boulder; Fort Collins

(00:00) - Intro
(01:04) - Andres' Journey
(05:15) - Tech Layoff Cycle
(26:12) - Why AI?
(30:58) - What is Red Mage?
(36:12) - AI as a Tool
(41:55) - AInxiety
(47:26) - Dark Patterns and Critical Perspectives
(01:01:35) - RMAIIG
(01:10:09) - Inclusive Tech Education
(01:18:05) - Colorado AI Governance
(01:23:46) - Building Your Own Tech Community

Links

Tech Job Market

Layoff tracker website
The Big Newsletter article - Why Are We Pretending AI Is Going to Take All the Jobs?
METR preprint - Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity
AI Business blogpost - https://aibusiness.com/responsible-ai/debunking-the-ai-job-crisis
Crunchbase article - Data: Tech Layoffs Remain Stubbornly High, With Big Tech Leading The Way
Computerworld article - Tech layoffs surge even as US unemployment remains stable
Apollo Technical blogpost - Ghost jobs in tech: Why companies are posting roles they don’t plan to fill
The HR Digest article - The Rise of Ghost Jobs Is Leaving Job Seekers Frustrated and Disappointed
A Life After Layoff video - The Tech Job Market Is Hot Trash Right Now
Economy Media video - Will The Tech Job Market Ever Recover?
Soleyman Shahir video - Tech CEO Explains: The Real Reason Behind AI Layoffs

Dark Patterns

Deceptive Design website
Journal of Legal Analysis article - Shining a Light on Dark Patterns
ICLR paper - DarkBench: Benchmarking Dark Patterns in Large Language Models
Computing Within Limits paper - Imposing AI: Deceptive design patterns against sustainability
Communications of the ACM blogpost - Dark Patterns
[Preprint] - A Comprehensive Study on Dark Patterns

Colorado AI Regulation

Senate Bill 24-205 (Colorado AI Act) bill and webpage
NAAG article - A Deep Dive into Colorado’s Artificial Intelligence Act
Colorado Sun article - Why Colorado’s artificial intelligence law is a big deal for the whole country
CFO Dive blogpost - ‘Heavy lift’: Colorado AI law sets high bar, analysts say
Denver 7 article - Colorado could lose federal funding as Trump administration targets AI regulations
America's AI Action Plan document

Other Sources

Concordia Framework report and repo
80,000 Hours website
AI Incident Database website

Getting Into PauseAI w/ Will Petillo

Jacob Haimes, Will Petillo — Mon, 23 Jun 2025 10:00:00 -0600

Will Petillo, onboarding team lead at PauseAI, joins me to discuss the grassroots movement advocating for a pause on frontier AI model development. We explore PauseAI's strategy, talk about common misconceptions Will hears, and dig into how diverse perspectives still converge on the need to slow down AI development.

Will's Links

Personal blog on AI
His mindmap of the AI x-risk debate
Game demos
AI focused YouTube channel

(00:00) - Intro
(03:36) - What is PauseAI
(10:10) - Will Petillo's journey into AI safety advocacy
(21:13) - Understanding PauseAI
(31:35) - Pursuing a pause
(40:06) - Balancing advocacy in a complex world
(45:54) - Why a pause on frontier models?
(54:48) - Diverse perspectives within PauseAI
(59:55) - PauseAI misconceptions
(01:16:40) - Ongoing AI governance efforts (SB1047)
(01:28:52) - The role of incremental progress
(01:35:16) - Safety-washing and corporate responsibility
(01:37:23) - Lessons from environmentalism
(01:41:59) - Will's superlatives

Links

Related Kairos.fm Episodes

Into AI Safety episode with Dr. Igor Krawczuk
muckrAIkers episode on SB1047

Exclusionary Tendencies

Jacobin article - Elite Universities Gave Us Effective Altruism, the Dumbest Idea of the Century
SSIR article - The Elitist Philanthropy of So-Called Effective Altruism
Persuasion blogpost - The Problem with Effective Altruism
Dark Markets blogpost - What's So Bad About Rationalism?
FEE blogpost - What’s Wrong With the Rationality Community?

AI in Warfare

Master's Thesis - The Evolution of Artificial Intelligence and Expert Computer Systems in the Army
International Journal of Intelligent Systems article - Artificial Intelligence in the Military: An Overview of the Capabilities, Applications, and Challenges
Preprint - Basic Research, Lethal Effects: Military AI Research Funding as Enlistment
AOAV Article - ‘Military Age Males’ in US Drone Strikes
The Conversation article - Gaza war: Israel using AI to identify human targets raising fears that innocents are being caught in the net
972 article - ‘Lavender’: The AI machine directing Israel’s bombing spree in Gaza
IDF press release - The IDF's Use of Data Technologies in Intelligence Processing
Lieber Institute West Point article - Israel–Hamas 2024 Symposium
Verfassungsblog article - Gaza, Artificial Intelligence, and Kill Lists
RAND research report - Dr. Li Bicheng, or How China Learned to Stop Worrying and Love Social Media Manipulation
The Intercept article collection - The Drone Papers
AFIT faculty publication - On Large Language Models in National Security Applications
Nature article - Death by Metadata: The Bioinformationalisation of Life and the Transliteration of Algorithms to Flesh

Legislation

LegiScan page on SB1047
NY State Senate page on the RAISE Act
Congress page on the TAKE IT DOWN Act

The Gavernor

FastCompany article - Big Tech may be focusing its lobbying push on the California AI safety bill’s last stop: Gavin Newsom
POLITICO article - How California politics killed a nationally important AI bill
Newsom's veto message
Additional relevant lobbying documentation - [1], [2]
Jacobin article - With Newsom’s Veto, Big Tech Beats Democracy

Misc. Links

FLI Open Letter on an AI pause
Wikipedia article - Overton window
Daniel Smachtenburger YouTube video - An Introduction to the Metacrisis
VAISU website (looks broken as of 2025.06.19)
AI Impacts report - Why Did Environmentalism Become Partisan?

Making Your Voice Heard w/ Tristan & Felix de Simone

Jacob Haimes — Mon, 19 May 2025 09:00:00 -0600

I am joined by Tristan Williams and Felix de Simone to discuss their work on the potential of constituent communication, specifically in the context of AI legislation. These two worked as part of an AI Safety Camp team to understand whether or not it would be useful for more people to be sharing their experiences, concerns, and opinions with their government representative (hint, it is).

Check out the blogpost on their findings, "Talking to Congress: Can constituents contacting their legislator influence policy?" and the tool they created!

(01:53) - Introductions
(04:04) - Starting the project
(13:30) - Project overview
(16:36) - Understanding constituent communication
(28:50) - Literature review
(35:52) - Phase 2
(43:26) - Creating a tool for citizen engagement
(50:16) - Crafting your message
(59:40) - The game of advocacy
(01:15:19) - Difficulties on the project
(01:22:33) - Call to action
(01:32:30) - Outro

Links

Links to all articles/papers which are mentioned throughout the episode can be found below, in order of their appearance.

AI Safety Camp
Pause AI
BlueDot Impact
TIME article - There’s an AI Lobbying Frenzy in Washington. Big Tech Is Dominating
Congressional Management Foundation study - Communicating with Congress: Perceptions of Citizen Advocacy on Capitol Hill
Congressional Management Foundation study - The Future of Citizen Engagement: Rebuilding the Democratic Dialogue
Tristan and Felix's blogpost - Talking to Congress: Can constituents contacting their legislator influence policy?
Wired article - What It Takes to Make Congress Actually Listen
American Journal of Polical Science article - Congressional Representation: Accountability from the Constituent’s Perspective
Political Behavior article - Call Your Legislator: A Field Experimental Study of the Impact of a Constituency Mobilization Campaign on Legislative Voting
Guided Track website
The Tool
Holistic AI global regulatory tracker
White & Case global regulatory tracker
Steptoe US AI legislation tracker
Manatt US AIxHealth legislation tracker
Issue One article - Big Tech Cozies Up to New Administration After Spending Record Sums on Lobbying Last Year
Verfassungsblog article - BigTech’s Efforts to Derail the AI Act
MIT Technology Review article - OpenAI has upped its lobbying efforts nearly sevenfold
Open Secrets webpage - Issue Profile: Science & Technology
Statista data - Leading lobbying spenders in the United States in 2024
Global Justice Now report - Democracy at risk in Davos: new report exposes big tech lobbying and political interference
Ipsos article - Where Americans stand on AI
AP-NORC report - There Is Bipartisan Concern About the Use of AI in the 2024 Elections
AI Action Summit report - International AI Safety Report
YouGov article - Do Americans think AI will have a positive or negative impact on society?

INTERVIEW: Scaling Democracy w/ (Dr.) Igor Krawczuk

Jacob Haimes — Mon, 03 Jun 2024 10:11:00 -0600

The almost Dr. Igor Krawczuk joins me for what is the equivalent of 4 of my previous episodes. We get into all the classics: eugenics, capitalism, philosophical toads... Need I say more?

If you're interested in connecting with Igor, head on over to his website, or check out placeholder for thesis (it isn't published yet).

Because the full show notes have a whopping 115 additional links, I'll highlight some that I think are particularly worthwhile here:

The best article you'll ever read on Open Source AI
The best article you'll ever read on emergence in ML
Kate Crawford's Atlas of AI (Wikipedia)
On the Measure of Intelligence
Thomas Piketty's Capital in the Twenty-First Century (Wikipedia)
Yurii Nesterov's Introductory Lectures on Convex Optimization

Chapters

(02:32) - Introducing Igor
(10:11) - Aside on EY, LW, EA, etc., a.k.a. lettersoup
(18:30) - Igor on AI alignment
(33:06) - "Open Source" in AI
(41:20) - The story of infinite riches and suffering
(59:11) - On AI threat models
(01:09:25) - Representation in AI
(01:15:00) - Hazard fishing
(01:18:52) - Intelligence and eugenics
(01:34:38) - Emergence
(01:48:19) - Considering externalities
(01:53:33) - The shape of an argument
(02:01:39) - More eugenics
(02:06:09) - I'm convinced, what now?
(02:18:03) - AIxBio (round ??)
(02:29:09) - On open release of models
(02:40:28) - Data and copyright
(02:44:09) - Scientific accessibility and bullshit
(02:53:04) - Igor's point of view
(02:57:20) - Outro

Links

Links to all articles/papers which are mentioned throughout the episode can be found below, in order of their appearance. All references, including those only mentioned in the extended version of this episode, are included.

Suspicious Machines Methodology, referred to as the "Rotterdam Lighthouse Report" in the episode
LIONS Lab at EPFL
The meme that Igor references
On the Hardness of Learning Under Symmetries
Course on the concept of equivariant deep learning
Aside on EY/EA/etc.
- Sources on Eliezer Yudkowski
  - Scholarly Community Encyclopedia
  - TIME100 AI
  - Yudkowski's personal website
  - EY Wikipedia
  - A Very Literary Wiki -TIME article: Pausing AI Developments Isn’t Enough. We Need to Shut it All Down documenting EY's ruminations of bombing datacenters; this comes up later in the episode but is included here because it about EY.
- LessWrong
  - LW Wikipedia
- MIRI
- Coverage on Nick Bostrom (being a racist)
  - The Guardian article: ‘Eugenics on steroids’: the toxic and contested legacy of Oxford’s Future of Humanity Institute
  - The Guardian article: Oxford shuts down institute run by Elon Musk-backed philosopher
- Investigative piece on Émile Torres
- On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜
- NY Times article: We Teach A.I. Systems Everything, Including Our Biases
- NY Times article: Google Researcher Says She Was Fired Over Paper Highlighting Bias in A.I.
- Timnit Gebru's Wikipedia
- The TESCREAL Bundle: Eugenics and the Promise of Utopia through Artificial General Intelligence
- Sources on the environmental impact of LLMs
Filling Gaps in Trustworthy Development of AI (Igor is an author on this one)
A Computational Turn in Policy Process Studies: Coevolving Network Dynamics of Policy Change
The Smoothed Possibility of Social Choice, an intro in social choice theory and how it overlaps with ML
Relating to Dan Hendrycks
- Natural Selection Favors AIs over Humans
  - "One easy-to-digest source to highlight what he gets wrong [is] Social and Biopolitical Dimensions of Evolutionary Thinking" -Igor
- Introduction to AI Safety, Ethics, and Society, recently published textbook
- "Source to the section [of this paper] that makes Dan one of my favs from that crowd." -Igor
- Twitter post referenced in the episode<...

INTERVIEW: StakeOut.AI w/ Dr. Peter Park (3)

Jacob Haimes — Mon, 25 Mar 2024 09:00:00 -0600

As always, the best things come in 3s: dimensions, musketeers, pyramids, and... 3 installments of my interview with Dr. Peter Park, an AI Existential Safety Post-doctoral Fellow working with Dr. Max Tegmark at MIT.

As you may have ascertained from the previous two segments of the interview, Dr. Park cofounded StakeOut.AI along with Harry Luk and one other cofounder whose name has been removed due to requirements of her current position. The non-profit had a simple but important mission: make the adoption of AI technology go well, for humanity, but unfortunately, StakeOut.AI had to dissolve in late February of 2024 because no granter would fund them. Although it certainly is disappointing that the organization is no longer functioning, all three cofounders continue to contribute positively towards improving our world in their current roles.

If you would like to investigate further into Dr. Park's work, view his website, Google Scholar, or follow him on Twitter

00:00:54 ❙ Intro
00:02:41 ❙ Rapid development
00:08:25 ❙ Provable safety, safety factors, & CSAM
00:18:50 ❙ Litigation
00:23:06 ❙ Open/Closed Source
00:38:52 ❙ AIxBio
00:47:50 ❙ Scientific rigor in AI
00:56:22 ❙ AI deception
01:02:45 ❙ No takesies-backsies
01:08:22 ❙ StakeOut.AI's start
01:12:53 ❙ Sustainability & Agency
01:18:21 ❙ "I'm sold, next steps?" -you
01:23:53 ❙ Lessons from the amazing Spiderman
01:33:15 ❙ "I'm ready to switch careers, next steps?" -you
01:40:00 ❙ The most important question
01:41:11 ❙ Outro

Links to all articles/papers which are mentioned throughout the episode can be found below, in order of their appearance.

StakeOut.AI
Pause AI
AI Governance Scorecard (go to Pg. 3)
CIVITAI
- Article on CIVITAI and CSAM
Senate Hearing: Protecting Children Online
- PBS Newshour Coverage
The Times Sues OpenAI and Microsoft Over A.I. Use of Copyrighted Work
Open Source/Weights/Release/Interpretation
Signal
Bloomz model on HuggingFace
Mistral website
NASA Tragedies
- Challenger disaster on Wikipedia
- Columbia disaster on Wikipedia
AIxBio Risk
Apart Research
Science
Cicero
Open Sourcing the AI Revolution: Framing the debate on open source, artificial intelligence and regulation
AI Safety Camp
Into AI Safety Patreon

INTERVIEW: StakeOut.AI w/ Dr. Peter Park (2)

Jacob Haimes — Mon, 18 Mar 2024 09:00:00 -0600

Join me for round 2 with Dr. Peter Park, an AI Existential Safety Postdoctoral Fellow working with Dr. Max Tegmark at MIT. Dr. Park was a cofounder of StakeOut.AI, a non-profit focused on making AI go well for humans, along with Harry Luk and one other individual, whose name has been removed due to requirements of her current position.

In addition to the normal links, I wanted to include the links to the petitions that Dr. Park mentions during the podcast. Note that the nonprofit which began these petitions, StakeOut.AI, has been dissolved.
Right AI Laws, to Right Our Future: Support Artificial Intelligence Safety Regulations Now
Is Deepfake Illegal? Not Yet! Ban Deepfakes to Protect Your Family & Demand Deepfake Laws
Ban Superintelligence: Stop AI-Driven Human Extinction Risk

00:00:54 - Intro
00:02:34 - Battleground 1: Copyright
00:06:28 - Battleground 2: Moral Critique of AI Collaborationists
00:08:15 - Rich Sutton
00:20:41 - OpenAI Drama
00:34:28 - Battleground 3: Contract Negotiations for AI Ban Clauses
00:37:57 - Tesla, Autopilot, and FSD
00:40:02 - Recycling
00:47:40 - Battleground 4: New Laws and Policies
00:50:00 - Battleground 5: Whistleblower Protections
00:53:07 - Whistleblowing on Microsoft
00:54:43 - Andrej Karpathy & Exercises in Empathy
01:05:57 - Outro

Links to all articles/papers which are mentioned throughout the episode can be found below, in order of their appearance.

StakeOut.AI
The Times Sues OpenAI and Microsoft Over A.I. Use of Copyrighted Work
Susman Godfrey LLP
Rich Sutton
- Reinforcement Learning: An Introduction (textbook)
- AI Succession (presentation by Rich Sutton)
- The Alberta Plan for AI Research
Moore's Law
- The Future of Integrated Electronics (original paper)
- Computer History Museum's entry on Moore's Law
Stochastic gradient descent (SGD) on Wikipedia
OpenAI Drama
- Max Read's Substack post
- Zvi Mowshowitz's Substack series, in order of posting
  - OpenAI: Facts from a Weekend
  - OpenAI: The Battle of the Board
  - OpenAI: Altman Returns
  - OpenAI: Leaks Confirm the Story ← best singular post in the series
  - OpenAI: The Board Expands
- Official OpenAI announcement
WGA on Wikipedia
SAG-AFTRA on Wikipedia
Tesla's False Advertising
Recycling
Divide-and-Conquer Dynamics in AI-Driven Disempowerment
Microsoft Whistleblower
- Whistleblowers call out AI's flaws
- Shane's LinkedIn post
  - Letters sent by Jones
Karpathy announces departure from OpenAI

MINISODE: Restructure Vol. 2

Jacob Haimes — Mon, 11 Mar 2024 09:00:00 -0600

UPDATE: Contrary to what I say in this episode, I won't be removing any episodes that are already published from the podcast RSS feed.

After getting some advice and reflecting more on my own personal goals, I have decided to shift the direction of the podcast towards accessible content regarding "AI" instead of the show's original focus. I will still be releasing what I am calling research ride-along content to my Patreon, but the show's feed will consist only of content that I aim to make as accessible as possible.

00:35 - TL;DL
01:12 - Advice from Pete
03:10 - My personal goal
05:39 - Reflection on refining my goal
09:08 - Looking forward (logistics

INTERVIEW: StakeOut.AI w/ Dr. Peter Park (1)

Jacob Haimes — Mon, 04 Mar 2024 09:00:00 -0700

Dr. Peter Park is an AI Existential Safety Postdoctoral Fellow working with Dr. Max Tegmark at MIT. In conjunction with Harry Luk and one other cofounder, he founded ⁠StakeOut.AI, a non-profit focused on making AI go well for humans.

00:54 - Intro
03:15 - Dr. Park, x-risk, and AGI
08:55 - StakeOut.AI
12:05 - Governance scorecard
19:34 - Hollywood webinar
22:02 - Regulations.gov comments
23:48 - Open letters
26:15 - EU AI Act
35:07 - Effective accelerationism
40:50 - Divide and conquer dynamics
45:40 - AI "art"
53:09 - Outro

Links to all articles/papers which are mentioned throughout the episode can be found below, in order of their appearance.

MINISODE: "LLMs, a Survey"

Jacob Haimes — Mon, 26 Feb 2024 09:00:00 -0700

Take a trip with me through the paper Large Language Models, A Survey, published on February 9th of 2024. All figures and tables mentioned throughout the episode can be found on the Into AI Safety podcast website.

00:36 - Intro and authors
01:50 - My takes and paper structure
04:40 - Getting to LLMs
07:27 - Defining LLMs & emergence
12:12 - Overview of PLMs
15:00 - How LLMs are built
18:52 - Limitations if LLMs
23:06 - Uses of LLMs
25:16 - Evaluations and Benchmarks
28:11 - Challenges and future directions
29:21 - Recap & outro

Links to all articles/papers which are mentioned throughout the episode can be found below, in order of their appearance.

FEEDBACK: Applying for Funding w/ Esben Kran

Jacob Haimes — Mon, 19 Feb 2024 09:00:00 -0700

Esben reviews an application that I would soon submit for Open Philanthropy's Career Transitition Funding opportunity. Although I didn't end up receiving the funding, I do think that this episode can be a valuable resource for both others and myself when applying for funding in the future.

Head over to Apart Research's website to check out their work, or the Alignment Jam website for information on upcoming hackathons.

A doc-capsule of the application at the time of this recording can be found at this link.

01:38 - Interview starts
05:41 - Proposal
11:00 - Personal statement
14:00 - Budget
21:12 - CV
22:45 - Application questions
34:06 - Funding questions
44:25 - Outro

Links to all articles/papers which are mentioned throughout the episode can be found below, in order of their appearance.

AI governance talent profiles we’d like to see
The AI Governance Research Sprint
Reasoning Transparency
Places to look for funding
- Open Philanthropy's Career development and transition funding
- Long-Term Future Fund
- Manifund

MINISODE: Reading a Research Paper

Jacob Haimes — Mon, 12 Feb 2024 09:00:00 -0700

Before I begin with the paper-distillation based minisodes, I figured we would go over best practices for reading research papers. I go through the anatomy of typical papers, and some generally applicable advice.

00:56 - Anatomy of a paper
02:38 - Most common advice
05:24 - Reading sparsity and path
07:30 - Notes and motivation

Links to all articles/papers which are mentioned throughout the episode can be found below, in order of their appearance.

Ten simple rules for reading a scientific paper
Best sources I found
Some more sources

HACKATHON: Evals November 2023 (2)

Jacob Haimes — Mon, 05 Feb 2024 09:00:00 -0700

Join our hackathon group for the second episode in the Evals November 2023 Hackathon subseries. In this episode, we solidify our goals for the hackathon after some preliminary experimentation and ideation.

Check out Stellaric's website, or follow them on Twitter.

01:53 - Meeting starts
05:05 - Pitch: extension of locked models
23:23 - Pitch: retroactive holdout datasets
34:04 - Preliminary results
37:44 - Next steps
42:55 - Recap

Links to all articles/papers which are mentioned throughout the episode can be found below, in order of their appearance.

MINISODE: Portfolios

Jacob Haimes — Mon, 29 Jan 2024 09:00:00 -0700

I provide my thoughts and recommendations regarding personal professional portfolios.

00:35 - Intro to portfolios
01:42 - Modern portfolios
02:27 - What to include
04:38 - Importance of visual
05:50 - The "About" page
06:25 - Tools
08:12 - Future of "Minisodes"

Links to all articles/papers which are mentioned throughout the episode can be found below, in order of their appearance.

INTERVIEW: Polysemanticity w/ Dr. Darryl Wright

Jacob Haimes — Mon, 22 Jan 2024 09:00:00 -0700

Darryl and I discuss his background, how he became interested in machine learning, and a project we are currently working on investigating the penalization of polysemanticity during the training of neural networks.

Check out a diagram of the decoder task used for our research!

01:46 - Interview begins
02:14 - Supernovae classification
08:58 - Penalizing polysemanticity
20:58 - Our "toy model"
30:06 - Task description
32:47 - Addressing hurdles
39:20 - Lessons learned

Links to all articles/papers which are mentioned throughout the episode can be found below, in order of their appearance.

Zooniverse
BlueDot Impact
AI Safety Support
Zoom In: An Introduction to Circuits
MNIST dataset on PapersWithCode
Clusterability in Neural Networks
CIFAR-10 dataset
Effective Altruism Global
CLIP (blog post)
Long Term Future Fund
Engineering Monosemanticity in Toy Models

MINISODE: Starting a Podcast

Jacob Haimes — Mon, 15 Jan 2024 09:00:00 -0700

A summary and reflections on the path I have taken to get this podcast started, including some resources recommendations for others who want to do something similar.

Links to all articles/papers which are mentioned throughout the episode can be found below, in order of their appearance.

HACKATHON: Evals November 2023 (1)

Jacob Haimes — Mon, 08 Jan 2024 09:00:00 -0700

This episode kicks off our first subseries, which will consist of recordings taken during my team's meetings for the AlignmentJams Evals Hackathon in November of 2023. Our team won first place, so you'll be listening to the process which, at the end of the day, turned out to be pretty good.

Check out Apart Research, the group that runs the AlignmentJamz Hackathons.

Links to all articles/papers which are mentioned throughout the episode can be found below, in order of their appearance.

MINISODE: Staying Up-to-Date in AI

Jacob Haimes — Mon, 01 Jan 2024 09:00:00 -0700

In this minisode I give some tips for staying up-to-date in the everchanging landscape of AI. I would like to point out that I am constantly iterating on these strategies, tools, and sources, so it is likely that I will make an update episode in the future.

Links to all articles/papers which are mentioned throughout the episode can be found below, in order of their appearance.

Tools
My "Distilled AI" Folder
- AI Explained YouTube channel
- AI Safety newsletter
- Data Machina newsletter
- Import AI
- Midwit Alignment
Honourable Mentions
- AI Alignment Forum
- LessWrong
- Bounded Regret (Jacob Steinhart's blog)
- Cold Takes (Holden Karnofsky's blog)
- Chris Olah's blog
- Tim Dettmers blog
- Epoch blog
- Apollo Research blog

INTERVIEW: Applications w/ Alice Rigg

Jacob Haimes — Mon, 18 Dec 2023 09:00:00 -0700

Alice Rigg, a mechanistic interpretability researcher from Ottawa, Canada, joins me to discuss their path and the applications process for research/mentorship programs.

Join the Mech Interp Discord server and attend reading groups at 11:00am on Wednesdays (Mountain Time)!

Check out Alice's website.

Links to all articles/papers which are mentioned throughout the episode can be found below, in order of their appearance.

EleutherAI
- Join the public EleutherAI discord server
Distill
Effective Altruism (EA)
MATS Retrospective Summer 2023 post
Ambitious Mechanistic Interpretability AISC research plan by Alice Rigg
SPAR
Stability AI
- During their most recent fundraising round, Stability AI had a valuation of $4B (Bloomberg)
Mech Interp Discord Server

MINISODE: Program Applications (Winter 2024)

Jacob Haimes — Mon, 11 Dec 2023 09:00:00 -0700

We're back after a month-long hiatus with a podcast refactor and advice on the applications process for research/mentorship programs.

Check out the About page on the Into AI Safety website for a summary of the logistics updates.

Links to all articles/papers which are mentioned throughout the episode can be found below, in order of their appearance.

MINISODE: EAG Takeaways (Boston 2023)

Jacob Haimes — Mon, 04 Dec 2023 09:00:00 -0700

This episode is a brief overview of the major takeaways I had from attending EAG Boston 2023, and an update on my plans for the podcast moving forward.

TL;DL

Starting in early December (2023), I will be uploading episodes on a biweekly basis (day TBD).
I won't be releasing another episode until then, so that I can build a cache of episodes up.
During this month (November 2023), I'll also try to get the podcast up on more platforms, set up comments on more platforms, and create an anonymous feedback form.

Links Links to all articles/papers which are mentioned throughout the episode can be found below, in order of their appearance.

Like the show? Think it could be improved? Fill out this anonymous feedback form to let me know!

Please email all inquiries to intoaisafety@gmail.com.

FEEDBACK: AISC Proposal w/ Remmelt Ellen

Jacob Haimes — Mon, 27 Nov 2023 09:00:00 -0700

In this episode I discuss my initial research proposal for the 2024 Winter AI Safety Camp with one of the individuals who helps facilitate the program, Remmelt Ellen.

The proposal is titled The Effect of Machine Learning on Bioengineered Pandemic Risk. A doc-capsule of the proposal at the time of this recording can be found at this link.

Links to all articles/papers which are mentioned throughout the episode can be found below, in order of their appearance.

MINISODE: Introduction and Motivation

Jacob Haimes — Mon, 13 Nov 2023 09:00:00 -0700

Welcome to the Into AI Safety podcast! In this episode I provide reasoning for why I am starting this podcast, what I am trying to accomplish with it, and a little bit of background on how I got here.

Please email all inquiries and suggestions to intoaisafety@gmail.com.