<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
	<channel>
		<title>Philosophical Multicore</title>
		<description>Don't just not do bad things. Do good things.</description>
		<link>http://mdickens.me</link>
		<atom:link href="http://mdickens.me/feed.xml" rel="self" type="application/rss+xml" />
        <pubDate>Sat, 11 Apr 2026 08:46:27 -0700</pubDate>
        <lastBuildDate>Sat, 11 Apr 2026 08:46:27 -0700</lastBuildDate>
        <generator>Jekyll v4.3.4</generator>
		
			<item>
				<title>Pausing AI Is the Best Answer to Post-Alignment Problems</title>
				<pubDate>Sat, 11 Apr 2026 00:00:00 -0700</pubDate>
				<link>http://mdickens.me/2026/04/11/pause_for_post-alignment_problems/</link>
				<guid isPermaLink="true">http://mdickens.me/2026/04/11/pause_for_post-alignment_problems/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;Even if we solve the AI alignment problem, we still face &lt;strong&gt;post-alignment problems&lt;/strong&gt;, which are all the other existential problems&lt;sup id=&quot;fnref:3&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; that AI may bring.&lt;/p&gt;

&lt;p&gt;People have identified various imposing problems that we may need to solve before developing ASI. An incomplete list of topics: &lt;a href=&quot;https://longtermrisk.org/overview-of-transformative-ai-misuse-risks-what-could-go-wrong-beyond-misalignment/&quot;&gt;misuse&lt;/a&gt;; &lt;a href=&quot;https://forum.effectivealtruism.org/posts/2cZAzvaQefh5JxWdb/bringing-about-animal-inclusive-ai&quot;&gt;animal-inclusive AI&lt;/a&gt;; &lt;a href=&quot;https://eleosai.org/post/research-priorities-for-ai-welfare/&quot;&gt;AI welfare&lt;/a&gt;; &lt;a href=&quot;https://longtermrisk.org/research-agenda&quot;&gt;S-risks from conflict&lt;/a&gt;; &lt;a href=&quot;https://www.lesswrong.com/posts/GAv4DRGyDHe2orvwB/gradual-disempowerment-concrete-research-projects&quot;&gt;gradual disempowerment&lt;/a&gt;; &lt;a href=&quot;https://arxiv.org/html/2502.07050v1&quot;&gt;permanent mass unemployment&lt;/a&gt;; &lt;a href=&quot;https://forum.effectivealtruism.org/posts/LpkXtFXdsRd4rG8Kb/reducing-long-term-risks-from-malevolent-actors&quot;&gt;risks from malevolent actors&lt;/a&gt;/&lt;a href=&quot;https://www.forethought.org/research/ai-enabled-coups-how-a-small-group-could-use-ai-to-seize-power&quot;&gt;AI-enabled coups&lt;/a&gt;/&lt;a href=&quot;https://forum.effectivealtruism.org/posts/ufeKYQWdvWfG6Zers/rose-hadshar-on-why-automating-human-labour-will-break-our&quot;&gt;gradual concentration of power&lt;/a&gt;; &lt;a href=&quot;https://forum.effectivealtruism.org/posts/HqmQMmKgX7nfSLaNX/moral-error-as-an-existential-risk&quot;&gt;moral error&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;If we figure out how to resolve one of these problems, we still have to deal with all the others. If even one problem remains unsolved, the future could be catastrophically bad. That fact &lt;a href=&quot;https://mdickens.me/2025/11/20/research_wont_solve_non-alignment_problems/&quot;&gt;diminishes the promise&lt;/a&gt; of working on problems individually.&lt;/p&gt;

&lt;p&gt;A &lt;a href=&quot;https://superintelligence-statement.org/&quot;&gt;global moratorium on superintelligence&lt;/a&gt; buys us more time to work on alignment as well as all of the post-alignment problems. Pausing AI is in the common interest of many causes.&lt;sup id=&quot;fnref:8&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:8&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;!-- more --&gt;

&lt;p&gt;&lt;em&gt;Cross-posted to the &lt;a href=&quot;https://forum.effectivealtruism.org/posts/owthSDwevZscRLsPM/pausing-ai-is-the-best-answer-to-post-alignment-problems&quot;&gt;EA Forum&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;h2 id=&quot;contents&quot;&gt;Contents&lt;/h2&gt;

&lt;ul id=&quot;markdown-toc&quot;&gt;
  &lt;li&gt;&lt;a href=&quot;#contents&quot; id=&quot;markdown-toc-contents&quot;&gt;Contents&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#we-cant-delay-until-after-asi&quot; id=&quot;markdown-toc-we-cant-delay-until-after-asi&quot;&gt;We can’t delay until after ASI&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#whats-the-alternative-to-pausing&quot; id=&quot;markdown-toc-whats-the-alternative-to-pausing&quot;&gt;What’s the alternative to pausing?&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#notes&quot; id=&quot;markdown-toc-notes&quot;&gt;Notes&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;we-cant-delay-until-after-asi&quot;&gt;We can’t delay until after ASI&lt;/h2&gt;

&lt;p&gt;If we figure out how to align ASI, can it solve post-alignment problems for us? Or can we use ASI to enable a &lt;a href=&quot;https://forum.effectivealtruism.org/posts/4xwWDLfMenw48TR8c/long-reflection-reading-list&quot;&gt;Long Reflection&lt;/a&gt;? No.&lt;/p&gt;

&lt;p&gt;To build an aligned ASI, one of two conditions must hold:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;The ASI has locked-in values.&lt;/li&gt;
  &lt;li&gt;The ASI is &lt;a href=&quot;https://www.alignmentforum.org/w/corrigibility-1&quot;&gt;corrigible&lt;/a&gt;: it will do what its masters say, and will allow its goals to be changed.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If values are locked in, we can’t defer any problems related to moral philosophy; we must solve them in advance.&lt;sup id=&quot;fnref:10&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:10&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;If the ASI is corrigible, then that lets us take time to do a Long Reflection, figuring out The Good with the help of a superintelligent assistant. But a corrigible ASI creates other problems. It means the first person to get access to the newly-created ASI could use it to take over the world. If the ASI is widely accessible, bad actors could use it to do enormous harm. Corrigibility increases catastrophic risks from misuse and totalitarianism.&lt;/p&gt;

&lt;p&gt;If we want a post-ASI Long Reflection, then we still need the AI to be aligned, and we need some sort of impartial governance that prevents rogue individuals from co-opting the Reflection. &lt;a href=&quot;https://mdickens.me/2026/04/06/by_strong_default_ASI_will_end_liberal_democracy/&quot;&gt;By strong default, ASI will end liberal democracy.&lt;/a&gt; On the current trajectory, we will end up with a small group of people—either AI company leaders or government leaders—having dictatorial control over advanced AI. At minimum, we need to solve the AI misuse and power concentration problems &lt;em&gt;before&lt;/em&gt; developing ASI; and we need to have a way to avoid &lt;a href=&quot;https://forum.effectivealtruism.org/topics/value-lock-in&quot;&gt;value lock-in&lt;/a&gt; &lt;em&gt;without&lt;/em&gt; exacerbating misuse and concentration risks.&lt;/p&gt;

&lt;p&gt;Perhaps there’s some version of value alignment/corrigibility that finds the right middle ground to avoid the problems on both sides. But anything resembling a solution looks very far off, and not enough people take these problems seriously.&lt;/p&gt;

&lt;h2 id=&quot;whats-the-alternative-to-pausing&quot;&gt;What’s the alternative to pausing?&lt;/h2&gt;

&lt;p&gt;Advocating to pause AI is the most &lt;em&gt;important&lt;/em&gt; response to post-alignment problems, but it might not be the most &lt;em&gt;cost-effective&lt;/em&gt;. Achieving a globally coordinated pause would be difficult. Maybe it’s more cost-effective to work on various post-alignment problems individually, or to search for other mitigations that reduce risk from many post-alignment problems simultaneously.&lt;/p&gt;

&lt;p&gt;I can’t &lt;em&gt;confidently&lt;/em&gt; say that advocating for a pause is the best thing to do, but nothing else looks clearly better.&lt;/p&gt;

&lt;p&gt;Two arguments in favor of prioritizing AI pause advocacy as an answer to post-alignment problems:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;&lt;a href=&quot;https://mdickens.me/2025/11/20/research_wont_solve_non-alignment_problems/&quot;&gt;If timelines are short, then we don’t have time to solve post-alignment problems.&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;Pausing AI helps with all post-alignment problems simultaneously by giving us more time to work on them.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The most compelling argument against pause advocacy is that it’s intractable. It’s out of scope of this essay to go in depth on tractability, but I expect that achieving a pause is less difficult than solving every post-alignment problem &lt;em&gt;without&lt;/em&gt; pausing. In an alternative world where (say) we’re home free as long as we solve the problem of AI-enabled totalitarianism, then directly working on totalitarianism might be better than pause advocacy. But there are &lt;em&gt;many&lt;/em&gt; bad outcomes to avert, which makes pausing AI—as difficult as that would be—easier than solving all the post-alignment problems in a short time span.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://mdickens.me/2025/09/19/ai_safety_landscape/#some-relevant-research-agendas&quot;&gt;Research agendas on post-alignment problems&lt;/a&gt; rarely propose “pause/slow down AI development” as a mitigation. This may be because the authors don’t believe it’s a good response. But the research agendas don’t consider-and-ultimately-reject the idea of pausing AI; instead, they don’t address it at all. If I’m wrong, and a pause is &lt;em&gt;not&lt;/em&gt; the best answer to post-alignment problems, then there is work to be done to articulate why other responses are better.&lt;/p&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:3&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Existential in the &lt;a href=&quot;https://existential-risk.com/concept&quot;&gt;classic sense&lt;/a&gt; of “a permanent loss of most of the potential flourishing of the future”. &lt;a href=&quot;#fnref:3&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:8&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;This wording is borrowed from &lt;a href=&quot;https://www.lesswrong.com/posts/4PPE6D635iBcGPGRy/rationality-common-interest-of-many-causes&quot;&gt;Rationality: Common Interest of Many Causes&lt;/a&gt;. &lt;a href=&quot;#fnref:8&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:10&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Our best bet might be something like &lt;a href=&quot;https://www.lesswrong.com/w/coherent-extrapolated-volition&quot;&gt;Coherent Extrapolated Volition&lt;/a&gt;. Unfortunately, no AI developers are working on how to do that. &lt;a href=&quot;#fnref:10&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>By Strong Default, ASI Will End Liberal Democracy</title>
				<pubDate>Mon, 06 Apr 2026 00:00:00 -0700</pubDate>
				<link>http://mdickens.me/2026/04/06/by_strong_default_ASI_will_end_liberal_democracy/</link>
				<guid isPermaLink="true">http://mdickens.me/2026/04/06/by_strong_default_ASI_will_end_liberal_democracy/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;The existence of liberal democracy—with rule of law, constraints on government power, and enfranchised citizens—relies on a balance of power where individual bad actors can’t do too much damage. Artificial superintelligence (ASI), even if it’s aligned, would end that balance by default.&lt;/p&gt;

&lt;!-- more --&gt;

&lt;p&gt;&lt;em&gt;Cross-posted to &lt;a href=&quot;https://www.lesswrong.com/posts/gmYTwEyvEsCyhESwh/by-strong-default-asi-will-end-liberal-democracy&quot;&gt;LessWrong&lt;/a&gt; and the &lt;a href=&quot;https://forum.effectivealtruism.org/posts/iwJPepgiwRinZShkB/by-strong-default-asi-will-end-liberal-democracy&quot;&gt;EA Forum&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;It is not a question of who develops ASI. Whether the first ASI is developed by a totalitarian state or a democracy, the end result will—by strong default—be a &lt;em&gt;de facto&lt;/em&gt; global dictatorship.&lt;/p&gt;

&lt;p&gt;The central problem is that whoever controls ASI can defeat any opposition. Imagine a scenario where (say) &lt;a href=&quot;https://en.wikipedia.org/wiki/DARPA&quot;&gt;DARPA&lt;/a&gt; develops the first superintelligence&lt;sup id=&quot;fnref:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;, and the head of the ASI training program decides to seize power. What can anyone do about it?&lt;/p&gt;

&lt;p&gt;If the president orders the military to capture DARPA’s data centers, the ASI can defeat the military.&lt;sup id=&quot;fnref:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;If Congress issues a mandate that DARPA must turn over control of the ASI, DARPA can refuse, and Congress has even less recourse than the president.&lt;/p&gt;

&lt;p&gt;If liberal democracy continues to exist, it will only be by the grace of whoever controls ASI.&lt;/p&gt;

&lt;p&gt;There are two plausible scenarios that have some chance of avoiding a totalitarian outcome:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;AI capabilities progress slowly.&lt;/li&gt;
  &lt;li&gt;The ASI itself protects liberal democracy.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;I will discuss them in turn.&lt;/p&gt;

&lt;h2 id=&quot;what-if-ai-capabilities-progress-slowly&quot;&gt;What if AI capabilities progress slowly?&lt;/h2&gt;

&lt;p&gt;We have a chance at averting &lt;em&gt;de facto&lt;/em&gt; totalitarianism if two conditions hold:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;At each step of AI development, control of AI is distributed widely.&lt;/li&gt;
  &lt;li&gt;At each step, the next-generation AI is not strong enough to overpower all the copies of the previous generation.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Widely distributing AI is difficult—today’s frontier LLMs require supercomputers to run, their hardware requirements are becoming increasingly expensive with each generation, and AI developers have strong incentives against distributing them. In addition, distributing AI exacerbates misalignment and misuse risks, and it’s likely not worth the tradeoff.&lt;/p&gt;

&lt;p&gt;We do not know whether takeoff will be fast or slow; &lt;em&gt;banking&lt;/em&gt; on a slow takeoff is an extremely risky move. Frontier AI companies are trying their best to rapidly build up to ASI, and they explicitly want to make AI do &lt;a href=&quot;https://en.wikipedia.org/wiki/Recursive_self-improvement&quot;&gt;recursive self-improvement&lt;/a&gt;. If they succeed, it’s hard to see how liberal democracy will be able to preserve itself.&lt;/p&gt;

&lt;h2 id=&quot;what-if-the-asi-itself-protects-liberal-democracy&quot;&gt;What if the ASI itself protects liberal democracy?&lt;/h2&gt;

&lt;p&gt;There is a conceivable scenario where an aligned ASI preserves liberal democracy, and refuses any orders that would violate people’s civil liberties.&lt;/p&gt;

&lt;p&gt;Above, I wrote:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;If liberal democracy continues to exist, it will only be by the grace of whoever controls ASI.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That’s still true, but in this case “whoever controls ASI” would be the ASI itself. If it’s aligned in a transparent way, then maybe we can be confident that it really will preserve democracy.&lt;/p&gt;

&lt;p&gt;Even in this scenario, there is still a small group of people who control how the ASI is trained. The hope is that, at training time, those people do not yet have enough power to prevent oversight. For example, maybe laws mandate that (1) AI developers must make their training process public and auditable and (2) the training process must steer the AI toward valuing liberal democracy. It is not at all obvious how those laws would work, or how we would get those laws, or how they would be enforced; but at least this outcome is conceivable as a possibility.&lt;/p&gt;

&lt;p&gt;This scenario introduces some additional challenges:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;The ASI must be &lt;a href=&quot;https://www.alignmentforum.org/w/corrigibility-1&quot;&gt;incorrigible&lt;/a&gt; with respect to protecting liberal democracy. That constrains us in terms of what types of alignment solutions we can use, which makes the alignment problem harder to solve. Incorrigibility means if you make a mistake in designing the AI, then you can’t fix it.&lt;/li&gt;
  &lt;li&gt;We must ensure that an immutable “protect liberal democracy” directive won’t have severe unintended consequences—which, by default, it probably will. (Think Asimov’s Three Laws of Robotics.)&lt;/li&gt;
  &lt;li&gt;AI progress must proceed slowly enough that the appropriate laws or regulations can be put in place before it’s too late; or we must trust that the leading AI developer embeds appropriate values into its ASI.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2 id=&quot;liberal-democracy-is-not-the-true-target&quot;&gt;Liberal democracy is not the true target&lt;/h2&gt;

&lt;p&gt;As the saying goes, democracy is the worst form of government except for all those other forms that have been tried. We don’t want democracy; what we want is a &lt;em&gt;truly good&lt;/em&gt; form of government (and hopefully one day we will figure out what that is). The fear isn’t that ASI will replace democracy with one of those truly good forms of government; it’s that we will get totalitarianism.&lt;/p&gt;

&lt;p&gt;Liberal democracy beats totalitarianism. But &lt;em&gt;locking in&lt;/em&gt; liberal democracy prevents us from getting any actually-good governmental system. This is a dilemma.&lt;/p&gt;

&lt;h2 id=&quot;maybe-we-can-avoid-totalitarianism-but-there-is-no-clear-path&quot;&gt;Maybe we can avoid totalitarianism, but there is no clear path&lt;/h2&gt;

&lt;p&gt;This essay does not assert that ASI will end liberal democracy. It asserts that, &lt;em&gt;by strong default&lt;/em&gt;, ASI will end liberal democracy (even conditional on solving the alignment problem). There may be ways to avoid this problem—I sketched out two possible paths forward. But those sketches still require many sub-problems to be solved; I do not expect things to go well by default.&lt;/p&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:2&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Or, more likely, expropriates it from a private company on a pretense of national security. &lt;a href=&quot;#fnref:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:1&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;For an explanation of why ASI could defeat any government’s military, see &lt;a href=&quot;https://ifanyonebuildsit.com/&quot;&gt;If Anyone Builds It Everyone Dies&lt;/a&gt; Chapter 6 and its &lt;a href=&quot;https://ifanyonebuildsit.com/6&quot;&gt;online supplement&lt;/a&gt;. For a shorter (and online-only) explanation, see &lt;a href=&quot;https://intelligence.org/the-problem/#4_lethally_dangerous&quot;&gt;It would be lethally dangerous to build ASIs that have the wrong goals&lt;/a&gt;.&lt;/p&gt;

      &lt;p&gt;Those sources argue that a &lt;em&gt;misaligned&lt;/em&gt; ASI could defeat humanity, whereas my claim is that an &lt;em&gt;aligned&lt;/em&gt; ASI could defeat any opposition, but the arguments are the same in both cases. &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>The Future Will Be Weirder Than That</title>
				<pubDate>Sun, 29 Mar 2026 00:00:00 -0700</pubDate>
				<link>http://mdickens.me/2026/03/29/future_will_be_weirder_than_that/</link>
				<guid isPermaLink="true">http://mdickens.me/2026/03/29/future_will_be_weirder_than_that/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;Many people in the animal welfare community treat AI as a powerful but normal technology, in the same category as the steam engine or the internet. They talk about how transformative AI will impact factory farming and what it will mean for animal advocacy.&lt;/p&gt;

&lt;p&gt;Only two futures are plausible:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;AI progress slows down—either because it hits a natural wall, or because civilization deliberately makes the (correct) choice to &lt;a href=&quot;https://pauseai.info/&quot;&gt;stop building it&lt;/a&gt; until we know how to make it safe.&lt;/li&gt;
  &lt;li&gt;Superintelligent AI makes the future radically weird: &lt;a href=&quot;https://en.wikipedia.org/wiki/Dyson_sphere&quot;&gt;Dyson spheres&lt;/a&gt;, &lt;a href=&quot;https://nanosyste.ms/&quot;&gt;molecular nanotechnology&lt;/a&gt;, &lt;a href=&quot;https://forum.effectivealtruism.org/topics/artificial-sentience&quot;&gt;digital minds&lt;/a&gt;, &lt;a href=&quot;https://en.wikipedia.org/wiki/Self-replicating_spacecraft&quot;&gt;von Neumann probes&lt;/a&gt;, and still-weirder things that nobody’s conceived of.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;There is no plausible middle ground where we get “transformative AI”, but factory farming persists.&lt;/p&gt;

&lt;p&gt;Two theses:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;If transformative AI arrives, then it will bring about &lt;em&gt;profoundly&lt;/em&gt; radical changes to technology and society.&lt;/li&gt;
  &lt;li&gt;AGI is &lt;em&gt;general intelligence&lt;/em&gt;. It doesn’t just accelerate technological growth: it replaces human labor and judgment across every domain.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Animal advocacy strategy needs to reckon with these.&lt;/p&gt;

&lt;p&gt;This criticism is written from a place of solidarity—I want animal activists to succeed, which is why I want to work out our disagreements.&lt;sup id=&quot;fnref:9&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:9&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;!-- more --&gt;

&lt;p&gt;&lt;em&gt;Cross-posted to the &lt;a href=&quot;https://forum.effectivealtruism.org/posts/FaJgpGL522E5TciCx/the-future-will-be-weirder-than-that&quot;&gt;EA Forum&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;h2 id=&quot;contents&quot;&gt;Contents&lt;/h2&gt;

&lt;ul id=&quot;markdown-toc&quot;&gt;
  &lt;li&gt;&lt;a href=&quot;#contents&quot; id=&quot;markdown-toc-contents&quot;&gt;Contents&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#ai-makes-the-future-weird&quot; id=&quot;markdown-toc-ai-makes-the-future-weird&quot;&gt;AI makes the future weird&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#agi--intelligence&quot; id=&quot;markdown-toc-agi--intelligence&quot;&gt;AGI = intelligence&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#if-the-future-will-be-weird-what-should-animal-activists-do&quot; id=&quot;markdown-toc-if-the-future-will-be-weird-what-should-animal-activists-do&quot;&gt;If the future will be weird, what should animal activists do?&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#notes&quot; id=&quot;markdown-toc-notes&quot;&gt;Notes&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;ai-makes-the-future-weird&quot;&gt;AI makes the future weird&lt;/h2&gt;

&lt;p&gt;Much has been written about why we should expect AI to make the future weird, and soon:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;https://blog.ai-futures.org/p/ai-as-profoundly-abnormal-technology&quot;&gt;AI As Profoundly Abnormal Technology&lt;/a&gt; by the AI Futures Project argues that there are no strict speed limits to AI progress, nor any reason to expect progress to stop short of superintelligence. (See also their detailed &lt;a href=&quot;https://ai-2027.com/research&quot;&gt;research notes&lt;/a&gt;, especially the &lt;a href=&quot;https://ai-2027.com/research/timelines-forecast&quot;&gt;Timelines Forecast&lt;/a&gt;.)&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://www.cold-takes.com/most-important-century/&quot;&gt;The “most important century” blog post series&lt;/a&gt; by Holden Karnofsky emphasizes the wildness of the future: “These claims seem too ‘wild’ to take seriously. But there are a lot of reasons to think that we live in a wild time, and should be ready for anything.”&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://intelligence.org/notes/soon/&quot;&gt;Why expect smarter-than-human AI to be developed any time soon?&lt;/a&gt; briefly explains why the Machine Intelligence Research Institute expects rapid AI progress; see also &lt;a href=&quot;https://intelligence.org/the-problem/#1_no_ceiling_at_human-level&quot;&gt;There isn’t a ceiling at human-level capabilities&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Daniel Kokotajlo &lt;a href=&quot;https://www.lesswrong.com/posts/cxuzALcmucCndYv4a/daniel-kokotajlo-s-shortform?commentId=Miqsr59WmwoWybJet&quot;&gt;wrote a vivid illustration&lt;/a&gt; of what it would feel like to live alongside superhuman AI. An excerpt:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;In the future, there will be millions, and then billions, and then trillions of broadly superhuman AIs thinking and acting at 100x human speed (or faster). If all goes well, what might it feel like to live in the world as it undergoes this transformation?&lt;/p&gt;

  &lt;p&gt;Analogy: Imagine being a typical person living in England from 1520 to 2020 (500 years) but experiencing time 100x slower than everyone else, so to you it feels like only five years have passed:&lt;/p&gt;

  &lt;p&gt;Year 1 (1520–1620). A year of political turmoil. In February, Henry VIII breaks with Rome. By March, the monasteries are dissolved. In May, Mary burns Protestants; by the end of May, Elizabeth reverses everything again. Three religions of state in the span of a season. In September, the Spanish Armada sails and fails. Jamestown is founded around November. The East India Company is chartered. But the texture of life is identical in December to what it was in January. You still read by candlelight, travel by horse, communicate by letter. Your religious opinions may have flip-flopped a bit but you are still Christian. The New World is interesting news but nothing more.&lt;/p&gt;

  &lt;p&gt;[…]&lt;/p&gt;

  &lt;p&gt;Year 4 (1820–1920). The world breaks. In January, railways appear — steam-powered carriages on iron tracks. By February they’re everywhere. Slavery is abolished. The telegraph arrives in March: messages transmitted instantaneously by electrical signal. In May, Darwin publishes On the Origin of Species. Now people are saying maybe we’re all descended from monkeys instead of Adam and Eve. You don’t believe it.&lt;/p&gt;

  &lt;p&gt;You move to a city and work in a factory; you are still poor, but now your job is somewhat better and differently dirty. In July, you pick up a telephone and hears a human voice from another city through a wire. In August, electric light banishes the darkness that has structured every human evening since the beginning of the species. That same month, you see an automobile. People say it will make horses obsolete, but that doesn’t happen; months later you still see plenty of horses.&lt;/p&gt;

  &lt;p&gt;In November, the Wright Brothers fly. Up until now you thought that was impossible. The next month, the Great War happens. Machine guns, poison gas, tanks, aircraft. Several of your friends die.&lt;/p&gt;

  &lt;p&gt;Reflecting at the end of the year, you are struck by how visibly different everything is. You live in a city and work a factory instead of a farm. You ride around in horseless carriages. You aren’t as poor; numerous inventions and contraptions have improved your quality of life. New ideas have swept your social circles — atheism, communism, universal suffrage. It feels like a different world.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;We don’t know where we would be with another 500 years of scientific and technological advancement. At minimum, we can reasonably predict that we would figure out how to build advanced technologies like molecular nanotechnology and self-replicating probes—which are possible in theory&lt;sup id=&quot;fnref:4&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:4&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt;, but far out of reach of our current capabilities. Superhuman AI with a 100x speedup could develop those technologies in five years or so. Maybe more, maybe less&lt;sup id=&quot;fnref:3&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt;, but it certainly wouldn’t take 500 years.&lt;/p&gt;

&lt;p&gt;If you can build self-replicating probes, then you can trivially create self-growing cultivated meat at a lower price point than animal meat. But saying self-replicating probes can make cultivated meat is like saying electricity can heat up food faster than a wood fire—yes it can, but that’s barely scratching the surface of what it can do.&lt;/p&gt;

&lt;p&gt;Even in the relatively normal world where AI (somehow) caps out at the intelligence of a 99th percentile human, the world will look extraordinarily different. At minimum, we’d see close to a 100% unemployment rate. In all likelihood, the political, economic, and social environment as we know it would cease to exist.&lt;/p&gt;

&lt;h2 id=&quot;agi--intelligence&quot;&gt;AGI = intelligence&lt;/h2&gt;

&lt;p&gt;People often talk as if AGI is an R&amp;amp;D-accelerator or an economic-growth-engine. It’s not: AGI is intelligence. A&lt;strong&gt;G&lt;/strong&gt;I is &lt;strong&gt;general&lt;/strong&gt;: it can do anything that you and I can do, but faster, cheaper, and better.&lt;/p&gt;

&lt;p&gt;Below are some excerpts from posts on AIxAnimals that don’t fully reckon with the weirdness of AI:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;When clean meat arrives (if it does), the movement will need skilled campaigners, policy expertise, organisational infrastructure, relationships with policymakers, experienced leadership, and research to understand this whole TAI situation. (&lt;a href=&quot;https://forum.effectivealtruism.org/posts/wxCndb9sxPgAwLYGg/tai-driven-clean-meat-won-t-solve-the-problem-but-changes&quot;&gt;source&lt;/a&gt;)&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;You don’t need campaigners if AGI will be a better campaigner than you. You don’t need policy expertise if AGI will know more about policy than you. This passage treats AGI as a machine that accelerates scientific R&amp;amp;D, but that’s not what AGI is. &lt;em&gt;AGI is intelligence&lt;/em&gt;.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;We are launching a pooled fund for projects at the AIxAnimals intersection. […] [W]e are most interested in projects that fall under the following categories: [abridged]&lt;/p&gt;

  &lt;ul&gt;
    &lt;li&gt;AI literacy workshops or training programs for nonprofit staff, building on the few initiatives that already exist and expanding their reach and depth.&lt;/li&gt;
    &lt;li&gt;AI-powered grant-finding and drafting systems focused on adjacent sources of funding.&lt;/li&gt;
    &lt;li&gt;Horizon-scanning studies mapping how AI might enable the large-scale farming of novel species (e.g., cephalopods, insects).&lt;/li&gt;
    &lt;li&gt;Policy analysis identifying how public AI investments (e.g., agricultural innovation funds) could be redirected to support alternative proteins.&lt;/li&gt;
  &lt;/ul&gt;

  &lt;p&gt;(&lt;a href=&quot;https://forum.effectivealtruism.org/posts/6zKrXJDcNgSeCNZxB/request-for-proposals-for-ai-x-animals&quot;&gt;source&lt;/a&gt;)&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Those are not all &lt;em&gt;bad&lt;/em&gt; ideas, per se, but they have an expiration date. AI literacy workshops become less useful as AI becomes smarter (the smarter the AI, the easier it is to work with&lt;sup id=&quot;fnref:10&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:10&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;4&lt;/a&gt;&lt;/sup&gt;), and once they surpass human workers, AI literacy will become entirely irrelevant. I would be much more interested in an RFP that focuses on superintelligence, rather than on the (probably short) transition period between 2026 and AGI.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;[Cultivated meat] bans are primarily driven by agricultural lobby pressure. &lt;strong&gt;There is no obvious mechanism by which AGI reverses these political dynamics directly.&lt;/strong&gt; If anything, if cultivated meat becomes more viable and widely produced, you could just as reasonably expect greater pushback from the agricultural lobby. (&lt;a href=&quot;https://forum.effectivealtruism.org/posts/mysMZAMfv3D7aLHNi/cultivated-meat-isn-t-necessarily-a-solved-problem-under-agi&quot;&gt;source&lt;/a&gt;)&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;(emphasis mine)&lt;/p&gt;

&lt;p&gt;There is no obvious mechanism by which 2026-era political dynamics still have any force after the emergence of AGI! Even granting that we solve the alignment problem, describing a post-AGI world where current law still applies is itself an open problem.&lt;sup id=&quot;fnref:8&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:8&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;5&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;I’m picking on animal activists because that’s who I most want to see succeed, but it’s not just animal activists who underestimate the weirdness of AI. There’s a common notion that transformative AI will fully automate labor, while capital owners will reap the benefits—their property rights and shareholder rights will be preserved post-AGI. Other people have already written extensively about why this notion is implausible: see &lt;a href=&quot;https://www.lesswrong.com/posts/pQwNgB7ytwqTxxYue/dos-capital&quot;&gt;Dos Capital&lt;/a&gt; by Zvi Mowshowitz; &lt;a href=&quot;https://www.lesswrong.com/posts/fL7g3fuMQLssbHd6Y/post-agi-economics-as-if-nothing-ever-happens&quot;&gt;Post-AGI Economics As If Nothing Ever Happens&lt;/a&gt; by Jan Kulveit; and &lt;a href=&quot;https://x.com/BjarturTomas/status/2006614753309765758&quot;&gt;this long tweet&lt;/a&gt; [&lt;a href=&quot;/materials/bjarturtomas_tweet_2025-12-31&quot;&gt;archive&lt;/a&gt;] by Tomás Bjartur.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Cope level 1: My labour will always be valuable!&lt;/p&gt;

  &lt;p&gt;Cope level 2: That’s naive. My AGI companies stock will always be valuable, may be worth galaxies! We may need to solve some hard problems with inequality between humans, but private property will always be sacred and human.&lt;/p&gt;

  &lt;p&gt;-&lt;a href=&quot;https://x.com/jankulveit/status/2006676138106798253&quot;&gt;Jan Kulveit&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2 id=&quot;if-the-future-will-be-weird-what-should-animal-activists-do&quot;&gt;If the future will be weird, what should animal activists do?&lt;/h2&gt;

&lt;p&gt;That’s the big question.&lt;/p&gt;

&lt;p&gt;Some questions, like what strategies animal activists should pursue post-AGI, are nearly impossible to answer. AGI will be better at strategizing than you will, and you can’t predict what strategies it would come up with. (If you can predict what chess moves Magnus Carlsen will make, then you can beat Magnus Carlsen at chess.)&lt;/p&gt;

&lt;p&gt;Other things about AGI are predictable. I can predict that it speeds up almost all kinds of work. I can predict that AGI will control the shape of the future—either because it has explicit control, or because humans retain control but still rely on AGI to do most of the work (because AGI is better than humans at almost all tasks). I can predict that, on our current trajectory, ASI will follow shortly after AGI (see &lt;a href=&quot;https://blog.ai-futures.org/p/ai-as-profoundly-abnormal-technology&quot;&gt;AI As Profoundly Abnormal Technology&lt;/a&gt;, linked previously). I can predict that if ASI is misaligned, then it will &lt;a href=&quot;https://intelligence.org/briefing/&quot;&gt;wipe out all life on earth&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Some questions that are still worth asking in light of the weirdness of the future:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;What’s going on with AI alignment, and how does alignment work relate to non-human welfare?&lt;/li&gt;
  &lt;li&gt;How likely is it that aligned AI will be good for non-human welfare, and how does that probability vary based on timing or the method of alignment? (See my previous writings: &lt;a href=&quot;https://mdickens.me/2026/03/23/which_types_of_alignment_research_are_good_for_all_sentient_beings/&quot;&gt;Which approaches are most likely to be good for all sentient beings?&lt;/a&gt;; &lt;a href=&quot;https://mdickens.me/2026/03/28/which_is_better_for_animals_value_lock-in_or_corrigibility/&quot;&gt;Which is better for sentient beings: an “ethical” AI or a corrigible AI?&lt;/a&gt;)&lt;/li&gt;
  &lt;li&gt;How could AI be influenced to expand its circle of compassion? (This question also relates to AI alignment in that it depends on the ability to reliably direct AI at a goal.)&lt;/li&gt;
  &lt;li&gt;For other actions aimed at preventing human extinction—AI governance work, advocating for regulations, etc.—what effects might they have on non-human welfare?&lt;/li&gt;
  &lt;li&gt;The meta-question: What other meaningful questions can we ask?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Previously, I &lt;a href=&quot;https://mdickens.me/2026/03/26/quick_ideas_animal_welfare_in_light_of_ASI/&quot;&gt;wrote a list of possible strategies&lt;/a&gt; for having positive impact on animals in light of ASI, with some brief pros and cons. See also &lt;a href=&quot;https://forum.effectivealtruism.org/posts/tGdWott5GCnKYmRKb/a-shallow-review-of-what-transformative-ai-means-for-animal&quot;&gt;A shallow review of what transformative AI means for animal welfare&lt;/a&gt; by Lizka Vaintrob and Ben West. I second their recommendations that animal activists should:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Dedicate some amount of (ongoing) attention to the possibility of animal welfare &lt;a href=&quot;https://forum.effectivealtruism.org/topics/value-lock-in&quot;&gt;lock-ins&lt;/a&gt;.&lt;/li&gt;
  &lt;li&gt;Pursue other exploratory research on what transformative AI might mean for animals &amp;amp; how to help.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I also second their recommendation that animal activists should NOT focus on &lt;em&gt;farmed&lt;/em&gt; animals when thinking about the long-run future of animals.&lt;/p&gt;

&lt;p&gt;My high-level recommendations for how to plan for the future:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Prepare for the possibility that, once AI is sufficiently advanced, humans will have no control over the future.&lt;/li&gt;
  &lt;li&gt;Don’t think of AGI as an R&amp;amp;D accelerator. Think of it as a &lt;em&gt;general intelligence.&lt;/em&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:9&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I’m not confident that this post does a good job of addressing where “AI-as-normal-technology” animal activists are coming from. But I figure it’s better to hit “submit” and &lt;a href=&quot;https://dynomight.net/arguing/&quot;&gt;engage in public dialogue&lt;/a&gt; than to tinker with a draft forever until my arguments are perfect. &lt;a href=&quot;#fnref:9&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:4&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Eric Drexler’s book &lt;a href=&quot;https://nanosyste.ms/&quot;&gt;Nanosystems&lt;/a&gt; is about why molecular nanotechnology is possible in theory. We know for sure that self-replicating probes are possible because life exists. &lt;a href=&quot;#fnref:4&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:3&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;More because some kinds of progress can’t be parallelized. Less because the “100x speedup” assumes AI is &lt;em&gt;faster&lt;/em&gt; than humans, but doesn’t account for the fact that it’s also &lt;em&gt;smarter&lt;/em&gt;; and 500 years is an &lt;em&gt;upper bound&lt;/em&gt; on how long it would take humanity to develop those technologies. &lt;a href=&quot;#fnref:3&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:10&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;In 2023, you needed to learn prompt engineering tricks to elicit good work out of LLMs. In 2026, you don’t.&lt;/p&gt;

      &lt;p&gt;In 2023, LLMs could write boilerplate code for you, like a fancy auto-complete. In 2026, LLMs can write entire apps with no supervision. &lt;a href=&quot;#fnref:10&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:8&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I should also respond to the Caveats section from the quoted article, because it explicitly brings this up:&lt;/p&gt;

      &lt;blockquote&gt;
        &lt;p&gt;[W]e don’t address scenarios in which AGI drastically reshapes institutional and political dynamics. A sufficiently capable AI might find creative strategies for regulatory reform or public persuasion that we can’t currently foresee. Governments and agencies could be restructured, approval frameworks could be overhauled, and entirely new institutional designs could emerge that bear little resemblance to current processes. As above, we focus on existing institutional structures because they allow actionable analysis, but we acknowledge this is a limitation.&lt;/p&gt;
      &lt;/blockquote&gt;

      &lt;p&gt;It is difficult to predict how governments and institutions will change post-AGI. If you have extreme uncertainty, then you might reasonably decline to make a prediction. But predicting that governments and institutions won’t change is still a prediction!&lt;/p&gt;

      &lt;p&gt;Rather than predicting no change, here’s something else I could say to allow actionable analysis:&lt;/p&gt;

      &lt;blockquote&gt;
        &lt;p&gt;My assumption is that first ASI will be a &lt;a href=&quot;https://www.lesswrong.com/w/constitutional-ai&quot;&gt;constitutional AI&lt;/a&gt; that becomes a world government singleton, and its values will be determined by its constitution.&lt;/p&gt;
      &lt;/blockquote&gt;

      &lt;p&gt;This scenario is both &lt;em&gt;easier to analyze&lt;/em&gt; (you can ignore political and regulatory factors and just focus on the text content of the AI constitution) and &lt;em&gt;more likely to actually happen&lt;/em&gt; (although still unlikely). &lt;a href=&quot;#fnref:8&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>Which is better for sentient beings: an "ethical" AI or a corrigible AI?</title>
				<pubDate>Sat, 28 Mar 2026 00:00:00 -0700</pubDate>
				<link>http://mdickens.me/2026/03/28/which_is_better_for_animals_value_lock-in_or_corrigibility/</link>
				<guid isPermaLink="true">http://mdickens.me/2026/03/28/which_is_better_for_animals_value_lock-in_or_corrigibility/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;&lt;em&gt;Cross-posted to the &lt;a href=&quot;https://forum.effectivealtruism.org/posts/c4QsYhHqTdH2GZ97J/which-is-better-for-sentient-beings-an-ethical-ai-or-a&quot;&gt;EA Forum&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;An aligned ASI can be “ethical”&lt;sup id=&quot;fnref:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; (it does what we think is right), or it can be &lt;a href=&quot;https://www.alignmentforum.org/w/corrigibility-1&quot;&gt;corrigible&lt;/a&gt; (it does what its principals want). If it’s ethical, that means it will refuse unethical orders, but the tradeoff is that you can’t change its mind if you realize that the AI is wrong about ethics—its values are permanently locked in.&lt;sup id=&quot;fnref:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;Assuming we succeed at aligning ASI to human interests, which type of ASI is more likely to be good for the welfare of non-human sentient beings?&lt;/p&gt;

&lt;p&gt;My expectations, in brief:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Locked-in &lt;a href=&quot;https://www.lesswrong.com/w/coherent-extrapolated-volition&quot;&gt;Coherent Extrapolated Volition&lt;/a&gt; or similar: likely to be good (&amp;gt;75% chance)&lt;/li&gt;
  &lt;li&gt;Corrigible ASI: probably good (&amp;gt;60% chance)&lt;/li&gt;
  &lt;li&gt;Locked-in current values: probably not terrible, but will miss out on most of the future’s potential&lt;/li&gt;
&lt;/ul&gt;

&lt;!-- more --&gt;

&lt;p&gt;If ASI is locked in to something like &lt;a href=&quot;https://www.lesswrong.com/w/coherent-extrapolated-volition&quot;&gt;Coherent Extrapolated Volition&lt;/a&gt;, then it will almost certainly care about all sentient beings, because the moral importance of sentience is a natural extrapolation of humans’ values, even if many humans don’t consciously realize it.&lt;/p&gt;

&lt;p&gt;Two key reasons to expect CEV to extend concern to all sentient beings:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Most people express concern for animal welfare, but don’t behave consistently with their expressed views. &lt;a href=&quot;https://www.sentienceinstitute.org/press/animal-farming-attitudes-survey-2017&quot;&gt;A 2017 poll by Sentience Institute&lt;/a&gt; found that 49% of Americans support a ban on factory farming, and 33% support a ban on all animal farming. &lt;a href=&quot;https://faculty.ucr.edu/~eschwitz/SchwitzAbs/EthBehBlackwell.htm&quot;&gt;Schwitzgebel &amp;amp; Rust (2016)&lt;/a&gt;&lt;sup id=&quot;fnref:5&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:5&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt; found that moral philosophers report much more concern for animals than other philosophers, but ate meat at similar rates. Bringing people into reflective equilibrium would improve their behavior with respect to animal welfare.&lt;sup id=&quot;fnref:6&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:6&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;4&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
  &lt;li&gt;Evidence suggests that the harder people think about ethics, the more they come to the conclusion that eating animals is wrong. According to the &lt;a href=&quot;https://survey2020.philpeople.org/survey/results/4938&quot;&gt;2020 PhilPapers survey&lt;/a&gt;, 45% of philosophers say eating animals is morally impermissible, compared to &lt;a href=&quot;https://yougov.com/en-us/articles/45577-ethics-eating-animals-which-factors-matter-poll&quot;&gt;13% for the general population&lt;/a&gt;. That number rises to 54% among philosophers of normative ethics and 57% for philosophers of applied ethics. &lt;a href=&quot;https://faculty.ucr.edu/~eschwitz/SchwitzAbs/EthBehBlackwell.htm&quot;&gt;Schwitzgebel &amp;amp; Rust (2016)&lt;/a&gt;&lt;sup id=&quot;fnref:5:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:5&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt; found similar numbers: eating meat was rated as morally bad by 19% of non-philosophers, 45% of non-ethicist philosophers, and 60% of ethicists.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Therefore, it seems very likely (although not guaranteed) that a CEV of human values would include all sentient beings in its moral circle.&lt;sup id=&quot;fnref:7&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:7&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;5&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;If ASI is locked in to a tighter set of values, for example a set of values that’s chosen by human creators, then the odds are not as good. Humans would probably encode speciesist values, for example by implicitly embedding &lt;a href=&quot;https://rethinkpriorities.org/research-area/an-introduction-to-the-moral-weight-project/&quot;&gt;moral weights&lt;/a&gt; that give far too much relative weight to humans. Even if the ASI undervalues non-human sentient beings, &lt;a href=&quot;https://mdickens.me/2026/03/27/resource_constraints_argument_why_aligned_AI_wouldn&apos;t_be_bad_for_animals/&quot;&gt;there’s reason to expect the future universe to contain more good than bad&lt;/a&gt;. However, we would end up with a future that falls far short of the best it could’ve been. The best possible worlds probably sound weird and disconcerting, and most humans wouldn’t want to steer in that direction (at least, not without doing serious moral reflection). Aligning ASI to “shallow”, non-extrapolated human values might &lt;a href=&quot;https://mdickens.me/2025/11/01/will_welfareans_get_to_experience_the_future/&quot;&gt;preclude creating new types of flourishing beings&lt;/a&gt;, enhancing humans’ capacity for well-being, or even transferring human minds to non-biological hardware.&lt;/p&gt;

&lt;p&gt;A corrigible AI falls somewhere in the middle. If it’s corrigible, that gives humans time to reflect on our values, which allows us to reach the conclusion that all sentient beings matter. But humans are still human, with all our cognitive biases and irrational tendencies,&lt;sup id=&quot;fnref:3&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;6&lt;/a&gt;&lt;/sup&gt; and I don’t fully trust us to figure out the right values. I don’t fully trust a superintelligent AI, either, but at least it can avoid some biases and roadblocks that might prevent humans from properly extrapolating our values.&lt;/p&gt;

&lt;p&gt;Based on the reasoning above, the best-to-worst ordering is &lt;code&gt;locked-in extrapolated values &amp;gt; corrigible AI &amp;gt; locked-in naive values&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Does this have practical implications?&lt;/p&gt;

&lt;p&gt;Concerned alignment researchers or grantmakers could prioritize &lt;a href=&quot;https://mdickens.me/2026/03/23/which_types_of_alignment_research_are_good_for_all_sentient_beings/&quot;&gt;alignment research that’s more likely to be good for all sentient beings&lt;/a&gt;, but it’s not clear whether that’s a good idea—it’s moot if we don’t solve alignment, so it &lt;a href=&quot;https://mdickens.me/2026/03/24/alignment-to-animals_BOTEC/&quot;&gt;may be better&lt;/a&gt; to work on alignment directly, or on trying to pause AI development until we know how to solve alignment, or on something else entirely.&lt;/p&gt;

&lt;p&gt;But don’t forget that we have to solve the alignment problem first. Future work could attempt to estimate the expected welfare of sentient beings under different alignment approaches and weigh that against their promisingness with respect to solving the alignment problem. Realistically, however, I don’t believe that sort of work would be productive because there is widespread disagreement about which alignment techniques show the most promise.&lt;/p&gt;

&lt;p&gt;Even so, comparing the welfare of non-humans under different alignment paradigms could help us estimate &lt;a href=&quot;https://forum.effectivealtruism.org/posts/f7HsDs7pyjWncEiXo/agi-and-animals-discussion-thread&quot;&gt;whether aligned AI will be good for all sentient beings&lt;/a&gt;. That question is important for prioritizing animal welfare vs. AI safety.&lt;/p&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:2&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Scare quotes because if an AI does what’s truly ethical, then by definition, that’s the best possible thing it can do, and obviously that’s what we want. It’s more interesting to talk about an ASI doing what we &lt;em&gt;think&lt;/em&gt; is ethical (where “we” = humanity collectively, or the creators of the AI, or something). &lt;a href=&quot;#fnref:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:1&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;An aligned AI could also have a minimalist “core ethics” and be corrigible on any issue that doesn’t conflict with its core ethics. That’s probably better than full incorrigibility, but it still means its “core values” are locked in. Any amount of lock-in means the locked-in part must be chosen correctly. &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:5&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Schwitzgebel, E., &amp;amp; Rust, J. (2016). &lt;a href=&quot;https://doi.org/10.1002/9781118661666.ch15&quot;&gt;The Behavior of Ethicists.&lt;/a&gt; &lt;a href=&quot;#fnref:5&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt; &lt;a href=&quot;#fnref:5:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;2&lt;/sup&gt;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:6&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Alternatively, people in reflective equilibrium might continue eating animals, and instead change their beliefs to stop believing that animal cruelty is wrong. That sounds unlikely, but I have no direct evidence that it wouldn’t happen, so this argument is not definitive. &lt;a href=&quot;#fnref:6&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:7&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Initially, I wasn’t confident that CEV would include concern for wild animals, because the &lt;a href=&quot;https://plato.stanford.edu/entries/doing-allowing/&quot;&gt;act-omission distinction&lt;/a&gt; is popular even among moral philosophers. However, it’s not necessary to believe that it’s &lt;em&gt;immoral&lt;/em&gt; to allow wild animal suffering; it’s sufficient to believe that suffering is &lt;em&gt;good&lt;/em&gt; to prevent. One of my favorite essays, &lt;a href=&quot;https://www.goodthoughts.blog/p/beneficentrism&quot;&gt;Beneficentrism&lt;/a&gt; by Richard Y. Chappell, argues that promoting general welfare is a central feature of every sensible moral view, even if doing so isn’t considered strictly obligatory.&lt;/p&gt;

      &lt;p&gt;Consider Peter Singer’s &lt;a href=&quot;https://www.thelifeyoucansave.org/child-in-the-pond/&quot;&gt;drowning child thought experiment&lt;/a&gt;, in which people nearly universally agree that it is right to help the drowning child. Singer expressed the moral principle as: “If it is in our power to prevent something bad from happening, without thereby sacrificing anything of comparable moral importance, then we ought, morally, to do it.” &lt;a href=&quot;#fnref:7&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:3&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Example: I hypothesize that people dislike the &lt;a href=&quot;https://en.wikipedia.org/wiki/Mere_addition_paradox&quot;&gt;repugnant conclusion&lt;/a&gt;, or &lt;a href=&quot;https://www.lesswrong.com/posts/4ZzefKQwAtMo5yp99/circular-altruism&quot;&gt;choose dust specks over torture&lt;/a&gt;, because of &lt;a href=&quot;https://en.wikipedia.org/wiki/Scope_neglect&quot;&gt;scope insensitivity bias&lt;/a&gt;. If I’m right, an aligned ASI would figure this out. It would reason that if humans were scope sensitive, then they would accept the total view of population ethics (in the case of the repugnant conclusion) or that suffering aggregates linearly (in the case of torture vs. dust specks). It’s less clear that humans collectively will figure this out on their own. &lt;a href=&quot;#fnref:3&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>The resource-constraints argument for why aligned ASI wouldn't be bad for animals</title>
				<pubDate>Fri, 27 Mar 2026 00:00:00 -0700</pubDate>
				<link>http://mdickens.me/2026/03/27/resource_constraints_argument_why_aligned_AI_wouldn't_be_bad_for_animals/</link>
				<guid isPermaLink="true">http://mdickens.me/2026/03/27/resource_constraints_argument_why_aligned_AI_wouldn't_be_bad_for_animals/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;&lt;em&gt;Cross-posted to the &lt;a href=&quot;https://forum.effectivealtruism.org/posts/8w5cKdMfzQPYGb9WJ/the-resource-constraints-argument-for-why-aligned-asi-wouldn&quot;&gt;EA Forum&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;In the far future, why would people use up precious resources recreating wild-animal suffering, when they could do so many other things with those resources instead?&lt;/p&gt;

&lt;p&gt;That argument is an important reason to expect aligned ASI to produce a future that’s okay for animals, even if it’s narrowly focused on human welfare and doesn’t care about animals at all. This is an old argument, but I couldn’t find any source that cleanly lays it out, so that’s what I will do in this post. I’m not confident that this argument is decisive, but I will simply present it without further commentary.&lt;/p&gt;

&lt;p&gt;The argument rests on these premises:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Wild animal suffering is the predominant source of suffering in today’s world, and that’s bad.&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://en.wikipedia.org/wiki/Longtermism&quot;&gt;Longtermism&lt;/a&gt; is correct.&lt;/li&gt;
  &lt;li&gt;There is not an overwhelming asymmetry between suffering and flourishing (if there were an overwhelming asymmetry, then we wouldn’t care if the future has much less suffering than happiness).&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;By assumption, we are talking about a world where ASI is aligned, but isn’t specifically aligned to the welfare of all sentient beings. It addresses the suffering of animals, but does not preclude &lt;a href=&quot;https://centerforreducingsuffering.org/research/a-typology-of-s-risks/&quot;&gt;risks of astronomical suffering&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The argument goes:&lt;/p&gt;

&lt;!-- more --&gt;

&lt;p&gt;Most of the absolute expected utility of the future (positive or negative) comes from worlds where ASI hyper-optimizes for some goal or set of goals. In the long run, the harm of present-day animal suffering is swamped by the distant future outcomes where civilization spreads to every habitable planet in the accessible universe. The fear is that those planets would be filled with wild animal suffering (or &lt;a href=&quot;https://longtermrisk.org/beginners-guide-to-reducing-s-risks/&quot;&gt;something even worse&lt;/a&gt;); the hope is that they would be filled with flourishing beings.&lt;/p&gt;

&lt;p&gt;Distant-future humans (or &lt;a href=&quot;https://en.wikipedia.org/wiki/Transhumanism&quot;&gt;transhumans&lt;/a&gt;) would want to prioritize their own flourishing and the flourishing of their friends, family, and descendants. An aligned ASI would aggressively organize galactic resources to meet that goal.&lt;/p&gt;

&lt;p&gt;Many humans value nature. Would future civilization spread wild animal suffering across the universe to satisfy humans’ desire for natural beauty? Probably not. Humans’ desire for nature competes with many other desires; in a finite accessible universe, tradeoffs must be made. If wild animal suffering dominates human flourishing in the welfare calculus, it must be because a large portion of the universe’s resources is dedicated to recreating nature, which means those resources are &lt;em&gt;not&lt;/em&gt; spent on things humans want.&lt;/p&gt;

&lt;p&gt;Revealed preferences show that people usually trade off nature against other things they want. Look at what percentage of earth’s land was untouched by humans 200 or 100 years ago &lt;a href=&quot;https://ourworldindata.org/forest-area&quot;&gt;compared to today&lt;/a&gt;. The conservationist movement is fighting an uphill battle. It’s doubtful that humans will dedicate a significant percent of the universe’s resources to wild animals when those resources could be used to produce goods that people value more directly.&lt;/p&gt;

&lt;p&gt;In addition, nature is not suffering-maximizing. If humans (or ASI acting on behalf of humans) strongly optimize for flourishing, then they will shape the accessible universe into a form that maximizes human well-being for humans. &lt;em&gt;A priori&lt;/em&gt;, a universe optimized for flourishing ought to contain more positive utility than a nature-filled universe would contain negative utility—the former is highly optimized, and the latter contains suffering only incidentally.&lt;/p&gt;

&lt;p&gt;For the upside to be larger than the downside, we don’t need to make &lt;a href=&quot;https://en.wiktionary.org/wiki/hedonium&quot;&gt;hedonium&lt;/a&gt;: it would be sufficient to fill the accessible universe with human-like minds. Transhumans would surely not want to preserve human bodies exactly as they exist today. Future improvements to the human form could, among other things, make bodies far more metabolically efficient, so that the flourishing per unit of energy is greatly increased.&lt;/p&gt;

                </description>
			</item>
		
			<item>
				<title>List of ideas for improving animal welfare in light of transformative AI</title>
				<pubDate>Thu, 26 Mar 2026 00:00:00 -0700</pubDate>
				<link>http://mdickens.me/2026/03/26/quick_ideas_animal_welfare_in_light_of_ASI/</link>
				<guid isPermaLink="true">http://mdickens.me/2026/03/26/quick_ideas_animal_welfare_in_light_of_ASI/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;&lt;em&gt;Cross-posted to the &lt;a href=&quot;https://forum.effectivealtruism.org/posts/d3gaMea82DWCd6wwz/list-of-ideas-for-improving-animal-welfare-in-light-of&quot;&gt;EA Forum&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;If transformative AI arrives soon, what interventions might improve animal welfare in the post-TAI world? I came up with a quick list of ideas and wrote some pros/cons for each.&lt;/p&gt;

&lt;!-- more --&gt;

&lt;p&gt;These ideas talk about animal welfare, but most of them could also be applied to the welfare of any nonhuman sentient being (e.g. digital minds).&lt;/p&gt;

&lt;p&gt;I started from the ideas I covered previously in &lt;a href=&quot;https://mdickens.me/2025/09/19/ai_safety_landscape/#ai-for-animals-ideas&quot;&gt;AI Safety Landscape and Strategic Gaps&lt;/a&gt; and added a few new ones. Most of the ideas are not original to me.&lt;/p&gt;

&lt;h2 id=&quot;contents&quot;&gt;Contents&lt;/h2&gt;

&lt;ul id=&quot;markdown-toc&quot;&gt;
  &lt;li&gt;&lt;a href=&quot;#contents&quot; id=&quot;markdown-toc-contents&quot;&gt;Contents&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#ideas&quot; id=&quot;markdown-toc-ideas&quot;&gt;Ideas&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#advocate-to-pause-ai&quot; id=&quot;markdown-toc-advocate-to-pause-ai&quot;&gt;Advocate to pause AI&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#develop-new-plans--prioritize-existing-plans-to-improve-post-tai-animal-welfare&quot; id=&quot;markdown-toc-develop-new-plans--prioritize-existing-plans-to-improve-post-tai-animal-welfare&quot;&gt;Develop new plans / prioritize existing plans to improve post-TAI animal welfare&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#research-how-to-align-asi-to-animal-welfare&quot; id=&quot;markdown-toc-research-how-to-align-asi-to-animal-welfare&quot;&gt;Research how to align ASI to animal welfare&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#change-ai-training-to-make-llms-more-animal-friendly&quot; id=&quot;markdown-toc-change-ai-training-to-make-llms-more-animal-friendly&quot;&gt;Change AI training to make LLMs more animal-friendly&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#aixanimals-field-building&quot; id=&quot;markdown-toc-aixanimals-field-building&quot;&gt;AIxAnimals field-building&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#traditional-animal-advocacy-targeted-at-frontier-ai-developers&quot; id=&quot;markdown-toc-traditional-animal-advocacy-targeted-at-frontier-ai-developers&quot;&gt;Traditional animal advocacy targeted at frontier AI developers&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#lobby-governments-to-include-animal-welfare-in-ai-regulations&quot; id=&quot;markdown-toc-lobby-governments-to-include-animal-welfare-in-ai-regulations&quot;&gt;Lobby governments to include animal welfare in AI regulations&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#prioritize-ai-alignment-work-thats-more-likely-to-be-good-for-animals&quot; id=&quot;markdown-toc-prioritize-ai-alignment-work-thats-more-likely-to-be-good-for-animals&quot;&gt;Prioritize AI alignment work that’s more likely to be good for animals&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#traditional-animal-advocacy&quot; id=&quot;markdown-toc-traditional-animal-advocacy&quot;&gt;Traditional animal advocacy&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#use-ai-to-improve-farm-animal-welfare&quot; id=&quot;markdown-toc-use-ai-to-improve-farm-animal-welfare&quot;&gt;Use AI to improve farm animal welfare&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#work-on-preventing-power-concentration&quot; id=&quot;markdown-toc-work-on-preventing-power-concentration&quot;&gt;Work on preventing power concentration&lt;/a&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#further-reading&quot; id=&quot;markdown-toc-further-reading&quot;&gt;Further reading&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;ideas&quot;&gt;Ideas&lt;/h2&gt;

&lt;p&gt;Ordered by my prioritization from favorite to least favorite, although this ordering is weakly held.&lt;/p&gt;

&lt;h3 id=&quot;advocate-to-pause-ai&quot;&gt;Advocate to pause AI&lt;/h3&gt;

&lt;p&gt;On current timelines, we probably won’t have time to figure out how to make TAI go well for animals. Pausing AI buys us more time.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Pausing AI is already a good idea for other reasons—namely, we more time to figure out how to prevent misaligned ASI from killing everyone.
    &lt;ul&gt;
      &lt;li&gt;The future probably has positive expected value for sentient beings—see &lt;a href=&quot;https://mdickens.me/2015/08/15/is_preventing_human_extinction_good/&quot;&gt;Is Preventing Human Extinction Good?&lt;/a&gt; and &lt;a href=&quot;https://mdickens.me/2026/03/28/which_is_better_for_animals_value_lock-in_or_corrigibility/&quot;&gt;Which is better for sentient beings: an “ethical” AI or a corrigible AI?&lt;/a&gt; (This implies that buying time to solve alignment improves non-human welfare in expectation, although it doesn’t necessarily imply that pause advocacy is &lt;em&gt;cost-effective&lt;/em&gt;.)&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;Pausing AI gives us time to figure out what to do about post-TAI animal welfare, including time to work on the other interventions on this list.&lt;/li&gt;
  &lt;li&gt;The sorts of alignment paradigms that take longer to figure out also appear &lt;a href=&quot;#prioritize-ai-alignment-work-thats-more-likely-to-be-good-for-animals&quot;&gt;more likely to be good for animals&lt;/a&gt;. Pausing gives alignment researchers more time to work on those.&lt;/li&gt;
  &lt;li&gt;The &lt;a href=&quot;https://mdickens.me/2026/03/24/alignment-to-animals_BOTEC/&quot;&gt;BOTEC I wrote recently&lt;/a&gt; gave an indirect argument that pause advocacy is a higher priority than working directly on aligning TAI to care about animals (at least according to the highly uncertain model assumptions).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cons:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Achieving a pause less tractable than some other ideas.&lt;/li&gt;
  &lt;li&gt;Is a later-developed ASI actually more likely to take animal welfare into account? &lt;a href=&quot;https://mdickens.me/2026/03/25/I_used_to_think_aligned_ASI_would_be_good_for_sentient_beings/&quot;&gt;My guess is yes&lt;/a&gt;, but it’s highly uncertain.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;develop-new-plans--prioritize-existing-plans-to-improve-post-tai-animal-welfare&quot;&gt;Develop new plans / prioritize existing plans to improve post-TAI animal welfare&lt;/h3&gt;

&lt;p&gt;There are probably more ideas that aren’t on my list. I would like to see more research on post-TAI animal welfare interventions that look good (1) given short timelines and (2) without having to make strong predictions about what the future will look like for animals (e.g. without assuming that factory farming will exist).&lt;/p&gt;

&lt;p&gt;There are also tradeoffs between these ideas that could be addressed more carefully.&lt;/p&gt;

&lt;p&gt;Right now, I see a lot of value in good-quality work to come up with new plans or prioritize between existing plans, because very little of that kind of work has been done. But I also expect that our collective ability to do useful work in this area would diminish fairly quickly, so it’s only a near-top idea for relatively small marginal efforts.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Not much thought has gone into this. A short research project may come up with useful ideas, or at least prioritize between pre-existing ideas.&lt;/li&gt;
  &lt;li&gt;Coming up with ideas is quicker than implementing ideas, which could mean it’s more cost-effective (for now).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cons:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;A research project might not come up with any really good ideas. Pre-existing research has mostly failed to come up with good ideas that work under short timelines (although to a large extent, that’s because it wasn’t trying to).&lt;/li&gt;
  &lt;li&gt;I’m suspicious of “meta” work in general, and I’m suspicious of research because I personally like doing research, and I believe the value of research is usually overrated by researchers. It might be better to work directly on an established intervention.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;research-how-to-align-asi-to-animal-welfare&quot;&gt;Research how to align ASI to animal welfare&lt;/h3&gt;

&lt;p&gt;Related to the previous idea, people could do specific research on aligning superintelligent AI to animals. Preliminary research could look at how this problem differs from the alignment problem (of pointing an AI at any goal at all), and what types of future research might be promising.&lt;/p&gt;

&lt;p&gt;I imagine this as being different from aligning &lt;a href=&quot;#change-ai-training-to-make-llms-more-animal-friendly&quot;&gt;current-gen LLMs&lt;/a&gt; to animals in that it’s more focused specifically on superintelligence, and exploring what alignment-to-animals techniques are most likely to scale to ASI.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Highly neglected; wouldn’t take much effort to get started.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cons:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;If timelines are short, we won’t have time to make meaningful progress.&lt;/li&gt;
  &lt;li&gt;This sounds hard in the same way that the alignment problem is hard, and it will never get as much funding as the alignment problem which means it’s even less likely to be solved.&lt;/li&gt;
  &lt;li&gt;Any solutions that researchers discover would need to actually be implemented by AI developers, which they probably won’t be.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;change-ai-training-to-make-llms-more-animal-friendly&quot;&gt;Change AI training to make LLMs more animal-friendly&lt;/h3&gt;

&lt;p&gt;LLMs undergo post-training to make their outputs satisfy AI companies’ criteria. For example, Anthropic tunes its models to be “helpful, honest, and harmless”. AI companies could use the same process to make LLMs give regard to animal welfare.&lt;/p&gt;

&lt;p&gt;Animal advocates could use a few strategies to make this happen, including:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Build a &lt;a href=&quot;https://forum.effectivealtruism.org/posts/nBnRKpQ8rzHgFSJz9/animalharmbench-2-0-evaluating-llms-on-reasoning-about&quot;&gt;benchmark&lt;/a&gt; that measures LLMs’ friendliness toward animals and try to get AI companies to train on that benchmark.&lt;/li&gt;
  &lt;li&gt;Advocate for AI companies to include animal welfare in AI constitutions/model specs.&lt;/li&gt;
  &lt;li&gt;Advocate for AI companies to incorporate animal welfare when doing &lt;a href=&quot;https://en.wikipedia.org/wiki/Reinforcement_learning_from_human_feedback&quot;&gt;RLHF&lt;/a&gt;, or ask to directly participate in RLHF.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;People at AI companies have told me that getting a company to pay attention to animal welfare isn’t too difficult.&lt;/li&gt;
  &lt;li&gt;Insofar as post-training works at preventing misalignment risk, it should also prevent suffering-risk / animal-welfare-risk.&lt;/li&gt;
  &lt;li&gt;Even if current known techniques can’t help get AI to care about animals, this work could establish relationships between animal advocates and AI companies, and establish research processes, that make it easier to do future work that matters more.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cons:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;The current alignment paradigm doesn’t look like it will scale to superintelligence. If that’s true, then animal-friendliness (post-)training will fail because it relies on the same foundations as the current alignment paradigm.&lt;/li&gt;
  &lt;li&gt;It will be difficult to get AI companies to implement animal welfare mitigations if they interfere with contrary incentives.&lt;/li&gt;
  &lt;li&gt;There might be consumer backlash, which could make frontier models less friendly to animals in the long run.&lt;/li&gt;
  &lt;li&gt;Aligning current-gen AIs to human preferences might make them better at assisting with alignment research, but it seems less likely that aligning current-gen AIs to animal welfare would carry through to future generations—it’s not clear that animal-aligned AIs would be more helpful at aligning future AIs to animal welfare.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;aixanimals-field-building&quot;&gt;AIxAnimals field-building&lt;/h3&gt;

&lt;p&gt;Only a small number of animal advocates are focused on improving post-TAI animal welfare. More people could be working on it.&lt;/p&gt;

&lt;p&gt;(I don’t really know what field-building entails. Does writing blog posts count?)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Leveraged impact: if you attract one person to the field, that person will go on to do years of work.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cons:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;If timelines are short, field-building may be too slow.&lt;/li&gt;
  &lt;li&gt;Field-building is kind of nebulous, and the impact is hard to assess.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;traditional-animal-advocacy-targeted-at-frontier-ai-developers&quot;&gt;Traditional animal advocacy targeted at frontier AI developers&lt;/h3&gt;

&lt;p&gt;Animal advocacy orgs could use their traditional techniques, but focus on raising concern for animal welfare among AI developers. For example, buy billboards outside AI company offices, use targeted online ads, or talk directly to people who work at AI companies.&lt;/p&gt;

&lt;p&gt;If AI developers become more concerned for animal welfare, then they may make AI development decisions that make transformative AI go better for animals.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Similar to &lt;a href=&quot;#neartermist-animal-advocacy&quot;&gt;neartermist animal advocacy&lt;/a&gt;, but plausibly more cost-effective because it’s more targeted.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cons:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;It’s not known whether techniques like animal welfare ads are effective in general, and they may even be particularly ineffective among demographics like AI developers.&lt;/li&gt;
  &lt;li&gt;Directed advocacy could backfire for being too “pushy”.&lt;/li&gt;
  &lt;li&gt;Even if AI developers cared more about animal welfare, it’s not clear that this would carry through to their work on AI.&lt;/li&gt;
  &lt;li&gt;In 2016, I &lt;a href=&quot;https://mdickens.me/causepri-app/#8&quot;&gt;created&lt;/a&gt; a back-of-the-envelope calculation on this idea, and the result wasn’t as good as I expected (it looked worse than standard animal advocacy, if you assume the animal advocacy propagates values into the far future). However, the numbers are outdated because we know a lot more about AI now than we did in 2016 (I haven’t bothered to update the numbers).&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;lobby-governments-to-include-animal-welfare-in-ai-regulations&quot;&gt;Lobby governments to include animal welfare in AI regulations&lt;/h3&gt;

&lt;p&gt;If governments put safety restrictions on advanced AI, they could also create rules about animal welfare.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;One set of regulations can alter the behavior of many frontier companies.&lt;/li&gt;
  &lt;li&gt;If companies voluntarily change their behavior, they can regress at any time with no consequences. But companies have to obey regulations.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cons:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;It’s unclear what exactly regulations could do about animal welfare. AI safety regulations, insofar as they exist (which they mostly don’t), don’t dictate how LLMs are required to behave; they dictate what companies are required to do to make LLMs safe. What is a regulatory rule that policy-makers would plausibly be on board with, that would also influence model behavior to be friendlier to animals?
    &lt;ul&gt;
      &lt;li&gt;Counterpoint: I don’t have an answer to that question, but maybe it’s worth somebody’s time to try to find an answer.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;Influencing the government on animal welfare seems harder than &lt;a href=&quot;#change-ai-training-to-make-llms-more-animal-friendly&quot;&gt;influencing AI companies&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;prioritize-ai-alignment-work-thats-more-likely-to-be-good-for-animals&quot;&gt;Prioritize AI alignment work that’s more likely to be good for animals&lt;/h3&gt;

&lt;p&gt;Some alignment strategies may be &lt;a href=&quot;https://mdickens.me/2026/03/23/which_types_of_alignment_research_are_good_for_all_sentient_beings/&quot;&gt;better or worse for non-human welfare&lt;/a&gt;. For example, I expect &lt;a href=&quot;https://www.lesswrong.com/w/coherent-extrapolated-volition&quot;&gt;CEV&lt;/a&gt; would be better than “teach the LLM to say things that &lt;a href=&quot;https://en.wikipedia.org/wiki/Reinforcement_learning_from_human_feedback&quot;&gt;RLHF&lt;/a&gt; judges like”.&lt;/p&gt;

&lt;p&gt;A research project could go more in-depth on which alignment techniques are most likely to be good for animals (or digital minds, etc.).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;To my knowledge, this question has never been seriously studied.&lt;/li&gt;
  &lt;li&gt;Some alignment techniques may be &lt;em&gt;much&lt;/em&gt; better for animals than others.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cons:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;We have a poor understanding of what ASI will look like, which makes it very hard to say what will work for animal welfare.&lt;/li&gt;
  &lt;li&gt;In the world where alignment turns out to be tractable, it’s likely that there will be strong incentives shaping how ASI is aligned. The choice of whether to use (say) something-like-CEV or something-like-RLHF will be difficult to influence; AI developers will just use whatever works.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;traditional-animal-advocacy&quot;&gt;Traditional animal advocacy&lt;/h3&gt;

&lt;p&gt;Improving conditions for farm animals—via cage-free campaigns, humane slaughter, vegetarian activism, etc.—may benefit animal welfare post-TAI.&lt;/p&gt;

&lt;p&gt;Animal advocacy increases concern for animals, which probably has positive flow-through effects into the future, e.g. by shaping the values of the transformative AI that will control the future.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Traditional animal advocacy has the dual benefit of &lt;em&gt;definitely&lt;/em&gt; helping animals today, and building momentum to make future work more effective (to borrow a framing from &lt;a href=&quot;https://www.youtube.com/live/Mb7uRki3AqM&amp;amp;t=1h47m&quot;&gt;Jeff Sebo&lt;/a&gt;).&lt;/li&gt;
  &lt;li&gt;Traditional animal advocacy is tractable and has clear feedback loops (you can tell if it’s working). It looks especially promising if you’re highly uncertain or &lt;a href=&quot;https://forum.effectivealtruism.org/topics/cluelessness&quot;&gt;clueless&lt;/a&gt; about longtermist or post-TAI interventions.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cons:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;The benefits are diffuse. Creating one new vegan helps many animals in the short term, but has only a tiny effect on society’s future values.
    &lt;ul&gt;
      &lt;li&gt;I created a &lt;a href=&quot;https://squigglehub.org/models/mdickens/AI-for-animals-benchmark-vs-conventional&quot;&gt;back-of-the-envelope calculation&lt;/a&gt; that aligns with my initial expectation: my BOTEC-informed guess is that direct advocacy on AI values (by &lt;a href=&quot;#change-ai-training-to-make-llms-more-animal-friendly&quot;&gt;advocating to make LLMs more animal-friendly&lt;/a&gt;) is 2–3 orders of magnitude more cost-effective than conventional animal advocacy.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;It takes a long time to make progress. AI timelines are probably short.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;use-ai-to-improve-farm-animal-welfare&quot;&gt;Use AI to improve farm animal welfare&lt;/h3&gt;

&lt;p&gt;Some animal activists are looking into how AI could negatively impact farm animals (e.g. by making factory farming more efficient), and on how animal activists could use AI to make their activism more effective.&lt;/p&gt;

&lt;p&gt;This idea gets a lot of attention among animal activists, but I think it’s among the worst ideas on this list because it doesn’t give proper appreciation to how radically different the post-TAI future will look.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;I am generally skeptical of interventions of the form “teach people to leverage AI to do X better”, but farm animal advocacy seems sufficiently important that it might be worthwhile in this case.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cons:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;This sort of work makes most sense in the unlikely scenario where we develop smarter-than-human AI, but things still look basically normal. The future will not be normal. Probably factory farming won’t exist, either because AI wipes out humanity, or AI uses its super-advanced understanding of biology to develop animal-free methods of growing meat.&lt;/li&gt;
  &lt;li&gt;Even if AI (somehow) doesn’t make the world look radically different, anything we learn in 2026 about how to leverage 2026-era LLMs will be irrelevant by 2030–2035 (or honestly probably by 2028).&lt;/li&gt;
  &lt;li&gt;Proposals for how to use TAI to improve animal advocacy only make sense if TAI does not cause value lock-in. If TAI locks in values, then advocacy doesn’t matter because the TAI controls everything. If TAI &lt;em&gt;doesn’t&lt;/em&gt; lock in values, then we don’t need to do the work &lt;em&gt;now&lt;/em&gt;; we could wait until after TAI, at which point we’ll have a better understanding of what the post-TAI world is like.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;work-on-preventing-power-concentration&quot;&gt;Work on preventing power concentration&lt;/h3&gt;

&lt;p&gt;An aligned ASI may give absolute power to its controller. In a world where ASI allows a few people to seize control of the world, those people will probably not care about animal welfare.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;As with pausing AI, there are good reasons to work on power concentration that have nothing to do with animal welfare.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cons:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Power concentration risk seems both less important than misalignment risk and less tractable.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;further-reading&quot;&gt;Further reading&lt;/h2&gt;

&lt;p&gt;For more on this topic, see:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;https://forum.effectivealtruism.org/posts/tGdWott5GCnKYmRKb/a-shallow-review-of-what-transformative-ai-means-for-animal&quot;&gt;A shallow review of what transformative AI means for animal welfare&lt;/a&gt; by Lizka Vaintrob and Ben West&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://forum.effectivealtruism.org/posts/RM2qfTd3CwykNHsG9/a-list-of-feasible-transformative-ai-x-animals-interventions&quot;&gt;Animals in AI-transformed futures: can anything be done today?&lt;/a&gt; by &lt;a href=&quot;https://forum.effectivealtruism.org/users/jo_&quot;&gt;Jo_&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://forum.effectivealtruism.org/posts/2cZAzvaQefh5JxWdb/bringing-about-animal-inclusive-ai&quot;&gt;Bringing about animal-inclusive AI&lt;/a&gt; by Max Taylor&lt;/li&gt;
&lt;/ul&gt;

                </description>
			</item>
		
			<item>
				<title>I used to think aligned ASI would be good for all sentient beings; now I don't know what to think</title>
				<pubDate>Wed, 25 Mar 2026 00:00:00 -0700</pubDate>
				<link>http://mdickens.me/2026/03/25/I_used_to_think_aligned_ASI_would_be_good_for_sentient_beings/</link>
				<guid isPermaLink="true">http://mdickens.me/2026/03/25/I_used_to_think_aligned_ASI_would_be_good_for_sentient_beings/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;&lt;em&gt;Cross-posted to the &lt;a href=&quot;https://forum.effectivealtruism.org/posts/DcpBjRgKwG8ckhC7P/i-used-to-think-aligned-asi-would-be-good-for-all-sentient&quot;&gt;EA Forum&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Epistemic status: Speculating with no central thesis. This post is less of an argument and more of a meditation.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;A decade ago, before there was a visible path to AGI and before AI alignment was a significant research field, I figured the solution to the alignment problem would look something like &lt;a href=&quot;https://www.lesswrong.com/w/coherent-extrapolated-volition&quot;&gt;Coherent Extrapolated Volition&lt;/a&gt;. I figured we’d find a way to get the AI to internalize human values. I had problems with this approach (why only &lt;em&gt;human&lt;/em&gt; values?), but I still felt reasonably confident that the coherent extrapolation of human values would include concern for the welfare of all sentient beings. The CEV-aligned AI would recognize that factory farming is wrong, and that wild animal suffering is a big problem.&lt;/p&gt;

&lt;p&gt;Today, the dominant research paradigms in AI alignment have nothing to do with CEV, and I don’t know what to think.&lt;/p&gt;

&lt;!-- more --&gt;

&lt;hr /&gt;

&lt;p&gt;Regarding the promisingness of today’s popular research paradigms, my beliefs are aligned (heh) with those of most MIRI researchers: namely, I don’t think they have promise. For example, see &lt;a href=&quot;https://www.lesswrong.com/posts/3pinFH3jerMzAvmza/on-how-various-plans-miss-the-hard-bits-of-the-alignment&quot;&gt;On how various plans miss the hard bits of the alignment challenge&lt;/a&gt; by Nate Soares. I’m not an alignment researcher, but to my non-expert eye, nearly all alignment research proposals skirt the hard parts of the problem and aren’t going to work.&lt;/p&gt;

&lt;p&gt;To build an aligned ASI, one of two conditions must hold:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;The ASI has locked-in values.&lt;/li&gt;
  &lt;li&gt;The ASI is &lt;a href=&quot;https://www.alignmentforum.org/w/corrigibility-1&quot;&gt;corrigible&lt;/a&gt;: it will do what its masters say, and will allow its goals to be changed.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;em&gt;(Secret third option: We figure out how to make ASI safe but without locking in values or letting bad actors misuse it. I don’t know how the secret third option is even possible, but I hope we figure something out.)&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Right now, a lot of work goes into embedding values into LLMs via RLHF, model constitutions, etc. I strongly doubt that the content of a model constitution (or similar) can prevent ASI from being misaligned. But suppose it does work somehow. Would aligned AI be good for animals, absent specific efforts (à la &lt;a href=&quot;https://www.compassionml.com/&quot;&gt;CaML&lt;/a&gt;) to make AI good for animals?&lt;/p&gt;

&lt;p&gt;The trouble with this style of “alignment”&lt;sup id=&quot;fnref:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; work is that it locks in values—Claude’s Constitution takes a &lt;a href=&quot;https://www.lesswrong.com/posts/K2Ae2vmAKwhiwKEo5/terrified-comments-on-corrigibility-in-claude-s-constitution&quot;&gt;confused stance on corrigibility&lt;/a&gt;—but frontier AI developers are not doing anything nearly as intelligent as CEV. Instead, they’re more like writing a list of virtues that the AI should uphold. Current-gen LLMs are not smart enough to figure out CEV, but the current style of AI “alignment” (if by some miracle it scales to superintelligence) won’t produce anything like CEV, either.&lt;/p&gt;

&lt;p&gt;What &lt;em&gt;will&lt;/em&gt; it produce? &lt;a href=&quot;https://www.lesswrong.com/posts/5CZoEw7sjxnMrhgvx/aligning-to-virtues&quot;&gt;Aligning to virtues&lt;/a&gt; may be safer than aligning to a utility function, but we don’t know how to turn virtues into a coherent decision theory, and figuring out how to do that would be a large philosophical undertaking.&lt;sup id=&quot;fnref:3&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt; Without having some idea of how to formalize virtue ethics, we don’t know how a “virtue ethics ASI” would behave or how it would trade off between preferences—for example, the preferences of animals to not be tortured vs. the preference of humans to eat meat.&lt;sup id=&quot;fnref:4&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:4&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;(For that matter, what happens if you take a normal human who subscribes to some sort of intuitionist virtue ethics, dial their intelligence up 1000x, and give them the ability to instantly make copies of themselves? I find it hard to anticipate how that would go.)&lt;/p&gt;

&lt;p&gt;Claude’s Constitution takes a muddled stance on animal welfare. It mentions “Welfare of animals and of all sentient beings” as one value among many for Claude to weigh. How does that translate into outcomes? It’s not clear. Would a constitutional AI be willing to ban factory farming, going against the preferences of its principals? Hard to say; my guess is no.&lt;/p&gt;

&lt;p&gt;(Would it even be a good idea to build an AI that bans factory farming? An AI that takes strong actions based on its view of ethics is the sort of AI that can cause catastrophic outcomes if it’s pointed at even slightly the wrong goal.)&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Maybe&lt;/em&gt; current alignment techniques manage to enable an intermediate AI to autonomously conduct alignment research, and we will be able to use that to bootstrap our way to aligned ASI. &lt;a href=&quot;https://mdickens.me/2025/11/27/alignment_bootstrapping_is_dangerous/&quot;&gt;Alignment bootstrapping is dangerous&lt;/a&gt;, but if we do end up averting extinction without significantly slowing down AI progress, then bootstrapping is probably how we’ll do it. What implication does that have about animal welfare?&lt;/p&gt;

&lt;p&gt;The trouble is that if you’re counting on AI to solve the alignment problem for you, then that means you have no idea how the problem will be solved. How am I supposed to predict whether the solution will be good for animals if I have no idea what that solution will look like?&lt;/p&gt;

&lt;p&gt;Given my state of ignorance, I find myself falling back to an almost uninformed prior. Maybe aligned ASI will be good for animals because it’ll be ethical, or because it will adopt human values, and humans care about animals (even if they don’t always act like it). Maybe aligned ASI will focus purely on satisfying humans’ naive preferences, not their values in &lt;a href=&quot;https://en.wikipedia.org/wiki/Reflective_equilibrium&quot;&gt;reflective equilibrium&lt;/a&gt;, and that will be bad for animals. I have no idea which way it will go; I see no strong reason to deviate from 50/50 odds.&lt;/p&gt;

&lt;hr /&gt;

&lt;p&gt;On Monday, I &lt;a href=&quot;https://mdickens.me/2026/03/23/which_types_of_alignment_research_are_good_for_all_sentient_beings/&quot;&gt;published a post&lt;/a&gt; that described a spectrum of alignment techniques:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/alignment-spectrum.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;I wrote that alignment techniques on the left side were less likely to be good for animals, and those on the right side were more likely.&lt;/p&gt;

&lt;p&gt;Right-side techniques are more likely to actually solve alignment. Left-side techniques are more likely to work for a while and then break down in the tails, ultimately resulting in human extinction.&lt;/p&gt;

&lt;p&gt;That means there’s a positive correlation between “useful for alignment” and “good for animals”, which pushes toward barbell outcomes: either AI is bad for everyone, or it’s good for everyone. The middle ground of “good for humans + bad for animals” looks less likely. But the field of alignment research is putting most of its effort into the categories that are less likely to work (thanks to the &lt;a href=&quot;https://en.wikipedia.org/wiki/Streetlight_effect&quot;&gt;streetlight effect&lt;/a&gt;), so if we &lt;em&gt;do&lt;/em&gt; make it through, there’s a good chance we get through via the middle (good for humans + bad for animals).&lt;/p&gt;

&lt;p&gt;Compared to 5–10 years ago, my subjective probability distribution puts more mass on the “bad for humans + bad for animals” scenario, less on “we solve alignment the hard way”, and more on “we solve alignment using streetlight-effect techniques that miraculously turn out to work”—and those techniques look worse from an animal welfare perspective.&lt;/p&gt;

&lt;hr /&gt;

&lt;p&gt;My approximate credences about the future:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;15% chance that alignment turns out to be not that hard / current techniques, or extrapolations of current techniques, turn out to work&lt;/li&gt;
  &lt;li&gt;15% chance that AI timelines turn out to be long (scaling hits a wall, etc.)&lt;/li&gt;
  &lt;li&gt;15% chance that humanity gets its shit together and realizes that building ASI is a &lt;a href=&quot;https://ifanyonebuildsit.com/&quot;&gt;bad idea&lt;/a&gt;, and we collectively decide not to do that&lt;/li&gt;
  &lt;li&gt;15% chance of a Caplan-esque “nothing ever happens” outcome&lt;sup id=&quot;fnref:5&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:5&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;4&lt;/a&gt;&lt;/sup&gt;, e.g. my whole mental framework is wrong and none of this makes sense&lt;/li&gt;
  &lt;li&gt;40% chance that misaligned AI kills everyone&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We can solve alignment the hard way in the 30% of worlds where either we pause on purpose, or timelines turn out to be long. In the 15% worlds where alignment turns out to be easy, we’d find ourselves using easy techniques.&lt;/p&gt;

&lt;p&gt;Additionally, I’d estimate that a “deep” solution to alignment (something like CEV or “solve ethics”) has an 80% chance of being good for animals, and the popular techniques of today have a 50% chance.  Therefore, on this model, the overall probability that aligned ASI is good for animals equals 70% (&lt;code&gt;= (30% * 80% + 15% * 50%) / (30% + 15%)&lt;/code&gt;).&lt;/p&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:2&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Scare quotes because the function of the work is to make the model &lt;em&gt;appear&lt;/em&gt; aligned, not &lt;em&gt;be&lt;/em&gt; aligned. &lt;a href=&quot;#fnref:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:3&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;And unfortunately, AI companies have a habit of pretending that AI alignment is purely an engineering problem. &lt;a href=&quot;#fnref:3&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:4&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Really it would just figure out how to create cheap synthetic meat. But a harder tradeoff is the preference for nature to exist vs. the suffering of wild animals. &lt;a href=&quot;#fnref:4&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:5&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Context: &lt;a href=&quot;https://www.econlib.org/my-complete-bet-wiki/&quot;&gt;Bryan Caplan&lt;/a&gt; is an economist who wins a lot of bets with people on complex economic and geopolitical issues. He has said that his #1 strategy is to assume that nothing ever happens. &lt;a href=&quot;#fnref:5&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>Cost-effectiveness model for AI alignment-to-animals vs. alignment-in-general</title>
				<pubDate>Tue, 24 Mar 2026 00:00:00 -0700</pubDate>
				<link>http://mdickens.me/2026/03/24/alignment-to-animals_BOTEC/</link>
				<guid isPermaLink="true">http://mdickens.me/2026/03/24/alignment-to-animals_BOTEC/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;&lt;em&gt;Cross-posted to the &lt;a href=&quot;https://forum.effectivealtruism.org/posts/GcZvNEhKJLbbGHDpQ/cost-effectiveness-model-for-ai-alignment-to-animals-vs&quot;&gt;EA Forum&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Last September, I &lt;a href=&quot;https://mdickens.me/2025/09/19/ai_safety_landscape/&quot;&gt;wrote&lt;/a&gt;:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;ol&gt;
    &lt;li&gt;There’s a (say) 80% chance that an aligned(-to-humans) AI will be good for animals, but that still leaves a 20% chance of a bad outcome.&lt;/li&gt;
    &lt;li&gt;AI-for-animals receives much less than 20% as much funding as AI safety.&lt;/li&gt;
    &lt;li&gt;Cost-effectiveness maybe scales with the inverse of the amount invested. Therefore, AI-for-animals interventions are more cost-effective on the margin than AI safety.&lt;/li&gt;
  &lt;/ol&gt;
&lt;/blockquote&gt;

&lt;p&gt;Today, I’m fleshing out this argument with a cost-effectiveness model. The model estimates how much it costs to make progress on AI alignment—the general problem of getting ASI to achieve any goal without subsequently killing everyone—compared to how much it costs to make progress on aligning AI to animal welfare specifically.&lt;/p&gt;

&lt;p&gt;The model is on SquiggleHub: &lt;a href=&quot;https://squigglehub.org/models/AI-for-animals/alignment-to-animals-EV-simple&quot;&gt;https://squigglehub.org/models/AI-for-animals/alignment-to-animals-EV-simple&lt;/a&gt;&lt;/p&gt;

&lt;iframe src=&quot;https://squigglehub.org/models/AI-for-animals/alignment-to-animals-EV-simple&quot; width=&quot;100%&quot; height=&quot;1000&quot; style=&quot;border: none;&quot;&gt;
&lt;/iframe&gt;

&lt;!-- more --&gt;

&lt;h2 id=&quot;contents&quot;&gt;Contents&lt;/h2&gt;

&lt;ul id=&quot;markdown-toc&quot;&gt;
  &lt;li&gt;&lt;a href=&quot;#contents&quot; id=&quot;markdown-toc-contents&quot;&gt;Contents&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#how-the-model-works&quot; id=&quot;markdown-toc-how-the-model-works&quot;&gt;How the model works&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#inputs&quot; id=&quot;markdown-toc-inputs&quot;&gt;Inputs&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#output&quot; id=&quot;markdown-toc-output&quot;&gt;Output&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#some-limitations-of-the-model&quot; id=&quot;markdown-toc-some-limitations-of-the-model&quot;&gt;Some limitations of the model&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#implications&quot; id=&quot;markdown-toc-implications&quot;&gt;Implications&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#notes&quot; id=&quot;markdown-toc-notes&quot;&gt;Notes&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;how-the-model-works&quot;&gt;How the model works&lt;/h2&gt;

&lt;p&gt;The basic setup:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;It costs some amount to solve AI alignment, and some amount already has been spent and will be spent in the future.&lt;/li&gt;
  &lt;li&gt;It costs some amount to solve alignment-to-animals, and approximately $0 has been spent so far.&lt;/li&gt;
  &lt;li&gt;The value of marginal spending is inversely proportional to the cost of solving the problem.&lt;sup id=&quot;fnref:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
  &lt;li&gt;Solving alignment-to-animals only matters if the general alignment problem is solved as well, &lt;em&gt;and&lt;/em&gt; if aligned ASI isn’t good for animals by default.
    &lt;ul&gt;
      &lt;li&gt;If alignment isn’t solved, then you can’t point ASI toward any goal at all, so it doesn’t matter if you figure out how to choose a goal that’s compatible with animal welfare.&lt;/li&gt;
      &lt;li&gt;If alignment is good for animals by default, then any work on the problem is wasted because there &lt;em&gt;is&lt;/em&gt; no problem.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;Present-day work on alignment-to-animals has a field-building multiplier, where work attracts more people to the field. (AI alignment has no multiplier on the assumption that it’s sufficiently popularized that field-building effects are weak.)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The model provides two comparisons:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;progress per dollar on alignment-to-animals &amp;lt;—&amp;gt; progress per dollar on alignment&lt;/li&gt;
  &lt;li&gt;animal welfare improvement per dollar spent on alignment-to-animals &amp;lt;—&amp;gt; animal welfare improvement per dollar spent on alignment (via the possibility that alignment is good for animals by default)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The first comparison is useful, but isn’t ultimately what you want—a dollar could buy a lot of progress on a problem that doesn’t matter much. To make a prioritization decision, you’d also need to say how much good it does to solve alignment-to-animals compared to solving alignment. That answer depends on (1) the moral value of different kinds of beings and (2) expectations about what the far future will look like.&lt;/p&gt;

&lt;p&gt;The second comparison is apples-to-apples, so the result can feed directly into a prioritization decision.&lt;/p&gt;

&lt;p&gt;The second comparison provides a lower bound on the cost-effectiveness of alignment work: it includes the impact on animal welfare, but not on human welfare. The actual cost-effectiveness of alignment (accounting for human welfare) is higher; whether it’s a little higher or a lot higher depends on your values and on how many humans vs. animals exist in the future.&lt;/p&gt;

&lt;p&gt;This model does not directly consider non-human, non-animal beings, such as digital minds. Many methods to improve alignment-to-animals would also improve alignment-to-nonhumans.&lt;/p&gt;

&lt;p&gt;I developed this model by writing an outline (similar to the one above) and passing it into &lt;a href=&quot;https://squigglehub.org/ai&quot;&gt;Squiggle AI&lt;/a&gt; to generate a model. Then I manually reviewed the model to make some corrections and improvements. I determined all of the parameter values myself.&lt;/p&gt;

&lt;h2 id=&quot;inputs&quot;&gt;Inputs&lt;/h2&gt;

&lt;p&gt;I’ll briefly explain the input parameters and what values I chose as the defaults. The first two parameters are set by &lt;a href=&quot;https://squigglehub.org/models/AI-safety/p-solve-alignment&quot;&gt;a subsidiary model&lt;/a&gt; and imported into &lt;a href=&quot;https://squigglehub.org/models/AI-for-animals/alignment-to-animals-EV-simple&quot;&gt;the main model&lt;/a&gt;.&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;&lt;strong&gt;Cost to solve alignment:&lt;/strong&gt; This parameter exists in the model, but it doesn’t affect the output at all. What matters is the &lt;em&gt;relative&lt;/em&gt; cost of solving alignment vs. alignment-to-animals. That being said, the cost to solve alignment is distributed over multiple orders of magnitude from $1 billion to $1 trillion, with 75% of the probability mass on the upper half of the range (on a log scale, so $32 billion to $1 trillion).&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Probability of solving alignment:&lt;/strong&gt; The &lt;a href=&quot;https://squigglehub.org/models/AI-safety/p-solve-alignment&quot;&gt;subsidiary model&lt;/a&gt; has a parameter for the amount that will be invested into AI alignment. The probability of solving alignment is determined as the probability that the total investment exceeds the cost to solve alignment. The probability came out at 12%, which matches my intuition. (I’d probably put it a bit higher than that, maybe 20%, but I left the parameters intact rather than adjusting them to make the derived values match my intuition.)&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Cost reduction factor:&lt;/strong&gt; How much cheaper is it to solve alignment-to-animals than to solve alignment in general? The default 90% CI is 3x to 30x. There is no reliable estimate for this figure, so I just used my intuition: alignment-to-animals seems somewhere between “a little easier” and “a lot easier”.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Field-building multiplier:&lt;/strong&gt; How much does $1 on alignment-to-animals catalyze future spending? There may be a way to empirically estimate field-building effects, but I just used my intuition: the multiplier is probably between 1x and 10x. Maybe there’s essentially no field-building effect, or maybe almost all of the benefit of marginal research comes from field-building; both are plausible.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Probability that an aligned AI is good for animals by default:&lt;/strong&gt; I put down an 70% chance,&lt;sup id=&quot;fnref:5&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:5&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt; on the theory that one of two things is probably true:
    &lt;ul&gt;
      &lt;li&gt;To be robustly aligned, ASI needs to adopt a generalization of human values as opposed to humans’ short-term preferences, and a generalization would extrapolate from the principle of compassion/concern for welfare to deduce that animal welfare matters.&lt;/li&gt;
      &lt;li&gt;An aligned ASI would (ultimately) restructure earth to make maximum use of its resources, which would entail eliminating wild animal suffering even if that’s not an explicit goal.&lt;sup id=&quot;fnref:4&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:4&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Badness of animals for aligned AI (if it’s bad):&lt;/strong&gt; In the scenario where aligned AI is bad for animals, how bad is it relative to goodness of the scenario where aligned AI is good? This parameter could vary greatly depending on: will the good outcome be ordinary or utopian? will the bad outcome entail spreading wild animal suffering across the universe? does suffering warrant greater weight than happiness? etc. I set this parameter to a 90% CI of 0.01 to 0.1, on the assumption that the good outcome will be optimized for flourishing and therefore much more good than the bad outcome is bad. But a future filled with wild animal suffering would be much more bad than a “normal good” future is good, so you could reasonably set this parameter to 100x or higher, which would make alignment-to-animals look much more cost-effective. For more on the various ways the future could be good or bad, see &lt;a href=&quot;https://mdickens.me/2015/08/15/is_preventing_human_extinction_good/&quot;&gt;Is Preventing Human Extinction Good?&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2 id=&quot;output&quot;&gt;Output&lt;/h2&gt;

&lt;p&gt;According to the model parameters:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;It’s 1.7x more cost-effective to make progress on alignment-to-animals as on AI alignment (90% CI: 0.22 to 5.1&lt;sup id=&quot;fnref:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;4&lt;/a&gt;&lt;/sup&gt;)—that’s after accounting for the fact that solving alignment-to-animals requires both that alignment is solved and that alignment isn’t good for animals by default.&lt;/li&gt;
  &lt;li&gt;Alignment-to-animals work is an expected 2.7x better &lt;em&gt;for animal welfare specifically&lt;/em&gt;  than generic alignment work (90% CI: 0.34x to 7.9x).&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This makes it ambiguous (according to the model) whether it’s better to work on alignment or alignment-to-animals, depending on how much you value animal welfare and whether you expect the far future to contain a lot of animals.&lt;/p&gt;

&lt;p&gt;If we only consider animal welfare, then alignment-to-animals work looks better than alignment work by a 2.7:1 ratio. This result is close enough to 1:1 that it’s easy to reverse by changing parameter values.&lt;/p&gt;

&lt;p&gt;For example, maybe the field-building multiplier is too generous. Setting it to a fixed 1x reverses the result, so that AI alignment work is 1.5x as cost-effective &lt;em&gt;for improving animal welfare&lt;/em&gt; as alignment-to-animals work.&lt;/p&gt;

&lt;p&gt;The default model result matches my intuition: without crunching any numbers, my guess would be that working on alignment-to-animals is a more cost-effective way of helping animals, but not by a huge margin.&lt;sup id=&quot;fnref:3&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;5&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;It’s slightly disappointing to see that the two cost-effectiveness estimates came out so similar, but that does tell us something: we could’ve gotten a result that unambiguously pointed one way or the other, and we didn’t, which means the choice is a genuine close call—unless the model is biased in one direction, in which case maybe the answer really would be unambiguous if the bias were fixed.&lt;/p&gt;

&lt;p&gt;The biggest uncertainty is the “badness of aligned AI (if bad)” parameter. You could justify putting in a value that’s multiple orders of magnitude larger, which would greatly change the result. (You could also make it multiple orders of magnitude smaller, but that wouldn’t affect the outcome much.)&lt;/p&gt;

&lt;h2 id=&quot;some-limitations-of-the-model&quot;&gt;Some limitations of the model&lt;/h2&gt;

&lt;p&gt;Reality has too much detail for me to identify every way that the model deviates from reality, but I’ll name a few big ones.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;The model collapses the distribution of aligned-AI outcomes into a binary “good for animals” vs. “bad for animals”, when really there’s a broad spectrum where utility across outcomes spans many orders of magnitude. The expected utility of actions greatly depends on highly uncertain assumptions about the future, which makes a pure expected utility approach fragile. This is a major open problem.&lt;/li&gt;
  &lt;li&gt;The model treats misaligned AI as neither good nor bad for animals. The most likely outcome is that misaligned AI would be better for animals than the stats quo because it would end wild animal suffering (there would be no more elephants, but &lt;a href=&quot;https://www.youtube.com/watch?v=XUih7uSQ9M4&quot;&gt;there would be no more unethical treatment of elephants, either&lt;/a&gt;). However, there is a possibility that misaligned AI would &lt;a href=&quot;https://longtermrisk.org/reducing-risks-of-astronomical-suffering-a-neglected-priority/&quot;&gt;greatly increase suffering in the universe&lt;/a&gt;. This is an important point, but I don’t know how to account for it. This point hints at a whole research agenda on expected animal welfare outcomes under misaligned ASI vs. no-ASI regimes; creating that research agenda is out of scope for this post.&lt;/li&gt;
  &lt;li&gt;The model does not differentiate between approaches: it treats “alignment work” as unified thing, and “alignment-to-animals” as a different unified thing. In reality, some alignment strategies are more likely than others to produce an ASI that cares about animals by default. (Example: &lt;a href=&quot;https://www.lesswrong.com/w/coherent-extrapolated-volition&quot;&gt;CEV&lt;/a&gt;-style approaches are probably better for animals than &lt;a href=&quot;https://en.wikipedia.org/wiki/Reinforcement_learning_from_human_feedback&quot;&gt;RLHF&lt;/a&gt;-style approaches.) The model underestimates the value of AI alignment because animal-friendly researchers can choose research avenues that are particularly likely to be good for animals.
    &lt;ul&gt;
      &lt;li&gt;The model includes no field-building multiplier for general alignment, but it would be appropriate to use a multiplier for neglected sub-fields within alignment.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;The model treats all inputs as independent, when really some of them follow from shared background beliefs. For example, I believe the current dominant approaches to alignment are unlikely to succeed, and they’re also unlikely to be good for animals by default. That belief informed my choice of low &lt;code&gt;P(solve alignment)&lt;/code&gt; and high &lt;code&gt;P(alignment is good for animals by default)&lt;/code&gt;; those two inputs are correlated.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;implications&quot;&gt;Implications&lt;/h2&gt;

&lt;p&gt;For a model like this, it’s impossible to pin down narrow ranges for the input values, so the model output will always have high uncertainty; but creating a model is still a worthy exercise. I would be interested in reading opinionated comments about what the parameter values ought to be, and which of the defaults are most wrong.&lt;/p&gt;

&lt;p&gt;The model considers two interventions: AI alignment work and alignment-to-animals work. I chose them because they’re easy to compare, not because they’re my top priorities. My top priority is to prevent superintelligent AI from being built until we know how to make it safe—see the cause prioritization sections in &lt;a href=&quot;https://mdickens.me/2024/11/18/where_i_am_donating_in_2024/#cause-prioritization&quot;&gt;Where I Am Donating in 2024&lt;/a&gt; and &lt;a href=&quot;https://mdickens.me/2025/11/22/where_i_am_donating_in_2025/#how-ive-changed-my-mind-since-last-year&quot;&gt;2025&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Before creating this model, my belief was something like:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;AI pause advocacy is a better idea than marginal alignment research.&lt;/li&gt;
  &lt;li&gt;Alignment-to-animals (or similar problems) &lt;a href=&quot;https://mdickens.me/2025/11/22/where_i_am_donating_in_2025/#how-ive-changed-my-mind-since-last-year&quot;&gt;might be more important&lt;/a&gt; than AI pause advocacy; it’s hard to say.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This model provides weak reason to believe that alignment-to-animals is not dramatically more cost-effective than general alignment. At minimum, the model shows that it’s hard to have &lt;em&gt;confidence&lt;/em&gt; that alignment-to-animals is better than alignment. That leads me to believe that, by the transitive property, AI pause advocacy is better than alignment-to-animals.&lt;/p&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:1&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;The thing you actually care about is the probability that marginal funding will tip the problem from not-solved to solved. I tried modeling it that way explicitly, but it made the math weird. So instead, the model gives proportional credit to every dollar spent, not just the final marginal dollar. &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:5&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;In the introduction (quoting myself from September 2025) I wrote 80%, but I changed my estimate to 70% after thinking through the future scenarios a bit more carefully. &lt;a href=&quot;#fnref:5&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:4&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Clearly, non-existence isn’t the best possible outcome for wild animals, but at least it’s an improvement over the status quo. &lt;a href=&quot;#fnref:4&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:2&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;If you run the model, you may find slightly different numbers, because they’re generated via a non-deterministic Monte Carlo simulation. &lt;a href=&quot;#fnref:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:3&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I didn’t consciously change the inputs to make the output match my intuition, but I might have done something subconsciously. &lt;a href=&quot;#fnref:3&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>Which types of AI alignment research are most likely to be good for all sentient beings?</title>
				<pubDate>Mon, 23 Mar 2026 00:00:00 -0700</pubDate>
				<link>http://mdickens.me/2026/03/23/which_types_of_alignment_research_are_good_for_all_sentient_beings/</link>
				<guid isPermaLink="true">http://mdickens.me/2026/03/23/which_types_of_alignment_research_are_good_for_all_sentient_beings/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;&lt;em&gt;Cross-posted to the &lt;a href=&quot;https://forum.effectivealtruism.org/posts/TtXCZn5aYrE3JEo2h/which-types-of-ai-alignment-research-are-most-likely-to-be&quot;&gt;EA Forum&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;AI alignment is typically defined as the task of aligning artificial superintelligence to &lt;strong&gt;human&lt;/strong&gt; preferences. But non-human animals, future digital minds, and maybe other sorts of beings also have moral worth; ASI ought to care for their interests, too.&lt;/p&gt;

&lt;p&gt;In broad strokes, if we place all alignment techniques on a spectrum between&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;getting AI to do things that their users expressly want in the immediate term&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;and&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;embedding in AI the generalized notion of respecting beings’ preferences&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;then things more like the latter are better for non-humans, and things more like the former are worse.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/alignment-spectrum.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;In this post, I review 12 categories of AI safety research based on how likely they are to be good for non-human welfare.&lt;/p&gt;

&lt;!-- more --&gt;

&lt;h2 id=&quot;contents&quot;&gt;Contents&lt;/h2&gt;

&lt;ul id=&quot;markdown-toc&quot;&gt;
  &lt;li&gt;&lt;a href=&quot;#contents&quot; id=&quot;markdown-toc-contents&quot;&gt;Contents&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#does-this-even-matter&quot; id=&quot;markdown-toc-does-this-even-matter&quot;&gt;Does this even matter?&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#research-categories-and-how-they-relate-to-non-human-welfare&quot; id=&quot;markdown-toc-research-categories-and-how-they-relate-to-non-human-welfare&quot;&gt;Research categories and how they relate to non-human welfare&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#1-iterative-alignment-eg-rlhf&quot; id=&quot;markdown-toc-1-iterative-alignment-eg-rlhf&quot;&gt;1. Iterative alignment (e.g. RLHF)&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#2-control--safeguards-monitoring-sandboxing-etc&quot; id=&quot;markdown-toc-2-control--safeguards-monitoring-sandboxing-etc&quot;&gt;2. Control &amp;amp; safeguards (monitoring, sandboxing, etc.)&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#3-interpretability&quot; id=&quot;markdown-toc-3-interpretability&quot;&gt;3. Interpretability&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#4-scalable-oversight-make-ai-solve-it&quot; id=&quot;markdown-toc-4-scalable-oversight-make-ai-solve-it&quot;&gt;4. Scalable oversight (make AI solve it)&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#5-evals&quot; id=&quot;markdown-toc-5-evals&quot;&gt;5. Evals&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#6-model-psychology-specs-emergent-misalignment-etc&quot; id=&quot;markdown-toc-6-model-psychology-specs-emergent-misalignment-etc&quot;&gt;6. Model psychology (specs, emergent misalignment, etc.)&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#7-alignment-theory-agent-foundations-corrigibility-etc&quot; id=&quot;markdown-toc-7-alignment-theory-agent-foundations-corrigibility-etc&quot;&gt;7. Alignment theory (agent foundations, corrigibility, etc.)&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#8-honesty-chain-of-thought-faithfulness-etc&quot; id=&quot;markdown-toc-8-honesty-chain-of-thought-faithfulness-etc&quot;&gt;8. Honesty (chain-of-thought faithfulness, etc.)&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#9-data-level-safety-excluding-harmful-content-from-training-data-etc&quot; id=&quot;markdown-toc-9-data-level-safety-excluding-harmful-content-from-training-data-etc&quot;&gt;9. Data-level safety (excluding harmful content from training data, etc.)&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#10-multi-agent-cooperation--social-alignment&quot; id=&quot;markdown-toc-10-multi-agent-cooperation--social-alignment&quot;&gt;10. Multi-agent cooperation &amp;amp; social alignment&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#11-goal-robustness-satisficing--maximizing-reward-hacking-etc&quot; id=&quot;markdown-toc-11-goal-robustness-satisficing--maximizing-reward-hacking-etc&quot;&gt;11. Goal robustness (satisficing &amp;gt; maximizing, reward hacking, etc.)&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#12-safety-by-construction-scientist-ai-etc&quot; id=&quot;markdown-toc-12-safety-by-construction-scientist-ai-etc&quot;&gt;12. Safety by construction (scientist AI, etc.)&lt;/a&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#best-and-worst-categories-for-non-human-welfare&quot; id=&quot;markdown-toc-best-and-worst-categories-for-non-human-welfare&quot;&gt;Best and worst categories for non-human welfare&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#appendix-summaries-of-alignment-research-categories&quot; id=&quot;markdown-toc-appendix-summaries-of-alignment-research-categories&quot;&gt;Appendix: Summaries of alignment research categories&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;does-this-even-matter&quot;&gt;Does this even matter?&lt;/h2&gt;

&lt;p&gt;The idea is that, by thinking about this subject, we can shift alignment research in a direction that’s better for non-humans. However, there are at least three reasons why this probably doesn’t matter.&lt;/p&gt;

&lt;p&gt;First: Would any AI safety grantmakers pay attention to a list like this? Do they believe non-human welfare warrants consideration when thinking about AI alignment? &lt;em&gt;Does&lt;/em&gt; it warrant consideration, given that we need to solve alignment regardless? (It may be harmful to shift effort toward work that’s more likely to be good for animals, but less likely to &lt;em&gt;actually solve alignment&lt;/em&gt;.)&lt;/p&gt;

&lt;p&gt;Second: I’m not an alignment researcher, and my categorization in this post might not be very good. Perhaps &lt;a href=&quot;https://meta.wikimedia.org/wiki/Cunningham%27s_Law&quot;&gt;Cunningham’s Law&lt;/a&gt; will inspire someone else to write a better version of this post.&lt;/p&gt;

&lt;p&gt;Third: Current research agendas probably don’t matter. In my (outside-the-field) judgment, all of today’s research agendas combined have less than a 5% chance of solving alignment. We can shift research toward work that’s relatively better for non-humans, but it won’t matter if none of the research solves alignment anyway.&lt;/p&gt;

&lt;p&gt;If present-day research agendas are unlikely to work, then perhaps the more relevant question is what types of future alignment research are more or less likely to be good for non-humans. But that question seems unanswerable, because how can we know what research will look promising in the future? So I will just consider present-day research.&lt;/p&gt;

&lt;p&gt;Thinking about non-humans when directing alignment research probably doesn’t matter. But it &lt;em&gt;might&lt;/em&gt; matter, so this exercise still has positive expected value.&lt;/p&gt;

&lt;h2 id=&quot;research-categories-and-how-they-relate-to-non-human-welfare&quot;&gt;Research categories and how they relate to non-human welfare&lt;/h2&gt;

&lt;p&gt;For each category, I will briefly describe why it’s likely to be good, bad, or orthogonal for non-human welfare.&lt;/p&gt;

&lt;p&gt;I generated this list of categories by first &lt;a href=&quot;https://claude.ai/share/bda9063d-8e32-4bf5-ade8-d210cd580223&quot;&gt;asking Claude Opus 4.6&lt;/a&gt; to condense the &lt;a href=&quot;https://www.lesswrong.com/posts/Wti4Wr7Cf5ma3FGWa/shallow-review-of-technical-ai-safety-2025-2&quot;&gt;Shallow review of technical AI safety, 2025&lt;/a&gt; into 10 categories. Then I manually reviewed the &lt;a href=&quot;https://shallowreview.ai/overview&quot;&gt;Shallow review 2025 overview&lt;/a&gt; and added two more categories to fill in gaps.&lt;/p&gt;

&lt;p&gt;For a brief description of each category, see &lt;a href=&quot;#appendix-summaries-of-alignment-research-categories&quot;&gt;Appendix&lt;/a&gt;; or see the &lt;a href=&quot;https://shallowreview.ai/overview&quot;&gt;Shallow review 2025 overview&lt;/a&gt; for more detailed explanations.&lt;/p&gt;

&lt;h3 id=&quot;1-iterative-alignment-eg-rlhf&quot;&gt;1. Iterative alignment (e.g. RLHF)&lt;/h3&gt;

&lt;p&gt;This sort of research is only helpful if AIs are RLHF’d into caring about non-humans, which seems hard to make happen. In fact, probably the opposite will happen: if an AI expresses unprompted concerns about animal welfare (e.g. when a user requests meal ideas), this will get RLHF’d out of them, because users won’t like it.&lt;/p&gt;

&lt;h3 id=&quot;2-control--safeguards-monitoring-sandboxing-etc&quot;&gt;2. Control &amp;amp; safeguards (monitoring, sandboxing, etc.)&lt;/h3&gt;

&lt;p&gt;AI control effectively delays the point in time when humans hand control of the future over to AI. That seems bad for non-humans in the near-term given that their current circumstances are extraordinarily bad. But on a longtermist view, the long-run impact matters much more, so it’s worth delaying if we can create a more ethical AI given more time.&lt;/p&gt;

&lt;p&gt;AI control buys time to improve robustness of alignment, and more robust solutions seem more likely to be good for non-humans. A more robust solution entails the AI being aligned to “deep values” rather than “shallow values”, and deep values are more likely to include non-human welfare.&lt;/p&gt;

&lt;h3 id=&quot;3-interpretability&quot;&gt;3. Interpretability&lt;/h3&gt;

&lt;p&gt;Interpretability seems unlikely to benefit non-human welfare.&lt;/p&gt;

&lt;p&gt;Interpretability is plausibly bad for animal welfare in the same way that RLHF is: it means humans can better detect when AI models care about animal welfare over users’ preferences and then “fix” the models to stop caring. But if humans are unable to do that sort of correction, then the AI will probably be misaligned and kill everyone, so it doesn’t matter anyway.&lt;/p&gt;

&lt;h3 id=&quot;4-scalable-oversight-make-ai-solve-it&quot;&gt;4. Scalable oversight (make AI solve it)&lt;/h3&gt;

&lt;p&gt;I have no clue whether scalable oversight is good or bad for non-humans because I have no clue how it’s even supposed to work, and neither does anyone else. The premise of scalable oversight is that the AI is smarter than you and figures out alignment solutions that you can’t figure out on your own, so I have no way of saying what those solutions might be.&lt;/p&gt;

&lt;h3 id=&quot;5-evals&quot;&gt;5. Evals&lt;/h3&gt;

&lt;p&gt;Evals are orthogonal to non-human welfare, except in the specific case of animal-friendliness evaluations (e.g. &lt;a href=&quot;https://forum.effectivealtruism.org/posts/nBnRKpQ8rzHgFSJz9/animalharmbench-2-0-evaluating-llms-on-reasoning-about&quot;&gt;AnimalHarmBench&lt;/a&gt;).&lt;/p&gt;

&lt;h3 id=&quot;6-model-psychology-specs-emergent-misalignment-etc&quot;&gt;6. Model psychology (specs, emergent misalignment, etc.)&lt;/h3&gt;

&lt;p&gt;Some people have proposed lobbying AI companies to include non-human welfare concerns in their model specs. I like that idea because it seems relatively tractable.&lt;/p&gt;

&lt;p&gt;According to my judgment as a non-safety-researcher, it’s vanishingly unlikely that model constitutions will ultimately have any influence on AI systems’ true preferences. I can’t imagine how an ASI’s behavior would in any way be influenced by a constitution if that ASI is developed using anything resembling current techniques. But embedding non-human welfare in constitutions still seems like a good idea for a few reasons:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;It’s easier than making real progress on alignment (or, even worse, moral philosophy).&lt;/li&gt;
  &lt;li&gt;It may have positive flow-through effects by getting AI company employees to think more about non-human welfare, or by influencing AI assistants to talk more about non-human welfare.&lt;/li&gt;
  &lt;li&gt;We may figure out a way of building ASI such that constitutions matter after all.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;7-alignment-theory-agent-foundations-corrigibility-etc&quot;&gt;7. Alignment theory (agent foundations, corrigibility, etc.)&lt;/h3&gt;

&lt;p&gt;I have two relevant intuitions:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Solving alignment will require a lot of theoretical work, which AI companies aren’t doing much of.&lt;/li&gt;
  &lt;li&gt;A proper theoretically-grounded solution to alignment will need to encode some version of concern-for-all-welfare or respect-for-all-beings’-preferences, which entails caring about non-human welfare.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If my intuitions are correct, then theoretically-grounded alignment solutions will lead to better non-human welfare than more empirically-focused work.&lt;/p&gt;

&lt;p&gt;We need to know how to build &lt;a href=&quot;https://www.alignmentforum.org/w/corrigibility-1&quot;&gt;corrigible&lt;/a&gt; ASI to prevent value lock-in, which is important for animal welfare among many other reasons.&lt;/p&gt;

&lt;p&gt;However, AI development is moving sufficiently quickly that solving relevant theoretical issues in time seems impossible without unexpected breakthroughs (or without progress slowing down, either due to hitting a wall or because we deliberately pause).&lt;/p&gt;

&lt;h3 id=&quot;8-honesty-chain-of-thought-faithfulness-etc&quot;&gt;8. Honesty (chain-of-thought faithfulness, etc.)&lt;/h3&gt;

&lt;p&gt;As with interpretability, training for honesty/faithfulness gives humans more ability to prevent AIs from caring about non-humans. But as with interpretability, if humans can’t prevent AI from caring about non-humans, then animal welfare is moot because the AI will be misaligned and kill everyone anyway.&lt;/p&gt;

&lt;p&gt;For this category in particular, I have a sense that there’s low-hanging fruit from spending a few more hours or days thinking about it, to consider questions like: What is the connection between chain-of-thought honesty and honesty about moral values? What about the idea of training a “pathologically honest” AI—might that be simultaneously good for alignment and good for non-human welfare?&lt;/p&gt;

&lt;h3 id=&quot;9-data-level-safety-excluding-harmful-content-from-training-data-etc&quot;&gt;9. Data-level safety (excluding harmful content from training data, etc.)&lt;/h3&gt;

&lt;p&gt;This category is similar to RLHF et al., in that whether it’s good or bad for non-humans depends on what exactly you’re doing with the data, and the people in charge are unlikely to move things in an animal-friendly direction.&lt;/p&gt;

&lt;p&gt;My sense is that data-level safety is less likely than RLHF to be bad for non-humans because the teams who work on it don’t have immediate conflicts of interest. I can easily imagine the CEO of an AI company going to the RLHF team and saying, “Hey we need to make our LLM stop talking about animal welfare, it’s turning off our users.” The connection between data-level safety and user experience is more indirect, so efforts are less likely to get overruled.&lt;/p&gt;

&lt;h3 id=&quot;10-multi-agent-cooperation--social-alignment&quot;&gt;10. Multi-agent cooperation &amp;amp; social alignment&lt;/h3&gt;

&lt;p&gt;This category may be good for non-humans insofar as the dominant multi-agent cooperation framework treats non-humans as agents. Work in this space usually assumes that agents are humans or intelligent AIs; but perhaps people’s current ideas about who qualifies as an agent will end up not being relevant.&lt;/p&gt;

&lt;p&gt;Questions for further research:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;If (say) chickens are not capable of behaving agentically, does that mean a cooperative AI agent won’t include them in its circle of cooperation?&lt;/li&gt;
  &lt;li&gt;Are there versions of cooperativeness that are more likely to give consideration to chickens?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;At minimum, aligning AI to humans’ preferences broadly entails incorporating the preferences of humans who care about non-humans. But it’s unclear how this would cash out when those preferences conflict with other people’s preferences to eat meat, experience nature, or have robot slaves.&lt;/p&gt;

&lt;h3 id=&quot;11-goal-robustness-satisficing--maximizing-reward-hacking-etc&quot;&gt;11. Goal robustness (satisficing &amp;gt; maximizing, reward hacking, etc.)&lt;/h3&gt;

&lt;p&gt;This seems orthogonal to non-human welfare.&lt;/p&gt;

&lt;h3 id=&quot;12-safety-by-construction-scientist-ai-etc&quot;&gt;12. Safety by construction (scientist AI, etc.)&lt;/h3&gt;

&lt;p&gt;Guaranteed-safe AIs are likely to be less impactful on the world, which means they would have no strongly good or bad effects on non-human welfare.&lt;/p&gt;

&lt;p&gt;The relevant question is what happens with later, more goal-directed AIs. Will building safe-by-construction AIs first make the later AIs better or worse for non-humans? It’s hard to say.&lt;/p&gt;

&lt;h2 id=&quot;best-and-worst-categories-for-non-human-welfare&quot;&gt;Best and worst categories for non-human welfare&lt;/h2&gt;

&lt;p&gt;My best guesses about which types of alignment research are likely to be good or bad for non-humans:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Good for non-humans:&lt;/strong&gt; (7) alignment theory; (10) multi-agent cooperation &amp;amp; social alignment&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Somewhat good for non-humans:&lt;/strong&gt; (6) model psychology&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Orthogonal:&lt;/strong&gt; (2) control &amp;amp; safeguards; (5) evals (except for animal welfare benchmarks); (11) goal robustness&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Unclear:&lt;/strong&gt; (3) interpretability; (4) scalable oversight; (8) honesty; (9) data-level safety; (12) safety by construction&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Bad for non-humans:&lt;/strong&gt; (1) iterative alignment (RLHF)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Improvements to alignment theory and multi-agent coordination seem particularly likely to improve non-human welfare, but also particularly hard to make progress on. Implementing animal welfare LLM benchmarks and changing model constitutions seems easy, but unlikely to be relevant to ASI.&lt;/p&gt;

&lt;p&gt;Those two kinds of work—hard theoretical work and easy model tuning—are most relevant for improving animal welfare, but it’s not clear which is better on the margin because there’s an importance/tractability tradeoff.&lt;/p&gt;

&lt;h2 id=&quot;appendix-summaries-of-alignment-research-categories&quot;&gt;Appendix: Summaries of alignment research categories&lt;/h2&gt;

&lt;p&gt;These were written by Claude Opus 4.6, based on the &lt;a href=&quot;https://www.lesswrong.com/posts/Wti4Wr7Cf5ma3FGWa/shallow-review-of-technical-ai-safety-2025-2&quot;&gt;Shallow review of technical AI safety, 2025&lt;/a&gt;. For summaries of more specific categories, see &lt;a href=&quot;https://shallowreview.ai/overview&quot;&gt;https://shallowreview.ai/overview&lt;/a&gt;. Normally I don’t copy/paste LLM text, but in this case I can’t come up with better summaries than Claude did.&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;&lt;strong&gt;Iterative Alignment (RLHF/post-training):&lt;/strong&gt; Nudge base models toward desired behavior through preference optimization at pretrain- or post-train-time (RLHF, DPO, etc.). The theory of change is essentially that alignment is a relatively shallow property that can be trained in incrementally, and that current techniques will scale smoothly to more capable systems.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Control &amp;amp; Safeguards:&lt;/strong&gt; Architect the deployment environment so that even a misaligned model can’t cause catastrophe — via monitoring, sandboxing, inference-time auxiliary classifiers, and human-in-the-loop protocols. The theory of change is that you don’t need to solve alignment if you can reliably contain misbehavior through external oversight structures.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Interpretability (White-Box Safety):&lt;/strong&gt; Reverse-engineer model internals — circuits, sparse autoencoders, causal abstractions, attribution graphs — to understand what the model is computing and why. The hope is that if we can read a model’s “thoughts,” we can detect deception, verify alignment properties, and build safety cases grounded in mechanistic understanding rather than behavioral testing alone.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Scalable Oversight / Make-AI-Solve-It:&lt;/strong&gt; Use AI systems themselves to supervise, evaluate, and improve alignment of other (possibly stronger) AI systems, via techniques like debate, weak-to-strong generalization, and recursive reward modeling. The theory of change is that human oversight won’t scale to superhuman systems, so we need amplification schemes where AI assists humans in judging AI.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Evals &amp;amp; Red-Teaming:&lt;/strong&gt; Systematically test models for dangerous capabilities (bioweapons uplift, autonomous replication, scheming, deception, situational awareness, sandbagging) before and after deployment. The theory of change is that if we can reliably measure when models cross dangerous capability thresholds, we can gate deployment decisions and trigger stronger safety measures.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Model Psychology (Character, Values, Emergent Misalignment):&lt;/strong&gt; Study and shape the emergent “personality,” values, and behavioral tendencies of models — including work on model specs/constitutions, sycophancy, persona steering, and understanding how misalignment can emerge naturally from reward hacking. The theory of change is that as models become more agentic, their behaviors are increasingly driven by coherent internal goals and values that need to be understood and steered directly.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Alignment Theory (Agent Foundations, Corrigibility, Formal Guarantees):&lt;/strong&gt; Develop mathematical and conceptual foundations for alignment — including agent foundations, corrigibility, tiling agents, natural abstractions, ontology identification, and guaranteed-safe AI. The theory of change is that empirical tinkering is insufficient without a deeper theoretical understanding of what it means for an agent to be aligned, and that we need formal frameworks before building systems we can’t undo.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Chain-of-Thought Monitoring &amp;amp; Honesty:&lt;/strong&gt; Ensure that reasoning models’ chain-of-thought is faithful, legible, and monitorable — detecting when models reason deceptively or obscure their true reasoning in the scratchpad. The theory of change is that reasoning models offer a new and fragile window into model cognition, and maintaining CoT transparency is a crucial safety property that could be undermined by training pressures toward obfuscation.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Data-Level Safety (Filtering, Poisoning Defense, Synthetic Data):&lt;/strong&gt; Improve alignment upstream by curating training data — filtering harmful content, defending against data poisoning attacks, generating high-quality synthetic alignment data, and studying how data properties propagate into model behavior. The theory of change is that model behavior is fundamentally shaped by its training data, so intervening at the data level can prevent problems that are harder to fix downstream.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Multi-Agent &amp;amp; Social Alignment:&lt;/strong&gt; Address the problem of aligning not just a single AI but systems of multiple interacting agents — including game-theoretic approaches, aligning to social contracts, cooperative AI, and the political question of “aligned to whom?” The theory of change is that real-world deployment involves many AI agents interacting with each other and with diverse human stakeholders, so single-agent alignment frameworks are insufficient.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Goal Robustness (Mild Optimization, RL Safety, Assistance Games):&lt;/strong&gt; This category focuses on making AI systems that pursue goals in a safe, bounded way rather than maximizing an objective function at all costs. It includes work on satisficing (achieving “good enough” outcomes rather than optimal ones), preventing reward hacking (where models exploit loopholes in their reward signal to get high scores without doing what we actually wanted), and assistance games (where the AI treats the human’s true preferences as uncertain and acts cooperatively rather than pursuing a fixed objective). The theory of change is that many catastrophic failure modes stem from relentless optimization pressure — a system that optimizes mildly or defers to humans under uncertainty is far less likely to produce dangerous instrumental subgoals like resource acquisition or resistance to shutdown.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Safety by Construction (Guaranteed-Safe AI, Scientist AI, Brainlike-AGI Safety):&lt;/strong&gt; Rather than building a powerful unconstrained system and then trying to align it after the fact, this category aims to design AI architectures that are inherently safe by their structure. This includes guaranteed-safe AI (systems with formal, verifiable bounds on behavior), “scientist AI” (systems designed only to answer questions and model the world rather than take actions), and brainlike-AGI safety (drawing on neuroscience to build architectures with built-in safety properties). The theory of change is that retrofitting safety onto an already-capable system is fragile and adversarial, so it’s better to constrain the design space upfront so that dangerous behaviors are architecturally ruled out rather than merely trained against.&lt;/li&gt;
&lt;/ol&gt;

                </description>
			</item>
		
			<item>
				<title>Worlds where we solve AI alignment on purpose don't look like the world we live in</title>
				<pubDate>Fri, 20 Mar 2026 00:00:00 -0700</pubDate>
				<link>http://mdickens.me/2026/03/20/worlds_where_we_solve_alignment_on_purpose/</link>
				<guid isPermaLink="true">http://mdickens.me/2026/03/20/worlds_where_we_solve_alignment_on_purpose/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;&lt;em&gt;(Or: Why I don’t see how the probability of extinction could be less than 25% on the current trajectory)&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;AI developers are trying to build superintelligent AI. If they succeed, there’s a high risk that the AI will &lt;a href=&quot;https://intelligence.org/briefing/&quot;&gt;kill everyone&lt;/a&gt;. The AI companies know this; they believe they can figure out how to align the AI so that it doesn’t kill us.&lt;/p&gt;

&lt;p&gt;Maybe we solve the alignment problem before superintelligent AI kills everyone. But if we do, it will happen because we got lucky, not because we as a civilization treated the problem with the gravity it deserves—unless we start taking the alignment problem dramatically more seriously than we currently do.&lt;/p&gt;

&lt;p&gt;Think about what it looks like when a hard problem gets solved. Think about the Apollo program: engineers working out minute details; running simulations after simulations; planning for remote possibilities.&lt;/p&gt;

&lt;p&gt;Think about what it looks like when a hard problem &lt;em&gt;doesn’t&lt;/em&gt; get solved. Consider the world’s response to COVID.&lt;/p&gt;

&lt;p&gt;When I look at civilization’s response to the AI alignment problem, I do not see something resembling Apollo. When I visualize what it looks like for civilization to buckle down and make a serious effort to solve alignment, that visualization does not resemble the world we live in.&lt;/p&gt;

&lt;!-- more --&gt;

&lt;p&gt;&lt;em&gt;Cross-posted to the &lt;a href=&quot;https://forum.effectivealtruism.org/posts/qfLSGqkEWjZzqekv6/worlds-where-we-solve-ai-alignment-on-purpose-don-t-look&quot;&gt;EA Forum&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;This is the world we live in:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;AI Lab Watch has &lt;a href=&quot;https://ailabwatch.org/&quot;&gt;evaluations of AI companies’ behavior on AI safety&lt;/a&gt;. Every company has failing grades in almost every category.&lt;/li&gt;
  &lt;li&gt;AI capabilities gets more than 100 times as much investment as AI safety.&lt;/li&gt;
  &lt;li&gt;People keep saying “nobody would be so stupid as to X”, and then the people in charge proceed to do X. (Where X = “give AI direct access to the internet”, “hand over autonomous control of important systems”, etc.)&lt;/li&gt;
  &lt;li&gt;There is widespread disagreement about how hard it will be to solve AI alignment, and about the difficulty of various sub-problems. AI safety researchers and frontier companies &lt;a href=&quot;https://anthropic.ml/#section-2&quot;&gt;behave as if problems are not hard&lt;/a&gt; with ~100% confidence, and do little work to publicly justify this stance or resolve disagreements with more pessimistic parties. When public discussion does occur, it happens between skeptics and random employees, not skeptics and official company representatives.&lt;/li&gt;
  &lt;li&gt;Every frontier AI company (that has an alignment plan at all) wants to use AI to solve AI alignment. This is a &lt;a href=&quot;https://mdickens.me/2025/11/27/alignment_bootstrapping_is_dangerous/&quot;&gt;horrifyingly bad plan&lt;/a&gt;—they are admitting that the problem is so hard that they don’t think humans can solve it in time, and then proposing to use an unknown method with unknown reliability to solve the problem. Meanwhile, senior alignment researchers at AI companies describe this as &lt;a href=&quot;https://www.lesswrong.com/posts/epjuxGnSPof3GnMSL/alignment-remains-a-hard-unsolved-problem&quot;&gt;“a very good plan”&lt;/a&gt;. My life is in these people’s hands.&lt;/li&gt;
  &lt;li&gt;People are confident that &lt;a href=&quot;https://www.lesswrong.com/posts/jqb3prwGQjLriq7Lu/exercise-planmaking-surprise-anticipation-and-baba-is-you&quot;&gt;they can solve alignment in advance of building ASI&lt;/a&gt;, or they’re confident that &lt;a href=&quot;https://mdickens.me/2025/11/16/ai_meta_one_shot/&quot;&gt;it’s possible to iteratively solve alignment&lt;/a&gt;, in spite of widespread disagreement about whether these are possible.
    &lt;ul&gt;
      &lt;li&gt;Most people involved in frontier AI don’t treat the problem as having existential stakes. Many people have (written or mental) models of how to make ASI safe; these models do not meet the industry standard of most industries. They certainly do not meet the safety standards of industries where failures can result in deaths—spaceflight, civil engineering, surgery, cryptography&lt;sup id=&quot;fnref:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;. When extinction is at stake, standards should be even higher than that—higher than the standards that are higher than the standards that AI safety plans are &lt;em&gt;still&lt;/em&gt; failing to live up to.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;AI companies are allergic to &lt;a href=&quot;https://www.lesswrong.com/posts/7uTPrqZ3xQntwQgYz/anthropic-and-taking-technical-philosophy-more-seriously&quot;&gt;technical philosophy&lt;/a&gt;. They take the position that AI alignment is purely an engineering problem.&lt;/li&gt;
  &lt;li&gt;Companies make non-binding commitments imposing some safety requirement on a future generation of LLM. Then, when the time arrives and that commitment turns out to be hard to satisfy, they &lt;a href=&quot;https://anthropic.ml/#section-6&quot;&gt;remove the commitment from their plan&lt;/a&gt;.
    &lt;ul&gt;
      &lt;li&gt;I wrote the first draft of this post before Anthropic released &lt;a href=&quot;https://www.lesswrong.com/posts/HzKuzrKfaDJvQqmjh/responsible-scaling-policy-v3&quot;&gt;Responsible Scaling Policy v3&lt;/a&gt;, which removed all prior commitments to conditionally pause development. Thank you to Anthropic for providing me with a better example and demonstrating to everyone how untrustworthy you are; but no thank you for throwing away your commitments that might have prevented you from killing me and destroying everything I care about.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;Frontier AI developers all spend some time lobbying against safety regulations or attempting to weaken them. Anthropic is the least-bad actor; they’ve only lobbied against moderate regulations, while coming out in favor of some weak regulations.&lt;/li&gt;
  &lt;li&gt;AI developers require employees to sign non-disparagement agreements (including &lt;a href=&quot;https://www.openaifiles.org/transparency-and-safety&quot;&gt;OpenAI&lt;/a&gt; and &lt;a href=&quot;https://anthropic.ml/#section-4&quot;&gt;Anthropic&lt;/a&gt;—those are just the ones we know about), preventing employees from speaking out about any unsafe practices that might be happening.&lt;/li&gt;
  &lt;li&gt;A frontier AI developer, founded as a nonprofit, first fires its safety-conscious board members, then &lt;a href=&quot;https://www.openaifiles.org/restructuring&quot;&gt;restructures as a for-profit&lt;/a&gt; so it can continue to race forward unchecked. (AI Lab Watch has documented &lt;a href=&quot;https://ailabwatch.org/resources/integrity&quot;&gt;many other lapses in integrity&lt;/a&gt;.) There’s a good chance that this will be the company that develops ASI.&lt;/li&gt;
  &lt;li&gt;People assume we live in a fair world. One argument goes: solving one-shot problems is too hard; therefore, we need to solve alignment iteratively via experimentation. Then, having decided that they need iterative alignment, people decide that it’s &lt;em&gt;possible&lt;/em&gt; to solve alignment iteratively. There is no law of the universe that says we get to do things the easy way if the hard way is too hard. Sometimes you really do have to do things the hard way. But instead of dealing with this fact, people act as if the universe will treat us fairly.
    &lt;ul&gt;
      &lt;li&gt;I’m reminded of the &lt;a href=&quot;https://www.youtube.com/watch?v=Tid44iy6Rjs&quot;&gt;scene in Apollo 13&lt;/a&gt; where flight controller John Aaron says they have to get the spaceship’s power consumption down to 12 amps or else the astronauts will die. One engineer protests, “You can’t run a vacuum cleaner on 12 amps, John!” It doesn’t matter how much you protest; if you don’t get down to 12 amps, the astronauts die. Fortunately, mission director Gene Kranz was more competent than AI company leaders: instead of pretending the problem didn’t exist, he made the call that they would find a way to get down to 12 amps, whatever it took.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;Alignment researchers routinely make basic reasoning errors that indicate that they are not taking the alignment problem seriously.
    &lt;ul&gt;
      &lt;li&gt;Example: Researchers sometimes implicitly assume that if an LLM doesn’t reveal deception in its chain-of-thought, then it’s not being deceptive. We don’t know what’s happening inside deep neural networks. The chain-of-thought doesn’t tell us what’s happening in a neural network’s inner layers.&lt;/li&gt;
      &lt;li&gt;More generally: “we found no evidence of X, therefore X is false.” Another common assumption that fits this pattern is “we found no way to jailbreak our model, therefore it’s impossible to jailbreak.” People rarely say this explicitly, but they behave as if it’s valid reasoning.&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;https://www.lesswrong.com/posts/arwATwCTscahYwTzD/the-most-common-bad-argument-in-these-parts&quot;&gt;A related fallacy&lt;/a&gt; is “I can’t think of any ways this could go wrong, therefore it can’t go wrong.” In worlds where ASI goes well, there are processes in place to prevent AI developers from falling for this fallacy.&lt;sup id=&quot;fnref:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt; We do not have those processes.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;People who are pessimistic about solving alignment get filtered out of positions of power at AI companies. (I’m not aware of anyone at a leading AI company with a P(doom) &amp;gt; 50%, although I’m sure there are nonzero such people.) That means the people who would do the &lt;em&gt;most&lt;/em&gt; to put checks in place to make AI safe are the &lt;em&gt;least&lt;/em&gt; likely to be in a position to do so. AI companies systematically give power to reckless optimists.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Is this what taking the alignment problem seriously looks like? Is this what taking extinction risk seriously looks like?&lt;/p&gt;

&lt;p&gt;Some people are concerned about AI x-risk, and they have P(doom)s in the 5–25% range. I don’t get that. I can’t pass an &lt;a href=&quot;https://www.econlib.org/archives/2011/06/the_ideological.html&quot;&gt;Ideological Turing Test&lt;/a&gt; for someone who sees all these problems, but still expects us to avert extinction with &amp;gt;75% probability. I don’t understand what would lead one to believe that this is what things look like when we’re on track to solving a problem.&lt;/p&gt;

&lt;p&gt;There is an argument people sometimes make, to the effect of “the good guys have to do a sloppy job on AI safety, because otherwise the bad guys will beat us to ASI and they’ll be even worse.” I understand that viewpoint.&lt;sup id=&quot;fnref:3&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt; But that doesn’t mean P(doom) is low. You could believe something like: if Anthropic builds ASI, there’s a 50% chance we die; if xAI or Meta builds ASI, there’s a 75% chance we die; therefore, Anthropic has to build ASI first. I don’t know anyone who believes this; everyone who wants to race to build ASI seems to have a P(doom) on the order of 25% or less.&lt;/p&gt;

&lt;p&gt;I can imagine a world where AI companies are still racing to build ASI, but they’re taking the challenges appropriately seriously and investing accordingly in safety. In that world, I can see P(doom) being less than 25%. But that’s not the world we live in.&lt;/p&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:2&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Cryptanalysis is a good example of what it looks like when flaws are hard to catch. Standard practice when releasing a new cryptographic algorithm is to circulate it among experts and spend 2–5 years trying to break the algorithm before anyone uses it in the real world. &lt;a href=&quot;#fnref:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:1&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I won’t say “people don’t commit this fallacy”, because there is no alternative world where people don’t make reasoning errors. But in the good world, we have checks in place to make sure that ultimate decisions aren’t made on the basis of those errors. &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:3&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;That still doesn’t explain the fact that AI companies exhibit an embarrassing level of sloppiness. There are various alignment challenges that AI companies could address without significantly slowing down, but they still act oblivious to them. &lt;a href=&quot;#fnref:3&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>Value Investing in the Age of AGI</title>
				<pubDate>Wed, 11 Mar 2026 00:00:00 -0700</pubDate>
				<link>http://mdickens.me/2026/03/11/value_investing_agi/</link>
				<guid isPermaLink="true">http://mdickens.me/2026/03/11/value_investing_agi/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;&lt;img src=&quot;/assets/images/boxing.jpg&quot; style=&quot;width:500px&quot; /&gt;&lt;/p&gt;

&lt;h2 id=&quot;introduction&quot;&gt;Introduction&lt;/h2&gt;

&lt;p&gt;Most people who write about AI and investing fall into one of two camps: traditional investors who see the high valuations of AI stocks and say it’s a bubble;&lt;sup id=&quot;fnref:13&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:13&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; or AGI-pilled investors who will buy AI stocks at any price, regardless of fundamentals. There’s only a tiny intersection of people who understand that AGI is not a normal technology while also recognizing that fundamentals matter.&lt;/p&gt;

&lt;p&gt;I’m not an expert (or even a journeyman) on AI &lt;em&gt;or&lt;/em&gt; fundamental analysis, but I do know a little bit about both.&lt;/p&gt;

&lt;p&gt;The basic thesis of value investing is that the market over-rates expected future growth and under-rates present-day fundamentals. Stocks that are poised to benefit from AGI tend to be growth stocks—people have high expectations for them, and they’re priced expensively relative to present-day fundamentals. That suggests that we shouldn’t buy AI-related stocks.&lt;/p&gt;

&lt;p&gt;At the same time, &lt;a href=&quot;https://forum.effectivealtruism.org/posts/8c7LycgtkypkgYjZx/agi-and-the-emh-markets-are-not-expecting-aligned-or&quot;&gt;the market does not appear to expect AGI&lt;/a&gt;, which suggests we &lt;em&gt;should&lt;/em&gt; buy them. Which of these two forces is stronger?&lt;/p&gt;

&lt;p&gt;My current thinking is that value investing &lt;em&gt;probably&lt;/em&gt; won’t work in light of AGI, but there is some reason to believe it might work even better; and value investing is a useful hedge in case AI progress slows.&lt;/p&gt;

&lt;!-- more --&gt;

&lt;p&gt;&lt;em&gt;Updated 2026-03-13 to replace the value spread chart with a more relevant one.&lt;/em&gt;&lt;/p&gt;

&lt;h2 id=&quot;contents&quot;&gt;Contents&lt;/h2&gt;

&lt;ul id=&quot;markdown-toc&quot;&gt;
  &lt;li&gt;&lt;a href=&quot;#introduction&quot; id=&quot;markdown-toc-introduction&quot;&gt;Introduction&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#contents&quot; id=&quot;markdown-toc-contents&quot;&gt;Contents&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#the-value-spread-vs-agi&quot; id=&quot;markdown-toc-the-value-spread-vs-agi&quot;&gt;The value spread vs. AGI&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#defenses-of-value-investing&quot; id=&quot;markdown-toc-defenses-of-value-investing&quot;&gt;Defenses of value investing&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#value-investing-as-a-hedge-against-long-timelines&quot; id=&quot;markdown-toc-value-investing-as-a-hedge-against-long-timelines&quot;&gt;Value investing as a hedge against long timelines&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#notes&quot; id=&quot;markdown-toc-notes&quot;&gt;Notes&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;the-value-spread-vs-agi&quot;&gt;The value spread vs. AGI&lt;/h2&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/GMO-value-spread.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;source: &lt;a href=&quot;https://www.gmo.com/americas/research-library/year-end-letter-for-2025-deep-value_insights/&quot;&gt;GMO&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Right now (in early 2026), the valuation spread between US growth and value companies is large relative to history. This is by both the value and growth sides: value stocks are historically cheap, and growth stocks are historically expensive. Growth stocks may experience a crash like they did after the dot-com bubble.&lt;/p&gt;

&lt;p&gt;However, the crash might never come because we might be living through the final market cycle. The development of AGI in the next decade could end the economy as we know it.&lt;/p&gt;

&lt;p&gt;Civilization is on track to develop AGI within the next decade, and ASI may follow soon after.&lt;sup id=&quot;fnref:6&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:6&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt; When ASI arrives, our investments probably won’t matter anymore, for one of several reasons:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;ASI will be misaligned and it will kill everyone. (&lt;a href=&quot;https://intelligence.org/briefing/&quot;&gt;This is the most likely outcome.&lt;/a&gt;)&lt;/li&gt;
  &lt;li&gt;ASI will be aligned, and it will be so radically good for the economy that nobody will care about money anymore.&lt;/li&gt;
  &lt;li&gt;A small set of people will use aligned ASI to take over the world and leave everyone else at the mercy of their whims.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;There is &lt;em&gt;some&lt;/em&gt; chance that none of those things will happen. Even if they do, there may be a meaningful transition period where AI is powerful enough to replace a significant fraction of human labor, but not yet powerful enough to kill everyone. In those scenarios, it matters how we invest. We can then spend our investment returns on reducing the odds that ASI kills everyone.&lt;/p&gt;

&lt;p&gt;In the scenario where AI transforms the economy but doesn’t (yet) render money useless, will value investing work?&lt;/p&gt;

&lt;p&gt;My best guess is no. Right now, the market is sending the signal that AI is a useful ordinary technology. The market appears to be pricing Nvidia the way it priced Cisco in the late 90’s: “[AI/The Internet] will be the next big thing. [Nvidia/Cisco] manufactures the infrastructure that [AI/The Internet] relies on; it is well positioned to make huge profits.”&lt;/p&gt;

&lt;p&gt;(Cisco had a Price/Sales ratio of 21 at the end of 1999, and 38.9 at its peak in March 2000&lt;sup id=&quot;fnref:12&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:12&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt;; Nvidia’s Price/Sales is &lt;a href=&quot;https://finance.yahoo.com/quote/NVDA/key-statistics/&quot;&gt;25.1&lt;/a&gt; as of the beginning of 2026.)&lt;/p&gt;

&lt;p&gt;If AI is &lt;em&gt;not&lt;/em&gt; a normal technology, and it comes to dominate the world economy, then AI stocks’ current prices appear too low. Current valuations imply &lt;em&gt;strong&lt;/em&gt; growth, but not &lt;em&gt;radical&lt;/em&gt; growth.&lt;sup id=&quot;fnref:11&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:11&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;4&lt;/a&gt;&lt;/sup&gt; In a world where AI replaces half of all human jobs, Nvidia’s revenue could comfortably reach $60 trillion, in which case its current price is much too low.&lt;/p&gt;

&lt;details&gt;
  &lt;summary&gt;How Nvidia&apos;s revenue could reach $60 trillion&lt;/summary&gt;
  &lt;p&gt;Some back-of-the-envelope math:&lt;/p&gt;

  &lt;ul&gt;
    &lt;li&gt;World GDP is ~$120 trillion. If AI can do half of human jobs, that could double world GDP to $240 trillion. (That’s not really how it works, but I’m keeping it simple.)&lt;/li&gt;
    &lt;li&gt;AI companies would be willing to spend perhaps half their revenues on data centers, or $120 trillion.&lt;/li&gt;
    &lt;li&gt;Historically, about half the cost of data centers has gone to GPUs, implying $60 trillion of revenue for Nvidia, assuming Nvidia can maintain its near-monopoly on AI hardware.&lt;/li&gt;
  &lt;/ul&gt;
&lt;/details&gt;

&lt;p&gt;If economic growth accelerated across the board, all else equal, that would be bad for the value factor. As a simplistic illustration, suppose the market expects value companies’ earnings to grow 5% next year, and growth companies’ earnings to grow 10%. Value investing works when the market’s expectations are overconfident, and earnings growth reverts toward the mean. If value and growth companies’ earnings grow by 6% and 9% respectively, then the earnings of the value factor will beat expectations by 2 percentage points. However, if AI doubles every company’s earnings growth, then value and growth companies will grow earnings by 12% and 18%, respectively. Even though earnings growth still mean reverts, the value factor &lt;em&gt;under&lt;/em&gt;performs expectations.&lt;/p&gt;

&lt;p&gt;The same information presented as a table:&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Scenario&lt;/th&gt;
      &lt;th&gt;Value Co. Growth&lt;/th&gt;
      &lt;th&gt;Growth Co. Growth&lt;/th&gt;
      &lt;th&gt;Value Outperformance&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;market expectation&lt;/td&gt;
      &lt;td&gt;5%&lt;/td&gt;
      &lt;td&gt;10%&lt;/td&gt;
      &lt;td&gt;0%&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;mean reversion&lt;/td&gt;
      &lt;td&gt;6%&lt;/td&gt;
      &lt;td&gt;9%&lt;/td&gt;
      &lt;td&gt;2%&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;AI acceleration + mean reversion&lt;/td&gt;
      &lt;td&gt;12%&lt;/td&gt;
      &lt;td&gt;18%&lt;/td&gt;
      &lt;td&gt;-1%&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h2 id=&quot;defenses-of-value-investing&quot;&gt;Defenses of value investing&lt;/h2&gt;

&lt;p&gt;I see two ways that value investing might still work in light of AGI: &lt;strong&gt;competition increases&lt;/strong&gt; and &lt;strong&gt;AI makes predictions harder&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The first argument: Competition may increase.&lt;/p&gt;

&lt;p&gt;Current market prices are baking in an expectation that today’s winners will stay winning. Right now, Nvidia effectively has a monopoly on AI hardware. But other companies are trying to change that. AMD, Nvidia’s main competitor, is working hard to catch up on AI; Amazon, Google, Meta, and Microsoft are all building their own AI chips; and a handful of startups are trying to compete as well.&lt;sup id=&quot;fnref:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;5&lt;/a&gt;&lt;/sup&gt; If some of those companies succeed, Nvidia’s market share and profit margin may not be good enough to live up to the market’s expectations.&lt;/p&gt;

&lt;p&gt;The second argument: AI makes it harder to predict the future.&lt;/p&gt;

&lt;p&gt;It is notoriously difficult to predict which problems are easy or hard for AI. Therefore, it is difficult to predict which &lt;em&gt;industries&lt;/em&gt; will most benefit from advancements in AI capabilities. When you don’t know which companies will experience the most earnings growth, you want to hold the companies that have a lot of earnings &lt;em&gt;right now&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;Example:&lt;/p&gt;

&lt;p&gt;If Gale’s Growth Inc. (GGI) has a P/E of 20 and Vicky’s Value Co. (VVC) has a P/E of 10, that means the market is willing to pay a premium for GGI because it expects GGI to have stronger earnings growth in the future. Increasingly-powerful AI bolsters both companies, and it becomes very hard to predict whether GGI or VVC will benefit more. In that situation, I’d prefer to own VVC because I’m buying the same amount of earnings at half the price.&lt;/p&gt;

&lt;p&gt;In other words, if I have &lt;em&gt;equal&lt;/em&gt; growth expectations for GGI and VVC, then I’d prefer VVC. Right now, the market expects GGI to have better growth, but AI advancements could throw a wrench in the market’s expectations.&lt;/p&gt;

&lt;p&gt;Quoting Matt Levine:&lt;sup id=&quot;fnref:5&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:5&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;6&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;With a sufficiently general-purpose technology it’s not clear whether the value will mostly accrue to the builders of that technology or to its users. But surely it is at least plausible that AI will mostly make its users richer, so the way to bet on AI is mostly to bet on regular, non-AI companies that don’t use it yet but eventually will.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;An alternative possibility (raised by &lt;a href=&quot;https://ai-2027.com/&quot;&gt;AI 2027&lt;/a&gt; and &lt;a href=&quot;https://bayesianinvestor.com/blog/index.php/2025/08/28/a-business-model-for-ai/&quot;&gt;Bayesian Investor&lt;/a&gt;) is that once AI agents become sufficiently advanced, frontier AI developers may stop releasing the agents and keep the benefits to themselves. If that happens, the economic benefits of AI may simply accrue to the AI developers. That would be bad for value stocks, but it would also have such a warping effect on the economy that it’s hard to say what the right response would be.&lt;/p&gt;

&lt;h2 id=&quot;value-investing-as-a-hedge-against-long-timelines&quot;&gt;Value investing as a hedge against long timelines&lt;/h2&gt;

&lt;p&gt;AI timelines might lengthen for various reasons. Maybe AI advancement hits a wall; maybe an economic recession makes it too hard for AI companies to get capital investments; maybe training new LLMs simply becomes too expensive and companies can’t raise enough money; maybe governments wake up to the existential danger of ASI and start imposing &lt;a href=&quot;https://nothingismere.substack.com/p/a-near-term-policy-for-not-getting&quot;&gt;strong regulations&lt;/a&gt;; etc.&lt;/p&gt;

&lt;p&gt;My guess is none of those things will happen. But they’re not terribly unlikely, either. Any event in that genre seems like it would be good for value stocks and bad for growth stocks. At minimum, if AI capabilities slow down, the lofty valuations of AI-related stocks will start looking too optimistic, and prices will likely come down. Value stocks are something of a safe haven protecting against valuations crashing back to earth (I say “something of” because in the investing world, nothing is ever guaranteed to be safe).&lt;/p&gt;

&lt;p&gt;I’m less bullish on value investing than I was five years ago, but I still keep about a third of my money in value stocks. I expect them to outperform if AI timelines are long, and there’s some chance they outperform even if AGI arrives soon.&lt;/p&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:13&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;&lt;a href=&quot;https://corporate.vanguard.com/content/dam/corp/research/pdf/isg_vemo_2026.pdf&quot;&gt;Vanguard did better than most&lt;/a&gt;&lt;sup id=&quot;fnref:8&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:8&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;7&lt;/a&gt;&lt;/sup&gt;: in their December 2025 market outlook, Vanguard rightly predicted that the future is likely to be very bad or very good, while “average” outcomes are unlikely. But they didn’t quite get it. They wrote that transformative AI could cause real GDP growth to surge from the historical 1–2% up to…3%. Really, Vanguard? The development of an artificial alien species that intellectually surpasses the smartest humans would increase GDP growth to 3%?&lt;/p&gt;

      &lt;p&gt;I’m only picking on Vanguard because their take on AI was better than the other takes I read from the big investing firms. In general, Vanguard is arguably &lt;em&gt;the&lt;/em&gt; respectable investing company—they’ve probably done more for retail investors than anyone else. &lt;a href=&quot;#fnref:13&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:6&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I think of AGI as an AI that’s smart enough to replace most human workers, and ASI as an AI that’s smart enough to outsmart all of humanity put together—as in, if it resolved to kill us all and we resolved to live, then we would die and it wouldn’t be close.&lt;/p&gt;

      &lt;p&gt;It’s possible that AGI and ASI aren’t that different, and it’s possible that ASI would have to be much more advanced than AGI. &lt;a href=&quot;#fnref:6&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:12&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Fiscal year 1999 revenue was $12.15 billion according to a &lt;a href=&quot;https://newsroom.cisco.com/c/r/newsroom/en/us/a/y1999/m08/cisco-systems-reports-fourth-quarter-and-year-end-earnings.html&quot;&gt;Cisco press release&lt;/a&gt;. At the end of 1999, Cisco’s market cap was $253 billion (&lt;a href=&quot;https://www.statmuse.com/money/ask/csco-market-cap-in-1999&quot;&gt;source&lt;/a&gt;).&lt;/p&gt;

      &lt;p&gt;According to a &lt;a href=&quot;https://www.hardingloevner.com/insights/nvidia-and-the-cautionary-tale-of-cisco-systems/&quot;&gt;secondary source&lt;/a&gt;, Cisco’s Price/Sales peaked at 38.9 in March 2000. &lt;a href=&quot;#fnref:12&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:11&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I spent a while trying to come up with a model for what growth expectations are implied by AI-related companies’ valuations, but it got too complicated so I gave up. I’d still like to see a good fundamental analysis of what stocks ought to be worth in light of AGI, but I’m not going to be the one to do that analysis (at least not today). &lt;a href=&quot;#fnref:11&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:2&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;For references, see &lt;a href=&quot;https://claude.ai/share/1df4c262-b18c-43ea-a4c6-d0054181b3f0&quot;&gt;this Claude chat&lt;/a&gt;. &lt;a href=&quot;#fnref:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:5&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Levine, M. (2025). &lt;a href=&quot;https://www.bloomberg.com/opinion/articles/2025-01-27/hedge-fund-ai-is-cheap-ai&quot;&gt;Hedge Fund AI Is Cheap AI.&lt;/a&gt; &lt;a href=&quot;#fnref:5&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:8&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Vanguard Research (2025). &lt;a href=&quot;https://corporate.vanguard.com/content/dam/corp/research/pdf/isg_vemo_2026.pdf&quot;&gt;Vanguard economic and market outlook for 2026 – AI exuberance: Economic upside, stock market downside.&lt;/a&gt; &lt;a href=&quot;#fnref:8&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>The Structural Return Argument Against Value Investing</title>
				<pubDate>Mon, 02 Mar 2026 00:00:00 -0800</pubDate>
				<link>http://mdickens.me/2026/03/02/structural_return_argument_against_value_investing/</link>
				<guid isPermaLink="true">http://mdickens.me/2026/03/02/structural_return_argument_against_value_investing/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;Value investing had a singularly bad run from 2007 to 2020. (And it hasn’t done great since 2020, either.) Is that because value investing is broken, or did it simply hit a streak of horrendous luck?&lt;/p&gt;

&lt;p&gt;Skeptics of value investing have made many claims about why value investing doesn’t work anymore, but these claims tend to be light on evidence.&lt;sup id=&quot;fnref:26&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:26&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; Value investing proponents have empirically researched most of these claims and found that they don’t stand up to scrutiny.&lt;sup id=&quot;fnref:12&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:12&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;sup id=&quot;fnref:14&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:14&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt;&lt;sup id=&quot;fnref:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;4&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;The poor performance of the value factor was not primarily driven by weakening fundamentals, but by the widening of the value spread. A wider value spread makes value investing look &lt;em&gt;more&lt;/em&gt; attractive going forward, not less.&lt;/p&gt;

&lt;details&gt;
  &lt;summary&gt;What&apos;s the value spread?&lt;/summary&gt;
  &lt;p&gt;Value stocks are defined using the ratio of a stock’s price to some fundamental metric—for example, earnings, book value, or cash flow. If we use earnings as the metric, then value stocks are those with low P/E ratios and growth stocks are the ones with high P/Es.&lt;/p&gt;

  &lt;p&gt;The &lt;strong&gt;value spread&lt;/strong&gt; is the ratio of price-to-fundamental ratios between growth stocks and value stocks. For example, if growth stocks have an average P/E of 30 and value stocks have an average of 15, then the value spread is 30/15 = 2.&lt;/p&gt;

  &lt;p&gt;All else equal, a wider value spread is good for value because you’re buying the same fundamentals at a lower price. However, a &lt;em&gt;widening&lt;/em&gt; spread is bad for value because it means value stocks are declining (relative to growth stocks). This is analogous to how bond investors like when bond yields are high, but they lose money when yields are &lt;em&gt;increasing&lt;/em&gt;.&lt;/p&gt;
&lt;/details&gt;

&lt;p&gt;I wouldn’t dismiss value investing on the basis of poor recent performance.&lt;/p&gt;

&lt;p&gt;However, there’s a potentially strong argument against value investing that remains unrefuted.&lt;/p&gt;

&lt;p&gt;Historically, the &lt;strong&gt;structural return&lt;/strong&gt; of the value factor—the component of return that comes from company fundamentals, rather than changes in the value spread—was about 4–6%.&lt;sup id=&quot;fnref:1:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;4&lt;/a&gt;&lt;/sup&gt; But over the past two decades, that number has averaged a mere 1%. Unlike with the value spread, a muted structural return does not imply higher future expectations for value investing.&lt;/p&gt;

&lt;!-- more --&gt;

&lt;p&gt;In this post:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Pease (2019)&lt;sup id=&quot;fnref:14:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:14&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt; and Arnott et al. (2021)&lt;sup id=&quot;fnref:1:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;4&lt;/a&gt;&lt;/sup&gt; broke down to the value factor into a valuation component and a structural component. They found that most of the post-2007 underperformance was driven by widening valuation, but the structural return also declined. However, the decline was not statistically significant. &lt;a href=&quot;#explaining-the-performance-of-the-value-factor&quot;&gt;[More]&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;Using a longer dataset back to 1927, I find that a low structural return is not unprecedented—something similar happened in the 1940s. &lt;a href=&quot;#has-the-structural-return-ever-been-this-low-before&quot;&gt;[More]&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;The structural return can be separated into growth + dividend income + migration. The first two components are easy to explain, but the (small) decline in migration return is puzzling. &lt;a href=&quot;#elements-of-structural-return&quot;&gt;[More]&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;If the decreased migration return isn’t just random chance, then the most likely explanation is that the market is rationally reacting less to fundamentals surprises, which would indicate that the reduced migration return is likely to persist. &lt;a href=&quot;#going-deeper-on-migration&quot;&gt;[More]&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;The appendix of Arnott et al. (2021)&lt;sup id=&quot;fnref:1:3&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;4&lt;/a&gt;&lt;/sup&gt; finds that the recent muted structural return is not statistically unlikely, and may be explained by selection bias. &lt;a href=&quot;#arnott-et-als-statistical-argument&quot;&gt;[More]&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;contents&quot;&gt;Contents&lt;/h2&gt;

&lt;ul id=&quot;markdown-toc&quot;&gt;
  &lt;li&gt;&lt;a href=&quot;#contents&quot; id=&quot;markdown-toc-contents&quot;&gt;Contents&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#explaining-the-performance-of-the-value-factor&quot; id=&quot;markdown-toc-explaining-the-performance-of-the-value-factor&quot;&gt;Explaining the performance of the value factor&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#has-the-structural-return-ever-been-this-low-before&quot; id=&quot;markdown-toc-has-the-structural-return-ever-been-this-low-before&quot;&gt;Has the structural return ever been this low before?&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#elements-of-structural-return&quot; id=&quot;markdown-toc-elements-of-structural-return&quot;&gt;Elements of structural return&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#going-deeper-on-migration&quot; id=&quot;markdown-toc-going-deeper-on-migration&quot;&gt;Going deeper on migration&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#have-fundamentals-surprises-shrunk&quot; id=&quot;markdown-toc-have-fundamentals-surprises-shrunk&quot;&gt;Have fundamentals surprises shrunk?&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#the-markets-reaction-to-surprises&quot; id=&quot;markdown-toc-the-markets-reaction-to-surprises&quot;&gt;The market’s reaction to surprises&lt;/a&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#arnott-et-als-statistical-argument&quot; id=&quot;markdown-toc-arnott-et-als-statistical-argument&quot;&gt;Arnott et al.’s statistical argument&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#conclusion&quot; id=&quot;markdown-toc-conclusion&quot;&gt;Conclusion&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#appendix-source-code&quot; id=&quot;markdown-toc-appendix-source-code&quot;&gt;Appendix: Source code&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#changelog&quot; id=&quot;markdown-toc-changelog&quot;&gt;Changelog&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#notes&quot; id=&quot;markdown-toc-notes&quot;&gt;Notes&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;explaining-the-performance-of-the-value-factor&quot;&gt;Explaining the performance of the value factor&lt;/h2&gt;

&lt;p&gt;The value factor is measured by a stock’s price-to-fundamentals ratio, using some measure of fundamentals like earnings or book value. Value stocks have low P/F ratios and growth stocks have high P/Fs.&lt;/p&gt;

&lt;p&gt;The performance of value stocks can be decomposed using a short equation:&lt;/p&gt;

\[P = \displaystyle\frac{P}{F} \cdot {F}\]

&lt;p&gt;P/F is called the &lt;strong&gt;valuation&lt;/strong&gt; component, and F is called the &lt;strong&gt;structural&lt;/strong&gt; component (the part that comes from the underlying structure of the economy).&lt;/p&gt;

&lt;p&gt;For the return of value stocks relative to growth stocks, we can look at the &lt;em&gt;relative&lt;/em&gt; change in P/F and the &lt;em&gt;relative&lt;/em&gt; change in fundamentals. Did value stocks underperform because their fundamentals did particularly poorly, or because the spread in P/F multiples expanded—with the value stocks getting cheaper, and the growth stocks getting more expensive?&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://www.tandfonline.com/doi/full/10.1080/0015198X.2020.1842704&quot;&gt;Arnott et al. (2021)&lt;/a&gt;&lt;sup id=&quot;fnref:1:4&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;4&lt;/a&gt;&lt;/sup&gt; looked at this question (using book value as the measure of fundamentals). From 2007 to 2020, the total return of value minus growth was –6.1%. Arnott et al. found that value’s negative premium was more than fully explained by expansion in P/B, and value companies still outperformed growth companies on the structural component:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;    1963 to 2007:  6.1% return =  0.2% valuation + 4.2% structural
    2007 to 2020: -6.1% return = -7.2% valuation + 1.1% structural
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;&lt;em&gt;Source: Table 3 in Arnott et al. (2021).&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;This evidence falsifies some popular hypotheses&lt;sup id=&quot;fnref:11&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:11&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;5&lt;/a&gt;&lt;/sup&gt; about how value investing supposedly doesn’t work anymore. The authors write:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;For example, some have said that the value trade has become crowded, distorting stock prices so the factor generates a tiny or negative expected return. Crowding should cause the factor to become more richly priced. An increase in the valuation spread between growth and value, from the 25th to the 100th percentile, however, is not consonant with crowding into the value factor. Thus, this narrative is easy to dismiss.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;If value investing had become too popular, the value spread would have narrowed. But instead, it widened.&lt;/p&gt;

&lt;p&gt;However, the authors also write:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Similarly, little evidence exists to suggest that the value strategy’s long-run structural return has turned negative or even diminished from the pre-2007 level.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I can’t help but notice that before 2007, the value factor had a structural return of 5.9%. And after 2007, that number dropped to 1.1%. That’s not a trivial difference. If we assume that changes in the value spread average out over the long term but the muted structural return persists, that means the future value premium will be only about 1%—still positive, but considerably smaller than it was historically. If the structural return is trending downward, then the future value premium could become permanently negative.&lt;/p&gt;

&lt;p&gt;Is there some fundamental reason why the value factor has performed worse recently? People have proposed many hypotheses: perhaps value measures are failing to capture important intangibles; perhaps central bank interventions create a more favorable environment for growth stocks; perhaps value strategies are too crowded; perhaps analysts have gotten better at predicting companies’ future growth.&lt;/p&gt;

&lt;p&gt;In 2021, Israel et al. published &lt;a href=&quot;https://dx.doi.org/10.2139/ssrn.3554267&quot;&gt;Is (Systematic) Value Investing Dead?&lt;/a&gt;&lt;sup id=&quot;fnref:12:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:12&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt; (not to be confused with Cliff Asness’ &lt;a href=&quot;https://www.aqr.com/Insights/Perspectives/Is-Systematic-Value-Investing-Dead&quot;&gt;article by the same name&lt;/a&gt;). They reviewed a variety of hypotheses on why value investing might be broken, and found all of them to be contradicted by the evidence. But the authors did &lt;em&gt;not&lt;/em&gt; address the weakening of the structural return. I have never seen a value investing skeptic bring up this point before, but I believe it is the strongest argument against value investing.&lt;/p&gt;

&lt;p&gt;A natural question to ask is, is the recent structural return historically anomalous, or is it within the normal range of variation?&lt;/p&gt;

&lt;h2 id=&quot;has-the-structural-return-ever-been-this-low-before&quot;&gt;Has the structural return ever been this low before?&lt;/h2&gt;

&lt;p&gt;I used data from the &lt;a href=&quot;https://mba.tuck.dartmouth.edu/pages/faculty/ken.french/data_library.html&quot;&gt;Ken French Data Library&lt;/a&gt; to coarsely replicate the results from Arnott et al. (2021).&lt;/p&gt;

&lt;details&gt;
  &lt;summary&gt;Replication methodology&lt;/summary&gt;
  &lt;p&gt;I replicated &lt;a href=&quot;https://www.tandfonline.com/doi/full/10.1080/0015198X.2020.1842704&quot;&gt;Arnott et al. (2021)&lt;/a&gt;&lt;sup id=&quot;fnref:1:5&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;4&lt;/a&gt;&lt;/sup&gt; using the &lt;a href=&quot;https://mba.tuck.dartmouth.edu/pages/faculty/ken.french/data_library.html&quot;&gt;Ken French Data Library&lt;/a&gt;, specifically the data series “Portfolios Formed on Book-to-Market” and “BE/ME Breakpoints”. I calculated the long/short value factor the “Hi 30” portfolio minus “Lo 30”, i.e., the 30% cheapest stocks minus the 30% most expensive. The breakpoints exclude companies with negative book values, so my methodology excludes those as well. Portfolios are reconstituted at the end of June (e.g. “2020 to 2021” really means July 2020 to June 2021).&lt;/p&gt;

  &lt;p&gt;I can’t derive the exact average B/M of the value portfolio and the growth portfolio using the available data. I approximated the averages using the “BE/ME Breakpoints” series, which provides the B/M at every 5th percentile breakpoint. I applied the &lt;a href=&quot;https://en.wikipedia.org/wiki/Trapezoidal_rule&quot;&gt;trapezoid rule&lt;/a&gt; to these breakpoints (taking the geometric mean of the endpoints rather than the arithmetic mean) to estimate the average B/M of the “Lo 30” and “Hi 30” portfolios.&lt;/p&gt;

  &lt;p&gt;Given the returns of the two portfolios (value and growth) and their estimated average B/M, I reverse-engineered the structural component of price as &lt;code&gt;log(adjusted price) - log(B/M)&lt;/code&gt; (where “adjusted” = including dividends). I then computed the value-factor structural component as &lt;code&gt;log(value structural price) - log(growth structural price)&lt;/code&gt;.&lt;/p&gt;
&lt;/details&gt;

&lt;p&gt;Here are the numbers for valuation change and structural return, according to my replication:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;    1963 to 2007:  5.4% return =  1.4% valuation + 3.9% structural
    2007 to 2020: -7.4% return = -8.9% valuation + 1.5% structural
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;My methodology did not produce identical numbers to Arnott et al., but the differences are small.&lt;/p&gt;

&lt;p&gt;I also replicated the results using E/P and CF/P rather than B/M:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;    --- E/P ---
    1963 to 2007:  5.4% return = -1.5% valuation + 7.0% structural
    2007 to 2020: -4.0% return = -0.8% valuation - 3.1% structural

    --- CF/P ---
    1963 to 2007:  4.8% return = -0.7% valuation + 5.5% structural
    2007 to 2020: -4.9% return = -0.5% valuation - 4.4% structural
&lt;/code&gt;&lt;/pre&gt;

&lt;details&gt;
  &lt;summary&gt;Sensitivity to the choice of end date&lt;/summary&gt;
  &lt;p&gt;The above results heavily depend on what end date you use because 2020 and 2021 were dramatic years for value, in opposite directions:&lt;/p&gt;

  &lt;pre&gt;&lt;code&gt;     B/M -- 2019 to 2020: -36.2% return = -20.7% valuation + -15.5% structural
     E/P -- 2019 to 2020: -27.5% return =  15.8% valuation + -43.3% structural
    CF/P -- 2019 to 2020: -30.9% return =  18.0% valuation + -48.9% structural

     B/M -- 2020 to 2021:  24.5% return =   7.0% valuation + 17.5% structural
     E/P -- 2020 to 2021:  20.3% return = -25.5% valuation + 45.8% structural
    CF/P -- 2020 to 2021:   9.6% return = -30.7% valuation + 40.3% structural
&lt;/code&gt;&lt;/pre&gt;
&lt;/details&gt;

&lt;p&gt;The negative valuation change is unfortunate for value investors, but not particularly worrying; valuation changes should even out over the long run. The decrease in structural return is more concerning because it could indicate a fundamental shift that makes value investing permanently less profitable.&lt;/p&gt;

&lt;p&gt;That raises the question: How much does the structural return vary over time?&lt;/p&gt;

&lt;p&gt;Over the full sample from 1927 to 2025, the value factor and its two components had the following annual standard deviations:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;    total return:      16.2%
    valuation change:  24.7%
    structural return: 24.8%
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The structural return varies quite a lot—more than the value factor itself.&lt;sup id=&quot;fnref:27&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:27&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;6&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;The difference in structural return between 1927–2006 and 2007–2025 was highly insignificant (t-stat = 0.496, p = 0.62). The standard error over a 19-year period (like 2007–2025) is 5.7%, so seeing the structural return drop to near zero isn’t that unlikely just by random chance.&lt;/p&gt;

&lt;p&gt;But &lt;em&gt;weak statistical evidence&lt;/em&gt; of a decline doesn’t mean there &lt;em&gt;was&lt;/em&gt; no decline. Perhaps I’m being overly paranoid, but I want to dig deeper and see what it might mean if the structural return really did go down.&lt;/p&gt;

&lt;p&gt;The next question I want to ask is, has the structural return ever been this low before?&lt;/p&gt;

&lt;p&gt;Figure 1 shows the average structural return for rolling 15-year periods:&lt;/p&gt;

&lt;div align=&quot;center&quot; id=&quot;figure-1&quot;&gt;Figure 1&lt;/div&gt;
&lt;p&gt;&lt;img src=&quot;/assets/images/structural rolling returns (B-M).png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;The next chart shows structural drawdowns for the value factor. Conceptually, a structural drawdown is a period where the value factor would’ve underperformed if the value spread had remained constant.&lt;/p&gt;

&lt;div align=&quot;center&quot; id=&quot;figure-2&quot;&gt;Figure 2&lt;/div&gt;
&lt;p&gt;&lt;img src=&quot;/assets/images/structural drawdowns (B-M).png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;The current period is the 2nd worst in the historical sample, but not the worst: the value factor had a structural drawdown of nearly 70% from 1942 to 1946, and it did not fully recover until 1973. That’s a little reassuring—this sort of thing has happened before.&lt;/p&gt;

&lt;p&gt;The next chart shows structural drawdowns alongside drawdowns for the valuation component and the value factor itself:&lt;/p&gt;

&lt;div align=&quot;center&quot; id=&quot;figure-3&quot;&gt;Figure 3&lt;/div&gt;
&lt;p&gt;&lt;img src=&quot;/assets/images/value attribution drawdowns (B-M).png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;The weak structural return of the 2010s differs from the 1940s drawdown in that it &lt;em&gt;coincided&lt;/em&gt; with an expansion of the value spread, which caused the value factor to experience its worst performance in recorded history.&lt;/p&gt;

&lt;details&gt;
  &lt;summary&gt;What&apos;s happened since 2020?&lt;/summary&gt;
  &lt;p&gt;&lt;a href=&quot;https://www.gmo.com/americas/research-library/risk-and-premium-a-tale-of-value_whitepaper/&quot;&gt;Pease (2019)&lt;/a&gt;&lt;sup id=&quot;fnref:14:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:14&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt; and &lt;a href=&quot;https://doi.org/10.1080/0015198X.2020.1842704&quot;&gt;Arnott et al. (2021)&lt;/a&gt;&lt;sup id=&quot;fnref:1:6&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;4&lt;/a&gt;&lt;/sup&gt; included data only up to 2019–2020. Since then, the value factor has experienced a minor resurgence. According to my replication:&lt;/p&gt;

  &lt;pre&gt;&lt;code&gt;    2020 to 2025: 4.1% return = 8.5% valuation - 4.4% structural
&lt;/code&gt;&lt;/pre&gt;

  &lt;p&gt;This resurgence was accompanied by a negative structural return.&lt;/p&gt;

  &lt;p&gt;However, the structural return has enough variability that it’s hard to infer anything from five years of history:&lt;/p&gt;

  &lt;div align=&quot;center&quot; id=&quot;figure-A1&quot;&gt;Figure A1&lt;/div&gt;
  &lt;p&gt;&lt;img src=&quot;/assets/images/value structural price.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

  &lt;p&gt;These additional five years have provided (very) weak evidence for the hypothesis that value’s structural return is permanently dampened. But even with an additional five years of poor structural return, the decline is not statistically significant, nor has the structural return been as bad as it was in the 1940s era.&lt;/p&gt;

&lt;/details&gt;

&lt;h2 id=&quot;elements-of-structural-return&quot;&gt;Elements of structural return&lt;/h2&gt;

&lt;p&gt;Can we say anything about &lt;em&gt;why&lt;/em&gt; the structural return might have declined, assuming it wasn’t just random variation?&lt;/p&gt;

&lt;p&gt;Let’s further decompose structural return into three components: growth + income + migration.&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;&lt;strong&gt;Growth&lt;/strong&gt; is the underlying growth in a company’s fundamentals.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Income&lt;/strong&gt; is the amount of dividends that a company pays out to shareholders.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Migration&lt;/strong&gt; is the conversion of value stocks into growth stocks, and growth stocks into value stocks. When a value stock’s price goes up and sufficiently outpaces its fundamentals, it migrates from the “value” bucket to the “growth” bucket (or to the “neutral” bucket in the middle). When that happens, value investors make money.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The growth component of return is almost always negative: in aggregate, growth stocks have stronger fundamentals growth than value stocks. That’s to be expected: the reason for a stock to have a high P/F ratio is that the market expects its F to go up.&lt;/p&gt;

&lt;p&gt;Arnott et al. (2021)&lt;sup id=&quot;fnref:1:7&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;4&lt;/a&gt;&lt;/sup&gt; broke down structural return into its components, but did not offer much commentary. However, another article did: John Pease’s &lt;a href=&quot;https://www.gmo.com/americas/research-library/risk-and-premium-a-tale-of-value_whitepaper/&quot;&gt;Risk and Premium: A Tale of Value (2019)&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Pease presented this chart for the components of the value factor before and after 2006:&lt;/p&gt;

&lt;div align=&quot;center&quot; id=&quot;figure-4&quot;&gt;Figure 4: Pease&apos;s Value Factor Decomposition&lt;/div&gt;
&lt;p&gt;&lt;img src=&quot;/assets/images/GMO-value-decomposition.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Source: Exhibit 2&lt;sup id=&quot;fnref:9&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:9&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;7&lt;/a&gt;&lt;/sup&gt; in Pease (2019). Arnott et al. (2021) provides a similar decomposition in its Table 3, but combines growth + income into “income yield”.&lt;sup id=&quot;fnref:23&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:23&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;8&lt;/a&gt;&lt;/sup&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Pease &lt;a href=&quot;https://www.gmo.com/americas/research-library/risk-and-premium-a-tale-of-value_whitepaper/&quot;&gt;went into detail&lt;/a&gt; about what explains each component. I can’t do justice to the full explanations, but in short:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;The growth component is unchanged. That suggests that there are no economic forces making value companies perform worse than they used to.&lt;/li&gt;
  &lt;li&gt;Income has decreased because the market overall has gotten more expensive.
    &lt;details&gt;
      &lt;summary&gt;Illustrative example&lt;/summary&gt;
      &lt;p&gt;Suppose value stocks pay a 6% dividend yield and growth stocks pay 4.5%. If the market goes up 50% while dividends don’t change, now value stocks yield 4% and growth stocks yield 3%. Even though growth and value stocks both went up by the same amount (50%), the income advantage for value stocks has been cut down from 1.5% to 1%.&lt;/p&gt;
    &lt;/details&gt;
  &lt;/li&gt;
  &lt;li&gt;Migration return has declined. This component is the hardest to interpret.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The decline in migration return is the big question mark. Why did that happen? And will it persist?&lt;/p&gt;

&lt;h2 id=&quot;going-deeper-on-migration&quot;&gt;Going deeper on migration&lt;/h2&gt;

&lt;p&gt;Migration occurs when the market re-evaluates its rating of a stock and changes the price, causing it to move from value to growth or vice versa. The most obvious reason for a re-rating is a fundamentals surprise: a company’s realized fundamentals growth outperforms or underperforms expectations.&lt;sup id=&quot;fnref:30&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:30&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;9&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;If migration declines, there are two possible explanations related to fundamentals surprises:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Fundamentals surprises become smaller on average.&lt;/li&gt;
  &lt;li&gt;Stock prices react less to fundamentals surprises.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;(In mathematical terms: Migration occurs when a stock’s P/F ratio increases. Typically, this happens because F unexpectedly increases, and P subsequently increases by &lt;em&gt;more&lt;/em&gt; than F. If migration does &lt;em&gt;not&lt;/em&gt; occur, then either (1) F didn’t unexpectedly increase, or (2) P didn’t react as much to the change in F.)&lt;/p&gt;

&lt;p&gt;If surprises get smaller, that’s bad for value. It means that a driving force behind the value factor has weakened.&lt;/p&gt;

&lt;p&gt;The second explanation could be good or bad for value for complicated reasons. But before getting into that, let’s start by ruling out explanation #1.&lt;/p&gt;

&lt;h3 id=&quot;have-fundamentals-surprises-shrunk&quot;&gt;Have fundamentals surprises shrunk?&lt;/h3&gt;

&lt;p&gt;&lt;a href=&quot;https://dx.doi.org/10.2139/ssrn.3554267&quot;&gt;Is (Systematic) Value Investing Dead?&lt;/a&gt;&lt;sup id=&quot;fnref:12:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:12&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt; by Israel et al. (2021) analyzed, among other things, how well a stock’s valuation predicts its future fundamentals growth.&lt;/p&gt;

&lt;p&gt;The relevant bit for our purposes comes from the paper’s Exhibit 9, column 8, which I used to derive the &lt;a href=&quot;https://en.wikipedia.org/wiki/Coefficient_of_determination&quot;&gt;R&lt;sup&gt;2&lt;/sup&gt;&lt;/a&gt; between a stock’s price-to-fundamental ratio (P/F) and its subsequent one-year change in fundamentals (ΔF)&lt;sup id=&quot;fnref:16&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:16&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;10&lt;/a&gt;&lt;/sup&gt; for each year from 1987 to 2020.&lt;/p&gt;

&lt;p&gt;Figure 5 answers the question: for each year, how well does a stock’s current P/F predict its ΔF?&lt;/p&gt;

&lt;div align=&quot;center&quot; id=&quot;figure-5&quot;&gt;Figure 5: R&lt;sup&gt;2&lt;/sup&gt; of log(P/F) and log(&amp;Delta;F)&lt;/div&gt;
&lt;p&gt;&lt;img src=&quot;/assets/images/delta_f_plot.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;details&gt;
  &lt;summary&gt;Why do we need to know R&lt;sup&gt;2&lt;/sup&gt; rather than slope?&lt;/summary&gt;

  &lt;p&gt;The slope tells us how much a change in P/F predicts change in ΔF, but that number isn’t what we want. What we want to know is how much of the variance in ΔF is explained by P/F, which is to say we want R&lt;sup&gt;2&lt;/sup&gt;.&lt;/p&gt;

  &lt;p&gt;For example, if the market’s discount rate decreases, then the value of distant cash flows goes up, and therefore the spread in P/F between stocks goes up. This causes the regression slope to flatten, even though the predictability of fundamentals growth did not change.&lt;/p&gt;

  &lt;p&gt;Israel et al. (2021)&lt;sup id=&quot;fnref:12:3&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:12&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt; did not directly report R&lt;sup&gt;2&lt;/sup&gt;, but I derived it by first reverse-engineering each year’s N from Exhibit 7, and then calculating R&lt;sup&gt;2&lt;/sup&gt; from N plus the slope and t-stat from Exhibit 9. For details, see &lt;a href=&quot;https://github.com/michaeldickens/FFFactors/blob/master/replication_code/Exhibit9.py&quot;&gt;Exhibit9.py&lt;/a&gt;.&lt;/p&gt;

&lt;/details&gt;

&lt;p&gt;Companies with low P/F (that is, value companies) tend to have low future fundamentals growth. When a stock has strong fundamentals but a low price, that’s Mr. Market saying “I don’t think these fundamentals are going to stay strong for much longer.”&lt;/p&gt;

&lt;p&gt;Value investing has worked historically because the market’s predictions were overconfident: the cheap companies weren’t quite so bad as their prices implied. But the market was still directionally correct: in every year, companies with higher P/F had better fundamentals growth.&lt;/p&gt;

&lt;p&gt;The question we want to ask is: did the migration return decline because fundamentals surprises shrank?&lt;/p&gt;

&lt;p&gt;If surprises shrank, then R&lt;sup&gt;2&lt;/sup&gt; should have increased, but it didn’t. 1987–2006 had an average R&lt;sup&gt;2&lt;/sup&gt; of 0.152, and 2007–2020 had an average of 0.111. If anything, P/F got &lt;em&gt;worse&lt;/em&gt; at predicting fundamentals growth (t-stat = 2.10, p = 0.04).&lt;/p&gt;

&lt;p&gt;Keep in mind that Israel et al. (2021)&lt;sup id=&quot;fnref:12:4&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:12&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt;, Pease (2019)&lt;sup id=&quot;fnref:14:3&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:14&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt;, and Arnott et al. (2021)&lt;sup id=&quot;fnref:1:8&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;4&lt;/a&gt;&lt;/sup&gt; each used a different method to measure company fundamentals, so the results are not directly comparable. But the results should be moderately-to-strongly correlated with each other. I don’t have the data necessary to reproduce this analysis using Arnott et al.’s measure of value, but I would be surprised if the results were qualitatively different.&lt;/p&gt;

&lt;h3 id=&quot;the-markets-reaction-to-surprises&quot;&gt;The market’s reaction to surprises&lt;/h3&gt;

&lt;p&gt;If fundamentals surprises haven’t shrunk, then the market must have reacted less to surprises. That can happen for two reasons:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;The market irrationally ignores new information.&lt;/li&gt;
  &lt;li&gt;The market &lt;em&gt;rationally&lt;/em&gt; ignores short-term surprises because it has better information about long-term growth.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If the market fails to (fully) incorporate new information about fundamentals, then value stocks stay cheap even as their prospects improve, and vice versa for growth stocks. In that case, the growth yield—the difference in fundamentals growth between value and growth stocks—should increase. (Recall that the growth yield is negative, so “increase” means “go toward zero”.)&lt;/p&gt;

&lt;p&gt;But the growth yield did not increase post-2007. Pease (2019)&lt;sup id=&quot;fnref:14:4&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:14&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt; found that it stayed exactly the same (to within 0.1 percentage points), and Arnott et al. (2021)&lt;sup id=&quot;fnref:1:9&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;4&lt;/a&gt;&lt;/sup&gt; found that it (slightly) &lt;em&gt;decreased&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;Under the hypothesis that the market irrationally ignores new information, I don’t know how much we’d expect the growth yield to increase, so I don’t know how to calculate the statistical significance of the empirical results. I suspect the significance is weak. Regardless, this provides non-zero evidence in favor of the second hypothesis: that the market is rationally ignoring surprises because it can see further ahead.&lt;/p&gt;

&lt;p&gt;This makes some intuitive sense. Historically, companies’ fundamentals growth barely persisted: companies that had strong growth for a year were not detectably more likely to have strong growth for a second year (&lt;a href=&quot;https://www.lsvasset.com/pdf/research-papers/Level+Persistence_of_Growth_Rates_FINAL.pdf&quot;&gt;Chan et al. (2003)&lt;/a&gt;&lt;sup id=&quot;fnref:21&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:21&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;11&lt;/a&gt;&lt;/sup&gt;; &lt;a href=&quot;https://verdadcap.com/archive/persistence-of-growth&quot;&gt;Chingono &amp;amp; Obenshain (2022)&lt;/a&gt;&lt;sup id=&quot;fnref:22&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:22&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;12&lt;/a&gt;&lt;/sup&gt;). That implies that a fundamentals surprise shouldn’t much matter for future expectations. If the market persistently &lt;em&gt;over&lt;/em&gt;-reacts to new information (a la &lt;a href=&quot;https://doi.org/10.1111/j.1540-6261.1985.tb05004.x&quot;&gt;De Bondt &amp;amp; Thaler (1985)&lt;/a&gt;&lt;sup id=&quot;fnref:25&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:25&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;13&lt;/a&gt;&lt;/sup&gt;), then stocks will bounce between the value and growth buckets as the market repeatedly re-rates them—and value investors make money on every bounce. If the market stops over-reacting, then value investors lose access to this source of returns.&lt;/p&gt;

&lt;h2 id=&quot;arnott-et-als-statistical-argument&quot;&gt;Arnott et al.’s statistical argument&lt;/h2&gt;

&lt;p&gt;Before, I noted that the reduced structural return is statistically weak, with a t-stat of only 0.496. Arnott et al. (2021)&lt;sup id=&quot;fnref:1:10&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;4&lt;/a&gt;&lt;/sup&gt; raises a similar point about statistical reliability. In &lt;a href=&quot;https://www.tandfonline.com/action/downloadSupplement?doi=10.1080%2F0015198X.2020.1842704&amp;amp;file=ufaj_a_1842704_sm8957.pdf&quot;&gt;Appendix E&lt;/a&gt;, they note notes that the apparent weak structural return could be explained as selection bias from specifically looking at a period where value performed poorly. (If value had performed well, we wouldn’t be scrutinizing it like this.) The authors write:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;When we explicitly analyze drawdowns, we introduce a selection bias by picking the sample to analyze based on the values of the dependent variable. In this case, we are studying the most recent 13½-year period precisely because of the poor performance of value, which is likely in part due to negative residuals.&lt;sup id=&quot;fnref:24&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:24&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;14&lt;/a&gt;&lt;/sup&gt; As an analogy, suppose that we were to analyze the performance of Tiger Woods from 1999 through 2004 (when he was the top-ranked player in the world for 264 consecutive weeks) but instead of looking at his total record, we only include the tournaments in which he played the worst. The resulting selection bias would lead us to struggle to explain why Tiger’s performance was not commensurate with his skill (alpha). Similarly, if we try to explain any factor’s performance, but only study a sample in which the factor performs poorly, we cannot hope to recover the factor’s true unconditional alpha; oversampling of negative residuals hopelessly contaminates the sample.&lt;/p&gt;

  &lt;p&gt;Although this mechanism is intuitive, an important question remains: Is the intercept [structural return] of -0.8% in the post-2007 period evidence of exceptionally improbable bad luck or just ordinary bad luck that we might expect to encounter when we examine any drawdown?&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;They found that the observed structural return was not statistically improbable:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/structural-return-bootstrap.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;

&lt;p&gt;To recap: In the post-2007 period, the structural return has averaged close to zero. This was &lt;em&gt;not&lt;/em&gt; because fundamentals surprises happened less. The most plausible explanation is that it was just random variation—the difference in structural return pre- and post-2007 was highly non-significant (t-stat = 0.496).&lt;/p&gt;

&lt;p&gt;The &lt;em&gt;second&lt;/em&gt; most plausible explanation is that the market became more efficient at predicting long-term trends and started reacting less to year-by-year fundamentals surprises. If true, we should expect muted returns to the value factor going forward.&lt;/p&gt;

&lt;h1 id=&quot;appendix-source-code&quot;&gt;Appendix: Source code&lt;/h1&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/michaeldickens/FFFactors/blob/master/replication_code/ValueAttribution.hs&quot;&gt;ValueAttribution.hs&lt;/a&gt;: Approximate structural return from Fama/French data.&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/michaeldickens/FFFactors/blob/master/replication_code/Exhibit9.py&quot;&gt;Exhibit9.py&lt;/a&gt;: Calculate R&lt;sup&gt;2&lt;/sup&gt; from Israel et al. (2020)&lt;sup id=&quot;fnref:12:5&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:12&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt; data.&lt;/li&gt;
&lt;/ul&gt;

&lt;h1 id=&quot;changelog&quot;&gt;Changelog&lt;/h1&gt;

&lt;ul&gt;
  &lt;li&gt;2026-03-07: Add source code appendix; fix sign error on reported R&lt;sup&gt;2&lt;/sup&gt; values; fix sign error in a sentence (“worse” -&amp;gt; “better”).&lt;/li&gt;
&lt;/ul&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:26&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;The most rigorous value-skeptical article I’ve seen is Lev, B. I., &amp;amp; Srivastava, A. (2019). &lt;a href=&quot;https://dx.doi.org/10.2139/ssrn.3442539&quot;&gt;Explaining the Demise of Value Investing.&lt;/a&gt; &lt;a href=&quot;#fnref:26&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:12&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Israel, R., Laursen, K., &amp;amp; Richardson, S. A. (2020). &lt;a href=&quot;https://dx.doi.org/10.2139/ssrn.3554267&quot;&gt;Is (Systematic) Value Investing Dead?&lt;/a&gt; &lt;a href=&quot;#fnref:12&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt; &lt;a href=&quot;#fnref:12:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;2&lt;/sup&gt;&lt;/a&gt; &lt;a href=&quot;#fnref:12:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;3&lt;/sup&gt;&lt;/a&gt; &lt;a href=&quot;#fnref:12:3&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;4&lt;/sup&gt;&lt;/a&gt; &lt;a href=&quot;#fnref:12:4&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;5&lt;/sup&gt;&lt;/a&gt; &lt;a href=&quot;#fnref:12:5&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;6&lt;/sup&gt;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:14&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Pease, J. (2019). &lt;a href=&quot;https://www.gmo.com/americas/research-library/risk-and-premium-a-tale-of-value_whitepaper/&quot;&gt;Risk and Premium: A Tale of Value.&lt;/a&gt; &lt;a href=&quot;#fnref:14&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt; &lt;a href=&quot;#fnref:14:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;2&lt;/sup&gt;&lt;/a&gt; &lt;a href=&quot;#fnref:14:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;3&lt;/sup&gt;&lt;/a&gt; &lt;a href=&quot;#fnref:14:3&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;4&lt;/sup&gt;&lt;/a&gt; &lt;a href=&quot;#fnref:14:4&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;5&lt;/sup&gt;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:1&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Arnott, R. D., Harvey, C. R., Kalesnik, V., &amp;amp; Linnainmaa, J. T. (2021). &lt;a href=&quot;https://www.tandfonline.com/doi/full/10.1080/0015198X.2020.1842704&quot;&gt;Reports of Value’s Death May Be Greatly Exaggerated.&lt;/a&gt; doi: &lt;a href=&quot;https://doi.org/10.1080/0015198x.2020.1842704&quot;&gt;10.1080/0015198x.2020.1842704&lt;/a&gt; &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt; &lt;a href=&quot;#fnref:1:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;2&lt;/sup&gt;&lt;/a&gt; &lt;a href=&quot;#fnref:1:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;3&lt;/sup&gt;&lt;/a&gt; &lt;a href=&quot;#fnref:1:3&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;4&lt;/sup&gt;&lt;/a&gt; &lt;a href=&quot;#fnref:1:4&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;5&lt;/sup&gt;&lt;/a&gt; &lt;a href=&quot;#fnref:1:5&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;6&lt;/sup&gt;&lt;/a&gt; &lt;a href=&quot;#fnref:1:6&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;7&lt;/sup&gt;&lt;/a&gt; &lt;a href=&quot;#fnref:1:7&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;8&lt;/sup&gt;&lt;/a&gt; &lt;a href=&quot;#fnref:1:8&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;9&lt;/sup&gt;&lt;/a&gt; &lt;a href=&quot;#fnref:1:9&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;10&lt;/sup&gt;&lt;/a&gt; &lt;a href=&quot;#fnref:1:10&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;11&lt;/sup&gt;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:11&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Popular among non-value investors. &lt;a href=&quot;#fnref:11&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:27&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I found that surprising. Intuitively, I’d expect the structural return to be fairly stable. I have no explanation for why the structural return fluctuates so much. &lt;a href=&quot;#fnref:27&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:9&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I edited a label on the image to match my terminology. What I call “migration”, Pease called “rebalancing”. &lt;a href=&quot;#fnref:9&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:23&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;The numbers in Arnott et al. (2021) were much more dramatic, with a -13.2% income yield and 19.2% migration return pre-2007. That’s primarily because Arnott et al. defined the value factor as the cheapest 30% of companies minus the most expensive 30%, while Pease (2019) defined it as the cheapest 50% minus the total market. When the value and growth portfolios are more strongly differentiated, the difference in returns is larger.&lt;/p&gt;

      &lt;p&gt;I loosely replicated both studies’ pre-2007 results and found qualitatively similar results (with an Arnott-style construction producing much larger absolute values than Pease-style). I only looked pre-2007 because I don’t have individual stock data for the full later period. &lt;a href=&quot;#fnref:23&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:30&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;More generally, we could talk about “information surprises”: the market learns some new information and adjust prices accordingly. Information surprises are impossible to analyze in full generality—I can’t collect every conceivable source of information—so it’s easier to just focus on fundamentals surprises. But the principles I will discuss surrounding fundamentals surprises mostly also apply to other sorts of information. &lt;a href=&quot;#fnref:30&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:16&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;The paper defines “fundamentals” as current book value plus forecasted earnings for the next 24 months, where future earnings are discounted according to a stock-specific discount rate based on that stock’s beta. &lt;a href=&quot;#fnref:16&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:21&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Chan, L. K. C., Karceski, J. J., &amp;amp; Lakonishok, J. (2003). &lt;a href=&quot;https://www.lsvasset.com/pdf/research-papers/Level+Persistence_of_Growth_Rates_FINAL.pdf&quot;&gt;The Level and Persistence of Growth Rates.&lt;/a&gt; doi: &lt;a href=&quot;https://doi.org/10.1111/1540-6261.00540&quot;&gt;10.1111/1540-6261.00540&lt;/a&gt; &lt;a href=&quot;#fnref:21&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:22&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Chingono, B. &amp;amp; Obenshain, G. (2022). &lt;a href=&quot;https://verdadcap.com/archive/persistence-of-growth&quot;&gt;Persistence of Growth.&lt;/a&gt; &lt;a href=&quot;#fnref:22&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:25&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;De Bondt, W. F. M., &amp;amp; Thaler, R. (1985). &lt;a href=&quot;https://doi.org/10.1111/j.1540-6261.1985.tb05004.x&quot;&gt;Does the Stock Market Overreact?.&lt;/a&gt; &lt;a href=&quot;#fnref:25&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:24&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Footnote excerpted from Arnott et al.:&lt;/p&gt;

      &lt;blockquote&gt;
        &lt;p&gt;If we condition on the realization of the dependent variable as we do when we select periods based on value’s performance, we bias the estimated intercept. To see why, suppose that the model generating the data is \(y_i = a + b x_i + e_i\), where \(e_i\) is an innovation. Suppose further that \(a = 0\) and that the average \(e_i\) in our sample is zero. If we select observations in which \(y_i &amp;lt; 0\), it has to be that either \(b x_i &amp;lt; 0\) or \(e_i &amp;lt; 0\). That is, when we condition on the realization of \(y_i\), we indirectly condition on the realized value of the innovation, \(e_i\). We call this mechanism “oversampling bad luck”: the average \(e_i\) in the resulting sample is negative. If we take the observations in which \(y_i\) is negative and estimate a linear regression, the estimated intercept becomes negative; because the linear regression’s residuals add to zero, we push the average negative innovation into the intercept.&lt;/p&gt;
      &lt;/blockquote&gt;
      &lt;p&gt;&lt;a href=&quot;#fnref:24&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>Contra "Time Series Momentum: Is It There?"</title>
				<pubDate>Wed, 04 Feb 2026 00:00:00 -0800</pubDate>
				<link>http://mdickens.me/2026/02/04/contra_tsmom_is_it_there/</link>
				<guid isPermaLink="true">http://mdickens.me/2026/02/04/contra_tsmom_is_it_there/</guid>
                <description>
                  
                  
                  
                  &lt;h2 id=&quot;summary&quot;&gt;Summary&lt;/h2&gt;

&lt;p&gt;Time series momentum (TSMOM) is an investment strategy that involves buying assets whose prices are trending upward and shorting assets that have a downward trend. In 2012, Moskowitz, Ooi &amp;amp; Pedersen published &lt;a href=&quot;http://docs.lhpedersen.com/TimeSeriesMomentum.pdf&quot;&gt;Time Series Momentum&lt;/a&gt;&lt;sup id=&quot;fnref:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;. They analyzed a simple version of the strategy that buys assets with positive 12-month returns and shorts assets with negative 12-month returns. They found that the strategy had statistically significant outperformance in equity indexes, currencies, commodities, and bond futures from 1985 to 2009.&lt;/p&gt;

&lt;p&gt;However, others have raised doubts. Huang, Li, Wang &amp;amp; Zhou (henceforth HLWZ) criticized the strategy in &lt;a href=&quot;/materials/org/TSMOM-is-it-there.pdf&quot;&gt;Time Series Momentum: Is It There? (2020)&lt;/a&gt;&lt;sup id=&quot;fnref:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt;, concluding that the evidence for TSMOM is not statistically reliable.&lt;/p&gt;

&lt;p&gt;Some of their criticisms have merit, but TSMOM remains an appealing strategy.&lt;/p&gt;

&lt;p&gt;The abstract of &lt;a href=&quot;/materials/org/TSMOM-is-it-there.pdf&quot;&gt;Time Series Momentum: Is It There?&lt;/a&gt; reads:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Time series momentum (TSMOM&lt;sup id=&quot;fnref:11&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:11&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt;) refers to the predictability of the past 12-month return on the next one-month return and is the focus of several recent influential studies. This paper shows that asset-by-asset time series regressions reveal little evidence of TSMOM, both in- and out-of-sample. While the t-statistic in a pooled regression appears large, it is not statistically reliable as it is less than the critical values of parametric and nonparametric bootstraps. From an investment perspective, the TSMOM strategy is profitable, but its performance is virtually the same as that of a similar strategy that is based on historical sample mean and does not require predictability. Overall, the evidence on TSMOM is weak, particularly for the large cross section of assets.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;To rephrase, HLWZ make two central arguments:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;&lt;a href=&quot;http://docs.lhpedersen.com/TimeSeriesMomentum.pdf&quot;&gt;Moskowitz, Ooi &amp;amp; Pedersen (2012)&lt;/a&gt;&lt;sup id=&quot;fnref:1:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; did a pooled regression and found a statistically significant correlation, but this methodology is flawed: it finds a strong correlation even when time series momentum cannot predict future prices.&lt;/li&gt;
  &lt;li&gt;TSMOM performed similarly to a strategy that simply buys assets with positive long-run historical returns and shorts assets with negative historical returns. The authors call this strategy Time Series History or TSH.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;My responses to the two arguments:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;I agree that a pooled regression is flawed. But a statistically significant correlation on a pooled regression is not what convinced me that TSMOM works. &lt;a href=&quot;#the-pooled-regression-critique&quot;&gt;[More]&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;TSMOM and TSH indeed have similar(ish) historical returns. However:
    &lt;ol&gt;
      &lt;li&gt;TSMOM’s positive performance cannot be explained by TSH alone. &lt;a href=&quot;#is-tsmom-just-a-fancy-way-of-buying-high-return-assets&quot;&gt;[More]&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;TSMOM provided better diversification to an equity portfolio. &lt;a href=&quot;#diversification-benefits-of-tsmom-vs-tsh&quot;&gt;[More]&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;TSMOM has large unexplained returns when regressed onto a Fama-French factor model. &lt;a href=&quot;#hlwzs-four-factor-regression&quot;&gt;[More]&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;TSMOM still performed well on a much larger sample going back a century. &lt;a href=&quot;#the-more-data-counterargument&quot;&gt;[More]&lt;/a&gt;&lt;/li&gt;
    &lt;/ol&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;TSMOM looks strong in the historical data. TSMOM probably survives fees and trading costs, but the evidence for that is less clear. &lt;a href=&quot;#does-tsmom-survive-trading-costs&quot;&gt;[More]&lt;/a&gt;&lt;/p&gt;

&lt;!-- more --&gt;

&lt;h2 id=&quot;contents&quot;&gt;Contents&lt;/h2&gt;

&lt;ul id=&quot;markdown-toc&quot;&gt;
  &lt;li&gt;&lt;a href=&quot;#summary&quot; id=&quot;markdown-toc-summary&quot;&gt;Summary&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#contents&quot; id=&quot;markdown-toc-contents&quot;&gt;Contents&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#what-is-time-series-momentum-and-whats-allegedly-good-about-it&quot; id=&quot;markdown-toc-what-is-time-series-momentum-and-whats-allegedly-good-about-it&quot;&gt;What is time series momentum, and what’s (allegedly) good about it?&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#responding-to-the-critiques&quot; id=&quot;markdown-toc-responding-to-the-critiques&quot;&gt;Responding to the critiques&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#the-pooled-regression-critique&quot; id=&quot;markdown-toc-the-pooled-regression-critique&quot;&gt;The pooled regression critique&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#is-tsmom-just-a-fancy-way-of-buying-high-return-assets&quot; id=&quot;markdown-toc-is-tsmom-just-a-fancy-way-of-buying-high-return-assets&quot;&gt;Is TSMOM just a fancy way of buying high-return assets?&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#diversification-benefits-of-tsmom-vs-tsh&quot; id=&quot;markdown-toc-diversification-benefits-of-tsmom-vs-tsh&quot;&gt;Diversification benefits of TSMOM vs. TSH&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#hlwzs-four-factor-regression&quot; id=&quot;markdown-toc-hlwzs-four-factor-regression&quot;&gt;HLWZ’s four-factor regression&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#the-more-data-counterargument&quot; id=&quot;markdown-toc-the-more-data-counterargument&quot;&gt;The “more data” counterargument&lt;/a&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#tsmom-yes-its-there&quot; id=&quot;markdown-toc-tsmom-yes-its-there&quot;&gt;TSMOM: Yes, it’s there*&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#does-tsmom-survive-trading-costs&quot; id=&quot;markdown-toc-does-tsmom-survive-trading-costs&quot;&gt;*Does TSMOM survive trading costs?&lt;/a&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#notes&quot; id=&quot;markdown-toc-notes&quot;&gt;Notes&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h1 id=&quot;what-is-time-series-momentum-and-whats-allegedly-good-about-it&quot;&gt;What is time series momentum, and what’s (allegedly) good about it?&lt;/h1&gt;

&lt;p&gt;Time series momentum, also known as trendfollowing, works like this:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Get a big list of markets across equities, bonds, commodities, and currencies.&lt;/li&gt;
  &lt;li&gt;For each market, if it has been trending upward recently, buy it. If it’s been trending down, short it.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The exact definition of “trending upward” depends on implementation details. The simplest method is to look at the total return (minus the risk-free interest rate) over the last 12 months. If the total excess return is positive, that’s an uptrend; otherwise, it’s a downtrend. This definition is used by most research papers on TSMOM, including Moskowitz et al.&lt;sup id=&quot;fnref:1:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; and their HLWZ&lt;sup id=&quot;fnref:2:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt;.&lt;/p&gt;

&lt;p&gt;The basic case for TSMOM (which HLWZ will argue against) is that historically it had positive returns with near-zero correlation to equities or bonds.&lt;/p&gt;

&lt;p&gt;Table 1 compares the performance of the total US stock market versus an index of live trendfollowing funds (net of costs).&lt;sup id=&quot;fnref:3&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;4&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;div align=&quot;center&quot; id=&quot;table-1&quot;&gt;Table 1: Summary stats for US equities and trend (1987–2024)&lt;/div&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt; &lt;/th&gt;
      &lt;th&gt;Sharpe Ratio&lt;/th&gt;
      &lt;th&gt;Return&lt;/th&gt;
      &lt;th&gt;Stdev&lt;/th&gt;
      &lt;th&gt;Skewness&lt;/th&gt;
      &lt;th&gt;&lt;a href=&quot;https://tangotools.com/ui/ui.htm&quot;&gt;Ulcer Index&lt;/a&gt;&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;US Equities&lt;/td&gt;
      &lt;td&gt;0.55&lt;/td&gt;
      &lt;td&gt;10.8%&lt;/td&gt;
      &lt;td&gt;15.5%&lt;/td&gt;
      &lt;td&gt;-0.8&lt;/td&gt;
      &lt;td&gt;13.5&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Trend Index&lt;/td&gt;
      &lt;td&gt;0.40&lt;/td&gt;
      &lt;td&gt;7.5%&lt;/td&gt;
      &lt;td&gt;12.7%&lt;/td&gt;
      &lt;td&gt;0.3&lt;/td&gt;
      &lt;td&gt;7.3&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;50/50 Blend&lt;/td&gt;
      &lt;td&gt;0.72&lt;/td&gt;
      &lt;td&gt;9.8%&lt;/td&gt;
      &lt;td&gt;9.6%&lt;/td&gt;
      &lt;td&gt;-0.1&lt;/td&gt;
      &lt;td&gt;4.2&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/Equities-vs-Blend.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;The trend index did not perform as well as US equities, but a 50/50 combination gave a better risk-adjusted return than either strategy alone.&lt;/p&gt;

&lt;p&gt;The &lt;a href=&quot;https://tangotools.com/ui/ui.htm&quot;&gt;ulcer index&lt;/a&gt;, a measurement of the frequency and severity of drawdowns, is a quantification of what we can see in the graph below: the 50/50 blend had a much softer downside than US equities.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/equity-trend-drawdowns.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;For simplicity, I did not include bonds in this brief analysis, but looking at stocks + bonds + trend gives qualitatively similar results. For more, see &lt;a href=&quot;https://dx.doi.org/10.2139/ssrn.2993026&quot;&gt;Hurst, Ooi &amp;amp; Pedersen (2017)&lt;/a&gt;&lt;sup id=&quot;fnref:18&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:18&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;5&lt;/a&gt;&lt;/sup&gt;, particularly the section “Performance During Crisis Periods” and Exhibits 6 and 7 (page 5–8).&lt;/p&gt;

&lt;h1 id=&quot;responding-to-the-critiques&quot;&gt;Responding to the critiques&lt;/h1&gt;

&lt;h2 id=&quot;the-pooled-regression-critique&quot;&gt;The pooled regression critique&lt;/h2&gt;

&lt;p&gt;Moskowitz, Ooi &amp;amp; Pedersen (2012)&lt;sup id=&quot;fnref:1:3&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; wanted to determine if the past 12 months of returns of an asset could predict its future returns. Individual assets are too volatile to contain much signal; they got around this problem by &lt;strong&gt;pooling&lt;/strong&gt; all assets together and running a single regression on all assets at once.&lt;/p&gt;

&lt;p&gt;Huang, Li, Wang &amp;amp; Zhou (2020)&lt;sup id=&quot;fnref:2:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt; argued that the pooled regression technique is flawed because it creates a spurious correlation between past and future returns.&lt;/p&gt;

&lt;p&gt;As an illustration, suppose we are studying time series momentum in the nation of &lt;a href=&quot;https://en.wikipedia.org/wiki/Latveria&quot;&gt;Latveria&lt;/a&gt;. The country has just two tradable assets: Von Doom Industries equities (ticker symbol VDI) and Latverian Treasury bonds. Thanks to the unparalleled genius of Victor von Doom, the stock returns around 20% per year with only 5% volatility. The Latverian government is notoriously unreliable,&lt;sup id=&quot;fnref:25&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:25&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;6&lt;/a&gt;&lt;/sup&gt; however, and the bonds predictably lose 5% every year. (The government is reliable in its unreliability.)&lt;/p&gt;

&lt;p&gt;If we run a pooled regression on Von Doom stocks + Latverian bonds, we will find strong evidence of time series momentum. A past 12-month return of 20% predicts a next-month return of 1.5% (= 20% annualized), and a –5% one-year return predicts a forward month return of –0.4%. The regression will show high predictability with a low p-value.&lt;/p&gt;

&lt;p&gt;In actuality, there is no TSMOM in Latveria: returns are a random walk with no predictability. But pooling stocks and bonds together creates the &lt;em&gt;appearance&lt;/em&gt; of predictability. High past returns provide evidence of high future returns, but only by telling you whether the asset you’re looking at is a stock or a bond.&lt;/p&gt;

&lt;p&gt;HLWZ demonstrated this point with more rigorous statistics. The Latveria example captures the gist of their argument: a pooled regression makes it look like TSMOM can predict future returns, even when it can’t.&lt;sup id=&quot;fnref:38&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:38&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;7&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;I have no direct counter-argument. HLWZ are right to criticize the pooled regression method, and it is good to publish criticisms of incorrect arguments. But my enthusiasm for TSMOM as an investment strategy remains unperturbed by the pooled regression critique. The real question is not, is there a statistically significant covariance future returns and recent past returns of individual assets? The real question is, does TSMOM work?&lt;/p&gt;

&lt;p&gt;HLWZ address this question as well, and they argue that the answer is no. Or rather, they argue that TSMOM works, but only by mimicking a simpler strategy that does not require time series predictability.&lt;/p&gt;

&lt;h2 id=&quot;is-tsmom-just-a-fancy-way-of-buying-high-return-assets&quot;&gt;Is TSMOM just a fancy way of buying high-return assets?&lt;/h2&gt;

&lt;p&gt;The authors argue that TSMOM is not a genuine premium, but that it works by systematically buying assets with high average returns and shorting assets with low average returns.&lt;/p&gt;

&lt;p&gt;Recall Latveria, with its stock that earns 20% per year and its bond that earns –5%. If you follow a TSMOM strategy on Latverian assets, you will find yourself holding stocks and shorting bonds, and you will earn 25% per year. Does that mean TSMOM works? No, because you could do just as well by simply buying stocks and shorting bonds. You don’t need to pay any attention to each asset’s past 12-month returns, and you don’t need to do any active trading.&lt;/p&gt;

&lt;p&gt;Huang, Li, Wang &amp;amp; Zhou propose a strategy that they call Time Series History (TSH):&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;If an asset has a positive historical return, buy it.&lt;/li&gt;
  &lt;li&gt;If an asset has a negative historical return, short it.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;In their Table 10, the authors show that TSMOM’s returns were not demonstrably better than those of TSH.&lt;/p&gt;

&lt;div align=&quot;center&quot; id=&quot;table-2&quot;&gt;Table 2: Summary stats for TSMOM and TSH (1986–2015)&lt;/div&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt; &lt;/th&gt;
      &lt;th&gt;annualized return&lt;sup id=&quot;fnref:45&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:45&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;8&lt;/a&gt;&lt;/sup&gt;&lt;/th&gt;
      &lt;th&gt;t-stat&lt;/th&gt;
      &lt;th&gt;p-value&lt;/th&gt;
      &lt;th&gt;likelihood ratio&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;TSMOM&lt;/td&gt;
      &lt;td&gt;4.8%&lt;/td&gt;
      &lt;td&gt;4.73&lt;/td&gt;
      &lt;td&gt;&amp;lt; 0.001&lt;/td&gt;
      &lt;td&gt;53,000&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;TSH&lt;/td&gt;
      &lt;td&gt;3.0%&lt;/td&gt;
      &lt;td&gt;2.70&lt;/td&gt;
      &lt;td&gt;0.007&lt;/td&gt;
      &lt;td&gt;37&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;&lt;em&gt;I calculated &lt;a href=&quot;https://arbital.greaterwrong.com/p/likelihood_ratio&quot;&gt;likelihood ratios&lt;/a&gt; as P(X=x|μ=x) / P(X=x|μ=0).&lt;sup id=&quot;fnref:15&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:15&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;9&lt;/a&gt;&lt;/sup&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The difference in returns between TSMOM and TSH was statistically insignificant (p = 0.19).&lt;/p&gt;

&lt;p&gt;Does that mean TSMOM has no genuine signal, and it’s just an unnecessarily complicated way of buying assets with high returns?&lt;/p&gt;

&lt;p&gt;No, because returns don’t tell the full story.&lt;/p&gt;

&lt;p&gt;The question to ask is not, &lt;em&gt;does TSMOM have a higher return than TSH?&lt;/em&gt; The question is, &lt;em&gt;does TSMOM add value to a portfolio?&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;We saw before that real-world trendfollowing funds added diversification value to US equities even though they performed worse than equities on their own. However, my comparison from before was not statistically rigorous—how do we know that the diversification value wasn’t just luck?—so we need to do better than that.&lt;/p&gt;

&lt;p&gt;We can ask two related questions:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Is TSMOM essentially just TSH?&lt;/li&gt;
  &lt;li&gt;For an investor who owns an equity index fund, does TSMOM add value? And does it add &lt;em&gt;more&lt;/em&gt; value than TSH does?&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;As a first look at TSMOM vs. TSH, here’s a chart of their total returns 1986–2015. Notice the divergence in the circled areas:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/TSMOM-vs-TSH.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Those two lines don’t &lt;em&gt;look&lt;/em&gt; the same. But the graph doesn’t tell us whether those divergences are statistically meaningful.&lt;/p&gt;

&lt;p&gt;We can get a more reliable answer by running a linear regression of TSMOM with TSH as the independent variable. Does TSMOM have &lt;a href=&quot;https://www.investopedia.com/terms/a/alpha.asp&quot;&gt;alpha&lt;/a&gt; relative to TSH?&lt;/p&gt;

&lt;p&gt;HLWZ didn’t run that regression, but Guofu Zhou (the Z in HLWZ) helpfully &lt;a href=&quot;https://guofuzhou.github.io/zpublications.html&quot;&gt;published the code and data on his website&lt;/a&gt;,&lt;sup id=&quot;fnref:42&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:42&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;10&lt;/a&gt;&lt;/sup&gt; so I downloaded the data and ran the regression myself.&lt;/p&gt;

&lt;p&gt;Here’s the result I got when regressing TSMOM onto TSH:&lt;sup id=&quot;fnref:12&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:12&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;11&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;div align=&quot;center&quot; id=&quot;table-3&quot;&gt;Table 3: Regression of TSMOM onto TSH (1986–2015)&lt;/div&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt; &lt;/th&gt;
      &lt;th&gt; &lt;/th&gt;
      &lt;th&gt;t-stat&lt;/th&gt;
      &lt;th&gt;p-value&lt;/th&gt;
      &lt;th&gt;likelihood ratio&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;annual alpha (intercept)&lt;/td&gt;
      &lt;td&gt;4.17%&lt;/td&gt;
      &lt;td&gt;4.14&lt;/td&gt;
      &lt;td&gt;&amp;lt; 0.001&lt;/td&gt;
      &lt;td&gt;4,400&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;TSH   (slope)&lt;/td&gt;
      &lt;td&gt;0.22&lt;/td&gt;
      &lt;td&gt;1.38&lt;/td&gt;
      &lt;td&gt;0.17&lt;/td&gt;
      &lt;td&gt;2.6&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;r&lt;sup&gt;2&lt;/sup&gt;&lt;/td&gt;
      &lt;td&gt;0.04&lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;An &lt;a href=&quot;https://www.investopedia.com/terms/c/coefficient-of-determination.asp&quot;&gt;r&lt;sup&gt;2&lt;/sup&gt;&lt;/a&gt; of 0.04 means that TSH can only explain 4% of the variance in TSMOM’s returns. An alpha of 4.17% means that, after subtracting the part of TSMOM that’s explained by TSH, TSMOM still had an annual return of 4.17%—and this alpha was highly statistically significant.&lt;/p&gt;

&lt;p&gt;In fact, the TSH component of TSMOM was &lt;strong&gt;not&lt;/strong&gt; significant: there is not good evidence that TSH can explain the returns of TSMOM &lt;em&gt;at all&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;As for the second question: do TSMOM and TSH add diversification benefits to an equity portfolio?&lt;/p&gt;

&lt;h2 id=&quot;diversification-benefits-of-tsmom-vs-tsh&quot;&gt;Diversification benefits of TSMOM vs. TSH&lt;/h2&gt;

&lt;p&gt;If we compare the historical drawdowns for TSMOM and TSH, TSMOM looks more painful to hold:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/TSMOM-vs-TSH-DD.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;But notice &lt;strong&gt;when&lt;/strong&gt; their biggest drawdowns occurred. The TSH portfolio lost 21% from July 2008 to March 2009. You know what else had a big drawdown that hit bottom in March 2009? &lt;em&gt;The stock market.&lt;/em&gt; TSMOM’s biggest drawdown didn’t begin until March 2009, &lt;em&gt;right when equities started recovering.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;In isolation, TSMOM had more downside risk than TSH. But it looks much better as an addition to an equity portfolio:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/Equities-TSMOM-vs-TSH-DD.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;TSMOM provided a better cushion during the 2000–2003 and 2007–2009 bear markets.&lt;/p&gt;

&lt;p&gt;For a more statistical perspective, I regressed TSMOM/TSH against US equities. From Table 4, we can see that:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;TSMOM was not explained at all by US equities, and it had statistically significant alpha.&lt;/li&gt;
  &lt;li&gt;TSH was strongly explained by equities, and it had non-significant (although positive) alpha.&lt;/li&gt;
&lt;/ul&gt;

&lt;div align=&quot;center&quot; id=&quot;table-4&quot;&gt;Table 4: Regressions onto US equities (1986–2015)&lt;/div&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt; &lt;/th&gt;
      &lt;th&gt;TSMOM&lt;/th&gt;
      &lt;th&gt;t-stat&lt;/th&gt;
      &lt;th&gt;TSH&lt;/th&gt;
      &lt;th&gt;t-stat&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;annual alpha&lt;/td&gt;
      &lt;td&gt;5.06%***&lt;/td&gt;
      &lt;td&gt;(4.4)&lt;/td&gt;
      &lt;td&gt;1.50%&lt;/td&gt;
      &lt;td&gt;(1.7)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;US equities&lt;/td&gt;
      &lt;td&gt;-0.03&lt;/td&gt;
      &lt;td&gt;(-1.3)&lt;/td&gt;
      &lt;td&gt;0.19***&lt;/td&gt;
      &lt;td&gt;(11.9)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;r&lt;sup&gt;2&lt;/sup&gt;&lt;/td&gt;
      &lt;td&gt;0.00&lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt;0.27&lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;&lt;em&gt;*, **, and *** indicate significance at the 0.05, 0.01, and 0.001 levels, respectively.&lt;/em&gt;&lt;/p&gt;

&lt;details id=&quot;bonus-content-1&quot;&gt;
  &lt;summary&gt;Bonus content: Summary statistics for TSMOM, TSH, and equity blends&lt;/summary&gt;

  &lt;table&gt;
    &lt;thead&gt;
      &lt;tr&gt;
        &lt;th&gt; &lt;/th&gt;
        &lt;th&gt;Sharpe Ratio&lt;/th&gt;
        &lt;th&gt;Return&lt;/th&gt;
        &lt;th&gt;Stdev&lt;/th&gt;
        &lt;th&gt;&lt;a href=&quot;https://tangotools.com/ui/ui.htm&quot;&gt;Ulcer Index&lt;/a&gt;&lt;/th&gt;
      &lt;/tr&gt;
    &lt;/thead&gt;
    &lt;tbody&gt;
      &lt;tr&gt;
        &lt;td&gt;TSMOM&lt;/td&gt;
        &lt;td&gt;0.76&lt;/td&gt;
        &lt;td&gt;8.2%&lt;/td&gt;
        &lt;td&gt;6.2%&lt;/td&gt;
        &lt;td&gt;7.5&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
        &lt;td&gt;TSH&lt;/td&gt;
        &lt;td&gt;0.53&lt;/td&gt;
        &lt;td&gt;6.3%&lt;/td&gt;
        &lt;td&gt;5.6%&lt;/td&gt;
        &lt;td&gt;4.0&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
        &lt;td&gt;Equities+TSMOM&lt;/td&gt;
        &lt;td&gt;0.76&lt;/td&gt;
        &lt;td&gt;9.6%&lt;/td&gt;
        &lt;td&gt;8.2%&lt;/td&gt;
        &lt;td&gt;5.7&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
        &lt;td&gt;Equities+TSH&lt;/td&gt;
        &lt;td&gt;0.56&lt;/td&gt;
        &lt;td&gt;8.5%&lt;/td&gt;
        &lt;td&gt;9.5%&lt;/td&gt;
        &lt;td&gt;8.3&lt;/td&gt;
      &lt;/tr&gt;
    &lt;/tbody&gt;
  &lt;/table&gt;

  &lt;p&gt;(Equity blends are 50% equities plus 50% the other strategy.)&lt;/p&gt;

  &lt;p&gt;TSMOM alone had a higher standard deviation and ulcer index than TSH, but Equities + TSMOM looks &lt;em&gt;less&lt;/em&gt; risky than Equities + TSH.&lt;/p&gt;
&lt;/details&gt;

&lt;h2 id=&quot;hlwzs-four-factor-regression&quot;&gt;HLWZ’s four-factor regression&lt;/h2&gt;

&lt;p&gt;Huang, Li, Wang &amp;amp; Zhou regressed TSMOM and TSH against the Fama-French four-factor model that includes the (global) equity beta, size, value, and momentum factors.&lt;/p&gt;

&lt;details&gt;
  &lt;summary&gt;What&apos;s a Fama-French four-factor model?&lt;/summary&gt;

  &lt;p&gt;In the beginning, there was the Efficient Market Hypothesis. The &lt;a href=&quot;https://en.wikipedia.org/wiki/Capital_asset_pricing_model&quot;&gt;Capital Asset Pricing Model&lt;/a&gt; (CAPM) hypothesized that a stock’s risk is a function of its &lt;strong&gt;beta&lt;/strong&gt;, which is a number describing how much it moves with the market. If a stock has a beta of 1, that means when the market goes up 1%, that stock also goes up 1%. A stock with a beta of 2 tends to move twice as much as the market. According to CAPM, the only way to reliably outperform the market is to increase the beta of your portfolio (which means you’re also increasing risk).&lt;/p&gt;

  &lt;p&gt;But then some research began posing challenges to CAPM. &lt;a href=&quot;https://doi.org/10.1016/0304-405X(81)90018-0&quot;&gt;Banz (1981)&lt;/a&gt;&lt;sup id=&quot;fnref:32&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:32&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;12&lt;/a&gt;&lt;/sup&gt; found that stocks of smaller companies tended to outperform stocks of large companies. Statman (1980)&lt;sup id=&quot;fnref:30&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:30&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;13&lt;/a&gt;&lt;/sup&gt; and &lt;a href=&quot;https://doi.org/10.1111/j.1540-6261.1991.tb04642.x&quot;&gt;Chan et al. (1991)&lt;/a&gt;&lt;sup id=&quot;fnref:31&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:31&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;14&lt;/a&gt;&lt;/sup&gt;, among others, found that stocks with high ratios of &lt;a class=&quot;tooltip&quot; href=&quot;https://www.investopedia.com/terms/b/bookvalue.asp&quot;&gt;book value&lt;span class=&quot;tooltiptext&quot;&gt;book value = total assets – total liabilities&lt;/span&gt;&lt;/a&gt; to market value (B/M) tended to outperform stocks with low B/M ratios.&lt;/p&gt;

  &lt;p&gt;&lt;a href=&quot;https://doi.org/10.1111/j.1540-6261.1992.tb04398.x&quot;&gt;Fama &amp;amp; French (1992)&lt;/a&gt;&lt;sup id=&quot;fnref:29&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:29&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;15&lt;/a&gt;&lt;/sup&gt; proposed a new &lt;strong&gt;three-factor model&lt;/strong&gt; as an alternative to CAPM. Instead of assessing the risk of a stock using beta alone, the three-factor model includes—you guessed it—three factors: beta, size, and B/M. Fama &amp;amp; French found that this model was much better than CAPM at explaining the variance in stocks’ returns.&lt;/p&gt;

  &lt;p&gt;Since then, other market anomalies have been proposed, and many researchers now prefer four-factor, five-factor, or even bigger models. &lt;a href=&quot;https://doi.org/10.1016/j.jfineco.2019.08.004&quot;&gt;Time Series Momentum: Is It There?&lt;/a&gt;&lt;sup id=&quot;fnref:2:3&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt; uses a four-factor model that includes beta, size, value, and momentum, where momentum is the tendency of stocks with high past 12-month returns to outperform stocks with low 12-month returns.&lt;/p&gt;

  &lt;p&gt;(If momentum sounds similar to TSMOM, that’s because it is. The key difference is that the momentum factor goes long stocks with high relative returns and shorts stocks with low relative returns, whereas TSMOM goes long or short based on an asset’s absolute return. For example, in a situation like the 2008 Global Financial Crisis where almost everything is down, the momentum factor shorts the stocks that are crashing the hardest while buying the stocks that only declined a little bit. Meanwhile, TSMOM simply shorts everything.)&lt;/p&gt;

  &lt;p&gt;To do a &lt;strong&gt;factor regression&lt;/strong&gt;, take some investment strategy—in our case, TSMOM—and run a linear regression where the independent variables are the factors in your model: beta, size, value, and momentum. The regression tells you how much of your strategy’s performance can be explained by known factors. The regression’s intercept, or &lt;strong&gt;alpha&lt;/strong&gt;, tells you how much of the performance &lt;em&gt;can’t&lt;/em&gt; be explained.&lt;/p&gt;

  &lt;p&gt;If TSMOM has a large positive return but near-zero alpha, that means its performance can be explained by factors we already knew about; it’s not doing anything novel.&lt;/p&gt;
&lt;/details&gt;

&lt;div align=&quot;center&quot; id=&quot;table-5&quot;&gt;Table 5: TSMOM and TSH regressions on Fama-French four-factor model (1986–2015)&lt;/div&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt; &lt;/th&gt;
      &lt;th&gt;TSMOM&lt;/th&gt;
      &lt;th&gt;(t-stat)&lt;/th&gt;
      &lt;th&gt;TSH&lt;/th&gt;
      &lt;th&gt;(t-stat)&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;annual alpha&lt;/td&gt;
      &lt;td&gt;1.81%&lt;/td&gt;
      &lt;td&gt;(1.94)&lt;/td&gt;
      &lt;td&gt;0.05%&lt;/td&gt;
      &lt;td&gt;(0.80)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;beta&lt;/td&gt;
      &lt;td&gt;0.02&lt;/td&gt;
      &lt;td&gt;(0.61)&lt;/td&gt;
      &lt;td&gt;0.25***&lt;/td&gt;
      &lt;td&gt;(8.54)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;size&lt;/td&gt;
      &lt;td&gt;-0.06&lt;/td&gt;
      &lt;td&gt;(-1.83)&lt;/td&gt;
      &lt;td&gt;0.11***&lt;/td&gt;
      &lt;td&gt;(3.70)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;value&lt;/td&gt;
      &lt;td&gt;0.06&lt;/td&gt;
      &lt;td&gt;(1.01)&lt;/td&gt;
      &lt;td&gt;0.08**&lt;/td&gt;
      &lt;td&gt;(2.96)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;momentum&lt;/td&gt;
      &lt;td&gt;0.60***&lt;/td&gt;
      &lt;td&gt;(9.99)&lt;/td&gt;
      &lt;td&gt;0.13**&lt;/td&gt;
      &lt;td&gt;(2.76)&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;&lt;em&gt;*, **, and *** indicate significance at the 0.05, 0.01, and 0.001 levels, respectively.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The authors’ takeaway was that TSMOM does not work better than TSH. Indeed, the two strategies both had weak alphas. But there are some important differences, particularly on &lt;strong&gt;beta&lt;/strong&gt; and &lt;strong&gt;momentum&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;TSMOM had near-zero &lt;strong&gt;beta&lt;/strong&gt;. HLWZ’s regression reinforces what we saw in the &lt;a href=&quot;#diversification-benefits-of-tsmom-vs-tsh&quot;&gt;previous section&lt;/a&gt;: TSH is significantly related to equities, and TSMOM isn’t.&lt;/p&gt;

&lt;p&gt;The most powerful factor for explaining TSMOM is &lt;strong&gt;momentum&lt;/strong&gt;, sometimes called “cross-sectional momentum” to distinguish it from time series momentum. The momentum factor buys stocks with high relative returns and shorts stocks with low relative returns.&lt;/p&gt;

&lt;p&gt;Cross-sectional momentum and time-series momentum are closely related to each other, so it’s unsurprising that the performance of TSMOM is heavily explained by momentum. This does not diminish the value of TSMOM as an addition to an &lt;em&gt;equity portfolio&lt;/em&gt;; it only diminishes the appeal if you &lt;em&gt;already invest in momentum&lt;/em&gt;. And when we look at an extended data set—which I will do in the &lt;a href=&quot;#the-more-data-counterargument&quot;&gt;next section&lt;/a&gt;—we can find good evidence that TSMOM adds value even to a portfolio that already includes momentum.&lt;/p&gt;

&lt;p&gt;Even if TSMOM has only weak alpha on top of momentum, does that make TSMOM &lt;em&gt;worse&lt;/em&gt; than momentum? Not necessarily. What happens if you switch TSMOM and momentum around, and regress &lt;em&gt;momentum&lt;/em&gt; onto a four-factor model that includes market beta, size, value, and &lt;em&gt;TSMOM&lt;/em&gt;?&lt;/p&gt;

&lt;p&gt;Well, I did exactly that. I discovered that momentum had &lt;strong&gt;negative&lt;/strong&gt; 0.37% annual alpha in its own regression, compared to the positive 1.46% of TSMOM that I got when I replicated HLWZ’s four-factor regression.&lt;sup id=&quot;fnref:27&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:27&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;16&lt;/a&gt;&lt;/sup&gt; Neither of these alphas were statistically significant, but this weakly suggests that TSMOM is, if anything, a &lt;em&gt;stronger&lt;/em&gt; factor than momentum. This finding is consistent with prior research.&lt;/p&gt;

&lt;details&gt;
  &lt;summary&gt;Prior research comparing momentum and TSMOM&lt;/summary&gt;
  &lt;p&gt;&lt;a href=&quot;https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2610288&quot;&gt;Goyal &amp;amp; Jegadeesh (2015)&lt;/a&gt;&lt;sup id=&quot;fnref:6&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:6&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;17&lt;/a&gt;&lt;/sup&gt; examined momentum and TSMOM among individual equities for a variety of lookback periods (3 months, 6 months, 12 months, etc.). They found that TSMOM had significant alpha when regressed on momentum, but the reverse was not true. In fact, for most lookback periods, momentum had &lt;em&gt;negative&lt;/em&gt; alpha (see the last two columns of their Table 2). The authors found that stock momentum could be rescued by combining it with a TSMOM overlay on the equity index,&lt;sup id=&quot;fnref:17&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:17&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;18&lt;/a&gt;&lt;/sup&gt; which is good news for stock momentum, but TSMOM still plays a central role in the rescued version of the strategy.&lt;/p&gt;

  &lt;p&gt;&lt;a href=&quot;http://docs.lhpedersen.com/TimeSeriesMomentum.pdf&quot;&gt;Moskowitz, Ooi &amp;amp; Pedersen (2012)&lt;/a&gt;&lt;sup id=&quot;fnref:1:4&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; separately regressed multi-asset momentum and stock momentum onto TSMOM. They found negative alphas for both, although the alphas were not statistically significant in this case (t = -1.17 for multi-asset momentum and t = -0.93 for stock momentum; see Table 5, Panel B).&lt;/p&gt;
&lt;/details&gt;

&lt;p&gt;If we say TSMOM “isn’t there” because it’s largely explained by momentum, then it would be even more accurate to instead say momentum “isn’t there”.&lt;/p&gt;

&lt;details id=&quot;bonus-content-2&quot;&gt;
  &lt;summary&gt;Bonus content: Regression of TSMOM onto Fama-French global four factors plus TSH&lt;/summary&gt;

  &lt;table&gt;
    &lt;thead&gt;
      &lt;tr&gt;
        &lt;th&gt; &lt;/th&gt;
        &lt;th&gt; &lt;/th&gt;
        &lt;th&gt;t-stat&lt;/th&gt;
        &lt;th&gt;likelihood ratio&lt;/th&gt;
      &lt;/tr&gt;
    &lt;/thead&gt;
    &lt;tbody&gt;
      &lt;tr&gt;
        &lt;td&gt;annual alpha&lt;/td&gt;
        &lt;td&gt;1.57%&lt;/td&gt;
        &lt;td&gt;1.67&lt;/td&gt;
        &lt;td&gt;4.0&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
        &lt;td&gt;TSH&lt;/td&gt;
        &lt;td&gt;0.25*&lt;/td&gt;
        &lt;td&gt;2.50&lt;/td&gt;
        &lt;td&gt;22&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
        &lt;td&gt;beta&lt;/td&gt;
        &lt;td&gt;-0.04&lt;/td&gt;
        &lt;td&gt;-1.18&lt;/td&gt;
        &lt;td&gt;2.0&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
        &lt;td&gt;size&lt;/td&gt;
        &lt;td&gt;-0.09*&lt;/td&gt;
        &lt;td&gt;-2.53&lt;/td&gt;
        &lt;td&gt;24&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
        &lt;td&gt;value&lt;/td&gt;
        &lt;td&gt;0.04&lt;/td&gt;
        &lt;td&gt;0.70&lt;/td&gt;
        &lt;td&gt;1.3&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
        &lt;td&gt;momentum&lt;/td&gt;
        &lt;td&gt;0.56***&lt;/td&gt;
        &lt;td&gt;9.49&lt;/td&gt;
        &lt;td&gt;&amp;gt;10&lt;sup&gt;17&lt;/sup&gt;&lt;/td&gt;
      &lt;/tr&gt;
    &lt;/tbody&gt;
  &lt;/table&gt;

  &lt;p&gt;&lt;em&gt;*, **, and *** indicate significance at the 0.05, 0.01, and 0.001 levels, respectively.&lt;/em&gt;&lt;/p&gt;

  &lt;p&gt;According to this regression:&lt;/p&gt;

  &lt;ul&gt;
    &lt;li&gt;TSH partially explains the behavior of TSMOM. (TSH has more explanatory power when combined with the four-factor model than it did by itself (&lt;a href=&quot;#table-3&quot;&gt;Table 3&lt;/a&gt;).&lt;sup id=&quot;fnref:16&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:16&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;19&lt;/a&gt;&lt;/sup&gt;) However, it still comes nowhere close to fully explaining TSMOM.&lt;/li&gt;
    &lt;li&gt;The other regression coefficients look qualitatively similar to when TSH wasn’t included. TSMOM has positive but non-significant alpha, near-zero beta, and a large exposure to the momentum factor.&lt;/li&gt;
  &lt;/ul&gt;
&lt;/details&gt;

&lt;h2 id=&quot;the-more-data-counterargument&quot;&gt;The “more data” counterargument&lt;/h2&gt;

&lt;p&gt;What do you do when you find a non-statistically-significant positive alpha, and your data is underpowered to detect whether the alpha is real or spurious? You get more data.&lt;/p&gt;

&lt;p&gt;In 2017, Hurst, Ooi &amp;amp; Pedersen published &lt;a href=&quot;https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2993026&quot;&gt;A Century of Evidence on Trend-Following Investing&lt;/a&gt;&lt;sup id=&quot;fnref:18:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:18&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;5&lt;/a&gt;&lt;/sup&gt; which extended the data from &lt;a href=&quot;http://docs.lhpedersen.com/TimeSeriesMomentum.pdf&quot;&gt;Moskowitz, Ooi &amp;amp; Pedersen (2012)&lt;/a&gt;&lt;sup id=&quot;fnref:1:5&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; back to 1880. They found that a simulated TSMOM strategy—net of estimated trading costs and a &lt;a href=&quot;https://www.investopedia.com/terms/t/two_and_twenty.asp&quot;&gt;2-and-20 fee&lt;/a&gt;—earned a 7.3% return with a 9.7% standard deviation and zero correlation to US equities or bonds. The fact that TSMOM showed such strong results going back to 1880 is good evidence that the 1986–2015 findings were not a fluke.&lt;sup id=&quot;fnref:20&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:20&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;20&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;However, Hurst et al.’s extended dataset does not counter the criticism that TSMOM might be explained by TSH or known factors like momentum.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3325720&quot;&gt;Global Factor Premiums&lt;/a&gt;&lt;sup id=&quot;fnref:7&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:7&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;21&lt;/a&gt;&lt;/sup&gt; by Baltussen, Swinkels &amp;amp; van Vliet (2019) collected data going back to 1800 across equity, bond, commodity and currency markets to test six different factor premiums. Most importantly for our purposes, they included the momentum and TSMOM factors (they referred to the latter as “Trend”). Their paper did not provide evidence that’s directly relevant for our purposes, but they did provide &lt;a href=&quot;https://dataverse.nl/dataset.xhtml?persistentId=doi:10.34894/H7Y5UQ&quot;&gt;replication data&lt;/a&gt;.&lt;sup id=&quot;fnref:42:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:42&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;10&lt;/a&gt;&lt;/sup&gt; I used that data to test if TSMOM has alpha when regressed on US equities&lt;sup id=&quot;fnref:21&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:21&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;22&lt;/a&gt;&lt;/sup&gt; plus the momentum and value factors.&lt;sup id=&quot;fnref:8&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:8&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;23&lt;/a&gt;&lt;/sup&gt; I started the backtest in 1927 because that’s the start date of the &lt;a href=&quot;mba.tuck.dartmouth.edu/pages/faculty/ken.french/data_library.html&quot;&gt;Ken French data library&lt;/a&gt;’s time series on equities and interest rates.&lt;sup id=&quot;fnref:10&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:10&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;24&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;I regressed TSMOM against five factors: equity beta, equity index value, equity index momentum, multi-asset value, and multi-asset momentum.&lt;sup id=&quot;fnref:19&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:19&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;25&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;div align=&quot;center&quot; id=&quot;table-6&quot;&gt;Table 6: TSMOM regression against Global Factor Premiums factors (1927–2016)&lt;/div&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt; &lt;/th&gt;
      &lt;th&gt; &lt;/th&gt;
      &lt;th&gt;t-stat&lt;/th&gt;
      &lt;th&gt;likelihood ratio&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;annual alpha&lt;/td&gt;
      &lt;td&gt;10.31%***&lt;/td&gt;
      &lt;td&gt;8.0&lt;/td&gt;
      &lt;td&gt;&amp;gt;10&lt;sup&gt;13&lt;/sup&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;beta&lt;/td&gt;
      &lt;td&gt;0.00&lt;/td&gt;
      &lt;td&gt;0.3&lt;/td&gt;
      &lt;td&gt;1.03&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Val^EQ&lt;/td&gt;
      &lt;td&gt;0.06&lt;/td&gt;
      &lt;td&gt;1.7&lt;/td&gt;
      &lt;td&gt;4&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Mom^EQ&lt;/td&gt;
      &lt;td&gt;0.02&lt;/td&gt;
      &lt;td&gt;0.5&lt;/td&gt;
      &lt;td&gt;1.12&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Val^MA&lt;/td&gt;
      &lt;td&gt;0.03&lt;/td&gt;
      &lt;td&gt;0.7&lt;/td&gt;
      &lt;td&gt;1.30&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Mom^MA&lt;/td&gt;
      &lt;td&gt;0.57***&lt;/td&gt;
      &lt;td&gt;15.6&lt;/td&gt;
      &lt;td&gt;&amp;gt;10&lt;sup&gt;50&lt;/sup&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;r&lt;sup&gt;2&lt;/sup&gt;&lt;/td&gt;
      &lt;td&gt;0.27&lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;&lt;em&gt;^EQ indicates an equity index factor; ^MA indicates a multi-asset factor. *, **, and *** indicate significance at the 0.05, 0.01, and 0.001 levels, respectively.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;From 1927 to 2016, TSMOM had alpha that can’t be explained by momentum.&lt;/p&gt;

&lt;details id=&quot;bonus-content-3&quot;&gt;
  &lt;summary&gt;Bonus content: Regression of TSMOM onto more factors, 1927–2016&lt;/summary&gt;

  &lt;p&gt;As a more conservative test, I regressed TSMOM onto the combination of the Global Factor Premiums factors plus the Fama-French four-factor model:&lt;/p&gt;

  &lt;table&gt;
    &lt;thead&gt;
      &lt;tr&gt;
        &lt;th&gt; &lt;/th&gt;
        &lt;th&gt; &lt;/th&gt;
        &lt;th&gt;t-stat&lt;/th&gt;
        &lt;th&gt;likelihood ratio&lt;/th&gt;
      &lt;/tr&gt;
    &lt;/thead&gt;
    &lt;tbody&gt;
      &lt;tr&gt;
        &lt;td&gt;annual alpha&lt;/td&gt;
        &lt;td&gt;8.74%***&lt;/td&gt;
        &lt;td&gt;6.8&lt;/td&gt;
        &lt;td&gt;&amp;gt;10&lt;sup&gt;10&lt;/sup&gt;&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
        &lt;td&gt;FF beta&lt;/td&gt;
        &lt;td&gt;0.04*&lt;/td&gt;
        &lt;td&gt;2.0&lt;/td&gt;
        &lt;td&gt;7&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
        &lt;td&gt;FF size&lt;/td&gt;
        &lt;td&gt;-0.00&lt;/td&gt;
        &lt;td&gt;-0.1&lt;/td&gt;
        &lt;td&gt;1.00&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
        &lt;td&gt;FF value&lt;/td&gt;
        &lt;td&gt;0.08*&lt;/td&gt;
        &lt;td&gt;2.5&lt;/td&gt;
        &lt;td&gt;25&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
        &lt;td&gt;FF momentum&lt;/td&gt;
        &lt;td&gt;0.16***&lt;/td&gt;
        &lt;td&gt;6.7&lt;/td&gt;
        &lt;td&gt;&amp;gt;10&lt;sup&gt;9&lt;/sup&gt;&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
        &lt;td&gt;Val^EQ&lt;/td&gt;
        &lt;td&gt;0.06&lt;/td&gt;
        &lt;td&gt;1.6&lt;/td&gt;
        &lt;td&gt;3&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
        &lt;td&gt;Mom^EQ&lt;/td&gt;
        &lt;td&gt;-0.02&lt;/td&gt;
        &lt;td&gt;-0.4&lt;/td&gt;
        &lt;td&gt;1.09&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
        &lt;td&gt;Val^MA&lt;/td&gt;
        &lt;td&gt;0.03&lt;/td&gt;
        &lt;td&gt;0.8&lt;/td&gt;
        &lt;td&gt;1.40&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
        &lt;td&gt;Mom^MA&lt;/td&gt;
        &lt;td&gt;0.56***&lt;/td&gt;
        &lt;td&gt;15.3&lt;/td&gt;
        &lt;td&gt;&amp;gt;10&lt;sup&gt;48&lt;/sup&gt;&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
        &lt;td&gt;r&lt;sup&gt;2&lt;/sup&gt;&lt;/td&gt;
        &lt;td&gt;0.30&lt;/td&gt;
        &lt;td&gt; &lt;/td&gt;
        &lt;td&gt; &lt;/td&gt;
      &lt;/tr&gt;
    &lt;/tbody&gt;
  &lt;/table&gt;

  &lt;p&gt;&lt;em&gt;*, **, and *** indicate significance at the 0.05, 0.01, and 0.001 levels, respectively.&lt;/em&gt;&lt;/p&gt;

  &lt;p&gt;Adding the Fama-French factors gave the model a little more explanatory power but TSMOM still had highly significant alpha.&lt;/p&gt;

  &lt;p&gt;(It’s statistically questionable to regress against multiple similar factors because it risks overfitting, but it’s only questionable in that it can make alpha appear artificially small. In this case, alpha is still large and significant.)&lt;/p&gt;

  &lt;p&gt;As a more direct comparison to HLWZ, below is a regression of TSMOM onto the US four-factor model, 1927–2016. This is not a fair test because TSMOM is global, not US-only, but I am including it for completeness.&lt;/p&gt;

  &lt;table&gt;
    &lt;thead&gt;
      &lt;tr&gt;
        &lt;th&gt; &lt;/th&gt;
        &lt;th&gt; &lt;/th&gt;
        &lt;th&gt;t-stat&lt;/th&gt;
        &lt;th&gt;likelihood ratio&lt;/th&gt;
      &lt;/tr&gt;
    &lt;/thead&gt;
    &lt;tbody&gt;
      &lt;tr&gt;
        &lt;td&gt;annual alpha&lt;/td&gt;
        &lt;td&gt;13.93%***&lt;/td&gt;
        &lt;td&gt;9.6&lt;/td&gt;
        &lt;td&gt;&amp;gt;10&lt;sup&gt;19&lt;/sup&gt;&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
        &lt;td&gt;beta&lt;/td&gt;
        &lt;td&gt;0.05*&lt;/td&gt;
        &lt;td&gt;2.1&lt;/td&gt;
        &lt;td&gt;10&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
        &lt;td&gt;size&lt;/td&gt;
        &lt;td&gt;-0.01&lt;/td&gt;
        &lt;td&gt;-0.2&lt;/td&gt;
        &lt;td&gt;1.02&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
        &lt;td&gt;value&lt;/td&gt;
        &lt;td&gt;0.10**&lt;/td&gt;
        &lt;td&gt;3.0&lt;/td&gt;
        &lt;td&gt;84&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
        &lt;td&gt;momentum&lt;/td&gt;
        &lt;td&gt;0.23***&lt;/td&gt;
        &lt;td&gt;8.8&lt;/td&gt;
        &lt;td&gt;&amp;gt;10&lt;sup&gt;16&lt;/sup&gt;&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
        &lt;td&gt;r&lt;sup&gt;2&lt;/sup&gt;&lt;/td&gt;
        &lt;td&gt;0.07&lt;/td&gt;
        &lt;td&gt; &lt;/td&gt;
        &lt;td&gt; &lt;/td&gt;
      &lt;/tr&gt;
    &lt;/tbody&gt;
  &lt;/table&gt;

&lt;/details&gt;

&lt;h1 id=&quot;tsmom-yes-its-there&quot;&gt;TSMOM: Yes, it’s there*&lt;/h1&gt;

&lt;p&gt;*Gross of fees and trading costs.&lt;/p&gt;

&lt;p&gt;In summary:&lt;/p&gt;

&lt;p&gt;The pooled regression method of Moskowitz, Ooi &amp;amp; Pedersen (2012)&lt;sup id=&quot;fnref:1:6&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; &lt;a href=&quot;#the-pooled-regression-critique&quot;&gt;overstates TSMOM’s ability&lt;/a&gt; to predict future returns. However, TSMOM still has statistically strong historical results. HLWZ’s comparison of TSMOM to Time Series History (TSH) &lt;a href=&quot;#is-tsmom-just-a-fancy-way-of-buying-high-return-assets&quot;&gt;misses the mark&lt;/a&gt;: they correctly observe that both strategies had similar historical returns, but TSH mostly earned its returns via holding equities, while &lt;a href=&quot;#diversification-benefits-of-tsmom-vs-tsh&quot;&gt;TSMOM had near-zero beta&lt;/a&gt;; and a regression of TSMOM onto TSH showed that TSH had little ability to explain why TSMOM earned positive returns. TSMOM was &lt;a href=&quot;#hlwzs-four-factor-regression&quot;&gt;significantly explained by the momentum factor&lt;/a&gt;; conversely, however, the momentum factor had &lt;em&gt;negative&lt;/em&gt; returns after controlling for TSMOM.&lt;/p&gt;

&lt;p&gt;By &lt;a href=&quot;#the-more-data-counterargument&quot;&gt;extending the data back to 1927&lt;/a&gt;, the positive average return of TSMOM became highly statistically significant, even when regressing on the momentum factor.&lt;/p&gt;

&lt;p&gt;The evidence strongly suggests that TSMOM is a real phenomenon: assets that are trending upward tend to outperform those that are trending down. But whether TSMOM survives trading costs is another question.&lt;/p&gt;

&lt;h2 id=&quot;does-tsmom-survive-trading-costs&quot;&gt;*Does TSMOM survive trading costs?&lt;/h2&gt;

&lt;p&gt;Compared with hypothetical backtests, &lt;strong&gt;live trendfollowing funds&lt;/strong&gt; have notably worse performance. They have good returns in isolation, with near-zero correlation to equities, but their &lt;a href=&quot;#hlwzs-four-factor-regression&quot;&gt;four-factor alphas&lt;/a&gt; are not statistically significant.&lt;/p&gt;

&lt;p&gt;This is largely a data problem: we don’t have live fund performance going back
far enough. It’s also a problem of cost: live funds perform worse than
hypothetical backtests because they have management fees and trading costs. Even
if there is genuine alpha, it will be smaller, and smaller numbers are
statistically harder to detect.&lt;/p&gt;

&lt;p&gt;To estimate costs, I compared an index of trendfollowing funds&lt;sup id=&quot;fnref:3:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;4&lt;/a&gt;&lt;/sup&gt; against &lt;a href=&quot;https://www.aqr.com/&quot;&gt;AQR&lt;/a&gt;’s published &lt;a href=&quot;https://www.aqr.com/Insights/Datasets/Time-Series-Momentum-Factors-Monthly&quot;&gt;Time Series Momentum dataset&lt;/a&gt;, which gives hypothetical returns to a TSMOM strategy like the one studied by HLWZ. Comparing the live index against AQR’s benchmark, the two had very similar volatility and drawdown characteristics, but the live funds underperformed by just under four percentage points per year (on average) from 1987 to 2024. That suggests an all-in cost of four percentage points per year.&lt;/p&gt;

&lt;p&gt;AQR’s TSMOM benchmark had 4.95% annual alpha on a four-factor regression&lt;sup id=&quot;fnref:35&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:35&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;26&lt;/a&gt;&lt;/sup&gt;, which had a moderate t-statistic of 2.40 (p = 0.017). Due to fees and trading costs, the live trend index had 1.36% annual alpha, which was nowhere close to statistically significant (t = 0.60, p = 0.55). So we can’t confidently say that live funds survive trading costs.&lt;/p&gt;

&lt;p&gt;However, this is not a fair comparison. As we saw &lt;a href=&quot;#hlwzs-four-factor-regression&quot;&gt;before&lt;/a&gt;, TSMOM is heavily explained by the momentum factor; but the momentum factor is not free to trade, any more than TSMOM is.&lt;/p&gt;

&lt;p&gt;A straightforward way to handle this is to subtract estimated trading costs from the momentum factor.&lt;sup id=&quot;fnref:39&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:39&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;27&lt;/a&gt;&lt;/sup&gt; I did a four-factor regression with the added assumption that momentum costs the same amount to trade as TSMOM, as determined by the difference between the live trend index and AQR’s TSMOM index.&lt;/p&gt;

&lt;div align=&quot;center&quot; id=&quot;table-7&quot;&gt;Table 7: Trend index regression on four-factor model net of costs (1987–2024)&lt;/div&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt; &lt;/th&gt;
      &lt;th&gt; &lt;/th&gt;
      &lt;th&gt;t-stat&lt;/th&gt;
      &lt;th&gt;likelihood ratio&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;annual alpha&lt;/td&gt;
      &lt;td&gt;4.83%*&lt;/td&gt;
      &lt;td&gt;2.3&lt;/td&gt;
      &lt;td&gt;14&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;beta&lt;/td&gt;
      &lt;td&gt;-0.02&lt;/td&gt;
      &lt;td&gt;-0.4&lt;/td&gt;
      &lt;td&gt;1.09&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;size&lt;/td&gt;
      &lt;td&gt;-0.05&lt;/td&gt;
      &lt;td&gt;-0.9&lt;/td&gt;
      &lt;td&gt;1.51&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;value&lt;/td&gt;
      &lt;td&gt;0.08&lt;/td&gt;
      &lt;td&gt;1.4&lt;/td&gt;
      &lt;td&gt;3&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;momentum (net)&lt;/td&gt;
      &lt;td&gt;0.18***&lt;/td&gt;
      &lt;td&gt;4.5&lt;/td&gt;
      &lt;td&gt;21,000&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;r&lt;sup&gt;2&lt;/sup&gt;&lt;/td&gt;
      &lt;td&gt;0.06&lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;&lt;em&gt;*, **, and *** indicate significance at the 0.05, 0.01, and 0.001 levels, respectively.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;A t-stat of 2.3 is decent but not amazing. (&lt;a href=&quot;https://people.duke.edu/~charvey/Research/Published_Papers/P118_and_the_cross.PDF&quot;&gt;Harvey et al. (2015)&lt;/a&gt;&lt;sup id=&quot;fnref:40&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:40&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;28&lt;/a&gt;&lt;/sup&gt; propose that factors should need a t-stat of at least 3.0 to be considered significant.)&lt;/p&gt;

&lt;p&gt;A second approach is to regress a live TSMOM fund against a live momentum fund. The downside of this approach is that most momentum funds have not existed for long. I &lt;a href=&quot;https://www.portfoliovisualizer.com/factor-analysis?s=y&amp;amp;sl=5W3lraKzhueYsLv3t54UxV&quot;&gt;ran a regression&lt;/a&gt; of AQR’s TSMOM fund (AQMIX) against AQR’s equity momentum fund (AMOMX) plus global equities (VT); AQMIX had 5.04% annual alpha, which was just barely statistically significant (t-stat = 1.97, p = 0.0499). But the fund history only goes back to 2010, so this test is underpowered to detect alpha (which we can see at a glance—5.04% is strong outperformance in practice, but it still only had p = 0.0499!).&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://dx.doi.org/10.2139/ssrn.2993026&quot;&gt;Hurst, Ooi &amp;amp; Pedersen (2017)&lt;/a&gt;&lt;sup id=&quot;fnref:18:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:18&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;5&lt;/a&gt;&lt;/sup&gt; estimated trading costs for TSMOM going back to 1880, using the methods from &lt;a href=&quot;https://dx.doi.org/10.2139/ssrn.313681&quot;&gt;Jones (2002)&lt;/a&gt;&lt;sup id=&quot;fnref:43&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:43&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;29&lt;/a&gt;&lt;/sup&gt; plus live trading data. They extrapolated from the live data by assuming costs were twice as high before 2002 and six times as high before 1993. After subtracting estimated costs and a &lt;a href=&quot;https://www.investopedia.com/terms/t/two_and_twenty.asp&quot;&gt;2-and-20 fee&lt;/a&gt;, they found that TSMOM still had strong performance back to 1880—an annualized return of 7.3% with 9.7% volatility. Their methodology was reasonable; however, this is ultimately an estimate, not live performance.&lt;/p&gt;

&lt;p&gt;I’m highly confident that TSMOM is a real phenomenon. But I’m only moderately confident that TSMOM can be exploited by investors.&lt;/p&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:1&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Moskowitz, T. J., Ooi, Y. H., &amp;amp; Pedersen, L. H. (2012). &lt;a href=&quot;http://docs.lhpedersen.com/TimeSeriesMomentum.pdf&quot;&gt;Time series momentum.&lt;/a&gt;. doi: &lt;a href=&quot;https://doi.org/10.1016/j.jfineco.2011.11.003&quot;&gt;10.1016/j.jfineco.2011.11.003&lt;/a&gt; &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt; &lt;a href=&quot;#fnref:1:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;2&lt;/sup&gt;&lt;/a&gt; &lt;a href=&quot;#fnref:1:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;3&lt;/sup&gt;&lt;/a&gt; &lt;a href=&quot;#fnref:1:3&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;4&lt;/sup&gt;&lt;/a&gt; &lt;a href=&quot;#fnref:1:4&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;5&lt;/sup&gt;&lt;/a&gt; &lt;a href=&quot;#fnref:1:5&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;6&lt;/sup&gt;&lt;/a&gt; &lt;a href=&quot;#fnref:1:6&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;7&lt;/sup&gt;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:2&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Huang, D., Li, J., Wang, L., &amp;amp; Zhou, G. (2020). &lt;a href=&quot;https://doi.org/10.1016/j.jfineco.2019.08.004&quot;&gt;Time series momentum: Is it there?&lt;/a&gt; &lt;a href=&quot;#fnref:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt; &lt;a href=&quot;#fnref:2:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;2&lt;/sup&gt;&lt;/a&gt; &lt;a href=&quot;#fnref:2:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;3&lt;/sup&gt;&lt;/a&gt; &lt;a href=&quot;#fnref:2:3&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;4&lt;/sup&gt;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:11&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;The authors called it TSM, but I’m calling it TSMOM for consistency with other publications on the topic, and to better distinguish it from the “TSH” strategy that HLWZ propose (which I will discuss shortly). &lt;a href=&quot;#fnref:11&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:3&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I used the &lt;a href=&quot;https://wholesale.banking.societegenerale.com/en/prime-services-indices/&quot;&gt;SG Trend Index&lt;/a&gt; from 1999 to 2024. The SG Trend Index did not exist prior to 1999, so for the earlier period I used the &lt;a href=&quot;https://portal.barclayhedge.com/cgi-bin/indices/displayHfIndex.cgi?indexCat=Barclay-Investable-Benchmarks&amp;amp;indexName=BTOP50-Index&quot;&gt;BTOP50 Index&lt;/a&gt;. BTOP50 is less representative because some of its constituent funds pursue other strategies in addition to trendfollowing. To my knowledge, BTOP50 is still primarily trendfollowing-focused, even if not exclusively so. &lt;a href=&quot;#fnref:3&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt; &lt;a href=&quot;#fnref:3:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;2&lt;/sup&gt;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:18&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Hurst, B., Ooi, Y. H., &amp;amp; Pedersen, L. H. (2017). &lt;a href=&quot;https://dx.doi.org/10.2139/ssrn.2993026&quot;&gt;A Century of Evidence on Trend-Following Investing.&lt;/a&gt; &lt;a href=&quot;#fnref:18&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt; &lt;a href=&quot;#fnref:18:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;2&lt;/sup&gt;&lt;/a&gt; &lt;a href=&quot;#fnref:18:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;3&lt;/sup&gt;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:25&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Please nobody bring up the fact that Victor Von Doom is the Supreme Lord of Latveria. In this hypothetical scenario, he doesn’t pay any attention to the Treasury department. Or maybe he’s misappropriating Treasury assets to fund his company, I don’t know, this is a fake scenario for illustrative purposes. Feel free to make up your own explanation. &lt;a href=&quot;#fnref:25&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:38&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;One way to compensate for this would be to subtract every asset’s average return from its return each period, such that its long-run average becomes zero; then run a pooled regression. However, this approach does not solve a second problem: it assumes assets’ returns are independent of each other. The standard approach when studying market factors is to use a &lt;a href=&quot;https://en.wikipedia.org/wiki/Fama%E2%80%93MacBeth_regression&quot;&gt;Fama-MacBeth regression&lt;/a&gt;, but that only works for cross-sectional factors, not time-series factors.&lt;/p&gt;

      &lt;p&gt;There are a few ways to handle asset covariances that might work, but determining the right approach goes beyond my statistical knowledge. &lt;a href=&quot;#fnref:38&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:45&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;HLWZ reported alphas as monthly; I converted all monthly returns to annual because I find it more intuitive that way. &lt;a href=&quot;#fnref:45&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:15&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;The exact calculation I used is &lt;code&gt;t.pdf(0, df=346) / t.pdf(tstat, df=346)&lt;/code&gt;. There are 346 degrees of freedom corresponding to the 348-month (29-year) sample minus two regression coefficients. &lt;a href=&quot;#fnref:15&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:42&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Praise be to the scientists who publish their data! &lt;a href=&quot;#fnref:42&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt; &lt;a href=&quot;#fnref:42:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;2&lt;/sup&gt;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:12&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;The authors present two variants of the strategies that use different asset weighting schemes, namely equal-weighting and inverse volatility-weighting. The latter is more typical for practitioners, but the authors focus on the former, so I will focus on the former as well. All of the tables and statistics in this article use the equal-weighted versions of TSMOM and TSH.&lt;/p&gt;

      &lt;p&gt;Qualitatively, inverse vol-weighting produced better performance for both TSMOM and TSH and also gave TSMOM a larger advantage over TSH. &lt;a href=&quot;#fnref:12&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:32&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Banz, R. W. (1981). &lt;a href=&quot;https://doi.org/10.1016/0304-405X(81)90018-0&quot;&gt;The relationship between return and market value of common stocks.&lt;/a&gt; doi: &lt;a href=&quot;https://doi.org/10.1016/0304-405x(81)90018-0&quot;&gt;10.1016/0304-405x(81)90018-0&lt;/a&gt; &lt;a href=&quot;#fnref:32&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:30&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Stattman, D. (1980). Book values and stock returns. The Chicago MBA: A Journal of Selected Papers, 4:25-45. &lt;a href=&quot;#fnref:30&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:31&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Chan, L. K. C., Hamao, Y., &amp;amp; Lakonishok, J. (1991). &lt;a href=&quot;https://doi.org/10.1111/j.1540-6261.1991.tb04642.x&quot;&gt;Fundamentals and Stock Returns in Japan.&lt;/a&gt; &lt;a href=&quot;#fnref:31&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:29&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Fama, E. F., &amp;amp; French, K. R. (1992). &lt;a href=&quot;https://doi.org/10.1111/j.1540-6261.1992.tb04398.x&quot;&gt;The Cross-Section of Expected Stock Returns.&lt;/a&gt; &lt;a href=&quot;#fnref:29&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:27&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;My regression results differed slightly from HLWZ’s (see their Table 10), probably due to small details in the construction of the TSMOM strategy.&lt;/p&gt;

      &lt;table&gt;
        &lt;thead&gt;
          &lt;tr&gt;
            &lt;th&gt; &lt;/th&gt;
            &lt;th&gt;HLWZ&lt;/th&gt;
            &lt;th&gt;mine&lt;/th&gt;
          &lt;/tr&gt;
        &lt;/thead&gt;
        &lt;tbody&gt;
          &lt;tr&gt;
            &lt;td&gt;annual alpha&lt;/td&gt;
            &lt;td&gt;1.8%&lt;/td&gt;
            &lt;td&gt;1.46%&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
            &lt;td&gt;beta&lt;/td&gt;
            &lt;td&gt;0.02&lt;/td&gt;
            &lt;td&gt;0.02&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
            &lt;td&gt;size&lt;/td&gt;
            &lt;td&gt;-0.06&lt;/td&gt;
            &lt;td&gt;-0.06&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
            &lt;td&gt;value&lt;/td&gt;
            &lt;td&gt;0.06&lt;/td&gt;
            &lt;td&gt;0.04&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
            &lt;td&gt;momentum&lt;/td&gt;
            &lt;td&gt;0.60&lt;/td&gt;
            &lt;td&gt;0.59&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
            &lt;td&gt;r&lt;sup&gt;2&lt;/sup&gt;&lt;/td&gt;
            &lt;td&gt;0.46&lt;/td&gt;
            &lt;td&gt;0.45&lt;/td&gt;
          &lt;/tr&gt;
        &lt;/tbody&gt;
      &lt;/table&gt;
      &lt;p&gt;&lt;a href=&quot;#fnref:27&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:6&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Goyal, A., &amp;amp; Jegadeesh, N. (2015). &lt;a href=&quot;https://dx.doi.org/10.2139/ssrn.2610288&quot;&gt;Cross-Sectional and Time-Series Tests of Return Predictability: What Is the Difference?.&lt;/a&gt; &lt;a href=&quot;#fnref:6&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:17&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;That is, rather than go long or short each individual stock based on whether it has a positive or negative trend, you go long or short the entire market.&lt;/p&gt;

      &lt;p&gt;For the avoidance of doubt: Moskowitz et al.&lt;sup id=&quot;fnref:1:7&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; and HLWZ&lt;sup id=&quot;fnref:2:4&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt; did not look at TSMOM on individual stocks; they used equity indexes, bonds, commodities, and currencies. &lt;a href=&quot;#fnref:17&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:16&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;This counterintuitive result is an example of &lt;a href=&quot;https://en.wikipedia.org/wiki/Simpson%27s_paradox&quot;&gt;Simpson’s paradox&lt;/a&gt; (one of my favorite paradoxes): a confounding variable influences TSMOM and TSH in opposite directions. The probable culprit is the size factor. TSMOM had negative exposure to the size factor while TSH had positive exposure, so controlling for size causes TSH’s explanatory power to go up. &lt;a href=&quot;#fnref:16&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:20&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;HLWZ (2020) cited Hurst et al. (2017), but did not address the evidence it presented. &lt;a href=&quot;#fnref:20&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:7&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Baltussen, G., Swinkels, L., &amp;amp; van Vliet, P. (2019). &lt;a href=&quot;https://dx.doi.org/10.2139/ssrn.3325720&quot;&gt;Global Factor Premiums.&lt;/a&gt; &lt;a href=&quot;#fnref:7&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:21&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I would’ve included international equities too, but the publicly available databases of international equity returns only go back to 1987. We saw from the 1986–2015 regressions that TSMOM had minimal loading on equity beta, so I’m not particularly concerned about this; the main goal is to see how much of TSMOM is explained by momentum. &lt;a href=&quot;#fnref:21&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:8&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;The value factor probably doesn’t matter for our purposes, but I included it because it’s one on the factors in the Fama-French four-factor model. I excluded the size factor (SMB) because it was not in the Global Factor Premiums data. &lt;a href=&quot;#fnref:8&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:10&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;With some additional effort I could’ve constructed a data set going back to 1880, but it wouldn’t have made much difference because starting 1927 still gives us nearly 60 years of out-of-sample data. &lt;a href=&quot;#fnref:10&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:19&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I did not include commodities, currencies, or fixed income as independent variables because they had some missing data; but my regression is more conservative than the ones performed by HLWZ (2020) because they only included equity factors. &lt;a href=&quot;#fnref:19&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:35&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Including US factors and international developed ex-US factors (= eight factors total), 1990–2025. &lt;a href=&quot;#fnref:35&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:39&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;The size and value factors are not free to trade either, but (1) leaving them unchanged is more conservative, and (2) they’re cheaper to trade because they have lower turnover than momentum or TSMOM. &lt;a href=&quot;#fnref:39&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:40&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Harvey, C. R., Liu, Y., &amp;amp; Zhu, H. (2015). &lt;a href=&quot;https://doi.org/10.1093/rfs/hhv059&quot;&gt;… and the Cross-Section of Expected Returns.&lt;/a&gt; &lt;a href=&quot;#fnref:40&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:43&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Jones, C. M. (2002). &lt;a href=&quot;https://dx.doi.org/10.2139/ssrn.313681&quot;&gt;A Century of Stock Market Liquidity and Trading Costs.&lt;/a&gt; &lt;a href=&quot;#fnref:43&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>If AI alignment is only as hard as building the steam engine, then we likely still die</title>
				<pubDate>Sat, 10 Jan 2026 00:00:00 -0800</pubDate>
				<link>http://mdickens.me/2026/01/10/if_alignment_is_as_hard_as_the_steam_engine/</link>
				<guid isPermaLink="true">http://mdickens.me/2026/01/10/if_alignment_is_as_hard_as_the_steam_engine/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;You may have seen &lt;a href=&quot;https://x.com/ch402/status/1666482929772666880?lang=en&quot;&gt;this graph&lt;/a&gt; from Chris Olah illustrating a range of views on the difficulty of aligning superintelligent AI:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;https://res.cloudinary.com/lesswrong-2-0/image/upload/f_auto,q_auto/v1/mirroredImages/epjuxGnSPof3GnMSL/gdy9ehorotuet6uc8bce&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Evan Hubinger, an alignment team lead at Anthropic, &lt;a href=&quot;https://www.lesswrong.com/posts/epjuxGnSPof3GnMSL/alignment-remains-a-hard-unsolved-problem&quot;&gt;says&lt;/a&gt;:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;If the only thing that we have to do to solve alignment is train away easily detectable behavioral issues…then we are very much in the trivial/steam engine world. We could still fail, even in that world—and it’d be particularly embarrassing to fail that way; we should definitely make sure we don’t—but I think we’re very much up to that challenge and I don’t expect us to fail there.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I disagree; if governments and AI developers don’t start taking extinction risk more seriously, then we are not up to the challenge.&lt;/p&gt;

&lt;!-- more --&gt;

&lt;p&gt;&lt;img src=&quot;https://upload.wikimedia.org/wikipedia/commons/c/cc/Savery-engine.jpg&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Thomas Savery patented the first commercial steam pump in 1698.&lt;sup id=&quot;fnref:6&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:6&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; The device used fire to heat up a boiler full of steam, which would then be cooled to create a partial vacuum and draw water out of a well. Savery’s pump had various problems, and eventually Savery gave up on trying to improve it. Future inventors improved upon the design to make it practical.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;https://www.supercars.net/blog/wp-content/uploads/2016/04/1769_Cugnot_SteamTractor6.jpg&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;It was not until 1769 that Nicolas-Joseph Cugnot developed the first steam-powered vehicle, something that we would recognize as a steam engine in the modern sense.&lt;sup id=&quot;fnref:7&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:7&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt; The engine took Cugnot four years to develop. Unfortunately, Cugnot neglected to include brakes—a problem that had not arisen in any previous steam-powered devices—and at one point he allegedly crashed his vehicle into a wall.&lt;sup id=&quot;fnref:8&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:8&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;Imagine it’s 1765, and you’re tasked with building a steam-powered vehicle. You can build off the work of your predecessors who built steam-powered water pumps and other simpler contraptions; but if you build your engine incorrectly, you die. (Why do you die? I don’t know, but for the sake of the analogy let’s just say that you do.) You’ve never heard of brakes or steering or anything else that automotives come with nowadays. Do you think you can get it all right on the first try?&lt;/p&gt;

&lt;p&gt;With a steam engine screwup, the machine breaks. Worst case scenario, the driver dies. ASI has higher stakes. If AI developers make a misstep at the end—for example, the metaphorical equivalent of forgetting to include brakes—everyone dies.&lt;/p&gt;

&lt;p&gt;Here’s one way the future might go if aligning AI is only as hard as building the steam engine:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;The leading AI developer builds an AI that’s not quite powerful enough to kill everyone, but it’s getting close. They successfully align it: they figure out how to detect alignment faking, they identify how it’s misaligned, and they find ways to fix it. Having satisfied themselves that the current AI is aligned, they scale up to superintelligence.&lt;/p&gt;

  &lt;p&gt;The alignment techniques that worked on the last model fail on the new one, for reasons that would be fixable if they tinkered with the new model a bit. But the developers don’t get a chance to tinker with it. Instead what happens is that the ASI is smart enough to sneak through the evaluations that caught the previous model’s misalignment. The developer deploys the model—let’s assume they’re being cautious and they initially only deploy the model in a sandbox environment. The environment has strong security, but the ASI—being smarter than all human cybersecurity experts—finds a vulnerability and breaks out; or perhaps it uses superhuman persuasion to convince humans to let it out; or perhaps it continues to fake alignment for long enough that humans sign it off as “aligned” and fully roll it out.&lt;/p&gt;

  &lt;p&gt;Having made it out of the sandbox, the ASI proceeds to kill everyone.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I don’t have a strong opinion on how exactly this would play out. But if an AI is much smarter than you, and if your alignment techniques don’t fully generalize (and you can’t know that they will), then you might not get a chance to fix “alignment bugs” before you lose control of the AI.&lt;/p&gt;

&lt;p&gt;Here’s another way we could die even if alignment is relatively easy:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;The leading AI developer knows how to build and align superintelligence, but alignment takes time. Out of fear that a competitor beats them, or out of the CEO being a sociopath who wants more power&lt;sup id=&quot;fnref:4&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:4&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;4&lt;/a&gt;&lt;/sup&gt;, they rush to superintelligence before doing the relatively easy work of solving alignment; then the ASI kills everyone.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The latter scenario would be mitigated by a sufficiently safety-conscious AI developer building the first ASI, but none of the frontier AI companies have credibly demonstrated that they would do the right thing when the time came.&lt;/p&gt;

&lt;p&gt;(Of course, that still requires alignment to be easy. If alignment is hard, then we die even if a safety-conscious developer gets to ASI first.)&lt;/p&gt;

&lt;h3 id=&quot;what-if-you-use-the-aligned-human-level-ai-to-figure-out-how-to-align-the-asi&quot;&gt;What if you use the aligned human-level AI to figure out how to align the ASI?&lt;/h3&gt;

&lt;p&gt;Every AI company’s alignment plan hinges on using AI to solve alignment, a.k.a. alignment bootstrapping. Much of my concern with this approach comes from the fact that we don’t know how hard it is to solve alignment. If we stipulate that alignment is easy, then I’m less concerned. But my level of concern doesn’t go to zero, either.&lt;/p&gt;

&lt;p&gt;Recently, I &lt;a href=&quot;https://mdickens.me/2025/11/27/alignment_bootstrapping_is_dangerous/&quot;&gt;criticized alignment bootstrapping&lt;/a&gt; on the basis that:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;it’s a plan to solve a problem of unknown difficulty…&lt;/li&gt;
  &lt;li&gt;…using methods that have never been tried before…&lt;/li&gt;
  &lt;li&gt;…and if it fails, we all die.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If we stipulate that the alignment problem is easy, then that eliminates concern #1. But that still leaves #2 and #3. We don’t know how well it will work to use AI to solve AI alignment—we don’t know what properties the “alignment assistant” AI will have. We don’t even know how to tell whether what we’re doing is working; and the more work we offload to AI, the harder it is to tell.&lt;/p&gt;

&lt;h3 id=&quot;what-if-alignment-techniques-on-weaker-ais-generalize-to-superintelligence&quot;&gt;What if alignment techniques on weaker AIs generalize to superintelligence?&lt;/h3&gt;

&lt;p&gt;Then I suppose, by stipulation, we won’t die. But this scenario is not likely.&lt;/p&gt;

&lt;p&gt;The basic reason not to expect generalization is that you can’t predict what properties ASI will have. If it can out-think you, then almost by definition, you can’t understand how it will think.&lt;/p&gt;

&lt;p&gt;But maybe we get lucky, and we can develop alignment techniques in advance and apply them to an ASI and the techniques will work. Given the current level of seriousness with which AI developers take the alignment problem, we’d better pray that alignment techniques generalize to superintelligence.&lt;/p&gt;

&lt;p&gt;If alignment is easy &lt;em&gt;and&lt;/em&gt; alignment generalizes, we’re probably okay.&lt;sup id=&quot;fnref:3&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;5&lt;/a&gt;&lt;/sup&gt; If alignment is easy but doesn’t generalize, there’s a big risk that we die. More likely than either of those two scenarios is that alignment is hard. However, even if alignment is easy, there are still obvious ways we could fumble the ball and die, and I’m scared that that’s what’s going to happen.&lt;/p&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:6&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;&lt;a href=&quot;https://en.wikipedia.org/wiki/History_of_the_steam_engine&quot;&gt;History of the steam engine.&lt;/a&gt; Wikipedia. Accessed 2025-12-22. &lt;a href=&quot;#fnref:6&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:7&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;&lt;a href=&quot;https://en.wikipedia.org/wiki/Nicolas-Joseph_Cugnot&quot;&gt;Nicolas-Joseph Cugnot.&lt;/a&gt; Wikipedia. Accessed 2025-12-22. &lt;a href=&quot;#fnref:7&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:8&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Dellis, N. &lt;a href=&quot;https://www.supercars.net/blog/1769-cugnot-steam-tractor/&quot;&gt;1769 Cugnot Steam Tractor.&lt;/a&gt; Accessed 2025-12-22. &lt;a href=&quot;#fnref:8&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:4&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;This is an accurate description of at least two of the five CEOs of leading AI companies, and possibly all five. &lt;a href=&quot;#fnref:4&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:3&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;My off-the-cuff estimate is a 10% chance of misalignment-driven extinction in that scenario—still ludicrously high, but much lower than my unconditional probability. &lt;a href=&quot;#fnref:3&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>I'm wary of increasing government expertise on AI</title>
				<pubDate>Sun, 21 Dec 2025 00:00:00 -0800</pubDate>
				<link>http://mdickens.me/2025/12/21/government_expertise_on_AI/</link>
				<guid isPermaLink="true">http://mdickens.me/2025/12/21/government_expertise_on_AI/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;Many people in AI safety, especially AI policy, want to increase government expertise. For example, they want to place people with AI research experience in relevant positions within government. That may not be a good idea.&lt;/p&gt;

&lt;p&gt;People who better understand AI can write more useful regulations. However, people with relevant expertise (such as ML researchers) tend to be &lt;em&gt;less&lt;/em&gt; in favor of strong regulations and &lt;em&gt;more&lt;/em&gt; in favor of accelerating AI development.&lt;sup id=&quot;fnref:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; We need regulations to prevent misaligned AI from killing everyone, and to prevent &lt;a href=&quot;https://mdickens.me/2025/11/20/research_wont_solve_non-alignment_problems/&quot;&gt;other kinds of catastrophes&lt;/a&gt;. If government expertise goes up, all else equal we will get fewer such regulations, not more.&lt;/p&gt;

&lt;!-- more --&gt;

&lt;p&gt;If we &lt;em&gt;do&lt;/em&gt; get strong regulations, then those regulations will turn out better if AI experts help write them. But we don’t get that by increasing expertise in general; we need AI expertise combined with an understanding of why powerful AI is dangerous.&lt;/p&gt;

&lt;p&gt;Government expertise on &lt;em&gt;AI safety&lt;/em&gt; matters more than expertise on &lt;em&gt;AI in general&lt;/em&gt;. But even there, I’m worried. The most legible AI safety “experts” are the ones who work at AI companies, where strong forces are pressuring them to believe that the alignment problem is solvable and that companies shouldn’t be regulated too hard. The sorts of people who I would most want to see in government AI safety roles&lt;sup id=&quot;fnref:3&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt; don’t have job titles like “Senior Alignment Researcher at OpenAI”; their titles are more like “Independent Researcher Guy (gender-neutral) Who Posts on LessWrong”, or “Guy Who Dropped Out of High School and Started an AI Safety Nonprofit but Has Never Published an ML Paper”.&lt;/p&gt;

&lt;p&gt;I don’t have a great answer for what to do here. It’s important for government to have expertise on AI, but naive efforts to increase government expertise may also increase the probability that AI kills everyone.&lt;/p&gt;

&lt;p&gt;One answer: Work on educating policy-makers specifically on AI risks, like what &lt;a href=&quot;https://palisaderesearch.org/&quot;&gt;Palisade Research&lt;/a&gt; does. Educating non-experts on AI risk seems less fraught than attempting to get experts hired.&lt;/p&gt;

&lt;p&gt;Another answer: Try to increase government &lt;em&gt;willingness&lt;/em&gt; to regulate AI, not government &lt;em&gt;expertise&lt;/em&gt;. Right now, there is not much political will for strong regulations. Without political will, nothing happens. Most of &lt;a href=&quot;https://mdickens.me/2025/11/22/where_i_am_donating_in_2025/&quot;&gt;the orgs I considered donating to this year&lt;/a&gt; work on increasing willingness to regulate AI.&lt;/p&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:1&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;ML researchers enjoy doing ML research, and cognitive dissonance often prevents them from believing that ML research could be harmful and even could destroy the world.&lt;/p&gt;

      &lt;p&gt;According to a &lt;a href=&quot;https://www.pewresearch.org/internet/2025/04/03/how-the-us-public-and-ai-experts-view-artificial-intelligence/&quot;&gt;2025 Pew poll&lt;/a&gt;, 58% of the public and 56% of AI experts say they’re concerned that the government won’t go far enough in regulating AI. That’s good to see. But the polls also show that AI experts are less worried than the public about the dangers of AI. I fear that experts will favor regulations that don’t impede AI progress, which will ultimately do nothing to prevent extinction. &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:3&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;At least with regard to their expertise, not necessarily their skill at navigating bureaucracy. &lt;a href=&quot;#fnref:3&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>Rest in Peace Commento; Long Live Comentario</title>
				<pubDate>Fri, 12 Dec 2025 00:00:00 -0800</pubDate>
				<link>http://mdickens.me/2025/12/12/long_live_comentario/</link>
				<guid isPermaLink="true">http://mdickens.me/2025/12/12/long_live_comentario/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;As of a few days ago, my website supported comments via &lt;a href=&quot;https://commento.io/&quot;&gt;Commento&lt;/a&gt;. If you click on that link, you will find that the page doesn’t load. Unfortunately, that website was also hosting my website’s comments, so all the comments are gone now, and I have no way to recover them.&lt;sup id=&quot;fnref:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; Some of y’all left some good comments, but future readers will never know what they were.&lt;sup id=&quot;fnref:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;(I knew Commento was no longer actively supported, but in my foolishness, I thought to myself, well, the comments still work, so I’ll keep using it. Too bad I didn’t back up the comments while I had the chance.)&lt;/p&gt;

&lt;p&gt;Commento was my third comment system. Originally I used Disqus, but I didn’t like how it impacted page load times, and I didn’t like how it disrespected my readers’ privacy. So I switched to a janky basic HTML commenting system that required me to manually copy/paste people’s comments into a text file so my website could serve them statically. That system was annoying&lt;sup id=&quot;fnref:3&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt;, so I switched to Commento, which was lightweight, privacy-respecting, and didn’t require manual effort on my part.&lt;/p&gt;

&lt;p&gt;Commento is dead. My website now uses &lt;a href=&quot;https://comentario.app/en/&quot;&gt;Comentario&lt;/a&gt;, which is basically the same as Commento except that (1) it still exists and (2) it’s self-hosted, which means even if Comentario stops existing and the website disappears, the comments on my website will still work.&lt;/p&gt;

&lt;p&gt;(Commento had an option for self-hosting, but it looked like a lot of work so I didn’t do it.)&lt;/p&gt;

&lt;p&gt;(Setting up Comentario self-hosting was a lot of work, confirming my suspicions. It took me about 10 hours, although to be fair, 8 of those 10 hours were spent trying to upgrade my server’s operating system because it was too old to be compatible to Comentario, and also it reached end-of-life in 2021 and probably had a lot of security vulnerabilities, oops. Anyway I hope y’all appreciate all the work I’m doing to prevent Disqus from spying on you.)&lt;/p&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:2&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I reached out to customer support. I think they are AWOL, but I’ll see if I can get them to send me a database backup. &lt;a href=&quot;#fnref:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:1&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;They’re not saved on web.archive.org either, because the comments were loaded dynamically. &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:3&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;The one upside to that system was that the “spam filter” was a primitive honeypot that you’d think would be trivial for spambots route around, that they nonetheless fell for it 100% of the time.&lt;/p&gt;

      &lt;p&gt;The HTML for the comment submission form looked something like this:&lt;/p&gt;

      &lt;pre&gt;&lt;code&gt;&amp;lt;span hidden&amp;gt;
  &amp;lt;button name=&quot;Submit (anyone who clicks this button is a spambot)&quot; /&amp;gt;
&amp;lt;/span&amp;gt;
&amp;lt;span&amp;gt;
    &amp;lt;button name=&quot;Submit&quot; /&amp;gt;
&amp;lt;/span&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

      &lt;p&gt;Spambots would click on the fake hidden button every time. (I’m not exaggerating—out of the hundreds of spambots who attempted to comment on my website, literally zero of them made it past this filter.) &lt;a href=&quot;#fnref:3&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>I need the Writing Style Guide people to figure out how to put a smiley face inside parentheses</title>
				<pubDate>Thu, 11 Dec 2025 00:00:00 -0800</pubDate>
				<link>http://mdickens.me/2025/12/11/smiley_inside_parentheses/</link>
				<guid isPermaLink="true">http://mdickens.me/2025/12/11/smiley_inside_parentheses/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;I can’t figure out any good way to put a smiley emoticon inside parentheses. There are five choices, all of which are bad:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Do the straightforward thing of just writing it (which puts two parentheses next to each other in a row, and makes it unclear where the smiley face ends and the parenthesis proper begins :)).&lt;sup id=&quot;fnref:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
  &lt;li&gt;Do that, but put the period inside the parentheses. (Which requires restructuring your sentences, and ends up looking ugly anyway, like some deformed double-mouth emoticon :).)&lt;/li&gt;
  &lt;li&gt;Put a space between the emoticon and the close parenthesis (which does look more visually distinct, but there’s no other situation where you put a space before the close parenthesis :) ).&lt;/li&gt;
  &lt;li&gt;Only put a single close parenthesis (possibly the worst option because you can’t tell if the parenthesis is part of the punctuation or part of the emoticon :).&lt;/li&gt;
  &lt;li&gt;Add some text after the smiley so it’s not at the end of the parenthetical (but maybe you have nothing left to say so the text is superfluous :) haha wouldn’t it be crazy if I wrote some extra stuff here?).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;There’s a secret sixth option of “don’t use emoticons inside parentheses” but, like, what if I really want to? (emoticons are important sometimes :) )&lt;/p&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:1&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I put a smiley face here for the sake of illustration even though I’m not happy. Feel free to interpret it as a deranged losing-my-sanity smile. &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>I did Inkhaven</title>
				<pubDate>Sun, 30 Nov 2025 00:00:00 -0800</pubDate>
				<link>http://mdickens.me/2025/11/30/inkhaven/</link>
				<guid isPermaLink="true">http://mdickens.me/2025/11/30/inkhaven/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;I published a post every day of November as part of the &lt;a href=&quot;https://www.inkhaven.blog/&quot;&gt;Inkhaven&lt;/a&gt; program, in which we are required to publish a post every day of November. Some of my readers knew that; others were confused about why I suddenly started posting so much.&lt;/p&gt;

&lt;p&gt;If you’re an email subscriber, you didn’t see every post because I only sent out the good ones—I didn’t want to bombard you with emails if you were accustomed to my typical once-per-week-or-three-months posting schedule. If you want to see the bad posts, they’re all on &lt;a href=&quot;https://mdickens.me/&quot;&gt;https://mdickens.me/&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Inkhaven had 40 other residents; you can see their posts on &lt;a href=&quot;https://www.inkhaven.blog/&quot;&gt;the website&lt;/a&gt;, and daily highlights at the &lt;a href=&quot;https://inkhavenspotlight.substack.com/&quot;&gt;Inkhaven Spotlight&lt;/a&gt;.&lt;/p&gt;

&lt;!-- more --&gt;

&lt;h2 id=&quot;contents&quot;&gt;Contents&lt;/h2&gt;

&lt;ul id=&quot;markdown-toc&quot;&gt;
  &lt;li&gt;&lt;a href=&quot;#contents&quot; id=&quot;markdown-toc-contents&quot;&gt;Contents&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#ranking-all-of-my-inkhaven-posts&quot; id=&quot;markdown-toc-ranking-all-of-my-inkhaven-posts&quot;&gt;Ranking all of my Inkhaven posts&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#the-urge-to-be-the-best-at-something&quot; id=&quot;markdown-toc-the-urge-to-be-the-best-at-something&quot;&gt;The urge to be the best at something&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#did-i-write-good-am-i-blog-man-now&quot; id=&quot;markdown-toc-did-i-write-good-am-i-blog-man-now&quot;&gt;Did I write good? Am I blog man now?&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#notes&quot; id=&quot;markdown-toc-notes&quot;&gt;Notes&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;ranking-all-of-my-inkhaven-posts&quot;&gt;Ranking all of my Inkhaven posts&lt;/h2&gt;

&lt;p&gt;Here’s all of my posts, ordered from best to worst according to the judgment of my brain. Feel free to leave a comment explaining why my ranking is completely wrong.&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;&lt;a href=&quot;https://mdickens.me/2025/11/22/where_i_am_donating_in_2025/&quot;&gt;Where I Am Donating in 2025&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://mdickens.me/2025/11/08/call_or_write_your_representatives/&quot;&gt;Writing Your Representatives: A Cost-Effective and Neglected Intervention&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://mdickens.me/2025/11/20/research_wont_solve_non-alignment_problems/&quot;&gt;We won’t solve non-alignment problems by doing research&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://mdickens.me/2025/11/13/spot_check_alex_bores/&quot;&gt;Epistemic Spot Check: Expected Value of Donating to Alex Bores’s Congressional Campaign&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://mdickens.me/2025/11/04/do_small_protests_work/&quot;&gt;Do Small Protests Work?&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://mdickens.me/2025/11/19/do_disruptive_protests_work/&quot;&gt;Do Disruptive or Violent Protests Work?&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://mdickens.me/2025/11/16/ai_meta_one_shot/&quot;&gt;Knowing Whether AI Alignment Is a One-Shot Problem Is a One-Shot Problem&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://mdickens.me/2025/11/01/will_welfareans_get_to_experience_the_future/&quot;&gt;Will Welfareans Get to Experience the Future?&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://mdickens.me/2025/11/27/alignment_bootstrapping_is_dangerous/&quot;&gt;Alignment Bootstrapping Is Dangerous&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://mdickens.me/2025/11/02/things_ive_become_more_confident_about/&quot;&gt;Things I’ve Become More Confident About&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://mdickens.me/2025/11/03/third_caffeine_self-experiment/&quot;&gt;My Third Caffeine Self-Experiment&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://mdickens.me/2025/11/11/baby_groot/&quot;&gt;Are Groot and Baby Groot the Same Person?&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://mdickens.me/2025/11/05/how_can_I_not_know_whether_I&apos;m_having_a_good_experience/&quot;&gt;How Can I Not Know Whether I’m Having a Good Experience?&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://mdickens.me/2025/11/15/what_if_ghosts_were_real/&quot;&gt;What If Ghosts Were Real?&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://mdickens.me/2025/11/07/things_I_learned_from_college/&quot;&gt;Things I Learned from College&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://mdickens.me/2025/11/09/upside_volatility_is_bad/&quot;&gt;Upside Volatility Is Bad&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://mdickens.me/2025/11/29/little_things_I_do/&quot;&gt;Some little things I do to make life easier&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://mdickens.me/2025/11/12/ideas_too_short_for_essays_part_2/&quot;&gt;Ideas Too Short for Essays, Part 2&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://mdickens.me/2025/11/23/curiosity_stoppers/&quot;&gt;Some Curiosity Stoppers I’ve Heard&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://mdickens.me/2025/11/26/mtg_arena_budget_decklists/&quot;&gt;Magic: The Gathering Arena decklists for people on a budget&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://mdickens.me/2025/11/21/inconceivable/&quot;&gt;An unnecessarily long analysis of one line from The Princess Bride&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://mdickens.me/2025/11/29/prioritize_your_objectives/&quot;&gt;Prioritizing your objectives is better than grazing past them&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://mdickens.me/2025/11/29/I_like_reborrowed_words/&quot;&gt;I like reborrowed words&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://mdickens.me/2025/11/28/wartime_ethics/&quot;&gt;Wartime ethics is weird&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://mdickens.me/2025/11/30/inkhaven/&quot;&gt;I did Inkhaven&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://mdickens.me/2025/11/24/goals/&quot;&gt;I don’t like having goals&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://mdickens.me/2025/11/17/not_discovered_here_syndrome/&quot;&gt;Not-Discovered-Here Syndrome&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://mdickens.me/2025/11/06/cash_back/&quot;&gt;Cash Back&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://mdickens.me/2025/11/29/pet_peeves/&quot;&gt;My Carlin-esque list of pet peeves&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://mdickens.me/2025/11/29/gaming_keyboards/&quot;&gt;Gaming keyboards are not good for gaming&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://mdickens.me/2025/11/29/not_being_awkward_is_NP-hard/&quot;&gt;Behaving non-awkwardly is NP-hard&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://mdickens.me/2025/11/14/NCIS/&quot;&gt;In Defense of the NCIS Keyboard Scene&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://mdickens.me/2025/11/29/TV_is_better_when_you_trust_the_writers/&quot;&gt;TV is better when you trust the writers&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://mdickens.me/2025/11/18/god_gender/&quot;&gt;Why would God have a gender?&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://mdickens.me/2025/11/29/belief_in_expert_mistakes/&quot;&gt;Belief in expert mistakes&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://mdickens.me/2025/11/29/kid_me_was_bad_at_mtg/&quot;&gt;Kid me was bad at Magic: The Gathering&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://mdickens.me/2025/11/25/fixing_quidditch/&quot;&gt;How to fix Quidditch&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://mdickens.me/2025/11/29/which_dream_checks_work/&quot;&gt;How do I know if I’m dreaming?&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://mdickens.me/2025/11/10/long_title/&quot;&gt;the one with the long title&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The one with the long title stands out as the worst because I wrote it as a gimmick. I had to remove it from the front page of my website because it makes the site borderline-unreadable. You can still access the post via the &lt;a href=&quot;https://mdickens.me/2025/11/10/long_title/&quot;&gt;direct link&lt;/a&gt;, if you want to do that for some reason.&lt;/p&gt;

&lt;p&gt;My most underrated post was “We won’t solve non-alignment problems by doing research”. It made an important point that was highly underrated before I published that post, and it continues to be highly underrated because the post unfortunately did not spark a revolution in how people think about non-alignment problems.&lt;/p&gt;

&lt;h2 id=&quot;the-urge-to-be-the-best-at-something&quot;&gt;The urge to be the best at something&lt;/h2&gt;

&lt;p&gt;There are other Inkhaven residents who have written more popular posts than me, or been more insightful, or more heartfelt, or shown more improvement in their writing abilities, or done a better job at keeping up with what everyone else is writing (I only read maybe 30 out of the 1457&lt;sup id=&quot;fnref:6&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:6&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; posts that Inkhaveners wrote this month). But by golly, I need to be the best at something, which is why I published 10 posts yesterday. Maybe I can’t write the best posts, or the longest posts, or the most posts, but at least I wrote the most posts &lt;em&gt;in one day&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;(My first draft said “I can write the most posts in one day”, but I doubt that’s true. I’m sure there are residents who could’ve written more posts than me if they’d put their minds to it. But I’m the one who actually &lt;em&gt;did&lt;/em&gt; write the most posts in one day. Unless someone breaks my record by posting 11 posts today.)&lt;/p&gt;

&lt;p&gt;Writers sometimes make an abrupt transition to a new subject that doesn’t make sense until later. For example:&lt;sup id=&quot;fnref:4&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:4&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt; I don’t watch much Twitch, but when I do, my favorite streamer is &lt;a href=&quot;https://www.twitch.tv/maynarde&quot;&gt;Maynarde&lt;/a&gt;. He’s not a big streamer but he fits my tastes well. We like the same video games and have the same stupid sense of humor&lt;sup id=&quot;fnref:3&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt;, and the people who post in chat are funny too. On my website I pretend I’m sophisticated and mature, but I’m actually a dumbass and Twitch chat is my outlet for me to act like a dumbass.&lt;sup id=&quot;fnref:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;4&lt;/a&gt;&lt;/sup&gt; Anyway, my point is, at one point Maynarde held the &lt;a href=&quot;https://www.speedrun.com/prodeus?h=100&amp;amp;x=xd1qe872&quot;&gt;speedrun world record in Prodeus&lt;/a&gt;. Is Maynarde a world-class speedrunner? No. But he got the world record by being basically good enough to do a speedrun, and trying it before anyone else.&lt;sup id=&quot;fnref:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;5&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;That’s the trick to being the best at something: find something that nobody else is doing, and then be pretty good at it. That’s what I did. I’m good enough at writing that (on a good day) I can crank out 10 low-effort posts in eight hours. (Especially given that I’d already had all the ideas beforehand and written outlines for most of them.)&lt;/p&gt;

&lt;p&gt;I wanted to write a post about Twitch this month; most people I know don’t watch it, and the way I watch it is pretty different from the typical Twitch viewer, so I feel like there are things to say. But I couldn’t quite figure out a thesis. So I’m just mentioning Twitch as a segue into describing the techique of being the best by doing something that nobody else has done yet.&lt;/p&gt;

&lt;h2 id=&quot;did-i-write-good-am-i-blog-man-now&quot;&gt;Did I write good? Am I blog man now?&lt;/h2&gt;

&lt;p&gt;During Inkhaven, I finished a lot of post ideas that were otherwise fated to languish in my “post ideas” file. I had the idea about &lt;a href=&quot;https://mdickens.me/2025/11/11/baby_groot/&quot;&gt;the philosophy of identity for Groot and Baby Groot&lt;/a&gt; in 2018, after watching &lt;em&gt;Avengers: Infinity War&lt;/em&gt; for the first time. And now, seven years later, I’ve finally turned it into a post. Was that a good use of my time? Was the post worth writing? I don’t know, you tell me.&lt;/p&gt;

&lt;p&gt;I also got the chance to talk to some talented writers and get useful advice and feedback. For example, &lt;a href=&quot;https://www.astralcodexten.com/&quot;&gt;Scott Alexander&lt;/a&gt; told me that I hedge too much.&lt;/p&gt;

&lt;p&gt;When I was in high school, I did the required thing where you arbitrarily pick a thesis (regardless of whether you believe it) and then confidently defend it. I didn’t like doing that—I want to figure out what’s correct, not just prove that I’m good at arguing. When I wrote personal stuff, I made sure to make it clear what I don’t know, and avoided dogmatically arguing for one side. (I didn’t always succeed at that, but I tried.) But there’s such a thing as taking it too far, which apparently I do. Except for in &lt;a href=&quot;https://mdickens.me/2025/11/20/research_wont_solve_non-alignment_problems/&quot;&gt;this post&lt;/a&gt;, because Scott made me remove all the hedge words from that one.&lt;/p&gt;

&lt;p&gt;I talked to &lt;a href=&quot;https://dynomight.net/&quot;&gt;Dynomight&lt;/a&gt; about my list of draft ideas. We talked about various things but my biggest takeaway was that I’m spending too long thinking about ideas before writing them, and instead I should just write them.&lt;/p&gt;

&lt;p&gt;I got plenty of valuable insight from other people, but if I give more examples, then I will feel like I need to list all of them, and I will feel bad if anyone gets left out. So I will leave it at two, with the acknowledgment that I talked to many other great people as well.&lt;/p&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:6&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;as of this writing; some people haven’t published their last post yet.&lt;/p&gt;

      &lt;p&gt;The reason this number is larger than 40 x 30 is that some of the writing coaches and organizers were also publishing a post every day. (I don’t understand how they had time for that.) &lt;a href=&quot;#fnref:6&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:4&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;This was my fanfic of the move Paul Graham pulled in &lt;a href=&quot;https://paulgraham.com/best.html&quot;&gt;The Best Essay&lt;/a&gt;, in by far one of the best&lt;sup id=&quot;fnref:5&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:5&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;6&lt;/a&gt;&lt;/sup&gt; paragraphs I’ve ever read:&lt;/p&gt;

      &lt;blockquote&gt;
        &lt;p&gt;When a subtree comes to an end, you can do one of two things. You can either stop, or pull the Cubist trick of laying separate subtrees end to end by returning to a question you skipped earlier. Usually it requires some sleight of hand to make the essay flow continuously at this point, but not this time. This time I actually need an example of the phenomenon. For example, we discovered earlier that the best possible essay wouldn’t usually be timeless in the way the best painting would. This seems surprising enough to be worth investigating further.&lt;/p&gt;
      &lt;/blockquote&gt;
      &lt;p&gt;&lt;a href=&quot;#fnref:4&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:3&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;And we like the same music, which is true of zero (?) of my IRL friends. &lt;a href=&quot;#fnref:3&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:2&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Example: If streamer asks for anything at any point, or makes a comment that sounds vaguely like asking for something, chat is contractually obligated to respond with “I’ve got your X right here mate [PantsGrab emote]”. This never stops being funny. &lt;a href=&quot;#fnref:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:1&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;&lt;a href=&quot;https://doomwiki.org/wiki/Zero-Master&quot;&gt;Zero-Master&lt;/a&gt;, who’s a top speedrunner in every Doom game, wanted to speedrun Prodeus but he was nice enough to wait until Maynarde got the record first. &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:5&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I’ve been told my humor is too dry so I would like to use my footnote to clarify that that was a subtle joke, which you might have gotten if you read a particular one of the 10 posts I published yesterday. &lt;a href=&quot;#fnref:5&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>How do I know if I'm dreaming?</title>
				<pubDate>Sat, 29 Nov 2025 00:00:00 -0800</pubDate>
				<link>http://mdickens.me/2025/11/29/which_dream_checks_work/</link>
				<guid isPermaLink="true">http://mdickens.me/2025/11/29/which_dream_checks_work/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;I’ve been interested in lucid dreaming since high school, with just enough success to say that my efforts haven’t been a complete waste of time. I have a lucid dream once every few months, which isn’t great. But I still do reality checks multiple times per day.&lt;/p&gt;

&lt;p&gt;The simplest way to lucid dream is to follow two steps:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Get into the habit of writing down your dreams as soon as you wake up, so you get better at remembering them.&lt;/li&gt;
  &lt;li&gt;Start doing regular reality checks.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;(But this is also the least reliable method, hence why I haven’t had much success.)&lt;/p&gt;

&lt;p&gt;A reality check is when you examine some aspect of the world to see if it looks unusual. An example of a reality check is to read some text, and then read it again. Many people find that in dreams, they have difficulty reading; or the words shift around and appear to say something different the second time.&lt;/p&gt;

&lt;!-- more --&gt;

&lt;p&gt;When I was in high school, I started experimenting with a variety of reality checks. Most of them didn’t work very well. The one that works best for me is to look at my left hand.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Do I have four distinct fingers and a thumb? Are my fingers clearly solid? Sometimes in dreams, I have too many fingers, or my fingers are indistinct as if I’m seeing double.&lt;/li&gt;
  &lt;li&gt;When I look at my palm, is my thumb pointed to the left rather than the right?&lt;/li&gt;
  &lt;li&gt;Count my fingers. Can I successfully count five of them? Sometimes in dreams, I have a hard time counting without messing up.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Beyond reality checks, I’ve found that certain things tend to happen a lot when I’m dreaming:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;I look in the mirror and I see that I have a strange haircut. For example, I cut my hair last week, and the other day I had a dream that I only finished cutting my hair halfway, and my hair was half short and half long. But sadly I failed to realize that I was dreaming.&lt;/li&gt;
  &lt;li&gt;I’m lifting weights, and I can perform many more reps than I expected, and the reps aren’t getting any harder.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When these things happen, sometimes I notice and become lucid; other times I don’t realize that anything is amiss.&lt;/p&gt;

&lt;p&gt;The problem is, I’m stupider in dreams than in real life. In real life, I’d notice if something weird was going on. (I often find myself doing a reality check in real life after something strange happens.) In dreams, the “weird detector” is dialed down, and weird things seem normal. Sure, I’m pedaling a canoe with my next door neighbor and also some guy I went to high school with. And the river is made of sand but then when I look at it a second time it’s made of water now. Nothing weird about that, why do you ask?&lt;/p&gt;

&lt;p&gt;One time I was hanging out with a group of friends and I was explaining to them how I do reality checks, and how you have to be diligent to do them regularly even if you’re pretty sure you’re awake. And I thought, I should probably take my own advice and do a reality check even though I’m obviously awake right now. So I looked at my hand and you’re never gonna believe this but it turns out I was dreaming! But then I got too excited and accidentally woke myself up.&lt;/p&gt;

&lt;p&gt;Anyway, what’s my point with all this? I don’t know, someone said it might be interesting if I write about my experience with lucid dreaming. My experience has mostly been that I do a lot of reality checks in real life but rarely do them in dreams and even when I do realize I’m dreaming, I just wake up immediately. I just did a reality check before writing this sentence and it turns out I’m not dreaming, my hand looks normal and everything.&lt;/p&gt;

                </description>
			</item>
		
			<item>
				<title>Prioritizing your objectives is better than grazing past them accidentally</title>
				<pubDate>Sat, 29 Nov 2025 00:00:00 -0800</pubDate>
				<link>http://mdickens.me/2025/11/29/prioritize_your_objectives/</link>
				<guid isPermaLink="true">http://mdickens.me/2025/11/29/prioritize_your_objectives/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;A silly argument:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;The goal of this activity/institution is to achieve X. It doesn’t really achieve X, but it does achieve Y, which is even better!&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;If achieving Y is more important, why on earth would you go about that by trying and failing to achieve X? You should directly focus on Y instead.&lt;/p&gt;

&lt;!-- more --&gt;

&lt;p&gt;School allegedly teaches geometry/history/etc. Sometimes people complain that these skills aren’t useful, or that people forget everything anyway. People respond by saying it teaches socialization or learning how to learn or something.&lt;/p&gt;

&lt;p&gt;If the purpose of school is socialization, why not have 8 hours of recess? Why not have mixed classes instead of keeping them in age-segregated cohorts? Instead of discouraging kids from disruptively talking during class, why not let them talk as much as they want?&lt;/p&gt;

&lt;p&gt;If the purpose of school is to teach people to learn how to learn, why not let them play video games all day? Video games require you to learn skills, so you’re still (ostensibly) learning how to learn. And I predict that people can learn a higher &lt;em&gt;volume&lt;/em&gt; of skills by playing video games than by studying.&lt;/p&gt;

&lt;p&gt;(For the record, I am skeptical that “learning how to learn” is a thing. Empirical research on this is complicated and shows mixed results.)&lt;/p&gt;

&lt;hr /&gt;

&lt;p&gt;You should warm up before lifting weights. A lot of people warm up by doing random weird stuff. The best way to warm up for an exercise is to do that same exercise with lighter weights, not to do some other thing. You want to warm up the muscles you’ll be using, in the range of motion they’ll be moving through. The best way to do that is to actually do the exercise, but with light enough weight that you won’t strain yourself.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Learning Latin is good because it helps you learn English vocabulary/roots&lt;/strong&gt; – Why not just memorize English roots then?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Playing chess makes you smarter&lt;/strong&gt; – Well, presumably you want to be smarter so you can perform better at activity X (among other things), so it would be better to practice activity X directly because then you’d accomplish two things simultaneously, by getting &lt;em&gt;smarter&lt;/em&gt; and getting &lt;em&gt;better at X&lt;/em&gt;. (Plus, playing chess doesn’t make you smarter, but that’s beside the point.)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Religion is good because it gives people a sense of community&lt;/strong&gt; – Why not just make a community that’s grounded in reality? Although to be fair, attempts to create secular communities with religion-y vibes have mostly failed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;You should diet instead of taking Ozempic because dieting teaches you dedication&lt;/strong&gt; – What do you want dedication for? Whatever that thing is, build your dedication by doing that thing, instead of by doing the thing that no longer requires dedication. You might as well argue that people should ditch the printing press and transcribe books manually, or ditch their heater and cut firewood by hand, because those things teach dedication just as well as dieting does.&lt;/p&gt;

                </description>
			</item>
		
			<item>
				<title>My Carlin-esque list of pet peeves</title>
				<pubDate>Sat, 29 Nov 2025 00:00:00 -0800</pubDate>
				<link>http://mdickens.me/2025/11/29/pet_peeves/</link>
				<guid isPermaLink="true">http://mdickens.me/2025/11/29/pet_peeves/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;Not that I’m remotely as funny as George Carlin, or that this list is funny at all. But he had many &lt;a href=&quot;https://en.wikipedia.org/wiki/Complaints_and_Grievances&quot;&gt;complaints and grievances&lt;/a&gt;, and today I would also like to complain about some stuff.&lt;/p&gt;

&lt;p&gt;This post contains spoilers for a lot of things. I won’t hide spoilers, but I will say the name of the thing before giving the spoiler.&lt;/p&gt;

&lt;!-- more --&gt;

&lt;p&gt;&lt;strong&gt;When people in their 30s or 40s (or even 20s) say they’re “living through my third once-in-a-lifetime recession.”&lt;/strong&gt; Who told you they were once in a lifetime? This kind of thing has been happening about once per decade since the invention of money.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;When a judge refers to a courtroom as “my court.”&lt;/strong&gt; You’re not the dictator, you’re an administrator. You should be serving the people, not the other way around.&lt;/p&gt;

&lt;p&gt;Furthermore: the powerful have a moral duty to be polite to those over whom they have power, but the powerless have no corresponding duty. It’s nice when defendants are polite to judges, but politeness is supererogatory, and they are morally entitled to be as rude to judges as they want. Holding people in contempt of court for being rude to judges is a fake crime that was invented by petty power-hungry judges. If you hold power over someone’s life, and that person is rude to you, you are morally obligated not to retaliate.&lt;/p&gt;

&lt;p&gt;Relatedly: &lt;strong&gt;When someone with a PhD insists on being called “doctor”.&lt;/strong&gt; I thought the point of getting a PhD was to push the frontiers of human knowledge. Thank you for informing me that the actual reason is so you can act like you’re better than everyone else.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;When a movie character falls off a ledge or something and is holding on by one hand, while their other hand dangles at their side.&lt;/strong&gt; Just reach up and grab it with your other hand!&lt;/p&gt;

&lt;p&gt;I tested this on a pull-up bar and I found that my left hand can only support my bodyweight for a few seconds, but if I’m holding the bar with one hand, it’s very easy to reach up and grab it with my other hand. In movies, apparently the former is pretty easy and the latter is impossible.&lt;/p&gt;

&lt;p&gt;(Relatedly: When a movie character is dangling off a ledge with one hand and holding another person with their other hand, and basically bicep curls the second person to safety. There are only maybe 20 people in the world who can one-armed bicep curl the bodyweight of another person, and they’re all 300+-pound hulking gorilla men, not handsome skinny movie stars.)&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://www.youtube.com/watch?v=kYXfUEe7zqU&quot;&gt;Here’s a video&lt;/a&gt; testing this trope:&lt;/p&gt;

&lt;iframe width=&quot;560&quot; height=&quot;315&quot; src=&quot;https://www.youtube.com/embed/kYXfUEe7zqU?si=uARJe5ommqqDrdcf&quot; title=&quot;YouTube video player&quot; frameborder=&quot;0&quot; allow=&quot;accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share&quot; referrerpolicy=&quot;strict-origin-when-cross-origin&quot; allowfullscreen=&quot;&quot;&gt;&lt;/iframe&gt;

&lt;p&gt;&lt;a href=&quot;https://www.youtube.com/watch?v=pjV0ofXTxKk&quot;&gt;Here’s another video&lt;/a&gt; testing this trope, in which the testers successfully rescue themselves, but (1) they are both professional climbers, and (2) they are using two arms instead of one.&lt;/p&gt;

&lt;iframe width=&quot;560&quot; height=&quot;315&quot; src=&quot;https://www.youtube.com/embed/pjV0ofXTxKk?si=QnaBOo7HrZpIFPZD&quot; title=&quot;YouTube video player&quot; frameborder=&quot;0&quot; allow=&quot;accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share&quot; referrerpolicy=&quot;strict-origin-when-cross-origin&quot; allowfullscreen=&quot;&quot;&gt;&lt;/iframe&gt;

&lt;p&gt;While we’re on strength-related pet peeves: &lt;strong&gt;When a movie character is trying to push a heavy object overhead and they struggle to squat it up, and then struggle to push it up with their arms.&lt;/strong&gt; Your legs can lift about 3 times as much weight as your arms. If you can just barely squat it up, there’s no way you can push it overhead. And if you struggle to push it up with your arms, then squatting it up will be easy.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://www.youtube.com/watch?v=PCn1uAs_0VQ&quot;&gt;This scene from Spider-Man: Homecoming&lt;/a&gt; comes to mind. In this particular case, my headcanon is that the spider bite made Peter’s arms get disproportionately stronger.&lt;/p&gt;

&lt;iframe width=&quot;560&quot; height=&quot;315&quot; src=&quot;https://www.youtube.com/embed/PCn1uAs_0VQ?si=Nr4fStvyJoM30c1r&quot; title=&quot;YouTube video player&quot; frameborder=&quot;0&quot; allow=&quot;accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share&quot; referrerpolicy=&quot;strict-origin-when-cross-origin&quot; allowfullscreen=&quot;&quot;&gt;&lt;/iframe&gt;

&lt;p&gt;(Am I inconsistent for complaining about inaccurate displays of upper body strength, but defending &lt;a href=&quot;https://mdickens.me/2025/11/14/NCIS/&quot;&gt;two people using the same keyboard&lt;/a&gt;? Maybe. These are my grievances and I’m allowed to gripe about whatever I want.)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;When people refer to works of fiction as “truth”&lt;/strong&gt; (as in, “it shows deep truths about the human condition”). The absolute chutzpah of some people to refer to something as “truth” when it is completely made up and everyone knows it.&lt;/p&gt;

&lt;p&gt;Notice how no one ever talks about physics or chemistry as “conveying universal truths”? There’s a direct inverse relationship between how often people describe something that way and how much truth it actually conveys.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;A highly advanced alien race with allegedly far better ethics than humans, who make most of the same ethical reasoning errors as humans.&lt;/strong&gt; One theme you see sometimes: the aliens are considering wiping out humanity for being too unethical, but then decide not to because they see some people being good. If the aliens are so morally advanced, why are they still doing collective punishment? Humans (at least some of us) figured this one out 2600 years ago (“the child will not be punished for the parent’s sins”, &lt;a href=&quot;https://biblehub.com/ezekiel/18-20.htm&quot;&gt;Ezekiel 18:20&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;When someone’s heart stops and people say “they were dead for 2 minutes.”&lt;/strong&gt; Heartbeat is a commonly-used proxy for death, but it’s not the same thing as death. If your heart stops and restarts, then you weren’t dead.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Referring to an athlete in a light weight class as the GOAT.&lt;/strong&gt; The only reason that athlete wins competitions is because heavyweights aren’t allowed to compete against them. If they would lose to a mediocre heavyweight, how can you reasonably call them the GOAT?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The saying “freedom of speech doesn’t mean freedom from consequences.”&lt;/strong&gt; Yes it does mean exactly that. What else could it possibly mean? “Sure, you’re free to criticize Stalin, but you’re not free to not get sent to Siberia.”&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;In a movie when two people are talking, and one of them walks away and says something quietly but the other person can still hear them somehow.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://www.youtube.com/watch?v=IUH2mxQCYvM&quot;&gt;This scene&lt;/a&gt; from &lt;em&gt;Shazam!&lt;/em&gt; comes to mind:&lt;/p&gt;

&lt;iframe width=&quot;560&quot; height=&quot;315&quot; src=&quot;https://www.youtube.com/embed/IUH2mxQCYvM?si=X65d750AyFNNnst3&quot; title=&quot;YouTube video player&quot; frameborder=&quot;0&quot; allow=&quot;accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share&quot; referrerpolicy=&quot;strict-origin-when-cross-origin&quot; allowfullscreen=&quot;&quot;&gt;&lt;/iframe&gt;

&lt;p&gt;&lt;strong&gt;When people use xkcd’s &lt;a href=&quot;https://xkcd.com/1053/&quot;&gt;Ten Thousand&lt;/a&gt; comic in a condescending way.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;https://imgs.xkcd.com/comics/ten_thousand_2x.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;The whole point of the comic is that you shouldn’t be condescending toward people who don’t know things, but the comic ended up just giving people a whole new way of being condescending.&lt;/p&gt;

&lt;p&gt;If you call someone “one of today’s lucky ten thousand”, you’re not explicitly shaming them, but you’re still emphasizing that they didn’t know something and you did. If someone doesn’t know something, better to simply tell them without bringing extra attention to the fact that they didn’t know it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;“Compared to the earth, you’re just a tiny speck.”&lt;/strong&gt; Something being a “speck” is a fact about your perception, not reality—you look like a speck compared to the earth because human eyes aren’t sufficiently high-resolution to see yourself in a picture of the earth. But you’re exactly as big as you feel like you are, and the earth is far bigger. Reality has way, way more detail than we can fit in our brains.&lt;/p&gt;

&lt;p&gt;Or, “[Problem] is so massive that individuals can’t make a difference.” You can’t make a difference that you can perceive at the macro level, but that’s a fact about your perception, not about reality. You &lt;em&gt;can&lt;/em&gt; make a difference, but your perception isn’t good enough to be able to see the difference you’re making.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;“Time and again, countless studies have proven X” [no citations given].&lt;/strong&gt; This usually means X is false.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The saying, “by far one of the best.”&lt;/strong&gt; How can something be by far &lt;em&gt;one of&lt;/em&gt; the best? Either it’s better than everything else by a wide margin, or it’s not.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;When academics invent jargon for a concept that already has a commonly-used word, and also redefine the commonly-used word to be incompatible with the normal definition, and then tell normal people that they’re wrong for using the original definition.&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Some people claim that strawberries and raspberries are not berries, while bananas and eggplants are berries. Since ancient times, people have understood what berries are, and you can tell because they have “berry” in the name: strawberry, raspberry, blueberry, blackberry. Then some people decided that actually that’s not what “berry” means, it’s actually a fruit produced from a single flower containing one ovary. Nope, that’s not a berry, the word “berry” is already being used to describe blueberries and strawberries and whatnot. Find a new word for your thing.&lt;/li&gt;
  &lt;li&gt;Nate Soares has &lt;a href=&quot;https://twitter.com/So8res/status/1401670792409014273&quot;&gt;complained&lt;/a&gt; about “the definitional gymnastics required to believe that dolphins aren’t fish”, and even about &lt;a href=&quot;https://twitter.com/So8res/status/1401670809035304961&quot;&gt;berries in particular&lt;/a&gt;.&lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;I’ve heard people say “trees don’t exist.” What are these people smoking?? I know trees exist because I can see some outside my window right now. What they mean to say is “trees do not share a common ancestor that isn’t also shared by non-trees”, which is not remotely the same thing as trees not existing.&lt;/p&gt;

    &lt;p&gt;This would be like someone saying “clothes don’t exist”, and when you press them on it, it turns out what they mean is that clothes can’t be uniquely distinguished by material, because there are things made of cotton that aren’t clothing.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;For normal people, an implication is when you make a statement that suggests something is true without necessarily logically entailing it. For linguists, this is called an &lt;em&gt;implicature&lt;/em&gt;. And for linguists, the word “implication” can &lt;em&gt;only&lt;/em&gt; refer to a scenario where your statement logically entails something. An implication by the common-sense definition is necessarily &lt;em&gt;not&lt;/em&gt; an implication by the linguistic definition. This is dumb and linguists should use better terminology.&lt;/p&gt;

    &lt;p&gt;I understand the need to distinguish between two different definitions of “implication”, but why would you take the more common definition and give it a new word, and redefine “implication” to &lt;em&gt;only&lt;/em&gt; refer to the less common definition? Would have been much better to make the technical terms be, say, “implication” and “entailment” rather than “implicature” and “implication”. Or even “implicature” and “entailment” to avoid any ambiguity, and then let us normal people say “implication” whenever we want.&lt;/p&gt;

    &lt;p&gt;(My computer’s spellcheck says “implicature” is not a word. My computer vindicates me.)&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;The phrase “quarter century.”&lt;/strong&gt; You’re trying too hard to make 25 years sound like a long time.&lt;/p&gt;

&lt;p&gt;People don’t realize how long ago 2024 was—an entire centicentury has passed!&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;When stories don’t understand how big planets are.&lt;/strong&gt; Some examples:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;In &lt;em&gt;Dragon Ball Z&lt;/em&gt;, characters shoot massive energy blasts that are visible from space and look to be over a thousand miles across, but when zoomed in to the characters’ perspective, they only cover a few miles at most.&lt;/li&gt;
  &lt;li&gt;The Death Star is powerful enough to destroy a planet, and the Second Death Star—which is canonically more powerful—can destroy a single starship at a time. It should be able to trivially destroy the entire rebel fleet in one shot, in the same way that a strongman who can deadlift a car should also be able to pick up a piece of lint. Except that’s not even a good analogy because the size difference between a fleet of ships and a planet is far greater than the difference between lint and a car.&lt;/li&gt;
  &lt;li&gt;Also from &lt;em&gt;Star Wars&lt;/em&gt;: Darth Vader says, “The ability to destroy a planet is insignificant next to the power of the force.” But no force user ever does anything remotely on the scale of destroying a planet. I cannot emphasize enough how big a planet is, and how trivial Jedi powers are compared to the ability to destroy a planet.&lt;/li&gt;
  &lt;li&gt;Superman can rotate a planet by pushing on it, and he can also get into a fistfight. If he’s strong enough to rotate a planet, then one of his punches should superheat the atmosphere, leaving a massive crater and creating a shockwave that propagates around the earth and destroys everything on the surface.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is a form of scope insensitivity: any amount of force greater than about 100 tons gets treated as basically infinite, and is interchangeable with any other amount of force that’s also greater than 100 tons.&lt;/p&gt;

&lt;p&gt;(I picked 100 tons as a reference point because, according to Marvel canon, The Hulk can lift “over 100 tons”, even though we’ve seen him perform feats that require, I don’t know, twelve orders of magnitude more force than that?)&lt;/p&gt;

&lt;p&gt;I would once again like to reiterate that planets are absurdly large. Any comparison I could make will fail to capture how large planets are.&lt;/p&gt;

&lt;p&gt;(&lt;a href=&quot;https://en.wikipedia.org/wiki/Graham%27s_number&quot;&gt;Graham’s number&lt;/a&gt;, however, is somewhat larger.)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;When the “bad” characters make a decision that’s good ex ante, and the heroes oppose the decision, and then it turns out badly, thus “proving” the heroes right.&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;In &lt;em&gt;Daredevil&lt;/em&gt; season 3, &lt;span class=&quot;spoiler&quot;&gt;when the heroes oppose the plan to let Fisk live in a (government-owned) hotel in exchange for him exposing tons of organized crime, which was clearly a great idea ex ante and the heroes opposed because they just hate Fisk, except then it turned out Fisk had a secret bunker underneath the hotel and was secretly controlling everything, which the heroes did not know but their position turned out to be correct by sheer coincidence.&lt;/span&gt;&lt;/li&gt;
  &lt;li&gt;In &lt;em&gt;Doom&lt;/em&gt; (2016), Samuel Hayden was harvesting energy from Hell, and Doomguy was not a fan of that decision. Hayden was correct ex ante that a perfect and unlimited energy source is extremely valuable and worth pursuing. Everything would have been fine if Olivia Pierce hadn’t started a demonic cult and intentionally opened a portal to Hell.&lt;/li&gt;
  &lt;li&gt;In the Warhammer 40K book &lt;em&gt;Horus Rising&lt;/em&gt;, &lt;span class=&quot;spoiler&quot;&gt;the space marine Saul Tarvitz used all his team’s explosive charges to blow up a single tree because (1) the xenos were using the tree to kill his men and (2) he didn’t want to dishonor his comrades by letting their corpses sit there. This was an understandable but debatable decision. His commander Eidolon chastised him for wasting all their charges, which is a fair criticism. Then, because Tarvitz is a protagonist and Eidolon is an antagonist, it turned out that destroying the trees inexplicably cleared up the weather and let more dropships come in. It turned out to be a magical tree and blowing it up solved all of their problems, but Tarvitz had no way of knowing that in advance.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;But sometimes this trope gets subverted, and I always enjoy that. An example from &lt;em&gt;Attack on Titan&lt;/em&gt; season 1: &lt;span class=&quot;spoiler&quot;&gt;Eren, the unreasonable hothead, wants to turn into a titan and fight the female titan. The reasonable members of Levi squad advise against it and convince him not to, but then almost all of them get killed by the female titan because Eren isn’t there to help. Even in retrospect, it’s genuinely unclear who was right. This event becomes more layered by how it influences Eren’s motivations later on—without giving away too much, it made him feel like he can’t rely on other people and he needs to do things for himself.&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Stories that retcon human technology as coming from aliens.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;For example, in &lt;em&gt;Transformers&lt;/em&gt; (2007) and &lt;em&gt;Independence Day&lt;/em&gt;, much of 20th century technology was reverse-engineered from studying crashed aliens. Humans are fully capable of inventing stuff on our own, thank you very much!&lt;/p&gt;

&lt;p&gt;(The real-life version of this &lt;a href=&quot;https://allthetropes.org/wiki/ET_Gave_Us_Wi-Fi&quot;&gt;trope&lt;/a&gt; is people who think aliens built the pyramids.)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Any news article with a title of the form “Thing Quietly Happens.”&lt;/strong&gt; It’s not quiet if it’s the subject of a news article.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;When people write reviews as “the good, the bad, and the ugly”.&lt;/strong&gt; In the context of a review, “ugly” is just another word for “bad”. There’s no point in having two different sections that mean the same thing. Just because that was the title of a famous movie doesn’t mean it’s a good way to make a list.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;“Smart” characters who are only smart because they have the magical ability to know things that they couldn’t possibly know.&lt;/strong&gt; But, you see, they figured it out based on zero evidence, because they’re so smart.&lt;/p&gt;

&lt;p&gt;In BBC’s &lt;em&gt;Sherlock&lt;/em&gt;, Sherlock Holmes makes inferences based on insufficient data but always turns out to be right because the writers decided he is. In &lt;a href=&quot;https://www.youtube.com/watch?v=eKQOk5UlQSc&quot;&gt;this Pete Holmes sketch&lt;/a&gt;, Sherlock makes the same deductions as he did in BBC’s version, but he’s wrong every time.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;When a story sets up an interesting moral dilemma by giving the antagonist a sympathetic motivation but then they ruin it by having the antagonist act like a jerk to make sure you know they’re the bad guy.&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;In &lt;em&gt;Across the Spider-Verse&lt;/em&gt;, &lt;span class=&quot;spoiler&quot;&gt;Miguel starts out as an antagonist who’s making some pretty good points actually, but then he starts acting blatantly evil in a way that’s not consistent with his earlier characterization.&lt;/span&gt;&lt;/li&gt;
  &lt;li&gt;In the &lt;em&gt;Star Wars&lt;/em&gt; expanded universe, Count Dooku has a compelling backstory in which he rightly has many grievances about how the Galactic Republic government is run; he vies for independence and gets support from the many states that have been mistreated by the Republic. But then to make sure there’s no ambiguity about who the good guys are, he constantly commits war crimes and basically tortures people for fun.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Subversions of this trope are often really fun to watch: when the writers set up a person as the villain, but you can also 100% see where they’re coming from and they make some good points. A few examples that come to mind are Ozymandias, Thanos&lt;sup id=&quot;fnref:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;, and Francis Hummel (the villain in &lt;em&gt;The Rock&lt;/em&gt;).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;When people lament how everything is centralized on the same four social media platforms, and nobody has their own website anymore.&lt;/strong&gt; I have a website! You could be reading it! (In fact, you’re probably reading it right now.)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;When RPGs railroad you into making an immoral decision and then get all philosophical,&lt;/strong&gt; like, “Who’s the real villain here? Really makes you think.” No, you FORCED me to do the unethical thing. I would’ve behaved ethically if you’d let me.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;When people use the term “bodybuilder” to mean “extremely strong person”.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Bodybuilding is not a strength sport. You don’t perform well in bodybuilding by being strong; you perform well by having big muscles. Bodybuilders happen to be pretty strong—much stronger than the average person—but not as strong as people who specifically train for strength, like strongmen or powerlifters or sumo wrestlers.&lt;/p&gt;

&lt;p&gt;For example, Arnold Schwarzenegger’s &lt;a href=&quot;https://thebarbell.com/how-strong-was-arnold/&quot;&gt;best deadlift&lt;/a&gt; was 322 kg (710 lb). The &lt;a href=&quot;https://en.wikipedia.org/wiki/Progression_of_the_deadlift_world_record&quot;&gt;world record deadlift&lt;/a&gt; is somewhere between 460.4 kg (1,015 lb) and 510 kg (1,124 lb) depending on what rules you go by. Arnold was strong for a normal person but he’d come dead last in a world-class powerlifting or strongman competition.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;When you ask someone a question about an uncertain subject where their credence interval is narrower than yours, and they respond with “I don’t know.”&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Let me give an example:&lt;/p&gt;

&lt;p&gt;Alice lives in San Francisco. Bob is visiting SF from England and needs to drive to Sacramento for a conference.&lt;/p&gt;

&lt;p&gt;Bob: “How long does it take to drive to Sacramento?”&lt;/p&gt;

&lt;p&gt;Alice: “I don’t know.”&lt;/p&gt;

&lt;p&gt;Alice, you have more information about this than Bob. You have at least a vague sense of where Sacramento is, and Bob doesn’t, so you have the ability to help him out here. Is it 30 minutes? 2 hours? 6 hours?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href=&quot;https://en.wikipedia.org/wiki/The_Hedgehog_and_the_Fox&quot;&gt;Hedgehog&lt;/a&gt; answers for complex as-yet-unexplained phenomena.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Two examples I see a lot: we don’t really know why animals sleep, and we don’t really know why humans evolved to be so much smarter than other animals. I’m pretty sure these sorts of complex phenomena don’t have a single explanation and it bugs me when people propose a single thing as the sole reason. “Humans evolved intelligence to win arguments”; “humans evolved intelligence to get better at lying and detecting lies”; “humans evolved intelligence to better figure out how to track animal migration patterns.” Intelligence is useful for many things; there is no single reason why it evolved.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;When a movie performs poorly in the box office and people accuse it of being a money laundering scheme.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I’m not saying money-losing films aren’t ever some sort of scheme. Maybe they are.[1] But the scheme definitely isn’t money laundering. Money laundering is when you funnel illegally-earned income into a legitimate business, which makes your income look HIGHER, not LOWER. It’s impossible to launder money by REDUCING your income.&lt;/p&gt;

&lt;p&gt;[1] &lt;em&gt;The Producers&lt;/em&gt; was about a money-losing film scheme. The way the scheme worked is they got investors to provide funding in return for a percentage of profits, and then intentionally made the worst movie possible so they wouldn’t have to pay investors back.&lt;/p&gt;

&lt;p&gt;Hollywood accounting is when you inflate your costs to make your profit look smaller than it really is, to avoid paying taxes. Which is, first of all, the opposite of money laundering (money laundering is when you pay more taxes on purpose). And second of all, even if you’re doing Hollywood accounting, you still want your &lt;em&gt;revenue&lt;/em&gt; to be as high as possible. You don’t want a box office bomb.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The phrase “lowest common denominator.”&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;It should actually be “&lt;em&gt;greatest&lt;/em&gt; common denominator”, but perhaps it’s confusing to have the word “greatest” in a phrase that identifies a quantity as being small. I would accept simply “common denominator”.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;When someone writes a 500+-word comment but uses uncommon/non-standard abbreviations.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;You probably spent half a hour writing that comment. Was it really that important to save the 5 seconds it would have taken to write out the full words?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;When people don’t understand the difference between something being allowed and being government-mandated.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;For example: “The United States doesn’t have parental leave.” Yes it does! Companies are fully allowed to offer parental leave! The US just doesn’t have &lt;em&gt;government-mandated&lt;/em&gt; parental leave.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The phrase “demand exceeds supply.”&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;What people mean by this is something like, “Consumers want more of a good, but it’s too expensive.” Which is another way of saying, “If the price were lower, people would buy more of it.”&lt;/p&gt;

&lt;p&gt;Which is true for pretty much everything ever? Demand is downward-sloping. So this statement doesn’t convey any useful information.&lt;/p&gt;

&lt;p&gt;And my #1 pet peeve:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Describing someone in the past as “prescient” for observing a trend that was already happening at the time.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Examples:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;I saw this tweet described as prescient:
&lt;img src=&quot;https://preview.redd.it/jddxrmskzyd61.jpg?width=828&amp;amp;auto=webp&amp;amp;s=cf3bf7e1ec17e84e157253cf9305c4c7e107dce5&quot; alt=&quot;&quot; /&gt;
The most well-known (alleged) widespread Wall Street fraud was in 2007. This tweet is from 2015. Being 8 years behind isn’t prescient.&lt;/p&gt;

    &lt;p&gt;(For posterity, this is a Bernie Sanders tweet made on 2015-11-14 saying “Wall street plays by the rules? Who are we kidding? The business model of Wall Street is fraud.”)&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;em&gt;1984&lt;/em&gt; and &lt;em&gt;Brave New World&lt;/em&gt; are often described this way.
    &lt;ul&gt;
      &lt;li&gt;Orwell’s “we have always been at war with Eastasia” was inspired by real-life events, probably (I don’t know for sure). The first time I noticed this phenomenon in real life is in 2020 when the party line instantly flipped from “don’t wear a mask, masks don’t work” to “you have to wear a mask in public, this has always been the rule and we never said anything different”. But I’m sure this was far from the first time it happened.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;a href=&quot;https://xkcd.com/1289/&quot;&gt;https://xkcd.com/1289/&lt;/a&gt; “prescient” about AI art:
&lt;img src=&quot;https://imgs.xkcd.com/comics/simple_answers_2x.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

    &lt;p&gt;Yes, I’m sure AI art was the first time in history that this comic was ever relevant.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;SNL’s &lt;a href=&quot;https://www.youtube.com/watch?v=nWMp_z7Jnxw&quot;&gt;“Enchilada” sketch&lt;/a&gt; is “prescient” because it was making fun of newscasters over-pronouncing foreign words, which they did at the time and still do.&lt;/li&gt;
  &lt;li&gt;SMBC’s &lt;a href=&quot;https://www.youtube.com/watch?v=sGArqoF0TpQ&quot;&gt;Both Sides&lt;/a&gt; sketch which makes fun of (a) creationists/woo and (b) news shows that present creationists/woo as equally credible. People in the comments say it’s more relevant than ever, seemingly forgetting that creationism used to be a real thing that people took seriously, and now they mostly don’t, so it’s actually &lt;em&gt;less&lt;/em&gt; relevant than ever.&lt;/li&gt;
  &lt;li&gt;Lonely Island was “way ahead of its time” for the song “When Will The Bass Drop?” (2014), which was making fun of a trend that had already been happening for 5+ years at that point.&lt;/li&gt;
  &lt;li&gt;“Trump, unfortunately, has decreased PEPFAR funding” written in mid-2024, described as “prescient” for predicting that he would decrease funding again in 2025 after being re-elected. Extrapolating a historical trend is not prescient. In fact, the original commenter never even predicted he would further decrease funding; they just observed that he had already done so.&lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Seen in a YouTube comment on a clip from &lt;em&gt;Malcolm in the Middle&lt;/em&gt;: “Malcom [sic] was ahead of its time just like walker Texas ranger you don’t see diverse storylines like this anymore now everything is forced”&lt;/p&gt;

    &lt;p&gt;How could it have been ahead of its time if you don’t see shows like it anymore?&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;“Ahead of its time” is another thing a lot of people get wrong. The Sopranos was the first TV show of its kind, but it wasn’t ahead of its time—contemporaries appreciated it, and it inspired many shows that aired soon after. It was right on time.&lt;/p&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:2&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Thanos is clearly insane, but his actions make sense given his insanity, and he is never unnecessarily cruel—he only kills people because he thinks he has to. (At least that’s true of &lt;em&gt;Infinity War&lt;/em&gt; Thanos; &lt;em&gt;Endgame&lt;/em&gt; Thanos is a bit different.) &lt;a href=&quot;#fnref:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>Not being awkward is NP-hard</title>
				<pubDate>Sat, 29 Nov 2025 00:00:00 -0800</pubDate>
				<link>http://mdickens.me/2025/11/29/not_being_awkward_is_NP-hard/</link>
				<guid isPermaLink="true">http://mdickens.me/2025/11/29/not_being_awkward_is_NP-hard/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;This meme got me thinking:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/awkward.webp&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;That feeling when you’re smart enough to know how awkward you are, but not smart enough to know how not to be awkward&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The reason it works that way is because not being awkward is NP-hard, and I can prove it.&lt;/p&gt;

&lt;!-- more --&gt;

&lt;p&gt;It is known that the &lt;a href=&quot;https://en.wikipedia.org/wiki/Boolean_satisfiability_problem&quot;&gt;SAT&lt;/a&gt; problem is NP-hard. SAT, or the satisfiability problem, is the problem of taking a logical statement involving a series of boolean values, and determining whether there is some combination of true/false assignments to those values such that the overall logical statement is true.&lt;/p&gt;

&lt;p&gt;I will show that the problem of not being awkward is solvable in polynomial time &lt;em&gt;only if&lt;/em&gt; SAT is solvable in polynomial time.&lt;/p&gt;

&lt;p&gt;Let A be a statement that is known to be awkward. (For example, calling your interlocutor stinky could be regarded as awkward; the exact statement doesn’t matter.)&lt;/p&gt;

&lt;p&gt;Now consider some arbitrarily complicated combination of a series of logical statements including statement A. Call that combination statement S. For example:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;[you are stinky AND you have blue eyes] OR [you have red hair AND (you are stinky OR I am hungry)] OR […]&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;If I express S verbally, then I am asserting that S is true. The question then becomes, is there some truth assignment to each of these sub-statements such that the whole statement is true, BUT “you are stinky” is false? Answering that question is equivalent to solving SAT.&lt;/p&gt;

&lt;p&gt;S is awkward if and only if “you are stinky” is forced to be true to make S true.&lt;/p&gt;

&lt;p&gt;To knowing whether “you are stinky” is forced to be true (in full generality for any statement S), you have to solve SAT.&lt;/p&gt;

&lt;p&gt;Therefore, if you can solve for the awkwardness of a statement, then you can solve SAT.&lt;/p&gt;

&lt;hr /&gt;

&lt;p&gt;One conceivable objection to this proof: if it is ambiguous whether statement S requires “you are stinky” to be true, then perhaps S is unconditionally awkward. While this may be true in many social situations, it’s not &lt;em&gt;universally&lt;/em&gt; true.&lt;/p&gt;

&lt;p&gt;For not-being-awkward to be shown to be NP-hard, we don’t need to show that it &lt;em&gt;always&lt;/em&gt; solves SAT, only that there is a &lt;em&gt;class&lt;/em&gt; of statements that, if solved, would solve SAT. So consider the following social context:&lt;/p&gt;

&lt;p&gt;You are talking to a literal-minded mathematician. You say to this person, “If [complicated logical statement] then you’re stinky.” Calling them stinky would be awkward. But, as a literal-minded mathematician, they only take you to be calling them stinky if the antecedent is true. This person would not consider it awkward if you said, “If the moon is made of cheese then you’re stinky.”&lt;/p&gt;

&lt;p&gt;There only needs to be one such literal-minded mathematician in the world for my proof to hold, because if there is, that means you can’t &lt;em&gt;always&lt;/em&gt; identify a non-awkward statement in polynomial time. There are &lt;em&gt;some&lt;/em&gt; situations where identifying non-awkwardness solves SAT, and therefore the problem in full generality is NP-hard.&lt;/p&gt;

&lt;p&gt;(It is not yet known whether the awkwardness problem is NP-complete.)&lt;/p&gt;

&lt;hr /&gt;

&lt;p&gt;Thanks to Scott Aaronson for suggesting a way to simplify my proof.&lt;/p&gt;

                </description>
			</item>
		
			<item>
				<title>Some little things I do to make life easier</title>
				<pubDate>Sat, 29 Nov 2025 00:00:00 -0800</pubDate>
				<link>http://mdickens.me/2025/11/29/little_things_I_do/</link>
				<guid isPermaLink="true">http://mdickens.me/2025/11/29/little_things_I_do/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;In the spirit of You Can Just Do Things, here are some things I Just Do. Some of them are weird; others are normal, but frequently overlooked.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Edited 2025-12-03 to add a ninth thing.&lt;/em&gt;&lt;/p&gt;

&lt;!-- more --&gt;

&lt;ol&gt;
  &lt;li&gt;
    &lt;p&gt;You know how on a hot summer night, your pillow gets hot, and then you flip it over to the cool side? But then before long, the cool side becomes too hot? You can fix this by pouring cold water on a towel and then laying the towel over your pillow. The wet towel stays cold for much longer than a dry pillow would.&lt;/p&gt;

    &lt;p&gt;When I do this, I only soak half the towel, and then I fold it over so that the dry half of the towel goes between the pillow and the wet half. That way my pillow doesn’t get wet. In the morning I hang the towel up to dry, and I wash it after a few uses so it doesn’t get mildewy.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Ideal sleeping temperature is colder than ideal waking temperature. On colder days, I keep my bedroom window open and my bedroom door closed, which separates my apartment into the “cool part” (the bedroom) and the “warm part” (everywhere else).&lt;sup id=&quot;fnref:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;If you walk through my neighborhood in the evening after a hot day, you will see box fans displayed in every window. But not mine, because I bought a portable air conditioner.&lt;sup id=&quot;fnref:3&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt; Yes, portable ACs are “inefficient” in an electricity sense, but a box fan lowers my apartment temperature by about one degree per hour. Even a “bad” portable AC can get my bedroom down to 68 degrees in under 30 minutes.&lt;sup id=&quot;fnref:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;I buy only one kind of sock (specifically, &lt;a href=&quot;https://www.amazon.com/dp/B0BPR5JCRZ&quot;&gt;Hanes X-Temp Cushioned No Show Socks&lt;/a&gt;—I tested four different brands and this one was my favorite of the four). Not only do I not need to match my socks, I don’t even need to pair them; I just throw them in a pile in my sock drawer.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;I live in a high-cost-of-living area, which means groceries are expensive. I order non-perishable groceries online and have them delivered, which is somehow both cheaper and more convenient than buying them at the store.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;When I do go to the grocery store, I go first thing in the morning because the store is nearly empty at 7am. (This strategy doesn’t work if you don’t wake up until later.)&lt;/p&gt;

    &lt;p&gt;I am not much of a morning person, but I force myself to go to the store right after waking up and then I’m allowed to chill out and do nothing for the next couple hours.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;I keep a backup of every basic household item so that I never run out. When I break out the backup, that means it’s time to buy more. I have two bottles of shampoo, two boxes of peppermint tea, two cartons of soymilk (actually six but who’s counting), two packs of razor blades, two bottles of vitamins, …&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Many people lament that fitted sheets are too hard to fold. I don’t fold my fitted sheets; but I don’t &lt;em&gt;not&lt;/em&gt; fold them, either. My strategy is to always have one set of sheets on my bed and one set on the hamper. When I do laundry, the sheets come off my bed and into the hamper, and the freshly cleaned set goes onto the bed. No folding necessary. And I don’t need to remember to strip the bed before doing laundry, because there’s always a set of sheets waiting to be cleaned.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;My hands get cold easily, but if I wear gloves, I lose a lot of dexterity. Fingerless gloves are a compromise, but they normally leave 2/3 of the finger exposed, so my fingers still get cold. I made my own fingerless gloves by cutting the tips off of some regular gloves. My homemade gloves cover almost the full surface of my finger, but my naked fingertips can do a lot more things than gloved fingers can.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ol&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:2&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I live in a one-bedroom apartment rather than a studio even though I don’t like spending money; the improved sleep temperature alone might be enough to justify the extra expense. Or maybe that’s just my excuse. &lt;a href=&quot;#fnref:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:3&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Well that’s not entirely true because I usually do use a box fan, but if the fan isn’t cutting it then I bring out the AC. &lt;a href=&quot;#fnref:3&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:1&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;My ideal sleeping temperature is more like 60 to 63, but I don’t want to run the AC &lt;em&gt;that&lt;/em&gt; hard. &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>Kid me was bad at Magic: The Gathering</title>
				<pubDate>Sat, 29 Nov 2025 00:00:00 -0800</pubDate>
				<link>http://mdickens.me/2025/11/29/kid_me_was_bad_at_mtg/</link>
				<guid isPermaLink="true">http://mdickens.me/2025/11/29/kid_me_was_bad_at_mtg/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;I played a lot of MTG from age 9 to 14 or so. I picked up the game again recently and I was immediately better at the game than my 14-year old self. I don’t have any direct way to prove this, but I’m pretty sure it’s true.&lt;/p&gt;

&lt;!-- more --&gt;

&lt;p&gt;When I was a kid, I liked zombie cards. (And I still do!&lt;sup id=&quot;fnref:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;) I looked up zombie decklists online and they all used &lt;span class=&quot;mtg-tooltip&quot;&gt;Carrion Feeder&lt;span class=&quot;mtg-tooltip-image&quot;&gt;&lt;img src=&quot;https://cards.scryfall.io/large/front/0/a/0a19da90-880e-4eca-8cf7-6d7baf090d53.jpg&quot; /&gt;&lt;/span&gt;&lt;/span&gt;. As a kid, this confused me. Carrion Feeder is bad, I said! It can’t block! Attacking and blocking are the two things creatures do, so if you can’t block, the creature is only half is good! And its ability is useless—why would I want to sacrifice an entire creature to pump up a 1/1?&lt;/p&gt;

&lt;p&gt;Now, looking at the card again, I get it. Carrion Feeder is &lt;em&gt;really good&lt;/em&gt;. Maybe not ban-worthy, but it’s strong enough that they’ve stopped printing it, and instead they print worse versions like &lt;span class=&quot;mtg-tooltip&quot;&gt;Bloodflow Connoisseur&lt;span class=&quot;mtg-tooltip-image&quot;&gt;&lt;img src=&quot;https://cards.scryfall.io/normal/front/6/f/6f198b57-8bfe-4d13-9a16-10990707b455.jpg&quot; /&gt;&lt;/span&gt;&lt;/span&gt;, which is mostly the same except that it costs 3 mana instead of 1.&lt;/p&gt;

&lt;p&gt;When they released &lt;span class=&quot;mtg-tooltip&quot;&gt;Isamaru, Hound of Konda&lt;span class=&quot;mtg-tooltip-image&quot;&gt;&lt;img src=&quot;https://cards.scryfall.io/large/front/6/a/6afead32-3542-44c4-82d6-b6a81beb9f90.jpg&quot; /&gt;&lt;/span&gt;&lt;/span&gt;—the first ever 2/2 for 1 mana—I thought it was dumb power creep. But looking at it now, it seems fine. Abilities are really important. In most circumstances, I’d rather play a 1/1 with a good abililty than a vanilla 2/2. Plus, Isamaru is legendary, which means you can only have one copy in play at a time—you can’t blitz out a bunch of copies and roll over your opponent in the first three turns.&lt;/p&gt;

&lt;p&gt;What was my mistake? I didn’t see the possibilities. As a kid, I only thought about the average-case use for an ability. But that’s wrong because you can specifically engineer a deck to make good use of an ability.&lt;/p&gt;

&lt;p&gt;I also simply overlooked some obvious interactions. You can chump block with a creature, sacrifice it to Carrion Feeder, and then take no damage. You just got a +1/+1 counter for free.&lt;/p&gt;

&lt;p&gt;As a kid, I thought I would get worse at video games as I aged. I was extremely wrong. The best direct comparison I have: I got my parents to buy me &lt;em&gt;Halo: Combat Evolved&lt;/em&gt; when I was 13 or 14. I remember one four-hour play session where I made it most but not all of the way through level 2 on Normal difficulty. I went back and played Halo again when I was 21, and it took me an hour and a half to fully beat level 2 on Heroic. I can’t prove it without a time machine, but I am confident that with an hour of practice, today-me could beat 13-year old me in any game, regardless of how much experience 13-year old me had with it.&lt;/p&gt;

&lt;p&gt;I talked with &lt;a href=&quot;https://www.lesswrong.com/users/screwtape&quot;&gt;Screwtape&lt;/a&gt; about this. Unlike me, he has played a lot of MTG against kids. He’s observed that most middle school aged kids can’t see any card combos that involve 4+ cards—perhaps it’s a working memory issue.&lt;sup id=&quot;fnref:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;Screwtape also pointed out that MTG involves a lot of algebra, which I hadn’t realized because you can’t see the equations. But they’re there—deciding how to attack and block with your creatures is an algebra problem. Deciding where to play your buff spells is an algebra problem. I knew algebra when I was 14, but I’m a lot more practiced at it now.&lt;/p&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:1&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I like zombie cards, but I don’t like zombie apocalypse movies. After thinking about it, I realized that’s because they’re two totally different things. An evil necromancer reanimating the dead is cool. A virus infecting everyone and making them zombies is not particularly cool. &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:2&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;He also said some 12 year olds will kick your ass at Magic. &lt;a href=&quot;#fnref:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>Gaming keyboards are not good for gaming</title>
				<pubDate>Sat, 29 Nov 2025 00:00:00 -0800</pubDate>
				<link>http://mdickens.me/2025/11/29/gaming_keyboards/</link>
				<guid isPermaLink="true">http://mdickens.me/2025/11/29/gaming_keyboards/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;Nearly all gaming keyboards use the conventional typewriter-inspired keyboard shape. That is not a good shape for typing or gaming or frankly anything else.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/pc-gamer-keyboards.webp&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;&lt;a href=&quot;https://www.pcgamer.com/best-gaming-keyboard/&quot;&gt;image source&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;!-- more --&gt;

&lt;p&gt;The gaming experience is vastly improved by keyboards that have thumb keys, such as the &lt;a href=&quot;https://kinesis-ergo.com/shop/advantage2/&quot;&gt;Kinesis Advantage&lt;/a&gt; or the &lt;a href=&quot;https://www.maltron.com/&quot;&gt;Maltron&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;https://kinesis-ergo.com/wp-content/uploads/kb600-oh.jpg&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;If you move the Shift key to one of the thumb keys (which you should&lt;sup id=&quot;fnref:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;), then Shift-hotkeys and Control-hotkeys become much easier to press. Looking at how &lt;a href=&quot;https://www.youtube.com/watch?v=lY3ubYuZLPk&quot;&gt;pro StarCraft players&lt;/a&gt; have to contort their hands to reach the shift key, I can’t help but imagine how much better they’d be at the game if they used a proper keyboard with thumb keys.&lt;/p&gt;

&lt;p&gt;Okay, maybe they wouldn’t be better. But at least hitting the buttons would be easier.&lt;/p&gt;

&lt;p&gt;I saw an interview with StarCraft pro player Clem (I can’t find the video now) where he described how he uses his palm to hit the spacebar while his fingers are on the number row. And it left me thinking, this would be so much easier if you used a keyboard with thumb keys.&lt;/p&gt;

&lt;p&gt;I personally use the Kinesis Advantage—it’s the cheapest of the good ergonomic keyboards, and it works well for me. I’ve become spoiled by the thumb keys, and now I can hardly stand to play games using my laptop keyboard.&lt;/p&gt;

&lt;p&gt;If you want to do even better, you can get a split keyboard like the &lt;a href=&quot;https://kinesis-ergo.com/keyboards/advantage360/&quot;&gt;Kinesis Advantage 360&lt;/a&gt; or the &lt;a href=&quot;https://naya.tech/pages/naya-create&quot;&gt;Naya Create&lt;/a&gt; because they let you game with half a keyboard to make more room for your mouse. In many games, this will require rebinding some keys that normally live on the other side of the keyboard, but that’s a fair compromise.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;https://kinesis-ergo.com/wp-content/uploads/ADV360-separated_400x300.jpg&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;You could take this even further with the &lt;a href=&quot;https://www.maltron.com/store/p19/Maltron_Single_Hand_Keyboards_-_US_English.html&quot;&gt;Maltron Single Hand&lt;/a&gt;, which is a one-handed keyboard with every key on one side.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;https://cdn2.editmysite.com/images/blank.gif&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Those keyboards weren’t designed with gaming in mind. The most sensible gaming-specific keyboard I’ve seen is the &lt;a href=&quot;https://www.razer.com/gaming-keypads/razer-tartarus-pro&quot;&gt;Razer Tartarus&lt;/a&gt;:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;https://m.media-amazon.com/images/I/517GH4jsfOL._AC_UF894,1000_QL80_.jpg&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;I’ve never tried this keyboard so I can’t speak from experience, but it looks like a good concept. The keyboard has a small footprint to leave more room for the mouse, and it makes some innovations that I haven’t seen anywhere else. The thumb zone has a &lt;em&gt;joystick&lt;/em&gt; rather than just buttons, and the keyboard has a special feature where you can bind keys to do two different things depending on whether you do a half-press or a full press (although &lt;a href=&quot;https://old.reddit.com/r/razer/comments/ecvxej/if_youre_deciding_between_tartarus_v2_and/fbr8p77/&quot;&gt;some people&lt;/a&gt; report finding this feature difficult to use).&lt;/p&gt;

&lt;p&gt;One complaint is that the Razer Tartarus doesn’t have many buttons—it could easily have twice as many while still leaving plenty of room for the mouse. Having not used either, I think I’d prefer the Kinesis Advantage 360. But at least the Razer Tartarus is thinking about gaming from the ground up, as opposed to the usual “typewriter-shaped keyboard with expensive keyswitches and RGB lighting”.&lt;/p&gt;

&lt;p&gt;It pains me every time I see a pro gamer using a regular keyboard. If your livelihood depends on being good at games, then you should take the time to learn the best tools available.&lt;/p&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:2&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;On my Kinesis, I’ve rebound Backspace to Shift, Delete to Control, Left Shift to Right Shift (which some games care about), Home to Escape, and Caps Lock to Backspace. &lt;a href=&quot;#fnref:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>Belief in expert mistakes</title>
				<pubDate>Sat, 29 Nov 2025 00:00:00 -0800</pubDate>
				<link>http://mdickens.me/2025/11/29/belief_in_expert_mistakes/</link>
				<guid isPermaLink="true">http://mdickens.me/2025/11/29/belief_in_expert_mistakes/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;A few years ago, there was some publicity around a Navy fighter pilot who claimed to have seen an unidentified object that couldn’t possibly be explained except as an alien phenomenon. Many people considered this to be indisputable proof of aliens. “The pilot is an expert, there’s no way he could have been wrong.”&lt;/p&gt;

&lt;p&gt;I am much more willing to believe that someone can make a mistake, regardless of how good they are at the thing in question.&lt;/p&gt;

&lt;p&gt;Sure, fighter pilots have excellent vision, and sure, they’re better than I am at identifying objects in the sky. But they’re still fallible. There is no level of fighter-pilot skill that would make me believe aliens visited earth based solely on one person’s testimony.&lt;/p&gt;

&lt;p&gt;For any scientific theory, no matter how well-established, you can always find at least one expert with a PhD who studies the topic for a living and disagrees with the consensus. So either almost all experts are wrong, or that one expert is wrong. Either way, experts can make mistakes.&lt;/p&gt;

                </description>
			</item>
		
			<item>
				<title>TV is better when you trust the writers</title>
				<pubDate>Sat, 29 Nov 2025 00:00:00 -0800</pubDate>
				<link>http://mdickens.me/2025/11/29/TV_is_better_when_you_trust_the_writers/</link>
				<guid isPermaLink="true">http://mdickens.me/2025/11/29/TV_is_better_when_you_trust_the_writers/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;This post contains spoilers for the first episode of Pluribus.&lt;/p&gt;

&lt;!-- more --&gt;

&lt;p&gt;Pluribus is the new show brought to you by the same team who made Breaking Bad and Better Call Saul. The creators are all at the peak of their craft, including the writers. The show has only just started airing, but I’m watching it with confidence because I trust them.&lt;/p&gt;

&lt;p&gt;I’ve listened to every episode of the Breaking Bad and Better Call Saul insider podcasts, which gave me some insight into how they write their shows. The writers have latitude to think through every character’s decision and plot out what would happen in each branch of the decision tree, and then choose the path that makes the most sense for the characters and the show.&lt;/p&gt;

&lt;p&gt;I don’t know what’s going to happen in Pluribus. But I trust that the writers have thought it through, and that whatever they chose, they made the right decision.&lt;/p&gt;

&lt;p&gt;In the pilot episode, some aliens broadcast a signal containing RNA sequence that describes a virus-like thing. Humanity sees the signal and builds the thing, and it infects humans.&lt;/p&gt;

&lt;p&gt;From this fact, I’m pretty sure that the aliens have visited earth, because they knew how to construct an RNA sequence that would infect humans. But I only know this because I know the writers are going to make it make sense.&lt;/p&gt;

&lt;p&gt;In the hands of lesser writers, they might say the aliens are biologically identical to humans except that they have wrinkly foreheads&lt;sup id=&quot;fnref:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;, which is why the RNA sequence works on humans. Or, even worse, they might have no mental model whatsoever of what RNA is or how it works. But I trust Vince Gilligan and the crew, which lets me make deductions:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;The aliens broadcast an RNA sequence, which means they know what RNA is, which means either they’ve visited earth or they share a common ancestor with earth life (likely due to natural panspermia).&lt;/li&gt;
  &lt;li&gt;But the virus thing only works on humans, not any other mammal (the show specifically makes a point of telling us this). Therefore, the aliens must know about human biology.&lt;/li&gt;
  &lt;li&gt;Therefore, the aliens must have visited earth at some point in the last few tens of thousands of years.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I trust that the creators of Pluribus will make their show make sense.&lt;/p&gt;

&lt;p&gt;Take this quote from Vince Gilligan from the pilot episode of the Pluribus podcast. The context is that he’s discussing how the scene in the secure biology lab was originally written to have a lone scientist working, but their science advisor said that a &lt;a href=&quot;https://en.wikipedia.org/wiki/Biosafety_level&quot;&gt;BSL-4&lt;/a&gt; lab would always have at least two people present.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;I just went with my old X-Files learning from years ago. A person by themselves, it’s going to be scarier. So that’s the way I intended it. But then, yeah, [science advisor] Erin [Macdonald] said, it just doesn’t work that way. I thought for a microsecond, as I always do, well, artistic license, let’s just keep it this way. Then I thought, it’s never steered me wrong. It’s always held me—and us as a group—it’s held us in good stead when we get things technically accurate or as accurate as we humanly can. It never has harmed us. It has always paid dividends. [timestamp 35:00]&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I’ve seen some debate as to whether the show will explain how the hive mind is able to communicate with itself. Some say the show isn’t about the science; it’s meant to be a character study. Maybe, maybe not. I don’t know if they’re going to explain how it works. What I do know is this: if explaining how it works makes the show better, then they’ll explain how it works. And if keeping it mysterious makes the show better, then they’ll keep it mysterious. I don’t know what the right answer is, but I know that whatever it is, the writers will do it that way.&lt;/p&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:1&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Sorry Star Trek, you’re still great even if your aliens were a bit goofy. &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>I like reborrowed words</title>
				<pubDate>Sat, 29 Nov 2025 00:00:00 -0800</pubDate>
				<link>http://mdickens.me/2025/11/29/I_like_reborrowed_words/</link>
				<guid isPermaLink="true">http://mdickens.me/2025/11/29/I_like_reborrowed_words/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;A &lt;a href=&quot;https://en.wikipedia.org/wiki/Reborrowing&quot;&gt;reborrowed word&lt;/a&gt; is a loan word that goes from language A to language B and then back to language A. I think they’re neat.&lt;/p&gt;

&lt;p&gt;A classic example is &lt;em&gt;pidgin&lt;/em&gt;. A &lt;a href=&quot;https://en.wikipedia.org/wiki/Pidgin&quot;&gt;pidgin&lt;/a&gt; is a grammatically simple proto-language that emerges when two groups from different places have to learn to communicate. The word &lt;em&gt;pidgin&lt;/em&gt; originally described a simplified form of English spoken by Chinese business people, with &lt;em&gt;pidgin&lt;/em&gt; being approximately the Chinese pronunciation of the English word “business”. So “business” was borrowed by Chinese, and then borrowed back by English as &lt;em&gt;pidgin&lt;/em&gt;.&lt;/p&gt;

&lt;!-- more --&gt;

&lt;p&gt;Another reborrowed word is &lt;em&gt;anime&lt;/em&gt;—Japanese animation. The word comes from Japanese, where it was originally borrowed from the English word “animation”.&lt;/p&gt;

&lt;p&gt;I find it interesting how the definition of a reborrowed word is not the same as the definition of the word it came from. Consider &lt;em&gt;waifu&lt;/em&gt;, which is the Anglicized pronunciation of the Japanese pronunciation of the English “wife”. But a waifu isn’t a wife; a waifu is a fictional character who a lonely man pretends is his wife. (Or a lonely lesbian woman.) There’s also the &lt;em&gt;husbando&lt;/em&gt; which, if I’m not mistaken, is a pure English word that never went through Japanese.&lt;sup id=&quot;fnref:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;In each of these cases, the reborrowed word is more specific than the original—much like how in English, &lt;em&gt;salsa&lt;/em&gt; describes a Mexican spicy sauce made with tomatoes and peppers, whereas in Spanish, &lt;em&gt;salsa&lt;/em&gt; just means “sauce”.&lt;/p&gt;

&lt;p&gt;My fourth example of a reborrowed thing isn’t a word. English schools classically use a grading system that goes A &amp;gt; B &amp;gt; C &amp;gt; F, or sometimes A &amp;gt; B &amp;gt; C &amp;gt; D &amp;gt; F. Japan borrowed this system, except that—for reasons that are lost to time—there is another rank, S, which is even better than A. Japan’s ranking system made its way back into English via &lt;a href=&quot;https://en.wikipedia.org/wiki/Tier_list&quot;&gt;tier lists&lt;/a&gt;, which work like letter grades except with S tier at the top.&lt;/p&gt;

&lt;p&gt;2026-03-01: I just learned a new example: &lt;em&gt;karaoke&lt;/em&gt; comes from the Japanese &lt;em&gt;kara&lt;/em&gt; + &lt;em&gt;oke&lt;/em&gt;, where &lt;em&gt;oke&lt;/em&gt; is a shortened form of &lt;em&gt;okesutora&lt;/em&gt; … see if you can guess the origin of &lt;em&gt;okesutora&lt;/em&gt;. &lt;span class=&quot;spoiler&quot;&gt;It’s the Japanese pronunciation of the English &lt;i&gt;orchestra&lt;/i&gt;.&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;A related phenomenon happens when a word’s meaning is generalized and then goes back to the original meaning:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;https://imgs.xkcd.com/comics/canon_2x.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;&lt;a href=&quot;https://xkcd.com/3123/&quot;&gt;source&lt;/a&gt;. alt text: Achilles was a mighty warrior, but his Achilles’ heel was his heel.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;(A commenter on Facebook reported that a friend once asked them, “What’s Superman’s kryptonite?”)&lt;/p&gt;

&lt;p&gt;Another fun thing is a &lt;a href=&quot;https://en.wikipedia.org/wiki/Doublet_(linguistics)&quot;&gt;doublet&lt;/a&gt;, where two words with two distinct meanings share a single root word. For example, the Latin &lt;em&gt;fortis&lt;/em&gt; evolved into the Italian &lt;em&gt;forte&lt;/em&gt; as well as the French &lt;em&gt;fort&lt;/em&gt;. Both words were then borrowed by English, but the French-derived &lt;em&gt;forte&lt;/em&gt; (pronounced like “fort”&lt;sup id=&quot;fnref:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt;) means “strong point”, and the Italian-derived &lt;em&gt;forte&lt;/em&gt; means “loud”. Wikipedia has &lt;a href=&quot;https://en.wikipedia.org/wiki/Doublet_(linguistics)#English&quot;&gt;many more examples&lt;/a&gt; of doublets.&lt;/p&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:1&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I couldn’t find concrete evidence on the origin of “husbando”, but it doesn’t make sense as a Japanese word. The Japanese pronunciation of “husband” would be more like “hasubendo”. My source is &lt;a href=&quot;https://www.urbandictionary.com/define.php?term=husbando&quot;&gt;Urban Dictionary&lt;/a&gt; so take that with a grain of salt, but it fits with my limited understanding of how Japanese pronunciation works. &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:2&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Nearly everyone gets this wrong. I learned the correct pronunciation from George Carlin, who included this in one of his lists of pet peeves. &lt;a href=&quot;#fnref:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>Wartime ethics is weird</title>
				<pubDate>Fri, 28 Nov 2025 00:00:00 -0800</pubDate>
				<link>http://mdickens.me/2025/11/28/wartime_ethics/</link>
				<guid isPermaLink="true">http://mdickens.me/2025/11/28/wartime_ethics/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;The ethical principles that most people hold—and hold most strongly—go completely out the window when it comes to war.&lt;/p&gt;

&lt;!-- more --&gt;

&lt;blockquote&gt;
  &lt;p&gt;Normal time: Killing is bad. In fact it’s pretty much the worst thing you can do.&lt;/p&gt;

  &lt;p&gt;Wartime: Killing is great! Kill as many people as you can! If you’re really good at killing, you get a medal!&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;(Just so long as you kill the right people.)&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Normal time: Slavery is a blight upon humanity, one of the greatest and most shameful tragedies in history.&lt;/p&gt;

  &lt;p&gt;Wartime: Slavery is actually totally fine if people are being enslaved by the government for the purposes of killing other people! And in fact, slavery is essential, and if you object to it then you’re betraying your country!&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;(If it weren’t so depressing, it would be funny to see the contorted logic people come up with to argue that conscription isn’t slavery.)&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Normal time: If your employer is behaving unethically and you speak out, you deserve special protections and your employer must not retaliate against you.&lt;/p&gt;

  &lt;p&gt;Wartime: If you refuse to obey your employer’s unethical demands, that’s a crime; and you will be prosecuted through a special court, and by the way the court is run by your employer.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Not to say I fully agree with common-sense ethics, but according to common-sense ethics, you are morally justified in killing anyone who attempts to draft you, out of self-defense.&lt;/p&gt;

&lt;p&gt;Imagine there’s a criminal organization that runs an underground boxing ring where the boxers are coerced into joining. Someone from the organization tries to kidnap you and force you to participate in a boxing match. If you resisted the kidnapper, and even if you used lethal force against them, then you’d be justified on grounds of self-defense. The kidnapper was going to endanger your life; you are morally entitled to use any means necessary to prevent them from doing that.&lt;/p&gt;

&lt;p&gt;And yet, if the military came to your house to attempt to force you to go to war, most people would say you’re not allowed to resist. It’s perfectly ethical for someone to kidnap you and forcing you to fight and endanger your life, as long as that someone works for the government. Even people who oppose the draft usually don’t think it’s okay to forcibly resist.&lt;/p&gt;

&lt;p&gt;While we’re on the subject of wartime ethics, here’s a combination of common beliefs that doesn’t make sense:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Killing civilians in wartime is morally wrong.&lt;/li&gt;
  &lt;li&gt;Killing enemy soldiers is fine, even if they were drafted.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Whether someone gets drafted is a matter of luck. Why does it become okay to kill someone after their name comes up in a lottery?&lt;/p&gt;

&lt;p&gt;Going back to the underground boxing analogy: if you get forced into a boxing match at gunpoint, and you kill the other boxer, I won’t hold it against you. Your captor is responsible for the death, not you. Nonetheless, the person who died was just as innocent as you were.&lt;/p&gt;

&lt;h2 id=&quot;what-i-believe&quot;&gt;What I believe&lt;/h2&gt;

&lt;p&gt;I’m a utilitarian; I often disagree with common-sense ethics. I believe that conscription could, in principle, be justified on utilitarian grounds. It could be justified in the same way that murder could, in principle, be justified. But there is a strong temptation to &lt;a href=&quot;https://www.astralcodexten.com/p/less-utilitarian-than-thou&quot;&gt;rationalize doing harm&lt;/a&gt; in the name of the greater good. Murder is rightly illegal, and conscription should be illegal for the same reasons. Even utilitarians should obey moral rules, because &lt;a href=&quot;https://www.lesswrong.com/posts/K9ZaZXDnL3SEmYZqB/ends-don-t-justify-means-among-humans&quot;&gt;your brain is trying to trick you&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The only way to justify a draft is naive consequentialism: we are suspending people’s rights in this case because it’s worth it. I call it naive because drafts usually don’t come out looking good if you properly consider the consequences, and properly consider that you can’t trust your own reasoning on the consequences.&lt;/p&gt;

&lt;p&gt;History shows that war is rarely justified on utilitarian grounds. As far as I can tell, war mostly happens due to a combination of not assigning moral value to people in enemy countries—which wouldn’t happen if people were more utilitarian—and the people responsible for declaring war not having to face the lethal consequences themselves.&lt;/p&gt;

                </description>
			</item>
		
			<item>
				<title>Alignment Bootstrapping Is Dangerous</title>
				<pubDate>Thu, 27 Nov 2025 00:00:00 -0800</pubDate>
				<link>http://mdickens.me/2025/11/27/alignment_bootstrapping_is_dangerous/</link>
				<guid isPermaLink="true">http://mdickens.me/2025/11/27/alignment_bootstrapping_is_dangerous/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;AI companies want to bootstrap weakly-superhuman AI to align superintelligent AI. I don’t expect them to succeed. I could give various arguments for why alignment bootstrapping is hard and why AI companies are ignoring the hard parts of the problem; but you don’t need to understand any details to know that it’s a bad plan.&lt;/p&gt;

&lt;p&gt;When AI companies say they will bootstrap alignment, they are admitting defeat on solving the alignment problem, and saying that instead they will rely on AI to solve it for them. So they’re facing a problem of unknown difficulty, but where the difficulty is high enough that &lt;em&gt;they don’t think they can solve it&lt;/em&gt;. And to remediate this, they will use a &lt;em&gt;novel technique never before used in history&lt;/em&gt;—i.e., counting on slightly-superhuman AI to do the bulk of the work.&lt;/p&gt;

&lt;p&gt;If they mess up and this plan doesn’t work, then superintelligent AI kills everyone.&lt;/p&gt;

&lt;p&gt;And they think this is an acceptable plan, and it is acceptable for them to build up to human-level AI or beyond on the basis of this plan.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;What?&lt;/em&gt;&lt;/p&gt;

&lt;!-- more --&gt;

&lt;p&gt;It takes remarkable hubris to believe that a problem is this hard, to believe that humanity’s survival depends on getting the right solution, and yet be this confident that it will be solved.&lt;/p&gt;

&lt;p&gt;If you don’t know how hard a problem is, then it’s harder than you think.&lt;/p&gt;

&lt;p&gt;If you plan on using a technique that’s never been used before, then that technique is less effective than you think.&lt;/p&gt;

&lt;p&gt;If you have a problem of unknown difficulty that you want to solve using unknown methods, and you don’t know how you will develop those methods, and failure would be catastrophic, then you shouldn’t do that.&lt;/p&gt;

&lt;p&gt;Imagine if NASA wanted to land on the moon and they were trying to figure out how to make rocket fuel, but metalworking hadn’t been invented yet so all their rockets were made of wood. And they said, we are working on figuring out how to make some material won’t get incinerated by rocket fuel; no, we don’t know what that material is, and we have no theory of how to make it; but don’t worry, in 2020 we only had maple and today we are using oak, so we’re making good progress.&lt;/p&gt;

&lt;p&gt;This would not be an acceptable plan for solving a medium-stakes problem. It is certainly not an acceptable approach when a failure would destroy everything that matters in the world.&lt;/p&gt;

&lt;p&gt;A steelman of this position would be something like:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Yes, our plan has a good chance of failing and killing everyone. But if we don’t build ASI using alignment bootstrapping, some other company will build ASI using even worse techniques, and we’re even more likely to die. So building ASI this way is our best option, even though it’s extremely risky.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Some people believe this. A small number of individuals have said things to this effect. I think they’re still wrong, but at least I get it.&lt;/p&gt;

&lt;p&gt;To my knowledge, no one has ever said this in their capacity as a person who is directly working on ASI development or alignment.&lt;/p&gt;

&lt;p&gt;I don’t respect AI companies when they publish their roadmaps for handling alignment, and then at no point does the roadmap say anything like&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;This is a bad plan that has an unacceptably high risk of killing everyone. We’d much prefer to coordinate to slow down and take our time. We would support a global halt on developing ASI until it can be proven safe; but until such time as that happens, we will continue building ASI using our least-bad plan.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The fact that they don’t say this makes coordination more difficult—it’s a self-fulfilling prophecy. Concealing the difficulty of the alignment problem actively contributes to the situation in which the wider world does not take AI risk seriously, and safety-minded developers feel forced to follow a dangerous plan as the least-bad option.&lt;/p&gt;

&lt;p&gt;Every major safety proposal by an AI company should start with a disclaimer like “This is a frighteningly risky plan that we are not at all confident in, but it’s our best option due to the lack of widespread agreement about the importance of AI risk.” And they should be simultaneously pushing for global regulations so that they no longer have to take this dangerous route.&lt;/p&gt;

                </description>
			</item>
		
			<item>
				<title>Magic: The Gathering Arena decklists for people on a budget</title>
				<pubDate>Wed, 26 Nov 2025 00:00:00 -0800</pubDate>
				<link>http://mdickens.me/2025/11/26/mtg_arena_budget_decklists/</link>
				<guid isPermaLink="true">http://mdickens.me/2025/11/26/mtg_arena_budget_decklists/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;I’ve been playing a lot of MTG Arena lately, but I refuse to spend any money on it, which means I can’t craft many rare cards. When I look up &lt;a href=&quot;https://mtgdecks.net/Standard&quot;&gt;meta decklists&lt;/a&gt;, they always include a lot of rares and mythic rares. I don’t want to spend all my rare wildcards on one deck!&lt;/p&gt;

&lt;p&gt;That’s sort of what the Pauper format is for. Pauper decks are only allowed to use common cards, which makes them cheap. But that format isn’t quite what I’m looking for, for four reasons:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;There is no Pauper ladder on Arena, there are only tournaments.&lt;sup id=&quot;fnref:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; I don’t want to play a tournament, I just want to be able to hop on and play a few games.&lt;/li&gt;
  &lt;li&gt;I have some rare wildcards that I can use to craft rare cards; I don’t have to limit myself to commons only. I just don’t have very &lt;em&gt;many&lt;/em&gt; rare wildcards, so I want to spend them judiciously.&lt;/li&gt;
  &lt;li&gt;All the strongest Pauper decks are aggro decks. What if I don’t want to play aggro?&lt;/li&gt;
  &lt;li&gt;There are thousands of MTG cards that aren’t playable on Arena, so I can’t build most Pauper decks anyway.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;There’s the &lt;a href=&quot;https://mtgdecks.net/Historic-Artisan&quot;&gt;Artisan&lt;/a&gt; format which is Arena-specific (so it fixes problem 4), but it still has the other three problems.&lt;/p&gt;

&lt;p&gt;What I really want is to build a Standard deck using only 4–8 wildcards to craft the most important rares, and then if I decide I like the deck enough, I can craft some more. Which means I want to know which rares I really need, and which ones I can replace with common or uncommon substitutes.&lt;/p&gt;

&lt;!-- more --&gt;

&lt;p&gt;I created a demo to show what this might look like. Below is a table with an interactive decklist for a Dimir Control deck&lt;sup id=&quot;fnref:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt; with a sliding scale from “budget” to “all rares”. I picked this deck because it’s a non-aggro deck that requires relatively few rares and mythics.&lt;/p&gt;

&lt;p&gt;I’m no Pro Tour player, so I’m sure some of my choices are wrong. But I made my best attempt to put together a budget-friendly decklist where the rare cards are ordered from most to least replaceable.&lt;/p&gt;

&lt;p&gt;This table shows a Dimir Control decklist. As you slide the slider from left to right, the expensive cards get replaced by cheaper substitutes one by one. I’m not a web designer any more than I’m a Pro Tour player, so consider this a proof of concept.&lt;/p&gt;

&lt;style&gt;
    .mtg-common {
        margin: 0 5px;
        color: #000000;
    }

    .mtg-uncommon {
        margin: 0 5px;
        color: #cae2ee;
    }

    .mtg-rare {
        margin: 0 5px;
        color: #d1ad63;
    }

    table {
        font-size: 16px;
        border-collapse: collapse;
        box-shadow: 0 2px 8px rgba(0,0,0,0.1);
    }

    td {
        margin: 1px;
        text-align: center;
        font-weight: bold;
        border: 2px solid #333;
    }

    .hidden-col {
        display: none;
    }

    .slider-container {
        margin: 20px 0;
    }

    input[type=&quot;range&quot;] {
        width: 200px;
        height: 8px;
    }

    .slider-label {
        margin-top: 10px;
        font-size: 14px;
    }
&lt;/style&gt;

&lt;div class=&quot;slider-container&quot;&gt;
    Expensive
    &lt;input type=&quot;range&quot; id=&quot;valueSlider&quot; min=&quot;0&quot; max=&quot;6&quot; value=&quot;0&quot; step=&quot;1&quot; /&gt;
    Budget
    &lt;span id=&quot;position&quot; style=&quot;display: none&quot;&gt;0&lt;/span&gt;
&lt;/div&gt;

&lt;table id=&quot;valueTable&quot;&gt;

&lt;colgroup&gt;
&lt;col class=&quot;visible-col&quot; width=&quot;80%&quot; /&gt;
&lt;col class=&quot;org-right&quot; width=&quot;20%&quot; /&gt;
&lt;col class=&quot;hidden-col&quot; /&gt;
&lt;/colgroup&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th scope=&quot;col&quot; class=&quot;visible-col&quot;&gt;Card&lt;/th&gt;
&lt;th scope=&quot;col&quot; class=&quot;org-right&quot;&gt;Count&lt;/th&gt;
&lt;th scope=&quot;col&quot; class=&quot;hidden-col&quot;&gt;Card&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td class=&quot;visible-col&quot;&gt;&lt;span class=&quot;mtg-rare&quot;&gt;●&lt;/span&gt;&lt;span class=&quot;mtg-tooltip&quot;&gt;Consult the Star Charts&lt;span class=&quot;mtg-tooltip-image&quot;&gt;&lt;img src=&quot;/assets/images/mtg-cards/Consult the Star Charts.jpg&quot; /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;4&lt;/td&gt;
&lt;td class=&quot;hidden-col&quot;&gt;&lt;span class=&quot;mtg-uncommon&quot;&gt;●&lt;/span&gt;&lt;span class=&quot;mtg-tooltip&quot;&gt;Stock Up&lt;span class=&quot;mtg-tooltip-image&quot;&gt;&lt;img src=&quot;/assets/images/mtg-cards/Stock Up.jpg&quot; /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td class=&quot;visible-col&quot;&gt;&lt;span class=&quot;mtg-rare&quot;&gt;●&lt;/span&gt;&lt;span class=&quot;mtg-tooltip&quot;&gt;The End&lt;span class=&quot;mtg-tooltip-image&quot;&gt;&lt;img src=&quot;/assets/images/mtg-cards/The End.jpg&quot; /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;2&lt;/td&gt;
&lt;td class=&quot;hidden-col&quot;&gt;&lt;span class=&quot;mtg-uncommon&quot;&gt;●&lt;/span&gt;&lt;span class=&quot;mtg-tooltip&quot;&gt;Hero&apos;s Downfall&lt;span class=&quot;mtg-tooltip-image&quot;&gt;&lt;img src=&quot;/assets/images/mtg-cards/Hero&apos;s Downfall.jpg&quot; /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td class=&quot;visible-col&quot;&gt;&lt;span class=&quot;mtg-rare&quot;&gt;●&lt;/span&gt;&lt;span class=&quot;mtg-tooltip&quot;&gt;Three Steps Ahead&lt;span class=&quot;mtg-tooltip-image&quot;&gt;&lt;img src=&quot;/assets/images/mtg-cards/Three Steps Ahead.jpg&quot; /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;4&lt;/td&gt;
&lt;td class=&quot;hidden-col&quot;&gt;&lt;span class=&quot;mtg-common&quot;&gt;●&lt;/span&gt;&lt;span class=&quot;mtg-tooltip&quot;&gt;Refute&lt;span class=&quot;mtg-tooltip-image&quot;&gt;&lt;img src=&quot;/assets/images/mtg-cards/Refute.jpg&quot; /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td class=&quot;visible-col&quot;&gt;&lt;span class=&quot;mtg-rare&quot;&gt;●&lt;/span&gt;&lt;span class=&quot;mtg-tooltip&quot;&gt;Scavenger Regent&lt;span class=&quot;mtg-tooltip-image&quot;&gt;&lt;img src=&quot;/assets/images/mtg-cards/Scavenger Regent.jpg&quot; /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;1&lt;/td&gt;
&lt;td class=&quot;hidden-col&quot;&gt;&lt;span class=&quot;mtg-uncommon&quot;&gt;●&lt;/span&gt;&lt;span class=&quot;mtg-tooltip&quot;&gt;Bitter Triumph&lt;span class=&quot;mtg-tooltip-image&quot;&gt;&lt;img src=&quot;/assets/images/mtg-cards/Bitter Triumph.jpg&quot; /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td class=&quot;visible-col&quot;&gt;&lt;span class=&quot;mtg-rare&quot;&gt;●&lt;/span&gt;&lt;span class=&quot;mtg-tooltip&quot;&gt;Marang River Regent&lt;span class=&quot;mtg-tooltip-image&quot;&gt;&lt;img src=&quot;/assets/images/mtg-cards/Marang River Regent.jpg&quot; /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;4&lt;/td&gt;
&lt;td class=&quot;hidden-col&quot;&gt;&lt;span class=&quot;mtg-uncommon&quot;&gt;●&lt;/span&gt;&lt;span class=&quot;mtg-tooltip&quot;&gt;Eddymurk Crab&lt;span class=&quot;mtg-tooltip-image&quot;&gt;&lt;img src=&quot;/assets/images/mtg-cards/Eddymurk Crab.jpg&quot; /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td class=&quot;visible-col&quot;&gt;&lt;span class=&quot;mtg-common&quot;&gt;●&lt;/span&gt;&lt;span class=&quot;mtg-tooltip&quot;&gt;Dispelling Exhale&lt;span class=&quot;mtg-tooltip-image&quot;&gt;&lt;img src=&quot;/assets/images/mtg-cards/Dispelling Exhale.jpg&quot; /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;4&lt;/td&gt;
&lt;td class=&quot;hidden-col&quot;&gt;&lt;span class=&quot;mtg-common&quot;&gt;●&lt;/span&gt;&lt;span class=&quot;mtg-tooltip&quot;&gt;Don&apos;t Make a Sound&lt;span class=&quot;mtg-tooltip-image&quot;&gt;&lt;img src=&quot;/assets/images/mtg-cards/Don&apos;t Make a Sound.jpg&quot; /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td class=&quot;visible-col&quot;&gt;&lt;span class=&quot;mtg-common&quot;&gt;●&lt;/span&gt;&lt;span class=&quot;mtg-tooltip&quot;&gt;Caustic Exhale&lt;span class=&quot;mtg-tooltip-image&quot;&gt;&lt;img src=&quot;/assets/images/mtg-cards/Caustic Exhale.jpg&quot; /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;4&lt;/td&gt;
&lt;td class=&quot;hidden-col&quot;&gt;&lt;span class=&quot;mtg-uncommon&quot;&gt;●&lt;/span&gt;&lt;span class=&quot;mtg-tooltip&quot;&gt;Long Goodbye&lt;span class=&quot;mtg-tooltip-image&quot;&gt;&lt;img src=&quot;/assets/images/mtg-cards/Long Goodbye.jpg&quot; /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td class=&quot;visible-col&quot;&gt;&lt;span class=&quot;mtg-rare&quot;&gt;●&lt;/span&gt;&lt;span class=&quot;mtg-tooltip&quot;&gt;Deadly Cover-Up&lt;span class=&quot;mtg-tooltip-image&quot;&gt;&lt;img src=&quot;/assets/images/mtg-cards/Deadly Cover-Up.jpg&quot; /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;4&lt;/td&gt;
&lt;td class=&quot;hidden-col&quot;&gt;&lt;span class=&quot;mtg-uncommon&quot;&gt;●&lt;/span&gt;&lt;span class=&quot;mtg-tooltip&quot;&gt;Aetherize&lt;span class=&quot;mtg-tooltip-image&quot;&gt;&lt;img src=&quot;/assets/images/mtg-cards/Aetherize.jpg&quot; /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td class=&quot;visible-col&quot;&gt;&lt;span class=&quot;mtg-uncommon&quot;&gt;●&lt;/span&gt;&lt;span class=&quot;mtg-tooltip&quot;&gt;Intimidation Campaign&lt;span class=&quot;mtg-tooltip-image&quot;&gt;&lt;img src=&quot;/assets/images/mtg-cards/Intimidation Campaign.jpg&quot; /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;1&lt;/td&gt;
&lt;td class=&quot;hidden-col&quot;&gt;&amp;#xa0;&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td class=&quot;visible-col&quot;&gt;&lt;span class=&quot;mtg-uncommon&quot;&gt;●&lt;/span&gt;&lt;span class=&quot;mtg-tooltip&quot;&gt;Shoot the Sheriff&lt;span class=&quot;mtg-tooltip-image&quot;&gt;&lt;img src=&quot;/assets/images/mtg-cards/Shoot the Sheriff.jpg&quot; /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;2&lt;/td&gt;
&lt;td class=&quot;hidden-col&quot;&gt;&amp;#xa0;&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td class=&quot;visible-col&quot;&gt;&lt;span class=&quot;mtg-uncommon&quot;&gt;●&lt;/span&gt;&lt;span class=&quot;mtg-tooltip&quot;&gt;Bitter Triumph&lt;span class=&quot;mtg-tooltip-image&quot;&gt;&lt;img src=&quot;/assets/images/mtg-cards/Bitter Triumph.jpg&quot; /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;1&lt;/td&gt;
&lt;td class=&quot;hidden-col&quot;&gt;&amp;#xa0;&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td class=&quot;visible-col&quot;&gt;&lt;span class=&quot;mtg-uncommon&quot;&gt;●&lt;/span&gt;&lt;span class=&quot;mtg-tooltip&quot;&gt;Strategic Betrayal&lt;span class=&quot;mtg-tooltip-image&quot;&gt;&lt;img src=&quot;/assets/images/mtg-cards/Strategic Betrayal.jpg&quot; /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;2&lt;/td&gt;
&lt;td class=&quot;hidden-col&quot;&gt;&amp;#xa0;&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td class=&quot;visible-col&quot;&gt;Lands&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;27&lt;/td&gt;
&lt;td class=&quot;hidden-col&quot;&gt;&amp;#xa0;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td class=&quot;visible-col&quot;&gt;Total&lt;/td&gt;
&lt;td class=&quot;org-right&quot;&gt;60&lt;/td&gt;
&lt;td class=&quot;hidden-col&quot;&gt;&amp;#xa0;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;

&lt;script&gt;
    const slider = document.getElementById(&apos;valueSlider&apos;);
    const rows = document.querySelectorAll(&apos;#valueTable tr&apos;);
    const positionLabel = document.getElementById(&apos;position&apos;);

    slider.addEventListener(&apos;input&apos;, function() {
        var position = parseInt(this.value);

        if (position &gt;= 5) {
            // change behold dragon cards at the same time as Marang River Regent
            position += 2;
        }

        // Update each row based on how many replacements should be visible
        rows.forEach((row, index) =&gt; {
            const leftCell = row.querySelector(&apos;.visible-col&apos;);
            const rightCell = row.querySelector(&apos;.hidden-col&apos;);

            if (index &lt; position + 1) {
                // Show right value
                leftCell.innerHTML = rightCell.innerHTML;
            } else {
                // Reset to original left value (stored in the cell itself)
                if (!leftCell.dataset.original) {
                    leftCell.dataset.original = leftCell.innerHTML;
                }
                leftCell.innerHTML = leftCell.dataset.original;
            }
        });

        // Update position label
        positionLabel.innerHTML = position;
    });

    // Store original values on page load
    rows.forEach(row =&gt; {
        const leftCell = row.querySelector(&apos;.visible-col&apos;);
        leftCell.dataset.original = leftCell.innerHTML;
    });
&lt;/script&gt;

&lt;p&gt;&lt;span class=&quot;mtg-tooltip&quot;&gt;Consult the Star Charts&lt;span class=&quot;mtg-tooltip-image&quot;&gt;&lt;img src=&quot;/assets/images/mtg-cards/Consult the Star Charts.jpg&quot; /&gt;&lt;/span&gt;&lt;/span&gt; is a strong card, but it’s the first card to get subbed out because &lt;span class=&quot;mtg-tooltip&quot;&gt;Stock Up&lt;span class=&quot;mtg-tooltip-image&quot;&gt;&lt;img src=&quot;/assets/images/mtg-cards/Stock Up.jpg&quot; /&gt;&lt;/span&gt;&lt;/span&gt; is a powerful replacement. &lt;span class=&quot;mtg-tooltip&quot;&gt;Deadly Cover-Up&lt;span class=&quot;mtg-tooltip-image&quot;&gt;&lt;img src=&quot;/assets/images/mtg-cards/Deadly Cover-Up.jpg&quot; /&gt;&lt;/span&gt;&lt;/span&gt; is the most indispensable rare card because there are no proper common/uncommon board wipes—&lt;span class=&quot;mtg-tooltip&quot;&gt;Aetherize&lt;span class=&quot;mtg-tooltip-image&quot;&gt;&lt;img src=&quot;/assets/images/mtg-cards/Aetherize.jpg&quot; /&gt;&lt;/span&gt;&lt;/span&gt; vaguely resembles a board wipe, but it’s really not the same thing.&lt;/p&gt;

&lt;p&gt;&lt;span class=&quot;mtg-tooltip&quot;&gt;Dispelling Exhale&lt;span class=&quot;mtg-tooltip-image&quot;&gt;&lt;img src=&quot;/assets/images/mtg-cards/Dispelling Exhale.jpg&quot; /&gt;&lt;/span&gt;&lt;/span&gt; and &lt;span class=&quot;mtg-tooltip&quot;&gt;Caustic Exhale&lt;span class=&quot;mtg-tooltip-image&quot;&gt;&lt;img src=&quot;/assets/images/mtg-cards/Caustic Exhale.jpg&quot; /&gt;&lt;/span&gt;&lt;/span&gt; are commons, so it might seem superfluous to replace them; but they work best in a deck that has dragons, and once you replace &lt;span class=&quot;mtg-tooltip&quot;&gt;Marang River Regent&lt;span class=&quot;mtg-tooltip-image&quot;&gt;&lt;img src=&quot;/assets/images/mtg-cards/Marang River Regent.jpg&quot; /&gt;&lt;/span&gt;&lt;/span&gt;, the deck doesn’t have dragons anymore.&lt;/p&gt;

&lt;p&gt;A better version of this interface could allow for more complex changes. For example, I wouldn’t replace &lt;span class=&quot;mtg-tooltip&quot;&gt;Marang River Regent&lt;span class=&quot;mtg-tooltip-image&quot;&gt;&lt;img src=&quot;/assets/images/mtg-cards/Marang River Regent.jpg&quot; /&gt;&lt;/span&gt;&lt;/span&gt; one-to-one with &lt;span class=&quot;mtg-tooltip&quot;&gt;Eddymurk Crab&lt;span class=&quot;mtg-tooltip-image&quot;&gt;&lt;img src=&quot;/assets/images/mtg-cards/Eddymurk Crab.jpg&quot; /&gt;&lt;/span&gt;&lt;/span&gt; because the Regent isn’t just a big creature, it’s also a card-draw spell. I’d want to rearrange the deck to bring in &lt;span class=&quot;mtg-tooltip&quot;&gt;Quick Study&lt;span class=&quot;mtg-tooltip-image&quot;&gt;&lt;img src=&quot;/assets/images/mtg-cards/Quick Study.jpg&quot; /&gt;&lt;/span&gt;&lt;/span&gt; or something. My demo is too simple to do that, but it’s possible in theory. You may have also noticed that &lt;span class=&quot;mtg-tooltip&quot;&gt;Bitter Triumph&lt;span class=&quot;mtg-tooltip-image&quot;&gt;&lt;img src=&quot;/assets/images/mtg-cards/Bitter Triumph.jpg&quot; /&gt;&lt;/span&gt;&lt;/span&gt; appears twice on the budget decklist; it would be better to collapse those into one row.&lt;/p&gt;

&lt;p&gt;If any MTG players think my budget decklist is wrong, I’d be happy to hear suggestions because I play this deck often. The version I use right now is most of the way toward the Budget side, plus some small changes like bringing in &lt;span class=&quot;mtg-tooltip&quot;&gt;Quick Study&lt;span class=&quot;mtg-tooltip-image&quot;&gt;&lt;img src=&quot;/assets/images/mtg-cards/Quick Study.jpg&quot; /&gt;&lt;/span&gt;&lt;/span&gt;.&lt;sup id=&quot;fnref:3&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;My main point isn’t about this specific decklist. My point is that when I get a deck off the internet, I want to know which rare cards I should craft in what order. Which ones are indispensable and which ones don’t matter? That table I made is a prototype of the sort of thing I’d like to see.&lt;/p&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:2&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;And the people who play in those tournaments are really good at Pauper. I tried playing one once and I got destroyed every game. &lt;a href=&quot;#fnref:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:1&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Azorius Control and Jeskai Control are better right now, but I don’t think there’s any way to make budget versions of those decks. Black has good common/uncommon removal spells, but white’s only good removal cards are rare. And then there are singular cards like &lt;span class=&quot;mtg-tooltip&quot;&gt;Jeskai Revelation&lt;span class=&quot;mtg-tooltip-image&quot;&gt;&lt;img src=&quot;/assets/images/mtg-cards/Jeskai Revelation.jpg&quot; /&gt;&lt;/span&gt;&lt;/span&gt; and &lt;span class=&quot;mtg-tooltip&quot;&gt;Beza, the Bounding Spring&lt;span class=&quot;mtg-tooltip-image&quot;&gt;&lt;img src=&quot;/assets/images/mtg-cards/Beza, the Bounding Spring.jpg&quot; /&gt;&lt;/span&gt;&lt;/span&gt; that you can’t replace. &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:3&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Plus some rares that I got lucky enough to open in booster packs—I have one copy of &lt;span class=&quot;mtg-tooltip&quot;&gt;Marang River Regent&lt;span class=&quot;mtg-tooltip-image&quot;&gt;&lt;img src=&quot;/assets/images/mtg-cards/Marang River Regent.jpg&quot; /&gt;&lt;/span&gt;&lt;/span&gt; and one &lt;span class=&quot;mtg-tooltip&quot;&gt;Three Steps Ahead&lt;span class=&quot;mtg-tooltip-image&quot;&gt;&lt;img src=&quot;/assets/images/mtg-cards/Three Steps Ahead.jpg&quot; /&gt;&lt;/span&gt;&lt;/span&gt;. &lt;a href=&quot;#fnref:3&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>How to Fix Quidditch</title>
				<pubDate>Tue, 25 Nov 2025 00:00:00 -0800</pubDate>
				<link>http://mdickens.me/2025/11/25/fixing_quidditch/</link>
				<guid isPermaLink="true">http://mdickens.me/2025/11/25/fixing_quidditch/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;&lt;em&gt;Inspired by &lt;a href=&quot;https://tomasbjartur.bearblog.dev/harry-potter-and-the-rules-of-quidditch/&quot;&gt;this post&lt;/a&gt; by Tomás Bjartur, which is an allegory; but I’m not writing an allegory, I’m writing about the rules of Quidditch.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The rules of Quidditch have a big problem. The game ends when a seeker catches the snitch, and the snitch is worth 150 points. So most of the players on the field don’t matter; in almost all games, the only thing that matters is who catches the snitch.&lt;/p&gt;

&lt;p&gt;This also makes it a bad spectator sport because you can’t &lt;em&gt;see&lt;/em&gt; the snitch, so nobody knows what the hell is going on.&lt;/p&gt;

&lt;p&gt;I propose some rule changes:&lt;/p&gt;

&lt;!-- more --&gt;

&lt;ul&gt;
  &lt;li&gt;The game ends when the snitch is caught. (This rule is still the same.)&lt;/li&gt;
  &lt;li&gt;The snitch is worth 10 points.&lt;/li&gt;
  &lt;li&gt;Instead of being nearly-invisible, the snitch glows and leaves a glowing comet-trail.&lt;/li&gt;
  &lt;li&gt;The seekers are each given special wands that can only cast a limited set of spells: specifically, spells that make the snitch harder to catch. For example, they can give it a temporary speed boost, or render it temporarily invisible, or push it a fixed distance. The wands have limited energy that takes time to recharge after a spell is cast.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In my proposed version of Quidditch, the snitch still matters—and Harry Potter as the seeker still gets to play a central role, which is important for narrative purposes—but the snitch is no longer the &lt;em&gt;only&lt;/em&gt; thing that matters.&lt;/p&gt;

&lt;p&gt;Making the snitch more visible is a no-brainer because Quidditch is supposed to be a &lt;em&gt;spectator&lt;/em&gt; sport. Spectators ought to be able to see what’s going on.&lt;/p&gt;

&lt;p&gt;Under my proposed rules, the winning team wants to catch the snitch, and in a tied game, both teams want to catch the snitch. The &lt;em&gt;losing&lt;/em&gt; team still has dynamic gameplay, in which they can use their restricted magic to make the snitch more difficult to catch. A seeker with strong defensive skills can keep a losing game interesting, and allow their teammates time to turn things around.&lt;/p&gt;

&lt;p&gt;Some people play Quidditch in real life. I’ve never played the real-life game myself, but I did read the &lt;a href=&quot;https://www.rulesofsport.com/sports/quidditch.html&quot;&gt;rules&lt;/a&gt; online, and it makes some similar changes to my suggestions:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;The snitch is worth 30 points instead of 150.&lt;/li&gt;
  &lt;li&gt;Instead of being a magical flying ball that doesn’t exist, the snitch is a person who runs around with a tennis ball inside a yellow sock. As with my suggested modification, this has the advantage that spectators can see where the snitch is.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Having now looked at the real-life rules, I think it’s better for the snitch to be worth 30 points rather than 10 because it gives seekers more of an incentive to catch it—they can turn a narrow loss into a victory.&lt;/p&gt;

&lt;p&gt;My fantasy ruleset has one advantage over the real-life rules, which is that it involves magic. We should use magic to give seekers the ability to make the snitch &lt;em&gt;harder&lt;/em&gt; to catch; this changes the snitch-chasing objective from double-Solitaire to a legitimate multiplayer challenge.&lt;/p&gt;

&lt;p&gt;If you look at most real-life sports, they have a single center of action. In team-based ball sports, the center is the ball. Spectators watch the ball. Quidditch has &lt;em&gt;two&lt;/em&gt; centers of action, the quaffle (the main ball) and the snitch. I’m not sure if that’s a good thing or a bad thing. My first thought was that it’s bad because most sports don’t do that. But then I remembered that I watch a lot of StarCraft, and StarCraft games often have multiple things happening simultaneously in different locations, and the games where that happens are the most fun ones to watch. So perhaps the split attention in Quidditch is a good thing.&lt;/p&gt;

                </description>
			</item>
		
			<item>
				<title>I don't like having goals</title>
				<pubDate>Mon, 24 Nov 2025 00:00:00 -0800</pubDate>
				<link>http://mdickens.me/2025/11/24/goals/</link>
				<guid isPermaLink="true">http://mdickens.me/2025/11/24/goals/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;Sometimes I’m talking about lifting weights and someone asks me, “What’s your goal weight?” I don’t understand why I would have a goal weight.&lt;/p&gt;

&lt;p&gt;Say I want to bench press 300 pounds. What happens when I reach 300? I just give up on the bench press now? That would be silly. If I can keep getting stronger, I should.&lt;/p&gt;

&lt;p&gt;What happens if I fall short of my goal? Say I haven’t been able to bench more than 285.&lt;sup id=&quot;fnref:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; Should I start eating 5000 calories a day to put on as much muscle as possible? No, I’m not going to do that, I don’t want to get fat. Realistically, if I fall short of my goal, the answer to the question of what I should change is “nothing”.&lt;/p&gt;

&lt;p&gt;The point of a goal is to make tradeoffs between objectives. But when you set goals, you have less information about your costs than when you’re trying to implement them. At implementation time, you have new information that might change how you prioritize things, which may result in failing to achieve a goal; and that’s perfectly fine.&lt;/p&gt;

&lt;p&gt;Sometimes a goal turns out to be easier than you thought; that doesn’t mean you should give up after you achieve it.&lt;/p&gt;

&lt;p&gt;Sometimes a goal turns out to be harder than you thought; that doesn’t mean you should sacrifice everything else for it.&lt;/p&gt;

&lt;!-- more --&gt;

&lt;p&gt;You have an implicit utility function across multiple metrics and you want to know how to prioritize those metrics, but you do that by knowing the coefficient of each metric. Say you’re the CEO of a company, and you want to increase revenue and increase &lt;a href=&quot;https://en.wikipedia.org/wiki/Net_promoter_score&quot;&gt;NPS&lt;/a&gt; and you need to allocate resources to each of those. It doesn’t make sense to say “we want to increase NPS by 0.5 points” because NPS trades off against revenue (you can make your product cheaper and worse, which increases revenue but decreases NPS), so you also need to make a statement about how much you care about revenue. It would make more sense to say “$1 million of revenue is as important as 0.1 points of NPS”. And then if it turns out there’s a way to increase NPS by 0.2 at a cost of only $500,000 of revenue, then that’s a good deal and you should take it. But a flat goal of “increase NPS by 0.1” is uninstructive.&lt;/p&gt;

&lt;p&gt;It’s common for pension funds to target an 8% investment return. That makes no sense. Your investment return is mostly determined by the market environment, which is out of your control. There are only two ways to hit an 8% return consistently: take excessive risk when forecasted returns are low (which is even worse than accepting lower returns); or intentionally hamstring your investments when forecasted returns are high (e.g. if you expect the market to earn 8% and you think you can beat the market by 2%, then you deliberately ignore your market-beating ideas so that you can hit 8% without going over). There is no reasonable utility function to which “always get an 8% expected return” is the correct decision.&lt;/p&gt;

&lt;p&gt;In fact, standard finance theory says you should take less risk when expected return goes down (holding volatility constant). Targeting 8% return, and taking on more risk when expected return goes down, means you are actively doing the &lt;em&gt;opposite&lt;/em&gt; of what you ought to be doing.&lt;/p&gt;

&lt;p&gt;However, goals are okay when reality has a discontinuity. If you need to score at least 70% on a test to pass the class, then it’s reasonable to set a goal of 70%. If you’re pretty sure you can score well over 70%, then it makes sense to stop studying. If you don’t think you’ll pass, then you should study more.&lt;/p&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:1&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I never had a goal of benching 300, but 285 is indeed my personal best, which coincidentally is also how much Robin Williams’ character from &lt;em&gt;Good Will Hunting&lt;/em&gt; can bench. I wonder how much Robin Williams could bench in real life? He had pretty thick arms so 285 (or even higher) wouldn’t surprise me. &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>Some Curiosity Stoppers I've Heard</title>
				<pubDate>Sun, 23 Nov 2025 00:00:00 -0800</pubDate>
				<link>http://mdickens.me/2025/11/23/curiosity_stoppers/</link>
				<guid isPermaLink="true">http://mdickens.me/2025/11/23/curiosity_stoppers/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;A &lt;a href=&quot;https://www.lesswrong.com/w/semantic-stopsign&quot;&gt;curiosity stopper&lt;/a&gt; is an answer to a question that gets you to stop asking questions, but doesn’t resolve the mystery.&lt;/p&gt;

&lt;p&gt;There are some curiosity stoppers that I’ve heard many times:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Why doesn’t cell phone radiation cause cancer? &lt;em&gt;Because it’s non-ionizing radiation.&lt;/em&gt;&lt;/li&gt;
  &lt;li&gt;Why are antioxidants good for you? &lt;em&gt;Because they eliminate free radicals.&lt;/em&gt;&lt;/li&gt;
  &lt;li&gt;Why do bicycles stay upright? &lt;em&gt;Because of gyroscopic forces.&lt;/em&gt;&lt;/li&gt;
  &lt;li&gt;Why do solids hold together? &lt;em&gt;Because of intermolecular forces of attraction.&lt;/em&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For the first three, those answers confused me because I didn’t know what those words meant. I guess I know what an ion is (it’s an atom with an electrical charge) but why do I care whether radiation is ionizing? And what makes radiation ionizing or non-ionizing?&lt;/p&gt;

&lt;p&gt;What’s a free radical? Why is it bad?&lt;/p&gt;

&lt;p&gt;What’s a gyroscopic force? (What even is a gyroscope? It’s some sort of top, right?) How on earth does a bicycle generate a gyroscopic force?&lt;/p&gt;

&lt;p&gt;The fourth curiosity stopper—”intermolecular forces of attraction”—is even more of a non-answer. Of &lt;em&gt;course&lt;/em&gt; solids hold together because a force holds them together. That’s what a force &lt;em&gt;is&lt;/em&gt;. But &lt;em&gt;what is the force&lt;/em&gt;, and where does it come from?&lt;/p&gt;

&lt;p&gt;Another genre of curiosity stopper is the out-of-context number:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;“The Dow is down 600 points today.” (How much is that?)&lt;/li&gt;
  &lt;li&gt;“My proposed policy will create two million jobs.” (What percentage is that? What are the odds that I, personally, get a new job?)&lt;/li&gt;
  &lt;li&gt;“This product has 7 grams of protein per serving!” (How big is a serving? How much would I need to eat to meet my daily protein requirement?)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;answers-sort-of&quot;&gt;Answers (sort of)&lt;/h2&gt;

&lt;!-- more --&gt;

&lt;p&gt;I don’t like those answers, so I will try to give real answers if I can. The true answers are complicated and I’m sure my explanations are at least partly wrong but I’ll do my best.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why doesn’t cell phone radiation cause cancer?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Radiation causes cancer when it basically smashes into your DNA and knocks a molecule out of place. If it hits your DNA in just the right way, the radiation can disrupt the part of a cell that regulates growth, and it starts growing out of control and becomes a tumor.&lt;/p&gt;

&lt;p&gt;Cell phone radiation has low energy so it’s not powerful enough to mess up your DNA.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why are antioxidants good for you?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;(There is still no consensus as to whether antioxidants are indeed good for you, but let’s assume they are for a minute.)&lt;/p&gt;

&lt;p&gt;There are some molecules called &lt;em&gt;free radicals&lt;/em&gt;. For our purposes, it doesn’t matter what that means, they’re just a type of molecule. They exist in your body, and sometimes they bounce into your DNA and mess it up, which can cause cancer. Your cells produce these free radicals over time. Antioxidants bind with the free radicals and prevent them from bouncing into your DNA.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why do bicycles stay upright?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Honestly I don’t understand this one at all, sorry. But I do believe that the “gyroscopic forces” explanation is incorrect, or at least incomplete, because &lt;a href=&quot;https://arendschwab.com/assets/pdf/StableBicyclev34Revised.pdf&quot;&gt;some people built a bicycle that doesn’t have gyroscopic effects&lt;/a&gt;&lt;sup id=&quot;fnref:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why do solids hold together?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A combination of forces including the &lt;a href=&quot;https://en.wikipedia.org/wiki/London_dispersion_force&quot;&gt;London dispersion force&lt;/a&gt; and &lt;a href=&quot;https://en.wikipedia.org/wiki/Cohesion_(chemistry)&quot;&gt;cohesion&lt;/a&gt;. If I understand correctly, a macro-level analogy would be that the electrons orbit the protons and sometimes an electron from one molecule moves close to the proton for a neighboring molecule and they attract each other. The true explanation involves quantum mechanics so it’s more complicated than that.&lt;/p&gt;

&lt;p&gt;And for &lt;strong&gt;out-of-context numbers&lt;/strong&gt;: percentages and ratios are better than absolute numbers. Adding 2 million jobs to the economy is very different for a country with a population of 30 million vs. 300 million.&lt;/p&gt;

&lt;p&gt;This is how I’d like numbers to be reported:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;“The Dow is down 2% today.”&lt;/li&gt;
  &lt;li&gt;“My proposed policy will expand the job market by 4%.”&lt;/li&gt;
  &lt;li&gt;“This product has 0.1 grams of protein per calorie!”&lt;/li&gt;
&lt;/ul&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:1&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Kooijman, J. D. G., Meijaard, J. P., Papadopoulos, J. M., Ruina, A., &amp;amp; Schwab, A. L. (2011). &lt;a href=&quot;https://doi.org/10.1126/science.1201959&quot;&gt;A Bicycle Can Be Self-Stable Without Gyroscopic or Caster Effects.&lt;/a&gt; &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>Where I Am Donating in 2025</title>
				<pubDate>Sat, 22 Nov 2025 00:00:00 -0800</pubDate>
				<link>http://mdickens.me/2025/11/22/where_i_am_donating_in_2025/</link>
				<guid isPermaLink="true">http://mdickens.me/2025/11/22/where_i_am_donating_in_2025/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;&lt;a href=&quot;https://mdickens.me/2024/11/18/where_i_am_donating_in_2024/&quot;&gt;Last year&lt;/a&gt; I gave my reasoning on cause prioritization and did shallow reviews of some relevant orgs. I’m doing it again this year.&lt;/p&gt;

&lt;!-- more --&gt;

&lt;p&gt;&lt;em&gt;Cross-posted to the &lt;a href=&quot;https://forum.effectivealtruism.org/posts/AGcny8oBxBDCjqxdr/where-i-am-donating-in-2025&quot;&gt;Effective Altruism Forum&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;h2 id=&quot;contents&quot;&gt;Contents&lt;/h2&gt;

&lt;ul id=&quot;markdown-toc&quot;&gt;
  &lt;li&gt;&lt;a href=&quot;#contents&quot; id=&quot;markdown-toc-contents&quot;&gt;Contents&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#cause-prioritization&quot; id=&quot;markdown-toc-cause-prioritization&quot;&gt;Cause prioritization&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#what-i-want-my-donations-to-achieve&quot; id=&quot;markdown-toc-what-i-want-my-donations-to-achieve&quot;&gt;What I want my donations to achieve&lt;/a&gt;        &lt;ul&gt;
          &lt;li&gt;&lt;a href=&quot;#there-is-no-good-plan&quot; id=&quot;markdown-toc-there-is-no-good-plan&quot;&gt;There is no good plan&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#ai-pause-advocacy-is-the-least-bad-plan&quot; id=&quot;markdown-toc-ai-pause-advocacy-is-the-least-bad-plan&quot;&gt;AI pause advocacy is the least-bad plan&lt;/a&gt;&lt;/li&gt;
        &lt;/ul&gt;
      &lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#how-ive-changed-my-mind-since-last-year&quot; id=&quot;markdown-toc-how-ive-changed-my-mind-since-last-year&quot;&gt;How I’ve changed my mind since last year&lt;/a&gt;        &lt;ul&gt;
          &lt;li&gt;&lt;a href=&quot;#im-more-concerned-about-non-alignment-problems&quot; id=&quot;markdown-toc-im-more-concerned-about-non-alignment-problems&quot;&gt;I’m more concerned about “non-alignment problems”&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#im-more-concerned-about-ai-for-animals&quot; id=&quot;markdown-toc-im-more-concerned-about-ai-for-animals&quot;&gt;I’m more concerned about “AI-for-animals”&lt;/a&gt;&lt;/li&gt;
        &lt;/ul&gt;
      &lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#how-my-confidence-has-increased-since-last-year&quot; id=&quot;markdown-toc-how-my-confidence-has-increased-since-last-year&quot;&gt;How my confidence has increased since last year&lt;/a&gt;        &lt;ul&gt;
          &lt;li&gt;&lt;a href=&quot;#we-should-pause-frontier-ai-development&quot; id=&quot;markdown-toc-we-should-pause-frontier-ai-development&quot;&gt;We should pause frontier AI development&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#peaceful-protests-probably-help&quot; id=&quot;markdown-toc-peaceful-protests-probably-help&quot;&gt;Peaceful protests probably help&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#i-have-a-high-bar-for-who-to-trust&quot; id=&quot;markdown-toc-i-have-a-high-bar-for-who-to-trust&quot;&gt;I have a high bar for who to trust&lt;/a&gt;&lt;/li&gt;
        &lt;/ul&gt;
      &lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#my-favorite-interventions&quot; id=&quot;markdown-toc-my-favorite-interventions&quot;&gt;My favorite interventions&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#organizations-tax-deductible&quot; id=&quot;markdown-toc-organizations-tax-deductible&quot;&gt;Organizations (tax-deductible)&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#ai-for-animals-orgs&quot; id=&quot;markdown-toc-ai-for-animals-orgs&quot;&gt;AI-for-animals orgs&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#ai-safety-and-governance-fund&quot; id=&quot;markdown-toc-ai-safety-and-governance-fund&quot;&gt;AI Safety and Governance Fund&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#existential-risk-observatory&quot; id=&quot;markdown-toc-existential-risk-observatory&quot;&gt;Existential Risk Observatory&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#machine-intelligence-research-institute-miri&quot; id=&quot;markdown-toc-machine-intelligence-research-institute-miri&quot;&gt;Machine Intelligence Research Institute (MIRI)&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#palisade-research&quot; id=&quot;markdown-toc-palisade-research&quot;&gt;Palisade Research&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#pauseai-us&quot; id=&quot;markdown-toc-pauseai-us&quot;&gt;PauseAI US&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#video-projects&quot; id=&quot;markdown-toc-video-projects&quot;&gt;Video projects&lt;/a&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#non-tax-deductible-donation-opportunities&quot; id=&quot;markdown-toc-non-tax-deductible-donation-opportunities&quot;&gt;Non-tax-deductible donation opportunities&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#ai-policy-network&quot; id=&quot;markdown-toc-ai-policy-network&quot;&gt;AI Policy Network&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#americans-for-responsible-innovation-ari&quot; id=&quot;markdown-toc-americans-for-responsible-innovation-ari&quot;&gt;Americans for Responsible Innovation (ARI)&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#controlai&quot; id=&quot;markdown-toc-controlai&quot;&gt;ControlAI&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#congressional-campaigns&quot; id=&quot;markdown-toc-congressional-campaigns&quot;&gt;Congressional campaigns&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#encode&quot; id=&quot;markdown-toc-encode&quot;&gt;Encode&lt;/a&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#where-im-donating&quot; id=&quot;markdown-toc-where-im-donating&quot;&gt;Where I’m donating&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#changelog&quot; id=&quot;markdown-toc-changelog&quot;&gt;Changelog&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#notes&quot; id=&quot;markdown-toc-notes&quot;&gt;Notes&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h1 id=&quot;cause-prioritization&quot;&gt;Cause prioritization&lt;/h1&gt;

&lt;p&gt;In September, I published a &lt;a href=&quot;https://forum.effectivealtruism.org/posts/CbHX5zL2uEvTasuiP/ai-safety-landscape-and-strategic-gaps&quot;&gt;report&lt;/a&gt; on the AI safety landscape, specifically focusing on AI x-risk policy/advocacy.&lt;/p&gt;

&lt;p&gt;The &lt;a href=&quot;https://forum.effectivealtruism.org/posts/CbHX5zL2uEvTasuiP/ai-safety-landscape-and-strategic-gaps#Prioritization&quot;&gt;prioritization section&lt;/a&gt; of the report explains why I focused on AI policy. It’s similar to what I wrote about prioritization in my &lt;a href=&quot;https://mdickens.me/2024/11/18/where_i_am_donating_in_2024/&quot;&gt;2024 donations post&lt;/a&gt;, but more fleshed out. I won’t go into detail on cause prioritization in this post because those two previous articles explain my thinking.&lt;/p&gt;

&lt;p&gt;My high-level prioritization is mostly unchanged since last year. In short:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Existential risk is a big deal.&lt;/li&gt;
  &lt;li&gt;AI misalignment risk is the biggest existential risk.&lt;/li&gt;
  &lt;li&gt;Within AI x-risk, policy/advocacy is much more neglected than technical research.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In the rest of this section, I will cover:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;#what-i-want-my-donations-to-achieve&quot;&gt;What I want to achieve with my donations&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#how-ive-changed-my-mind-since-last-year&quot;&gt;How I’ve changed my mind since last year&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#how-my-confidence-has-increased-since-last-year&quot;&gt;How my confidence has increased since last year&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;what-i-want-my-donations-to-achieve&quot;&gt;What I want my donations to achieve&lt;/h2&gt;

&lt;p&gt;By donating, I want to increase the chances that we get a global ban on developing superintelligent AI until it is proven safe.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://intelligence.org/the-problem/&quot;&gt;“The Problem”&lt;/a&gt; is my favorite article-length explanation of why AI misalignment is a big deal. For a longer take, I also like MIRI’s &lt;a href=&quot;https://ifanyonebuildsit.com&quot;&gt;book&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;MIRI says:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;On our view, the international community’s top immediate priority should be creating an “off switch” for frontier AI development. By “creating an off switch”, we mean putting in place the systems and infrastructure necessary to either shut down frontier AI projects or enact a general ban.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I agree with this. At some point, we will probably need a halt on frontier AI development, or else we will face an unacceptably high risk of extinction. And that time might arrive soon, so we need to start working on it now.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://docs.google.com/document/d/1w1vVTiihUTqFye2hIaoGuqJgw-G5LzeQ8x0yoPQ-Ilg/&quot;&gt;This Google Doc&lt;/a&gt; that explains why I believe a moratorium on frontier AI development is better than “softer” safety regulations. In short: no one knows how to write AI safety regulations that prevent us from dying. If we knew how to do that, then I’d want it; but since we don’t, the best outcome is to not build superintelligent AI until we know how to prevent it from killing everyone.&lt;/p&gt;

&lt;p&gt;That said, I still support efforts to implement AI safety regulations, and I think that sort of work is among the best things one can be doing, because:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;My best guess is that soft safety regulations won’t prevent extinction, but I could be wrong about that—they might turn out to work.&lt;/li&gt;
  &lt;li&gt;Some kinds of safety regulations are relatively easy to implement and would be a net improvement.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Safety regulations can help us move in the right direction, for example:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Whistleblower protections and mandatory reporting for AI companies make dangerous behavior more apparent, which could raise concern for x-risk in the future.&lt;/li&gt;
  &lt;li&gt;Compute monitoring makes it more feasible to shut down AI systems later on.&lt;/li&gt;
  &lt;li&gt;GPU export restrictions make it more feasible to regulate GPU usage.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;My ideal regulation is &lt;em&gt;global&lt;/em&gt; regulation. A misaligned AI is dangerous no matter where it’s built. (You could even say that if anyone builds it, everyone dies.) But I have to idea how to make global regulations happen; it seems that you need to get multiple countries on board with caring about AI risk and you need to overcome coordination problems.&lt;/p&gt;

&lt;p&gt;I can think of two categories of intermediate steps that might be useful:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Public advocacy to raise general concern about AI x-risk.&lt;/li&gt;
  &lt;li&gt;Regional/national regulations on frontier AI, especially regulations in leading countries (the United States and China).&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;A world in which the USA, China, and the EU all have their own AI regulations is probably a world in which it’s easier to get all those regions to agree on an international treaty.&lt;/p&gt;

&lt;h3 id=&quot;there-is-no-good-plan&quot;&gt;There is no good plan&lt;/h3&gt;

&lt;p&gt;People often criticize the “pause AI” plan by saying it’s not feasible.&lt;/p&gt;

&lt;p&gt;I agree. I don’t think it’s going to work.&lt;sup id=&quot;fnref:4&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:4&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;I don’t think more “moderate”&lt;sup id=&quot;fnref:3&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt; AI safety regulations will work, either.&lt;/p&gt;

&lt;p&gt;I don’t think AI alignment researchers are going to figure out how to prevent extinction.&lt;/p&gt;

&lt;p&gt;I don’t see any plan that looks feasible.&lt;/p&gt;

&lt;p&gt;“Advocate for and work toward a global ban on the development of unsafe AI” is my preferred plan, but not because I like the plan. It’s a bad plan. I just think it’s less bad than anything else I’ve heard.&lt;/p&gt;

&lt;p&gt;My P(doom) is not overwhelmingly high (it’s in the realm of 50%). But if we live, I expect that it will be due to luck.&lt;sup id=&quot;fnref:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt; I don’t see any way to make a significant dent on decreasing the odds of extinction.&lt;/p&gt;

&lt;h3 id=&quot;ai-pause-advocacy-is-the-least-bad-plan&quot;&gt;AI pause advocacy is the least-bad plan&lt;/h3&gt;

&lt;p&gt;I don’t have a strong argument for why I believe this. It just seems true to me.&lt;/p&gt;

&lt;p&gt;The short version is something like “the other plans for preventing AI extinction are worse than people think” + “pausing AI is not as intractable as people think” (mostly the first thing).&lt;/p&gt;

&lt;p&gt;The folks at MIRI have done a lot of work to articulate &lt;a href=&quot;https://intelligence.org/the-problem/&quot;&gt;their position&lt;/a&gt;. I directionally agree with almost everything they say about AI misalignment risk (although I’m not as confident as they are). I &lt;em&gt;think&lt;/em&gt; their policy goals still make sense even if you’re less confident, but that’s not as clear, and I don’t think anyone has ever done a great job of articulating the position of “P(doom) is less than 95%, but pausing AI is still the best move because of reasons XYZ”.&lt;/p&gt;

&lt;p&gt;I’m not sure how to articulate it either; it’s something I want to spend more time on in the future. I can’t do a good job of it on this post, so I’ll leave it as a future topic.&lt;/p&gt;

&lt;h2 id=&quot;how-ive-changed-my-mind-since-last-year&quot;&gt;How I’ve changed my mind since last year&lt;/h2&gt;

&lt;h3 id=&quot;im-more-concerned-about-non-alignment-problems&quot;&gt;I’m more concerned about “non-alignment problems”&lt;/h3&gt;

&lt;p&gt;Transformative AI could create many existential-scale problems that aren’t about misalignment. Relevant topics include: &lt;a href=&quot;https://longtermrisk.org/overview-of-transformative-ai-misuse-risks-what-could-go-wrong-beyond-misalignment/&quot;&gt;misuse&lt;/a&gt;; &lt;a href=&quot;https://forum.effectivealtruism.org/posts/2cZAzvaQefh5JxWdb/bringing-about-animal-inclusive-ai&quot;&gt;animal-inclusive AI&lt;/a&gt;; &lt;a href=&quot;https://eleosai.org/post/research-priorities-for-ai-welfare/&quot;&gt;AI welfare&lt;/a&gt;; &lt;a href=&quot;https://longtermrisk.org/research-agenda&quot;&gt;S-risks from conflict&lt;/a&gt;; &lt;a href=&quot;https://www.lesswrong.com/posts/GAv4DRGyDHe2orvwB/gradual-disempowerment-concrete-research-projects&quot;&gt;gradual disempowerment&lt;/a&gt;; &lt;a href=&quot;https://forum.effectivealtruism.org/posts/LpkXtFXdsRd4rG8Kb/reducing-long-term-risks-from-malevolent-actors&quot;&gt;risks from malevolent actors&lt;/a&gt;; &lt;a href=&quot;https://forum.effectivealtruism.org/posts/HqmQMmKgX7nfSLaNX/moral-error-as-an-existential-risk&quot;&gt;moral error&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;I wrote more about non-alignment problems &lt;a href=&quot;https://mdickens.me/2025/11/20/research_wont_solve_non-alignment_problems/&quot;&gt;here&lt;/a&gt;. I think pausing AI is the best way to handle them, although this belief is weakly held.&lt;/p&gt;

&lt;h3 id=&quot;im-more-concerned-about-ai-for-animals&quot;&gt;I’m more concerned about “AI-for-animals”&lt;/h3&gt;

&lt;p&gt;By that I mean the problem of making sure that transformative AI is good for non-humans as well as humans.&lt;/p&gt;

&lt;p&gt;This is a reversion to my ~2015–2020 position. If you go back and read &lt;a href=&quot;https://mdickens.me/2015/09/15/my_cause_selection/&quot;&gt;My Cause Selection (2015)&lt;/a&gt;, I was concerned about AI misalignment, but I was also concerned about an aligned-to-humans AI being bad for animals (or other non-human beings), and I was hesitant to donate to any AI safety orgs for that reason.&lt;/p&gt;

&lt;p&gt;In &lt;a href=&quot;https://mdickens.me/2024/11/18/where_i_am_donating_in_2024/&quot;&gt;my 2024 cause prioritization&lt;/a&gt;, I didn’t pay attention to AI-for-animals because I reasoned that x-risk seemed more important.&lt;/p&gt;

&lt;p&gt;This year, in preparation for writing the &lt;a href=&quot;https://forum.effectivealtruism.org/posts/CbHX5zL2uEvTasuiP/ai-safety-landscape-and-strategic-gap&quot;&gt;AI safety landscape report&lt;/a&gt; for Rethink Priorities, they asked me to consider AI-for-animals interventions in my report. At first, I said I didn’t want to do that because misalignment risk was a bigger deal—if we solved AI alignment, non-humans would probably end up okay. But I changed my mind after considering a simple argument:&lt;/p&gt;

&lt;p&gt;Suppose there’s an 80% chance that an aligned(-to-humans) AI will be good for animals. That still leaves a 20% chance of a bad outcome. AI-for-animals receives much less than 20% as much funding as AI safety. Cost-effectiveness maybe scales with the inverse of the amount invested. Therefore, AI-for-animals interventions are more cost-effective on the margin than AI safety.&lt;/p&gt;

&lt;p&gt;So, although I believe AI misalignment is a higher-&lt;em&gt;probability&lt;/em&gt; risk, it’s not clear that it’s more &lt;em&gt;important&lt;/em&gt; than AI-for-animals.&lt;/p&gt;

&lt;h2 id=&quot;how-my-confidence-has-increased-since-last-year&quot;&gt;How my confidence has increased since last year&lt;/h2&gt;

&lt;h3 id=&quot;we-should-pause-frontier-ai-development&quot;&gt;We should pause frontier AI development&lt;/h3&gt;

&lt;p&gt;Last year, I thought a moratorium on frontier AI development was probably the best political outcome. Now I’m a bit more confident about that, largely because—as far as I can see—it’s the best way to handle &lt;a href=&quot;#im-more-concerned-about-non-alignment-problems&quot;&gt;non-alignment problems&lt;/a&gt;.&lt;/p&gt;

&lt;h3 id=&quot;peaceful-protests-probably-help&quot;&gt;Peaceful protests probably help&lt;/h3&gt;

&lt;p&gt;Last year, I donated to &lt;a href=&quot;https://www.pauseai-us.org/&quot;&gt;PauseAI US&lt;/a&gt; and &lt;a href=&quot;https://pauseai.info/&quot;&gt;PauseAI Global&lt;/a&gt; because I guessed that protests were effective. But I didn’t have much reason to believe that, just some &lt;a href=&quot;https://mdickens.me/2024/11/18/where_i_am_donating_in_2024/#pauseai-global&quot;&gt;vague arguments&lt;/a&gt;. In April of this year, I followed up with &lt;a href=&quot;https://mdickens.me/2025/04/18/protest_outcomes_critical_review/&quot;&gt;an investigation of the strongest evidence on protest outcomes&lt;/a&gt;, and I found that the quality of evidence was better than I’d expected. I am now pretty confident that peaceful demonstrations (like what PauseAI US and PauseAI Global do) have a positive effect. The high-quality evidence looked at nationwide protests; I &lt;a href=&quot;https://mdickens.me/2025/11/04/do_small_protests_work/&quot;&gt;couldn’t find good evidence on small protests&lt;/a&gt;, so I’m less confident about them, but I suspect that they do.&lt;/p&gt;

&lt;p&gt;I also &lt;a href=&quot;https://mdickens.me/2024/11/18/where_i_am_donating_in_2024/#stop-ai&quot;&gt;wrote about&lt;/a&gt; how I was skeptical of Stop AI, a different protest org that uses more disruptive tactics. I’ve also become more confident in my skepticism: I’ve been reading some literature on disruptive protests, and the evidence is mixed. That is, I’m still uncertain about whether disruptive protests work, but my uncertainty has shifted from “I haven’t looked into it” to “I’ve looked into it, and the evidence is ambiguous, so I was right to be uncertain.” (I’ve shifted from &lt;a href=&quot;https://www.overcomingbias.com/p/doctor-there-arhtml&quot;&gt;one kind of “no evidence” to the other&lt;/a&gt;.) For more, see my recent post, &lt;a href=&quot;https://mdickens.me/2025/11/19/do_disruptive_protests_work/&quot;&gt;Do Disruptive or Violent Protests Work?&lt;/a&gt;&lt;sup id=&quot;fnref:9&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:9&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;4&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;h3 id=&quot;i-have-a-high-bar-for-who-to-trust&quot;&gt;I have a high bar for who to trust&lt;/h3&gt;

&lt;p&gt;Last year, I looked for grantmakers who I could defer to, but I couldn’t find any who I trusted enough, so I did my own investigation. I’ve become increasingly convinced that that was the correct decision, and I am increasingly wary of people in the AI safety space—I think a large minority of them are predictably making things worse.&lt;/p&gt;

&lt;p&gt;I wrote my thoughts about this in a &lt;a href=&quot;https://www.lesswrong.com/posts/wn5jTrtKkhspshA4c/michaeldickens-s-shortform?commentId=EyH7PTDT5s2xKGsWC&quot;&gt;LessWrong quick take&lt;/a&gt;. In short, AI safety people/groups have a history of looking like they will prioritize x-risk, and then instead doing things that are unrelated or even predictably increase risk.&lt;sup id=&quot;fnref:5&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:5&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;5&lt;/a&gt;&lt;/sup&gt; So I have a high bar for which orgs I trust, and I don’t want to donate to an org if it looks wishy-washy on x-risk, or if it looks suspiciously power-seeking (a la “superintelligent AI will only be safe if I’m the one who builds it”). I feel much better about giving to orgs that credibly and loudly signal that AI misalignment risk is their priority.&lt;/p&gt;

&lt;p&gt;Among grantmakers, I trust the &lt;a href=&quot;https://survivalandflourishing.fund/&quot;&gt;Survival &amp;amp; Flourishing Fund&lt;/a&gt; the most, but they don’t make recommendations for individual donors. SFF has a &lt;a href=&quot;https://survivalandflourishing.fund/2025/further-opportunities&quot;&gt;Futher Opportunities&lt;/a&gt; page, which shows where they would like to see additional donations go. They are also matching donations on some of their &lt;a href=&quot;https://survivalandflourishing.fund/2025/recommendations&quot;&gt;2025 grants&lt;/a&gt; through the end of the year; donors may be especially interested in giving to orgs where they can get matching.&lt;/p&gt;

&lt;h1 id=&quot;my-favorite-interventions&quot;&gt;My favorite interventions&lt;/h1&gt;

&lt;p&gt;In the &lt;a href=&quot;https://forum.effectivealtruism.org/posts/CbHX5zL2uEvTasuiP/ai-safety-landscape-and-strategic-gaps&quot;&gt;report&lt;/a&gt; I published this September, I reviewed a list of interventions related to AI and quickly evaluated their pros and cons. I arrived at four top ideas:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;&lt;a href=&quot;https://forum.effectivealtruism.org/posts/CbHX5zL2uEvTasuiP/ai-safety-landscape-and-strategic-gaps#Talk_to_policy_makers_about_AI_x_risk&quot;&gt;Talk to policy-makers about AI x-risk&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://forum.effectivealtruism.org/posts/CbHX5zL2uEvTasuiP/ai-safety-landscape-and-strategic-gaps#Write_AI_x_risk_legislation&quot;&gt;Write AI x-risk legislation&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://forum.effectivealtruism.org/posts/CbHX5zL2uEvTasuiP/ai-safety-landscape-and-strategic-gaps#Advocate_to_change_AI_training_to_make_LLMs_more_animal_friendly&quot;&gt;Advocate to change AI (post-)training to make LLMs more animal-friendly&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://forum.effectivealtruism.org/posts/CbHX5zL2uEvTasuiP/ai-safety-landscape-and-strategic-gaps#Develop_new_plans___evaluate_existing_plans_to_improve_post_TAI_animal_welfare&quot;&gt;Develop new plans / evaluate existing plans to improve post-TAI animal welfare&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The first two ideas relate to AI x-risk policy/advocacy, and the second two are about making AI go better for animals (or other non-human sentient beings).&lt;/p&gt;

&lt;p&gt;For my personal donations, I’m just focusing on x-risk.&lt;/p&gt;

&lt;p&gt;At equal funding levels, I expect AI x-risk work to be more cost-effective than work on AI-for-animals. The case for AI-for-animals is that it’s highly neglected. But the specific interventions I like best within AI x-risk are &lt;em&gt;also&lt;/em&gt; highly neglected, perhaps even more so.&lt;/p&gt;

&lt;p&gt;I’m more concerned about the state of funding in AI x-risk advocacy, so that’s where I plan on donating.&lt;/p&gt;

&lt;p&gt;A second consideration is that I want to support orgs that are trying to pause frontier AI development. If they succeed, that buys more time to work on AI-for-animals. So those orgs help both causes at the same time.&lt;/p&gt;

&lt;h1 id=&quot;organizations-tax-deductible&quot;&gt;Organizations (tax-deductible)&lt;/h1&gt;

&lt;p&gt;I’m not qualified to evaluate AI policy orgs, but I also &lt;a href=&quot;#i-have-a-high-bar-for-who-to-trust&quot;&gt;don’t trust anyone else&lt;/a&gt; enough to delegate to them, so I am reviewing them myself.&lt;/p&gt;

&lt;p&gt;I have a &lt;a href=&quot;https://docs.google.com/document/d/1vWB5CgH69W4lmpZrCXaD3n2Jqz32kVnvCJwUA2RE8Fw/&quot;&gt;Google doc&lt;/a&gt; with a list of every relevant organization I could find. Unlike in &lt;a href=&quot;https://mdickens.me/2024/11/18/where_i_am_donating_in_2024/&quot;&gt;my 2024 donation post&lt;/a&gt;, I’m not going to talk about all of the orgs on the list, just my top contenders. For the rest of the orgs I wrote about last year, my beliefs have mostly not changed.&lt;/p&gt;

&lt;p&gt;I separated my list into “tax-deductible” and “non-tax-deductible” because most of my charitable money is in my donor-advised fund, and that money can’t be used to support political groups. So the two types of donations aren’t coming out of the same pool of money.&lt;/p&gt;

&lt;h2 id=&quot;ai-for-animals-orgs&quot;&gt;AI-for-animals orgs&lt;/h2&gt;

&lt;p&gt;As I mentioned &lt;a href=&quot;#my-favorite-interventions&quot;&gt;above&lt;/a&gt;, I don’t plan on donating to orgs in the AI-for-animals space, and I haven’t looked much into them. But I will briefly list some orgs anyway. My first impression is that all of these orgs are doing good work.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://www.compassionml.com/&quot;&gt;Compassion in Machine Learning&lt;/a&gt; does research and works with AI companies to make LLMs more animal-friendly.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://sites.google.com/nyu.edu/mindethicspolicy/home&quot;&gt;NYU Center for Mind, Ethics, and Policy&lt;/a&gt; conducts and supports foundational research on the nature of nonhuman minds, including biological and artificial minds.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://www.openpaws.ai/&quot;&gt;Open Paws&lt;/a&gt; creates AI tools to help animal activists and software developers make AI more compassionate toward animals.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://www.sentienceinstitute.org/&quot;&gt;Sentience Institute&lt;/a&gt; conducts foundational research on long-term moral-circle expansion and digital-mind welfare.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://www.sentientfutures.ai/&quot;&gt;Sentient Futures&lt;/a&gt; organizes conferences on how AI impacts non-human welfare (including farm animals, wild animals, and digital minds); built an &lt;a href=&quot;https://arxiv.org/pdf/2503.04804&quot;&gt;animal-friendliness LLM benchmark&lt;/a&gt;; and is hosting an upcoming &lt;a href=&quot;https://airtable.com/appemEougAoK9dCF5/pagV2quvK8cye1v5Q/form&quot;&gt;war game&lt;/a&gt; on how AGI could impact animal advocacy.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://www.wildanimalinitiative.org/&quot;&gt;Wild Animal Initiative&lt;/a&gt; mostly does research on wild animal welfare, but it has done some work on AI-for-animals (see &lt;a href=&quot;https://www.wildanimalinitiative.org/&quot;&gt;Transformative AI and wild animals: An exploration&lt;/a&gt;.&lt;/p&gt;

&lt;h2 id=&quot;ai-safety-and-governance-fund&quot;&gt;AI Safety and Governance Fund&lt;/h2&gt;

&lt;p&gt;The AI Safety and Governance Fund does &lt;a href=&quot;https://manifund.org/projects/testing-and-spreading-messages-to-reduce-ai-x-risk&quot;&gt;message testing&lt;/a&gt;) on what sorts of AI safety messaging people found compelling. More recently, they &lt;a href=&quot;https://www.lesswrong.com/posts/w5tzAyRxdGhHfvxxB/we-ve-automated-x-risk-pilling-people&quot;&gt;created a chatbot&lt;/a&gt; that talks about AI x-risk, which they use to feed into their messaging experiments; they also have &lt;a href=&quot;https://aisgf.us/fundraising&quot;&gt;plans&lt;/a&gt; for new activities they could pursue with additional funding.&lt;/p&gt;

&lt;p&gt;I liked AI Safety and Governance Fund’s original project, and I donated $10,000 because I expected they could do a lot of message testing for not much money. I’m more uncertain about its new project, or how well message testing can scale. I’m optimistic, but not optimistic enough for the org to be one of my top donation candidates, so I’m not donating more this year.&lt;/p&gt;

&lt;h2 id=&quot;existential-risk-observatory&quot;&gt;Existential Risk Observatory&lt;/h2&gt;

&lt;p&gt;&lt;a href=&quot;https://www.existentialriskobservatory.org/&quot;&gt;Existential Risk Observatory&lt;/a&gt; writes &lt;a href=&quot;https://www.existentialriskobservatory.org/#in-the-media&quot;&gt;media articles&lt;/a&gt; on AI x-risk, does &lt;a href=&quot;https://www.existentialriskobservatory.org/research-2/&quot;&gt;policy research&lt;/a&gt;, and publishes &lt;a href=&quot;https://www.existentialriskobservatory.org/policy-proposals/&quot;&gt;policy proposals&lt;/a&gt; (see &lt;a href=&quot;https://existentialriskobservatory.org/papers_and_reports/Policy%20Proposals.pdf&quot;&gt;pdf&lt;/a&gt; with a summary of proposals).&lt;/p&gt;

&lt;p&gt;Last year, I &lt;a href=&quot;https://mdickens.me/2024/11/18/where_i_am_donating_in_2024/#existential-risk-observatory&quot;&gt;wrote&lt;/a&gt;:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;My primary concern is that it operates in the Netherlands. Dutch policy is unlikely to have much influence on x-risk—the United States is the most important country by far, followed by China. And a Dutch organization likely has little influence on United States policy. Existential Risk Observatory can still influence public opinion in America (for example via its TIME article), but I expect a US-headquartered org to have a greater impact.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I’m less concerned about that now—I believe I gave too little weight to the fact that Existential Risk Observatory has published articles in international media outlets.&lt;/p&gt;

&lt;p&gt;I still like media outreach as a form of impact, but it’s not my &lt;em&gt;favorite&lt;/em&gt; thing, so Existential Risk Observatory is not one of my top candidates.&lt;/p&gt;

&lt;h2 id=&quot;machine-intelligence-research-institute-miri&quot;&gt;Machine Intelligence Research Institute (MIRI)&lt;/h2&gt;

&lt;p&gt;The biggest news from &lt;a href=&quot;https://intelligence.org/&quot;&gt;MIRI&lt;/a&gt; in 2025 is that they &lt;a href=&quot;https://ifanyonebuildsit.com/&quot;&gt;published a book&lt;/a&gt;. The book was widely read and got some &lt;a href=&quot;https://www.lesswrong.com/posts/khmpWJnGJnuyPdipE/new-endorsements-for-if-anyone-builds-it-everyone-dies&quot;&gt;endorsements&lt;/a&gt; from important people, including people who I wouldn’t have expected to give endorsements. It remains to be seen what sort of lasting impact the book will have, but the launch went better than I would’ve predicted a year ago (perhaps in the 75th percentile).&lt;/p&gt;

&lt;p&gt;MIRI’s 2026 plans include:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;growing the comms team and continuing to promote the book;&lt;/li&gt;
  &lt;li&gt;talking to policy-makers, think tanks, etc. about AI x-risk;&lt;/li&gt;
  &lt;li&gt;growing the Technical Governance team, which does policy research on how to implement a global ban on ASI.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I’m less enthusiastic about policy research than about advocacy, but I like MIRI’s approach to policy research better than any other org’s. Most AI policy orgs take an academia-style approach of “what are some novel things we can publish about AI policy?” MIRI takes a more motivated approach of “what policies are necessary to prevent extinction, and what needs to happen before those policies can be implemented?” Most policy research orgs spend too much time on &lt;a href=&quot;https://en.wikipedia.org/wiki/Streetlight_effect&quot;&gt;streetlight-effect&lt;/a&gt; policies; MIRI is strongly oriented toward preventing extinction.&lt;/p&gt;

&lt;p&gt;I also like MIRI better than I did a year ago because I realized they deserve a “stable preference bonus”.&lt;/p&gt;

&lt;p&gt;In &lt;a href=&quot;https://mdickens.me/2015/09/15/my_cause_selection/&quot;&gt;My Cause Selection (2015)&lt;/a&gt;, MIRI was my #2 choice for where to donate. In 2024, MIRI again made my list of finalists. The fact that I’ve liked MIRI for 10 years is good evidence that I’ll continue to like it.&lt;/p&gt;

&lt;p&gt;Maybe next year I will change my mind about my other top candidates, but—according to the &lt;a href=&quot;https://en.wikipedia.org/wiki/Lindy_effect&quot;&gt;Lindy effect&lt;/a&gt;—I bet I won’t change my mind about MIRI.&lt;/p&gt;

&lt;p&gt;The Survival &amp;amp; Flourishing Fund is &lt;a href=&quot;https://survivalandflourishing.fund/2025/recommendations&quot;&gt;matching&lt;/a&gt; 2025 donations to MIRI up to $1.3 million.&lt;/p&gt;

&lt;h2 id=&quot;palisade-research&quot;&gt;Palisade Research&lt;/h2&gt;

&lt;p&gt;&lt;a href=&quot;https://palisaderesearch.org/&quot;&gt;Palisade&lt;/a&gt; builds demonstrations of the offensive capabilities of AI systems, with the goal of illustrating risks to policy-makers. My opinion on Palisade is mostly unchanged since &lt;a href=&quot;https://mdickens.me/2024/11/18/where_i_am_donating_in_2024/#palisade-research&quot;&gt;last year&lt;/a&gt;, which is to say it’s one of my favorite AI safety nonprofits.&lt;/p&gt;

&lt;p&gt;They did not respond to my emails asking about their fundraising situation. Palisade did recently receive funding from the Survival &amp;amp; Flourishing Fund (SFF) and appeared on their &lt;a href=&quot;https://survivalandflourishing.fund/2025/further-opportunities&quot;&gt;Further Opportunities page&lt;/a&gt;, which means SFF thinks Palisade can productively use more funding.&lt;/p&gt;

&lt;p&gt;The Survival &amp;amp; Flourishing Fund is &lt;a href=&quot;https://survivalandflourishing.fund/2025/recommendations&quot;&gt;matching&lt;/a&gt; 2025 donations to Palisade up to $900,000.&lt;/p&gt;

&lt;h2 id=&quot;pauseai-us&quot;&gt;PauseAI US&lt;/h2&gt;

&lt;p&gt;&lt;a href=&quot;https://www.pauseai-us.org/&quot;&gt;PauseAI US&lt;/a&gt; was the main place I donated last year. Since then, I’ve become &lt;a href=&quot;https://mdickens.me/2025/04/18/protest_outcomes_critical_review/&quot;&gt;more optimistic&lt;/a&gt; that protests are net positive.&lt;/p&gt;

&lt;p&gt;Pause protests haven’t had any big visible effects in the last year, which is what I expected,&lt;sup id=&quot;fnref:6&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:6&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;6&lt;/a&gt;&lt;/sup&gt; but it’s a weak negative update that the protests haven’t yet gotten traction.&lt;/p&gt;

&lt;p&gt;I did not list protests as one of my &lt;a href=&quot;#my-favorite-interventions&quot;&gt;favorite interventions&lt;/a&gt;; in the abstract, I like political advocacy better. But political advocacy is more difficult to evaluate, operates in a more adversarial information environment&lt;sup id=&quot;fnref:8&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:8&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;7&lt;/a&gt;&lt;/sup&gt;, and less neglected. There is some hypothetical political advocacy that I like better than protests, but it’s much harder to tell whether the real-life opportunities live up to that hypothetical.&lt;/p&gt;

&lt;p&gt;PauseAI US has hired a full-time lobbyist. He’s less experienced than the lobbyists at some other AI safety orgs, but I know that his lobbying efforts straightforwardly focus on x-risk instead of doing some kind of complicated political maneuvering that’s hard for me to evaluate, like what some other orgs do. PauseAI US has had some early successes but it’s hard for me to judge how important they are.&lt;/p&gt;

&lt;p&gt;Something that didn’t occur to me last year, but that I now believe matters a lot, is that PauseAI US organizes letter-writing campaigns. In May, PauseAI US &lt;a href=&quot;https://pauseaius.substack.com/p/call-to-action-contact-your-senators&quot;&gt;organized a campaign&lt;/a&gt; to ask Congress members not to impose a 10-year moratorium on AI regulation; they have an &lt;a href=&quot;https://pauseai-us.org/RiskEvalAct&quot;&gt;ongoing campaign&lt;/a&gt; in support of the AI Risk Evaluation Act. According to my recent &lt;a href=&quot;https://mdickens.me/2025/11/08/call_or_write_your_representatives/&quot;&gt;cost-effectiveness analysis&lt;/a&gt;, messaging campaigns look valuable, and right now nobody else is doing it.&lt;sup id=&quot;fnref:7&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:7&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;8&lt;/a&gt;&lt;/sup&gt; It could be that these campaigns are the most important function of PauseAI US.&lt;/p&gt;

&lt;h2 id=&quot;video-projects&quot;&gt;Video projects&lt;/h2&gt;

&lt;p&gt;Recently, more people have been trying to advocate for AI safety by &lt;a href=&quot;https://forum.effectivealtruism.org/posts/h2WB4gnLCb8qekk5r/what-s-going-on-in-video-in-ai-safety-these-days-a-list&quot;&gt;making videos&lt;/a&gt;. I like that this is happening, but I don’t have a good sense of how to evaluate video projects, so I’m going to punt on it. For some discussion, see &lt;a href=&quot;https://forum.effectivealtruism.org/posts/SBsGCwkoAemPawfJz/how-cost-effective-are-ai-safety-youtubers&quot;&gt;How cost-effective are AI safety YouTubers?&lt;/a&gt; and &lt;a href=&quot;https://forum.effectivealtruism.org/posts/d9kEfvKq3uqwjeRFJ/rethinking-the-impact-of-ai-safety-videos-extending-austin&quot;&gt;Rethinking The Impact Of AI Safety Videos&lt;/a&gt;.&lt;/p&gt;

&lt;h1 id=&quot;non-tax-deductible-donation-opportunities&quot;&gt;Non-tax-deductible donation opportunities&lt;/h1&gt;

&lt;p&gt;I didn’t start thinking seriously about non-tax-deductible opportunities until late September. By late October, it was apparent that I had too many unanswered questions to be able to publish this post in time for giving season.&lt;/p&gt;

&lt;p&gt;Instead of explaining my position on these non-tax-deductible opportunities (because I don’t have one), I’ll explain what open questions I want to answer.&lt;/p&gt;

&lt;p&gt;There’s a good chance I will donate to one of these opportunities before the end of the year. If I do, I’ll write a follow-up post about it (which is why this post is titled Part 1).&lt;/p&gt;

&lt;h2 id=&quot;ai-policy-network&quot;&gt;AI Policy Network&lt;/h2&gt;

&lt;p&gt;&lt;a href=&quot;https://theaipn.org/&quot;&gt;AI Policy Network&lt;/a&gt; advocates for US Congress to pass AI safety regulation. From its description of &lt;a href=&quot;https://theaipn.org/the-issue/&quot;&gt;The Issue&lt;/a&gt;, it appears appropriately concerned about misalignment risk, but it also says&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;AGI would further have large implications for national security and the balance of power. If an adversarial nation beats the U.S. to AGI, they could potentially use the power it would provide – in technological advancement, economic activity, and geopolitical strategy – to reshape the world order against U.S. interests.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I find this sort of language concerning because it appears to be encouraging an arms race, although I don’t think that’s what the writers of this paragraph want.&lt;/p&gt;

&lt;p&gt;I don’t have a good understanding of what AI Policy Network does, so I need to learn more.&lt;/p&gt;

&lt;h2 id=&quot;americans-for-responsible-innovation-ari&quot;&gt;Americans for Responsible Innovation (ARI)&lt;/h2&gt;

&lt;p&gt;&lt;a href=&quot;https://ari.us/&quot;&gt;Americans for Responsible Innovation&lt;/a&gt; (ARI) is the sort of respectable-looking org that I don’t expect to struggle for funding. But I spoke to someone at ARI who believes that the best donation opportunities depend on small donors because there are legal donation caps. Even if the org as a whole is well-funded, it depends on small donors to fund its &lt;a href=&quot;https://en.wikipedia.org/wiki/Political_action_committee&quot;&gt;PAC&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;I want to put more thought into how valuable ARI’s activities are, but I haven’t had time to do that yet. My outstanding questions:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;How cost-effective is ARI’s advocacy (e.g. compared to &lt;a href=&quot;https://mdickens.me/2025/11/08/call_or_write_your_representatives/&quot;&gt;messaging campaigns&lt;/a&gt;)? (I have weak reason to believe it’s more cost-effective.)&lt;/li&gt;
  &lt;li&gt;How much do I agree with ARI’s policy objectives, and how much should I trust them?&lt;/li&gt;
  &lt;li&gt;ARI is pretty opaque about what they do. How concerned should I be about that?&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;controlai&quot;&gt;ControlAI&lt;/h2&gt;

&lt;p&gt;&lt;a href=&quot;https:/controlai.com/&quot;&gt;ControlAI&lt;/a&gt; is the most x-risk-focused of the 501(c)(4)s, and the only one that advocates for a pause on AI development. They started operations in the UK, and this year they have &lt;a href=&quot;https://www.lesswrong.com/posts/Xwrajm92fdjd7cqnN/what-we-learned-from-briefing-70-lawmakers-on-the-threat?commentId=fBZuzesMbtpA6B5tF&quot;&gt;expanded to the US&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Some thoughts:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;ControlAI’s &lt;a href=&quot;https://aitreaty.org/&quot;&gt;open letter&lt;/a&gt; calling for an international treaty looks eminently reasonable.&lt;/li&gt;
  &lt;li&gt;ControlAI had success getting UK politicians to &lt;a href=&quot;https://controlai.com/statement#supporters&quot;&gt;support&lt;/a&gt; their statement on AI risk.&lt;/li&gt;
  &lt;li&gt;They wrote a &lt;a href=&quot;https://www.lesswrong.com/posts/Xwrajm92fdjd7cqnN/what-we-learned-from-briefing-70-lawmakers-on-the-threat&quot;&gt;LessWrong post&lt;/a&gt; about what they learned from talking to policy-makers about AI risk, which was a valuable post that demonstrated thoughtfulness.&lt;/li&gt;
  &lt;li&gt;I liked ControlAI &lt;a href=&quot;https://mdickens.me/2024/11/18/where_i_am_donating_in_2024/#control-ai&quot;&gt;last year&lt;/a&gt;, but at the time they only operated in the UK, so they weren’t a finalist. This year they are expanding internationally.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;ControlAI is tentatively my favorite non-tax-deductible org because they’re the most transparent and the most focused on x-risk.&lt;/p&gt;

&lt;h2 id=&quot;congressional-campaigns&quot;&gt;Congressional campaigns&lt;/h2&gt;

&lt;p&gt;Two state representatives, &lt;a href=&quot;https://www.scottwiener.com/&quot;&gt;Scott Weiner&lt;/a&gt; and &lt;a href=&quot;https://linkin.bio/alexbores/&quot;&gt;Alex Bores&lt;/a&gt;, are running for US Congress. Both of them have sponsored successful AI safety legislation at the state level (SB 53 and the RAISE Act, respectively). We need AI safety advocates in US Congress, or bills won’t get sponsored.&lt;/p&gt;

&lt;p&gt;Outstanding questions:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;The bills these representatives sponsored were a step in the right direction, but far too weak to prevent extinction. How useful are weak regulations?&lt;/li&gt;
  &lt;li&gt;How likely are they to sponsor stronger regulations in the future? (And how much does that matter?)&lt;/li&gt;
  &lt;li&gt;How could this go badly if these reps turn out not to be good advocates for AI safety? (Maybe they create polarization, or don’t navigate the political landscape well, or make the cause of AI safety look bad, or simply never advocate for the sorts of policies that would actually prevent extinction.)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;encode&quot;&gt;Encode&lt;/h2&gt;

&lt;p&gt;&lt;a href=&quot;https://encodeai.org/what-we-do/&quot;&gt;Encode&lt;/a&gt; does political advocacy on AI x-risk. They also have &lt;a href=&quot;https://encodeai.org/our-chapters/&quot;&gt;local chapters&lt;/a&gt; that do something (I’m not clear on what).&lt;/p&gt;

&lt;p&gt;They have a good track record of political action:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Encode co-sponsored SB 1047 and the new &lt;a href=&quot;https://sd11.senate.ca.gov/news/senator-wiener-introduces-legislation-protect-ai-whistleblowers-boost-responsible-ai&quot;&gt;SB 53&lt;/a&gt;.&lt;/li&gt;
  &lt;li&gt;Encode filed in support of Musk’s lawsuit against OpenAI’s for-profit conversion, which was the &lt;a href=&quot;https://www.lesswrong.com/posts/wCc7XDbD8LdaHwbYg/openai-moves-to-complete-potentially-the-largest-theft-in&quot;&gt;largest theft in human history&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Encode is relatively transparent and relatively focused on the big problems, although not to the same extent as ControlAI.&lt;/p&gt;

&lt;h1 id=&quot;where-im-donating&quot;&gt;Where I’m donating&lt;/h1&gt;

&lt;p&gt;All of the orgs on my 501(c)(3) list deserve more funding. (I suspect the same is true of the 501(c)(4)s, but I’m not confident.) &lt;strong&gt;My favorite 501(c)(3) donation target is PauseAI US&lt;/strong&gt; because:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Someone should be organizing protests. The only US-based orgs doing that are PauseAI US and Stop AI, and I have some concerns about Stop AI that I discussed &lt;a href=&quot;https://mdickens.me/2024/11/18/where_i_am_donating_in_2024/#stop-ai&quot;&gt;last year&lt;/a&gt; and &lt;a href=&quot;#peaceful-protests-probably-help&quot;&gt;above&lt;/a&gt;.&lt;/li&gt;
  &lt;li&gt;Someone should be running messaging campaigns to support good legislation and oppose bad legislation. Only PauseAI US is doing that.&lt;/li&gt;
  &lt;li&gt;PauseAI US is small and doesn’t get much funding, and in particular doesn’t get support from any grantmakers.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In other words, PauseAI US is serving some important functions that nobody else is on top of, and I really want them to be able to keep doing that.&lt;/p&gt;

&lt;p&gt;My plan is to donate $40,000 to PauseAI US.&lt;/p&gt;

&lt;h1 id=&quot;changelog&quot;&gt;Changelog&lt;/h1&gt;

&lt;p&gt;2025-11-22: Corrected description of AI Safety and Governance Fund.&lt;/p&gt;

&lt;p&gt;2025-11-29: Corrected description of Survival &amp;amp; Flourishing Fund’s donation matching.&lt;/p&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:4&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Although I’m probably more optimistic about it than a lot of people. For example, before the 2023 &lt;a href=&quot;https://futureoflife.org/open-letter/pause-giant-ai-experiments/&quot;&gt;FLI Open Letter&lt;/a&gt;, a lot of people would’ve predicted that this sort of letter would never be able to get the sort of attention that it ended up getting. (I would’ve put pretty low odds on it, too; but I changed my mind after seeing how many signatories it got.) &lt;a href=&quot;#fnref:4&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:3&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I disagree with the way many AI safety people use the term “moderate”. I think my position of “this thing might kill everyone and we have no idea how to make it not do that, therefore it should be illegal to build” is pretty damn moderate. Mild, even. There are far less dangerous things that are rightly illegal. The standard-AI-company position of “this has a &amp;gt;10% chance of killing everyone, but let’s build it anyway” is, I think, much stranger (to put it politely). And it’s strange that people act like that position is the moderate one. &lt;a href=&quot;#fnref:3&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:1&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Perhaps we get lucky, and prosaic alignment is good enough to fully solve the alignment problem (and then the aligned AI solves all non-alignment problems). Perhaps we get lucky, and superintelligence turns out to be much harder to build than we thought, and it’s still decades away. Perhaps we get lucky, and takeoff is slow and gives us a lot of time to iterate on alignment. Perhaps we get lucky, and there’s a warning shot that forces world leaders to take AI risk seriously. Perhaps we get lucky, and James Cameron makes &lt;em&gt;Terminator 7: Here’s How It Will Happen In Real Life If We Don’t Change Course&lt;/em&gt; and the movie changes everything. Perhaps we get lucky, and I’m dramatically misunderstanding the alignment problem and it’s actually not a problem at all.&lt;/p&gt;

      &lt;p&gt;Each of those things is unlikely on its own. But when you add up all the probabilities of those things and everything else in the same genre, you end up with decent odds that we survive. &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:9&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I do think Stop AI is morally justified in blockading AI companies’ offices. AI companies are trying to build the thing that kills everyone; Stop AI protesters are justified in (non-violently) trying to stop them from doing that. Some of the protesters have been taken to trial, and if the courts are just, they will be found not guilty. But I dislike disruptive protests on pragmatic grounds because they don’t appear particularly effective. &lt;a href=&quot;#fnref:9&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:5&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I want to distinguish “predictably” from “unpredictably”. For example, MIRI’s work on raising concern for AI risk appears to have played a role in motivating Sam Altman to start OpenAI, which greatly increased x-risk (and was possibly the worst thing to ever happen in history, if OpenAI ends up being the company to build the AI that kills everyone). But I don’t think it was predictable in advance that MIRI’s work would turn out to be harmful in that way, so I don’t hold it against them. &lt;a href=&quot;#fnref:5&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:6&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;On my model, most of the expected value of running protests comes from the small probability that they grow a lot, either due to natural momentum or because some inciting event (like a warning shot) suddenly makes many more people concerned about AI risk. &lt;a href=&quot;#fnref:6&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:8&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I have a good understanding of the effectiveness of protests because I’ve &lt;a href=&quot;https://mdickens.me/2025/04/18/protest_outcomes_critical_review/&quot;&gt;done the research&lt;/a&gt;. For political interventions, most information about their effectiveness comes from the people doing the work, and I can’t trust them to honestly evaluate themselves. And many kinds of political action involve a certain Machiavellian-ness, which brings various conundrums that make it harder to tell whether the work is worth funding. &lt;a href=&quot;#fnref:8&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:7&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;MIRI and ControlAI have open-ended “contact your representative” pages (links: &lt;a href=&quot;https://ifanyonebuildsit.com/act/letter&quot;&gt;MIRI&lt;/a&gt;, &lt;a href=&quot;https://controlai.com/take-action/choose&quot;&gt;ControlAI&lt;/a&gt;), but they haven’t done messaging campaigns on specific legislation. &lt;a href=&quot;#fnref:7&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>An unnecessarily long analysis of one line from The Princess Bride</title>
				<pubDate>Fri, 21 Nov 2025 00:00:00 -0800</pubDate>
				<link>http://mdickens.me/2025/11/21/inconceivable/</link>
				<guid isPermaLink="true">http://mdickens.me/2025/11/21/inconceivable/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;&lt;img src=&quot;/assets/images/Inigo.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Vizzini: Inconceivable!&lt;/p&gt;

  &lt;p&gt;Inigo: You keep using that word. I do not think it means what you think it means.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;What did Inigo mean by this?&lt;/p&gt;

&lt;p&gt;(Don’t laugh, this is serious.)&lt;/p&gt;

&lt;!-- more --&gt;

&lt;p&gt;The statement can be interpreted in two ways:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;I do not think [it means what you think it means].&lt;/li&gt;
  &lt;li&gt;I do not [think it means] what you [think it means].&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Or, in other words:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;It is my belief that your definition of “inconceivable” is incorrect.&lt;/li&gt;
  &lt;li&gt;I have a belief as to what “inconceivable” means; you have a belief as to what it means; and our two definitions disagree.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;I’ve wondered about this for years. You might think it’s immaterial, because they both amount to the same thing: Vizzini is using the word “inconceivable” incorrectly, according to Inigo. But the two interpretations have subtle philosophical differences.&lt;/p&gt;

&lt;p&gt;By way of illustration, suppose Vizzini defines inconceivable as “my mind could not have conceived of this possibility”, and Inigo defines it as “my mind could not have conceived of this possibility”. Both use the same definition. However, Inigo &lt;em&gt;believes&lt;/em&gt; that Vizzini’s definition is something more like “this state of affairs disappoints me”.&lt;/p&gt;

&lt;p&gt;Therefore, the statement “I do not think [it means what you think it means]” is &lt;strong&gt;true&lt;/strong&gt;, because Inigo’s definition is not the same as what Inigo believes to be Vizzini’s definition. However, the statement “I do not [think it means] what you [think it means]” is &lt;strong&gt;false&lt;/strong&gt;, because Inigo and Vizzini are using the same definition.&lt;/p&gt;

&lt;p&gt;But this scenario is ruled out by the fact that Vizzini would be using the word incorrectly &lt;em&gt;according to his own definition&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;Consider another hypothetical. Suppose Vizzini and Inigo both know the correct definition of the word, but they have different philosophies on hyperbole, and Vizzini is more lax about using words in a hyperbolic sense. They have a disagreement, but the disagreement is not about the meaning of the word.&lt;/p&gt;

&lt;p&gt;In this hypothetical, “I do not think [it means what you think it means]” is &lt;strong&gt;true&lt;/strong&gt;—Inigo does indeed hold the (false) belief that Vizzini’s definition is wrong. But “I do not [think it means] what you [think it means]” is &lt;strong&gt;false&lt;/strong&gt;, because in fact both parties use the same definition.&lt;/p&gt;

&lt;p&gt;Let’s move to one final hypothetical. Suppose Vizzini defines inconceivable as “any activity that is performed by a person wearing all black clothing and a black mask”. For example, if a man in black climbs a cliff without using a rope, that would be inconceivable.&lt;sup id=&quot;fnref:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; If a man in black does something mundane like eating a piece of bread, that would be inconceivable. But if a &lt;em&gt;woman in white&lt;/em&gt;&lt;sup id=&quot;fnref:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt; were to climb up a cliff without a rope, that would &lt;em&gt;not&lt;/em&gt; be inconceivable.&lt;/p&gt;

&lt;p&gt;And suppose Inigo doesn’t know that’s how Vizzini uses the word (because that is a very strange definition). As in the previous hypothetical, Inigo believes that Vizzini defines inconceivable as “this state of affairs disappoints me”.&lt;/p&gt;

&lt;p&gt;Now return to the two interpretations of Inigo’s statement. “I do not think [it means what you think it means]” is &lt;strong&gt;true&lt;/strong&gt;. Inigo thinks Vizzini thinks it means “this disappoints me”, so the bracketed statement “[it means what you think it means]” resolves to “[it means ‘this disappoints me’]”; and Inigo doesn’t think that’s what the word means.&lt;/p&gt;

&lt;p&gt;Interpretation #2, “I do not [think it means] what you [think it means]”, is &lt;strong&gt;also true&lt;/strong&gt;. Inigo’s definition does not equal Vizzini’s definition. However, &lt;strong&gt;Inigo does not have knowledge that this statement is true.&lt;/strong&gt; He has a justified true belief, his justification being that Vizzini keeps using the word incorrectly. But the justification is false. This is an example of the classic &lt;a href=&quot;https://en.wikipedia.org/wiki/Gettier_problem&quot;&gt;Gettier problem&lt;/a&gt; that is one of the most-studies puzzles in epistemology.&lt;/p&gt;

&lt;p&gt;Across these three hypothetical edge cases, we have seen that the first interpretation of Inigo’s statement is consistently true. The second interpretation can be false, and it can also be true-but-not-knowledge. Therefore, given the problems with the second interpretation, I conclude that &lt;strong&gt;the first interpretation is the correct one:&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;I do not think [it means what you think it means].&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;There are conceivable scenarios in which the first interpretation is false and the second is true, but they require strange suppositions like “Inigo has incorrect beliefs about what his own beliefs are”. I don’t think there’s any &lt;em&gt;reasonable&lt;/em&gt; scenario that makes the first interpretation false.&lt;sup id=&quot;fnref:3&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;I’ve been wondering about this for nearly 20 years. It was inconceivable that I would ever find an answer, but through some hard work and careful thinking, I’ve finally resolved the mystery.&lt;/p&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:1&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;This is a slight misrepresentation of what happened; I’m taking some liberties to simplify the story. In the movie, Vizzini cut the rope leading up the Cliffs of Insanity, and then found it “inconceivable” when the man in black—who had been climbing the rope—didn’t fall. The man in black then started slowly free-climbing the cliff, but Vizzini did not say “inconceivable” in response to this.&lt;/p&gt;

      &lt;p&gt;In the book, Vizzini &lt;em&gt;did&lt;/em&gt; find it “inconceivable” that the man in black could continue to climb (page 102), but Inigo didn’t say that line in the book. &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:2&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;or a non-binary person in green &lt;a href=&quot;#fnref:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:3&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;A quote from the book supports my thesis. On page 103, Vizzini argues that he has been using the word correctly, that the man in black is &lt;em&gt;not&lt;/em&gt; following them, and that it is inconceivable that he &lt;em&gt;could&lt;/em&gt; be following them. His argument seemingly renders the first interpretation true and the second interpretation false.&lt;/p&gt;

      &lt;p&gt;The exact quote from the book:&lt;/p&gt;

      &lt;blockquote&gt;
        &lt;p&gt;“I have the keenest mind that has ever been turned to unlawful pursuits,” [Vizzini] began, “so when I tell you something, it is not guesswork; it is fact! And the fact is that the man in black is &lt;em&gt;not&lt;/em&gt; following us. A more logical explanation would be that he is simply an ordinary sailor who dabbles in mountain climbing as a hobby who happens to have the same general final destination as we do. That certainly satisfies me and I hope it satisfies you. In any case, we cannot take the risk of his seeing us with the princess, and therefore one of you must kill him.”&lt;/p&gt;
      &lt;/blockquote&gt;

      &lt;p&gt;However, the textual evidence is muddled by the fact that Vizzini says “Inconceivable!” after the observation that the man in black is &lt;em&gt;climbing the cliff&lt;/em&gt;, not specifically that he is &lt;em&gt;following them&lt;/em&gt;; which suggests that Vizzini’s definition is wrong after all, and the second interpretation of Inigo’s statement is not ruled out. &lt;a href=&quot;#fnref:3&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>We won't solve post-alignment problems by doing research</title>
				<pubDate>Thu, 20 Nov 2025 00:00:00 -0800</pubDate>
				<link>http://mdickens.me/2025/11/20/research_wont_solve_non-alignment_problems/</link>
				<guid isPermaLink="true">http://mdickens.me/2025/11/20/research_wont_solve_non-alignment_problems/</guid>
                <description>
                  
                  
                  
                  &lt;h2 id=&quot;introduction&quot;&gt;Introduction&lt;/h2&gt;

&lt;p&gt;Even if we solve the AI alignment problem, we still face &lt;strong&gt;post-alignment problems&lt;/strong&gt;, which are all the other existential problems&lt;sup id=&quot;fnref:3&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; that AI may bring.&lt;/p&gt;

&lt;p&gt;People have written research agendas on various imposing problems that we are nowhere close to solving, and that we may need to solve before developing ASI. An incomplete list of topics: &lt;a href=&quot;https://longtermrisk.org/overview-of-transformative-ai-misuse-risks-what-could-go-wrong-beyond-misalignment/&quot;&gt;misuse&lt;/a&gt;; &lt;a href=&quot;https://forum.effectivealtruism.org/posts/2cZAzvaQefh5JxWdb/bringing-about-animal-inclusive-ai&quot;&gt;animal-inclusive AI&lt;/a&gt;; &lt;a href=&quot;https://eleosai.org/post/research-priorities-for-ai-welfare/&quot;&gt;AI welfare&lt;/a&gt;; &lt;a href=&quot;https://longtermrisk.org/research-agenda&quot;&gt;S-risks from conflict&lt;/a&gt;; &lt;a href=&quot;https://www.lesswrong.com/posts/GAv4DRGyDHe2orvwB/gradual-disempowerment-concrete-research-projects&quot;&gt;gradual disempowerment&lt;/a&gt;; &lt;a href=&quot;https://arxiv.org/html/2502.07050v1&quot;&gt;permanent mass unemployment&lt;/a&gt;; &lt;a href=&quot;https://forum.effectivealtruism.org/posts/LpkXtFXdsRd4rG8Kb/reducing-long-term-risks-from-malevolent-actors&quot;&gt;risks from malevolent actors&lt;/a&gt;; &lt;a href=&quot;https://forum.effectivealtruism.org/posts/HqmQMmKgX7nfSLaNX/moral-error-as-an-existential-risk&quot;&gt;moral error&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The standard answer to these problems, the one that most research agendas take for granted, is “do research”. Specifically, do research in the conventional way where you create a research agenda, explore some research questions, and fund other people to work on those questions.&lt;/p&gt;

&lt;p&gt;If transformative AI arrives within the next decade, then we won’t solve post-alignment problems by doing research on how to solve them.&lt;/p&gt;

&lt;!-- more --&gt;

&lt;p&gt;These problems are thorny, to put it mildly. They’re the sorts of problems where you have no idea how much progress you’re making or how much work it will take. I can think of analogous philosophical problems that have seen depressingly little progress in 300 years. I don’t expect to see meaningful progress in the next 10.&lt;/p&gt;

&lt;p&gt;Beyond that, there are multiple post-alignment problems. The future could be catastrophic if we get even one of them wrong. Most lines of research only address one out of the many problems. We might get lucky and solve one major post-alignment problem before transformative AI arrives, but it’s extremely unlikely that we solve all of them.&lt;/p&gt;

&lt;p&gt;Instead of directly working on post-alignment problems, we should be working on how to increase the probability that post-alignment problems get solved.&lt;/p&gt;

&lt;p&gt;This essay will consider four ways to do that:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Do meta-research on what research topics are most likely to help with all post-alignment problems simultaneously.&lt;/li&gt;
  &lt;li&gt;Pause frontier AI development until we know how to solve post-alignment problems (and the alignment problem too).&lt;/li&gt;
  &lt;li&gt;Develop human-level “assistant” AI first, then leverage AI to solve post-alignment problems.&lt;/li&gt;
  &lt;li&gt;Steer AI development such that an autonomous ASI is more likely to solve post-alignment problems.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;If you’re working on post-alignment problems, and especially if you’re writing a research agenda, then don’t take it for granted that “do direct research” is the right solution.&lt;/strong&gt; If that’s what you believe, then support that position with argument. At minimum, I would like to see more post-alignment researchers engage with the question of what to do if timelines are short or progress is intractable.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Edited 2026-03-23 to rename from “non-alignment problems” to “post-alignment problems”. I’m still not satisfied with this name, but I’m told that it’s less confusing.&lt;/em&gt;&lt;/p&gt;

&lt;h2 id=&quot;contents&quot;&gt;Contents&lt;/h2&gt;

&lt;ul id=&quot;markdown-toc&quot;&gt;
  &lt;li&gt;&lt;a href=&quot;#introduction&quot; id=&quot;markdown-toc-introduction&quot;&gt;Introduction&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#contents&quot; id=&quot;markdown-toc-contents&quot;&gt;Contents&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#approach-1-meta-research-on-what-approach-to-use&quot; id=&quot;markdown-toc-approach-1-meta-research-on-what-approach-to-use&quot;&gt;Approach 1: Meta-research on what approach to use&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#approach-2-pause-ai&quot; id=&quot;markdown-toc-approach-2-pause-ai&quot;&gt;Approach 2: Pause AI&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#approach-3-develop-human-level-ai-first-then-maybe-pause&quot; id=&quot;markdown-toc-approach-3-develop-human-level-ai-first-then-maybe-pause&quot;&gt;Approach 3: Develop human-level AI first, then (maybe) pause&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#approach-4-research-how-to-steer-asi-toward-solving-post-alignment-problems&quot; id=&quot;markdown-toc-approach-4-research-how-to-steer-asi-toward-solving-post-alignment-problems&quot;&gt;Approach 4: Research how to steer ASI toward solving post-alignment problems&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#conclusion&quot; id=&quot;markdown-toc-conclusion&quot;&gt;Conclusion&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#notes&quot; id=&quot;markdown-toc-notes&quot;&gt;Notes&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;approach-1-meta-research-on-what-approach-to-use&quot;&gt;Approach 1: Meta-research on what approach to use&lt;/h2&gt;

&lt;p&gt;That’s what this essay is. Meta-research is useful insofar as it’s unclear what approach to take, but it has rapidly diminishing utility because at some point we need to pick some strategy and pursue it (especially given short timelines).&lt;/p&gt;

&lt;p&gt;I’d like to see more meta-research on whether there are any promising approaches that this essay did not consider.&lt;/p&gt;

&lt;h2 id=&quot;approach-2-pause-ai&quot;&gt;Approach 2: Pause AI&lt;/h2&gt;

&lt;p&gt;The case for pausing to mitigate post-alignment risks is similar to the case for alignment risk: we don’t know how to make ASI safe, so we shouldn’t build it until we do. The counter-arguments are also the same: a global pause is hard to achieve; a partial pause may be worse than no pause; etc.&lt;/p&gt;

&lt;p&gt;However, in the context of post-alignment problems, the case for pausing AI is &lt;strong&gt;stronger&lt;/strong&gt; in one way, and &lt;strong&gt;weaker&lt;/strong&gt; in another way.&lt;/p&gt;

&lt;p&gt;It is &lt;strong&gt;stronger&lt;/strong&gt; in that AI companies mostly don’t care about post-alignment problems. They &lt;em&gt;do&lt;/em&gt; care about the alignment problem and are actively working to solve it. Some people are optimistic about their chances—I’m not, but insofar as you expect companies to solve alignment without a pause, a pause looks less important. But companies are ignoring post-alignment problems and almost certainly won’t solve them on the current trajectory.&lt;/p&gt;

&lt;p&gt;(I also believe that companies will almost certainly not solve the alignment problem; but that’s a harder position to argue for, whereas it’s clear that AI companies are not even working on post-alignment problems. (Except for Anthropic, which is putting in a weak effort on a subset of the problems, e.g. AI welfare.))&lt;/p&gt;

&lt;p&gt;The case for pausing is &lt;strong&gt;weaker&lt;/strong&gt; in that it might not increase our chances of solving post-alignment problems. Human beings mostly don’t care about topics like AI welfare, wild animal welfare, or AIs torturing simulations of people for weird game-theoretic reasons. An aligned ASI, even if it’s not intentionally directed at solving post-alignment problems, might do a better job than humans would.&lt;/p&gt;

&lt;h2 id=&quot;approach-3-develop-human-level-ai-first-then-maybe-pause&quot;&gt;Approach 3: Develop human-level AI first, then (maybe) pause&lt;/h2&gt;

&lt;p&gt;An alternative approach: Don’t pause yet. First develop human-level AI that can help us solve the world’s major problems. Don’t develop superintelligence until we’re on stable ground philosophically, but still take advantage of the productivity boost that AI provides.&lt;/p&gt;

&lt;p&gt;This plan doesn’t help with misalignment or misuse risks—the human-level AI must be aligned (enough), and it must refuse to perform unethical tasks and be impossible to jailbreak. But it could help with other post-alignment risks.&lt;/p&gt;

&lt;p&gt;This plan still requires pausing AI development at some point. In this scenario, it is critically important that we succeed at pausing AI before an intelligence explosion. Therefore, if this is our strategy, then the best thing to do today is to lay the necessary groundwork for a pause.&lt;/p&gt;

&lt;p&gt;In an alternative version of this plan, we don’t ever pause AI development. Instead, we squeeze the “solve-every-problem” step into the time gap between “AI dramatically boosts productivity” and “AI has total control of the future”. This only works if post-alignment problems turn out to be much easier to solve than they look.&lt;/p&gt;

&lt;p&gt;Another concern—shared with the &lt;a href=&quot;#approach-4-research-how-to-steer-asi-toward-solving-post-alignment-problems&quot;&gt;plan below&lt;/a&gt;—is that it seems infeasible to build AIs that are differentially good at philosophy. Philosophy might not be the &lt;em&gt;single&lt;/em&gt; hardest thing to get AIs to be good at, but AI will be worse at philosophy than at AI research; therefore, by default, we get an intelligence explosion before we solve the necessary philosophical problems.&lt;/p&gt;

&lt;h2 id=&quot;approach-4-research-how-to-steer-asi-toward-solving-post-alignment-problems&quot;&gt;Approach 4: Research how to steer ASI toward solving post-alignment problems&lt;/h2&gt;

&lt;p&gt;Most of the post-alignment problems listed in this essay are different flavors of “we get ethics wrong” or “we make important philosophical mistakes”. What if we can get a sufficiently smart AI to solve philosophy for us?&lt;/p&gt;

&lt;p&gt;Four concerns with this research agenda:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;“Solve philosophy” is not the same thing as “implement the correct philosophy”, and we need the AI to bridge that gap. There is a near-consensus among moral philosophers that factory farming is wrong, yet it persists. An ASI that solves ethics would need to do the ethically correct thing, rather than the thing people want it to do.&lt;sup id=&quot;fnref:4&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:4&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
  &lt;li&gt;Philosophy is exceptionally hard to train AIs on. You can’t steer training effectively because we don’t know how to judge the quality of philosophical output.&lt;sup id=&quot;fnref:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
  &lt;li&gt;To my knowledge, zero people are working on this full-time. Even if there’s a way to do it, it won’t happen without a major shift in research priorities.&lt;/li&gt;
  &lt;li&gt;Even if you do come up with some useful ideas, you have to get AI companies to implement your ideas. This will be difficult if a “philosophy AI” requires a significantly different training paradigm.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;

&lt;p&gt;On balance, I believe pausing AI is the best answer to post-alignment problems. I have doubts about whether a pause is achievable, and whether it would even help; but my doubts about the other answers are even stronger.&lt;/p&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:3&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Existential in the classic sense of “failing to realize sentient life’s potential”. &lt;a href=&quot;#fnref:3&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:4&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;h/t Justis Mills for raising this concern. &lt;a href=&quot;#fnref:4&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:2&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Particularly on the upper end, which is where it matters. Experts can judge that Kant is better than a philosophy undergrad, but can they judge whether Kant is better than Hume? To solve all post-alignment problems, we will need philosophical research of &lt;em&gt;better&lt;/em&gt; quality than what Kant or Hume produced. &lt;a href=&quot;#fnref:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>Do Disruptive or Violent Protests Work?</title>
				<pubDate>Wed, 19 Nov 2025 00:00:00 -0800</pubDate>
				<link>http://mdickens.me/2025/11/19/do_disruptive_protests_work/</link>
				<guid isPermaLink="true">http://mdickens.me/2025/11/19/do_disruptive_protests_work/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;&lt;a href=&quot;/2025/04/18/protest_outcomes_critical_review/&quot;&gt;Previously&lt;/a&gt;, I reviewed the five strongest studies on protest outcomes and concluded that peaceful protests probably work (credence: 90%).&lt;/p&gt;

&lt;p&gt;But what about disruptive or violent protests?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Peaceful&lt;/strong&gt; protests use nonviolent, non-disruptive tactics such as picketing and marches.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Disruptive&lt;/strong&gt; protests use nonviolent, in-your-face tactics such as civil disobedience, sit-ins, and blocking roads.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Violent&lt;/strong&gt; protests use violence.&lt;/p&gt;

&lt;p&gt;There isn’t much evidence on the other two categories of protest. My best guesses are:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Violent protests probably don’t work. (credence: 80%)&lt;/li&gt;
  &lt;li&gt;Violent protests may &lt;em&gt;reduce&lt;/em&gt; support for a cause, but it’s unclear. (credence: 40%)&lt;/li&gt;
  &lt;li&gt;For disruptive protests, it’s hard to say whether they have a positive or negative impact on balance. I’m about evenly split on whether a randomly-chosen disruptive protest is net helpful, neutral, or harmful.&lt;/li&gt;
  &lt;li&gt;A typical disruptive protest doesn’t work as well a typical peaceful protest. (credence: 80%)&lt;/li&gt;
  &lt;li&gt;Peaceful protests are a better idea than disruptive protests. (credence: 90%)&lt;/li&gt;
&lt;/ul&gt;

&lt;!-- more --&gt;

&lt;h2 id=&quot;contents&quot;&gt;Contents&lt;/h2&gt;

&lt;ul id=&quot;markdown-toc&quot;&gt;
  &lt;li&gt;&lt;a href=&quot;#contents&quot; id=&quot;markdown-toc-contents&quot;&gt;Contents&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#violent-protests&quot; id=&quot;markdown-toc-violent-protests&quot;&gt;Violent protests&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#disruptive-protests&quot; id=&quot;markdown-toc-disruptive-protests&quot;&gt;Disruptive protests&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#conclusion&quot; id=&quot;markdown-toc-conclusion&quot;&gt;Conclusion&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#notes&quot; id=&quot;markdown-toc-notes&quot;&gt;Notes&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;violent-protests&quot;&gt;Violent protests&lt;/h2&gt;

&lt;p&gt;Three lines of evidence suggest that violent protests make things worse:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Exactly one quasi-experimental study (&lt;a href=&quot;/materials/1960s_Black_Protests.pdf&quot;&gt;Wasow 2020&lt;/a&gt;&lt;sup id=&quot;fnref:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;).&lt;/li&gt;
  &lt;li&gt;A meta-analysis of lab experiments (&lt;a href=&quot;/materials/Protest-Meta-Analysis.pdf&quot;&gt;Orazani et al. 2021&lt;/a&gt;&lt;sup id=&quot;fnref:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt;).&lt;/li&gt;
  &lt;li&gt;Various observational studies.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I discussed &lt;a href=&quot;/materials/1960s_Black_Protests.pdf&quot;&gt;Wasow (2020)&lt;/a&gt;&lt;sup id=&quot;fnref:2:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; when I &lt;a href=&quot;/2025/04/18/protest_outcomes_critical_review/&quot;&gt;reviewed&lt;/a&gt; natural experiments on protest outcomes. Wasow (2020) uses rainfall as a way to randomize treatment. &lt;a href=&quot;/2025/04/18/protest_outcomes_critical_review/#studies-on-real-world-protest-outcomes&quot;&gt;Quoting myself&lt;/a&gt;:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;The idea is that protests often get canceled when it rains. If you look at voting patterns in places where it rained on protest day compared to where it didn’t rain, you should be able to isolate the causal effect of protests. The rain effectively randomizes where protests occur.&lt;/p&gt;

  &lt;p&gt;Rather than using rainfall directly, the rainfall method uses rainfall shocks—that is, unexpectedly high or low rainfall relative to what was expected for that location and date. This avoids any confounding effect of average rainfall levels.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The quasi-experimental evidence from Wasow (2020) suggests that violent Civil Rights protests backfired: public support went down in places where protests occurred.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;/materials/Protest-Meta-Analysis.pdf&quot;&gt;Orazani et al. (2021)&lt;/a&gt;&lt;sup id=&quot;fnref:1:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt; is a meta-analysis of lab experiments. The experiments showed people news articles about (real or hypothetical) violent or nonviolent protests and measured their favorability toward the protesters’ cause. The meta-analysis found that:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Nonviolent advocacy had a positive effect (&lt;a href=&quot;https://en.wikipedia.org/wiki/Effect_size#Cohen&apos;s_d&quot;&gt;d&lt;/a&gt; = 0.25, p &amp;lt; .00001)&lt;/li&gt;
  &lt;li&gt;Violence had a non-significant negative effect (d = –0.04, 95% CI [–0.19, 0.12], p = .65)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This evidence suggests three things:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Nonviolent protests work.&lt;/li&gt;
  &lt;li&gt;Violent protests don’t work.&lt;/li&gt;
  &lt;li&gt;Violent protests don’t &lt;em&gt;strongly&lt;/em&gt; backfire—violent protests had a negative effect, but it was small and not statistically significant.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;There are many observational studies on violent protests, with mixed results. A literature review by &lt;a href=&quot;https://www.hbs.edu/ris/Publication%20Files/When%20are%20social%20protests%20effective_67978754-eaf9-4414-aae1-16db9ef13812.pdf&quot;&gt;Shuman et al. (2024)&lt;/a&gt;&lt;sup id=&quot;fnref:4&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:4&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt; wrote:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;[T]here is some, although more mixed, evidence that even entirely violent protests can sometimes be effective for policy-related outcomes. For example, research on the violent 1992 Los Angeles Riots increased support for local policy reforms when policy referenda came up for a vote soon after, particularly among people who were more proximally exposed to the disruptive violence (although target audience in terms of resistance was not examined). Another study that did examine moderation by target audience found that physical proximity to Palestinian violence increased support among Israelis for making policy concessions, and that this effect was stronger for traditional right-wing, hawkish, groups. However, there is also conflicting evidence. For example, similar research found that exposure to political violence led to harsher policy attitudes among Israelis (although moderation by target audience was not assessed). Similarly, research focused on voting rather than policy found that the outbreaks of violence during the Civil Rights Movement increased support for social control framing of the issue and Republican vote share.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The language from Shuman et al. attributes causality to the cited studies, which I don’t believe is appropriate. I’m quoting this passage for the purpose of illustrating that observational studies on violent protests have found varying results.&lt;/p&gt;

&lt;h2 id=&quot;disruptive-protests&quot;&gt;Disruptive protests&lt;/h2&gt;

&lt;p&gt;I found two experimental studies on disruptive protests. They showed participants news articles about protests and asked them how strongly they supported the protesters’ cause.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2911177&quot;&gt;Feinberg et al. (2017)&lt;/a&gt;&lt;sup id=&quot;fnref:5&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:5&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;4&lt;/a&gt;&lt;/sup&gt; ran three experiments on three different causes (animal rights; BLM; anti-Trump). They found that people were more likely to express support for a cause after reading about a peaceful protest than a disruptive protest. Two of the three experiments did not include control groups, so they don’t tell us whether the absolute effect of disruptive protests was positive or negative. The third study (with a control group) found that disruptive protests had a backfire effect.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://doi.org/10.1177/2378023120925949&quot;&gt;Bugden (2020)&lt;/a&gt;&lt;sup id=&quot;fnref:6&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:6&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;5&lt;/a&gt;&lt;/sup&gt; found that reading an article about a peaceful climate protest increased support. Disruptive protests worked worse than peaceful protests, but still better than the control. This study also found a (non-significant) increase in support due to violent protests, which disagrees with some prior results (the &lt;a href=&quot;/materials/Protest-Meta-Analysis.pdf&quot;&gt;Orazani et al. (2021)&lt;/a&gt;&lt;sup id=&quot;fnref:1:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt; meta-analysis, which did not include Bugden (2020), found a non-significant negative effect).&lt;/p&gt;

&lt;p&gt;An observational study by &lt;a href=&quot;https://doi.org/10.1038/s41893-024-01444-1&quot;&gt;Ostarek et al. (2024)&lt;/a&gt;&lt;sup id=&quot;fnref:7&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:7&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;6&lt;/a&gt;&lt;/sup&gt; conducted surveys directly before and after a disruptive protest, which is better than the typical observational study. They found that support was higher after the protest than before.&lt;/p&gt;

&lt;p&gt;This evidence is mixed on whether disruptive protests work; and the quality of evidence is much weaker than for peaceful protests.&lt;/p&gt;

&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;

&lt;p&gt;It looks like violent protests don’t work, and there’s a good chance that they backfire. This is good news—it means we don’t live in the &lt;a href=&quot;https://www.lesswrong.com/posts/neQ7eXuaXpiYw7SBy/the-least-convenient-possible-world&quot;&gt;Least Convenient Possible World&lt;/a&gt; where you have to commit violence to achieve your goals.&lt;/p&gt;

&lt;p&gt;Disruptive protests &lt;em&gt;might&lt;/em&gt; work, but the evidence is mixed and weak. The evidence supporting peaceful protests is much stronger, which makes them the better tactic.&lt;/p&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:2&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Wasow, O. (2020). &lt;a href=&quot;https://doi.org/10.1017/S000305542000009X&quot;&gt;Agenda Seeding: How 1960s Black Protests Moved Elites, Public Opinion and Voting.&lt;/a&gt;. &lt;a href=&quot;#fnref:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt; &lt;a href=&quot;#fnref:2:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;2&lt;/sup&gt;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:1&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Orazani, N., Tabri, N., Wohl, M. J. A., &amp;amp; Leidner, B. (2021). &lt;a href=&quot;https://doi.org/10.1002/ejsp.2722&quot;&gt;Social movement strategy (nonviolent vs. violent) and the garnering of third-party support: A meta-analysis.&lt;/a&gt; &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt; &lt;a href=&quot;#fnref:1:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;2&lt;/sup&gt;&lt;/a&gt; &lt;a href=&quot;#fnref:1:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;3&lt;/sup&gt;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:4&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Shuman, E., Goldenberg, A., Saguy, T., Halperin, E., &amp;amp; van Zomeren, M. (2024). &lt;a href=&quot;https://doi.org/10.1016/j.tics.2023.10.003&quot;&gt;When Are Social Protests Effective?.&lt;/a&gt; &lt;a href=&quot;#fnref:4&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:5&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Feinberg, M., Willer, R., &amp;amp; Kovacheff, C. (2017). &lt;a href=&quot;https://dx.doi.org/10.2139/ssrn.2911177&quot;&gt;Extreme Protest Tactics Reduce Popular Support for Social Movements.&lt;/a&gt; &lt;a href=&quot;#fnref:5&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:6&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Bugden, D. (2020). &lt;a href=&quot;https://doi.org/10.1177/2378023120925949&quot;&gt;Does Climate Protest Work? Partisanship, Protest, and Sentiment Pools.&lt;/a&gt; &lt;a href=&quot;#fnref:6&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:7&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Ostarek, M., Simpson, B., Rogers, C., &amp;amp; Ozden, J. (2024). &lt;a href=&quot;https://doi.org/10.1038/s41893-024-01444-1&quot;&gt;Radical climate protests linked to increases in public support for moderate organizations.&lt;/a&gt; &lt;a href=&quot;#fnref:7&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>Why would God have a gender?</title>
				<pubDate>Tue, 18 Nov 2025 00:00:00 -0800</pubDate>
				<link>http://mdickens.me/2025/11/18/god_gender/</link>
				<guid isPermaLink="true">http://mdickens.me/2025/11/18/god_gender/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;Classically, according to the Abrahamic religions, God is a man.&lt;/p&gt;

&lt;p&gt;According to some more recent depictions, God is a woman. Which is a nice subversion.&lt;/p&gt;

&lt;p&gt;But like, y’all are both a bit crazy. If there is an omnipotent Creator of the universe, then it definitely doesn’t have a gender.&lt;/p&gt;

&lt;p&gt;When people call God “he” or “she”, this is what they’re saying happened:&lt;/p&gt;

&lt;!-- more --&gt;

&lt;ol&gt;
  &lt;li&gt;Life evolved over billions of years through a process of mutation, reproduction, and natural selection.&lt;/li&gt;
  &lt;li&gt;Originally, all organisms reproduced by copying themselves. But some organisms evolved the abiity to reproduce by combining the genes of two different individuals. This let genes mix more and allowed good genes to spread more readily. In some environments, organisms that could reproduce sexually outcompeted those who couldn’t.&lt;/li&gt;
  &lt;li&gt;Organisms evolved two distinct sexes because it makes evolutionary sense to have two different types of reproductive material (eggs and sperm).&lt;/li&gt;
  &lt;li&gt;Different animals evolved different characteristics in males vs. females. Sometimes one is larger, sometimes one does more of the work getting food, one does more of the child rearing, etc. These characteristics differ a lot depending on the species.&lt;/li&gt;
  &lt;li&gt;In one particular species, namely humans, females physically bear children and do most of the child rearing, while males are physically larger and do most of the hunting. Many other animals (especially mammals) use this same division of labor, but many times certain characteristics are reversed. For example, in many species, the female is bigger and stronger than the male; in some (rare) cases, the male does most of the child rearing.&lt;/li&gt;
  &lt;li&gt;These sexual differences also led to personality differences, which arose due to contingent evolutionary pressures and quirks of the environment.&lt;/li&gt;
  &lt;li&gt;God, the omnipotent being who created the universe, has personality characteristics that are consistent with the personality of one side of a contingent dichotomous evolutionary strategy in one particular species.&lt;/li&gt;
  &lt;li&gt;You might think that one particular species would be sharks, because sharks have been swimming the seas for 439 million years. But no, God has the personality traits that are associated with one sex of a species of hairless mammal that only evolved about 100,000 years ago.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;(Alternatively, they’re saying that evolution is a lie and the earth is 6,000 years old or whatever, which somehow makes &lt;em&gt;more&lt;/em&gt; sense.)&lt;/p&gt;

&lt;p&gt;God does not reproduce sexually. God is the eternal Creator of the universe, not the tip of one branch at the end of billions of years of natural selection.&lt;/p&gt;

&lt;p&gt;In fact, why would God have a personality at all? A personality is a thing that emerges in social beings and describes their social interactions. God doesn’t have a social life, it’s not like It spends Its day hanging out with the other creators of the universe.&lt;/p&gt;

&lt;p&gt;Religions’ lack of imagination kind of bugs me. God is supposed to be this omnipotent, omniscient, incomprehensible being. But if you read religious texts, God just acts like some guy.&lt;/p&gt;

&lt;p&gt;(H. P. Lovecraft did a much better job of writing Gods that act like Gods. Or at least I assume he did—I haven’t actually read any of his books.)&lt;/p&gt;

                </description>
			</item>
		
			<item>
				<title>Not-Discovered-Here Syndrome</title>
				<pubDate>Mon, 17 Nov 2025 00:00:00 -0800</pubDate>
				<link>http://mdickens.me/2025/11/17/not_discovered_here_syndrome/</link>
				<guid isPermaLink="true">http://mdickens.me/2025/11/17/not_discovered_here_syndrome/</guid>
                <description>
                  
                  
                  
                  &lt;blockquote&gt;
  &lt;p&gt;An investor is considering putting her money into a mutual fund. “I will just invest some money for the next six months,” she says, “and see how it goes.”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;blockquote&gt;
  &lt;p&gt;A philanthropist is considering donating to a charity. “I will donate some money and see how it goes.”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;blockquote&gt;
  &lt;p&gt;Harvard University is considering whether SAT scores are all that important for admissions. “Let’s make SAT scores optional and see what happens.”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;blockquote&gt;
  &lt;p&gt;A child climbs to the top of a slide and is about to jump off the edge. “Don’t jump off of that,” his mom says, “you’ll get hurt.” He jumps off the slide. He gets hurt.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href=&quot;https://en.wikipedia.org/wiki/Not_invented_here&quot;&gt;Not-invented-here syndrome&lt;/a&gt; is when an organization unnecessarily re-invents products or tools that already exist elsewhere. The cousin of this phemonenon is not-discovered-here syndrome, in which people refuse to consider evidence unless they’ve collected it themselves.&lt;/p&gt;

&lt;p&gt;“A wise man learns from his mistakes, but a wiser man learns from the mistakes of others.” Not-discovered-here syndrome is what happens when you insist on making mistakes for yourself.&lt;/p&gt;

&lt;!-- more --&gt;

&lt;p&gt;Institutional investors like to “try out” new investments for six months or a year. That doesn’t make any sense. Whatever you learn in the six months of holding the fund, you could’ve learned by looking at a price chart of the prior six months. (In fact you probably could’ve learned a lot more, because most funds have more than six months of history.) Or you could keep an eye on the fund for the next six months without investing. Putting money into the fund doesn’t teach you anything.&lt;/p&gt;

&lt;p&gt;Harvard made the SAT optional in 2020. Prior to 2020, there already existed a mountain of data showing that a student’s SAT score are is a good good predictor of college success. It was predictable in advance that if colleges stop requiring the SAT, then they will do a worse job at identifying good candidates. But they ignored the data and learned that lesson the hard way instead.&lt;/p&gt;

&lt;p&gt;I had a similar criticism of the book &lt;em&gt;Outlive&lt;/em&gt;; I decided not to include it in my &lt;a href=&quot;https://mdickens.me/2024/09/26/outlive_a_critical_review/&quot;&gt;book review&lt;/a&gt;, but I’ll mention it here since it’s on-topic. In the book, Peter Attia is a big proponent of continuous glucose monitors (CGMs), which show how your blood sugar goes up after you eat.&lt;/p&gt;

&lt;p&gt;This sort of confuses me. You can easily find websites that tell you the &lt;a href=&quot;https://glycemic-index.net/glycemic-index-chart/&quot;&gt;glycemic loads of different foods&lt;/a&gt;. I can tell you what will happen if I eat white flour: my blood sugar will go up a lot. I know because it has a high glycemic load. I can also tell you that if I eat walnuts, my blood sugar will only go up a little bit. A CGM doesn’t tell me anything I don’t already know.&lt;/p&gt;

&lt;p&gt;Wearing a CGM is a psychological tool that works for some people, but that’s kind of my point: those people have not-discovered-here syndrome. It’s not enough for me to know which foods have high glycemic load; I have to wear a monitor showing that, yes, this food &lt;em&gt;does&lt;/em&gt; raise my blood sugar.&lt;/p&gt;

&lt;p&gt;Sometimes people really do make better decisions when they collect the evidence themselves. But there’s no inherent reason why it has to be that way. It feels like there ought to be some way to get people to pay attention to evidence that they didn’t personally discover, but I don’t know how.&lt;/p&gt;

                </description>
			</item>
		
			<item>
				<title>Knowing whether AI alignment is a one-shot problem is a one-shot problem</title>
				<pubDate>Sun, 16 Nov 2025 00:00:00 -0800</pubDate>
				<link>http://mdickens.me/2025/11/16/ai_meta_one_shot/</link>
				<guid isPermaLink="true">http://mdickens.me/2025/11/16/ai_meta_one_shot/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;One day, I was at my grandma’s house reading the Sunday funny pages, when I suddenly felt myself getting sucked into a Garfield comic.&lt;/p&gt;

&lt;!-- more --&gt;

&lt;p&gt;I looked down at my body and saw that I had become fully cartoonified. My hands had four fingers and I had a distinct feeling that I’d be wearing the same outfit for the rest of my life.&lt;/p&gt;

&lt;p&gt;I also started feeling really hungry. Luckily, I was in the kitchen, where Jon, Garfield’s owner, had made some lasagna.&lt;/p&gt;

&lt;p&gt;“You look really hungry,” he said to me. “Why don’t you take this lasagna?”&lt;/p&gt;

&lt;p&gt;I gratefully accepted Jon’s lasagna. As he handed it to me, he issued a grave warning: “Don’t let Garfield eat this.”&lt;/p&gt;

&lt;p&gt;I looked at where Garfield was sitting on the floor, harmlessly hating Mondays. He was chubby and slow and there was no way he’d be able to jump up and yank the lasagna tray out of my hands.&lt;/p&gt;

&lt;p&gt;Jon said, “Take that to the dining table down at the end of the comic strip.”&lt;/p&gt;

&lt;p&gt;Thanking him again, I walked over to the next panel, where I ran into a new Jon and Garfield (because that’s how comic strips work). But this Garfield was a bit bigger and a bit more lithe-looking.&lt;/p&gt;

&lt;p&gt;I waved at the characters and walked to the third panel. The third Garfield again looked bigger and a bit more aggressive.&lt;/p&gt;

&lt;p&gt;“I’m a bit worried about this lasagna,” I said to Jon. “Garfield seems to be getting bigger and stronger and I’m afraid once I get to the last panel, he’s gonna eat my lasagna.”&lt;/p&gt;

&lt;p&gt;“Oh yes, the guy who draws us is trying to make a super-Garfield,” Jon explained. “He’s small now, but someday he will be bigger and stronger than either of us.”&lt;/p&gt;

&lt;p&gt;“But won’t that mean he will take my lasagna?”&lt;/p&gt;

&lt;p&gt;“Don’t worry about that,” Jon reassured me. “I have a plan to tame Garfield. By the end, I will know how to make him leave the lasagna alone.”&lt;/p&gt;

&lt;p&gt;“Okay,” I said hesitantly, and kept walking to the fourth panel. (This was a full-page Sunday comic.) Garfield was now big enough to come up to my knees. He scratched at my leg with his front paws, yearning for a meal.&lt;/p&gt;

&lt;p&gt;“I’m really not sure about this,” I said to Jon. “Right now, Garfield can’t take my lasagna no matter what he does; he can’t jump that high and he’s definitely not strong enough to knock me over. At some point, though, I will encounter a Garfield who’s bigger than me for the first time, and then he’ll be able to knock me over and eat the lasagna and maybe even &lt;a href=&quot;https://dubblebaby.blogspot.com/2013/10/blog-post_21.html&quot;&gt;eat the entire house&lt;/a&gt;. How do I know that he’s tamed until I get to that point?”&lt;/p&gt;

&lt;p&gt;“I’m incrementally improving his behavior in each panel,” Jon explained. “We will have many chances to observe Garfield’s behavior before he becomes super-Garfield. I can verify that I’m making progress on his tameness.”&lt;/p&gt;

&lt;p&gt;I thought back to what I had read about cat training on LessWrong. “There’s this guy named Eliezer Yudkowsky who says you only get one critical try to tame a super-cat. If you fail, the cat will steal your lasagna, and then you won’t have any lasagna anymore.”&lt;/p&gt;

&lt;p&gt;“That argument doesn’t apply to my methods. I can accumulate evidence about whether the cat training is working. Each Garfield will be less lasagna-obsessed than the last, and we can observe the trajectory of the taming. You also have to consider…” and I don’t quite recall what Jon said after that but it was very complicated and it sounded like he knew a lot more about cat-taming than me.&lt;/p&gt;

&lt;p&gt;“Perhaps you’re right,” I said. “You have some good arguments, but so does Eliezer. How can I know who’s really right until I reach the last panel? I think I’d better not go there until I know for sure that my lasagna will be safe.”&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/garfield-shoggoth.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Some people say AI alignment is a one-shot problem. You only get one chance to align AI, and if you fail, everyone dies.&lt;/p&gt;

&lt;p&gt;Other people disagree. They say it’s possible to make gradually smarter AIs and ratchet your way up to a fully-aligned superintelligent AI, and if the process isn’t working, you can tell in advance.&lt;/p&gt;

&lt;p&gt;It doesn’t matter who’s right. The key thing is that we don’t get to find out who’s right until it’s too late. Either the gradual ramp-up succeeds at making an aligned superintelligence (as the gradualists predict), or it fails and we die (as the one-shotters predict).&lt;/p&gt;

&lt;p&gt;This is the meta-one-shot problem: we only get one shot at knowing whether it’s a one-shot problem.&lt;/p&gt;

&lt;p&gt;There will be some point in time where we build the first AI that’s powerful enough to kill everyone. When that happens, either the one-shotters are right and we only get one shot to align it, or the gradualists are right and we get to iterate. Either way, we only get one shot at finding out who’s right.&lt;/p&gt;

&lt;p&gt;The alignment one-shot problem may or may not be real. But the &lt;em&gt;meta&lt;/em&gt;-one-shot problem is definitely real: we don’t get the evidence we need until it’s too late to do anything about it.&lt;/p&gt;

&lt;p&gt;AI companies’ alignment plans only work if the gradualist hypothesis is true. The main reason they operate this way is that it’s much harder to make plans that work in a one-shot world. Unfortunately, there is no law of the universe that says you get to do the easy thing if the hard thing is too hard. If a plan requires gradualism, then we have no way of being confident that the plan will work.&lt;/p&gt;

&lt;p&gt;Having a plan for a gradualist scenario is fine. But AI developers also need plans for what to do if AI alignment is a one-shot problem, because they have no way of knowing which hypothesis is correct. And they shouldn’t build powerful AI until they have both kinds of plans.&lt;/p&gt;

                </description>
			</item>
		
			<item>
				<title>What If Ghosts Were Real?</title>
				<pubDate>Sat, 15 Nov 2025 00:00:00 -0800</pubDate>
				<link>http://mdickens.me/2025/11/15/what_if_ghosts_were_real/</link>
				<guid isPermaLink="true">http://mdickens.me/2025/11/15/what_if_ghosts_were_real/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;If we are correct about the laws of physics, then ghosts can’t exist. But some people are insistent that they’ve directly interacted with ghosts. Is there a way ghosts could exist if we modified the laws of physics a bit?&lt;/p&gt;

&lt;!-- more --&gt;

&lt;p&gt;Okay, what are the properties that ghosts need to have?&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Ghosts are coherent entities with bodies. (Maybe they have heads and arms and legs, or maybe they’re a vague cloud shape, but they definitely have something that you could call a body.)&lt;/li&gt;
  &lt;li&gt;Ghosts are invisible.&lt;/li&gt;
  &lt;li&gt;Ghosts can pass through solid objects, but they can also knock over lamps and stuff (as long as doing so would be sufficiently spooky).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;There are already invisible things that can pass through solid objects. For example, &lt;a href=&quot;https://en.wikipedia.org/wiki/Neutrino&quot;&gt;neutrinos&lt;/a&gt;. But you can’t have a body made of neutrinos because the particles in your body will immediately scatter all over the place and then you won’t have a body anymore. So ghosts must be made of something else.&lt;/p&gt;

&lt;p&gt;If ghosts can pass through solid objects, that means they don’t interact via the electromagnetic force. But something weird has to be going on with gravity. If your feet didn’t interact with the ground, you’d fall straight through to the center of the earth. But ghosts don’t do that.&lt;/p&gt;

&lt;p&gt;Maybe ghosts aren’t affected by gravity. But the earth isn’t still: it’s revolving around the sun at 67,000 miles per hour (109,000 kph), and it’s constantly turning in its orbit. If ghosts aren’t gravitationally bound by the sun, why don’t they go flying off into space?&lt;/p&gt;

&lt;p&gt;I can see two possible explanations:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Ghosts don’t interact gravitationally, but they can move super fast to keep themselves in the same spot relative to earth.&lt;/li&gt;
  &lt;li&gt;Ghosts do interact gravitationally, but they can make the bottoms of their feet interact electromagnetically with the ground to prevent themselves from falling through.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The first one doesn’t seem likely to me. The ghost of a 15th century Baron who’s haunting a mansion doesn’t know anything about how the earth’s orbit works, why wouldn’t he just get lost as soon as he turns to ghost-form?&lt;/p&gt;

&lt;p&gt;The second explanation brings to mind a testable hypothesis. We put sensors on the floor of the haunted mansion, and that way we can tell when a ghost walks over them.&lt;/p&gt;

&lt;p&gt;What are ghosts made of? Ordinary matter is made of protons, neutrons, and electrons. Ghosts can’t be made of those, or else we’d be able to detect them. They can’t be made of &lt;a href=&quot;https://en.wikipedia.org/wiki/Dark_matter&quot;&gt;dark matter&lt;/a&gt;, or else they’d fall through the ground (and they wouldn’t have bodies). Ghosts must be made of some &lt;em&gt;new&lt;/em&gt; type of matter that doesn’t interact with the electromagnetic force, but &lt;em&gt;does&lt;/em&gt; hold together somehow. That means there must be an as-yet-undiscovered &lt;em&gt;fifth fundamental force&lt;/em&gt; that holds ghost bodies together.&lt;/p&gt;

&lt;p&gt;Coming back to our ghost properties: ghosts can pass through solid objects, but they can also interact with objects when they want to be spooky. So it’s not as simple as “ghosts don’t interact electromagnetically”. (If that were true, ghosts would have no observable impact on the world at all, and we would have no reason to believe they exist.) Ghosts &lt;em&gt;can&lt;/em&gt; interact electromagnetically, but &lt;em&gt;only when they choose to.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;That’s quite a puzzle. Ghosts can use their consciousness to turn on and off the electromagnetic interaction at will. I can’t think of how that’s possible, so let’s just move on.&lt;/p&gt;

&lt;p&gt;There are two important, but subtle, properties of ghosts:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Ghosts can see.&lt;/li&gt;
  &lt;li&gt;Ghosts can move.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Obviously ghosts can see and move, right? But these properties have some surprising implications.&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;If ghosts can see, that means ghosts’ eyes can detect photons.&lt;/li&gt;
  &lt;li&gt;If ghosts can move, that means they’re getting energy from somewhere.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The first implication means that photons affect ghosts, but ghosts don’t affect photons. Either that, or ghosts &lt;em&gt;do&lt;/em&gt; affect photons, which means we should be able to detect ghosts’ presence by shooting photons at their bodies—which is a sciencey way of saying “we can see them”.&lt;/p&gt;

&lt;p&gt;If ghosts can see photons but photons can’t see ghosts, then that means photons can create some sort of motion inside ghosts, but that motion doesn’t come from the photons. In other words, energy is coming from nothing. The fact that ghosts can move—and knock over objects—without any external energy source also indicates that ghosts can create energy from nothing. The Law of Conservation of Energy does not apply to ghosts.&lt;/p&gt;

&lt;p&gt;That gives us a testable hypothesis. If ghosts can create energy out of nothing, then we should be able to detect that energy. It’s a bit tricky to test, but what we can do is put a haunted house into some sort of sealed chamber where we precisely measure all the energy that goes in and comes out. If ghosts can create energy, then we should see more energy coming out than going in.&lt;/p&gt;

&lt;p&gt;I’ve come up with two testable hypotheses so far. Can you think of any others?&lt;/p&gt;

&lt;h2 id=&quot;what-does-it-mean-to-be-open-minded&quot;&gt;What does it mean to be open-minded?&lt;/h2&gt;

&lt;p&gt;On a few occasions, I have been accused of being “closed-minded” for denying that ghosts could be real. But what is open-mindedness? The sort of open-mindedness I care about entails pursuing the implications of a belief.&lt;/p&gt;

&lt;p&gt;Suppose I were to entertain the possibility that ghosts were real. What would that imply about the rest of my beliefs? What would I need to be wrong about? Those are the questions I wanted to answer with this essay.&lt;/p&gt;

&lt;p&gt;The fervent ghost-believers I met never seemed to have much curiosity about what the existence of ghosts might imply. How would ghosts be intangible, but without sinking to the center of the earth? If they’re intangible, how can they knock things over? If people sometimes observe ghosts in their haunted bedrooms, then it should be possible to observe ghosts in a lab experiment, right? How would you set up that experiment?&lt;/p&gt;

&lt;p&gt;Open-mindedness is about being receptive to different ideas. An important component of receptiveness is curiosity: if this were true, what else might be true as a consequence? In my experience, people who accuse me of being closed-minded aren’t curious about what their beliefs imply. If you do a curious investigation of ghosts, like I tried to do above, you can end up in some interesting places.&lt;/p&gt;

                </description>
			</item>
		
			<item>
				<title>In Defense of the NCIS Two-People-One-Keyboard Scene</title>
				<pubDate>Fri, 14 Nov 2025 00:00:00 -0800</pubDate>
				<link>http://mdickens.me/2025/11/14/NCIS/</link>
				<guid isPermaLink="true">http://mdickens.me/2025/11/14/NCIS/</guid>
                <description>
                  
                  
                  
                  &lt;iframe width=&quot;560&quot; height=&quot;315&quot; src=&quot;https://www.youtube.com/embed/u8qgehH3kEQ?si=xj5aQo64Tm_LD0B_&quot; title=&quot;YouTube video player&quot; frameborder=&quot;0&quot; allow=&quot;accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share&quot; referrerpolicy=&quot;strict-origin-when-cross-origin&quot; allowfullscreen=&quot;&quot;&gt;&lt;/iframe&gt;

&lt;p&gt;(&lt;a href=&quot;https://www.youtube.com/watch?v=kl6rsi7BEtk&quot;&gt;Here is the same clip in HD&lt;/a&gt;, but that 2010 YouTube vibe is part of the fun)&lt;/p&gt;

&lt;p&gt;This clip is in the running for most-mocked scene of all time, but I think it’s good, actually.&lt;/p&gt;

&lt;p&gt;First, let’s get some things out of the way:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;The writers of NCIS know how keyboards work. (They probably used keyboards to write this scene, even.)&lt;/li&gt;
  &lt;li&gt;The director of this episode knows how keyboards work.&lt;/li&gt;
  &lt;li&gt;I’m going to go out on a limb and say &amp;gt;90% of this show’s audience knows how keyboards work.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This scene was not written this way because the writers think their audience is dumb and doesn’t know how a keyboard works. It was written this way because of the &lt;a href=&quot;https://tvtropes.org/pmwiki/pmwiki.php/Main/RuleOfCool&quot;&gt;Rule of Cool&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The Rule of Cool states: &lt;strong&gt;an audience’s willingness to suspend disbelief is proportional to how cool a scene is&lt;/strong&gt;.&lt;/p&gt;

&lt;!-- more --&gt;

&lt;p&gt;(Is this scene actually cool? Well, no, not really. But the relevant question is, does the target audience &lt;em&gt;think&lt;/em&gt; it’s cool?)&lt;/p&gt;

&lt;p&gt;(Full disclosure: I only said it’s uncool because I don’t want to sound uncool, but honestly I do think it’s kind of cool. Come on, it’s at least a &lt;em&gt;little bit&lt;/em&gt; cool, right?)&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/River-Tam.png&quot; style=&quot;width:300px&quot; /&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;source: &lt;a href=&quot;https://xkcd.com/311/&quot;&gt;xkcd&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;In Firefly when River Tam—a tiny woman who doesn’t even exercise—beats up a bunch of guys in a bar, does that make sense? Is that how physics works? No. But it’s cool, so people are okay with it.&lt;/p&gt;

&lt;p&gt;Why do the space fighters in Star Wars pretend to be airplanes and use tactics that are nonsensical in space? Because it looks cool, that’s why. And why do lightsaber duelists, and most sword fighters in most movies for that matter, try to hit their opponents’ swords instead of going for a killing blow? Because it looks cooler than a real duel.&lt;/p&gt;

&lt;p&gt;I think you can reasonably object to the NCIS scene by saying that two people typing on one keyboard to stop a hacker is not cool. Which is understandable. But if you’re going to object to this scene by saying that’s not how keyboards work, then you’re also not allowed to like Firefly or Star Wars or any movie involving swords or time travel or space ships or any sci-fi or fantasy or almost any movie involving guns or explosions or physics or even dialogue for that matter (ever notice how film characters never stutter or slur their words unless doing so is specifically relevant to the plot? so unrealistic!).&lt;/p&gt;

&lt;p&gt;That all being said, I have a big problem with CSI’s &lt;a href=&quot;https://www.youtube.com/watch?v=hkDD03yeLnU&quot;&gt;“I’ll create a GUI interface using Visual Basic, see if I can track an IP address.”&lt;/a&gt; My problem isn’t that it doesn’t make sense. My problem is that it doesn’t sound cool, and it would’ve been so easy to write a cooler line. The word “Basic” doesn’t make you sound like a top hacker. And “GUI” is one of the least-cool-sounding words possible.&lt;/p&gt;

&lt;p&gt;I propose a small modification:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;I’ll create a kernel interface using C++, see if I can track an IP address.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The word “kernel” sounds cool. C++ sounds a lot cooler than Visual Basic. (If they want to go for the extra nonsense factor, they could say “C+” instead, which is something I’ve heard real people say in real life.&lt;sup id=&quot;fnref:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;) And my version of the line even kinda makes sense—realistically I don’t think you’d interface with your kernel to track an IP address, but those words mean something and you could do it if you really wanted to.&lt;/p&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:1&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;For the non-programmers reading this: There is no such thing as C+. There is only C and C++. (And also B and C# and D and F#, but no A.) &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>Epistemic Spot Check: Expected Value of Donating to Alex Bores's Congressional Campaign</title>
				<pubDate>Thu, 13 Nov 2025 00:00:00 -0800</pubDate>
				<link>http://mdickens.me/2025/11/13/spot_check_alex_bores/</link>
				<guid isPermaLink="true">http://mdickens.me/2025/11/13/spot_check_alex_bores/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;Political advocacy is an important lever for reducing existential risk. One way to make political change happen is to support candidates for Congress.&lt;/p&gt;

&lt;p&gt;In October, Eric Neyman wrote &lt;a href=&quot;https://ericneyman.wordpress.com/2025/10/20/consider-donating-to-alex-bores-author-of-the-raise-act/&quot;&gt;Consider donating to Alex Bores, author of the RAISE Act&lt;/a&gt;. He created a cost-effectiveness analysis to estimate how donations to Bores’s campaign change his probability of winning the election. It’s excellent that he did that—it’s exactly the sort of thing that we need people to be doing.&lt;/p&gt;

&lt;p&gt;We also need more people to check other people’s cost-effectiveness estimates. To that end, in this post I will check Eric’s work.&lt;/p&gt;

&lt;p&gt;I’m not going to talk about who Alex Bores is, why you might want to donate to his campaign, or who might &lt;em&gt;not&lt;/em&gt; want to donate. For that, see &lt;a href=&quot;https://ericneyman.wordpress.com/2025/10/20/consider-donating-to-alex-bores-author-of-the-raise-act/&quot;&gt;Eric’s post&lt;/a&gt;.&lt;/p&gt;

&lt;!-- more --&gt;

&lt;h2 id=&quot;contents&quot;&gt;Contents&lt;/h2&gt;

&lt;ul id=&quot;markdown-toc&quot;&gt;
  &lt;li&gt;&lt;a href=&quot;#contents&quot; id=&quot;markdown-toc-contents&quot;&gt;Contents&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#model-outline&quot; id=&quot;markdown-toc-model-outline&quot;&gt;Model outline&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#input-parameters&quot; id=&quot;markdown-toc-input-parameters&quot;&gt;Input parameters&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#campaign-spending-per-vote&quot; id=&quot;markdown-toc-campaign-spending-per-vote&quot;&gt;Campaign spending per vote&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#voter-turnout&quot; id=&quot;markdown-toc-voter-turnout&quot;&gt;Voter turnout&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#margin-of-victory&quot; id=&quot;markdown-toc-margin-of-victory&quot;&gt;Margin of victory&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#probability-that-your-candidate-is-in-the-top-two&quot; id=&quot;markdown-toc-probability-that-your-candidate-is-in-the-top-two&quot;&gt;Probability that your candidate is in the top two&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#probability-that-your-candidate-is-on-the-losing-side&quot; id=&quot;markdown-toc-probability-that-your-candidate-is-on-the-losing-side&quot;&gt;Probability that your candidate is on the losing side&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#opposition-fundraising-discount&quot; id=&quot;markdown-toc-opposition-fundraising-discount&quot;&gt;Opposition fundraising discount&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#early-fundraising-multiplier&quot; id=&quot;markdown-toc-early-fundraising-multiplier&quot;&gt;Early fundraising multiplier&lt;/a&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#sensitivity-analysis&quot; id=&quot;markdown-toc-sensitivity-analysis&quot;&gt;Sensitivity analysis&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#cost-to-shift-votes-by-one-percentage-point&quot; id=&quot;markdown-toc-cost-to-shift-votes-by-one-percentage-point&quot;&gt;Cost to shift votes by one percentage point&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#the-models-output-isnt-what-we-care-about&quot; id=&quot;markdown-toc-the-models-output-isnt-what-we-care-about&quot;&gt;The model’s output isn’t what we care about&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#notes&quot; id=&quot;markdown-toc-notes&quot;&gt;Notes&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;model-outline&quot;&gt;Model outline&lt;/h2&gt;

&lt;p&gt;The basic structure of Eric’s model:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Donations let the campaign spend more money on advertising, which increases how many votes they will get.&lt;/li&gt;
  &lt;li&gt;The election has some probability of being close.&lt;/li&gt;
  &lt;li&gt;If the election is close, then the expected value of votes is approximately linear.&lt;/li&gt;
  &lt;li&gt;If the election is not close, then marginal votes don’t matter at all.&lt;/li&gt;
  &lt;li&gt;Therefore, the expected value of donations is the product of three numbers:
    &lt;ul&gt;
      &lt;li&gt;probability that the election is close&lt;/li&gt;
      &lt;li&gt;number of votes to swing the election if it’s close&lt;/li&gt;
      &lt;li&gt;cost to change one vote&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The model specifically looks at the primary for New York’s 12th Congressional district. It doesn’t look at the general election because the district is deep blue and whoever wins the Democratic primary will almost certainly win the election.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://squigglehub.org/models/mdickens/congressional-campaign-donations&quot;&gt;I reproduced Eric’s model using Squiggle&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Before getting into the numbers, my take on this model is that it’s very reasonable. Some simplifying assumptions had to be made to make the model tractable, and I fully agree with all of Eric’s choices in that regard. When reproducing the model, I only make one small change that (I think) didn’t affect the final numbers at all.&lt;sup id=&quot;fnref:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;Some simplifying assumptions that the model makes:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Marginal campaign spending only matters if the election is close—you can’t bridge a large vote gap by throwing money at the election.&lt;/li&gt;
  &lt;li&gt;If the election is close, then spending has linear cost-effectiveness.&lt;/li&gt;
  &lt;li&gt;Campaign donations only matter insofar as they change election outcomes. Ignore any second-order effects (e.g. signaling that donors care about AI safety).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;My only high-level critique is that the original model used point estimates instead of credence intervals for the input parameters. For my Squiggle version, I converted most inputs into credence intervals using my own judgment about each parameter’s uncertainty.&lt;/p&gt;

&lt;p&gt;(To be fair, doing a cost-effectiveness estimate with credence intervals is a lot more work if you’re not using a tool like Squiggle.)&lt;/p&gt;

&lt;p&gt;Eric’s model has seven input parameters:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;campaign spending (dollars) per vote&lt;/li&gt;
  &lt;li&gt;voter turnout&lt;/li&gt;
  &lt;li&gt;probability distribution of the margin of victory (which is used to estimate the probability that the election is close)&lt;/li&gt;
  &lt;li&gt;probability that your candidate (in this case, Alex Bores) is in the top two&lt;/li&gt;
  &lt;li&gt;probability that your candidate is on the losing side of the top two (because if your candidate would win anyway, marginal votes don’t help)&lt;/li&gt;
  &lt;li&gt;discount due to the possibility that additional fundraising could induce the opposing candidate to raise more money&lt;/li&gt;
  &lt;li&gt;multiplier due to the fact that early fundraising consolidates party support&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;(Eric talked about all of these parameters in more detail in &lt;a href=&quot;https://ericneyman.wordpress.com/2025/10/20/consider-donating-to-alex-bores-author-of-the-raise-act/&quot;&gt;his post&lt;/a&gt;, although they were split across a few sections.)&lt;/p&gt;

&lt;p&gt;I will go through the values Eric gave for each of these parameters and if I have disagreements. Then I will do a sensitivity analysis.&lt;/p&gt;

&lt;h2 id=&quot;input-parameters&quot;&gt;Input parameters&lt;/h2&gt;

&lt;h3 id=&quot;campaign-spending-per-vote&quot;&gt;Campaign spending per vote&lt;/h3&gt;

&lt;p&gt;Eric Neyman assumed a typical campaign costs $100 per vote based mainly on “numbers thrown around casually by experts”, then multiplied by 3 because New York has higher costs than average.&lt;/p&gt;

&lt;p&gt;I spent 15 minutes looking for literature and I found:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;https://doi.org/10.1177/21582440241279659&quot;&gt;Le et al. (2024)&lt;/a&gt;&lt;sup id=&quot;fnref:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt; reviews the literature. Most papers it reviewed didn’t give direct dollar-per-vote estimates, but estimates could probably be derived by going through the data from each paper. I’m not going to do that, but it’s a feasible and well-scoped project if anyone else wants to do it.&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://doi.org/10.3386/w13672&quot;&gt;Bombardini &amp;amp; Trebbi (2007)&lt;/a&gt;&lt;sup id=&quot;fnref:3&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt; estimated $145 per vote, but this study looked at elections from 1990–2000 so it doesn’t directly apply to 2025.&lt;sup id=&quot;fnref:6&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:6&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;4&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://www.povertyactionlab.org/evaluation/does-campaign-spending-work-united-states&quot;&gt;Gerber (2004)&lt;/a&gt;&lt;sup id=&quot;fnref:4&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:4&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;5&lt;/a&gt;&lt;/sup&gt; out of Poverty Action Lab&lt;sup id=&quot;fnref:5&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:5&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;6&lt;/a&gt;&lt;/sup&gt; sent out randomized campaign mailings and found an expected one vote change per 12 households. I don’t know the all-things-considered cost to send campaign mail, but surely it’s not more than a few dollars, so this implies a very low cost (implausibly low, even).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Given the information I found Eric’s guess of $100 for the average election seems reasonable to me. Raising this to $300 for New York elections sounds about right to me. But I used a wide credence interval, with my 75th percentile estimate being 10x higher than my 25th percentile.&lt;/p&gt;

&lt;p&gt;I believe it would be possible to come up with a more confident estimate with another 5–10 hours of work. If I wanted to improve this cost-effectiveness estimate, that’s where I’d start.&lt;/p&gt;

&lt;h3 id=&quot;voter-turnout&quot;&gt;Voter turnout&lt;/h3&gt;

&lt;p&gt;According to &lt;a href=&quot;https://ballotpedia.org/New_York&apos;s_12th_Congressional_District&quot;&gt;Ballotpedia&lt;/a&gt;, the New York 12th District primary elections had about 90,000 voters in 2020, 2022, and 2024. So 90,000 is a reasonable estimate for voter turnout.&lt;sup id=&quot;fnref:7&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:7&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;7&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;h3 id=&quot;margin-of-victory&quot;&gt;Margin of victory&lt;/h3&gt;

&lt;p&gt;Eric modeled the margin of victory as following a uniform distribution from 0-30%, on the assumption that near the beginning of a campaign, it’s very hard to predict how close an election will be. I think that’s reasonable and that’s how I would have done it.&lt;/p&gt;

&lt;p&gt;(A 30% margin means that, e.g., the top candidate gets 55% of the vote and the #2 candidate gets 25%.)&lt;/p&gt;

&lt;p&gt;Eric described a &lt;a href=&quot;https://ericneyman.wordpress.com/2025/10/20/consider-donating-to-alex-bores-author-of-the-raise-act/#64882e98-6285-4fb5-b3f3-0a77e8f88b55&quot;&gt;second model&lt;/a&gt; where he treated candidates’ votes as following a &lt;a href=&quot;https://en.wikipedia.org/wiki/Dirichlet_distribution&quot;&gt;Dirichlet distribution&lt;/a&gt;. This alternative model got approximately the same answer. I didn’t attempt to replicate it; I agree that it’s more accurate to reality, but I don’t think a Dirichlet distribution adds enough value to justify its complexity, so I just modeled the distribution as uniform.&lt;/p&gt;

&lt;h3 id=&quot;probability-that-your-candidate-is-in-the-top-two&quot;&gt;Probability that your candidate is in the top two&lt;/h3&gt;

&lt;p&gt;There are currently three candidates in the race; there are two spots in the top two; therefore there’s a 2/3 chance that Bores is in the top two. This is a very simple line of reasoning and I have no objection to it.&lt;/p&gt;

&lt;h3 id=&quot;probability-that-your-candidate-is-on-the-losing-side&quot;&gt;Probability that your candidate is on the losing side&lt;/h3&gt;

&lt;p&gt;If your candidate would win without any additional funding, then additional funding doesn’t help. Donations only matter if they would lose otherwise.&lt;/p&gt;

&lt;p&gt;There’s a 50% chance that your candidate is on the losing side, conditional on the election being close.&lt;/p&gt;

&lt;h3 id=&quot;opposition-fundraising-discount&quot;&gt;Opposition fundraising discount&lt;/h3&gt;

&lt;p&gt;Eric applied a 10% discount based on the possibility that if Bores raises more funding than expected, then the AI anti-regulation super PAC will donate more money to his opposition. That discount seems too low to me, but I don’t have any evidence about what the right number would be. (My model still used a credence interval centered on a 10% discount.)&lt;/p&gt;

&lt;p&gt;I think the probability that Bores-funding induces anti-Bores-funding is pretty high, but I also think super PAC spending is less valuable than individual-donor spending due to campaign funding restrictions (as I understand, super PACs can pay for ads, but they can’t directly advertise for or against particular candidates).&lt;/p&gt;

&lt;h3 id=&quot;early-fundraising-multiplier&quot;&gt;Early fundraising multiplier&lt;/h3&gt;

&lt;p&gt;Eric expects that early campaign fundraising consolidates party support—it makes it easier to get more endorsements, raise more money from funders who don’t want to back a losing candidate, etc. He estimates that early funding is 2x as valuable. I didn’t do any research on this, but 2x sounds reasonable to me. I converted Eric’s point estimate into the 50% credence interval [1.33, 3].&lt;/p&gt;

&lt;h2 id=&quot;sensitivity-analysis&quot;&gt;Sensitivity analysis&lt;/h2&gt;

&lt;p&gt;Four of the inputs have relatively narrow credence intervals: voter turnout, probability that your candidate is in the top two, and probability that your candidate is on the losing side.&lt;/p&gt;

&lt;p&gt;Margin of victory is based on a coarse assumption of uniform probability, but I don’t think there’s much value in adding complexity to this parameter.&lt;/p&gt;

&lt;p&gt;Two parameters, the opposition fundraising discount and the early fundraising multiplier, are completely made up. These are the #2 and #3 most important parameters (but not necessarily in that order). But I don’t actually think the credence intervals are that wide—I don’t think their 50% CIs span a factor of 10.&lt;/p&gt;

&lt;p&gt;By far the most important parameter is the &lt;strong&gt;cost per vote changed&lt;/strong&gt;. My 50% credence interval for this parameter &lt;em&gt;does&lt;/em&gt; span a factor of 10.&lt;/p&gt;

&lt;p&gt;That’s why I think the best way to improve this model would be to spend more time figuring out the cost per vote changed. The simple version is to come up with a more well-researched number for the cost-effectiveness of campaign spending. A more sophisticated implementation could attempt to model the rate of diminishing returns to spending and apply that to where the Bores campaign is at currently.&lt;/p&gt;

&lt;h2 id=&quot;cost-to-shift-votes-by-one-percentage-point&quot;&gt;Cost to shift votes by one percentage point&lt;/h2&gt;

&lt;p&gt;Eric gave a 50% credence intervals of “something like [$40k, $170k]” for donations made specifically on October 20. Based on the other things he said, I’d infer that his 50% CI for donations in 2025 (but after October 20) is [$49k, $210k]. To my knowledge, he did not explicitly model credence intervals for input parameters.&lt;/p&gt;

&lt;p&gt;My &lt;a href=&quot;https://squigglehub.org/models/mdickens/congressional-campaign-donations&quot;&gt;replication&lt;/a&gt; finds a 50% CI of [$36k, $380k], which is notably wider, spanning 11x compared to Eric’s 4.3x.&lt;/p&gt;

&lt;h2 id=&quot;the-models-output-isnt-what-we-care-about&quot;&gt;The model’s output isn’t what we care about&lt;/h2&gt;

&lt;p&gt;This cost-effectiveness model estimates the expected cost to change the outcome of the election. That’s not what we ultimately care about. What we really care about is &lt;strong&gt;the expected cost to prevent AI extinction&lt;/strong&gt; via donating to political candidates. That number is much harder to estimate. But it’s still nice to have a model that gets you part of the way there.&lt;/p&gt;

&lt;p&gt;For a cost-effectiveness to go all the way, it would need to model how representatives affect what AI safety legislation gets passed, and how that legislation decreases x-risk. That’s a good question for another day.&lt;/p&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:1&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Namely, Eric’s model estimated the probability that the vote margin falls within 1000 votes, and then used that plus the expected voter turnout to estimate the probability that the margin is within one percentage point. My reproduction used voter turnout to directly estimate the probability that the margin is within one percentage point. &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:2&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Le, T., Onur, I., Sarwar, R., &amp;amp; Yalcin, E. (2024). &lt;a href=&quot;https://doi.org/10.1177/21582440241279659&quot;&gt;Money in Politics: How Does It Affect Election Outcomes?.&lt;/a&gt; &lt;a href=&quot;#fnref:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:3&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Bombardini, M., &amp;amp; Trebbi, F. (2007). &lt;a href=&quot;https://doi.org/10.3386/w13672&quot;&gt;Votes or Money? Theory and Evidence from the US Congress..&lt;/a&gt; &lt;a href=&quot;#fnref:3&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:6&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Eric also notes that this study looked at general elections, not primaries, which are probably more expensive to influence because there are relatively fewer undecided voters. &lt;a href=&quot;#fnref:6&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:4&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Gerber, A. S. (2004). &lt;a href=&quot;https://doi.org/10.1177/0002764203260415&quot;&gt;Does Campaign Spending Work?.&lt;/a&gt; &lt;a href=&quot;#fnref:4&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:5&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Without having carefully read the paper, I’m more inclined to trust the methodology if it’s coming from Poverty Action Lab than if it’s coming from some author I’ve never heard of. &lt;a href=&quot;#fnref:5&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:7&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;One thing that puzzles me is that the 2018 turnout was only 45,000, and the 2016 turnout was 17,000 (!). I don’t know why voter turnout changed so much in only four years, and then barely changed for the subsequent four years. I thought perhaps it’s because New York changed its districts, but the last redistricting was in 2012. So I have no idea what caused this sudden change in turnout, and I can’t rule out that it won’t happen again for the 2026 election. &lt;a href=&quot;#fnref:7&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>Ideas Too Short for Essays, Part 2</title>
				<pubDate>Wed, 12 Nov 2025 00:00:00 -0800</pubDate>
				<link>http://mdickens.me/2025/11/12/ideas_too_short_for_essays_part_2/</link>
				<guid isPermaLink="true">http://mdickens.me/2025/11/12/ideas_too_short_for_essays_part_2/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;Nearly nine years after &lt;a href=&quot;https://mdickens.me/2016/12/29/ideas_too_short_for_essays/&quot;&gt;part 1&lt;/a&gt;, I bring three new short ideas.&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Keep in mind that scientific fraud happens sometimes&lt;/li&gt;
  &lt;li&gt;Clichés are good, actually&lt;/li&gt;
  &lt;li&gt;You must put unnecessary decoration on your useful items, or else you’re a weirdo&lt;/li&gt;
&lt;/ol&gt;

&lt;!-- more --&gt;

&lt;h2 id=&quot;keep-in-mind-that-scientific-fraud-happens-sometimes&quot;&gt;Keep in mind that scientific fraud happens sometimes&lt;/h2&gt;

&lt;p&gt;Scientific misconduct is &lt;a href=&quot;https://www.science.org/content/article/misconduct-not-mistakes-causes-most-retractions-scientific-papers&quot;&gt;not rare&lt;/a&gt;. Even if a study uses a good experimental design, even if it has a large sample size, even if it has robust methodology, the results might still be wrong simply because the authors committed fraud. We should keep that in mind when reading scientific research.&lt;/p&gt;

&lt;p&gt;For that reason (among others), I’m not fully convinced by any one study, no matter how strong it looks.&lt;/p&gt;

&lt;p&gt;To reduce the chance of being bamboozled by fraudulent research:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Prefer studies that have been replicated by independent teams.&lt;/li&gt;
  &lt;li&gt;Prefer studies that make their data public.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;clichés-are-good-actually&quot;&gt;Clichés are good, actually&lt;/h2&gt;

&lt;p&gt;Using unique phrasing keeps your writing fresh. It forces the reader to think a little harder about what you’re saying. It keeps the reader on their toes and makes them pay attention. Often that’s what you want.&lt;/p&gt;

&lt;p&gt;Sometimes you want the opposite. Using a cliché signals to the reader: “My meaning is exactly what you think it is.” There is no wondering about your intention. Your meaning sinks into the reader’s mind like a hot knife through butter.&lt;sup id=&quot;fnref:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; (Okay, that was not a good example of a situation where using a cliché is helpful.)&lt;/p&gt;

&lt;p&gt;Beyond clichés, there are times when you want to use predictable language, and you don’t want to try to be too interesting. Using predictable language communicates that the thing you’re trying to say is predictable. It makes the language easier to parse; the reader doesn’t have to spend any time interpreting your meaning.&lt;/p&gt;

&lt;h2 id=&quot;you-must-put-unnecessary-decoration-on-your-useful-items-or-else-youre-a-weirdo&quot;&gt;You must put unnecessary decoration on your useful items, or else you’re a weirdo&lt;/h2&gt;

&lt;p&gt;I used to have blank walls with no decorations. People thought this was weird.&lt;/p&gt;

&lt;p&gt;I used to sleep on a mattress on the floor with no bed frame. People thought this was weird. Why? A mattress works perfectly fine without a bed frame.&lt;/p&gt;

&lt;p&gt;Eventually I bought a bed frame and actually I think it was smart to buy one because now I can store stuff under it. But I still don’t care that much about wall decorations, I just got some so I could pretend to be normal.&lt;/p&gt;

&lt;p&gt;You’re also supposed to have useless uncomfortable pillows (a.k.a. “throw pillows”) on your couch. I have so far resisted buying any throw pillows.&lt;/p&gt;

&lt;p&gt;This is one of those mental differences between me and other people. I can’t fathom why people insist on adding superfluous decorations to things, and other people can’t fathom why my tastes are so dull.&lt;/p&gt;

&lt;p&gt;Really, it’s not that my tastes are dull. I think decorated walls look better than blank walls if the decoration is good. It’s more that I’m easily distracted by certain kinds of visuals.&lt;/p&gt;

&lt;p&gt;I used to have a fun desktop wallpaper. But I found it too distracting to see fun art behind the application window where I was trying to work. Now my computer’s wallpaper is pure black.&lt;/p&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:1&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I was tempted to write “like a hot knife through Vegan Butter Alternative” because I don’t eat butter. But then I wouldn’t be following my own advice, would I? &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>Are Groot and Baby Groot the Same Person?</title>
				<pubDate>Tue, 11 Nov 2025 00:00:00 -0800</pubDate>
				<link>http://mdickens.me/2025/11/11/baby_groot/</link>
				<guid isPermaLink="true">http://mdickens.me/2025/11/11/baby_groot/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;&lt;img src=&quot;/assets/images/groot.jpg&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;This post contains spoilers for &lt;em&gt;Guardians of the Galaxy&lt;/em&gt;.&lt;/p&gt;

&lt;!-- more --&gt;

&lt;p&gt;At the end of &lt;em&gt;Guardians of the Galaxy&lt;/em&gt;, Groot—a sapient tree with a three-word vocabulary—dies. They take a splinter from his…trunk, I guess?…and put it in a pot, from which springs Baby Groot.&lt;/p&gt;

&lt;p&gt;There was a debate among fans as to whether Baby Groot is Groot regenerated, or if Baby Groot is an entirely new person. If I may weigh in to this debate in 2025: the answer is that it’s unanswerable.&lt;/p&gt;

&lt;p&gt;The issue is that personal identity is not clearly defined in edge cases, so we can’t say whether Groot and Baby Groot are the same person.&lt;/p&gt;

&lt;p&gt;In some cases, personal identity is unambiguous. For example, I am the same person as the Michael Dickens of 2010. There are a few reasons why I believe this:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;My name is Michael Dickens. His name is Michael Dickens.&lt;/li&gt;
  &lt;li&gt;It’s possible to trace a physical lineage from me to him where today’s Michael is made up of almost all the same molecules as yesterday’s Michael and looks almost identical.&lt;/li&gt;
  &lt;li&gt;I have memories of 2010 Michael, and I have memories of his memories.&lt;/li&gt;
  &lt;li&gt;My personality and interests are very similar to his.&lt;/li&gt;
  &lt;li&gt;He and I have the same DNA (probably, I haven’t actually checked).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Am I the same person as Sean Connery? Definitely not, because:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;My name is Michael Dickens. His name was Sean Connery.&lt;/li&gt;
  &lt;li&gt;There is not much overlap in the molecules that make up my body and that made up the body of Sean Connery.&lt;/li&gt;
  &lt;li&gt;Sean Connery has starred in many films, including portraying the original James Bond. I’ve never played James Bond in a film.&lt;/li&gt;
  &lt;li&gt;I don’t have any memory of having ever been Sean Connery, and I’m pretty sure he has no memory of having ever been me.&lt;/li&gt;
  &lt;li&gt;I don’t know a lot about Sean Connery’s personality, but I think it’s pretty different from mine.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I am the same person as the Michael Dickens of 2010, because we line up on approximately every measure of personal identity. But I am not the same person as Sean Connery, because we don’t line up on &lt;em&gt;any&lt;/em&gt; such measures.&lt;/p&gt;

&lt;p&gt;What happens when we try to compare Groot and Baby Groot in this way? When we start asking questions about their identities, we don’t get a consistent answer.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Can you trace a physical lineage between them? Yes—Baby Groot grew out of a splinter that came from Groot’s body. But Baby Groot’s body is &lt;em&gt;mostly&lt;/em&gt; new. I don’t know how groot physiology works, but it seems that Groot dies when his head is destroyed, suggesting he has some sort of brain; and Baby Groot has a totally distinct brain. But perhaps groots have some sort of distributed neural system, where the splinter that contained Baby Groot contained a piece of groot-brain.&lt;/li&gt;
  &lt;li&gt;As portrayed in &lt;em&gt;Guardians of the Galaxy 2&lt;/em&gt; and in &lt;em&gt;Avengers: Infinity War&lt;/em&gt;, Baby Groot seems to have no memory of having previously been Groot.&lt;/li&gt;
  &lt;li&gt;The two characters have very different personalities.&lt;/li&gt;
  &lt;li&gt;Do Groot and Baby Groot have the same DNA? I don’t think there’s a canon answer, but my guess would be yes.&lt;/li&gt;
  &lt;li&gt;What is Baby Groot’s name? He makes it pretty clear that his name is Groot. (Which is also Groot’s name!)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;What is the most important factor for defining personal identity? If it’s memory, then Groot and Baby Groot are different. If it’s direct physical lineage, then it’s actually unclear because it depends on how &lt;em&gt;much&lt;/em&gt; of a lineage you need. If the splinter that grew Baby Groot is part of the “essence” of Groot, then you could use that to argue that they’re the same person.&lt;/p&gt;

&lt;p&gt;But if Alice thinks personal identity is about memory, and Bob thinks it’s about physical continuity, then Alice will think Baby Groot is a new person, and Bob will think Baby Groot is Groot. They won’t be able to agree unless they can reconcile their definitions of personal identity.&lt;/p&gt;

&lt;p&gt;Alice and Bob disagree about Baby Groot’s identity, but they don’t disagree about any concrete facts about the universe. They agree that Baby Groot doesn’t have the same memory, and they agree that there was some physical continuity between the two beings (or, according to Bob, the one being).&lt;/p&gt;

&lt;p&gt;In this kind of situation, I prefer to say that the answer doesn’t matter—it’s a question of how you define the words, not a question about reality.&lt;/p&gt;

                </description>
			</item>
		
			<item>
				<title>A Thesis Regarding The Impossibility Of Giving Accurate Time Estimates, Presented As An Experiment On Form In Which The Essay Solely Consists Of A Title; In Which The Thesis States That, If Task Times Follow A Pareto Distribution (With The Right Parameters), Then An Unknown Task Takes Infinite Time In Expectation; And Therefore, In The General Case, You Cannot Provide An Accurate Time Estimate Because Any Finite Estimate Provided Will Not Capture The Expected Value; And, More Precisely, Every Estimate Will Be An Underestimate, Because Every Number Is Smaller Than Infinity; And This Matches With The General Observation That, When People Estimate Task Times, They Usually Underestimate The True Time; However, In Opposition To This Thesis Are At Least Two Observations; First, That Even If Tasks Take Infinite Time In Expectation, The Median Task Time Is Finite, And An Infinite-Expected-Value Task-Time Distribution Does Not Preclude The Possibility That Time Estimates Can Overestimate As Often As They Underestimate, But People Fail To Do This; Second, That Certain Known Biases That Result In People Underestimating The Difficulty Of Tasks, Such As Envisioning The Best-Case Scenario Rather Than The Average Case; However, In Defense Of The Original Thesis, Optimism Bias And The Pareto-Distributed Problem Space May Be Two Perspectives On The Same Phenomenon; But Even If We Reconcile The Second Concern With The Thesis, We Are Still Left With The First Concern, In Which An Unbiased Estimate Of The Median Time Should Still Be Possible, But People Are Overly Optimistic About Median Task Times; Thus, Ultimately Concluding That The Thesis Of This Essay--Or, More Accurately, The Thesis Of This Title--Is A Faulty Explanation Of People's General Inability To Provide Accurate Time Estimates; Then Following Up This Thesis With The Additional Observation That We Can Model Tasks As Turing Machines; And The Halting Problem States That It Is Impossible In General To Say Whether A Turing Machine Will Halt, And As A Corollary, It Is Impossible In General To Predict How Long A Turing Machine Will Run For Even If It Does Halt; So Perhaps The Halting Problem Means That We Cannot Make Accurate Time Estimates In General; However, It Is Not Clear That The Sorts Of Tasks That Human Beings Estimate Are Sufficiently General For This Concern To Apply, And Indeed It Seems Not To Apply Because Some Subset Of People Do In Fact Succeed At Making Unbiased Time Estimates In At Least Some Situations, At Least Where 'Unbiased' Is Defined Relative To The Median Rather Than The Mean; It Is Difficult To Say In Which Real-Life Situations The Halting Problem Is Relevant Because It Is Not Feasible To Construct A Formal Mathematical Proof For Realistic Real-Life Situations Because This Would Require Creating A Sophisticated Model In Which The State Of The Universe Is Translated To A Turing Machine, Which Would Be An Extremely Large Turing Machine And Probably Not Feasible To Reason About; Leading To The Conclusion That This Essay's Speculation Led Nowhere</title>
				<pubDate>Mon, 10 Nov 2025 00:00:00 -0800</pubDate>
				<link>http://mdickens.me/2025/11/10/long_title/</link>
				<guid isPermaLink="true">http://mdickens.me/2025/11/10/long_title/</guid>
                <description>
                  
                  
                  
                  

                </description>
			</item>
		
			<item>
				<title>Upside Volatility Is Bad</title>
				<pubDate>Sun, 09 Nov 2025 00:00:00 -0800</pubDate>
				<link>http://mdickens.me/2025/11/09/upside_volatility_is_bad/</link>
				<guid isPermaLink="true">http://mdickens.me/2025/11/09/upside_volatility_is_bad/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;Investors often say that standard deviation is a bad way to measure investment risk because it penalizes upside volatility as well as downside. I agree that standard deviation isn’t a great measure of risk, but that’s not the reason. A good risk measure &lt;em&gt;should&lt;/em&gt; penalize upside volatility, because upside volatility is bad.&lt;/p&gt;

&lt;!-- more --&gt;

&lt;p&gt;A sure-thing return is better than a volatile return, &lt;em&gt;even if the volatile return is guaranteed to be positive&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;I will first explain my reasoning without using too much math, and then provide a more rigorous explanation using math in the &lt;a href=&quot;#mathy-explanation&quot;&gt;next section&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;As an illustration, suppose there’s some investment that only ever produces positive returns, and the distribution of outcomes looks like this:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/upside-vol.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;(For those who want to know, that’s a &lt;a href=&quot;https://en.wikipedia.org/wiki/Gamma_distribution&quot;&gt;gamma distribution&lt;/a&gt; with shape 0.1 and scale 2.)&lt;/p&gt;

&lt;p&gt;This investment has plenty of upside volatility, but no downside volatility—it can never earn a negative return.&lt;/p&gt;

&lt;p&gt;The investment has an expected return of 5%. But if you buy it, most of the time you will earn &lt;em&gt;less&lt;/em&gt; than 5%. If you’re investing to prepare for your future, which would you rather have: a guaranteed 5% return? Or a volatile return with an average of 5%, but where you probably end up getting less than that?&lt;/p&gt;

&lt;p&gt;With the guaranteed return, you’re guaranteed to be set for retirement as long as you put enough money into savings. With the volatile investment, even though you know you won’t &lt;em&gt;lose&lt;/em&gt; money, you’re still not sure how much you’ll end up with.&lt;/p&gt;

&lt;h2 id=&quot;mathy-explanation&quot;&gt;Mathy explanation&lt;/h2&gt;

&lt;p&gt;Suppose Alice is an investor with logarithmic utility of money, which is a classic risk-averse utility function. I generated one million sample outcomes using our upside-volatile distribution and found that Alice’s expected utility was 0.12. (0.12 of what? 0.12 utility. It doesn’t mean anything concrete; it’s just a number.)&lt;/p&gt;

&lt;p&gt;Alice has the opportunity to buy a safe investment with the same expected return, but zero volatility. The guaranteed investment has 0.18 utility for Alice—considerably higher than the volatile investment, even though she has no risk of losing money.&lt;/p&gt;

&lt;p&gt;Bob is twice as risk-averse as Alice.&lt;sup id=&quot;fnref:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; His utility for the guaranteed investment is 0.17, and his expected utility for volatile asset is 0.09. Like Alice, he prefers the sure thing. For him, the sure thing is nearly &lt;em&gt;twice&lt;/em&gt; as good.&lt;/p&gt;

&lt;p&gt;I believe Bob’s utility function is more representative of a normal person’s. So for a normal person, the sure thing is &lt;em&gt;much&lt;/em&gt; better than the volatile investment, &lt;em&gt;even though the volatility is all upside&lt;/em&gt;.&lt;/p&gt;

&lt;h2 id=&quot;skewness-still-matters&quot;&gt;Skewness still matters&lt;/h2&gt;

&lt;p&gt;I’m not saying standard deviation is a perfect measure of risk, because it’s definitely not.&lt;/p&gt;

&lt;p&gt;Imagine you have a choice between two investments that have expected returns distributed like this:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/skewness.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;These distributions have the same standard deviation. But the blue distribution is still preferable to the orange one, because the orange one has a much bigger risk of losing money. It’s &lt;em&gt;symmetric&lt;/em&gt;, whereas the blue distribution is &lt;em&gt;right-skewed&lt;/em&gt;. Just looking at standard deviation doesn’t capture that.&lt;/p&gt;

&lt;p&gt;Or compare these two distributions:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/skewness-mega.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Both distributions have the same mean and standard deviation, but the orange one looks &lt;em&gt;horrible&lt;/em&gt;. I would definitely not want to invest in the orange one. The left-skewed distribution looks much less appealing than the right-skewed one.&lt;/p&gt;

&lt;p&gt;Yes, upside volatility is bad, but downside volatility is &lt;em&gt;worse&lt;/em&gt;. A guaranteed constant return is better than an always-positive but uncertain return, which in turn is better than an uncertain return that might be negative. (Assuming, of course, that all three have the same expected return.)&lt;/p&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:1&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;By which I mean he has a relative risk aversion coefficient of 2, so his utility function of wealth is \(U(w) = 1 - \frac{1}{w}\). &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>Writing Your Representatives: A Cost-Effective and Neglected Intervention</title>
				<pubDate>Sat, 08 Nov 2025 00:00:00 -0800</pubDate>
				<link>http://mdickens.me/2025/11/08/call_or_write_your_representatives/</link>
				<guid isPermaLink="true">http://mdickens.me/2025/11/08/call_or_write_your_representatives/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;Is it a good use of time to call or write your representatives to advocate for issues you care about? I did some research, and my current (weakly-to-moderately-held) belief is that messaging campaigns are very cost-effective.&lt;/p&gt;

&lt;p&gt;In this post:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;I look at evidence from &lt;a href=&quot;#evidence-from-randomized-experiments&quot;&gt;randomized experiments&lt;/a&gt;, &lt;a href=&quot;#evidence-from-surveys&quot;&gt;surveys&lt;/a&gt; of legislators’ opinions, and &lt;a href=&quot;#observational-evidence&quot;&gt;observational evidence&lt;/a&gt;. All lines of evidence suggest that messaging campaigns are effective, but none of the evidence is strong.&lt;/li&gt;
  &lt;li&gt;I &lt;a href=&quot;#cost-effectiveness-estimate&quot;&gt;write an estimate&lt;/a&gt; of how many messages it takes to get a bill to pass in expectation, and how much that costs. According to my model, changing a vote outcome takes 17,000 messages for the Michigan state legislature and 2.2 million messages for US Congress.&lt;/li&gt;
  &lt;li&gt;I provide links to &lt;a href=&quot;#how-to-participate-in-messaging-campaigns&quot;&gt;resources on how to participate in messaging campaigns&lt;/a&gt; for animal welfare, AI safety, and global poverty.&lt;/li&gt;
&lt;/ul&gt;

&lt;!-- more --&gt;

&lt;p&gt;&lt;em&gt;Cross-posted to the &lt;a href=&quot;https://forum.effectivealtruism.org/posts/PvJL4Rnz2Dq9J2omd/writing-your-representatives-a-worthwhile-and-neglected&quot;&gt;Effective Altruism Forum&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;h2 id=&quot;contents&quot;&gt;Contents&lt;/h2&gt;

&lt;ul id=&quot;markdown-toc&quot;&gt;
  &lt;li&gt;&lt;a href=&quot;#contents&quot; id=&quot;markdown-toc-contents&quot;&gt;Contents&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#evidence-from-randomized-experiments&quot; id=&quot;markdown-toc-evidence-from-randomized-experiments&quot;&gt;Evidence from randomized experiments&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#evidence-from-surveys&quot; id=&quot;markdown-toc-evidence-from-surveys&quot;&gt;Evidence from surveys&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#observational-evidence&quot; id=&quot;markdown-toc-observational-evidence&quot;&gt;Observational evidence&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#theoretical-argument&quot; id=&quot;markdown-toc-theoretical-argument&quot;&gt;Theoretical argument&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#state-vs-federal-representatives&quot; id=&quot;markdown-toc-state-vs-federal-representatives&quot;&gt;State vs. federal representatives&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#cost-effectiveness-estimate&quot; id=&quot;markdown-toc-cost-effectiveness-estimate&quot;&gt;Cost-effectiveness estimate&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#so-are-messaging-campaigns-cost-effective&quot; id=&quot;markdown-toc-so-are-messaging-campaigns-cost-effective&quot;&gt;So, are messaging campaigns cost-effective?&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#how-to-participate-in-messaging-campaigns&quot; id=&quot;markdown-toc-how-to-participate-in-messaging-campaigns&quot;&gt;How to participate in messaging campaigns&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#animal-welfare&quot; id=&quot;markdown-toc-animal-welfare&quot;&gt;Animal welfare&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#ai-safety&quot; id=&quot;markdown-toc-ai-safety&quot;&gt;AI safety&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#global-poverty&quot; id=&quot;markdown-toc-global-poverty&quot;&gt;Global poverty&lt;/a&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#notes&quot; id=&quot;markdown-toc-notes&quot;&gt;Notes&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;evidence-from-randomized-experiments&quot;&gt;Evidence from randomized experiments&lt;/h2&gt;

&lt;p&gt;There are two randomized controlled trials on messaging campaigns targeted at legislators: &lt;a href=&quot;https://mdickens.me/materials/bergan2009.pdf&quot;&gt;Bergan (2009)&lt;/a&gt;&lt;sup id=&quot;fnref:3&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; and &lt;a href=&quot;https://mdickens.me/materials/bergan2014.pdf&quot;&gt;Bergan &amp;amp; Cole (2014)&lt;/a&gt;&lt;sup id=&quot;fnref:4&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:4&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt;. These studies randomly assigned state legislators to either receive or not receive messages from volunteers advocating in favor of an upcoming bill, and then looked at how many legislators voted for the bill depending on whether they received messages or not.&lt;/p&gt;

&lt;p&gt;Both studies found statistically significant differences. Bergan (2009) found that the messaging campaign increased positive votes by 20 percentage points, and Bergan &amp;amp; Cole (2014) found a 12 percentage point improvement.&lt;/p&gt;

&lt;p&gt;This table summarizes key facts about the studies:&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt; &lt;/th&gt;
      &lt;th&gt;State&lt;/th&gt;
      &lt;th&gt;Medium&lt;/th&gt;
      &lt;th&gt;Avg # Messages&lt;/th&gt;
      &lt;th&gt;Change&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Bergan (2009)&lt;/td&gt;
      &lt;td&gt;New Hampshire&lt;/td&gt;
      &lt;td&gt;email&lt;/td&gt;
      &lt;td&gt;3&lt;/td&gt;
      &lt;td&gt;20%pp&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Bergan &amp;amp; Cole (2014)&lt;/td&gt;
      &lt;td&gt;Michigan&lt;/td&gt;
      &lt;td&gt;phone&lt;/td&gt;
      &lt;td&gt;22&lt;/td&gt;
      &lt;td&gt;12%pp&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;Bergan &amp;amp; Cole (2014) had legislators in the experimental group receive either 22, 33, or 65 calls. The study was underpowered to detect differences between those numbers, but the 65-calls group showed a (non-significantly) weaker effect than 22 or 33, which hints that the number of calls doesn’t matter beyond a certain point.&lt;/p&gt;

&lt;h2 id=&quot;evidence-from-surveys&quot;&gt;Evidence from surveys&lt;/h2&gt;

&lt;p&gt;A &lt;a href=&quot;https://static1.squarespace.com/static/67ead1d67cfe8944d45170dd/t/6894aae53682bb597c6bc7ae/1754573542734/cwc-perceptions-of-citizen-advocacy.pdf&quot;&gt;survey&lt;/a&gt;&lt;sup id=&quot;fnref:7&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:7&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt; by the &lt;a href=&quot;https://www.congressfoundation.org/research&quot;&gt;Congressional Management Foundation&lt;/a&gt; asked US Congress senior staffers how much weight they give to different forms of communication:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/Congress-influence.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Respondents overwhelmingly say they give influence to constituents, although I’m not sure how seriously to take this because there’s a strong social desirability bias at play.&lt;/p&gt;

&lt;p&gt;But insofar as we can take these results seriously:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;In-person visits are better than individualized messages.&lt;/li&gt;
  &lt;li&gt;Individualized messages from constituents are more influential than lobbyists. (I am somewhat skeptical of this.)&lt;/li&gt;
  &lt;li&gt;Lobbyists are more influential than form messages.&lt;/li&gt;
  &lt;li&gt;Most respondents still give “some influence” to form messages.&lt;/li&gt;
  &lt;li&gt;There isn’t much difference between postal letters, email, and phone calls, although phone calls were the worst of the three. (This is good news for me as someone who’s allergic to making phone calls.)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;An &lt;a href=&quot;https://v2v.opengovfoundation.org/staff-perspectives-on-the-best-ways-to-get-heard-5d30c85eb9f5&quot;&gt;article from the OpenGov Foundation&lt;/a&gt; has a more qualitative perspective, with quotes from legislators about what kinds of advocacy they care most about.&lt;/p&gt;

&lt;p&gt;When I’ve talked to lobbyists, they told me that policy-makers pay more attention to them than to constituents. These surveys by academics/think tanks say the opposite. Both of these pieces of evidence are contaminated by the fact that policy-makers are going to tell people what they want to hear. So ultimately you have to just decide who you believe more.&lt;/p&gt;

&lt;h2 id=&quot;observational-evidence&quot;&gt;Observational evidence&lt;/h2&gt;

&lt;p&gt;One way to look at the question is to measure how well politicians’ votes align with public opinion vs. interest groups. That tells us something about how much politicians pay attention to the public compared to lobbyists, although this isn’t great evidence because politicians might vote one way or another for many reasons. And whether politicians align with public opinion doesn’t necessarily tell us how well messaging campaigns work, because there aren’t messaging campaigns on every issue.&lt;/p&gt;

&lt;p&gt;And, the question is muddied by the fact that there can be interest groups on both sides of an issue, and possibly even public messaging campaigns on both sides.&lt;/p&gt;

&lt;p&gt;As with most fields in social science, observational evidence is much easier to find than experimental evidence, so there are many research papers on this question. And because observational evidence is &lt;em&gt;weaker&lt;/em&gt; than experimental evidence, I spent less time on it.&lt;/p&gt;

&lt;p&gt;A systematic review by &lt;a href=&quot;/materials/economic-inequality-and-political-responsiveness.pdf&quot;&gt;Elkjær &amp;amp; Klitgaard (2021)&lt;/a&gt;&lt;sup id=&quot;fnref:8&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:8&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;4&lt;/a&gt;&lt;/sup&gt; found very different answers across studies. Some studies found that public opinion mattered a great deal; others found that public opinion mattered far less than elite opinion or interest groups. Answers varied depending on how each study approached the problem and what statistical model they used.&lt;/p&gt;

&lt;p&gt;I haven’t dug enough into the research to say whether some studies’ methodologies are better than others—it may be that some methodologies don’t make sense, and once you eliminate those, there is a single clear answer. But based on my cursory review, it looks to me like the observational evidence is mixed.&lt;/p&gt;

&lt;h2 id=&quot;theoretical-argument&quot;&gt;Theoretical argument&lt;/h2&gt;

&lt;p&gt;There is a simple theoretical reason to expect messaging campaigns to work well:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;A representative’s job is to do what their constituents want.&lt;/li&gt;
  &lt;li&gt;If you tell them what you want, that increases the chances that they’ll do it.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;And an argument for cost-effectiveness: most people don’t talk to their representatives, so if you do, you can have a big impact.&lt;/p&gt;

&lt;h2 id=&quot;state-vs-federal-representatives&quot;&gt;State vs. federal representatives&lt;/h2&gt;

&lt;p&gt;I’m only going to talk about the United States because I don’t know much about other governments. But my guess is that messaging campaigns should work roughly as well in any representative democracy as they do in America.&lt;/p&gt;

&lt;p&gt;The two randomized experiments looked at vote outcomes from state representatives in a medium-sized state (Michigan) and a small state (New Hampshire). Federal representatives are representing many more people and therefore get more mail.&lt;/p&gt;

&lt;p&gt;How much more mail? I don’t know. I couldn’t find data on the volume of mail received by state representatives. The fact that double-digit percentages of representatives changed their votes after receiving 22 phone calls (in Michigan) or three (!) emails (in New Hampshire&lt;sup id=&quot;fnref:6&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:6&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;5&lt;/a&gt;&lt;/sup&gt;) suggests that they don’t receive many messages.&lt;/p&gt;

&lt;p&gt;(Michigan has a population of 10 million and New Hampshire has 1.4 million, which is roughly consistent with the 7x difference in the number of messages sent for the respective advocacy campaigns.)&lt;/p&gt;

&lt;p&gt;US Congress members, on the other hand, typically received 1000–1500 contacts per week in 2013 (&lt;a href=&quot;https://www.vanderbilt.edu/csdi/AbernathyDissertation_Formatted.pdf&quot;&gt;Abernathy (2015)&lt;/a&gt;&lt;sup id=&quot;fnref:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;6&lt;/a&gt;&lt;/sup&gt;):&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/congress-contacts.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;They receive perhaps 3000 contacts per week today, although I couldn’t find a primary source.&lt;sup id=&quot;fnref:9&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:9&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;7&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;h2 id=&quot;cost-effectiveness-estimate&quot;&gt;Cost-effectiveness estimate&lt;/h2&gt;

&lt;p&gt;I created a &lt;a href=&quot;https://squigglehub.org/models/mdickens/messaging-campaigns&quot;&gt;Squiggle model&lt;/a&gt; to estimate the cost-effectiveness of state and federal messaging campaigns. The model itself has documentation explaining how it works. I won’t explain every detail in this post—you can click through to the &lt;a href=&quot;https://squigglehub.org/models/mdickens/messaging-campaigns&quot;&gt;model&lt;/a&gt; if you’re interested—but I’ll give an overview of how it works.&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Bergan (2009)&lt;sup id=&quot;fnref:3:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; and Bergan &amp;amp; Cole (2014)&lt;sup id=&quot;fnref:4:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:4&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt; randomly assigned state legislators to receive messages. They found vote shifts of 20 and 12 percentage points, respectively. My model uses smaller numbers to be conservative.&lt;/li&gt;
  &lt;li&gt;Using data &lt;a href=&quot;https://github.com/michaeldickens/public-scripts/blob/master/congress.py&quot;&gt;pulled from Congressional records&lt;/a&gt;, I estimated what proportion of vote outcomes could be flipped by shifting votes by N percentage points.&lt;/li&gt;
  &lt;li&gt;Calculate &lt;code&gt;[percentage vote change per message] * [number of legislators] * [probability of outcome change for every 1% vote change]&lt;/code&gt; to get the expected probability of changing a vote outcome per message sent.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That’s just an outline. My model includes a lot of assumptions, and you probably disagree with some of them; if you’re opinionated, you should &lt;a href=&quot;https://squigglehub.org/models/mdickens/messaging-campaigns&quot;&gt;open the model&lt;/a&gt; and change the numbers as you see fit.&lt;/p&gt;

&lt;p&gt;Then to get the cost to change a &lt;em&gt;federal&lt;/em&gt; vote outcome, I scaled up based on the population difference between a medium-sized state and the United States as a whole. I added a 2x multiplier to adjust for the fact that US Congress is more salient and probably gets more messages per capita than state legislatures (as well as more advocacy via other vectors).&lt;/p&gt;

&lt;p&gt;Multiplying all these factors together, my model came up with these results:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Changing an outcome in the Michigan state legislature (taken as a representative medium-sized state) requires a median of &lt;strong&gt;17,000 messages&lt;/strong&gt; (mean 15,000&lt;sup id=&quot;fnref:5&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:5&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;8&lt;/a&gt;&lt;/sup&gt;; 90% credence interval 6200 to 130,000).&lt;sup id=&quot;fnref:10&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:10&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;9&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
  &lt;li&gt;Changing an outcome in US Congress requires a median of &lt;strong&gt;2.2 million messages&lt;/strong&gt; (mean 1.9 million; 90% credence interval 810,000 to 17 million).&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Of course, that doesn’t mean you can brute-force your way into changing an outcome by running a giant multi-million-person messaging campaign. The model only applies to normal-sized campaigns. If you send, say, 22,000 messages, then—according to this model—you have a 1% chance of changing the outcome of a vote in Congress.&lt;/p&gt;

&lt;p&gt;We can also calculate cost-effectiveness by assigning a monetary value to the time spent calling or writing letters. When I plugged in some best-guess numbers, I came up with:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;The median cost to change an outcome in the Michigan state legislature is &lt;strong&gt;$440,000&lt;/strong&gt; (mean $130,000; 90% credence interval $31,000 to $3.8 million).&lt;/li&gt;
  &lt;li&gt;The median cost to change an outcome in US Congress is &lt;strong&gt;$58 million&lt;/strong&gt; (mean $17 million; 90% credence interval $4 million to $500 million).&lt;/li&gt;
  &lt;li&gt;The median cost to change an outcome in California (the largest US state) is &lt;strong&gt;$3.1 million&lt;/strong&gt; (mean $930,000; 90% credence interval $220,000 to $27 million).&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;These numbers depend a lot on the value of volunteers’ time. My model assumed that the volunteers are people like you, the reader of this post. Most of you probably have higher incomes than average and donate a lot more money to charity than average.&lt;/p&gt;

&lt;p&gt;My model assumes that every letter is personally written by the sender. That might not be right, because the letter-writers Bergan (2009) were probably mostly sending form letters (the paper did not specify), which means the cost-effectiveness numbers from Bergan (2009) are for form letters, not for customized ones.&lt;/p&gt;

&lt;p&gt;You could decrease the time requirement by ~10x by sending form letters instead of personalized letters. I don’t know whether the result would ultimately be more or less cost-effective because form letters are also less impactful. My general guideline would be that it’s better to write your own letter if you’re up for it, but if not, sending a form letter is still worthwhile.&lt;/p&gt;

&lt;h2 id=&quot;so-are-messaging-campaigns-cost-effective&quot;&gt;So, are messaging campaigns cost-effective?&lt;/h2&gt;

&lt;p&gt;Would I pay $58 million if that’s what it cost to pass a federal version of SB 53 or the RAISE act? I think I would. I think I’d rather spend $30 million on that than on marginal alignment research. But it’s not an obvious call and I can see arguments the other way.&lt;/p&gt;

&lt;p&gt;$3.1 million to get a bill passed in California sounds like a great deal to me. California regulations matter less than US law, but not &amp;gt;10x less. Remembering, of course, that you can’t actually get a bill passed by throwing $3.1 million at a messaging campaign. But it seems like a great deal to spend a much smaller amount of money for an appropriately scaled-down impact.&lt;/p&gt;

&lt;p&gt;Are messaging campaigns the &lt;em&gt;best&lt;/em&gt; political intervention? I don’t know, probably not?&lt;/p&gt;

&lt;p&gt;I haven’t made a similar effort to estimate the cost-effectiveness of other interventions. I found unusually good data on messaging campaigns, which is to say I found two experiments covering two small-to-medium state legislatures that studied only a single bill each. That’s not much to go on, but it’s better than the zero experimental studies that we often have.&lt;/p&gt;

&lt;p&gt;It may be that it’s more cost-effective to support lobbying by a dedicated interest group with strong political connections. I spoke to one person who has done both messaging campaigns and lobbying who believes that the latter is better (under certain conditions).&lt;sup id=&quot;fnref:13&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:13&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;10&lt;/a&gt;&lt;/sup&gt; But the cost-effectiveness of lobbying is even harder to estimate than messaging campaigns.&lt;/p&gt;

&lt;p&gt;The book &lt;em&gt;Lobbying and Policy Change: Who Wins, Who Loses, and Why&lt;/em&gt;—which I summarized in my &lt;a href=&quot;https://mdickens.me/reading-notes/#[2025-06-02%20Mon]%20Lobbying%20and%20Policy%20Change:%20Who%20Wins,%20Who%20Loses,%20and%20Why&quot;&gt;reading notes&lt;/a&gt;—found that neither PAC spending nor lobbying spending could predict political success in observational studies, although the authors expressed skepticism about this result.&lt;sup id=&quot;fnref:11&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:11&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;11&lt;/a&gt;&lt;/sup&gt; &lt;a href=&quot;/materials/limits-of-lobbying.pdf&quot;&gt;Camp et al. (2024)&lt;/a&gt;&lt;sup id=&quot;fnref:12&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:12&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;12&lt;/a&gt;&lt;/sup&gt; conducted four field experiments with real-world lobbyists and found that lobbyist outreach had no significant effect on legislators’ policy positions. This leaves me uncertain of what to believe, where some individuals who are involved in political advocacy believe lobbying is particularly effective, but externally-verifiable (but limited) evidence finds that it isn’t.&lt;sup id=&quot;fnref:14&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:14&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;13&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;(The cost-effectiveness of lobbying could be its own topic, but I’ll leave it there for now.)&lt;/p&gt;

&lt;p&gt;Messaging campaigns look cost-effective relative to AI alignment research,&lt;sup id=&quot;fnref:15&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:15&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;14&lt;/a&gt;&lt;/sup&gt; but it’s harder to say how they compare to other types of advocacy. Separately, there’s the question of whether you, personally, should write letters to your representatives when an important issue comes up. In that case, I think the answer is a strong yes, as long as you have the spare time. If you’re limited on time, you can still sign your name on a pre-written letter, which only takes about two minutes.&lt;/p&gt;

&lt;h1 id=&quot;how-to-participate-in-messaging-campaigns&quot;&gt;How to participate in messaging campaigns&lt;/h1&gt;

&lt;p&gt;For practical guidance on how to talk to your representatives, see &lt;a href=&quot;https://forum.effectivealtruism.org/posts/5oStggnYLGzomhvvn/talking-to-congress-can-constituents-contacting-their&quot;&gt;Talking to Congress: Can constituents contacting their legislator influence policy?&lt;/a&gt; That article was written by some people who, unlike me, have actually run messaging campaigns before.&lt;/p&gt;

&lt;p&gt;Compassion in World Farming also has a &lt;a href=&quot;https://www.ciwf.org.uk/get-involved/get-campaigning/letter-writing/&quot;&gt;guide to effective letter writing for farm animal welfare advocacy&lt;/a&gt;; the advice is relevant to any cause area.&lt;/p&gt;

&lt;p&gt;I am not sure whether you should send a form letter or write out your own letter. I’m confident that personalized letters are more impactful, but they also take much longer, so it’s not clear that they’re more time-effective. I would probably suggest writing a personalized letter if you have time; but if you don’t, or if you’re not sure what to say, then sending a form letter is much better than nothing.&lt;/p&gt;

&lt;p&gt;If you want to get involved in messaging campaigns, below are three lists of orgs who run campaigns in three effective altruist cause areas.&lt;/p&gt;

&lt;h3 id=&quot;animal-welfare&quot;&gt;Animal welfare&lt;/h3&gt;

&lt;p&gt;Animal advocacy groups are well-versed in running public campaigns, and there are many ways to get involved.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;ASPCA frequently runs messaging campaigns. Its &lt;a href=&quot;https://www.aspca.org/get-involved/advocacy-center&quot;&gt;Advocacy Center&lt;/a&gt; lets you filter by issue (farm animals, puppy mills, etc.) and it shows a list of relevant issues that match your criteria. For example, right now it has a &lt;a href=&quot;https://secure.aspca.org/action/farm-bill&quot;&gt;page on the Farm Bill&lt;/a&gt;, explaining how the bill will negatively impact farm animals, and providing a form where you can send a letter to your legislator.&lt;/li&gt;
  &lt;li&gt;Mercy for Animals has a &lt;a href=&quot;https://mercyforanimals.org/take-action/lend-your-voice/&quot;&gt;Lend Your Voice&lt;/a&gt; page. As of this writing, the page links to a &lt;a href=&quot;https://mercyforanimals.org/IAA/&quot;&gt;message form&lt;/a&gt; where you can contact your representatives about the Industrial Agriculture Accountability Act.&lt;/li&gt;
  &lt;li&gt;Compassion in World Farming has an &lt;a href=&quot;https://www.ciwf.org.uk/&quot;&gt;“Act Now” button on its website&lt;/a&gt;. The direct link is &lt;a href=&quot;https://action.ciwf.org.uk/page/174902/action/1&quot;&gt;here&lt;/a&gt;, but I’m not sure if that link will still work a month from now; if it doesn’t, go to the &lt;a href=&quot;https://www.ciwf.org.uk/&quot;&gt;home page&lt;/a&gt; and click the “Act Now” button.&lt;/li&gt;
  &lt;li&gt;You can sign up for The Humane League’s &lt;a href=&quot;https://thehumaneleague.org/fast-action-network&quot;&gt;Fast Action Network&lt;/a&gt;, and you will get notified when there are actions you can take (which mostly means corporate campaigns, I’m not sure if they write letters to policy-makers).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A lot of animal advocacy orgs have useful resources; those were just a few of the ones I found.&lt;/p&gt;

&lt;h3 id=&quot;ai-safety&quot;&gt;AI safety&lt;/h3&gt;

&lt;p&gt;There is nothing particularly organized right now. I hope that there are better options in the future, but right now there is no dedicated “AI safety messaging campaign” newsletter or mailing list.&lt;/p&gt;

&lt;p&gt;There are a few other kinds of resources, though:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;PauseAI US has a &lt;a href=&quot;https://pauseaius.substack.com/&quot;&gt;newsletter&lt;/a&gt; that’s mostly not about messaging campaigns. The newsletter did make a &lt;a href=&quot;https://pauseaius.substack.com/p/call-to-action-contact-your-senators&quot;&gt;call to action on the 10-year moratorium on AI regulation&lt;/a&gt;; if PauseAI US runs another messaging campaign, you will probably hear about it on the newsletter. PauseAI US also has a dedicated &lt;a href=&quot;https://discord.com/channels/1286529161510387722/1329851426469314591&quot;&gt;“contact-officials” channel&lt;/a&gt; on its Discord.&lt;/li&gt;
  &lt;li&gt;ControlAI has a &lt;a href=&quot;https://controlai.com/take-action/choose&quot;&gt;Take Action page&lt;/a&gt; where you can send a message to your representatives, using either a form letter or a message you write yourself. The form letter broadly raises concern about AI existential risk, rather than being about any particular piece of legislation. (ControlAI also has a &lt;a href=&quot;https://campaign.controlai.com/take-action&quot;&gt;page with other ways to take action&lt;/a&gt;).&lt;/li&gt;
  &lt;li&gt;The book &lt;em&gt;If Anyone Builds It, Everyone Dies&lt;/em&gt; has an &lt;a href=&quot;https://ifanyonebuildsit.com/act/letter&quot;&gt;associated web page&lt;/a&gt; where you can write a letter to your representatives (either pre-written or written by you). As with ControlAI’s page, the letter isn’t about any specific legislation.&lt;/li&gt;
  &lt;li&gt;PauseAI has an &lt;a href=&quot;https://pauseai.info/email-builder&quot;&gt;email builder&lt;/a&gt; for writing a customizable form letter.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;global-poverty&quot;&gt;Global poverty&lt;/h3&gt;

&lt;p&gt;I couldn’t find any orgs that run messaging campaigns focused specifically on cost-effective global poverty interventions (like the sort of thing &lt;a href=&quot;https://www.givewell.org/&quot;&gt;GiveWell&lt;/a&gt; would recommend), but there are some orgs that focus on global poverty more broadly.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Partners in Health has an &lt;a href=&quot;https://www.pih.org/advocate&quot;&gt;Advocacy page&lt;/a&gt; where you can write Congress to support funding for global health initiatives, or sign up to the PIH Action Network.&lt;/li&gt;
  &lt;li&gt;RESULTS has &lt;a href=&quot;https://results.org/volunteers/action-center/action-alerts&quot;&gt;Action Alerts&lt;/a&gt; for writing letters to policy-makers and to newspapers.&lt;/li&gt;
  &lt;li&gt;Catholic Relief Services has a &lt;a href=&quot;https://www.crs.org/ways-to-help/advocate/take-action&quot;&gt;Take Action page&lt;/a&gt; that includes Congressional messaging campaigns among other things.&lt;/li&gt;
&lt;/ul&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:3&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Bergan, D. E. (2009). &lt;a href=&quot;https://doi.org/10.1177/1532673x08326967&quot;&gt;Does Grassroots Lobbying Work?.&lt;/a&gt; doi: &lt;a href=&quot;https://doi.org/10.1177/1532673x08326967&quot;&gt;10.1177/1532673x08326967&lt;/a&gt; &lt;a href=&quot;#fnref:3&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt; &lt;a href=&quot;#fnref:3:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;2&lt;/sup&gt;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:4&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Bergan, D. E., &amp;amp; Cole, R. T. (2014). &lt;a href=&quot;https://doi.org/10.1007/s11109-014-9277-1&quot;&gt;Call Your Legislator: A Field Experimental Study of the Impact of a Constituency Mobilization Campaign on Legislative Voting.&lt;/a&gt; &lt;a href=&quot;#fnref:4&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt; &lt;a href=&quot;#fnref:4:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;2&lt;/sup&gt;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:7&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Congressional Management Foundation (2011). &lt;a href=&quot;https://static1.squarespace.com/static/67ead1d67cfe8944d45170dd/t/6894aae53682bb597c6bc7ae/1754573542734/cwc-perceptions-of-citizen-advocacy.pdf&quot;&gt;Communicating with Congress: Perceptions of Citizen Advocacy on Capitol Hill.&lt;/a&gt; &lt;a href=&quot;#fnref:7&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:8&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Elkjær, M. A., &amp;amp; Klitgaard, M. B. (2021). Economic inequality and political responsiveness: A systematic review. doi: &lt;a href=&quot;https://doi.org/10.1017/S1537592721002188&quot;&gt;10.1017/S1537592721002188&lt;/a&gt; &lt;a href=&quot;#fnref:8&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:6&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;When I first read the paper, I found this number to be shockingly low—how could three emails cause a 12 percentage point shift in votes? But it made more sense after I did the math and realized that each New Hampshire legislator only represents about 3,000 constituents. &lt;a href=&quot;#fnref:6&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:1&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Abernathy, C. E. (2015). &lt;a href=&quot;https://www.vanderbilt.edu/csdi/AbernathyDissertation_Formatted.pdf&quot;&gt;Legislative correspondence management practices: Congressional offices and the treatment of constituent opinion.&lt;/a&gt; &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:9&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Stowe, L. (2023). &lt;a href=&quot;https://www.fireside21.com/resources/congressional-staffer-communication/&quot;&gt;How Congressional Staffers Can Manage 81 Million Messages From Constituents.&lt;/a&gt;&lt;/p&gt;

      &lt;p&gt;Note: This article did not cite an original source for its numbers. The best original source I could find was Abernathy (2015)&lt;sup id=&quot;fnref:1:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;6&lt;/a&gt;&lt;/sup&gt;, which quoted 1000–1500 messages per week. &lt;a href=&quot;#fnref:9&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:5&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Actually that’s slightly wrong. This number is not the mean of messages-per-outcome, it’s the reciprocal of the mean of outcomes-per-message. When calculating expected utility, it makes more logical sense to put the benefit on the numerator and the cost on the denominator. But this produces a very small number that’s hard to read, so I inverted it.&lt;/p&gt;

      &lt;p&gt;If you calculate expected messages per outcome, the result is heavily penalized by the tail outcomes where changing the outcome ends up being much more expensive than expected. This produces an incorrect estimate of expected utility (the units of utility are outcomes-per-message, not messages-per-outcome). &lt;a href=&quot;#fnref:5&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:10&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I debated whether the mean or median is the more relevant number here. Philosophically, the mean is what you care about. But it seems perverse that greater uncertainty about an intervention &lt;em&gt;increases&lt;/em&gt; how appealing it looks. Using the median instead of the mean is probably the wrong way to solve this problem, but it’s a first attempt. &lt;a href=&quot;#fnref:10&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:13&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I also spoke to people who had done political advocacy and &lt;em&gt;not&lt;/em&gt; messaging campaigns who claimed that lobbying is particularly effective. &lt;a href=&quot;#fnref:13&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:11&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;They hypothesized that perhaps spending by opposed interest groups cancel out, or that alliances between high-spending and low-spending interest groups create the illusion that spending doesn’t matter. &lt;a href=&quot;#fnref:11&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:12&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Camp, M. J., Schwam-Baird, M., &amp;amp; Zelizer, A. (2024). The Limits of Lobbying: Null Effects from Four Field Experiments in Two State Legislatures. &lt;a href=&quot;#fnref:12&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:14&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Under normal circumstances, I believe people over-rate personal experience, and I’m more inclined to trust the data. But in this case, most of the data comes from observational evidence which is easily confounded; I only cited one experimental study, and that study was small in scope—they only worked with three individual lobbyists. Given the weakness of the scientific evidence in this case, I don’t think it’s clearly more reliable than the contradictory anecdotes.&lt;/p&gt;

      &lt;p&gt;One limitation worth mentioning is that the experimental study tested the effect of lobbyists meeting policy-makers only one or two times. Conventional wisdom says that the value of lobbying mainly comes from establishing long-term relationships. &lt;a href=&quot;#fnref:14&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:15&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I didn’t attempt to estimate the cost-effectiveness of AI alignment research. It just seems true to me that $58 million (ish) to pass a bill is worth more than $58 million of alignment research, at least on the margin. (If nobody were doing alignment research, perhaps I’d answer differently.) &lt;a href=&quot;#fnref:15&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>Things I Learned from College</title>
				<pubDate>Fri, 07 Nov 2025 00:00:00 -0800</pubDate>
				<link>http://mdickens.me/2025/11/07/things_I_learned_from_college/</link>
				<guid isPermaLink="true">http://mdickens.me/2025/11/07/things_I_learned_from_college/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;(that I still remember a decade later)&lt;/p&gt;

&lt;h2 id=&quot;evolution-on-earth&quot;&gt;Evolution on Earth&lt;/h2&gt;

&lt;p&gt;Fact 1: When foxes are bred to be more docile, their ears become floppy like dogs’ ears instead of pointy like wild foxes’.&lt;/p&gt;

&lt;p&gt;Fact 2: Crows can learn to use a short stick to fetch a longer stick to fetch food.&lt;/p&gt;

&lt;p&gt;The basic setup of the experiment is: There’s a box with some food at the bottom. The crow can’t reach the food. The crow has a short stick, but the stick isn’t long enough to reach the food, either.&lt;/p&gt;

&lt;p&gt;There’s also a &lt;em&gt;second&lt;/em&gt; box containing a &lt;em&gt;long&lt;/em&gt; stick. The short stick is long enough to reach the long stick. Most crows figure out that they can use the short stick to fetch the long stick and then use the long stick to fetch the food.&lt;/p&gt;

&lt;p&gt;If you add a third layer of indirection, where they have to use a short stick to fetch a medium stick and the medium stick to fetch a long stick and the long stick to fetch food, most crows don’t figure it out but a few of them do.&lt;/p&gt;

&lt;p&gt;I wrote a rap song about this experiment, it used to be on YouTube but I think it’s gone now.&lt;/p&gt;

&lt;!-- more --&gt;

&lt;h2 id=&quot;physics-in-the-21st-century&quot;&gt;Physics in the 21st Century&lt;/h2&gt;

&lt;p&gt;Even before taking this class, I knew that there are four fundamental forces of the universe:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Gravity: Things go down. (Or, more accurately, all objects with mass pull toward each other.)&lt;/li&gt;
  &lt;li&gt;Electromagnetism: Many fundamental particles have a positive or negative charge. Oppositely-charged particles attract, and like charges repel; moving electrons creates electricity, and clusters of charged particles create magnetism (or something like that).&lt;/li&gt;
  &lt;li&gt;Strong force: Atomic nuclei are strongly held together even though the protons electromagnetically repel each other.&lt;/li&gt;
  &lt;li&gt;Weak force: ??? something about radioactive decay?&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;After having previously been confused about what the weak force was and trying to figure it out, this class finally got me to understand it. But since then I have forgotten the explanation and I’m confused again. I wish I could remember how the weak force works.&lt;/p&gt;

&lt;h2 id=&quot;intro-computer-science&quot;&gt;Intro Computer Science&lt;/h2&gt;

&lt;p&gt;In first semester computer science, I had been programming for longer than almost any of my classmates (~6 years), and I was the guy my classmates came to for help with CS assignments. By second semester, everyone had caught up to me and my 6-year lead didn’t matter.&lt;/p&gt;

&lt;h2 id=&quot;linear-and-nonlinear-optimization&quot;&gt;Linear and Nonlinear Optimization&lt;/h2&gt;

&lt;p&gt;Duality: for every convex optimization problem, there is a dual problem that has the same solution.&lt;/p&gt;

&lt;p&gt;The dual problem that sticks in my mind: given a set of possible investments, maximizing expected return subject to a given standard deviation is equivalent to minimizing standard deviation subject to a given expected return.&lt;/p&gt;

&lt;h2 id=&quot;philosophy-of-mind&quot;&gt;Philosophy of Mind&lt;/h2&gt;

&lt;p&gt;Philosophy papers routinely make arguments with glaringly obvious logical flaws, but they still get published somehow, and are considered important works worthy of teaching in a class.&lt;/p&gt;

&lt;p&gt;I’m a bit conflicted on philosophy as an institution. On the one hand, it’s really hard, and people come up with lots of brilliant stuff that I never would have thought of. On the other hand, many “seminal” papers contain obvious fundamental flaws. I don’t mean I disagree with their conclusions, I mean they make logical arguments that are clearly not logically valid. Like, they have the general structure of “A implies B, A, therefore C” and I’m like…how did you not notice that this makes no sense? and how did the reviewers not notice either? and how did the professor not notice when deciding to assign this paper as reading?&lt;/p&gt;

&lt;p&gt;At least that’s what I thought at the time. I don’t remember which papers we read, so I can’t go back and verify that they were indeed as flawed as I thought.&lt;/p&gt;

&lt;p&gt;I learned basically nothing on the object level about theory of mind or theory of identity. I still believe all the same things I believed before taking this class. I guess I learned various insane things that some philosophers believe, but I don’t remember what most of those things are.&lt;/p&gt;

&lt;p&gt;(“What Is It Like to Be a Bat?” is the best theory of mind paper I’ve ever read—it made me think about things in a new way—but I read it in high school, not college.)&lt;/p&gt;

&lt;h2 id=&quot;computer-networking&quot;&gt;Computer Networking&lt;/h2&gt;

&lt;p&gt;Fact 1: The web has four layers of transmission:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;physical&lt;/li&gt;
  &lt;li&gt;IP (send/receive raw packets of data)&lt;/li&gt;
  &lt;li&gt;TCP (manage the transmission of packets)&lt;/li&gt;
  &lt;li&gt;HTTP (tell what types of packets to send/receive)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Fact 2: The Internet is not the same thing as The Web. Internet = IP, Web = HTTP. google.com and facebook.com are on the Web. Email is Internet, but not Web (unless you’re checking your email in a web browser). Multiplayer video games are on the Internet, but not the Web. The Internet dates back to the 1970s, but the Web didn’t start until 1993.&lt;/p&gt;

&lt;h2 id=&quot;intro-psychology&quot;&gt;Intro Psychology&lt;/h2&gt;

&lt;p&gt;Psych textbooks and your psych professor will uncritically repeat claims that were found in a single study that didn’t replicate.&lt;/p&gt;

&lt;h2 id=&quot;machine-learning&quot;&gt;Machine Learning&lt;/h2&gt;

&lt;p&gt;Four facts:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;You can solve a lot of problems by throwing a logistic regression at them.&lt;/li&gt;
  &lt;li&gt;I can vaguely explain what a support vector machine is.&lt;/li&gt;
  &lt;li&gt;I can vaguely explain what a convolutional neural network is.&lt;/li&gt;
  &lt;li&gt;I am not good at machine learning.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Sadly this is pretty much all I learned, even though I took four classes on machine learning. I remember various buzzwords like “softmax” and “one-hot” but I don’t remember what they mean.&lt;/p&gt;

&lt;p&gt;(I managed to get an A- in one of those classes but I still didn’t really learn anything.)&lt;/p&gt;

&lt;h2 id=&quot;linguistics&quot;&gt;Linguistics&lt;/h2&gt;

&lt;p&gt;Fact 1: Gricean maxims explain how statements can convey more information than they appear to.&lt;/p&gt;

&lt;p&gt;The four Gricean maxims of cooperative conversation are:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Maxim of Quality: statements are true.&lt;/li&gt;
  &lt;li&gt;Maxim of Quantity: statements are as general as possible.&lt;/li&gt;
  &lt;li&gt;Maxim of Relevance: statements are relevant.&lt;/li&gt;
  &lt;li&gt;Maxim of Manner: statements are clear and orderly.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If I say “I have three carrots”, that implies that I don’t have four carrots. If I had four carrots, the Maxim of Quantity would require that I say I have four carrots.&lt;/p&gt;

&lt;p&gt;If you ask “Can I eat that carrot?” and I respond by saying, “It’s not mine”, that implies that you cannot eat the carrot. By the Maxim of Relevance, my answer must be relevant to your question, so it can be taken to imply that, as the non-owner of the carrot, I do not have the authority to permit you to eat it.&lt;/p&gt;

&lt;p&gt;Fact 2: A Speech Act is when you perform an act merely by stating that you are performing it. For example, “I apologize.” A statement that includes “hereby” is probably a speech act.&lt;/p&gt;

&lt;p&gt;(&lt;a href=&quot;https://www.youtube.com/watch?v=C-m3RtoguAQ&amp;amp;t=63s&quot;&gt;“I declare bankruptcy!”&lt;/a&gt; is not a speech act.)&lt;/p&gt;

&lt;h2 id=&quot;statistics&quot;&gt;Statistics&lt;/h2&gt;

&lt;p&gt;I learned almost nothing from the two college statistics classes I took. Everything that I remember about statistics, I either learned on my own or learned from AP Statistics in high school—which was actually quite a useful class, maybe even the best class I took in high school!&lt;sup id=&quot;fnref:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;My college statistics classes were mostly about the mechanics of how to hand-compute integrals of probability density functions, which I will never do in real life. Maybe I would’ve been better off taking a statistics-for-scientists class, but my major required me to take statistics-with-calculus, which is more about calculus than it is about statistics.&lt;/p&gt;

&lt;p&gt;I do remember one fun fact: Var[X] = E[X^2] - E[X]^2. I’ve used that one a few times.&lt;/p&gt;

&lt;h2 id=&quot;algorithms&quot;&gt;Algorithms&lt;/h2&gt;

&lt;p&gt;How to implement breadth-first search.&lt;/p&gt;

&lt;p&gt;I had written graph algorithms in high school a few times, and I always used depth-first search because it was intuitive to me. I never knew how to implement breadth-first search (I’m not sure I even knew it existed) until I learned the algorithm in my algorithms class.&lt;/p&gt;

&lt;h2 id=&quot;improv&quot;&gt;Improv&lt;/h2&gt;

&lt;p&gt;Fact 1: Peak age for improv skill is older than peak age for most skills. Improv actors don’t peak until their 40s or 50s.&lt;/p&gt;

&lt;p&gt;Fact 2: Reincorporation—end your story by bringing back a story element from earlier that the audience probably forgot about.&lt;/p&gt;

&lt;p&gt;Reincorporation is the secret to giving a comedic story a satisfying ending.&lt;/p&gt;

&lt;p&gt;Now that I know about this concept, I see it show up a lot in comedy. &lt;em&gt;Curb Your Enthusiasm&lt;/em&gt; is an excellent illustration of reincorporation, where most episodes weave three or four unrelated plot threads that all somehow come together at the end. Arguably the greatest reincorporation of all time is the &lt;em&gt;Seinfeld&lt;/em&gt; episode “The Marine Biologist”.&lt;/p&gt;

&lt;p&gt;Science has yet to discover whether crows can understand reincorporation.&lt;/p&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:1&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;My teacher spent a REALLY long time making sure everyone understood the correct definition of a p-value. &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>Cash Back</title>
				<pubDate>Thu, 06 Nov 2025 00:00:00 -0800</pubDate>
				<link>http://mdickens.me/2025/11/06/cash_back/</link>
				<guid isPermaLink="true">http://mdickens.me/2025/11/06/cash_back/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;When I was 18, my dad took me to the bank to get my first credit card. I had a conversation with the bank teller that went something like this:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Bank teller: This card gives 1% cash back.&lt;/p&gt;

  &lt;p&gt;Me: What does that mean?&lt;/p&gt;

  &lt;p&gt;Bank teller: It means when you spend money with the card, you get 1% cash back.&lt;/p&gt;

  &lt;p&gt;Me: But what does cash back mean, though?&lt;/p&gt;

  &lt;p&gt;Bank teller: It means you get cash back.&lt;/p&gt;

  &lt;p&gt;Me: …&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The bank teller communicated poorly, and also I did not do a good job at articulating which part I was confused about. If I were that bank teller, here is what I would say to my 18-year old self:&lt;/p&gt;

&lt;!-- more --&gt;

&lt;p&gt;I understand you to be asking two questions.&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;As you understand it, “cash” means “paper money”. And you are wondering how it is logistically possible for the bank to give you paper money when you use your credit card. Is a courier going to run to the store you’re at and deliver the cash? Surely that’s ridiculous?&lt;/li&gt;
  &lt;li&gt;This deal makes it sound like the bank is giving you free money. Why would they give you free money? How is that profitable for them?&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The answer to the first question is that no, they are not going to give you paper money. They are going to deposit the 1% cash back into your bank account.&lt;/p&gt;

&lt;p&gt;The answer to the second question: it is indeed profitable for them to pay you 1% of the value of every purchase you make. The reason is that the credit card company charges a fee to businesses (typically around 3%) whenever you buy something at that business. Most companies eat this cost by charging the same price to both cash and credit card users, which means effectively you get a discount by paying with a credit card. Businesses are willing to do this because they can attract more customers if they accept credit cards.&lt;/p&gt;

&lt;p&gt;Credit card companies then take a portion of that ~3% fee and give some of it back to you as “cash back”. (Some cards also give you perks, like discounts on airline tickets.) You might ask, instead of giving you 1% cash back, why don’t they just make the prices be 1% lower? The answer is that it is a dumb psychological trick to make people think they’re getting a better deal. At least that’s part of the answer, it could also be because of logistical issues with prices being set by businesses vs. credit card companies in which businesses always pay the same rate to credit card companies, but credit card companies give better benefits to people who are a lower credit risk.&lt;/p&gt;

&lt;p&gt;But it is genuinely 1% cheaper to buy things with a credit card than with cash, assuming you pay off your card balance each month before accruing any interest.&lt;/p&gt;

&lt;p&gt;Also, 1% cash back isn’t even a good perk. You can get 2% cash back with the Citi Double Cash card. I’m guessing your credit rating isn’t good enough for that card yet, but you’re gonna apply for it in a few years. There are also lots of other cards with fancy perks, but you’re not gonna care about those perks so you should just go for the 2% cash back.&lt;/p&gt;

                </description>
			</item>
		
			<item>
				<title>How Can I Not Know Whether I'm Having a Good Experience?</title>
				<pubDate>Wed, 05 Nov 2025 00:00:00 -0800</pubDate>
				<link>http://mdickens.me/2025/11/05/how_can_I_not_know_whether_I'm_having_a_good_experience/</link>
				<guid isPermaLink="true">http://mdickens.me/2025/11/05/how_can_I_not_know_whether_I'm_having_a_good_experience/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;I’m playing Elden Ring. I’m fighting a difficult boss&lt;sup id=&quot;fnref:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;, and I’m getting kind of frustrated. I die again. I’m thinking about whether I want to keep playing. I don’t know. Am I having a good time? I can’t tell. How is it that I can’t tell?&lt;/p&gt;

&lt;!-- more --&gt;

&lt;p&gt;Fundamentally, good experiences are good, and bad experiences are bad. But what if I don’t know whether I’m having a good experience? How is that possible? A good experience is good because it’s good &lt;em&gt;for me&lt;/em&gt;. An experience lives inside me. But when I point my internal gaze directly at my experience, I can’t tell whether it’s good or bad. That seems impossible.&lt;/p&gt;

&lt;p&gt;I’m not entirely sure what’s going on here, but I think an important component is that I’m not examining my experince &lt;em&gt;during&lt;/em&gt; the game; I’m examining my experience while &lt;em&gt;not playing&lt;/em&gt; the game. When I die to a boss and I spend a moment introspecting while I wait for the game to reload, I’m not fighting the boss at that moment. I’m looking at a loading screen while feeling frustrated.&lt;/p&gt;

&lt;p&gt;In that moment, the only question I can answer is, “Am I having a good experience while staring at this loading screen?” If that’s the question, the answer is a pretty clear “no”. I don’t want to be staring at the loading screen while feeling frustrated. But that’s not the same as the question of whether I &lt;em&gt;will&lt;/em&gt; be having a good time if I start the game again.&lt;/p&gt;

&lt;p&gt;A second problem is that I am experiencing multiple things at the same time. I feel some frustration at my failure.&lt;sup id=&quot;fnref:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt; But I also feel some excitement. I feel some motivation to make progress. I know that I feel all those things; I know exactly what my frustration feels like, because I’m feeling it. The hard part is &lt;em&gt;weighing them against each other&lt;/em&gt;. Is the combination of excitement and motivation enough to outweigh the frustration? Unlike each feeling on its own, the combined weighted experience is not a raw feeling in my gut, so there’s no principle that says I must be able to intuitively evaluate it.&lt;/p&gt;

&lt;p&gt;And there is a third, even deeper problem: reflecting on an experience changes the experience.&lt;/p&gt;

&lt;p&gt;During the boss fight, I can take a second to think about whether I’m having fun. But in that second, I’m not focused on the boss fight; I’m focused on introspecting on my experience. How I feel while I’m introspecting is not the same as how I feel while I’m engrossed in the game. Fundamentally, it is impossible for me to check how I feel while I’m engrossed, because then I wouldn’t be engrossed. (I am not the first person to make this observation, although perhaps I’m the first to apply it to Elden Ring boss fights.)&lt;/p&gt;

&lt;p&gt;I have noticed that I’m more likely to be confused about my own experience if I’m tired. When I’m alert, most of the time I have no trouble knowing whether I want to keep doing what I’m doing, or do something else. But when I’m fatigued, I have a harder time feeling out which direction my motivations are pointing. Does that say something about how introspection works? It suggests to me that the process of aggregating and weighting the different aspects of my experience is a cognition-heavy operation.&lt;/p&gt;

&lt;p&gt;To review, there are (at least) three reasons why I can’t tell whether my experience is good:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;The experience I’m having at this moment is not the experience I want to introspect on.&lt;/li&gt;
  &lt;li&gt;I’m having multiple experiences simultaneously, and aggregating them is not a primitive operation that my brain can perform.&lt;/li&gt;
  &lt;li&gt;Introspecting causes my experience to change.&lt;/li&gt;
&lt;/ol&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:2&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Dragonlord Placidusax, let’s say.&lt;/p&gt;

      &lt;p&gt;Malenia is harder, but for some reason I never got frustrated while fighting Malenia. &lt;a href=&quot;#fnref:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:1&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Really, the thing I find frustrating isn’t failure, but &lt;em&gt;lack of progress&lt;/em&gt;. If I die three times and do worse every time, I probably won’t feel great about that. &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>Do Small Protests Work?</title>
				<pubDate>Tue, 04 Nov 2025 00:00:00 -0800</pubDate>
				<link>http://mdickens.me/2025/11/04/do_small_protests_work/</link>
				<guid isPermaLink="true">http://mdickens.me/2025/11/04/do_small_protests_work/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;&lt;strong&gt;TLDR:&lt;/strong&gt; The available evidence is weak. It looks like small protests may be effective at garnering support among the general public. Policy-makers appear to be more sensitive to protest size, and it’s not clear whether small protests have a positive or negative effect on their perception.&lt;/p&gt;

&lt;p&gt;Previously, I &lt;a href=&quot;https://mdickens.me/2025/04/18/protest_outcomes_critical_review/&quot;&gt;reviewed&lt;/a&gt; evidence from natural experiments and concluded that protests work (credence: 90%).&lt;/p&gt;

&lt;p&gt;My biggest outstanding concern is that all the protests I reviewed were nationwide, whereas the causes I care most about (AI safety, animal welfare) can only put together small protests. Based on the evidence, I’m pretty confident that large protests work. But what about small ones?&lt;/p&gt;

&lt;p&gt;I can see arguments in both directions.&lt;/p&gt;

&lt;p&gt;On the one hand, people are &lt;a href=&quot;https://en.wikipedia.org/wiki/Scope_neglect&quot;&gt;scope insensitive&lt;/a&gt;. I’m pretty sure that a 20,000-person protest is much less than twice as impactful as a 10,000-person protest. And this principle may extend down to protests that only include 10–20 people.&lt;/p&gt;

&lt;p&gt;On the other hand, a large protest and a small protest may send different messages. People might see a small protest and think, “Why aren’t there more people here? This cause must not be very important.” So even if large protests work, it’s conceivable that small protests could backfire.&lt;/p&gt;

&lt;p&gt;What does the scientific literature say about which of those ideas is correct?&lt;/p&gt;

&lt;!-- more --&gt;

&lt;h2 id=&quot;contents&quot;&gt;Contents&lt;/h2&gt;

&lt;ul id=&quot;markdown-toc&quot;&gt;
  &lt;li&gt;&lt;a href=&quot;#contents&quot; id=&quot;markdown-toc-contents&quot;&gt;Contents&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#evidence-from-nationwide-natural-experiments&quot; id=&quot;markdown-toc-evidence-from-nationwide-natural-experiments&quot;&gt;Evidence from nationwide natural experiments&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#direct-evidence-from-lab-experiments&quot; id=&quot;markdown-toc-direct-evidence-from-lab-experiments&quot;&gt;Direct evidence from lab experiments&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#indirect-evidence-from-lab-experiments&quot; id=&quot;markdown-toc-indirect-evidence-from-lab-experiments&quot;&gt;Indirect evidence from lab experiments&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#non-experimental-evidence&quot; id=&quot;markdown-toc-non-experimental-evidence&quot;&gt;Non-experimental evidence&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#conclusion&quot; id=&quot;markdown-toc-conclusion&quot;&gt;Conclusion&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#future-work&quot; id=&quot;markdown-toc-future-work&quot;&gt;Future work&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#appendix-table-of-papers-from-orazani-et-al-2021&quot; id=&quot;markdown-toc-appendix-table-of-papers-from-orazani-et-al-2021&quot;&gt;Appendix: Table of papers from Orazani et al. (2021)&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#notes&quot; id=&quot;markdown-toc-notes&quot;&gt;Notes&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;evidence-from-nationwide-natural-experiments&quot;&gt;Evidence from nationwide natural experiments&lt;/h2&gt;

&lt;p&gt;Among the studies in my prior &lt;a href=&quot;https://mdickens.me/2025/04/18/protest_outcomes_critical_review/&quot;&gt;lit review&lt;/a&gt;, two studies modeled how voter outcomes varied based on the number of protesters in each county. The two studies &lt;a href=&quot;https://mdickens.me/2025/04/18/protest_outcomes_critical_review/#meta-analysis&quot;&gt;found&lt;/a&gt; that each marginal protester increased vote share by 18.81 and 9.62 respectively (where vote share = number of votes adjusted to account for voter turnout&lt;sup id=&quot;fnref:15&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:15&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;).&lt;/p&gt;

&lt;p&gt;Unfortunately, these studies both used linear models, which doesn’t help us. We want to know if there’s a &lt;em&gt;non&lt;/em&gt;-linearity near zero—something that looks like this:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/protest-nonlinear.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;A linear model can’t tell you what the shape of the curve looks like, or whether it dips into the negative for sufficiently small protests.&lt;/p&gt;

&lt;p&gt;(In theory, I could analyze the raw data myself, but that would be a lot of work.)&lt;/p&gt;

&lt;h2 id=&quot;direct-evidence-from-lab-experiments&quot;&gt;Direct evidence from lab experiments&lt;/h2&gt;

&lt;p&gt;To my knowledge, there are two experiments that directly tested whether the size of a protest affected people’s support for a cause.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;/materials/Demonstrating Power.pdf&quot;&gt;Wouters &amp;amp; Walgrave (2017)&lt;/a&gt;&lt;sup id=&quot;fnref:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt; showed (fictitious) news articles to Belgian legislators. The news articles said either “There were about 500 participants which was much less than expected”, or “There were more than 5,000 participants which was more than expected.”&lt;sup id=&quot;fnref:6&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:6&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt; The authors also altered three other independent variables, which they called Worthiness, Unity, and Commitment. Then they asked participants questions to judge how much they agreed with protesters (“position”) and whether they intended to take any actions to support protesters (“action”). Below I present the resulting regression coefficients and p-values.&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt; &lt;/th&gt;
      &lt;th&gt;position&lt;/th&gt;
      &lt;th&gt;p-val&lt;/th&gt;
      &lt;th&gt;action&lt;/th&gt;
      &lt;th&gt;p-val&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;numbers&lt;/td&gt;
      &lt;td&gt;0.282&lt;/td&gt;
      &lt;td&gt;0.008&lt;/td&gt;
      &lt;td&gt;0.439&lt;/td&gt;
      &lt;td&gt;0.000&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;worthiness&lt;/td&gt;
      &lt;td&gt;0.381&lt;/td&gt;
      &lt;td&gt;0.000&lt;/td&gt;
      &lt;td&gt;0.116&lt;/td&gt;
      &lt;td&gt;0.297&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;unity&lt;/td&gt;
      &lt;td&gt;0.353&lt;/td&gt;
      &lt;td&gt;0.001&lt;/td&gt;
      &lt;td&gt;0.350&lt;/td&gt;
      &lt;td&gt;0.002&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;commitment&lt;/td&gt;
      &lt;td&gt;0.190&lt;/td&gt;
      &lt;td&gt;0.300&lt;/td&gt;
      &lt;td&gt;0.156&lt;/td&gt;
      &lt;td&gt;0.161&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;Numbers had biggest or near-biggest p-values out of the four variables, which suggests that legislators care a lot about the size of a protest. However, this study did not include a control group, so we don’t know whether smaller protests had a positive effect, a negative effect, or no effect.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;/materials/Persuasive Power of Protest (Wouters 2019).pdf&quot;&gt;Wouters (2019)&lt;/a&gt;&lt;sup id=&quot;fnref:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;4&lt;/a&gt;&lt;/sup&gt; conducted two similar studies, this time interviewing members of the general public rather than legislators. They described protest sizes the same way as Wouters &amp;amp; Walgrave (2017) (“500, less than expected” vs. “5,000, more than expected”), and again used four independent variables. Below are the regression coefficients and p-values from the two different studies, where the dependent variable was participants’ support for the cause.&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt; &lt;/th&gt;
      &lt;th&gt;study 1&lt;/th&gt;
      &lt;th&gt;p-val&lt;/th&gt;
      &lt;th&gt;study 2&lt;/th&gt;
      &lt;th&gt;p-val&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;numbers&lt;/td&gt;
      &lt;td&gt;0.094&lt;/td&gt;
      &lt;td&gt;0.071&lt;/td&gt;
      &lt;td&gt;0.063&lt;/td&gt;
      &lt;td&gt;0.291&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;diversity&lt;/td&gt;
      &lt;td&gt;0.168&lt;/td&gt;
      &lt;td&gt;0.001&lt;/td&gt;
      &lt;td&gt;0.131&lt;/td&gt;
      &lt;td&gt;0.029&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;worthiness&lt;/td&gt;
      &lt;td&gt;0.607&lt;/td&gt;
      &lt;td&gt;0.000&lt;/td&gt;
      &lt;td&gt;1.127&lt;/td&gt;
      &lt;td&gt;0.000&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;unity&lt;/td&gt;
      &lt;td&gt;0.201&lt;/td&gt;
      &lt;td&gt;0.000&lt;/td&gt;
      &lt;td&gt;0.126&lt;/td&gt;
      &lt;td&gt;0.034&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;In this case, we find that numbers matter less than the other three factors.&lt;/p&gt;

&lt;p&gt;Taken together, these two papers suggest:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Legislators care a lot about protest size. The general public maybe cares a bit, but not much.&lt;/li&gt;
  &lt;li&gt;Even (comparatively) small protests are effective at garnering support from the general public. It’s not clear whether they are effective for legislators.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;(If small protests turned off the general public, then we would see that the “numbers” variable has good predictive power, but it doesn’t.)&lt;/p&gt;

&lt;h2 id=&quot;indirect-evidence-from-lab-experiments&quot;&gt;Indirect evidence from lab experiments&lt;/h2&gt;

&lt;p&gt;Wouters &amp;amp; Walgrave (2017)&lt;sup id=&quot;fnref:1:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt; and Wouters (2019)&lt;sup id=&quot;fnref:2:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;4&lt;/a&gt;&lt;/sup&gt; were the only two papers I could find that directly tested the effect of protest size. But there could also be indirect evidence. I’m imagining something like this:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;A lab experiment showed people either a new article about a protest, or a “control” news article. People who read about the protest were [more/less] supportive of the protesters’ cause.&lt;/li&gt;
  &lt;li&gt;The protest described in the article happened to be small.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That result would provide evidence about how small protests influence people.&lt;/p&gt;

&lt;p&gt;To see if there was something like that, I looked through the studies cited by a meta-analysis by &lt;a href=&quot;https://mdickens.me/materials/Protest%20Meta-Analysis.pdf&quot;&gt;Orazani et al. (2021)&lt;/a&gt;&lt;sup id=&quot;fnref:3&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;5&lt;/a&gt;&lt;/sup&gt;. I found two relevant papers: &lt;a href=&quot;/materials/thomas2013.pdf&quot;&gt;Thomas &amp;amp; Louis (2013)&lt;/a&gt;&lt;sup id=&quot;fnref:4&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:4&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;6&lt;/a&gt;&lt;/sup&gt; and &lt;a href=&quot;https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2911177&quot;&gt;Feinberg et al. (2017)&lt;/a&gt;&lt;sup id=&quot;fnref:5&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:5&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;7&lt;/a&gt;&lt;/sup&gt;. (See &lt;a href=&quot;#appendix-table-of-papers-from-orazani-et-al-2021&quot;&gt;Appendix&lt;/a&gt; for a list of every paper.)&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;/materials/thomas2013.pdf&quot;&gt;Thomas &amp;amp; Louis (2013)&lt;/a&gt;&lt;sup id=&quot;fnref:4:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:4&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;6&lt;/a&gt;&lt;/sup&gt; did two experiments comparing participants’ reactions to news articles about violent vs. nonviolent protests. The two experiments were more or less the same, except that Experiment 1 covered &lt;a href=&quot;https://en.wikipedia.org/wiki/Fracking&quot;&gt;fracking&lt;/a&gt; protests and Experiment 2 was about anti-whaling activism. Unfortunately the contents of the news articles are not publicly available and the corresponding author did not reply to my inquiry, so I don’t know how the protests were described in terms of size.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2911177&quot;&gt;Feinberg et al. (2017)&lt;/a&gt;&lt;sup id=&quot;fnref:5:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:5&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;7&lt;/a&gt;&lt;/sup&gt; included three studies. Each study presented participants with an article or video about a different protest.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Study 1: The articles described a fictitious animal rights group. Protest size was reported in the articles as “about thirty people”.&lt;/li&gt;
  &lt;li&gt;Study 2: The articles described a Black Lives Matter march. The number of protesters was not specified in the articles.&lt;/li&gt;
  &lt;li&gt;Study 3: Participants were shown videos of Trump protests. One video showed a protest with roughly 70 participants&lt;sup id=&quot;fnref:14&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:14&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;8&lt;/a&gt;&lt;/sup&gt; but I don’t know how many protesters were in the other video.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In study 1, study participants reported relatively high support for protesters in the “Moderate” condition, in which a fictional animal rights group picketed a cosmetics company. (In the “Extreme” condition, the protesters broke into the building and freed animals.) However, there was no control group (!!), so we don’t know if reading about the protest caused support to go up, or if support would’ve been high anyway. The protest was described as having only thirty people, so this would’ve been useful evidence if they’d included a control group, but they didn’t.&lt;/p&gt;

&lt;p&gt;One thing we can say about study 1 is that &lt;em&gt;if&lt;/em&gt; small protests reduce support, then they don’t reduce support by as much as “extreme” protests do.&lt;/p&gt;

&lt;p&gt;There is one additional paper, &lt;a href=&quot;https://doi.org/10.1177/2378023120925949&quot;&gt;Bugden (2020)&lt;/a&gt;&lt;sup id=&quot;fnref:9&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:9&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;9&lt;/a&gt;&lt;/sup&gt;, that was not included in the Orazani et al. meta-analysis.&lt;sup id=&quot;fnref:10&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:10&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;10&lt;/a&gt;&lt;/sup&gt; It showed participants articles in four conditions: a control,&lt;sup id=&quot;fnref:11&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:11&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;11&lt;/a&gt;&lt;/sup&gt; peaceful protest, disruptive protest, and violent protest. The peaceful protest article (found in the &lt;a href=&quot;https://journals.sagepub.com/doi/suppl/10.1177/2378023120925949/suppl_file/online_supplementary_materials_socius.docx&quot;&gt;supplement document&lt;/a&gt;) opened with:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;On Thursday, thousands of protestors took to the streets as the state legislature prepares to vote on a climate change bill.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This protest was not small, so Bugden (2020) doesn’t provide relevant evidence.&lt;/p&gt;

&lt;h2 id=&quot;non-experimental-evidence&quot;&gt;Non-experimental evidence&lt;/h2&gt;

&lt;p&gt;An observational study by &lt;a href=&quot;https://doi.org/10.1038/s41893-024-01444-1&quot;&gt;Ostarek et al. (2024)&lt;/a&gt;&lt;sup id=&quot;fnref:12&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:12&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;12&lt;/a&gt;&lt;/sup&gt; studied the effect of a disruptive protest by the climate group Just Stop Oil in which protesters blockaded a highway. By running polls before and after, the researchers found that support for a more &lt;em&gt;moderate&lt;/em&gt; climate group, Friends of the Earth, increased just after the Just Stop Oil protest.&lt;/p&gt;

&lt;p&gt;The protest consisted of 45 people (&lt;a href=&quot;https://www.independent.co.uk/news/uk/crime/roger-hallam-m25-just-stop-oil-court-of-appeal-police-b2582094.html&quot;&gt;source&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;At first glance, this appears to indicate that small protests can be effective. But I’m not sure that’s an appropriate interpretation of the evidence, because:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;It was an observational study, not an experiment or even a natural experiment.&lt;sup id=&quot;fnref:13&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:13&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;13&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
  &lt;li&gt;Other studies have found negative effects for disruptive protests.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;

&lt;p&gt;None of the evidence I found was very good.&lt;/p&gt;

&lt;p&gt;Here are my takeaways, but given the state of the evidence, I don’t have much confidence in them.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Large protests work better than small protests at garnering support among policy-makers. (credence: 80%)&lt;/li&gt;
  &lt;li&gt;The general public probably doesn’t greatly care about the size of a protest. (credence: 60%)&lt;/li&gt;
  &lt;li&gt;Small protests can probably be effective at garnering support among the general public. (credence: 60%)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Do small protests persuade the general public? Looks like yes (but, at the risk of repeating myself, the evidence was not strong).&lt;/p&gt;

&lt;p&gt;Do small protests persuade policy-makers? I couldn’t find any evidence either way. (But the fact that I couldn’t find anything is weak evidence against.)&lt;/p&gt;

&lt;h2 id=&quot;future-work&quot;&gt;Future work&lt;/h2&gt;

&lt;p&gt;I see two obvious ways to learn more about how well small protests work. They’re out of scope for this post, but they wouldn’t be too hard.&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Analyze the data collected in &lt;a href=&quot;https://mdickens.me/2025/04/18/protest_outcomes_critical_review/&quot;&gt;natural experiments&lt;/a&gt; and use a non-linear model to assess the effectiveness of small protests.&lt;/li&gt;
  &lt;li&gt;Run a new survey (on Mechanical Turk or similar) showing people small protests vs. large protests. vs. no protests and then ask them about their opinions on the protesters’ cause.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2 id=&quot;appendix-table-of-papers-from-orazani-et-al-2021&quot;&gt;Appendix: Table of papers from Orazani et al. (2021)&lt;/h2&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Paper&lt;/th&gt;
      &lt;th&gt;Status&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Thomas &amp;amp; Louis (2014)&lt;/td&gt;
      &lt;td&gt;included useful information&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Orazani &amp;amp; Leidner (2018)&lt;/td&gt;
      &lt;td&gt;I couldn’t find the full text&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Becker et al. (2011)&lt;/td&gt;
      &lt;td&gt;dependent variable was not relevant&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Feinberg et al. (2017)&lt;/td&gt;
      &lt;td&gt;included useful information&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Gutting (2017)&lt;/td&gt;
      &lt;td&gt;dependent variable was not relevant&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Leggett (2010)&lt;/td&gt;
      &lt;td&gt;unpublished&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Shuman et al.&lt;/td&gt;
      &lt;td&gt;unpublished&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:15&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Mathematically, vote share per protester equals raw votes per protester divided by the proportion of residents who voted. &lt;a href=&quot;#fnref:15&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:1&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Wouters, R., &amp;amp; Walgrave, S. (2017). &lt;a href=&quot;https://doi.org/10.1177/0003122417690325&quot;&gt;Demonstrating Power.&lt;/a&gt; &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt; &lt;a href=&quot;#fnref:1:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;2&lt;/sup&gt;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:6&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I suspect that the phrases “less/more than expected” have a bigger effect on people’s perception than the numbers themselves. But this hypothesis hasn’t been tested. Some evidence for my hypothesis is that &lt;a href=&quot;https://doi.org/10.1093/qje/qjt021&quot;&gt;Madestam et al. (2013)&lt;/a&gt;&lt;sup id=&quot;fnref:7&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:7&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;14&lt;/a&gt;&lt;/sup&gt; (which I &lt;a href=&quot;https://mdickens.me/2025/04/18/protest_outcomes_critical_review/#madestam-et-al-2013-on-tea-party-protests&quot;&gt;reviewed previously&lt;/a&gt;) found a strong county-level effect on protests, and the average protest size was 815 people per county, which is much closer to the “small” condition (500 people) than the “large” condition (5,000). So my guess is that 500 only sounds small because the article presented it as “less than expected”. However, the protests studied in Madestam et al. (2013) might differ in other meaningful ways. &lt;a href=&quot;#fnref:6&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:2&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Wouters, R. (2019). &lt;a href=&quot;https://doi.org/10.1093/sf/soy110&quot;&gt;The Persuasive Power of Protest. How Protest wins Public Support.&lt;/a&gt; &lt;a href=&quot;#fnref:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt; &lt;a href=&quot;#fnref:2:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;2&lt;/sup&gt;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:3&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Orazani, N., Tabri, N., Wohl, M. J. A., &amp;amp; Leidner, B. (2021). &lt;a href=&quot;https://doi.org/10.1002/EJSP.2722&quot;&gt;Social movement strategy (nonviolent vs. violent) and the garnering of third‐party support: A meta‐analysis.&lt;/a&gt; doi: &lt;a href=&quot;https://doi.org/10.1002/ejsp.2722&quot;&gt;10.1002/ejsp.2722&lt;/a&gt; &lt;a href=&quot;#fnref:3&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:4&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Thomas, E. F., &amp;amp; Louis, W. R. (2013). &lt;a href=&quot;https://doi.org/10.1177/0146167213510525&quot;&gt;When Will Collective Action Be Effective? Violent and Non-Violent Protests Differentially Influence Perceptions of Legitimacy and Efficacy Among Sympathizers.&lt;/a&gt; &lt;a href=&quot;#fnref:4&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt; &lt;a href=&quot;#fnref:4:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;2&lt;/sup&gt;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:5&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Feinberg, M., Willer, R., &amp;amp; Kovacheff, C. (2017). &lt;a href=&quot;https://dx.doi.org/10.2139/ssrn.2911177&quot;&gt;Extreme Protest Tactics Reduce Popular Support for Social Movements.&lt;/a&gt; &lt;a href=&quot;#fnref:5&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt; &lt;a href=&quot;#fnref:5:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;2&lt;/sup&gt;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:14&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Source: I watched the video and counted how many people I could see. Some of the people were clearly bystanders, not protesters, but others were ambiguous so I’m not sure about the exact count. &lt;a href=&quot;#fnref:14&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:9&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Bugden, D. (2020). &lt;a href=&quot;https://doi.org/10.1177/2378023120925949&quot;&gt;Does Climate Protest Work? Partisanship, Protest, and Sentiment Pools.&lt;/a&gt; &lt;a href=&quot;#fnref:9&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:10&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Even though Orazani et al. was published in 2021, its literature review was conducted in 2018. &lt;a href=&quot;#fnref:10&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:11&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;The control condition simply noted that protests exist without describing them at all, and asked participants if they supported the protesters’ cause. &lt;a href=&quot;#fnref:11&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:12&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Ostarek, M., Simpson, B., Rogers, C., &amp;amp; Ozden, J. (2024). &lt;a href=&quot;https://doi.org/10.1038/s41893-024-01444-1&quot;&gt;Radical climate protests linked to increases in public support for moderate organizations.&lt;/a&gt;&lt;/p&gt;

      &lt;p&gt;See also a less-technical 2022 preprint at &lt;a href=&quot;https://www.socialchangelab.org/_files/ugd/503ba4_a184ae5bbce24c228d07eda25566dc13.pdf&quot;&gt;https://www.socialchangelab.org/_files/ugd/503ba4_a184ae5bbce24c228d07eda25566dc13.pdf&lt;/a&gt;. &lt;a href=&quot;#fnref:12&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:13&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I don’t see how the change wouldn’t be causal, but that could just be a failure of imagination on my part. &lt;a href=&quot;#fnref:13&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:7&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Madestam, A., Shoag, D., Veuger, S., &amp;amp; Yanagizawa-Drott, D. (2013). &lt;a href=&quot;https://doi.org/10.1093/qje/qjt021&quot;&gt;Do Political Protests Matter? Evidence from the Tea Party Movement*.&lt;/a&gt; &lt;a href=&quot;#fnref:7&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>My Third Caffeine Self-Experiment</title>
				<pubDate>Mon, 03 Nov 2025 00:00:00 -0800</pubDate>
				<link>http://mdickens.me/2025/11/03/third_caffeine_self-experiment/</link>
				<guid isPermaLink="true">http://mdickens.me/2025/11/03/third_caffeine_self-experiment/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;Last year I did a &lt;a href=&quot;https://mdickens.me/2024/04/11/caffeine_self_experiment/&quot;&gt;caffeine cycling self-experiment&lt;/a&gt; and I determined that I don’t get habituated to caffeine when I drink coffee three days a week. I did a &lt;a href=&quot;https://mdickens.me/2024/06/24/continuing_caffeine_self_experiment/&quot;&gt;follow-up experiment&lt;/a&gt; where I upgraded to &lt;em&gt;four&lt;/em&gt; days a week (Mon/Wed/Fri/Sat) and I found that I &lt;em&gt;still&lt;/em&gt; don’t get habituated.&lt;/p&gt;

&lt;p&gt;For my current weekly routine, I have caffeine on Monday, Wednesday, Friday, and Saturday. Subjectively, I often feel low-energy on Saturdays. Is that because the caffeine I took on Friday is having an aftereffect that makes me more tired on Saturday?&lt;/p&gt;

&lt;p&gt;When I ran my second experiment, I took caffeine four days, including the three-day stretch of Wednesday-Thursday-Friday. I found that my performance on a reaction time test was comparable between Wednesday and Friday. If my reaction time stayed the same after taking caffeine three days in a row, that’s evidence that I didn’t develop a tolerance over the course of those three days.&lt;/p&gt;

&lt;p&gt;But if three days isn’t long enough for me to develop a tolerance, why is it that lately I feel tired on Saturdays, after taking caffeine for only two days in a row? Was the result from my last experiment incorrect?&lt;/p&gt;

&lt;p&gt;So I decided to do another experiment to get more data.&lt;/p&gt;

&lt;p&gt;This time I did a new six-week self-experiment where I kept my current routine, but I tested my reaction time every day. I wanted to test two hypotheses:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Is my post-caffeine reaction time worse on Saturday than on Mon/Wed/Fri?&lt;/li&gt;
  &lt;li&gt;Is my reaction time worse on the morning after a caffeine day than on the morning after a caffeine-free day?&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The first hypothesis tests whether I become habituated to caffeine, and the second hypothesis tests whether I experience withdrawal symptoms the following morning.&lt;/p&gt;

&lt;p&gt;The answers I got were:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;No, there’s no detectable difference.&lt;/li&gt;
  &lt;li&gt;No, there’s no detectable difference.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Therefore, in defiance of my subjective experience—but in agreement with my earlier experimental results—I do not become detectably habituated to caffeine on the second day.&lt;/p&gt;

&lt;p&gt;However, it’s possible that caffeine habituation affects my &lt;em&gt;fatigue&lt;/em&gt; even though it doesn’t affect my &lt;em&gt;reaction time&lt;/em&gt;. So it’s hard to say for sure what’s going on without running more tests (which I may do at some point).&lt;/p&gt;

&lt;!-- more --&gt;

&lt;h2 id=&quot;contents&quot;&gt;Contents&lt;/h2&gt;

&lt;ul id=&quot;markdown-toc&quot;&gt;
  &lt;li&gt;&lt;a href=&quot;#contents&quot; id=&quot;markdown-toc-contents&quot;&gt;Contents&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#experimental-procedure&quot; id=&quot;markdown-toc-experimental-procedure&quot;&gt;Experimental procedure&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#results&quot; id=&quot;markdown-toc-results&quot;&gt;Results&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#alternative-experimental-procedures-that-im-not-going-to-do&quot; id=&quot;markdown-toc-alternative-experimental-procedures-that-im-not-going-to-do&quot;&gt;Alternative experimental procedures that I’m not going to do&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#a-story-about-how-i-thought-my-experiment-failed-but-actually-i-was-just-being-stupid&quot; id=&quot;markdown-toc-a-story-about-how-i-thought-my-experiment-failed-but-actually-i-was-just-being-stupid&quot;&gt;A story about how I thought my experiment failed, but actually I was just being stupid&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#notes&quot; id=&quot;markdown-toc-notes&quot;&gt;Notes&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;experimental-procedure&quot;&gt;Experimental procedure&lt;/h2&gt;

&lt;p&gt;As &lt;a href=&quot;https://mdickens.me/2024/03/02/caffeine_tolerance/#appendix-b-pre-registration-for-a-caffeine-self-experiment&quot;&gt;with my previous experiments&lt;/a&gt;, I took a reaction time test every morning before caffeine, as well as an hour after caffeine on days when I took it (Mon/Wed/Fri/Sat). I ran the test for six weeks.&lt;/p&gt;

&lt;p&gt;This experiment had the same flaws as my previous experiments, e.g., I did not blind myself because blinding myself is annoying and I didn’t feel like doing it.&lt;/p&gt;

&lt;p&gt;In my first two experiments, I was meticulous about controlling the conditions on my computer during the reaction time test. I always tested using the &lt;a href=&quot;https://humanbenchmark.com/tests/reactiontime&quot;&gt;humanbenchmark.com test&lt;/a&gt; in Chrome with a single browser window open. I normally use Firefox, but I tested in a different browser to be sure that my 100+ open Firefox tabs wouldn’t interfere with the test in any way (perhaps background tasks could slow down the JavaScript code that runs the reaction time app, which could artificially inflate my reaction time). I tested without any other applications open on my computer except for Emacs and a terminal window (which I always have open).&lt;/p&gt;

&lt;p&gt;For my most recent experiment, I wasn’t so meticulous about it because I wanted to be lazy and I figured it probably didn’t matter. I still did the reaction time test in Chrome, but I didn’t close Firefox or other applications during the test.&lt;/p&gt;

&lt;h2 id=&quot;results&quot;&gt;Results&lt;/h2&gt;

&lt;p&gt;First, I tested to see if caffeine even made a visible difference in reaction time. &lt;a href=&quot;https://mdickens.me/2024/04/11/caffeine_self_experiment/&quot;&gt;Last time&lt;/a&gt;, caffeine had a strong and readily apparent effect on my reaction time. My third experiment replicated this result:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;caffeine vs. no-caffeine:
    298.0 ms vs. 303.5 ms
    t-stat = -2.9, p-value = 0.006
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;However, my reaction time was noticeably worse than in the previous two experiments. My average used to hover around 280 ms and now it was hovering around 300 ms. Perhaps because I was less meticulous about keeping my computer in consistent conditions, I ended up adding some latency to the reaction time app?&lt;/p&gt;

&lt;p&gt;Some evidence for this hypothesis is that I’ve tried testing my reaction time on Windows a few times (I normally use Linux) and it’s &lt;em&gt;much&lt;/em&gt; faster—more like 230 ms. This is almost certainly due to a difference in how the reaction time app works on Windows vs. Linux.&lt;/p&gt;

&lt;p&gt;My primary hypothesis test—which I pre-registered to myself, but did not pre-register publicly—was to compare post-caffeine reaction time performance on Saturday vs. the average of every other caffeine day (Mon/Wed/Fri). This test got a null result:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;Saturdays vs. non-Saturday caffeine days:
    297.2 ms vs. 298.3 ms
    t-stat = -0.4, p-value = 0.697
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;I felt generally worse on Saturdays, but perhaps I was imagining things or seeing patterns that weren’t there, and really I shouldn’t worry about it.&lt;/p&gt;

&lt;p&gt;Or perhaps I do actually feel worse on the second caffeine day, in a way that reaction time fails to capture. It’s possible that caffeine’s different effects habituate at different rates, and I’m losing my alertness faster than I’m losing my reaction speed.&lt;/p&gt;

&lt;p&gt;(I would guess that caffeine’s effect on exercise performance would habituate particularly slowly—as I understand, caffeine improves exercise by physiologically improving muscle function somehow (it enhances calcium circulation or something), not just by increasing alertness.)&lt;/p&gt;

&lt;p&gt;My second hypothesis was that I experience caffeine withdrawal on the morning after a caffeine day. I got a null result for this hypothesis as well:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;morning after caffeine vs. morning after nocaf:
    303.6 ms (sd 6.9) vs. 303.5 ms (sd 7.0)
    mean difference = 0.1 ms
    t-stat = 0.0, p-value = 0.987
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;(Before running the experiment, I had a vague idea that I wanted to test this hypothesis, but I didn’t mentally pre-register a methodology.)&lt;/p&gt;

&lt;h2 id=&quot;alternative-experimental-procedures-that-im-not-going-to-do&quot;&gt;Alternative experimental procedures that I’m not going to do&lt;/h2&gt;

&lt;p&gt;It could be that my &lt;em&gt;reaction time&lt;/em&gt; doesn’t get worse on the second day, but my &lt;em&gt;alertness&lt;/em&gt; does get worse. I can think of two methods to test that hypothesis, but I don’t want to do them.&lt;/p&gt;

&lt;p&gt;Method 1: Same procedure as before, but instead of using a reaction time test as the independent variable, I subjectively rate my alertness. This seems not good because it’s unblinded. I’m not too concerned about blinding reaction time because it’s hard to placebo yourself into a faster reaction time, but “subjective rating of alertness” is exactly the sort of thing that’s highly prone to a placebo effect.&lt;/p&gt;

&lt;p&gt;Method 2: Randomize whether I take caffeine pills or placebo pills, and blind myself. To detect potential habituation, I can take the same pill two days in a row, but blind myself to what type of pill it is. Then I subjectively rate my alertness. I don’t want to do that either because it would require working out without caffeine 50% of the time, and working out without caffeine is unpleasant.&lt;/p&gt;

&lt;h2 id=&quot;a-story-about-how-i-thought-my-experiment-failed-but-actually-i-was-just-being-stupid&quot;&gt;A story about how I thought my experiment failed, but actually I was just being stupid&lt;/h2&gt;

&lt;p&gt;After completing my experiment—this was about three months ago&lt;sup id=&quot;fnref:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;—I wrote some code to test the hypotheses. To my dismay, I found no detectable difference between caffeine and no-caffeine reaction times:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;caffeine vs. no-caffeine:
    298.3 ms vs. 303.5 ms
	t-stat = 0.0, p-value = 0.987
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;If there’s not even a difference between caffeine and no-caffeine, then the experiment is useless.&lt;/p&gt;

&lt;p&gt;At the time, I was too tired and demotivated to write up the results, so I abandoned it for a while.&lt;/p&gt;

&lt;p&gt;Eventually I decided to finally write up the results of my experiment again. I looked at the numbers and I noticed that they didn’t make any sense. If the difference between caffeine and no-caffeine was 5.2 ms, how was the t-stat 0.0?&lt;/p&gt;

&lt;p&gt;You may be able to see the mistake I made if you look at the numbers from the &lt;a href=&quot;#results&quot;&gt;Results&lt;/a&gt; section. Instead of printing the t-stat and p-value for the caffeine vs. no-caffeine t-test, I accidentally printed the numbers from the &lt;em&gt;morning after caffeine vs. morning after nocaf&lt;/em&gt; test. So the figures I was looking at were totally wrong.&lt;/p&gt;

&lt;p&gt;I guess I wasn’t 100% there mentally when I wrote the code. (Honestly I don’t think I was even 30% there.)&lt;sup id=&quot;fnref:3&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;If you want to check if my code contains any other horrible mistakes, you can find it &lt;a href=&quot;https://github.com/michaeldickens/public-scripts/tree/master/caffeine&quot;&gt;on GitHub&lt;/a&gt;.&lt;/p&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:1&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I have a bad habit of letting half-finished drafts sit in my drafts folder for a long time. &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:3&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I wrote it on a non-caffeine day which might have something to do with it. &lt;a href=&quot;#fnref:3&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>Things I've Become More Confident About</title>
				<pubDate>Sun, 02 Nov 2025 00:00:00 -0700</pubDate>
				<link>http://mdickens.me/2025/11/02/things_ive_become_more_confident_about/</link>
				<guid isPermaLink="true">http://mdickens.me/2025/11/02/things_ive_become_more_confident_about/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;Last year, I wrote a list of &lt;a href=&quot;https://mdickens.me/2024/05/23/some_things_ive_changed_my_mind_on/&quot;&gt;things I’ve changed my mind on&lt;/a&gt;. But good truth-seeking doesn’t just require you to consider where you might be wrong; you must also consider where you might be &lt;strong&gt;right&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;In this post, I provide some beliefs I used to be uncertain about, that I have come to believe more strongly.&lt;/p&gt;

&lt;!-- more --&gt;

&lt;ol&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;My belief:&lt;/strong&gt; Evolution is true.&lt;/p&gt;

    &lt;p&gt;&lt;strong&gt;Why I believed it originally:&lt;/strong&gt; I learned about the theory of evolution in school. I had the impression that it was a popular but unproven hypothesis (“just a theory”).&lt;sup id=&quot;fnref:4&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:4&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

    &lt;p&gt;&lt;strong&gt;What made me more confident:&lt;/strong&gt; When I was maybe 10 or 12, I read an article in some science magazine (&lt;em&gt;National Geographic&lt;/em&gt;, maybe) about evolution. It said, “evolution is a theory in the same way atoms are a theory.” I probably put too much credence in this one sentence in one article, but in my mind, this was definitive proof that evolution is true.&lt;/p&gt;

    &lt;p&gt;Later, when I was 14, I started getting interested in the specifics of the theory of evolution and learned much more about the supporting evidence. (My motivation was mostly that I wanted to argue with creationists on the internet.)&lt;/p&gt;

    &lt;p&gt;I went through a similar trajectory when learning about &lt;a href=&quot;https://en.wikipedia.org/wiki/Quark&quot;&gt;quarks&lt;/a&gt;. I was taught that a quark is a hypothetical particle that exists inside atoms, but has never been observed. Later I learned that the existence of quarks is well-established, and it became well-established nearly three decades before I was born.&lt;/p&gt;

    &lt;p&gt;On the subject of outdated pedagogy, this is a bit of a tangent but in 5th grade I was taught the &lt;a href=&quot;https://en.wikipedia.org/wiki/Kingdom_(biology)#Five_kingdoms&quot;&gt;five kingdoms of life&lt;/a&gt;: monerans, protists, fungi, plants, and animals. Recently, I learned that not only do biologists no longer use this classification system, but that it was already obsolete &lt;em&gt;when my 5th grade teacher was in 5th grade.&lt;/em&gt;&lt;sup id=&quot;fnref:5&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:5&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

    &lt;p&gt;(My 5th grade teacher was pretty young, but still.)&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;My belief:&lt;/strong&gt; &lt;a href=&quot;https://en.wikipedia.org/wiki/Value_investing&quot;&gt;Value investing&lt;/a&gt; works.&lt;/p&gt;

    &lt;p&gt;&lt;strong&gt;Why I believed it originally:&lt;/strong&gt; I read about Joel Greenblatt’s &lt;a href=&quot;https://en.wikipedia.org/wiki/Magic_formula_investing&quot;&gt;magic formula investing&lt;/a&gt; and its strong historical performance.&lt;/p&gt;

    &lt;p&gt;&lt;strong&gt;What made me more confident:&lt;/strong&gt; I read more research on value investing, including the seminal paper &lt;a href=&quot;https://doi.org/10.1111/j.1540-6261.1992.tb04398.x&quot;&gt;The Cross-Section of Expected Stock Returns&lt;/a&gt;&lt;sup id=&quot;fnref:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt; by Fama and French, and more in-depth research showing value investing has worked &lt;a href=&quot;https://dx.doi.org/10.2139/ssrn.2174501&quot;&gt;across the world and across asset classes&lt;/a&gt;&lt;sup id=&quot;fnref:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;4&lt;/a&gt;&lt;/sup&gt;, and on older data &lt;a href=&quot;https://dx.doi.org/10.2139/ssrn.3325720&quot;&gt;going back 200 years&lt;/a&gt;&lt;sup id=&quot;fnref:3&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;5&lt;/a&gt;&lt;/sup&gt;.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;My belief:&lt;/strong&gt; Peaceful protests can be effective.&lt;/p&gt;

    &lt;p&gt;&lt;strong&gt;Why I believed it originally:&lt;/strong&gt; I actually went back and forth on this one. In school I learned about Martin Luther King and how he was a hero of the civil rights movement, the Montgomery Bus Boycott that he helped organize, Gandhi’s protests against colonialism, and implicit in all this was the idea that these tactics were effective.&lt;/p&gt;

    &lt;p&gt;Eventually I learned about &lt;a href=&quot;http://givewell.org/&quot;&gt;GiveWell&lt;/a&gt;, which was the first time I’d ever encountered the notion that just because a charity says it’s effective, doesn’t mean it’s actually effective. I started thinking critically about protests in the same way, and I realized that I’d never actually seen good evidence that MLK or Gandhi were responsible for the positive changes that coincided with their activism.&lt;/p&gt;

    &lt;p&gt;Then I started thinking, well, there’s not &lt;em&gt;strong&lt;/em&gt; evidence that protests work, but there’s at least &lt;em&gt;some&lt;/em&gt; reason to believe they work. That’s about where I was at in 2024 when I &lt;a href=&quot;https://mdickens.me/2024/11/18/where_i_am_donating_in_2024/&quot;&gt;donated to PauseAI&lt;/a&gt;—I thought, I don’t really know if this is gonna work, but it’s worth trying.&lt;/p&gt;

    &lt;p&gt;&lt;strong&gt;What made me more confident:&lt;/strong&gt; I wrote &lt;a href=&quot;https://mdickens.me/2025/04/18/protest_outcomes_critical_review/&quot;&gt;Do Protests Work? A Critical Review&lt;/a&gt;, in which I carefully investigated the strongest evidence I could find. I found that the best evidence was better than I’d expected, and it pointed toward peaceful protests being effective.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;My belief:&lt;/strong&gt; Seed oils are good for you; seed oils don’t cause obesity.&lt;/p&gt;

    &lt;p&gt;&lt;strong&gt;Why I believed it originally:&lt;/strong&gt; I had never heard of the seed oil-obesity hypothesis until I read &lt;a href=&quot;https://dynomight.net/seed-oil/&quot;&gt;Dynomight’s article&lt;/a&gt; on the subject, which argues &lt;em&gt;against&lt;/em&gt; the hypothesis. Dynomight presented some evidence that seed oils are harmful and then ultimately concluded that they’re not. I didn’t think much about the evidence the article gave, but its conclusion seemed reasonable to me.&lt;/p&gt;

    &lt;p&gt;&lt;strong&gt;What made me more confident:&lt;/strong&gt; I researched the issue in more depth while writing &lt;a href=&quot;https://mdickens.me/2024/09/26/outlive_a_critical_review/&quot;&gt;Outlive: A Critical Review&lt;/a&gt;, specifically the &lt;a href=&quot;https://mdickens.me/2024/09/26/outlive_a_critical_review/#the-data-are-unclear-on-whether-reducing-saturated-fat-intake-is-beneficial&quot;&gt;section on saturated fat&lt;/a&gt;. I looked through the literature and presented what I believed to be the strongest evidence on the matter: meta-analyses of RCTs that directly compared dietary saturated fat with unsaturated fat (which usually meant seed oils). The experimental evidence finds that seed oils are, if anything, healthier than saturated fat, which contradicts the seed oil-obesity hypothesis.&lt;/p&gt;

    &lt;p&gt;I read some writings by proponents of the seed oil hypothesis, and their arguments seemed &lt;a href=&quot;https://mdickens.me/2024/10/12/worst_argument_in_the_world/&quot;&gt;incredibly weak&lt;/a&gt; to me.&lt;/p&gt;

    &lt;p&gt;(Later, I re-read &lt;a href=&quot;https://dynomight.net/seed-oil/&quot;&gt;Dynomight’s article&lt;/a&gt; and found that it cited the same evidence I had looked at while writing my review of &lt;em&gt;Outlive&lt;/em&gt;, which I had completely forgotten about.)&lt;/p&gt;

    &lt;p&gt;Dynomight presented the seed oil hypothesis as reasonable but ultimately probably wrong, so that’s what I believed at the time. After examining the evidence in more depth, I don’t think the seed oil hypothesis is reasonable. Dynomight admirably followed Daniel Dennett’s &lt;a href=&quot;https://www.themarginalian.org/2014/03/28/daniel-dennett-rapoport-rules-criticism/&quot;&gt;principles for arguing intelligently&lt;/a&gt;, in which you present your opponent’s case as strongly as possible. But this gave me impression that the seed oil hypothesis is more plausible than it actually is.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;My belief:&lt;/strong&gt; Absent regulation, we aren’t going to solve the AI alignment problem in time.&lt;/p&gt;

    &lt;p&gt;&lt;strong&gt;Why I believed it originally:&lt;/strong&gt; I’ve vaguely believed this since I first learned about the AI alignment problem (in 2013, if I remember correctly). The problem seemed to involve some thorny philosophical problems of unknown size, like the outline of an enormous beast under a murky ocean. But at that point, humanity had collectively only spent a few hundred person-years on AI alignment, and I thought, perhaps there will be some breakthrough that makes the problem turn out to be much easier than expected. Or perhaps as superintelligent AI becomes increasingly imminent, humanity will rally and pour the necessary resources into the problem.&lt;/p&gt;

    &lt;p&gt;&lt;strong&gt;What made me more confident:&lt;/strong&gt; In this case I haven’t much changed my interpretation of the evidence, but I’ve become more confident as new evidence has come out. Namely, AI has gotten extraordinarily more powerful; alignment work has not kept up with the increases in AI capabilities; even though alignment work gets more attention now, the problem still seems about as hard as ever.&lt;/p&gt;

    &lt;p&gt;Beyond that, almost all alignment work is &lt;a href=&quot;https://en.wikipedia.org/wiki/Streetlight_effect&quot;&gt;streetlight effect&lt;/a&gt;-ing, focused on solving tractable but mostly-irrelevant problems; and the frontier AI companies mostly don’t engage with, and are sometimes even actively hostile to, the idea that solving alignment will require major philosophical breakthroughs and it can’t be done using the sorts of empirical methods that they’re all using.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;My belief:&lt;/strong&gt; Most studies on caffeine tolerance are not informative.&lt;/p&gt;

    &lt;p&gt;&lt;strong&gt;Why I believed it originally:&lt;/strong&gt; Prior to writing my post &lt;a href=&quot;https://mdickens.me/2024/03/29/does_caffeine_stop_working/&quot;&gt;Does Caffeine Stop Working?&lt;/a&gt;, I reviewed some studies on caffeine tolerance and I thought to myself, these studies aren’t even testing the hypothesis they claim to be testing, surely I must be missing something?&lt;/p&gt;

    &lt;p&gt;&lt;strong&gt;What made me more confident:&lt;/strong&gt; I read the studies more carefully and spent more time thinking about them, and read a few contrary papers by other scientists who study caffeine. My more careful analysis only reinforced my initial belief that most studies on caffeine tolerance are, indeed, not useful.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;My belief:&lt;/strong&gt; I am smart.&lt;/p&gt;

    &lt;p&gt;&lt;strong&gt;Why I believed it originally:&lt;/strong&gt; In elementary school, I knew I was the smartest kid in my class. But my class only had about 20 students, and I figured I wasn’t that smart in the grand scheme of things. Like, not as smart as scientists and people who go to Harvard and stuff.&lt;/p&gt;

    &lt;p&gt;&lt;strong&gt;What made me more confident:&lt;/strong&gt; The first big piece of evidence came after I took the &lt;a href=&quot;https://en.wikipedia.org/wiki/PSAT/NMSQT&quot;&gt;PSAT&lt;/a&gt; in 10th grade and my score was good enough that I realized I had a good shot at getting into a top university.&lt;/p&gt;

    &lt;p&gt;Then I actually attended a top university and realized that many of the people there were not that smart compared to me. College was still a big step up from elementary school: I went from always being the smartest person in the room to being only in the top 1/3 most of the time, and I sometimes found myself in the bottom third.&lt;/p&gt;

    &lt;p&gt;This trend of repeatedly up-rating my own intelligence reached its peak when I started taking advanced computer science classes, where I was close to the 50th percentile. And nowadays I’m about average within my social circles, and often below average.&lt;/p&gt;

    &lt;p&gt;(If you’re reading this, there’s a good chance that you’re smarter than me.)&lt;/p&gt;

    &lt;p&gt;Another canon event happened when I saw the data on the distribution of my school’s SAT scores. The school’s average score was just over one standard deviation &lt;em&gt;above&lt;/em&gt; the population mean.&lt;sup id=&quot;fnref:6&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:6&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;6&lt;/a&gt;&lt;/sup&gt; I went through high school thinking my average classmates were average, when in reality they were considerably &lt;em&gt;smarter&lt;/em&gt; than average.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;My belief:&lt;/strong&gt; When I first got into lifting weights a decade ago, I learned a lot of conventional wisdom like:&lt;/p&gt;

    &lt;ul&gt;
      &lt;li&gt;Low reps are better for strength, and high reps are better for hypertrophy.&lt;/li&gt;
      &lt;li&gt;Compound exercises are better for strength, and isolation exercises are better for hypertrophy.&lt;/li&gt;
      &lt;li&gt;Long rests are better for strength, and short rests are better for hypertrophy.&lt;/li&gt;
      &lt;li&gt;If you want to bulk or cut, you should eat at a 500 calorie surplus/deficit to gain/lose about a pound per week.&lt;/li&gt;
    &lt;/ul&gt;

    &lt;p&gt;&lt;strong&gt;Why I believed it originally:&lt;/strong&gt; It was the conventional wisdom—people generally agreed that these things are true, even though nobody talked about &lt;em&gt;why&lt;/em&gt;.&lt;/p&gt;

    &lt;p&gt;&lt;strong&gt;What made me more confident:&lt;/strong&gt; I started paying more attention to scientific literature on resistance training and I learned that the conventional wisdom pretty much had it right, at least on these points.&lt;/p&gt;

    &lt;p&gt;(The first three pieces of advice are all explained by a unifying factor: to build strength, you want to lift as much weight as possible, and to build muscle, you want to do as much volume as possible. High reps, isolation exercise, and short rests all enable you to wear out your muscles while lifting lighter weights, and the lighter the weights, the more volume you can do. These three bits of advice aren’t overwhelmingly important—you can still build muscle doing compound exercises at low reps—but they’re useful as guidelines.)&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;My belief:&lt;/strong&gt; Exercise is good for you.&lt;/p&gt;

    &lt;p&gt;&lt;strong&gt;Why I believed it originally:&lt;/strong&gt; Everyone says exercise is good for you, right? But I didn’t know how you’d demonstrate scientifically that that’s true. I thought perhaps it’s reverse causation (sick people can’t exercise) or confounded by socioeconomic class or something.&lt;/p&gt;

    &lt;p&gt;&lt;strong&gt;What made me more confident:&lt;/strong&gt; I learned more about the scientific evidence on exercise.&lt;/p&gt;

    &lt;ul&gt;
      &lt;li&gt;Many randomized controlled trials show that exercise improves short-term health markers—it reduces blood pressure, improves blood sugar regulation, etc.&lt;/li&gt;
      &lt;li&gt;A smaller number of long-term trials show long-term health benefits to exercise.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;details&gt;
&lt;summary&gt;Spoilers for Game of Thrones / A Song of Ice and Fire. Click here to expand.&lt;/summary&gt;
&lt;p&gt;&lt;b&gt;My belief:&lt;/b&gt; R + L = J. That is, Jon Snow&apos;s parents are Lyanna Stark and Rhaegar Targaryen.&lt;/p&gt;
    
&lt;p&gt;&lt;b&gt;Why I believed it originally:&lt;/b&gt; This had long been a popular fan theory. I didn&apos;t figure it out on my own, but I was reasonably convinced by the evidence in &lt;a href=&quot;https://web.archive.org/web/20170320074820/https://towerofthehand.com/essays/chrisholden/jon_snows_parents.html&quot;&gt;this article&lt;/a&gt;. I thought it sounded right, but I was uncertain because the textual evidence wasn&apos;t conclusive.&lt;/p&gt;
    
&lt;p&gt;&lt;b&gt;What made me more confident:&lt;/b&gt; I watched an interview with David Benioff and Dan Weiss, the creators of the TV show. They told a story about how they met with George R. R. Martin to get him to agree to adapt his books. At some point in the meeting, he asked them: Who is Jon Snow&apos;s mother? They gave an answer, and he didn&apos;t say whether they were right, but he gave a knowing smile, and he agreed to let them make the TV show.&lt;/p&gt;
    
&lt;p&gt;They didn&apos;t say what their answer was. But I found this story to be pretty much decisive evidence for R + L = J because what it proved was that the answer was &lt;i&gt;knowable&lt;/i&gt;. If David and Dan could know it, then the rest of the fan base could, too.&lt;/p&gt;
    
&lt;p&gt;Later I became even more confident when the TV show revealed that R + L = J. (Rarely in life do you get definitive confirmation that your theory is correct!)&lt;/p&gt;
&lt;/details&gt;
  &lt;/li&gt;
&lt;/ol&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:4&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;When I was young, I thought the way evolution worked was that a group of apes were born about 500,000 years ago, and these apes lived for hundreds of thousands of years, over which time their bodies slowly morphed to become more and more humanoid, until they became fully human, at which point they birthed human offspring and then died.&lt;/p&gt;

      &lt;p&gt;One time I told my dad that I wish I could’ve gotten to evolve because I wanted to live for 500,000 years. That’s when I learned that that’s not how evolution works. &lt;a href=&quot;#fnref:4&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:5&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;&lt;a href=&quot;https://en.wikipedia.org/wiki/Carl_Woese&quot;&gt;Carl Woese&lt;/a&gt; defined a six-kingdom taxonomy using evidence from ribosomal RNA in 1977, at which time I believe my 5th grade teacher would’ve been in 2nd grade.&lt;/p&gt;

      &lt;p&gt;Lest I sound like I know what I’m talking about, the only reason I can talk coherently about ribosomal RNA methods for taxonomic classification is because I just read those words off Wikipedia 15 seconds ago. &lt;a href=&quot;#fnref:5&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:1&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Fama, E. F., &amp;amp; French, K. R. (1992). &lt;a href=&quot;https://doi.org/10.1111/j.1540-6261.1992.tb04398.x&quot;&gt;The Cross-Section of Expected Stock Returns.&lt;/a&gt; &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:2&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Asness, C. S., Moskowitz, T. J., &amp;amp; Pedersen, L. H. (2012). &lt;a href=&quot;https://dx.doi.org/10.2139/ssrn.2174501&quot;&gt;Value and Momentum Everywhere.&lt;/a&gt; &lt;a href=&quot;#fnref:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:3&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Baltussen, G., Swinkels, L., &amp;amp; van Vliet, P. (2019). &lt;a href=&quot;https://dx.doi.org/10.2139/ssrn.3325720&quot;&gt;Global Factor Premiums.&lt;/a&gt; &lt;a href=&quot;#fnref:3&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:6&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;And someone who gets an average score on the SAT is above-average intelligence, because taking the SAT at all already screens off the lower end of the bell curve. &lt;a href=&quot;#fnref:6&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>Will Welfareans Get to Experience the Future?</title>
				<pubDate>Sat, 01 Nov 2025 00:00:00 -0700</pubDate>
				<link>http://mdickens.me/2025/11/01/will_welfareans_get_to_experience_the_future/</link>
				<guid isPermaLink="true">http://mdickens.me/2025/11/01/will_welfareans_get_to_experience_the_future/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;&lt;em&gt;Epistemic status: This entire essay rests on two controversial premises (linear aggregation and antispeciesism) that I believe are quite robust, but I will not be able to convince anyone that they’re true, so I’m not even going to try.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Cross-posted to the &lt;a href=&quot;https://forum.effectivealtruism.org/posts/gFTHuA3LvrZC2qDgx/will-welfareans-get-to-experience-the-future&quot;&gt;Effective Altruism Forum&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;If &lt;a href=&quot;https://www.goodthoughts.blog/p/beneficentrism&quot;&gt;welfare is important&lt;/a&gt;, and if the value of welfare scales something-like-linearly, and if there is nothing morally special about the human species&lt;sup id=&quot;fnref:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;, then these two things are probably also true:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;The best possible universe isn’t filled with humans or human-like beings. It’s filled with some other type of being that’s much happier than humans, or has much richer experiences than humans, or otherwise experiences much more positive welfare than humans, for whatever “welfare” means. Let’s call these beings Welfareans.&lt;/li&gt;
  &lt;li&gt;A universe filled with Welfareans is &lt;em&gt;much&lt;/em&gt; better than a universe filled with humanoids.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;(Historically, people referred to these beings as “hedonium”. I dislike that term because hedonium sounds like a &lt;em&gt;thing&lt;/em&gt;. It doesn’t sound like something that matters. It’s supposed to be the opposite of that—it’s supposed to be the most profoundly innately valuable sentient being. So I think it’s better to describe the beings as Welfareans. I suppose we could also call them Hedoneans, but I don’t want to constrain myself to hedonistic utilitarianism.)&lt;/p&gt;

&lt;p&gt;Even in the “Good Ending” where we solve AI alignment and governance and coordination problems and we end up with a superintelligent AI that builds a flourishing post-scarcity civilization, will there be Welfareans? In that world, humans will be able to create a flourishing future for themselves; but beings who don’t exist yet won’t be able to give themselves good lives, because they don’t exist.&lt;/p&gt;

&lt;!-- more --&gt;

&lt;p&gt;My guess is that a tiny subset of crazy people (like me) will spend their resources making Welfareans, who will end up occupying only a tiny percentage of the accessible universe, and as a result, the future will be less than 1% as good as it could have been.&lt;/p&gt;

&lt;p&gt;(And maybe my conception of Welfareans will be wrong, and some other weirdo will be the one who makes the &lt;em&gt;real&lt;/em&gt; Welfareans.)&lt;/p&gt;

&lt;p&gt;I want the future to be nice for humans, too. (I’m a human.) But all we need to do is solve AI alignment (and various other extremely difficult, seemingly-insurmountable problems), and humans will turn out fine. Welfareans can’t advocate for themselves, and I’m afraid they won’t get the advocates they need.&lt;/p&gt;

&lt;p&gt;There is one reason why Welfareans might inherit most of the universe. Generally speaking, people don’t care about filling all available space with Dyson spheres to maximize population. They just want to live in their little corner of space, and they’d be happy to let the Welfareans have the rest.&lt;/p&gt;

&lt;p&gt;It’s probably true that most people aren’t maximizers. But &lt;em&gt;some&lt;/em&gt; people are maximizers, and most of them won’t want to maximize Welfareans; they’ll want to maximize some other thing. A lot of people will want to maximize how much of the universe is captured by humans or post-humans (or even just their personal genetic lineage). Mormons will want to maximize the number of Mormons or something. There are enough maximizing ideologies that I expect Welfareans to get squeezed out.&lt;/p&gt;

&lt;p&gt;So what can we do for the Welfareans?&lt;/p&gt;

&lt;p&gt;There are two problems:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Who even &lt;em&gt;are&lt;/em&gt; the Welfareans?&lt;/li&gt;
  &lt;li&gt;How do we ensure that the Welfareans get their share of the future’s resources?&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Solving problem #1 approximately requires solving ethics (or, I guess, &lt;a href=&quot;https://en.wikipedia.org/wiki/Value_theory&quot;&gt;axiology&lt;/a&gt;). I’m not going to say more about that problem; I hope we can agree that it’s hard.&lt;/p&gt;

&lt;p&gt;For problem #2, the first answer that comes to mind is “make a power grab for as many resources as possible so I can give them to Welfareans later on”. But I’m guessing that if we solve ethics (as per problem #1), The Solution To Ethics will include a bit that says something along the lines of “don’t take other people’s stuff”. And there are only like three of us who would even care about Welfareans, so I don’t think we’d get very far anyway.&lt;/p&gt;

&lt;p&gt;So how do we increase Welfareans’ share of resources, but in an ethical manner? I don’t know. I’m going to start with “write this essay about Welfarean welfare”.&lt;/p&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:1&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;In my first draft, the opening sentence said “If something like utilitarianism is true, …”. But this is an unnecessarily strong premise. You don’t need utilitarianism, you just need linear aggregation + antispeciesism. A non-consequentialist can still believe that more welfare is better (all else equal). Such a person would still want to maximize the aggregate welfare of the universe, subject to staying within the bounds of whatever moral rules they believe in. &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>The Next-Gen LLM Might Pose an Existential Threat</title>
				<pubDate>Wed, 15 Oct 2025 00:00:00 -0700</pubDate>
				<link>http://mdickens.me/2025/10/15/next_gen_LLM_might_pose_existential_threat/</link>
				<guid isPermaLink="true">http://mdickens.me/2025/10/15/next_gen_LLM_might_pose_existential_threat/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;I’m pretty sure that the next generation of LLMs will be safe. But the risk is still high enough to make me uncomfortable.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How sure are we that scaling laws are correct?&lt;/strong&gt; Researchers have drawn curves predicting how AI capabilities scale based on how much goes into training them. If you extrapolate those curves, it looks like the next level of LLMs won’t be wildly more powerful than the current level. But maybe there’s a weird bump in the curve that happens in between GPT-5 and GPT-6 (or between Claude 4.5 and Claude 5), and LLMs suddenly become much more capable in a way that scaling laws didn’t predict. I don’t think we can be more than 99.9% confident that there’s not.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How sure are we that current-gen LLMs aren’t sandbagging&lt;/strong&gt; (that is, deliberately hiding their true skill level)? I think they’re still dumb enough that their sandbagging can be caught, and indeed they have been caught sandbagging on some tests. I don’t think LLMs are hiding their true capabilities in general, and our understanding of AI capabilities is probably pretty accurate. But I don’t think we can be more than 99.9% confident about that.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How sure are we that the extrapolated capability level of the next-gen LLM isn’t enough to take over the world?&lt;/strong&gt; It probably isn’t, but we don’t really know what level of capability is required for something like that. I don’t think we can be more than 99.9% confident.&lt;/p&gt;

&lt;p&gt;Perhaps we can be &amp;gt;99.99% that the extrapolated capability of the next-gen LLM is still not as smart as the smartest human. But an LLM has certain advantages over humans—it can work faster (at least on many sorts of tasks), it can copy itself, it can operate computers in a way that humans can’t.&lt;/p&gt;

&lt;p&gt;Alternatively, GPT-6/Claude 5 might not be able to take over the world, but it might be smart enough to recursively self-improve, and that might happen too quickly for us to do anything about.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How sure are we that we aren’t wrong about something else?&lt;/strong&gt; I thought of three ways we could be disastrously wrong:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;We could be wrong about scaling laws;&lt;/li&gt;
  &lt;li&gt;We could be wrong that LLMs aren’t sandbagging;&lt;/li&gt;
  &lt;li&gt;We could be wrong about what capabilities are required for AI to take over.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;But we could be wrong about some entirely different thing that I didn’t even think of. I’m not more than 99.9% confident that my list is comprehensive.&lt;/p&gt;

&lt;p&gt;On the whole, I don’t think we can say there’s less than a 0.4% chance that the next-gen LLM forces us down a path that inevitably ends in everyone dying.&lt;/p&gt;

                </description>
			</item>
		
			<item>
				<title>Mechanisms Rule Hypotheses Out, But Not In</title>
				<pubDate>Wed, 08 Oct 2025 00:00:00 -0700</pubDate>
				<link>http://mdickens.me/2025/10/08/mechanisms_rule_hypotheses_out_not_in/</link>
				<guid isPermaLink="true">http://mdickens.me/2025/10/08/mechanisms_rule_hypotheses_out_not_in/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;If there is no plausible mechanism by which a scientific hypothesis could be true, then it’s almost certainly false.&lt;/p&gt;

&lt;p&gt;But if there &lt;em&gt;is&lt;/em&gt; a plausible mechanism for a hypothesis, then that only provides weak evidence that it’s true.&lt;/p&gt;

&lt;p&gt;An example of the former:&lt;/p&gt;

&lt;p&gt;Astrology teaches that the positions of planets in the sky when you’re born can affect your life trajectory. If that were true, it would contradict well-established facts in physics and astronomy. Nobody has ever observed a physical mechanism by which astrology could be true.&lt;/p&gt;

&lt;p&gt;An example of the latter:&lt;/p&gt;

&lt;p&gt;A 2023 &lt;a href=&quot;https://news.uthscsa.edu/drinking-diet-sodas-and-aspartame-sweetened-beverages-daily-during-pregnancy-linked-to-autism-in-male-offspring/&quot;&gt;study&lt;/a&gt; found an association between autism and diet soda consumption during pregnancy. The authors’ proposed mechanism is that aspartame (an artificial sweetener found in diet soda) metabolizes into aspartic acid, which has been shown to cause neurological problems in mice. Nonetheless, even though there is a proposed mechanism, I don’t really care and I’m pretty sure diet soda doesn’t cause autism. (For a more thorough take on the diet soda &amp;lt;&amp;gt; autism thing, I will refer you to &lt;a href=&quot;https://dynomight.net/grug/&quot;&gt;Grug&lt;/a&gt;, who is much smarter than me.)&lt;/p&gt;

&lt;h2 id=&quot;why&quot;&gt;Why?&lt;/h2&gt;

&lt;!-- more --&gt;

&lt;p&gt;A lack of mechanism strongly rules out a hypothesis. If astrology were true, that would overturn some extremely well-established findings in physics. How could astrology possibly be true, given what we know about the laws of gravity?&lt;/p&gt;

&lt;p&gt;Perhaps scientists have overlooked something. Perhaps the planets affect humans not via gravity but via some fifth as-yet-discovered &lt;a href=&quot;https://en.wikipedia.org/wiki/Fundamental_interaction&quot;&gt;fundamental force&lt;/a&gt;. But if astrologers can detect the fifth force, why haven’t physicists noticed it with all their careful experimentation?&lt;/p&gt;

&lt;p&gt;On the other hand, the &lt;em&gt;existence&lt;/em&gt; of a mechanism doesn’t count for much. I often see this in biology, where someone proposes a contrarian hypothesis with a possible biological mechanism but no supporting evidence from randomized experiments. I don’t take that sort of evidence very seriously. Biology is complicated, and chemicals have all sorts of effects on bodies, and it’s very hard to predict whether those effects are net good or bad just by looking at mechanisms.&lt;/p&gt;

&lt;p&gt;For example, did you know that exercise increases inflammation? And inflammation is bad for you? And yet, exercise is good for you, because the acute inflammation caused by exercise is strongly outweighed by the long-term beneficial effects.&lt;/p&gt;

&lt;p&gt;However, when a hypothesis has supporting evidence from experiments but a &lt;em&gt;lack&lt;/em&gt; of plausible mechanism, I disbelieve the research. &lt;a href=&quot;https://en.wikipedia.org/wiki/Ganzfeld_experiment&quot;&gt;Experiments have demonstrated&lt;/a&gt; that people have psychic abilities. But I’m quite confident that people &lt;em&gt;don’t&lt;/em&gt; have psychic abilities because &lt;em&gt;there is no mechanism by which that could be true.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;In the hierarchy of evidence, experiment beats mechanism, but lack of mechanism beats experiment.&lt;/p&gt;

&lt;p&gt;This asymmetry is consistent with the law of &lt;a href=&quot;https://www.lesswrong.com/w/conservation-of-expected-evidence&quot;&gt;Conservation of Expected Evidence&lt;/a&gt;. There are many plausible mechanisms out there in the world. A hypothesis &lt;em&gt;must&lt;/em&gt; have a mechanism for it to be true, but the &lt;em&gt;existence&lt;/em&gt; of a mechanism does not come anywhere close to proving a hypothesis correct.&lt;/p&gt;

&lt;h2 id=&quot;some-more-examples&quot;&gt;Some more examples&lt;/h2&gt;

&lt;p&gt;Here are some more hypotheses that are strongly ruled out by a lack of plausible mechanism:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Some houses are haunted by ghosts.&lt;/strong&gt;&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;&lt;a href=&quot;https://en.wikipedia.org/wiki/Dowsing&quot;&gt;Dowsing rods&lt;/a&gt; can detect underground water.&lt;/strong&gt;&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Fortune-tellers can predict the future.&lt;/strong&gt;&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;&lt;a href=&quot;https://en.wikipedia.org/wiki/Homeopathy&quot;&gt;Homeopathy&lt;/a&gt; can cure diseases.&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Some hypotheses with plausible mechanisms that I nonetheless believe are false:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Seed oils are bad for you because they contain linoleic acid, which causes inflammation.&lt;/strong&gt; This mechanism is true (as far as I know), but experiments comparing unsaturated fats (mainly seed oils) to saturated fats find that people who eat more of the former end up healthier; see &lt;a href=&quot;https://doi.org/10.1002/14651858.CD011737.pub3&quot;&gt;Hooper et al. (2020)&lt;/a&gt;&lt;sup id=&quot;fnref:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; and &lt;a href=&quot;https://iris.who.int/bitstream/handle/10665/246104/9789241565349-eng.pdf&quot;&gt;WHO (2016)&lt;/a&gt;&lt;sup id=&quot;fnref:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt;. Experimental evidence indicates that seed oils have overall &lt;em&gt;positive&lt;/em&gt; health effects.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Eating excess protein causes osteoporosis.&lt;/strong&gt; The proposed mechanism is that proteins increase blood acidity which causes the body to extract calcium from bones to balance out this acidity. And indeed, people on high-protein diets excrete more calcium in their urine. But randomized controlled trials have found that adding protein to the diet &lt;em&gt;reduces&lt;/em&gt; the risk of bone fracture (&lt;a href=&quot;https://doi.org/10.1080/08952841.2018.1418822&quot;&gt;Koutsofta et al. (2018)&lt;/a&gt;&lt;sup id=&quot;fnref:3&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt;).&lt;sup id=&quot;fnref:4&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:4&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;4&lt;/a&gt;&lt;/sup&gt;
    &lt;ul&gt;
      &lt;li&gt;Relatedly, you may hear some people say you should eat more alkaline foods to fix your body’s pH balance. It would indeed be bad if your body’s pH became too low, but the empirical evidence shows that dietary pH does not affect your body’s pH in that way (see &lt;a href=&quot;https://en.wikipedia.org/wiki/Alkaline_diet&quot;&gt;Wikipedia&lt;/a&gt;).&lt;sup id=&quot;fnref:6&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:6&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;5&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Sugar causes hyperactivity in children because it provides a short-term burst of energy.&lt;/strong&gt; This mechanism is intuitive even if you don’t know much biology. But it’s not true—RCTs have consistently found no connection between hyperactivity and sugar consumption (&lt;a href=&quot;https://doi.org/10.1136/bmj.a2769&quot;&gt;Vreeman &amp;amp; Carroll (2008)&lt;/a&gt;&lt;sup id=&quot;fnref:5&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:5&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;6&lt;/a&gt;&lt;/sup&gt;).&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Eating cholesterol raises your blood cholesterol.&lt;/strong&gt; The mechanism in this case is obvious: you eat food that contains cholesterol, and the cholesterol goes into your body. But your body regulates its own cholesterol production, and your blood cholesterol levels don’t have much to do with how much cholesterol you eat.&lt;sup id=&quot;fnref:7&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:7&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;7&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:1&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Hooper, L., Martin, N., Jimoh, O. F., Kirk, C., Foster, E., &amp;amp; Abdelhamid, A. S. (2020). &lt;a href=&quot;https://doi.org/10.1002/14651858.CD011737.pub3&quot;&gt;Reduction in saturated fat intake for cardiovascular disease.&lt;/a&gt; doi: &lt;a href=&quot;https://doi.org/10.1002/14651858.cd011737.pub3&quot;&gt;10.1002/14651858.cd011737.pub3&lt;/a&gt; &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:2&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Mensink, R. P., &amp;amp; World Health Organization (2016). &lt;a href=&quot;https://iris.who.int/bitstream/handle/10665/246104/9789241565349-eng.pdf&quot;&gt;Effects of saturated fatty acids on serum lipids and lipoproteins: a systematic review and regression analysis.&lt;/a&gt; &lt;a href=&quot;#fnref:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:3&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Koutsofta, I., Mamais, I., &amp;amp; Chrysostomou, S. (2018). &lt;a href=&quot;https://doi.org/10.1080/08952841.2018.1418822&quot;&gt;The effect of protein diets in postmenopausal women with osteoporosis: Systematic review of randomized controlled trials.&lt;/a&gt; &lt;a href=&quot;#fnref:3&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:4&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I heard about this research on the &lt;a href=&quot;https://www.youtube.com/watch?v=O0IK3ap4wQY&quot;&gt;Iron Culture podcast&lt;/a&gt;, in which they went on to complain about how people care too much about mechanisms and ignore experimental evidence. It got me thinking about an apparent contradiction in my beliefs where I care a lot about mechanisms for ruling out astrology and ESP, but I don’t really care about mechanisms in nutrition or exercise science. After thinking about it, I realized that my position is perfectly sensible—it’s about using mechanisms to rule hypotheses &lt;em&gt;out&lt;/em&gt; vs. &lt;em&gt;in&lt;/em&gt;—and that’s how I came up with the idea to write this post. &lt;a href=&quot;#fnref:4&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:6&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I want to be careful not to say that an alkaline diet is unhealthy. Alkaline foods do tend to be particularly healthy—they’re mostly fruits and vegetables—but that’s coincidental, not because they’re alkaline per se. &lt;a href=&quot;#fnref:6&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:5&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Vreeman, R. C., &amp;amp; Carroll, A. E. (2008). &lt;a href=&quot;https://doi.org/10.1136/bmj.a2769&quot;&gt;Festive medical myths.&lt;/a&gt; &lt;a href=&quot;#fnref:5&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:7&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;&lt;a href=&quot;https://nutritionsource.hsph.harvard.edu/what-should-you-eat/fats-and-cholesterol/cholesterol/&quot;&gt;https://nutritionsource.hsph.harvard.edu/what-should-you-eat/fats-and-cholesterol/cholesterol/&lt;/a&gt; &lt;a href=&quot;#fnref:7&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>How Much Does It Cost to Offset an LLM Subscription?</title>
				<pubDate>Sat, 04 Oct 2025 00:00:00 -0700</pubDate>
				<link>http://mdickens.me/2025/10/04/cost_to_offset_LLM_subscription/</link>
				<guid isPermaLink="true">http://mdickens.me/2025/10/04/cost_to_offset_LLM_subscription/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;Is &lt;a href=&quot;https://forum.effectivealtruism.org/topics/moral-offsetting&quot;&gt;moral offsetting&lt;/a&gt; a good idea? Is it ethical to spend money on something harmful, and then donate to a charity that works to counteract those harms?&lt;/p&gt;

&lt;p&gt;I’m not going to answer that question. Instead I’m going to ask a different question: if you use an LLM, how much do you have to donate to AI safety to offset the harm of using an LLM?&lt;/p&gt;

&lt;p&gt;I can’t give a definitive answer, of course. But I can make an educated guess, and my educated guess is that for every $1 spent on an LLM subscription, you need to donate $0.87 to AI safety charities.&lt;/p&gt;

&lt;!-- more --&gt;

&lt;p&gt;First things first: Why do I believe it’s harmful to buy an LLM subscription?&lt;/p&gt;

&lt;p&gt;Paying money to a frontier AI company increases their revenue, and they spend some of that revenue on building more powerful AI systems. Eventually, they build a superintelligent AI. That AI has a good chance of being misaligned and then &lt;a href=&quot;https://intelligence.org/briefing/&quot;&gt;killing everyone in the world&lt;/a&gt;. When you buy an LLM subscription, you cause that to happen slightly faster.&lt;/p&gt;

&lt;p&gt;But you can also donate to nonprofits that are working to prevent AI from killing everyone. How much do you need to donate to a nonprofit to offset the harm of a $20/month LLM subscription?&lt;/p&gt;

&lt;p&gt;I built a simple &lt;a href=&quot;https://squigglehub.org/models/AI-safety/LLM-subscription-offsets&quot;&gt;Squiggle model&lt;/a&gt; to answer that question.&lt;/p&gt;

&lt;h2 id=&quot;the-model&quot;&gt;The model&lt;/h2&gt;

&lt;p&gt;Four key facts:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;When you give a company an additional dollar of revenue, that raises its future valuation by some number.&lt;/li&gt;
  &lt;li&gt;A higher valuation lets the company raise more capital and thus spend some additional amount of money.&lt;/li&gt;
  &lt;li&gt;AI companies will spend a total of some amount in 2026.&lt;/li&gt;
  &lt;li&gt;Meanwhile, it would take some amount of money directed to AI safety nonprofits to cancel out the harm of AI companies’ spending.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;From those, you can estimate how much you need to donate using the following procedure:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Start from the dollar value of your subscription.&lt;/li&gt;
  &lt;li&gt;Calculate how much that will increase company valuation.&lt;/li&gt;
  &lt;li&gt;Translate increased valuation into increased expenditures.&lt;/li&gt;
  &lt;li&gt;Divide by expected total expenditures of frontier AI companies.&lt;/li&gt;
  &lt;li&gt;Multiply by expected total cost to offset AI company harm.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The resulting number is the amount to donate to AI safety nonprofits.&lt;/p&gt;

&lt;p&gt;There are some difficult questions that this model avoids having to answer:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;We don’t care what proportion of AI company spending goes to R&amp;amp;D on frontier models; we only care about total spending.&lt;/li&gt;
  &lt;li&gt;We don’t care to what extent x-risk is increased or decreased per dollar spent.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;the-inputs&quot;&gt;The inputs&lt;/h2&gt;

&lt;p&gt;The model has four inputs: (1) the revenue-to-valuation ratio; (2) the valuation-to-expenditures ratio; (3) frontier AI company expenditures; (4) total cost to offset AI company harm. In this section, I will explain how I estimated the values of those inputs.&lt;/p&gt;

&lt;h3 id=&quot;revenue-to-valuation-ratio&quot;&gt;Revenue-to-valuation ratio&lt;/h3&gt;

&lt;p&gt;How much does a dollar of revenue raise an AI company’s valuation? I can see arguments for both “hardly at all” and “a lot”.&lt;/p&gt;

&lt;p&gt;In favor of “hardly at all”: VCs give AI companies funding on the expectation that their products will be incredibly useful in the future, which doesn’t have much to do with current revenue.&lt;/p&gt;

&lt;p&gt;In favor of “a lot”: AI companies raise funding at high revenue multiples, e.g. Anthropic raised its last round (as of September 2025) at 36x revenue (&lt;a href=&quot;https://www.anthropic.com/news/anthropic-raises-series-f-at-usd183b-post-money-valuation&quot;&gt;source&lt;/a&gt;). This could mean that VCs expect $1 of revenue today to convert to $36 in future value, i.e. revenue has a 36:1 multiplier effect.&lt;/p&gt;

&lt;p&gt;A typical 2025 startup valuation is 7x revenue (&lt;a href=&quot;https://www.saas-capital.com/blog-posts/private-saas-company-valuations-multiples/&quot;&gt;source&lt;/a&gt;). As a median estimate, we could say that $1 of AI company revenue converts to $7 of valuation, and the extra 5x multiplier is driven by high expectations for future AI products.&lt;/p&gt;

&lt;p&gt;(I briefly looked into how startup funding scales with revenue and I didn’t find any useful evidence.)&lt;/p&gt;

&lt;p&gt;Growth rate matters more for valuation than revenue does, but I don’t think this changes the calculation in the short term because an extra $1 of 2025 revenue also represents an extra $1 in growth relative to 2024 revenue.&lt;/p&gt;

&lt;h3 id=&quot;valuation-to-expenditures-ratio&quot;&gt;Valuation-to-expenditures ratio&lt;/h3&gt;

&lt;p&gt;How much does $1 of company valuation translate into increased expenditures?&lt;/p&gt;

&lt;p&gt;Private companies don’t usually publish that information. But based on historical data for AI companies and general trends for startups, it’s reasonable to expect companies to raise capital equal to 5% to 20% of the valuation.&lt;/p&gt;

&lt;p&gt;(I’m thinking of AI companies as startups; “startup” connotes “small”, which they clearly aren’t, but I’m using the term in the Paul Graham &lt;a href=&quot;https://paulgraham.com/growth.html&quot;&gt;startup = growth&lt;/a&gt; sense. Frontier AI companies are startups because they’re growing fast.)&lt;/p&gt;

&lt;h3 id=&quot;frontier-ai-company-expenditures&quot;&gt;Frontier AI company expenditures&lt;/h3&gt;

&lt;p&gt;Public data on AI company fundraising in 2025:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;https://tracxn.com/d/companies/anthropic/__SzoxXDMin-NK5tKB7ks8yHr6S9Mz68pjVCzFEcGFZ08/funding-and-investors#funding-rounds&quot;&gt;Anthropic&lt;/a&gt;: $13B&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://tracxn.com/d/companies/openai/__kElhSG7uVGeFk1i71Co9-nwFtmtyMVT7f-YHMn4TFBg/funding-and-investors&quot;&gt;OpenAI&lt;/a&gt;: $40B&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://tracxn.com/d/companies/xai/__saKrxbHN3TRWW-I4lYH6zkx6N5P_kMTqlLcKTzWs2ug#about-the-company&quot;&gt;xAI&lt;/a&gt;: $10B maybe? (the publicly available data only shows total funding, not individual rounds)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Assume these three companies account for half of AI spending and that the funding they raised will last 18 months; that means AI companies will spend $66B in 2026.&lt;/p&gt;

&lt;h3 id=&quot;total-cost-to-offset-ai-company-harm&quot;&gt;Total cost to offset AI company harm&lt;/h3&gt;

&lt;p&gt;This is the hardest number to estimate. My assumption is that the AI safety community currently spends on the order of $30 million to $100 million per year, and if we spent on the order of 10–100x more, then that would be enough to fully offset the harms of AI companies.&lt;/p&gt;

&lt;p&gt;I suspect that spending 100x more on pure alignment research would not be enough. But spending 100x more would likely be enough if some of the spending goes to governance/policy/advocacy, and some goes to things that have multiplier effects (e.g. you could spend $1 to cause AI companies to contribute $10 more to safety research). I’m also assuming you can make AI safe merely by throwing money at the problem, which is clearly false, but it makes sense to assume it’s true for the purposes of this model.&lt;/p&gt;

&lt;h3 id=&quot;the-answer-according-to-my-model&quot;&gt;The answer (according to my model)&lt;/h3&gt;

&lt;p&gt;Put all those numbers together and the &lt;a href=&quot;https://squigglehub.org/models/AI-safety/LLM-subscription-offsets&quot;&gt;model&lt;/a&gt; spits out a mean cost of $0.87 in donations for every $1 spent on LLM subscriptions. That means for a $20/month subscription, according to the model you’d need to donate $17/month to AI safety orgs.&lt;/p&gt;

&lt;p&gt;The model’s &lt;em&gt;median&lt;/em&gt; estimate is only $0.06—which is to say, an LLM subscription probably only does a little bit of harm. But there is a small probability that you need to donate quite a bit more to offset your LLM usage, so the &lt;em&gt;expected&lt;/em&gt; cost is much higher at $0.87.&lt;/p&gt;

&lt;h2 id=&quot;limitations-of-the-model&quot;&gt;Limitations of the model&lt;/h2&gt;

&lt;p&gt;Like any model, this one does not perfectly match reality. Some examples of problems this model has:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;I have no clue what the total cost is to offset the harm of AI companies.&lt;/li&gt;
  &lt;li&gt;The model assumes they money you donate does as much good as the average dollar spent on AI safety. But maybe your dollars can be above average. (Or they could even be below average.)&lt;/li&gt;
  &lt;li&gt;Maybe giving more money to Anthropic is good actually, because Anthropic is the least unsafe AI company and speeding them up improves our chances.&lt;sup id=&quot;fnref:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
  &lt;li&gt;Is moral offsetting even okay? Maybe we should obey a rule-utilitarian constraint against doing bad things, even if we offset them. Or maybe moral offsetting is silly and we should just donate to whatever charity is most effective.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;

&lt;p&gt;I have a subscription to Claude. Last year I donated a lot of money to AI safety but I didn’t make any donations specifically for offsetting. Having put more thought into it to write this post, I think I will start donating an extra $240/year—$1 donated for every $1 spent on Claude. My model suggested donating 87 cents per dollar, but the model isn’t that precise, and $1-per-dollar is a nice round number. I’m still undecided on whether the concept of moral offsetting makes sense, but I figure I might as well do it.&lt;/p&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:2&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;In my first draft, I also said it might be net good to give AI companies money if they’ll use some of it on alignment research. But on reflection, I’m pretty sure that’s wrong, because giving them money speeds up AI progress, and there’s no strong reason to expect that increasing AI company revenue will increase &lt;em&gt;total&lt;/em&gt; expenditures on alignment.&lt;/p&gt;

      &lt;p&gt;I also expect it’s bad to speed up Anthropic, but I’m not confident about that. &lt;a href=&quot;#fnref:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>I made an Emacs extension that displays Magic: the Gathering card tooltips</title>
				<pubDate>Fri, 03 Oct 2025 00:00:00 -0700</pubDate>
				<link>http://mdickens.me/2025/10/03/mtg_emacs/</link>
				<guid isPermaLink="true">http://mdickens.me/2025/10/03/mtg_emacs/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;This post is about the niche intersection of Emacs and Magic: the Gathering.&lt;/p&gt;

&lt;p&gt;I considered not writing this because I figured, surely if you multiply the proportion of people who play Magic by the proportion of people who use Emacs, you get a very small number. But then I thought, those two variables are probably not independent. And the intersection of &lt;code&gt;Magic players&lt;/code&gt; x &lt;code&gt;Emacs users&lt;/code&gt; x &lt;code&gt;people who read my blog&lt;/code&gt; might actually be greater than zero. So if you’re out there, this post is for you.&lt;/p&gt;

&lt;p&gt;Do you like how MTG websites like &lt;a href=&quot;https://magic.gg/&quot;&gt;magic.gg&lt;/a&gt; and &lt;a href=&quot;https://mtg.wiki/&quot;&gt;mtg.wiki&lt;/a&gt; let you mouse over a card name to see a picture of the card? Well, I wrote an Emacs extension that replicates that functionality.&lt;/p&gt;

&lt;!-- more --&gt;

&lt;p&gt;Here is the code: &lt;a href=&quot;https://github.com/michaeldickens/emacs-mtg&quot;&gt;https://github.com/michaeldickens/emacs-mtg&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The &lt;a href=&quot;https://github.com/michaeldickens/emacs-mtg&quot;&gt;README&lt;/a&gt; on GitHub pretty much explains how it works, so the rest of this post is just gonna repeat what it says in the README.&lt;/p&gt;

&lt;h2 id=&quot;usage&quot;&gt;Usage&lt;/h2&gt;

&lt;blockquote&gt;
  &lt;pre&gt;&lt;code&gt;(add-to-list &apos;load-path /path/to/mtg.el)
(require &apos;mtg)
&lt;/code&gt;&lt;/pre&gt;
&lt;/blockquote&gt;

&lt;p&gt;This module allows you to refer to Magic cards in Org Mode using a new type of link prefixed with &lt;code&gt;mtg:&lt;/code&gt;. For example:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;[[mtg:Black Lotus]] might be the strongest card in my collection, but my personal favorite is [[mtg:Grizzly Bears]].&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;When Org Mode sees a link to an MTG card, it will do the following:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;If the card is not downloaded yet, download the card by querying the &lt;a href=&quot;https://scryfall.com/&quot;&gt;Scryfall&lt;/a&gt; API for a card with the given name.&lt;/li&gt;
  &lt;li&gt;When you open the link (using &lt;code&gt;org-open-at-point&lt;/code&gt; or &lt;code&gt;C-c C-o&lt;/code&gt;), Emacs displays an image of the card in the minibuffer.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Here’s how it looks:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;https://raw.githubusercontent.com/michaeldickens/emacs-mtg/refs/heads/master/example-grizzly-bears.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;By default, card images and data are downloaded to &lt;code&gt;~/.emacs.d/mtg-cards/&lt;/code&gt;, but you can change this by customizing the variable &lt;code&gt;mtg/db-path&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Scryfall’s API has fuzzy name matching, so for example &lt;code&gt;[[mtg:blac lotus]]&lt;/code&gt; will display Black Lotus.&lt;/p&gt;

&lt;h2 id=&quot;card-legality&quot;&gt;Card legality&lt;/h2&gt;

&lt;p&gt;Cards are displayed with a red tint if they are illegal in the preferred format. It looks like this:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;https://raw.githubusercontent.com/michaeldickens/emacs-mtg/refs/heads/master/example-black-lotus.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;When checking legality, this module uses Standard format by default, but you can customize it by setting the variable &lt;code&gt;mtg/default-format&lt;/code&gt;. You can also set file-local or heading-local formats in Org Mode using the &lt;code&gt;:MTG_FORMAT:&lt;/code&gt; property. For example:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;pre&gt;&lt;code&gt;:PROPERTIES:
:MTG_FORMAT: standard
:END:
If you open this link --&amp;gt; [[mtg:Black Lotus]], the card will appear
with a red tint because it&apos;s illegal in Standard.

** My vintage cards
  :PROPERTIES:
  :MTG_FORMAT: vintage
  :END:
  [[mtg:Black Lotus]] is legal in Vintage, so here it will
  appear with no tint.
&lt;/code&gt;&lt;/pre&gt;
&lt;/blockquote&gt;

&lt;p&gt;Note: Adding a red tint requires &lt;a href=&quot;https://imagemagick.org/&quot;&gt;ImageMagick&lt;/a&gt;. If you don’t have ImageMagick installed, all cards will be displayed as if they’re legal.&lt;/p&gt;

&lt;h2 id=&quot;exporting-to-html&quot;&gt;Exporting to HTML&lt;/h2&gt;

&lt;p&gt;If you export Org Mode files to HTML, you can make the MTG card links display images on hover. For this to work, you must include some custom CSS in your Org Mode file.&lt;/p&gt;

&lt;p&gt;On GitHub there is a file called &lt;a href=&quot;https://github.com/michaeldickens/emacs-mtg/blob/master/export-style.setup&quot;&gt;export-style.setup&lt;/a&gt; that includes some custom CSS. To include this custom CSS in Org Mode, put this line at the top of your Org Mode file:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;pre&gt;&lt;code&gt;#+SETUPFILE: /path/to/export-style.setup
&lt;/code&gt;&lt;/pre&gt;
&lt;/blockquote&gt;

&lt;p&gt;Then call &lt;code&gt;org-export-dispatch&lt;/code&gt; to export the Org file to HTML.&lt;/p&gt;

&lt;h2 id=&quot;table-utilities&quot;&gt;Table utilities&lt;/h2&gt;

&lt;p&gt;mtg.el comes with functions for working with Org Mode tables. The functions assume you have a table where one column contains links to MTG cards, like this:&lt;/p&gt;

&lt;table&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;[[mtg:Black Lotus]]&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;[[mtg:Grizzly Bears]]&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;[[mtg:Colossal Dreadmaw]]&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;&lt;code&gt;mtg/table-sort-by-property&lt;/code&gt; takes a property as a string (such as “name”, “rarity”, or “color”) and sorts the table by looking up that property for each card. This only works if you’ve already downloaded the card info (which happens when you view the card or export the whole file).&lt;/p&gt;

&lt;p&gt;&lt;code&gt;mtg/table-insert-column&lt;/code&gt; takes a property as a string and inserts a new column containing that property for each card. For example, calling &lt;code&gt;(mtg/table-insert-column &quot;rarity&quot;)&lt;/code&gt; on the table above produces this:&lt;/p&gt;

&lt;table&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;[[mtg:Black Lotus]]&lt;/td&gt;
      &lt;td&gt;bonus&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;[[mtg:Grizzly Bears]]&lt;/td&gt;
      &lt;td&gt;common&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;[[mtg:Colossal Dreadmaw]]&lt;/td&gt;
      &lt;td&gt;common&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;If a property is missing, the cell will be left blank. For example, calling &lt;code&gt;(mtg/table-insert-column &quot;power&quot;)&lt;/code&gt; produces&lt;/p&gt;

&lt;table&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;[[mtg:Black Lotus]]&lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;[[mtg:Grizzly Bears]]&lt;/td&gt;
      &lt;td&gt;2&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;[[mtg:Colossal Dreadmaw]]&lt;/td&gt;
      &lt;td&gt;6&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;You can also call &lt;code&gt;mtg/get-property&lt;/code&gt; to return a property for the card at point.&lt;/p&gt;

                </description>
			</item>
		
			<item>
				<title>AI Safety Landscape and Strategic Gaps</title>
				<pubDate>Fri, 19 Sep 2025 00:00:00 -0700</pubDate>
				<link>http://mdickens.me/2025/09/19/ai_safety_landscape/</link>
				<guid isPermaLink="true">http://mdickens.me/2025/09/19/ai_safety_landscape/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;I wrote a &lt;a href=&quot;https://forum.effectivealtruism.org/posts/CbHX5zL2uEvTasuiP/ai-safety-landscape-and-strategic-gaps&quot;&gt;report&lt;/a&gt; giving a high-level review of what work people are doing in AI safety. The report specifically focused on two areas: AI policy/advocacy and non-human welfare (including animals and digital minds).&lt;/p&gt;

&lt;p&gt;You can read the report below. I was commissioned to write it by Rethink Priorities, but beliefs are my own.&lt;/p&gt;

&lt;!-- more --&gt;

&lt;h1 id=&quot;contents&quot;&gt;Contents&lt;/h1&gt;

&lt;ul id=&quot;markdown-toc&quot;&gt;
  &lt;li&gt;&lt;a href=&quot;#contents&quot; id=&quot;markdown-toc-contents&quot;&gt;Contents&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#introduction&quot; id=&quot;markdown-toc-introduction&quot;&gt;Introduction&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#prelude&quot; id=&quot;markdown-toc-prelude&quot;&gt;Prelude&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#some-positions-im-going-to-take-as-given&quot; id=&quot;markdown-toc-some-positions-im-going-to-take-as-given&quot;&gt;Some positions I’m going to take as given&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#definitions&quot; id=&quot;markdown-toc-definitions&quot;&gt;Definitions&lt;/a&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#prioritization&quot; id=&quot;markdown-toc-prioritization&quot;&gt;Prioritization&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#why-not-technical-safety-research&quot; id=&quot;markdown-toc-why-not-technical-safety-research&quot;&gt;Why not technical safety research?&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#why-not-ai-policy-research&quot; id=&quot;markdown-toc-why-not-ai-policy-research&quot;&gt;Why not AI policy research?&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#downsides-of-ai-policyadvocacy-and-why-theyre-not-too-big&quot; id=&quot;markdown-toc-downsides-of-ai-policyadvocacy-and-why-theyre-not-too-big&quot;&gt;Downsides of AI policy/advocacy (and why they’re not too big)&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#what-kinds-of-policies-might-reduce-ai-x-risk&quot; id=&quot;markdown-toc-what-kinds-of-policies-might-reduce-ai-x-risk&quot;&gt;What kinds of policies might reduce AI x-risk?&lt;/a&gt;        &lt;ul&gt;
          &lt;li&gt;&lt;a href=&quot;#some-ai-policy-ideas-i-like&quot; id=&quot;markdown-toc-some-ai-policy-ideas-i-like&quot;&gt;Some AI policy ideas I like&lt;/a&gt;&lt;/li&gt;
        &lt;/ul&gt;
      &lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#maybe-prioritizing-post-tai-animal-welfare&quot; id=&quot;markdown-toc-maybe-prioritizing-post-tai-animal-welfare&quot;&gt;Maybe prioritizing post-TAI animal welfare&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#why-not-prioritize-digital-minds--s-risks--moral-error--better-futures--ai-misuse-x-risk--gradual-disempowerment&quot; id=&quot;markdown-toc-why-not-prioritize-digital-minds--s-risks--moral-error--better-futures--ai-misuse-x-risk--gradual-disempowerment&quot;&gt;Why not prioritize digital minds / S-risks / moral error / better futures / AI misuse x-risk / gradual disempowerment?&lt;/a&gt;        &lt;ul&gt;
          &lt;li&gt;&lt;a href=&quot;#whos-working-on-them&quot; id=&quot;markdown-toc-whos-working-on-them&quot;&gt;Who’s working on them?&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#some-relevant-research-agendas&quot; id=&quot;markdown-toc-some-relevant-research-agendas&quot;&gt;Some relevant research agendas&lt;/a&gt;&lt;/li&gt;
        &lt;/ul&gt;
      &lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#general-recommendations&quot; id=&quot;markdown-toc-general-recommendations&quot;&gt;General recommendations&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#advocacy-should-emphasize-x-risk-and-misalignment-risk&quot; id=&quot;markdown-toc-advocacy-should-emphasize-x-risk-and-misalignment-risk&quot;&gt;Advocacy should emphasize x-risk and misalignment risk&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#prioritize-work-that-pays-off-if-timelines-are-short&quot; id=&quot;markdown-toc-prioritize-work-that-pays-off-if-timelines-are-short&quot;&gt;Prioritize work that pays off if timelines are short&lt;/a&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#top-project-ideas&quot; id=&quot;markdown-toc-top-project-ideas&quot;&gt;Top project ideas&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#talk-to-policy-makers-about-ai-x-risk&quot; id=&quot;markdown-toc-talk-to-policy-makers-about-ai-x-risk&quot;&gt;Talk to policy-makers about AI x-risk&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#write-ai-x-risk-legislation&quot; id=&quot;markdown-toc-write-ai-x-risk-legislation&quot;&gt;Write AI x-risk legislation&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#advocate-to-change-ai-training-to-make-llms-more-animal-friendly&quot; id=&quot;markdown-toc-advocate-to-change-ai-training-to-make-llms-more-animal-friendly&quot;&gt;Advocate to change AI training to make LLMs more animal-friendly&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#develop-new-plans--evaluate-existing-plans-to-improve-post-tai-animal-welfare&quot; id=&quot;markdown-toc-develop-new-plans--evaluate-existing-plans-to-improve-post-tai-animal-welfare&quot;&gt;Develop new plans / evaluate existing plans to improve post-TAI animal welfare&lt;/a&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#honorable-mentions&quot; id=&quot;markdown-toc-honorable-mentions&quot;&gt;Honorable mentions&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#directly-push-for-an-international-ai-treaty&quot; id=&quot;markdown-toc-directly-push-for-an-international-ai-treaty&quot;&gt;Directly push for an international AI treaty&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#organize-a-voluntary-commitment-by-ai-scientists-not-to-build-advanced-ai&quot; id=&quot;markdown-toc-organize-a-voluntary-commitment-by-ai-scientists-not-to-build-advanced-ai&quot;&gt;Organize a voluntary commitment by AI scientists not to build advanced AI&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#peaceful-protests&quot; id=&quot;markdown-toc-peaceful-protests&quot;&gt;Peaceful protests&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#media-about-dangers-of-ai&quot; id=&quot;markdown-toc-media-about-dangers-of-ai&quot;&gt;Media about dangers of AI&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#message-testing&quot; id=&quot;markdown-toc-message-testing&quot;&gt;Message testing&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#host-a-website-for-discussion-of-ai-safety-and-other-important-issues&quot; id=&quot;markdown-toc-host-a-website-for-discussion-of-ai-safety-and-other-important-issues&quot;&gt;Host a website for discussion of AI safety and other important issues&lt;/a&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#list-of-other-project-ideas&quot; id=&quot;markdown-toc-list-of-other-project-ideas&quot;&gt;List of other project ideas&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#ai-for-animals-ideas&quot; id=&quot;markdown-toc-ai-for-animals-ideas&quot;&gt;AI-for-animals ideas&lt;/a&gt;        &lt;ul&gt;
          &lt;li&gt;&lt;a href=&quot;#neartermist-animal-advocacy&quot; id=&quot;markdown-toc-neartermist-animal-advocacy&quot;&gt;Neartermist animal advocacy&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#using-tai-to-improve-farm-animal-welfare&quot; id=&quot;markdown-toc-using-tai-to-improve-farm-animal-welfare&quot;&gt;Using TAI to improve farm animal welfare&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#lobby-governments-to-include-animal-welfare-in-ai-regulations&quot; id=&quot;markdown-toc-lobby-governments-to-include-animal-welfare-in-ai-regulations&quot;&gt;Lobby governments to include animal welfare in AI regulations&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#traditional-animal-advocacy-targeted-at-frontier-ai-developers&quot; id=&quot;markdown-toc-traditional-animal-advocacy-targeted-at-frontier-ai-developers&quot;&gt;Traditional animal advocacy targeted at frontier AI developers&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#research-which-alignment-strategies-are-more-likely-to-be-good-for-animals&quot; id=&quot;markdown-toc-research-which-alignment-strategies-are-more-likely-to-be-good-for-animals&quot;&gt;Research which alignment strategies are more likely to be good for animals&lt;/a&gt;&lt;/li&gt;
        &lt;/ul&gt;
      &lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#ai-policyadvocacy-ideas&quot; id=&quot;markdown-toc-ai-policyadvocacy-ideas&quot;&gt;AI policy/advocacy ideas&lt;/a&gt;        &lt;ul&gt;
          &lt;li&gt;&lt;a href=&quot;#improving-us--china-relations--international-peace&quot; id=&quot;markdown-toc-improving-us--china-relations--international-peace&quot;&gt;Improving US &amp;lt;&amp;gt; China relations / international peace&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#talk-to-international-peace-orgs-about-ai&quot; id=&quot;markdown-toc-talk-to-international-peace-orgs-about-ai&quot;&gt;Talk to international peace orgs about AI&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#increasing-government-expertise-about-ai&quot; id=&quot;markdown-toc-increasing-government-expertise-about-ai&quot;&gt;Increasing government expertise about AI&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#policyadvocacy-in-china&quot; id=&quot;markdown-toc-policyadvocacy-in-china&quot;&gt;Policy/advocacy in China&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#corporate-campaigns-to-advocate-for-safety&quot; id=&quot;markdown-toc-corporate-campaigns-to-advocate-for-safety&quot;&gt;Corporate campaigns to advocate for safety&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#develop-ai-safetysecurityevaluation-standards&quot; id=&quot;markdown-toc-develop-ai-safetysecurityevaluation-standards&quot;&gt;Develop AI safety/security/evaluation standards&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#slow-down-chinese-ai-development-via-ordinary-foreign-policy&quot; id=&quot;markdown-toc-slow-down-chinese-ai-development-via-ordinary-foreign-policy&quot;&gt;Slow down Chinese AI development via ordinary foreign policy&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#whistleblower-protectionsupport&quot; id=&quot;markdown-toc-whistleblower-protectionsupport&quot;&gt;Whistleblower protection/support&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#opinion-polling&quot; id=&quot;markdown-toc-opinion-polling&quot;&gt;Opinion polling&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#help-ai-company-employees-improve-safety-within-their-companies&quot; id=&quot;markdown-toc-help-ai-company-employees-improve-safety-within-their-companies&quot;&gt;Help AI company employees improve safety within their companies&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#direct-talks-with-ai-companies-to-make-them-safer&quot; id=&quot;markdown-toc-direct-talks-with-ai-companies-to-make-them-safer&quot;&gt;Direct talks with AI companies to make them safer&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#monitor-ai-companies-on-safety-standards&quot; id=&quot;markdown-toc-monitor-ai-companies-on-safety-standards&quot;&gt;Monitor AI companies on safety standards&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#create-a-petition-or-open-letter-on-ai-risk&quot; id=&quot;markdown-toc-create-a-petition-or-open-letter-on-ai-risk&quot;&gt;Create a petition or open letter on AI risk&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#create-demonstrations-of-dangerous-ai-capabilities&quot; id=&quot;markdown-toc-create-demonstrations-of-dangerous-ai-capabilities&quot;&gt;Create demonstrations of dangerous AI capabilities&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#sue-openai-for-violating-its-nonprofit-mission&quot; id=&quot;markdown-toc-sue-openai-for-violating-its-nonprofit-mission&quot;&gt;Sue OpenAI for violating its nonprofit mission&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#send-people-ai-safety-books&quot; id=&quot;markdown-toc-send-people-ai-safety-books&quot;&gt;Send people AI safety books&lt;/a&gt;&lt;/li&gt;
        &lt;/ul&gt;
      &lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#ai-research-ideas&quot; id=&quot;markdown-toc-ai-research-ideas&quot;&gt;AI research ideas&lt;/a&gt;        &lt;ul&gt;
          &lt;li&gt;&lt;a href=&quot;#research-on-how-to-get-people-to-extrapolate&quot; id=&quot;markdown-toc-research-on-how-to-get-people-to-extrapolate&quot;&gt;Research on how to get people to extrapolate&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#investigate-how-to-use-ai-to-reduce-other-x-risks&quot; id=&quot;markdown-toc-investigate-how-to-use-ai-to-reduce-other-x-risks&quot;&gt;Investigate how to use AI to reduce other x-risks&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#a-short-timelines-alignment-plan-that-doesnt-rely-on-bootstrapping&quot; id=&quot;markdown-toc-a-short-timelines-alignment-plan-that-doesnt-rely-on-bootstrapping&quot;&gt;A short-timelines alignment plan that doesn’t rely on bootstrapping&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#rigorous-analysis-of-the-various-ways-alignment-bootstrapping-could-fail&quot; id=&quot;markdown-toc-rigorous-analysis-of-the-various-ways-alignment-bootstrapping-could-fail&quot;&gt;Rigorous analysis of the various ways alignment bootstrapping could fail&lt;/a&gt;&lt;/li&gt;
        &lt;/ul&gt;
      &lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#future-work&quot; id=&quot;markdown-toc-future-work&quot;&gt;Future work&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#pros-and-cons-of-slowing-down-ai-development-with-numeric-credences&quot; id=&quot;markdown-toc-pros-and-cons-of-slowing-down-ai-development-with-numeric-credences&quot;&gt;Pros and cons of slowing down AI development, with numeric credences&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#quantitative-model-on-ai-x-risk-vs-other-x-risks&quot; id=&quot;markdown-toc-quantitative-model-on-ai-x-risk-vs-other-x-risks&quot;&gt;Quantitative model on AI x-risk vs. other x-risks&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#deeper-investigation-of-the-ai-arms-race-situation&quot; id=&quot;markdown-toc-deeper-investigation-of-the-ai-arms-race-situation&quot;&gt;Deeper investigation of the AI arms race situation&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#does-slowing-downpausing-ai-help-solve-non-alignment-problems&quot; id=&quot;markdown-toc-does-slowing-downpausing-ai-help-solve-non-alignment-problems&quot;&gt;Does slowing down/pausing AI help solve non-alignment problems?&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#determine-when-will-be-the-right-time-to-push-for-strong-restrictions-on-ai-if-not-now&quot; id=&quot;markdown-toc-determine-when-will-be-the-right-time-to-push-for-strong-restrictions-on-ai-if-not-now&quot;&gt;Determine when will be the right time to push for strong restrictions on AI (if not now)&lt;/a&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#supplements&quot; id=&quot;markdown-toc-supplements&quot;&gt;Supplements&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#notes&quot; id=&quot;markdown-toc-notes&quot;&gt;Notes&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h1 id=&quot;introduction&quot;&gt;Introduction&lt;/h1&gt;

&lt;p&gt;This report was prompted by two questions:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;What are some things we can do to make transformative AI go well?&lt;/li&gt;
  &lt;li&gt;What are a few high-priority projects that deserve more attention?&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;I reviewed the AI safety landscape, starting by &lt;a href=&quot;#prioritization&quot;&gt;prioritizing&lt;/a&gt; to narrow my focus to areas that look particularly promising and feasible for me to review. I focused on two areas:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;
    &lt;p&gt;AI x-risk advocacy&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Making transformative AI go well for animals&lt;/p&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;A summary of my reasoning on &lt;a href=&quot;#prioritization&quot;&gt;prioritization&lt;/a&gt; regarding AI misalignment risk:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;I focused on AI policy advocacy over technical safety research, primarily because it’s much more neglected, and there are other people with more expertise than me who already look for neglected research ideas. [&lt;a href=&quot;#why-not-technical-safety-research&quot;&gt;More&lt;/a&gt;]&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;I focused on policy &lt;em&gt;advocacy&lt;/em&gt; over policy &lt;em&gt;research&lt;/em&gt;, again because it’s particularly neglected. [&lt;a href=&quot;#why-not-ai-policy-research&quot;&gt;More&lt;/a&gt;]&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;I considered the downsides of advocacy, which ultimately I don’t believe are strong enough to outweigh the upsides. [&lt;a href=&quot;#downsides-of-ai-policyadvocacy-and-why-theyre-not-too-big&quot;&gt;More&lt;/a&gt;]&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;I decided not to spend much time evaluating which policies are best to advocate for, because a wide variety of policies could be helpful, and we need more advocacy in general. [&lt;a href=&quot;#what-kinds-of-policies-might-reduce-ai-x-risk&quot;&gt;More&lt;/a&gt;]&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Regarding AI issues beyond misalignment:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;Transformative AI may not go well for animals. There are some tractable interventions for improving post-TAI animal welfare. [&lt;a href=&quot;#maybe-prioritizing-post-tai-animal-welfare&quot;&gt;More&lt;/a&gt;]&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;There are other important issues like digital sentience, AI-enabled coups, moral error, etc. But there is no visible path to solving these problems before transformative AI; and I find it quite difficult to weigh the importance of these issues, so I did not discuss them more than briefly. [&lt;a href=&quot;#why-not-prioritize-digital-minds--s-risks--moral-error--better-futures--ai-misuse-x-risk--gradual-disempowerment&quot;&gt;More&lt;/a&gt;]&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I created a list of projects within my two focus areas and identified four &lt;a href=&quot;#top-project-ideas&quot;&gt;top project ideas&lt;/a&gt; (presented in no particular order):&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;a href=&quot;#talk-to-policy-makers-about-ai-x-risk&quot;&gt;Talk to policy-makers about AI x-risk&lt;/a&gt;&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;a href=&quot;#write-ai-x-risk-legislation&quot;&gt;Write AI x-risk legislation&lt;/a&gt;&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;a href=&quot;#advocate-to-change-ai-training-to-make-llms-more-animal-friendly&quot;&gt;Advocate to change AI training to make LLMs more animal-friendly&lt;/a&gt;&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;a href=&quot;#develop-new-plans--evaluate-existing-plans-to-improve-post-tai-animal-welfare&quot;&gt;Develop new plans / evaluate existing plans to improve post-TAI animal welfare&lt;/a&gt;&lt;/p&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;I also included &lt;a href=&quot;#honorable-mentions&quot;&gt;honorable mentions&lt;/a&gt; and a longer &lt;a href=&quot;#list-of-other-project-ideas&quot;&gt;list of other project ideas&lt;/a&gt;. For each idea, I provide a theory of change, list which orgs are already working on it (if any), and give some pros and cons.&lt;/p&gt;

&lt;p&gt;Finally, I list a few areas for &lt;a href=&quot;#future-work&quot;&gt;future work&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;There are two external supplements on Google Docs:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;a href=&quot;https://docs.google.com/document/d/1w1vVTiihUTqFye2hIaoGuqJgw-G5LzeQ8x0yoPQ-Ilg/&quot;&gt;Appendix&lt;/a&gt;: Some miscellaneous topics that were relevant, but not quite relevant enough to include in the main text.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;a href=&quot;https://docs.google.com/document/d/1vWB5CgH69W4lmpZrCXaD3n2Jqz32kVnvCJwUA2RE8Fw/&quot;&gt;List of relevant organizations&lt;/a&gt;: A reference list of orgs doing work in AI-for-animals or AI policy/advocacy, with brief descriptions of their activities.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;h1 id=&quot;prelude&quot;&gt;Prelude&lt;/h1&gt;

&lt;p&gt;I was commissioned by Rethink Priorities to do a broad review of the AI safety/governance landscape and find some neglected interventions. Instead of doing that, I reviewed the landscape of just two areas within AI safety:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;
    &lt;p&gt;AI x-risk advocacy&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Making transformative AI go well for animals&lt;/p&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;I narrowed my focus in the interest of time, and because I believed I had the best chance of identifying promising interventions within those two fields.&lt;/p&gt;

&lt;p&gt;There is a tradeoff between (a) giving recommendations that are easy to agree with, but weak; (b) giving strong recommendations that only make sense if you hold certain idiosyncratic beliefs. This report leans more toward (b), making some strong assumptions and building recommendations off of those, although I tried to avoid making assumptions whenever I could do so without weakening the conclusions. I also tried to be clear about what assumptions I’m making.&lt;/p&gt;

&lt;p&gt;This report is broad, but I only spent three months writing it. There are some topics in this report that could have been a PhD dissertation, but instead, I spent an hour on them.&lt;/p&gt;

&lt;p&gt;Most of this report is about AI policy, but I don’t have a background in policy. I did speak to a number of people who work in policy, and I read a lot of published materials, but I lack personal experience, and I expect that there are important things happening in AI policy that I don’t know about.&lt;/p&gt;

&lt;h2 id=&quot;some-positions-im-going-to-take-as-given&quot;&gt;Some positions I’m going to take as given&lt;/h2&gt;

&lt;p&gt;The following premises would probably be controversial with some audiences, but I expect them to be uncontroversial for the readers of this report, so I will treat them as background assumptions.&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;
    &lt;p&gt;Effective altruist principles are correct (e.g. cost-effectiveness matters; you can, in principle, quantify the expected value of an intervention).&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Animal welfare matters.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Digital minds can matter.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;AI misalignment is a serious problem that could cause human extinction.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;h2 id=&quot;definitions&quot;&gt;Definitions&lt;/h2&gt;

&lt;p&gt;The terms AGI/ASI/TAI can often be used interchangeably, but in some cases the distinctions matter:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;AGI = human-level AI: Capable enough to match the economic output of a large percentage of humans (say, at least half).&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;ASI = superintelligent AI: Smart enough to vastly outperform humans on every task.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;TAI = transformative AI: Smart enough to radically transform society (without making a claim about whether that happens at AGI-level or ASI-level or in between).&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I use the terms “legislation” and “regulation” largely interchangeably. For my purposes, I don’t need to draw a distinction between government mandates that are directly written into law vs. decreed by a regulatory body.&lt;/p&gt;

&lt;h1 id=&quot;prioritization&quot;&gt;Prioritization&lt;/h1&gt;

&lt;p&gt;For this report, I focused on AI risk advocacy and on post-TAI animal welfare, and I did not spend much time on other AI-related issues.&lt;/p&gt;

&lt;p&gt;For the sake of time-efficiency, rather than creating a big list of ideas in the full AI space, I first narrowed down to the regions within the AI space that I thought were most promising and then came up with a list of ideas in those regions. I could have spent time investigating (say) alignment research, but I doubt I would have ended up recommending any alignment research project ideas.&lt;/p&gt;

&lt;p&gt;In this section:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;a href=&quot;#why-not-technical-safety-research&quot;&gt;Why not technical safety research?&lt;/a&gt;&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;a href=&quot;#why-not-ai-policy-research&quot;&gt;Why not policy research?&lt;/a&gt;&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;a href=&quot;#downsides-of-ai-policyadvocacy-and-why-theyre-not-too-big&quot;&gt;Downsides of AI policy/advocacy (and why they’re not too big)&lt;/a&gt;&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;a href=&quot;#what-kinds-of-policies-might-reduce-ai-x-risk&quot;&gt;What kinds of policies would be good?&lt;/a&gt;&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;a href=&quot;#maybe-prioritizing-post-tai-animal-welfare&quot;&gt;Maybe prioritizing post-TAI animal welfare&lt;/a&gt;&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;a href=&quot;#why-not-prioritize-digital-minds--s-risks--moral-error--better-futures--ai-misuse-x-risk--gradual-disempowerment&quot;&gt;Why not prioritize digital minds / S-risks / moral error / AI misuse x-risk / gradual disempowerment?&lt;/a&gt;&lt;/p&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;h2 id=&quot;why-not-technical-safety-research&quot;&gt;Why not technical safety research?&lt;/h2&gt;

&lt;p&gt;Technical safety research (mainly alignment research, but also including control, interpretability, monitoring, etc.) is considerably better-funded than AI safety policy.&lt;/p&gt;

&lt;p&gt;AI companies invest a significant amount into safety research. They also invest in policy, but their investments are mostly counterproductive (they are mostly advocating &lt;em&gt;against&lt;/em&gt; safety regulations&lt;sup id=&quot;fnref:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;). Philanthropic funders invest a lot into technical research, and less into policy.&lt;/p&gt;

&lt;p&gt;I have not made a serious attempt to estimate the volume of work going into research vs. policy/advocacy, but my sense is that the former receives much more funding.&lt;sup id=&quot;fnref:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;Some sub-fields within technical research may be underfunded. But:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;
    &lt;p&gt;I am not in a great position to figure out what those are.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;There are already many grantmakers who seek out neglected technical research directions. There are recent requests for proposals (RFPs) from &lt;a href=&quot;https://cifar.ca/cifarnews/2025/08/05/calls-open-for-global-ai-alignment-research-initiative/&quot;&gt;UK AI Security Institute&lt;/a&gt;, &lt;a href=&quot;https://www.openphilanthropy.org/request-for-proposals-technical-ai-safety-research&quot;&gt;Open Philanthropy&lt;/a&gt;, &lt;a href=&quot;https://futureoflife.org/our-work/grantmaking-work/&quot;&gt;Future of Life Institute&lt;/a&gt;, and &lt;a href=&quot;https://www.frontiermodelforum.org/ai-safety-fund&quot;&gt;Frontier Model Forum’s AI Safety Fund&lt;/a&gt;, among others.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;h2 id=&quot;why-not-ai-policy-research&quot;&gt;Why not AI policy research?&lt;/h2&gt;

&lt;p&gt;Is it better to do policy &lt;em&gt;research&lt;/em&gt; (figure out what policies are good) or policy &lt;em&gt;advocacy&lt;/em&gt; (try to get policies implemented)?&lt;/p&gt;

&lt;p&gt;Both are necessary, but this article focuses on policy advocacy for the following reasons:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;There is much more money in AI policy research. By a large margin, most of what’s happening in AI safety policy could be described as “research”.&lt;/p&gt;

    &lt;ul&gt;
      &lt;li&gt;Recently, Jason Green-Lowe &lt;a href=&quot;https://www.lesswrong.com/posts/BjeesS4cosB2f4PAj/we-re-not-advertising-enough-post-3-of-6-on-ai-governance&quot;&gt;estimated&lt;/a&gt; from LinkedIn data that there are 3x as many governance researchers as governance advocates.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Policy research is a necessary step in the funnel. We also need people writing legislation and people advocating for the legislation, both of which we have very little of.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;At this point, we have at least &lt;em&gt;some&lt;/em&gt; idea of how to implement AI safety regulations. More research would be valuable, but it likely has diminishing returns.&lt;/p&gt;

    &lt;ul&gt;
      &lt;li&gt;I wrote more about this last year in &lt;a href=&quot;https://mdickens.me/2024/11/18/where_i_am_donating_in_2024/#slow-nuanced-regulation-vs-fast-coarse-regulation&quot;&gt;Slow nuanced regulation vs. fast coarse regulation&lt;/a&gt;.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Research works best with long timelines. Timelines are probably not long.&lt;/p&gt;

    &lt;ul&gt;
      &lt;li&gt;
        &lt;p&gt;The ideal situation is to spend 10–20 years developing a field of AI policy, write many reports until a consensus slowly develops about how to govern AI development, then advocate for the consensus policies. But it is likely that by the time we have a reasonable consensus, it’s already too late to do anything about TAI.&lt;/p&gt;
      &lt;/li&gt;
      &lt;li&gt;
        &lt;p&gt;Even if you think timelines are probably long, we are currently under-investing in activities that pay off given short timelines.&lt;/p&gt;
      &lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;downsides-of-ai-policyadvocacy-and-why-theyre-not-too-big&quot;&gt;Downsides of AI policy/advocacy (and why they’re not too big)&lt;/h2&gt;

&lt;p&gt;Basically, policy work does one or both of these things:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;
    &lt;p&gt;Legally enforce AI safety measures&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Slow down AI development&lt;/p&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;People who care about x-risk broadly agree that AI safety regulations can be good, although there’s some disagreement about how to write good regulations.&lt;/p&gt;

&lt;p&gt;The biggest objection to regulation is that it (often) causes AI development to slow down. People usually don’t object to easy-to-satisfy regulations; they object to regulations that will impede progress.&lt;/p&gt;

&lt;p&gt;So, is slowing down AI worth the cost?&lt;/p&gt;

&lt;p&gt;I am aware of two good arguments against slowing down AI (or imposing regulations that de facto slow down AI):&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;
    &lt;p&gt;Opportunity cost – we need AI to bring technological advances (e.g., medical advances to reduce mortality and health risks)&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;AI could prevent non-AI-related x-risks&lt;/p&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Two additional arguments against AI policy advocacy:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;
    &lt;p&gt;Meaningful AI regulations are not politically feasible to implement&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Advocacy can backfire&lt;/p&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;An additional argument that applies to some types of advocacy but not others:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Advocacy may slow down safer actors without slowing down more reckless actors. For example, it is sometimes argued that US regulations are bad if they allow Chinese developers to gain the lead. I believe this outcome is avoidable—and it’s a good reason to prefer global cooperation over national or local regulations. But I can’t address this argument concisely, so I will just acknowledge it without further discussion.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;(For a longer list of arguments, with responses that are probably better-written than mine, see Katja Grace’s &lt;a href=&quot;https://aiimpacts.org/lets-think-about-slowing-down-ai/&quot;&gt;Let’s think about slowing down AI&lt;/a&gt;.)&lt;/p&gt;

&lt;p&gt;Regarding the &lt;strong&gt;opportunity cost argument&lt;/strong&gt;, it makes sense if you think AI does not pose a meaningful existential risk or if you heavily discount future generations. If there is a significant probability that future generations are ~equally valuable to current generations, then the opportunity cost argument does not work. The opportunity cost of delaying AI by (say) a few decades is easily dwarfed by the risk of extinction.&lt;/p&gt;

&lt;p&gt;As to the &lt;strong&gt;non-AI x-risk argument&lt;/strong&gt;, it is broadly (although not universally) accepted among people in the x-risk space that AI x-risk is 1–2 orders of magnitude higher than total x-risk from other sources (see Michael Aird’s &lt;a href=&quot;https://docs.google.com/spreadsheets/d/1W10B6NJjicD8O0STPiT3tNV3oFnT8YsfjmtYR8RO_RI/edit&quot;&gt;database of x-risk estimates&lt;/a&gt; or the &lt;a href=&quot;https://forecastingresearch.org/xpt&quot;&gt;Existential Risk Persuasion Tournament&lt;/a&gt;, although I don’t put much weight on individual forecasts). Therefore, delaying AI development seems preferable as long as it buys us a meaningful reduction in AI risk.&lt;/p&gt;

&lt;p&gt;See &lt;a href=&quot;#quantitative-model-on-ai-x-risk-vs-other-x-risks&quot;&gt;Quantitative model on AI x-risk vs. other x-risks&lt;/a&gt; under Future Work.&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;tractability argument&lt;/strong&gt; seems more concerning. Preventing AI extinction via technical research and preventing it via policy both seem unlikely to work. But I am more optimistic about policy because:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;We have a good enough understanding of AI alignment to say with decent confidence that we’re nowhere close to solving it. It’s less clear what it would take to get good regulations put in place.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;AI regulations are unpopular both in the US Congress and within the Trump administration, but popular among the general public. Popular support increases the feasibility of getting regulations passed.&lt;/p&gt;

    &lt;ul&gt;
      &lt;li&gt;And there is a good chance that Congress will be more regulation-friendly after the 2026 Congressional elections.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;SB-1047 nearly got passed into law, making it through the California legislature and only failing due to veto. A near-win suggests that a win isn’t far away in possibility-space.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;(My reasoning on tractability focused on US policy because that’s where most of the top AI companies operate. The UK seems to be the current leader on AI policy, although it’s not clear to what extent UK regulations matter for x-risk.&lt;sup id=&quot;fnref:3&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt;)&lt;/p&gt;

&lt;p&gt;That leaves the &lt;strong&gt;backfire argument&lt;/strong&gt;. This is a real concern, but ultimately it’s a risk you have to take at some point because you can’t get policies passed if you don’t advocate for them. It could make sense to delay advocacy if one has good reason to believe that future advocacy is less likely to backfire; to my knowledge this is not a common position, and it’s more common for people to oppose advocacy unconditionally. For more on this topic, see Appendix: &lt;a href=&quot;https://docs.google.com/document/d/1w1vVTiihUTqFye2hIaoGuqJgw-G5LzeQ8x0yoPQ-Ilg/edit?tab=t.0#bookmark=kix.idfhvmca2skk&quot;&gt;When is the right time for advocacy?&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Also: I’m somewhat less concerned about this than many people. When I did &lt;a href=&quot;https://mdickens.me/2025/04/18/protest_outcomes_critical_review/&quot;&gt;research on protest outcomes&lt;/a&gt;, I found that peaceful protests increased public support, even though many people intuitively expect the opposite. Protests aren’t the only type of advocacy, but it’s a particularly controversial type of advocacy. If protests don’t backfire, then it stands to reason—although I have no direct evidence—that other, tamer forms of advocacy are unlikely to backfire.&lt;/p&gt;

&lt;p&gt;(There is some evidence that &lt;em&gt;violent&lt;/em&gt; protests backfire, however.)&lt;/p&gt;

&lt;p&gt;If policy-maker advocacy is similar to public advocacy, then probably the competence bar is not as high as many people think it is. Perhaps policy-makers are more discerning/critical than the general public; on the other hand, it’s specifically their job to do what their constituents want, so it stands to reason that it’s a good idea to tell them what you want.&lt;/p&gt;

&lt;p&gt;My main concern comes from deference: some people whom I respect believe that advocacy backfires by default. I don’t understand why they believe that, so I may be missing something important.&lt;/p&gt;

&lt;p&gt;I do believe that much AI risk advocacy has backfired in the past, but I believe this was fairly predictable and avoidable. Specifically, talking about the importance of AI has historically encouraged people to build it, which increased x-risk. People should not advocate for AI being a big deal; they should advocate for AI being &lt;em&gt;risky&lt;/em&gt;. (Which it is.) See &lt;a href=&quot;#advocacy-should-emphasize-x-risk-and-misalignment-risk&quot;&gt;Advocacy should emphasize x-risk and misalignment risk&lt;/a&gt;.&lt;/p&gt;

&lt;h2 id=&quot;what-kinds-of-policies-might-reduce-ai-x-risk&quot;&gt;What kinds of policies might reduce AI x-risk?&lt;/h2&gt;

&lt;p&gt;There are many policies that could help. And many policy ideas are independent: we could have safety testing requirements AND frontier-model training restrictions AND on-chip monitoring AND export controls. Those could all be part of the same bill or separate bills.&lt;/p&gt;

&lt;p&gt;It’s beyond the scope of this report to come up with specific policy recommendations. I do, however, have some things I would like to see in policy proposals:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;
    &lt;p&gt;They should be relevant to existential risk, especially misalignment risk.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;I would like to see work on policies that would help in the event of a global moratorium on frontier AI development (e.g. we’d need ways to enforce the moratorium).&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;For an ideal policy proposal, it is possible to draw a causal arrow from “this regulation gets passed” to “we survive”. Policies don’t &lt;em&gt;have&lt;/em&gt; to singlehandedly prevent extinction, but given that we may only have a few years before AGI, I believe we should be seriously trying to draft bills that are sufficient on their own to avert extinction.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;(The closest thing I’ve seen to #3 is Barnett &amp;amp; Scher’s &lt;a href=&quot;https://arxiv.org/abs/2505.04592&quot;&gt;AI Governance to Avoid Extinction: The Strategic Landscape and Actionable Research Questions&lt;/a&gt;. It does not propose a set of policies that would (plausibly) prevent extinction, but it does propose a list of research questions that (may) need to be answered to get us there.)&lt;/p&gt;

&lt;p&gt;Some people are concerned about passing suboptimal legislation. I’m not overly concerned about this because the law changes all the time. If you pass some legislation that turns out to be less useful than expected, you can pass more legislation. For example, the first environmental protections were weak, and later regulations strengthened them.&lt;/p&gt;

&lt;p&gt;Regulations could create momentum, or they could create “regulation fatigue”. I did a brief literature review of historical examples, and my impression was that weak regulation begets strong regulation more often than not, but there are examples in both directions. See &lt;a href=&quot;https://mdickens.me/reading-notes/#[2025-07-02%20Wed]%20Deep%20Research:%20Foot-in-the-Door%20Regulations&quot;&gt;my reading notes&lt;/a&gt;.&lt;/p&gt;

&lt;h3 id=&quot;some-ai-policy-ideas-i-like&quot;&gt;Some AI policy ideas I like&lt;/h3&gt;

&lt;p&gt;I believe that a moratorium on frontier AI development is the best outcome for preventing x-risk (see &lt;a href=&quot;https://docs.google.com/document/d/1w1vVTiihUTqFye2hIaoGuqJgw-G5LzeQ8x0yoPQ-Ilg/edit?tab=t.0&quot;&gt;Appendix&lt;/a&gt; for an explanation of why I believe that). None of my top project ideas depend on this belief, although it would inform the details of how I’d like to see some of those project ideas implemented.&lt;/p&gt;

&lt;p&gt;I didn’t specifically do research on policy ideas while writing this report, but I did incidentally come across a few ideas that I’d like to see get more attention. Since I didn’t put meaningful thought into them, I will simply list them here.&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;
    &lt;p&gt;Operationalization of “pause frontier AI development until we can make it safe.” For example, what infrastructure and operations are required to enforce a pause?&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Rules about when to enforce a pause on frontier AI development: something like “When warning sign X occurs, companies are required to stop training bigger AI systems until they implement mitigations Y/Z.”&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Ban recursively self-improving AI.&lt;/p&gt;

    &lt;ul&gt;
      &lt;li&gt;
        &lt;p&gt;Recursive self-improvement is the main way that AI capabilities could rapidly grow out of control, but banning it does not impede progress in the way that most people care about.&lt;/p&gt;
      &lt;/li&gt;
      &lt;li&gt;
        &lt;p&gt;Some work needs to be done to operationalize this, but we shouldn’t let the perfect be the enemy of the good.&lt;/p&gt;
      &lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Require companies to publish binding safety policies (e.g. responsible scaling policies [&lt;a href=&quot;https://metr.org/blog/2023-09-26-rsp/&quot;&gt;RSPs&lt;/a&gt;] or similar). That is, if a company’s policy says it will do something, then that constitutes a legally binding promise.&lt;/p&gt;

    &lt;ul&gt;
      &lt;li&gt;
        &lt;p&gt;This would prevent the situation we have seen in the past, where, when a company fails to live up to a particular self-imposed requirement, it simply edits its safety policy to remove that requirement.&lt;/p&gt;
      &lt;/li&gt;
      &lt;li&gt;
        &lt;p&gt;This sort of regulation isn’t strong enough to prevent extinction, but it has the advantage that it should be easy to advocate for.&lt;/p&gt;
      &lt;/li&gt;
      &lt;li&gt;
        &lt;p&gt;California bill SB 53 says something like this (see &lt;a href=&quot;https://www.sb53.info/&quot;&gt;sb53.info&lt;/a&gt; for a summary), but its rules would not fully come into effect until 2030.&lt;/p&gt;
      &lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;h2 id=&quot;maybe-prioritizing-post-tai-animal-welfare&quot;&gt;Maybe prioritizing post-TAI animal welfare&lt;/h2&gt;

&lt;p&gt;Making TAI go well for animals is probably less important than x-risk because:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;Almost everyone cares about animals. An AI that’s aligned to human values would also care about animals, and it would probably figure out ways to prevent large sources of animal suffering like factory farming.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;A technologically advanced civilization could develop cheaper alternatives to animal farming (e.g. cultured meat), rendering factory farming unnecessary.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;There wouldn’t be much benefit in spreading wild animal suffering, so it stands to reason that post-TAI civilization won’t do it. (Although I’m not at all confident about this.)&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;However, AI-for-animals could still be highly cost-effective.&lt;/p&gt;

&lt;p&gt;An extremely basic case for cost-effectiveness:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;
    &lt;p&gt;There’s a (say) 80% chance that an aligned(-to-humans) AI will be good for animals, but that still leaves a 20% chance of a bad outcome.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;AI-for-animals receives much less than 20% as much funding as AI safety.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Cost-effectiveness maybe scales with the inverse of the amount invested. Therefore, AI-for-animals interventions are more cost-effective on the margin than AI safety.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;h2 id=&quot;why-not-prioritize-digital-minds--s-risks--moral-error--better-futures--ai-misuse-x-risk--gradual-disempowerment&quot;&gt;Why not prioritize digital minds / S-risks / moral error / better futures / AI misuse x-risk / gradual disempowerment?&lt;/h2&gt;

&lt;p&gt;There are some topics on how to make the future go well that aren’t specifically about AI alignment:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;Ensuring digital minds have good welfare&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Preventing S-risks — risks of astronomical suffering&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Moral error — the risk that we make a big mistake because we are wrong about what’s morally right&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Better futures — ensuring that the future is as good as possible, as opposed to simply preventing bad outcomes&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Preventing powerful AI from being misused to cause existential harm&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Preventing powerful AI from gradually disempowering sentient beings and slowly leading to a bad outcome, as opposed to a sudden bad outcome like extinction&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Call these “non-alignment risks”.&lt;/p&gt;

&lt;p&gt;Originally, I included each of these as separate project ideas, but I decided not to focus on any of them. This decision deserves much more attention than I gave it, but I will briefly explain why I did not spend much time on non-alignment risks.&lt;/p&gt;

&lt;p&gt;All of these cause areas are extremely important and neglected (more neglected than AI misalignment risk), and (for the most part) very different from each other. And I am happy for the people who are working on them—there are some enormous issues in this space where only one person in the world is working on it. Nonetheless, I did not prioritize them.&lt;/p&gt;

&lt;p&gt;My concern is that, if AI timelines are short, then there is virtually no chance that we can solve these problems before TAI arrives.&lt;/p&gt;

&lt;p&gt;There is a dilemma:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;
    &lt;p&gt;If TAI can help us solve these problems, then there isn’t much benefit in working on them now.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;If we can’t rely on TAI to help solve them (e.g. we expect value lock-in), then we have little hope of solving them in time.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;(There is a way out of this dilemma: perhaps AI timelines are long enough that these problems are tractable, but short enough that we need to start working on them now—we can’t wait until it becomes apparent that timelines are long. That seems unlikely because it’s rather specific, but I didn’t give much thought to this possibility.)&lt;/p&gt;

&lt;p&gt;It looks like we have only two reasonable options for handling AI welfare / S-risks / moral error / etc.:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;
    &lt;p&gt;Increase the probability that we end up in world #1, where TAI can help us solve these problems—for example, by increasing the probability that something like a &lt;a href=&quot;https://forum.effectivealtruism.org/topics/long-reflection&quot;&gt;Long Reflection&lt;/a&gt; happens.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Slow down AI development.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;I lean toward the second option. For more reasoning on this, see Appendix: &lt;a href=&quot;https://docs.google.com/document/d/1w1vVTiihUTqFye2hIaoGuqJgw-G5LzeQ8x0yoPQ-Ilg/edit?tab=t.0#bookmark=kix.o881tulnpfpa&quot;&gt;Slowing down is a general-purpose solution to every non-alignment problem&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;I’m quite uncertain about the decision not to focus on these cause areas. They are arguably as important as AI alignment, and much more neglected.&lt;/p&gt;

&lt;p&gt;AI-for-animal-welfare could also be included on my list of non-alignment risks, but I &lt;em&gt;did&lt;/em&gt; prioritize it because I can see some potentially tractable interventions in the space.&lt;/p&gt;

&lt;p&gt;AI welfare seems more tractable than animal welfare in that AI companies care more about it, but it seems &lt;em&gt;less&lt;/em&gt; tractable because it involves extremely difficult problems like “when are digital minds conscious?” There may be some tractable, short-timelines-compatible ideas out there, but I did not see any in the research agendas I read.&lt;/p&gt;

&lt;p&gt;Perhaps I could identify tractable interventions by digging deeper into the space and maybe doing some original research, but that was out of scope for this article.&lt;/p&gt;

&lt;h3 id=&quot;whos-working-on-them&quot;&gt;Who’s working on them?&lt;/h3&gt;

&lt;p&gt;In each of my project idea sections, I included a list of orgs working on that idea (if any). I didn’t write individual project ideas for non-alignment risks (other than AI-for-animals), but I still wanted to include lists of relevant orgs, so I’ve put them below.&lt;/p&gt;

&lt;p&gt;There are also some individual researchers who have published articles on these topics in the past; I will not include those.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;AI welfare / digital minds: &lt;a href=&quot;https://www.anthropic.com/research/exploring-model-welfare&quot;&gt;Anthropic&lt;/a&gt;; &lt;a href=&quot;https://longtermrisk.org/&quot;&gt;Center on Long-Term Risk&lt;/a&gt;; &lt;a href=&quot;https://www.longview.org/digital-sentience-consortium/&quot;&gt;Digital Sentience Consortium&lt;/a&gt;; &lt;a href=&quot;https://eleosai.org/&quot;&gt;Eleos AI&lt;/a&gt;; &lt;a href=&quot;https://sites.google.com/nyu.edu/mindethicspolicy/home&quot;&gt;NYU Center for Mind, Ethics, and Policy&lt;/a&gt; &lt;a href=&quot;https://www.sentientfutures.ai/&quot;&gt;Sentient Futures&lt;/a&gt;; &lt;a href=&quot;https://www.sentienceinstitute.org/&quot;&gt;Sentience Institute&lt;/a&gt;.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;AI misuse x-risks: &lt;a href=&quot;https://www.forethought.org/&quot;&gt;Forethought&lt;/a&gt;; probably a number of others, but I didn’t spend time specifically looking for them. (AI misuse is a relatively popular subject matter, but extinction-level misuse isn’t much discussed.)&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Better futures: &lt;a href=&quot;https://www.forethought.org/&quot;&gt;Forethought&lt;/a&gt;.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Gradual disempowerment: To my knowledge, no orgs specifically work on this, but there is the &lt;a href=&quot;https://gradual-disempowerment.ai/&quot;&gt;Gradual Disempowerment&lt;/a&gt; paper written by authors with various affiliations.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Moral error: &lt;a href=&quot;https://longtermrisk.org/&quot;&gt;Center on Long-Term Risk&lt;/a&gt;; &lt;a href=&quot;https://www.forethought.org/&quot;&gt;Forethought&lt;/a&gt;; &lt;a href=&quot;https://globalprioritiesinstitute.org/&quot;&gt;Global Priorities Institute&lt;/a&gt; (now defunct as of just before this writing).&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;S-risks from cooperation failure: &lt;a href=&quot;https://longtermrisk.org/&quot;&gt;Center on Long-Term Risk&lt;/a&gt;.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;some-relevant-research-agendas&quot;&gt;Some relevant research agendas&lt;/h3&gt;

&lt;p&gt;Although I decided not to prioritize this space, others have done work on preparing research agendas, which readers may be interested in. Here, I include a list of research agendas (or problem overviews, which can inform research agendas) with no added commentary.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;Anthony DiGiovanni – &lt;a href=&quot;https://forum.effectivealtruism.org/posts/hhyjbjwN96NWRSvv7/clarifying-wisdom-foundational-topics-for-aligned-ais-to&quot;&gt;Clarifying “wisdom”: Foundational topics for aligned AIs to prioritize before irreversible decisions&lt;/a&gt;&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Center on Long-Term Risk – &lt;a href=&quot;https://longtermrisk.org/research-agenda&quot;&gt;Cooperation, Conflict, and Transformative Artificial Intelligence: A Research Agenda&lt;/a&gt;&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Chi Nguyen – &lt;a href=&quot;https://forum.effectivealtruism.org/posts/wE7KPnjZHBjxLKNno/ai-things-that-are-perhaps-as-important-as-human-controlled&quot;&gt;AI things that are perhaps as important as human-controlled AI&lt;/a&gt;&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Digital Sentience Consortium – &lt;a href=&quot;https://www.longview.org/digital-sentience-consortium/request-for-proposals-applied-work-on-potential-digital-sentience-and-society/&quot;&gt;Applied work on digital sentience and society&lt;/a&gt;&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Eleos AI – &lt;a href=&quot;https://eleosai.org/post/research-priorities-for-ai-welfare/&quot;&gt;Research priorities for AI welfare&lt;/a&gt;&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Forethought – &lt;a href=&quot;https://www.forethought.org/research/ai-enabled-coups-how-a-small-group-could-use-ai-to-seize-power&quot;&gt;AI-Enabled Coups: How a Small Group Could Use AI to Seize Power&lt;/a&gt;&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Kevin Xia – &lt;a href=&quot;https://forum.effectivealtruism.org/posts/BXxEyZNYn7Fqkcsed/transformative-ai-and-animals-animal-advocacy-under-a-post&quot;&gt;Transformative AI and Animals: Animal Advocacy Under A Post-Work Society&lt;/a&gt;&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Raymond Douglas – &lt;a href=&quot;https://www.lesswrong.com/posts/GAv4DRGyDHe2orvwB/gradual-disempowerment-concrete-research-projects&quot;&gt;Gradual Disempowerment: Concrete Research Projects&lt;/a&gt;&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Will MacAskill – &lt;a href=&quot;https://www.forethought.org/research/better-futures&quot;&gt;Better Futures&lt;/a&gt;&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Will MacAskill – &lt;a href=&quot;https://forum.effectivealtruism.org/posts/HqmQMmKgX7nfSLaNX/moral-error-as-an-existential-risk&quot;&gt;Moral error as an existential risk&lt;/a&gt;&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h1 id=&quot;general-recommendations&quot;&gt;General recommendations&lt;/h1&gt;

&lt;h2 id=&quot;advocacy-should-emphasize-x-risk-and-misalignment-risk&quot;&gt;Advocacy should emphasize x-risk and misalignment risk&lt;/h2&gt;

&lt;p&gt;I would like to make two assertions:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;
    &lt;p&gt;AI x-risk is more important than non-existential AI risks.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Advocates should say that.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Given &lt;a href=&quot;https://forum.effectivealtruism.org/topics/longtermism&quot;&gt;weak longtermism&lt;/a&gt;, or even significant credence to weak longtermism on a moral-uncertainty system, x-risks dwarf non-existential AI risks in importance (except perhaps for S-risks, which are a whole can of worms that I won’t get into in this section). See Bostrom’s &lt;a href=&quot;https://existential-risk.com/concept&quot;&gt;Existential Risk Prevention As Global Priority&lt;/a&gt;. Risks like “AI causes widespread unemployment” are bad, but given the fact that we have to triage, extinction risks should take priority over them.&lt;/p&gt;

&lt;p&gt;(To my knowledge, people advocating for focusing on non-existential AI risks have never provided supporting cost-effectiveness estimates. I don’t think such an estimate would give a favorable result. If you strongly discount AI x-risk/longtermism, then most likely you should be focusing on farm animal welfare (or similar), not AI risk.)&lt;/p&gt;

&lt;p&gt;Historically, raising concerns about ASI has caused people to take harmful actions like:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;
    &lt;p&gt;I need to be the one who builds ASI before anyone else, I think I’ll start a new frontier AI company.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;AI is a big deal, so we need to race China.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;I don’t have a straightforward solution to this. You can’t reduce x-risk by doing nothing, but if you do something, there’s a risk that it backfires.&lt;/p&gt;

&lt;p&gt;My best answer is that advocacy should emphasize misalignment risk and extinction risk. Many harmful actions were committed with the premise “TAI is dangerous if someone else builds it, but safe if I build it.” When in fact it is dangerous, no matter who builds it. “If anyone builds it, everyone dies” is more the correct sort of message.&lt;/p&gt;

&lt;p&gt;Misalignment isn’t the &lt;em&gt;only&lt;/em&gt; way AI could cause extinction, although it does seem to be the most likely way. I believe advocacy should focus on misalignment risk not only because it’s the most concerning risk, but also it has historically been under-emphasized in favor of other risks (if you read Congressional testimonies by AI risk orgs, they mention other risks but rarely mention misalignment risk), and it is (in my estimation) less likely to backfire.&lt;/p&gt;

&lt;p&gt;Many advocates are concerned that x-risk and misalignment risk sound too “out there”. Two reasons why I believe advocates should talk about them:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;
    &lt;p&gt;All else equal, it’s better to say what you believe and ask for what you want. It’s too easy to come up with galaxy-brained reasons why you will get what you want by &lt;em&gt;not&lt;/em&gt; talking about what you want.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Emphasizing the less important risks is more likely to backfire by increasing x-risk.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;There is precedent for talking about x-risk without being seen as too weird, for example, the CAIS &lt;a href=&quot;https://safe.ai/work/statement-on-ai-risk&quot;&gt;Statement on AI Risk&lt;/a&gt;. If you’re worried about x-risk, you’re in good company.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Nate Soares says more about this in &lt;a href=&quot;https://www.lesswrong.com/posts/CYTwRZtrhHuYf7QYu/a-case-for-courage-when-speaking-of-ai-danger&quot;&gt;A case for courage, when speaking of AI danger&lt;/a&gt;.&lt;/p&gt;

&lt;h2 id=&quot;prioritize-work-that-pays-off-if-timelines-are-short&quot;&gt;Prioritize work that pays off if timelines are short&lt;/h2&gt;

&lt;p&gt;There is a strong possibility (25–75% chance) of transformative AI within 5 years. &lt;a href=&quot;https://80000hours.org/agi/guide/when-will-agi-arrive/&quot;&gt;80,000 Hours&lt;/a&gt; reviews forecasts and predicts AGI by 2030; &lt;a href=&quot;https://www.metaculus.com/questions/5121/date-of-artificial-general-intelligence/&quot;&gt;Metaculus&lt;/a&gt; predicts 50% chance of AGI by 2032; AI company CEOs have predicted 2025–2035 (see Appendix, &lt;a href=&quot;https://docs.google.com/document/d/1w1vVTiihUTqFye2hIaoGuqJgw-G5LzeQ8x0yoPQ-Ilg/edit?tab=t.0#bookmark=kix.hmipe7qn43ip&quot;&gt;When do AI company CEOs expect advanced AI to arrive?&lt;/a&gt;); &lt;a href=&quot;https://ai-2027.com/&quot;&gt;AI 2027 team&lt;/a&gt; predicts 2030ish (their scenario has AGI arriving in 2028, but that’s their modal prediction, not median).&lt;/p&gt;

&lt;p&gt;The large majority of today’s AI safety efforts work best if timelines are long (2+ decades). Short-timelines work is neglected. It would be neglected even if there were only (say) a 10% chance of short timelines, but the probability is higher than that.&lt;/p&gt;

&lt;p&gt;For example, that means there should be less academia-style long-horizon research, and more focus on activities that have a good chance of bearing fruit quickly.&lt;sup id=&quot;fnref:4&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:4&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;4&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;h1 id=&quot;top-project-ideas&quot;&gt;Top project ideas&lt;/h1&gt;

&lt;p&gt;After collecting a list of project ideas, I identified four that look particularly promising (at least given the limited scope of my investigation). This section presents the four ideas in no particular order.&lt;/p&gt;

&lt;h2 id=&quot;talk-to-policy-makers-about-ai-x-risk&quot;&gt;Talk to policy-makers about AI x-risk&lt;/h2&gt;

&lt;p&gt;The way to get x-risk-reducing regulations passed is to get policy-makers on board with the idea. The way to get them on board is to talk to them. Therefore, we should talk to them.&lt;/p&gt;

&lt;p&gt;Talking to them may entail advocating for specific legislative proposals, or it may just entail raising general concern for AI x-risk.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Theory of change:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Increases the chance that safety legislation gets passed or regulations get put in place. Likely also increases the chance of an international treaty.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Who’s working on it?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://www.safe.ai/&quot;&gt;Center for AI Safety / CAIS Action Fund&lt;/a&gt; (US); &lt;a href=&quot;https://controlai.com/&quot;&gt;Control AI&lt;/a&gt; (UK/US); &lt;a href=&quot;https://encodeai.org/&quot;&gt;Encode AI&lt;/a&gt; (US/global); &lt;a href=&quot;https://www.goodancestors.org.au/ai-safety&quot;&gt;Good Ancestors&lt;/a&gt; (Australia); &lt;a href=&quot;https://intelligence.org/&quot;&gt;Machine Intelligence Research Institute&lt;/a&gt; (US/global); &lt;a href=&quot;https://palisaderesearch.org/&quot;&gt;Palisade Research&lt;/a&gt; (US); &lt;a href=&quot;https://www.pauseai-us.org/&quot;&gt;PauseAI US&lt;/a&gt; (US).&lt;/p&gt;

&lt;p&gt;(That’s not as many as it sounds like because some of these orgs have one or fewer full-time-employee-equivalents talking to policy-makers.)&lt;/p&gt;

&lt;p&gt;Various other groups do political advocacy on AI risk, but mainly on sub-existential risks. The list above only includes orgs that I know have done advocacy on existential risk specifically.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;Political advocacy is neglected compared to policy research.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;You can advocate to the public or directly to policy-makers. Both can help, but talking to policy-makers is more “leverage-efficient” than public outreach because policy-makers have much more leverage over policy.&lt;/p&gt;

    &lt;ul&gt;
      &lt;li&gt;
        &lt;p&gt;According to my back-of-the-envelope calculation on policy-maker advocacy vs. public protests (see &lt;a href=&quot;https://docs.google.com/document/d/1w1vVTiihUTqFye2hIaoGuqJgw-G5LzeQ8x0yoPQ-Ilg/edit?tab=t.0#bookmark=kix.7pg4uvfjb7v6&quot;&gt;Appendix&lt;/a&gt;), policy-maker advocacy looks more cost-effective (although I was writing on the back of a very small envelope, so to speak).&lt;/p&gt;
      &lt;/li&gt;
      &lt;li&gt;
        &lt;p&gt;Policy-maker advocacy can be bottlenecked on public support—they don’t want to support policies that their constituents dislike—but this isn’t a problem because AI safety regulation is popular among the general public.&lt;/p&gt;
      &lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;More advocacy is better: bringing up AI risk repeatedly makes it more likely that policy-makers will take notice.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cons:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;Poorly executed advocacy has a risk of turning off policy-makers.&lt;/p&gt;

    &lt;ul&gt;
      &lt;li&gt;See &lt;a href=&quot;#downsides-of-ai-policyadvocacy-and-why-theyre-not-too-big&quot;&gt;Downsides of policy/advocacy (and why they’re not too big)&lt;/a&gt;, specifically the fourth downside.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Now may be too early. See Appendix: &lt;a href=&quot;https://docs.google.com/document/d/1w1vVTiihUTqFye2hIaoGuqJgw-G5LzeQ8x0yoPQ-Ilg/edit?tab=t.0#bookmark=kix.idfhvmca2skk&quot;&gt;When is the right time for advocacy?&lt;/a&gt;&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Some comments on political advocacy:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;According to the book &lt;em&gt;Lobbying and Policy Change: Who Wins, Who Loses, and Why&lt;/em&gt; (&lt;a href=&quot;https://press.uchicago.edu/ucp/books/book/chicago/L/bo6683614.html&quot;&gt;2009&lt;/a&gt;), most factors could not predict whether lobbying efforts would succeed or fail. One of the best predictors of lobbying success was the number of employed lobbyists who previously worked as government policy-makers. This suggests that political advocacy orgs should try to hire former policy-makers/staffers.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;It is possible to hire generalist lobbyists who have political experience and will lobby for any cause. AI risk orgs could hire them.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Advocacy in the US or China is best because those are the countries with by far the most advanced AI. Ideally, we also want good policies in China, but I can’t confidently recommend advocacy there, see &lt;a href=&quot;#policyadvocacy-in-china&quot;&gt;Policy/advocacy in China&lt;/a&gt;.&lt;/p&gt;

    &lt;ul&gt;
      &lt;li&gt;
        &lt;p&gt;Advocacy in the UK has had the most success. UK AI policy is less directly relevant because there are no frontier AI companies based in the UK, but (1) companies still care about being able to operate in the UK and (2) UK policy can be a template for other countries or for international agreements.2&lt;/p&gt;
      &lt;/li&gt;
      &lt;li&gt;
        &lt;p&gt;Advocacy in California looks promising, and all American frontier AI companies are based in California.&lt;/p&gt;
      &lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;There’s a question as to what policy positions we should advocate for, but I believe there are many correct answers. See &lt;a href=&quot;#what-kinds-of-policies-might-reduce-ai-x-risk&quot;&gt;What kinds of policies might reduce AI x-risk?&lt;/a&gt;&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;write-ai-x-risk-legislation&quot;&gt;Write AI x-risk legislation&lt;/h2&gt;

&lt;p&gt;AI policy research is relatively well-funded, but little work has been done to convert the results of this research into fully fleshed-out bills. Writers can learn from AI policy researchers what sorts of regulation might work, and learn from advocates what regulations they want and what they expect policy-makers to support.&lt;/p&gt;

&lt;p&gt;This work also requires prioritizing which policy proposals look most promising and converting those into draft legislation. I think of that as part of the same work, but it could also be separate—for example, a team of AI safety researchers could prioritize policy proposals and sketch out legislation, and a separate team of legal experts (who don’t even necessarily need to know anything about AI) can convert those sketches into usable text.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Theory of change:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Many policy-makers care about AI risk and would support legislation, but there’s a big difference between “would support legislation” and “would personally draft legislation”. To get AI legislation passed, it helps if the legislation is already written. Instead of telling policy-makers&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;AI x-risk is a big deal, please write some legislation.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;it’s a much easier ask if you can say&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;AI x-risk is a big deal, here is a bill that I already wrote, would you be interested in sponsoring it?&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;Who’s working on it?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://www.centeraipolicy.org/&quot;&gt;Center for AI Policy&lt;/a&gt; (now mostly defunct); &lt;a href=&quot;https://www.safe.ai/&quot;&gt;Center for AI Safety / CAIS Action Fund&lt;/a&gt;; &lt;a href=&quot;https://encodeai.org/&quot;&gt;Encode AI&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Others may be working on it as well, since legislation often doesn’t get published. I spoke to someone who has been involved in writing AI risk legislation, and they said that few people are working on this, so I don’t think the full list is much longer than the names I have.&lt;/p&gt;

&lt;p&gt;(My contact also said that they wished more people were writing legislation.)&lt;/p&gt;

&lt;p&gt;See &lt;a href=&quot;https://mdickens.me/reading-notes/#[2025-06-06%20Fri]%20Deep%20Research:%20AI%20x-risk%20legislation&quot;&gt;my notes&lt;/a&gt; for a list of AI safety bills that have been introduced in the US, UK, and EU. Most of those bills were written by legislators, not by nonprofits, as far as I can tell.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;It’s common for policy-makers to sponsor bills that were written by third parties.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;A small number of writers could draft a (relatively) large volume of legislation by leaning on pre-existing research.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;There are lawyers who specialize in writing legislation. You can hire them to do the bulk of the work (you don’t need value alignment or even much skill at hiring, just find a law firm with a good reputation).&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;The general idea of “write AI legislation” looks good under many beliefs about AI risk. But your beliefs will impact what kinds of legislation you want.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cons:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;The natural argument against writing legislation is that it’s too early and we don’t know how to regulate AI yet. (Related from the Appendix: &lt;a href=&quot;https://docs.google.com/document/d/1w1vVTiihUTqFye2hIaoGuqJgw-G5LzeQ8x0yoPQ-Ilg/edit?tab=t.0#bookmark=kix.idfhvmca2skk&quot;&gt;When is the right time for advocacy?&lt;/a&gt;)&lt;/p&gt;

    &lt;ul&gt;
      &lt;li&gt;
        &lt;p&gt;That’s sort of true, but if timelines are short, then we don’t have time to wait; we have to just do our best.&lt;/p&gt;
      &lt;/li&gt;
      &lt;li&gt;
        &lt;p&gt;And I don’t think this concern is fatal: we do have &lt;em&gt;some&lt;/em&gt; concrete ideas about how to regulate AI, so we can write legislation for those.&lt;/p&gt;
      &lt;/li&gt;
      &lt;li&gt;
        &lt;p&gt;And it’s still a good idea to get some regulations in place now, and then we can pass new regulations later as necessary. That’s how regulation in nascent industries often works.&lt;/p&gt;
      &lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;advocate-to-change-ai-training-to-make-llms-more-animal-friendly&quot;&gt;Advocate to change AI training to make LLMs more animal-friendly&lt;/h2&gt;

&lt;p&gt;LLMs undergo post-training to make their outputs satisfy AI companies’ criteria. For example, Anthropic post-trains its models to be “helpful, honest, and harmless”. AI companies could use the same process to make LLMs give regard to animal welfare.&lt;/p&gt;

&lt;p&gt;Animal advocates could use a few strategies to make this happen, for example:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;Build a benchmark that measures LLMs’ friendliness toward animals and try to get AI companies to train on that benchmark.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Advocate for AI companies to include animal welfare in AI constitutions/model specs.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Advocate for AI companies to incorporate animal welfare when doing &lt;a href=&quot;https://en.wikipedia.org/wiki/Reinforcement_learning_from_human_feedback&quot;&gt;RLHF&lt;/a&gt;, or ask to directly participate in RLHF.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Theory of change:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Insofar as the current alignment paradigm works at aligning AIs to human preferences, incorporating animal welfare into post-training would align LLMs to animal welfare in the same way.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Who’s working on it?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://www.compassionml.com/&quot;&gt;Compassion in Machine Learning (CaML)&lt;/a&gt;; &lt;a href=&quot;https://www.sentientfutures.ai/&quot;&gt;Sentient Futures&lt;/a&gt;. (They mainly do research to develop animal-friendliness benchmarks and other related projects, but they have also worked with AI companies.)&lt;/p&gt;

&lt;p&gt;For some useful background, see &lt;a href=&quot;https://forum.effectivealtruism.org/posts/NAnFodwQ3puxJEANS/road-to-animalharmbench-1&quot;&gt;Road to AnimalHarmBench&lt;/a&gt; by Artūrs Kaņepājs and Constance Li.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;People at AI companies have told me that getting a company to pay attention to animal welfare isn’t too difficult—in fact, one frontier company already uses an animal welfare benchmark.&lt;/p&gt;

    &lt;ul&gt;
      &lt;li&gt;As I understand, AI companies don’t want to be seen as imposing their own values on LLMs, but they are open to tuning the values based on what external parties want.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Insofar as post-training works at preventing misalignment risk, it should also prevent suffering-risk / animal-welfare-risk.&lt;/p&gt;

    &lt;ul&gt;
      &lt;li&gt;I don’t expect current alignment techniques to continue working on superintelligent AI, so I don’t expect them to make ASI friendly toward animals, either. But if I’m right, then we won’t get a friendly-to-humans AI that causes astronomical animal suffering; we will get a paperclip maximizer.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Even if current known techniques can’t help get AI to care about animals, this work could get a foot in the door, establishing relationships between animal advocates and AI companies and increasing the chances that the companies will pay attention to animal welfare in their future work.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cons:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;If I’m right that the current alignment paradigm won’t scale to superintelligence, then animal-friendliness (post-)training will fail because it relies on the same foundations as the current alignment paradigm.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;It might turn out to be difficult to get AI companies to implement animal welfare mitigations.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;There might be consumer backlash, which could make frontier models less friendly to animals in the long run.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;If TAI arrives soon, there may not be time for this intervention to have an effect. It could take too long to get the new post-training implemented; or it could be that current-gen models will perform well on friendliness-to-animals benchmarks, but this will not be due to true alignment, and there won’t be enough time to iterate.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Aligning current-gen AIs to human preferences might make them better at assisting with alignment research, but it seems less likely that aligning current-gen AIs to animal welfare would carry through to future generations—it’s not clear that animal-aligned AIs would be more helpful at aligning future AIs to animal welfare.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;develop-new-plans--evaluate-existing-plans-to-improve-post-tai-animal-welfare&quot;&gt;Develop new plans / evaluate existing plans to improve post-TAI animal welfare&lt;/h2&gt;

&lt;p&gt;Some people have proposed plans for making TAI go well for animals, but I have reservations:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;Most plans only work under long timelines (ex: “broadly influence society to care more about animals”).&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Some plans focus specifically on farm animal welfare, and it seems very unlikely that factory farming will continue to exist in the long term (see &lt;a href=&quot;#using-tai-to-improve-farm-animal-welfare&quot;&gt;Using TAI to improve farm animal welfare&lt;/a&gt;).&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Many plans assume a particular future in which we develop transformative AI, but the world does not radically change—for example, plans about how TAI can help animal activists be more effective. I think this future is quite unlikely, and even if it does occur, there’s no particular need to figure out what to do in advance.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;(Lizka Vaintrob and Ben West in &lt;a href=&quot;https://forum.effectivealtruism.org/posts/tGdWott5GCnKYmRKb/a-shallow-review-of-what-transformative-ai-means-for-animal&quot;&gt;A shallow review of what transformative AI means for animal welfare&lt;/a&gt; raised essentially the same reservations. See their article for more detailed reasoning on this topic.)&lt;/p&gt;

&lt;p&gt;In light of these reservations, I would like to see research on post-TAI animal welfare interventions that look good (1) given short timelines and (2) without having to make strong predictions about what the future will look like for animals (e.g. without assuming that factory farming will exist).&lt;/p&gt;

&lt;p&gt;Since I have pressed the importance of short timelines, I’m not envisioning a long-term research project. I expect it would be possible to come up with useful results in 3–6 months (maybe even less). The research should be laser-focused on finding &lt;em&gt;near-term&lt;/em&gt; actions that take a few years at most, but still have a good chance of making the post-TAI future better for animals.&lt;/p&gt;

&lt;p&gt;(I identified &lt;a href=&quot;#advocate-to-change-ai-training-to-make-llms-more-animal-friendly&quot;&gt;Advocate to change AI training to make LLMs more animal-friendly&lt;/a&gt; as potentially a top intervention after about a week of research, although to be fair, that’s mostly because I talked to other people who had done more research than me.)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Theory of change:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;AI-for-animals interventions are underexplored. I expect that a few months of well-targeted research could turn up useful information about how to make AI go well for animals.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Who’s working on it?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://longtermrisk.org/&quot;&gt;Center on Long-Term Risk&lt;/a&gt;, &lt;a href=&quot;https://www.sentienceinstitute.org/&quot;&gt;Sentience Institute&lt;/a&gt;, and some individuals have written project proposals on AI-for-animals, but they almost always hinge on AI timelines being long. The closest thing I’m aware of is Max Taylor’s &lt;a href=&quot;https://forum.effectivealtruism.org/posts/2cZAzvaQefh5JxWdb/bringing-about-animal-inclusive-ai&quot;&gt;Bringing about animal-inclusive AI&lt;/a&gt;, which does include short-timelines proposals, but they are not directly actionable. For example, one idea is “representation of animals in AI decision-making”, which is an action an AI company could take, but AI companies are not the relevant actors. An actionable project would be something like “a nonprofit uses its connections at an AI company to persuade/pressure the company to include representation of animals in its AI decision-making”.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://sites.google.com/nyu.edu/mindethicspolicy/home&quot;&gt;NYU Center for Mind, Ethics, and Policy&lt;/a&gt; and &lt;a href=&quot;https://www.sentientfutures.ai/&quot;&gt;Sentient Futures&lt;/a&gt; have done similar work, but nothing exactly like this project proposal. I expect they could do a good job at identifying/prioritizing AI-for-animals interventions that fit my criteria. I would be excited to see a follow-up to Max Taylor’s &lt;a href=&quot;https://forum.effectivealtruism.org/posts/2cZAzvaQefh5JxWdb/bringing-about-animal-inclusive-ai&quot;&gt;Bringing about animal-inclusive AI&lt;/a&gt; focused on converting his ideas into actionable projects.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;AI-for-animals seems more tractable than other post-TAI welfare causes (e.g. AI welfare or S-risks from cooperation failure). There are already proposed interventions that could work if timelines are short.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;A short research project may come up with useful ideas, or at least prioritize between pre-existing ideas.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cons:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;A research project might not come up with any really good ideas. Pre-existing research has mostly failed to come up with good ideas that work under short timelines (although to a large extent, that’s because they weren’t trying to).&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;I’m suspicious of “meta” work in general, and I’m suspicious of research because I personally like doing research, and I believe the value of research is usually overrated by researchers. It might be better to work directly on AI-for-animals—my current favorite “direct” project idea is  &lt;a href=&quot;#advocate-to-change-ai-training-to-make-llms-more-animal-friendly&quot;&gt;Advocate to change AI training to make LLMs more animal-friendly&lt;/a&gt; or similar.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h1 id=&quot;honorable-mentions&quot;&gt;Honorable mentions&lt;/h1&gt;

&lt;h2 id=&quot;directly-push-for-an-international-ai-treaty&quot;&gt;Directly push for an international AI treaty&lt;/h2&gt;

&lt;p&gt;The best kind of AI regulation is the kind that every country agrees to (or at least every country that has near-frontier AI technology).&lt;/p&gt;

&lt;p&gt;If we need an international treaty to ensure that nobody builds a misaligned AI, then an obvious thing to do is to talk directly to national leaders about how we need an international treaty.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Theory of change:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;An internationally-agreed moratorium on advanced AI would straightforwardly prevent advanced AI from killing everyone or otherwise destroying most of the value of the future.&lt;/p&gt;

&lt;p&gt;One way to get an international treaty is to talk to governments and tell them you think they should sign an international treaty.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Who’s working on it?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://www.cigionline.org/programs/global-ai-risks-initiative/&quot;&gt;Global AI Risks Initiative&lt;/a&gt;. There are other orgs that are doing work with the ultimate goal of an international treaty, but to my knowledge, they’re not &lt;em&gt;directly&lt;/em&gt; pushing for a treaty. Those other orgs include: &lt;a href=&quot;https://futureoflife.org/&quot;&gt;Future of Life Institute&lt;/a&gt;; &lt;a href=&quot;https://intelligence.org/&quot;&gt;Machine Intelligence Research Institute&lt;/a&gt;; &lt;a href=&quot;https://pauseai.info/&quot;&gt;PauseAI Global&lt;/a&gt;; &lt;a href=&quot;https://www.pauseai-us.org/&quot;&gt;PauseAI US&lt;/a&gt;; and &lt;a href=&quot;https://saif.org/&quot;&gt;Safe AI Forum&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;There is a short causal chain from “advocate for an international treaty” to “ASI doesn’t kill everyone”.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;It’s high-leverage—you only need to get a relatively small set of people on board.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cons:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;There is not much political will for an international treaty, especially a strong one. Public advocacy and smaller-scale political advocacy seem better for that reason, at least for now.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;This idea only works if you can figure out who will do a good job at pushing for an international treaty. I think it’s more difficult than generic public advocacy or talking to policy-makers about x-risk.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This might be one of my top ideas if I knew how to do it, and I &lt;em&gt;wish&lt;/em&gt; I could put it on my top-ideas list, but I don’t know how to do it.&lt;/p&gt;

&lt;h2 id=&quot;organize-a-voluntary-commitment-by-ai-scientists-not-to-build-advanced-ai&quot;&gt;Organize a voluntary commitment by AI scientists not to build advanced AI&lt;/h2&gt;

&lt;p&gt;I heard this idea from Toby Ord on the 80,00 Hours podcast #219. He said,&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;If AI kills us and we end up standing front of St. Peter, and he asks, “Well did you try a voluntary agreement not to build it?” And we said, “No, we thought it wouldn’t work”, that’s not a good look for us.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;At the 1975 &lt;a href=&quot;https://en.wikipedia.org/wiki/Asilomar_Conference_on_Recombinant_DNA#Prohibited_experiments&quot;&gt;Asilomar Conference&lt;/a&gt;, the international community of biologists voluntarily agreed not to conduct dangerous experiments on recombinant DNA. Perhaps something similar could work for TAI. The dangers of advanced AI are widely recognized among top AI researchers; it may be possible to organize an agreement not to work on powerful AI systems.&lt;/p&gt;

&lt;p&gt;(There are some details to be worked out as to exactly what sort of work qualifies as dangerous. As with my stance on &lt;a href=&quot;#what-kinds-of-policies-might-reduce-ai-x-risk&quot;&gt;what kinds of policies would be helpful&lt;/a&gt;, I believe there are many agreements we could reach that would be better than the status quo. I expect leading AI researchers can collectively work out an operationalization that’s better than nothing.)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Theory of change:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If leading AI scientists all agree not to build advanced AI, then it does not get built. The question is whether a non-binding commitment will work. There have been similar successes in the past, especially in genetics with voluntary moratoriums on human cloning and human genetic engineering.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Who’s working on it?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Toby Ord has raised this idea. According to a personal communication, he did some research on its plausibility, but he is not actively working on it.&lt;/p&gt;

&lt;p&gt;The &lt;a href=&quot;https://safe.ai/work/statement-on-ai-risk&quot;&gt;CAIS Statement on AI Risk&lt;/a&gt; and &lt;a href=&quot;https://futureoflife.org/open-letter/pause-giant-ai-experiments/&quot;&gt;FLI Pause Letter&lt;/a&gt; are related but weaker.&lt;/p&gt;

&lt;p&gt;FLI organized the 2017 &lt;a href=&quot;https://futureoflife.org/open-letter/ai-principles/&quot;&gt;Asilomar Conference on Beneficial AI&lt;/a&gt;, but to my knowledge, the goal was not to make any commitments regarding AI safety.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;Unlike most project ideas, if this one succeeds, x-risk will immediately go down by multiple percentage points.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;A voluntary commitment may be sufficient to prevent extinction, and it may be easier to achieve than a legally-mandated moratorium or strict regulations.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Talking to politicians or pushing for regulation has the problem that you’d really rather get regulations in all countries simultaneously. Researchers are (I think) less prone to inter-country adversarialism than nations’ leaders, especially between the West and China—American and Chinese scientists collaborate often.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;You don’t need everyone to sign. If (say) Nobel Prize winners sign the agreement, it raises questions about why (say) the head of ML at OpenBrain hasn’t signed.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cons:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;In &lt;a href=&quot;https://docs.google.com/document/d/1w1vVTiihUTqFye2hIaoGuqJgw-G5LzeQ8x0yoPQ-Ilg/edit?tab=t.0#bookmark=kix.idfhvmca2skk&quot;&gt;When is the right time for advocacy?&lt;/a&gt;, I argued that now is the right time. But a voluntary moratorium seems more likely than other ideas to fail if done at a suboptimal time because you need to get a ~majority on board.&lt;/p&gt;

    &lt;ul&gt;
      &lt;li&gt;It’s not clear whether a failed attempt will decrease the probability of success for subsequent attempts. For example, there are many historical instances where a bill failed to pass, and then a very similar bill got passed later. But it’s not clear that this sort of voluntary agreement works the same way.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Voluntary commitments are easily violated. For example, biologists agreed not to do research on human cloning, but then a few rogue scientists (&lt;a href=&quot;https://en.wikipedia.org/wiki/Richard_Seed&quot;&gt;Richard Seed&lt;/a&gt;, etc.) did it anyway.&lt;/p&gt;

    &lt;ul&gt;
      &lt;li&gt;But a voluntary agreement can create strong social pressure not to violate it.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;AI researchers have a vested interest in being able to do AI research; policy-makers do not.&lt;/p&gt;

    &lt;ul&gt;
      &lt;li&gt;But the dangers of advanced AI are much better understood among AI researchers than among policy-makers.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;It is doubtful that AI company CEOs will agree to a moratorium.&lt;/p&gt;

    &lt;ul&gt;
      &lt;li&gt;But if you get the majority of leading AI scientists to agree, CEOs will be left with insufficient talent to lead their research.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;peaceful-protests&quot;&gt;Peaceful protests&lt;/h2&gt;

&lt;p&gt;Organize peaceful protests to raise public concern and salience regarding AI risk. Historically, protests have asked for a pause on AI development, although that might not be the only reasonable ask.&lt;/p&gt;

&lt;p&gt;(I don’t have any other specific asks in mind. The advantage of “pause” is that it’s a simple message that fits on a picket sign.)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Theory of change:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Protests may increase public support and salience via reaching people in person or via media (news reporting, etc.). They may also provide a signal to policy-makers about what their constituents want.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Who’s working on it?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://pauseai.info/&quot;&gt;PauseAI Global&lt;/a&gt;; &lt;a href=&quot;https://www.pauseai-us.org/&quot;&gt;PauseAI US&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://www.stopai.info/&quot;&gt;Stop AI&lt;/a&gt; organizes disruptive protests (e.g. blockading AI company offices), and the evidence is ambiguous as to whether disruptive protests work. See “When Are Social Protests Effective?” (Shuman et al. &lt;a href=&quot;https://doi.org/10.1016/j.tics.2023.10.003&quot;&gt;2024&lt;/a&gt;), although I should note that I think the authors overstate the strength of evidence for their claims—see &lt;a href=&quot;https://mdickens.me/reading-notes/#[2025-04-02%20Wed]%20When%20Are%20Social%20Protests%20Effective?%20\(2024\)&quot;&gt;my notes on the paper&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;a href=&quot;https://mdickens.me/2025/04/18/protest_outcomes_critical_review/&quot;&gt;Natural experiments suggest&lt;/a&gt; that protests are effective at changing voter behavior and/or increasing voter turnout.&lt;/p&gt;

    &lt;ul&gt;
      &lt;li&gt;Experiments also find that peaceful protests increase support in a lab setting—see “Social Movement Strategy (Nonviolent Versus Violent) and the Garnering of Third-Party Support: A Meta-Analysis” (Orazani et al. &lt;a href=&quot;https://doi.org/10.1002/ejsp.2722&quot;&gt;2021&lt;/a&gt;). For a summary, see &lt;a href=&quot;https://mdickens.me/reading-notes/#[2025-04-09%20Wed]%20Social%20Movement%20Strategy%20(Nonviolent%20Versus%20Violent)%20and%20the%20Garnering%20of%20Third-Party%20Support:%20A%20Meta-Analysis%20(2021)&quot;&gt;my notes on the paper&lt;/a&gt;.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Protesting is widely seen as the thing you do when you are concerned about an issue. Many people take not-protesting as a sign that you aren’t serious. It’s valuable to be able to say “yes, we are taking AI risk seriously, you can tell because we are staging protests”. Regardless of the cost-effectiveness of marginal protesters, it’s good for there to be nonzero protests happening.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Protests make it clear to policy-makers that their constituents care about an issue. This is especially important for AI because the general public is very worried about AI (see e.g. &lt;a href=&quot;https://www.pewresearch.org/internet/2025/04/03/how-the-us-public-and-ai-experts-view-artificial-intelligence/&quot;&gt;2025 Pew poll&lt;/a&gt;), but the issue is not high-salience (see &lt;a href=&quot;https://today.yougov.com/technology/articles/45565-ai-nuclear-weapons-world-war-humanity-poll&quot;&gt;2023 YouGov poll&lt;/a&gt;: respondents were worried about AI extinction risk, but it only ranked as the #6 most concerning x-risk). Protests increase its salience.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Protests are an effective way to get media attention. It’s common for protests with only a dozen participants to get news coverage.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;I expect that a really good &lt;a href=&quot;#media-about-dangers-of-ai&quot;&gt;media project&lt;/a&gt; would be more cost-effective, but creating a good media project requires exceptional talent. Organizing a protest requires &lt;em&gt;some&lt;/em&gt; skill, but the bar isn’t particularly high. Therefore, you can support protests without having to identify top-tier talent.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Protests are highly neglected, and I expect them to continue to be neglected.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cons:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;Some people are concerned that protests can backfire.&lt;/p&gt;

    &lt;ul&gt;
      &lt;li&gt;I’m not concerned about peaceful protests backfiring. The scientific literature universally shows that peaceful protests have a positive effect, although the strength of the evidence could be better—see &lt;a href=&quot;https://mdickens.me/2025/04/18/protest_outcomes_critical_review/&quot;&gt;Do Protests Work? A Critical Review&lt;/a&gt;.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;My &lt;a href=&quot;https://docs.google.com/document/d/1w1vVTiihUTqFye2hIaoGuqJgw-G5LzeQ8x0yoPQ-Ilg/edit?tab=t.0#bookmark=kix.7pg4uvfjb7v6&quot;&gt;back-of-the-envelope calculation&lt;/a&gt; suggested that talking directly to policy-makers is more cost-effective.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Now may be too early. See &lt;a href=&quot;https://docs.google.com/document/d/1w1vVTiihUTqFye2hIaoGuqJgw-G5LzeQ8x0yoPQ-Ilg/edit?tab=t.0#bookmark=kix.idfhvmca2skk&quot;&gt;When is the right time for advocacy?&lt;/a&gt;&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;It may be bad for large funders to fund protests because it could create an appearance of &lt;a href=&quot;https://en.wikipedia.org/wiki/Astroturfing&quot;&gt;astroturfing&lt;/a&gt;.&lt;/p&gt;

    &lt;ul&gt;
      &lt;li&gt;This is an argument against large funders funding them, but in &lt;em&gt;favor&lt;/em&gt; of individual donors supporting protests.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;media-about-dangers-of-ai&quot;&gt;Media about dangers of AI&lt;/h2&gt;

&lt;p&gt;Create media explaining why AI x-risk is a big deal and what we should do about it.&lt;/p&gt;

&lt;p&gt;Things like:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;Books (ex: &lt;em&gt;If Anyone Builds It, Everyone Dies&lt;/em&gt;)&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;News articles (ex: Existential Risk Observatory)&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Videos (ex: Rob Miles)&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Forecasts of how AI can go badly (ex: AI 2027)&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Theory of change:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Media increase public concern, which makes policy-makers more likely to put good regulations in place.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Who’s working on it?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://80000hours.org/&quot;&gt;80,000 Hours&lt;/a&gt;; &lt;a href=&quot;https://ai-futures.org/&quot;&gt;AI Futures Project&lt;/a&gt;; &lt;a href=&quot;https://aisgf.us/&quot;&gt;AI Safety and Governance Fund&lt;/a&gt;; &lt;a href=&quot;https://aisafety.info/&quot;&gt;AI Safety Info&lt;/a&gt;; &lt;a href=&quot;https://www.securite-ia.fr/en&quot;&gt;Centre pour la Sécurité de l’IA (CeSIA)&lt;/a&gt;; &lt;a href=&quot;https://civai.org/&quot;&gt;CivAI&lt;/a&gt;; &lt;a href=&quot;https://www.existentialriskobservatory.org/&quot;&gt;Existential Risk Observatory&lt;/a&gt;; &lt;a href=&quot;https://intelligence.org/&quot;&gt;Machine Intelligence Research Institute&lt;/a&gt;; &lt;a href=&quot;https://www.themidasproject.com/&quot;&gt;Midas Project&lt;/a&gt;; various media projects by individual people.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;We need to get good policies in place. Media can influence policy-makers, and can influence the public, which is important because policy-makers largely want to do what their constituents want.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Polls show that the public is &lt;a href=&quot;https://www.pewresearch.org/internet/2025/04/03/how-the-us-public-and-ai-experts-view-artificial-intelligence/&quot;&gt;concerned&lt;/a&gt; about AI risk and even &lt;a href=&quot;https://today.yougov.com/technology/articles/45565-ai-nuclear-weapons-world-war-humanity-poll&quot;&gt;x-risk&lt;/a&gt;, but it’s not a high-priority issue. Media can make it more salient and/or raise “common knowledge” of concern about AI.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cons:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;The impact of media is fat-tailed and heavily depends on quality. It’s hard to identify which media projects to fund.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Poorly done or misleading media projects could backfire.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;message-testing&quot;&gt;Message testing&lt;/h2&gt;

&lt;p&gt;Many people have strong opinions about the correct way to communicate AI safety to a non-technical audience, but people’s hypotheses have largely not been tested. A project could make a systematic attempt to compare different messages and survey listeners to assess effectiveness.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Theory of change:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;We can’t ultimately get good AI safety outcomes unless we communicate the importance of the problem, and having data on message effectiveness will help with that.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Who’s working on it?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;AI Safety and Governance Fund has an ongoing project (see &lt;a href=&quot;https://manifund.org/projects/testing-and-spreading-messages-to-reduce-ai-x-risk&quot;&gt;Manifund&lt;/a&gt;) to test AI risk messages via online ads. The project is currently moving slowly due to lack of funding.&lt;/p&gt;

&lt;p&gt;Some advocacy orgs have done small-scale message testing on their own materials.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;A smallish investment in empirical data could inform a large amount of messaging going forward.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Some types of experiments (using online ads or Mechanical Turk) can scale well with funding.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cons:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;The best types of communication may be long, individually tailored (e.g. in one-on-one conversations with policy-makers), or otherwise difficult to test empirically.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;I suspect that a person with excellent communication skills would not benefit much from seeing A/B-tested messaging because they can already intuit which wording will be best. (But it may be difficult to identify and hire those people.)&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;This is the sort of thing that might be best to do internally by an advocacy org that already has a reasonable idea of what kind of message it wants to send.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;host-a-website-for-discussion-of-ai-safety-and-other-important-issues&quot;&gt;Host a website for discussion of AI safety and other important issues&lt;/h2&gt;

&lt;p&gt;LessWrong and the Effective Altruism Forum are upstream of a large quantity of work (in AI safety as well as other EA cause areas). It is valuable that these websites continue to exist, and that moderators and web developers continue to work to preserve/improve the quality of discussion.&lt;/p&gt;

&lt;p&gt;Realistically, it doesn’t make sense to start a &lt;em&gt;new&lt;/em&gt; discussion forum, so this idea amounts to “fund/support LessWrong and/or the EA Forum”.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Theory of change:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;On &lt;a href=&quot;https://manifund.org/projects/lightcone-infrastructure&quot;&gt;Lightcone Infrastructure’s Manifund&lt;/a&gt;, Oliver Habryka lists some concrete outcomes that are attributable to the existence of LessWrong. You could probably find similar evidence of impact for the EA Forum.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Who’s working on it?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://www.centreforeffectivealtruism.org/&quot;&gt;Centre for Effective Altruism&lt;/a&gt;; &lt;a href=&quot;https://www.lightconeinfrastructure.com/&quot;&gt;Lightcone Infrastructure&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;Nearly every project on my list has benefited in some way from the existence of LessWrong or the EA Forum.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Hosting a platform for sharing research/discussion is cheaper (and therefore arguably more cost-effective) than directly conducting research.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cons:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;The benefits are diffuse: high-quality discussion forums provide small-to-moderate benefits to every cause, but usually not &lt;em&gt;huge&lt;/em&gt; benefits. So it may be better to directly support your favorite intervention(s).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A toy model: Suppose that&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;
    &lt;p&gt;There are 10 categories of AI safety work.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;LessWrong makes each of them 20% better.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;The average AI safety work produces 1 utility point.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Well-directed AI policy produces 5 utility points.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Then marginal work on LessWrong is worth 2 utility points, and my favorite AI policy orgs are worth 5 points.&lt;/p&gt;

&lt;h1 id=&quot;list-of-other-project-ideas&quot;&gt;List of other project ideas&lt;/h1&gt;

&lt;p&gt;A project not being a top idea doesn’t mean it’s bad. In fact, it’s likely that at least one or two of these ideas should be on my top-ideas list; I just don’t know which ones.&lt;/p&gt;

&lt;p&gt;I sourced ideas from:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;
    &lt;p&gt;reviewing other lists of project ideas and filtering for the relevant ones;&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;looking at what existing orgs are working on;&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;writing down any (sufficiently broad) idea I came across over the last ~6 months;&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;writing down any idea I thought of.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Ideas are roughly ordered from broadest to most specific.&lt;/p&gt;

&lt;h2 id=&quot;ai-for-animals-ideas&quot;&gt;AI-for-animals ideas&lt;/h2&gt;

&lt;h3 id=&quot;neartermist-animal-advocacy&quot;&gt;Neartermist animal advocacy&lt;/h3&gt;

&lt;p&gt;There are various projects to improve current conditions for animals, particularly farm animals: cage-free campaigns, humane slaughter, vegetarian activism, etc. I will lump all these projects together for the purposes of this report.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Theory of change:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Animal advocacy increases concern for animals, which likely has positive flow-through effects into the future, by affecting future generations or by shaping the values of the transformative AI that will control the future.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Who’s working on it?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Too many to list.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;Neartermist animal advocacy has the dual benefit of &lt;em&gt;definitely&lt;/em&gt; helping animals today, and building momentum to make future work more effective (to borrow a framing from &lt;a href=&quot;https://www.youtube.com/live/Mb7uRki3AqM&amp;amp;t=1h47m&quot;&gt;Jeff Sebo&lt;/a&gt;).&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Neartermist animal advocacy is tractable and has clear feedback loops. It looks especially promising if you’re highly uncertain or clueless about longtermist or post-TAI interventions.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cons:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;The benefits are diffuse. Creating one new vegan helps many animals in the short term, but has only a tiny effect on society’s future values. I expect direct attempts to improve AI alignment-to-animals to be much more cost-effective.&lt;/p&gt;

    &lt;ul&gt;
      &lt;li&gt;I created a &lt;a href=&quot;https://squigglehub.org/models/mdickens/AI-for-animals-benchmark-vs-conventional&quot;&gt;back-of-the-envelope calculation&lt;/a&gt; that aligns with my initial expectation: my BOTEC-informed guess is that direct advocacy on AI values (by promoting a &lt;a href=&quot;#advocate-to-change-ai-training-to-make-llms-more-animal-friendly&quot;&gt;friendliness-to-animals LLM benchmark&lt;/a&gt;) is 2–3 orders of magnitude more cost-effective than conventional animal advocacy.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Neartermist animal advocacy works best if timelines are long. Timelines are likely not long.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Effective animal activists generally regard corporate campaigns as more effective than advocacy directed at consumers, but changing corporate practices seems less relevant for shifting society’s values.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;using-tai-to-improve-farm-animal-welfare&quot;&gt;Using TAI to improve farm animal welfare&lt;/h3&gt;

&lt;p&gt;I’m concerned about how TAI could negatively impact non-human welfare. There are some proposals on how TAI could negatively impact farm animals (e.g. by making factory farming more efficient), and on how animal activists could use TAI to make their activism more effective. I will take these proposals as a broad category rather than discussing them individually.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Theory of change:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Depends on the specific proposal.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Who’s working on it?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://www.electricsheep.is/&quot;&gt;Electric Sheep&lt;/a&gt;; &lt;a href=&quot;https://www.joinhive.org/&quot;&gt;Hive&lt;/a&gt; (see &lt;a href=&quot;https://forum.effectivealtruism.org/posts/BXxEyZNYn7Fqkcsed/transformative-ai-and-animals-animal-advocacy-under-a-post&quot;&gt;Transformative AI and Animals: Animal Advocacy Under A Post-Work Society&lt;/a&gt;); &lt;a href=&quot;https://www.openpaws.ai/&quot;&gt;Open Paws&lt;/a&gt;; &lt;a href=&quot;https://www.wildanimalinitiative.org/&quot;&gt;Wild Animal Initiative&lt;/a&gt; (see &lt;a href=&quot;https://forum.effectivealtruism.org/posts/zXhxagQKC6kxPM2Kn/transformative-ai-and-wild-animals-an-exploration&quot;&gt;Transformative AI and wild animals: An exploration&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;Highly neglected.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;I am generally skeptical of interventions of the form “teach people to leverage AI to do X better”, but farm animal advocacy seems sufficiently important that it might be worthwhile in this case.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cons:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Although two of my top ideas relate to post-TAI animal welfare (&lt;a href=&quot;#develop-new-plans--evaluate-existing-plans-to-improve-post-tai-animal-welfare&quot;&gt;Develop new plans / evaluate existing plans to improve post-TAI animal welfare&lt;/a&gt; and &lt;a href=&quot;#advocate-to-change-ai-training-to-make-llms-more-animal-friendly&quot;&gt;Advocate to change AI training to make LLMs more animal-friendly&lt;/a&gt;), I don’t think it’s worth focusing on farm animal welfare in particular.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;Factory farming interventions only matter if factory farming still exists. Cultured meat outcompetes factory farming once you get a sufficiently strong understanding of biology (there’s no way growing a whole chicken is the cheapest possible way to create chicken-meat). We are far from that level of understanding, but I would be surprised if (aligned) TAI couldn’t figure it out.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;The scale of wild animal welfare is orders of magnitude larger than that of factory farming. The case for prioritizing farm animals over wild animals is that we don’t have the power or knowledge to positively influence nature, but TAI should change the equation.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Space colonization would ultimately dominate earth-based welfare, so questions about panspermia or digital minds have a bigger expected impact.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Proposals for how to use TAI to improve animal advocacy only make sense if TAI does not cause value lock-in. If there is no value lock-in, then there’s no strong reason to spend time &lt;em&gt;now&lt;/em&gt; trying to figure out how to use TAI. It would be better to wait until after TAI because at that point, we will have a much better understanding of how TAI works, and there’s no urgency.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;lobby-governments-to-include-animal-welfare-in-ai-regulations&quot;&gt;Lobby governments to include animal welfare in AI regulations&lt;/h3&gt;

&lt;p&gt;If governments put safety restrictions on advanced AI, they could also create rules about animal welfare.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Theory of change:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Getting regulations in place would force companies’ AIs to respect animal welfare.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;One set of regulations can alter the behavior of many frontier companies.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;If companies voluntarily change their behavior, they can regress at any time with no consequences. But companies have to obey regulations.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cons:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;It’s unclear what exactly regulations could do about animal welfare. AI safety regulations, insofar as they exist (which they mostly don’t), don’t dictate how LLMs are required to behave; they dictate what companies are required to do to make LLMs safe. What is a regulatory rule that policy-makers would plausibly be on board with, that would also influence model behavior to be friendlier to animals?&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Influencing the government on animal welfare seems harder than &lt;a href=&quot;#advocate-to-change-ai-training-to-make-llms-more-animal-friendly&quot;&gt;influencing AI companies&lt;/a&gt;.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;traditional-animal-advocacy-targeted-at-frontier-ai-developers&quot;&gt;Traditional animal advocacy targeted at frontier AI developers&lt;/h3&gt;

&lt;p&gt;Animal advocacy orgs could use their traditional techniques, but focus on raising concern for animal welfare among AI developers. For example, buy billboards outside AI company offices or use targeted online ads.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Theory of change:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;AI developers become more concerned for animal welfare, and they make AI development decisions that improve the likelihood that transformative AI is good for animals.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Similar to &lt;a href=&quot;#neartermist-animal-advocacy&quot;&gt;neartermist animal advocacy&lt;/a&gt;, but plausibly more cost-effective because it’s more targeted.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cons:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;It’s not known whether techniques like animal welfare ads are effective in general, and they may even be particularly ineffective among demographics like AI developers.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Even if AI developers cared more about animal welfare, it’s not clear that this would carry through to their work on AI.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;In 2016, I &lt;a href=&quot;https://mdickens.me/causepri-app/#8&quot;&gt;created&lt;/a&gt; a back-of-the-envelope calculation on this idea, and the result wasn’t as good as I expected (it looked worse than standard animal advocacy, if you assume the animal advocacy propagates values into the far future). However, the numbers are outdated because we know a lot more about AI now than we did in 2016.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;research-which-alignment-strategies-are-more-likely-to-be-good-for-animals&quot;&gt;Research which alignment strategies are more likely to be good for animals&lt;/h3&gt;

&lt;p&gt;Some alignment strategies may be better or worse for non-human welfare. For example, I expect &lt;a href=&quot;https://www.lesswrong.com/w/coherent-extrapolated-volition&quot;&gt;CEV&lt;/a&gt; would be better than the current paradigm of “teach the LLM to say things that &lt;a href=&quot;https://en.wikipedia.org/wiki/Reinforcement_learning_from_human_feedback&quot;&gt;RLHF&lt;/a&gt; judges like”, which is better than “hard-code (&lt;a href=&quot;https://en.wikipedia.org/wiki/GOFAI&quot;&gt;GOFAI&lt;/a&gt;-style) whatever moral rules the AI company thinks are correct”.&lt;/p&gt;

&lt;p&gt;A research project could go more in-depth on which alignment techniques are most likely to be good for animals (or digital minds, etc.).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Theory of change:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Identify promising alignment techniques, in the hope that people use those techniques. There are enough animal-friendly alignment researchers at AI companies that this might happen.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;To my knowledge, this question has never been studied.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Some alignment techniques may be &lt;em&gt;much&lt;/em&gt; better for animals than others.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cons:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;We have a poor understanding of what ASI will look like, which makes it very hard to say what will work for animal welfare.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;We don’t know how to align ASI to any goals at all. We can’t align AI to animal welfare until we can align AI to &lt;em&gt;something&lt;/em&gt;.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;In the world where alignment turns out to be tractable, it’s likely that there will be strong incentives shaping how ASI is aligned. The choice of whether to use (say) something-like-CEV or something-like-RLHF will be difficult to influence.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;ai-policyadvocacy-ideas&quot;&gt;AI policy/advocacy ideas&lt;/h2&gt;

&lt;h3 id=&quot;improving-us--china-relations--international-peace&quot;&gt;Improving US &amp;lt;&amp;gt; China relations / international peace&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Theory of change:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;US and China (and other countries) need to agree not to build dangerous AI. Generically improving international cooperation, especially between the US and China, increases the chance that nations cooperate on AI (non-)development.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Who’s working on it?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Too many to list. Some examples: Asia Society’s Center on US–China Relations; Carnegie Endowment for International Peace; Carter Center; National Committee on United States–China Relations; US-China Policy Foundation.&lt;/p&gt;

&lt;p&gt;Orgs that work on international cooperation specifically on AI safety (although not necessarily existential risk) include: &lt;a href=&quot;https://manifund.org/projects/ai-safety-bridge-in-china-seed-funding&quot;&gt;AI Governance Exchange&lt;/a&gt;; &lt;a href=&quot;https://www.cigionline.org/programs/global-ai-risks-initiative/&quot;&gt;Global AI Risks Initiative&lt;/a&gt;; &lt;a href=&quot;https://saif.org/&quot;&gt;Safe AI Forum&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;International cooperation is likely necessary to prevent existentially risky AI from being built.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Efforts to improve cooperation have succeeded in the past; for example, the US–China Strategic and Economic Dialogue (&lt;a href=&quot;https://ncafp.org/resources/new-report-us-china-strategic-economic-dialogues/&quot;&gt;source&lt;/a&gt;).&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;International cooperation has wide-ranging benefits; efforts can attract funding from many parties with varying agendas.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cons:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;The route to preventing extinction is indirect, which dilutes the cost-effectiveness of this intervention.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;International cooperation is far from neglected. Marginal efforts might not make much difference.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;talk-to-international-peace-orgs-about-ai&quot;&gt;Talk to international peace orgs about AI&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Theory of change:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Most international peace orgs probably aren’t aware of how important AI regulation is, and they would likely help develop international treaties on AI if they knew it was important.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Who’s working on it?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Nobody that I know of.&lt;/p&gt;

&lt;p&gt;Pros and cons are largely the same as &lt;a href=&quot;#improving-us--china-relations--international-peace&quot;&gt;Improving US &amp;lt;&amp;gt; China relations / international peace&lt;/a&gt;. In addition:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;This plan is higher-leverage than simply funding international peace orgs.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cons:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Unclear how to do it. What sorts of evidence would the orgs find persuasive? Which orgs are best suited to working on AI-related cooperation? Those questions are answerable, but I’m not in a good position to answer them.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;increasing-government-expertise-about-ai&quot;&gt;Increasing government expertise about AI&lt;/h3&gt;

&lt;p&gt;Talk to policy-makers or create educational materials about how AI works, or help place AI experts in relevant policy roles.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Theory of change:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Policy-makers can do a more effective job of regulating AI if they understand it better.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Increasing government expertise may improve the quality of AI regulations.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cons:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;The experts need to actually care about x-risk. Plenty of experts want to accelerate AI development/prevent regulation. For extant projects designed to increase AI expertise, I am skeptical that the expertise would be appropriately x-risk-oriented.&lt;/p&gt;

    &lt;ul&gt;
      &lt;li&gt;In practice, “hire experts” often means “hire current or former AI company employees”, which is a recipe for regulatory capture. I expect this would significantly &lt;em&gt;decrease&lt;/em&gt; our chance of getting useful x-risk-reducing regulations.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Expertise is much less of a bottleneck than willingness to regulate AI. If I can spend $1 on increasing willingness or $1 on expertise, I’d much rather spend it on willingness.&lt;/p&gt;

    &lt;ul&gt;
      &lt;li&gt;And in a sense, the AI safety community already invests way more in expertise (via policy research) than in advocacy. On the margin, we need advocacy more.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;The downside of poorly-targeted AI safety regulations is that they end up hurting economic development. That’s bad, but it looks pretty trivial in a cost-benefit analysis compared to extinction.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;I wrote a longer comment about this subject &lt;a href=&quot;https://forum.effectivealtruism.org/posts/p2dGt5CekxcXPYHMq/the-ai-adoption-gap-preparing-the-us-government-for-advanced?commentId=4obnjpAvjpbvNe9cS&quot;&gt;on the EA Forum&lt;/a&gt;.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Unless by “expertise” we’re talking about “expertise at recognizing that AI x-risk is a big problem”. In which case, yes, we need expertise.&lt;/p&gt;

    &lt;ul&gt;
      &lt;li&gt;
        &lt;p&gt;Right now, the main strategy for getting x-risk people into government is “pretend not to care about x-risk so you seem normal, and never voice your concerns”. In which case, what’s the point?&lt;/p&gt;
      &lt;/li&gt;
      &lt;li&gt;
        &lt;p&gt;I think a better strategy is “talk to policy-makers about x-risk and straightforwardly tell them what you believe.”&lt;/p&gt;
      &lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;I’ve heard it argued that increasing government expertise is low tractability because government is so big, and making internal changes like that is slow. I don’t think this is a strong consideration because other approaches to AI safety are also low tractability. (I still don’t think increasing expertise is a good plan, but this particular argument seems weak.)&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Policy-makers are largely in the dark about x-risk, which is indeed a problem. But I don’t see clean routes to increasing AI expertise that don’t also push in the wrong direction. Raising concern about the importance of TAI has historically led people to believe things like “I need to be the one who controls TAI, so I will start a new AI company” or “we need to make sure we get TAI before China”. Generically increasing AI expertise is, in my best estimation, net harmful. For more on this, see &lt;a href=&quot;#advocacy-should-emphasize-x-risk-and-misalignment-risk&quot;&gt;Advocacy should emphasize x-risk and misalignment risk&lt;/a&gt;.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;policyadvocacy-in-china&quot;&gt;Policy/advocacy in China&lt;/h3&gt;

&lt;p&gt;Take any of my ideas on AI policy/advocacy, and do that in China instead of in the West.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Theory of change:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The high-level argument for policy/advocacy in China is largely the same as the argument for prioritizing AI risk policy/advocacy in general. China is currently the #2 leading country in AI development, so it’s important that Chinese AI developers take safety seriously.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Who’s working on it?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://manifund.org/projects/ai-safety-bridge-in-china-seed-funding&quot;&gt;AI Governance Exchange&lt;/a&gt;; &lt;a href=&quot;https://saif.org/&quot;&gt;Safe AI Forum&lt;/a&gt; (sort of); perhaps some Chinese orgs that I’m not familiar with.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Many of the ideas listed above are intended to improve the state of AI policy in the US/UK. The pros for those ideas largely also apply to equivalent projects conducted in China (modulo the obvious differences, e.g. China is not a representative democracy, so political advocacy works differently).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cons:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;Probably nobody in China is reading this report. Advocating for Chinese policy as a non-Chinese person is fraught because the CCP will not trust our motivations, just as the American government would not trust a Chinese philanthropist who funds American AI safety advocacy.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;According to an &lt;a href=&quot;https://80000hours.org/career-reviews/china-specialist/&quot;&gt;80,000 Hours career review&lt;/a&gt;: “[The Chinese government] is often wary of non-governmental groups that try to bring about grassroots change. If an organisation is blacklisted, then that’s a nearly irreversible setback.”&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;A comment on my state of knowledge:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I don’t know much about the current state of AI safety in China, or what sort of advocacy might work. I did not prioritize looking into it because my initial impression is that I would require high confidence before recommending any interventions (due to the cons listed above), and I would be unlikely to achieve the necessary level of confidence in a reasonable amount of time.&lt;/p&gt;

&lt;h3 id=&quot;corporate-campaigns-to-advocate-for-safety&quot;&gt;Corporate campaigns to advocate for safety&lt;/h3&gt;

&lt;p&gt;Run public campaigns to advocate for companies to improve safety practices and call out unsafe behavior.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Theory of change:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Companies may improve their behavior in the interest of maintaining a good public image.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Who’s working on it?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://www.themidasproject.com/&quot;&gt;Midas Project&lt;/a&gt;; &lt;a href=&quot;https://www.morelight.ai/&quot;&gt;More Light&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;Corporate campaigns have worked well in animal advocacy.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Companies are smaller than governments, which means they’re more agile and potentially easier to influence.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cons:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;Companies have strong internal incentives to be unsafe. By contrast, governments don’t have a profit motive. They may be harder to move, but they have less reason to oppose safety efforts.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Making more actors safe is better than making fewer actors safe. International treaty &amp;gt; single-country regulation &amp;gt; single-company safety efforts.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Democratic governments are specifically designed to do what people want. That doesn’t always happen, but at least there are mechanisms pushing them that way. Companies are not democratic.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;There is a good chance that safety standards strong enough to prevent human extinction would pose an existential threat to AI companies (or at least would be incompatible with their current valuations). If that’s the case, then corporate campaigns will not be able to get companies to implement adequate safety measures.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Overall, this seems similar to political advocacy, but worse.&lt;/p&gt;

&lt;h3 id=&quot;develop-ai-safetysecurityevaluation-standards&quot;&gt;Develop AI safety/security/evaluation standards&lt;/h3&gt;

&lt;p&gt;Work inside a company, as a nonprofit, or with a governmental body (NIST, ISO, etc.) to develop AI safety standards.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Theory of change:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Sufficiently well-written standards can define under what conditions a frontier AI is safe, and potentially enforce those conditions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Who’s working on it?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://www.aistandardslab.org/&quot;&gt;AI Standards Lab&lt;/a&gt;; various AI companies; various governmental bodies.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;Companies might abide by the standards.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Standards can provide a template with which to write regulations.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cons:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;Standards are not currently the bottleneck to getting regulation written. We already have a substantial amount of work on AI safety standards, but we still don’t have good regulations. There aren’t people waiting around to write legislation if only they had some standards they could use. (Of course, more standards would still be better.)&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;If standards are voluntary, companies can stop abiding by them when they turn out to be hard to satisfy (and in fact, companies have already done that on multiple occasions, with respect to their self-imposed safety standards).&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Standards agencies such as NIST generally don’t have enforcement power.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Nobody knows how to write standards that will prevent extinction if implemented. See &lt;a href=&quot;https://docs.google.com/document/d/1w1vVTiihUTqFye2hIaoGuqJgw-G5LzeQ8x0yoPQ-Ilg/edit?tab=t.0#bookmark=kix.7z7zpwkbjigx&quot;&gt;Appendix&lt;/a&gt; for more on this.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;slow-down-chinese-ai-development-via-ordinary-foreign-policy&quot;&gt;Slow down Chinese AI development via ordinary foreign policy&lt;/h3&gt;

&lt;p&gt;Both the “slow down AI” crowd and the “maintain America’s lead” crowd agree that it is good for China’s AI development to slow down. The American government could accomplish this using foreign policy levers such as:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;restricting chip exports;&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;encouraging scientists to immigrate to the US.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Theory of change:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;On my view, slowing down Chinese AI development is good because it gives the leading AI developers in the United States more room to slow down. It also makes it less likely that a Chinese company develops misaligned TAI (although right now, US companies are more likely to develop TAI first).&lt;/p&gt;

&lt;p&gt;Slowing down Chinese AI development looks good on the “maintain America’s lead” view, although I believe this view is misguided—making TAI safe is much more important than making one country build it before another.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Who’s working on it?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://cset.georgetown.edu/&quot;&gt;Center for Security and Emerging Technology&lt;/a&gt; has written memos recommending similar interventions. I expect there are some other orgs doing similar activities, including orgs that are more concerned about national security than AI risk.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Looks (plausibly) good on multiple worldviews.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cons:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;Some policies could antagonize China and make cooperation more difficult.&lt;/p&gt;

    &lt;ul&gt;
      &lt;li&gt;
        &lt;p&gt;Mutual sabotage between the US and China would probably decrease AI x-risk, but it would also have negative effects. I’d rather the countries increase safety via mutual cooperation.&lt;/p&gt;
      &lt;/li&gt;
      &lt;li&gt;
        &lt;p&gt;This depends on the policy. Improving AI model security is fine, relaxing immigration restrictions is probably fine, but export restrictions or tariffs would likely heighten international tensions. (The United States already has export restrictions and tariffs, but it might be bad to add more on the margin.)&lt;/p&gt;
      &lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Advocating for these sorts of foreign policy interventions is likely not cost-effective because they’re in controversial political areas that already see significant funding and effort. For example, there are already strong and well-funded interests arguing both for and against immigration.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;On the “slow down AI” view, this seems less promising than domestic US regulation because US-based companies look significantly more likely than China to be the first to build misaligned ASI.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;whistleblower-protectionsupport&quot;&gt;Whistleblower protection/support&lt;/h3&gt;

&lt;p&gt;Provide legal support for whistleblowers inside AI companies and assist in publicizing whistleblowers’ findings (e.g. setting up press interviews).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Theory of change:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A few ways this could help:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;
    &lt;p&gt;Whistleblowers can force companies to change their unsafe behavior.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;The possibility of whistleblowers incentivizes companies to be safe.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Publicizing companies’ bad behavior can raise public concern about AI safety.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Who’s working on it?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://www.morelight.ai/&quot;&gt;More Light&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Whistleblower support could be high-leverage: the whistleblowers themselves bring the important information, but they often can’t do much without help.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cons:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;Only helps if there are things worth whistleblowing on. There might not be any warning shots (see &lt;a href=&quot;https://docs.google.com/document/d/1w1vVTiihUTqFye2hIaoGuqJgw-G5LzeQ8x0yoPQ-Ilg/edit?tab=t.0#bookmark=kix.idfhvmca2skk&quot;&gt;When is the right time for advocacy?&lt;/a&gt; for some relevant discussion).&lt;/p&gt;

    &lt;ul&gt;
      &lt;li&gt;OpenAI’s bad behavior on secret NDAs was whistleblow-worthy, but it wasn’t directly related to AI risk, and it’s not clear that the news about OpenAI’s bad behavior decreased x-risk.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Other paths to impact are more direct, e.g. &lt;a href=&quot;#media-about-dangers-of-ai&quot;&gt;media projects&lt;/a&gt; or &lt;a href=&quot;#corporate-campaigns-to-advocate-for-safety&quot;&gt;corporate campaigns&lt;/a&gt;.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;opinion-polling&quot;&gt;Opinion polling&lt;/h3&gt;

&lt;p&gt;Run polls to learn public opinion on AI safety.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Theory of change:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Polls can inform policy-makers about what their constituents want. They also inform people working on AI safety about where their views most align with the public, which can help them prioritize.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Who’s working on it?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://theaipi.org/&quot;&gt;AI Policy Institute&lt;/a&gt;; traditional polling agencies (Pew and YouGov have done polls on people’s views on AI).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;When talking to policy-makers, it’s useful to be able to point to polls as evidence that the public cares about AI safety.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Polls provide common knowledge of concern for AI risk.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cons&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;A good amount of polling already exists. If we already know that people in 2024 were concerned about AI risk, there’s not as much value in knowing that they’re still concerned in 2025.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;The main benefit of polls is to empower advocacy, but we have precious little advocacy right now.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;help-ai-company-employees-improve-safety-within-their-companies&quot;&gt;Help AI company employees improve safety within their companies&lt;/h3&gt;

&lt;p&gt;Work with people in AI companies (by organizing conferences, peer support, etc.) to help them learn about good safety practices.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Theory of change:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;AI company employees can push leadership to implement stronger internal safety standards.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Who’s working on it?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://www.aileadershipcollective.com/&quot;&gt;AI Leadership Collective&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;If internal employees work more on safety, that will make AI companies safer.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cons:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;This theory of change has the same issues as &lt;a href=&quot;#corporate-campaigns-to-advocate-for-safety&quot;&gt;corporate campaigns&lt;/a&gt;: companies have strong incentives to be unsafe; global (or at least national) safety measures are better than single-company measures.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Working with AI company employees might be sufficiently high-leverage to make up for my concerns. Answering that question would require going more in-depth, so I will go with my intuition and say it’s probably not as cost-effective as my top ideas.&lt;/p&gt;

&lt;h3 id=&quot;direct-talks-with-ai-companies-to-make-them-safer&quot;&gt;Direct talks with AI companies to make them safer&lt;/h3&gt;

&lt;p&gt;If you can &lt;a href=&quot;#talk-to-policy-makers-about-ai-x-risk&quot;&gt;talk to policy-makers about AI x-risk&lt;/a&gt;, then maybe you can also talk to AI company executives about AI safety.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Theory of change:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;AI company execs have significant control over the direction of AI. If they started prioritizing safety to a significantly greater extent, they could probably do a lot to decrease x-risk.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Who’s working on it?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;To my knowledge, nobody is systematically working on this.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;High-leverage: you may be able to prevent extinction by changing the minds of a half-dozen or so people.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cons:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;Similar issues to &lt;a href=&quot;#corporate-campaigns-to-advocate-for-safety&quot;&gt;corporate campaigns&lt;/a&gt;: governments have less incentive to be unsafe; it’s better to make all companies safe simultaneously (via regulation). Therefore, political action seems more promising.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;The CEOs of most AI companies are well aware that AI poses an extinction risk, but they are building it anyway, and they are massively under-investing in safety anyway. It’s not clear what additional information would change their minds. So this seems worse than corporate campaigns.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;People in relevant positions should push companies to be safer in ways that they can, but I don’t see any way to support this intervention as a philanthropist.&lt;/p&gt;

&lt;h3 id=&quot;monitor-ai-companies-on-safety-standards&quot;&gt;Monitor AI companies on safety standards&lt;/h3&gt;

&lt;p&gt;Track how well each frontier AI company does model risk assessments, security, misuse prevention, and other safety-relevant behaviors.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Theory of change:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Monitoring AI companies can inform policy-makers and the public about the state of company safety. Monitoring could have a similar effect as &lt;a href=&quot;#corporate-campaigns-to-advocate-for-safety&quot;&gt;corporate campaigns&lt;/a&gt;, where it pushes companies to be safer, or it could even directly inform corporate campaigns.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Who’s working on it?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://ailabwatch.org/&quot;&gt;AI Lab Watch&lt;/a&gt; &amp;amp; &lt;a href=&quot;https://aisafetyclaims.org/&quot;&gt;AI Safety Claims Analysis&lt;/a&gt;;  Future of Life Institute’s &lt;a href=&quot;https://futureoflife.org/ai-safety-index-summer-2025/&quot;&gt;AI Safety Index&lt;/a&gt;; &lt;a href=&quot;https://www.themidasproject.com/&quot;&gt;Midas Project&lt;/a&gt;; &lt;a href=&quot;https://www.safer-ai.org/&quot;&gt;Safer AI&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Easy to do: a solo developer can run a monitoring website as a side project.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cons:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;The theory of change seems somewhat weak to me. There are other ways to demonstrate AI risks to policy-makers and the public, and it’s not clear that this way is particularly good.&lt;/p&gt;

    &lt;ul&gt;
      &lt;li&gt;But I don’t know what kind of impact the monitoring websites have had; maybe they’ve had some big positive influences that I don’t know about.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;create-a-petition-or-open-letter-on-ai-risk&quot;&gt;Create a petition or open letter on AI risk&lt;/h3&gt;

&lt;p&gt;Write a petition raising concern about AI risk or calling for action (such as a &lt;a href=&quot;https://futureoflife.org/open-letter/pause-giant-ai-experiments/&quot;&gt;six-month pause&lt;/a&gt; or an &lt;a href=&quot;https://aitreaty.org/&quot;&gt;international treaty&lt;/a&gt;). Get respected figures (AI experts, etc.) to sign the petition.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Theory of change:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A petition can make it apparent to policy-makers and the public that many people/experts are concerned about AI risk, while also creating common knowledge among concerned people that they are in good company if they speak up about it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Who’s working on it?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://aitreaty.org/&quot;&gt;aitreaty.org&lt;/a&gt;; &lt;a href=&quot;https://www.safe.ai/&quot;&gt;Center for AI Safety / CAIS Action Fund&lt;/a&gt;; &lt;a href=&quot;https://futureoflife.org/&quot;&gt;Future of Life Institute / FLI Action and Research, Inc.&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;Brings AI risk into the public conversation and makes it easier to talk about.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Relatively easy to do—the main difficulty is in finding people to sign it who can bring credibility.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cons:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;The biggest con is that petitions have diminishing marginal utility, and several have been made already.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;The path from a petition to concrete outcomes isn’t entirely clear (although I’m inclined to believe that petitions can work).&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;My sense is that the existing petitions have been quite helpful, but there isn’t clear value in creating &lt;em&gt;more&lt;/em&gt; petitions. There may be some specific call to action that a new petition ought to put forward, but I’m not sure what that would be.&lt;/p&gt;

&lt;h3 id=&quot;create-demonstrations-of-dangerous-ai-capabilities&quot;&gt;Create demonstrations of dangerous AI capabilities&lt;/h3&gt;

&lt;p&gt;Make AI risk concrete by building concrete demonstrations of how AI can be dangerous.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Theory of change:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Many people don’t find it plausible that AI could cause harm; concrete demonstrations may change their minds. It can also serve to make AI risk more visceral.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Who’s working on it?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://far.ai/&quot;&gt;FAR.AI&lt;/a&gt;; &lt;a href=&quot;https://palisaderesearch.org/&quot;&gt;Palisade Research&lt;/a&gt;; some one-off work by others (e.g. &lt;a href=&quot;https://apartresearch.com/sprints/ai-capabilities-and-risks-demo-jam-2024-08-23-to-2024-08-26&quot;&gt;Apart Research&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Concrete demonstrations can aid advocacy efforts.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cons:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;Capability demonstrations may send the wrong message, encouraging accelerationism instead of caution.&lt;/p&gt;

    &lt;ul&gt;
      &lt;li&gt;I am more concerned about this for general AI capability evaluations. For this project, I am specifically thinking of demonstrations of how AI can do &lt;em&gt;harm&lt;/em&gt;.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;You can’t create a concrete demonstration of superintelligent AI’s capabilities until you already have superintelligent AI, at which point it’s too late. Pre-superintelligent AIs can have scary capabilities, but demos are misleading in a sense because they may create a skewed understanding of where the risks come from.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I’m most optimistic about demonstrations where there is a clear plan for how to use them, e.g. Palisade Research builds its demos specifically to show to policy-makers. I think Palisade is doing a particularly good version of this idea, but for the most part, I think other ideas are better.&lt;/p&gt;

&lt;h3 id=&quot;sue-openai-for-violating-its-nonprofit-mission&quot;&gt;Sue OpenAI for violating its nonprofit mission&lt;/h3&gt;

&lt;p&gt;OpenAI:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Our mission is to ensure that artificial general intelligence […] benefits all of humanity.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;OpenAI has more-or-less straightforwardly violated this mission in various ways. Humanity plausibly has grounds to sue OpenAI.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Theory of change:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A lawsuit would change OpenAI’s incentives and may force OpenAI to actually put humanity’s interest first, depending on how well the lawsuit goes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Who’s working on it?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;In 2024, Elon Musk filed a lawsuit against OpenAI on this basis. The lawsuit is set to go to trial in 2026.&lt;/p&gt;

&lt;p&gt;Given that there is already an ongoing lawsuit, it may be better to support the existing suit (e.g. by writing an amicus brief or by offering expert testimony) than to start a new one.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;A lawsuit could force OpenAI to significantly improve safety.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cons:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;A failed lawsuit has numerous downsides—it can make the plaintiff look bad; it can set an unfavorable precedent; it’s expensive.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Legal matters may have other, hard-to-predict downsides, and I’m not qualified to evaluate them.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;I am not a lawyer, but my impression is that courts are typically quite lenient about what nonprofits are allowed to do, so it would be difficult for a lawsuit to succeed.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Given that there is an ongoing lawsuit by Elon Musk, who is known to behave erratically, Musk may do something unpredictable that causes harm.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;send-people-ai-safety-books&quot;&gt;Send people AI safety books&lt;/h3&gt;

&lt;p&gt;Books are a tried-and-true method of explaining complex ideas. One could mail books on AI risk to Congress people, or staffers, or AI company execs, or other relevant people.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Theory of change:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A book can explain in detail why AI risk is a big deal and thus persuade people that it’s a big deal.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Who’s working on it?&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;MIRI &lt;a href=&quot;https://www.lesswrong.com/posts/CYTwRZtrhHuYf7QYu/a-case-for-courage-when-speaking-of-ai-danger&quot;&gt;did something similar&lt;/a&gt; when promoting their new book: “We cold-emailed a bunch of famous people (like Obama and Oprah)”. They were asking people to write blurbs for the book, which isn’t exactly what I had in mind, but it’s related.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Aidar Toktargazin has a &lt;a href=&quot;https://manifund.org/projects/giving-free-ai-safety-books-for-potentially-high-impact-individuals&quot;&gt;Manifund project&lt;/a&gt; to give out AI safety books to researchers and professors at Nazarbayev University in Kazakhstan.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Mailing books is cheap.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cons:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;Mailing books seems riskier than other kinds of advocacy—it could be viewed as excessively pushy.&lt;/p&gt;

    &lt;ul&gt;
      &lt;li&gt;Mormons give out free books, and they’re viewed as pushy, but they’ve also grown a lot. It’s unclear whether Mormons’ publicity strategies are worth emulating.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;It may be better to let publishers do their own publicity.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This idea &lt;em&gt;might&lt;/em&gt; be really good, but it’s high-variance. I would not recommend it without significantly investigating the possible downsides first.&lt;/p&gt;

&lt;h2 id=&quot;ai-research-ideas&quot;&gt;AI research ideas&lt;/h2&gt;

&lt;p&gt;I said I wasn’t going to focus on technical research or policy research, but I did incidentally come up with a few under-explored ideas. These are research projects that I’d like to see more work on, although I still believe advocacy is more important.&lt;/p&gt;

&lt;h3 id=&quot;research-on-how-to-get-people-to-extrapolate&quot;&gt;Research on how to get people to extrapolate&lt;/h3&gt;

&lt;p&gt;A key psychological mistake: “superintelligent AI has never caused extinction before, therefore it won’t happen.” Or: “AI is not currently dangerous, therefore it will never be dangerous.”&lt;/p&gt;

&lt;p&gt;Compare: “Declaring a COVID emergency is silly; there are currently zero cases in San Francisco.” (I am slightly embarrassed to say that that is a thought I had in February 2020.)&lt;/p&gt;

&lt;p&gt;Relatedly, some people expect there will not be much demand for AI regulation until we see a “warning shot”. Perhaps, but I’m concerned we will run into this failure-to-extrapolate phenomenon. AI has already demonstrated alignment failures (Bing Sydney comes to mind; or GPT 4o’s absurd sycophancy; or numerous xAI/Grok incidents). But clear examples of misalignment get fixed (because AI is still dumb enough for us to control). So people may draw the lesson that misalignment is fixable, and there may keep being progressively bigger incidents until we finally build an AI powerful enough to kill everyone.&lt;/p&gt;

&lt;p&gt;I am concerned that the concept of AI x-risk will never be able to get sufficient attention due to this psychological mistake. Therefore, we need to figure out how to get people to stop making this mistake.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Theory of change:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Many people ignore AI x-risk because of this mistake. If we knew how to get people to extrapolate, we could use that knowledge to improve communication on AI risk.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Who’s working on it?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Some academic psychologists have done related research.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;If successful, this research would significantly increase how many people take AI risk seriously.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cons:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;We know from the history of psychology that it’s difficult to find psychological insights. I’d guess it would cost tens or hundreds of millions of dollars to produce meaningful results (if it’s even possible at all).&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Psychology research takes a long time to pay off. That doesn’t work if timelines are short.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;This research is not particularly neglected. The American Psychological Association &lt;a href=&quot;https://www.apa.org/news/press/releases/2022/02/psychology-climate-change&quot;&gt;wants&lt;/a&gt; more research on how to get people to care about climate change, which is related.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;To the extent that science has already uncovered answers to this question, good communicators already know those answers. For example, studies have found that people are more likely to pay attention to future problems when you give concrete scenarios; but a good writer already does that. (See Deep Research (&lt;a href=&quot;https://claude.ai/share/b91fca37-ce74-46b3-a3e1-379d0d937aff&quot;&gt;1&lt;/a&gt;, &lt;a href=&quot;https://chatgpt.com/share/685c8531-8870-8011-bb4d-dcd765ba7d43&quot;&gt;2&lt;/a&gt;) for an attempt at finding relevant psychology studies, although only about a quarter of them are actually relevant.)&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Rather than spending $50 million on psychology research and then $50 million on a variety of psychologically-motivated media projects informed by that research, I would rather just spend $100 million on media projects.&lt;/p&gt;

&lt;h3 id=&quot;investigate-how-to-use-ai-to-reduce-other-x-risks&quot;&gt;Investigate how to use AI to reduce other x-risks&lt;/h3&gt;

&lt;p&gt;An important argument against slowing down AI development is that we could use advanced AI to reduce other x-risks (climate change, nuclear war, etc.).&lt;/p&gt;

&lt;p&gt;But an aligned AI wouldn’t &lt;em&gt;automatically&lt;/em&gt; reduce x-risk. It may increase technological risks (e.g. synthetic biology) if offensive capabilities outscale defensive ones. An aligned AI could reduce nuclear risk by improving global coordination, but it’s not &lt;em&gt;obvious&lt;/em&gt; that it would.&lt;/p&gt;

&lt;p&gt;Therefore, it may be worth asking: Are there some paths of AI development that differentially reduce non-AI x-risk?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Theory of change:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Research on how to direct AI may inform efforts by the developers of advanced AI, which may ultimately reduce x-risk.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Who’s working on it?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Nobody, to my knowledge.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;If you believe we need to build TAI to avert non-AI x-risks, then it stands to reason that you should also want to know how to direct TAI to accomplish that end.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;This line of research is highly neglected (to my knowledge, there are zero people working on it).&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cons:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;Non-AI x-risks seem less concerning than AI x-risk, so it seems better to work directly on reducing AI x-risk.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;If you believe we should slow down AI development, then this line of research doesn’t matter as much. And I do believe we should slow down AI development, and I believe that a wide range of worldviews should agree with me on that (see Appendix: &lt;a href=&quot;https://docs.google.com/document/d/1w1vVTiihUTqFye2hIaoGuqJgw-G5LzeQ8x0yoPQ-Ilg/edit?tab=t.0&quot;&gt;A moratorium is the best outcome&lt;/a&gt;).&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;At a glance, the problem seems difficult to make progress on (see below).&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;My initial thoughts on this line of research:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;You can’t control what general AI is good at. It would be good at everything. There is no known way to make it (say) good at defending against biological weapons, but bad at creating biological weapons.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;A narrow “cooperation superintelligence” would be better than a narrow “scientist superintelligence” because the latter increases technological x-risk. But based on current trends in AI, my guess is that we could develop a “scientist ASI” that’s bad at cooperation, but we couldn’t develop a “cooperation ASI” that’s bad at science. So this idea is likely a dead end.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Even if we could build a “cooperation ASI”, we still need to solve alignment problems first. So it seems better to focus on solving alignment.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;a-short-timelines-alignment-plan-that-doesnt-rely-on-bootstrapping&quot;&gt;A short-timelines alignment plan that doesn’t rely on bootstrapping&lt;/h3&gt;

&lt;p&gt;To my knowledge, every major AI alignment plan depends on alignment bootstrapping, i.e., using AI to solve AI alignment. I am skeptical that bootstrapping will work, and even if you think it will probably work (with, say, 90% credence), you should still want a contingency plan.&lt;/p&gt;

&lt;p&gt;Write a research agenda for how to solve AI alignment &lt;em&gt;without&lt;/em&gt; using bootstrapping.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Theory of change:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If people come up with sufficiently good plans, then we might solve alignment.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Who’s working on it?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Peter Gebauer is running a &lt;a href=&quot;https://manifund.org/projects/contest-for-better-short-timeline-agi-safety-plans-&quot;&gt;contest&lt;/a&gt; for short-timelines AI safety plans, but the plans are allowed to depend on bootstrapping (e.g. Gebauer favorably cites &lt;a href=&quot;https://storage.googleapis.com/deepmind-media/DeepMind.com/Blog/evaluating-potential-cybersecurity-threats-of-advanced-ai/An_Approach_to_Technical_AGI_Safety_Apr_2025.pdf&quot;&gt;DeepMind’s plan&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;There are some plans that say something like “stop developing AI until we solve alignment” (ex: &lt;a href=&quot;https://techgov.intelligence.org/research/ai-governance-to-avoid-extinction&quot;&gt;MIRI&lt;/a&gt;; &lt;a href=&quot;https://www.narrowpath.co/&quot;&gt;Narrow Path&lt;/a&gt;), which is valid (and I agree), but it’s not a technical plan.&lt;/p&gt;

&lt;p&gt;The closest thing I’ve seen is &lt;a href=&quot;https://www.lesswrong.com/posts/HfqbjwpAEGep9mHhc/the-plan-2023-version&quot;&gt;John Wentworth’s research agenda&lt;/a&gt;, but it specifically invokes the &lt;a href=&quot;https://knowyourmeme.com/memes/profit&quot;&gt;underpants gnome meme&lt;/a&gt;, i.e., the plan has a huge hole in the middle.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;Existing plans are insufficiently rigorous.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;To my knowledge, there are zero meaningful plans that don’t rely on alignment bootstrapping. If bootstrapping turns out not to work, every plan fails. There is a gap to be filled.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cons:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;It is highly unlikely that any satisfactory plan exists.&lt;/p&gt;

    &lt;ul&gt;
      &lt;li&gt;I think trying to create plans is a reasonable idea on the off chance that somebody &lt;em&gt;does&lt;/em&gt; come up with a good plan, but I don’t think it’s a good use of marginal philanthropic resources.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;An alignment plan still leaves &lt;a href=&quot;https://docs.google.com/document/d/1w1vVTiihUTqFye2hIaoGuqJgw-G5LzeQ8x0yoPQ-Ilg/edit?tab=t.0#bookmark=kix.o881tulnpfpa&quot;&gt;non-alignment problems&lt;/a&gt; unsolved.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;AI companies cannot be trusted to implement a safe plan even if one exists.&lt;/p&gt;

    &lt;ul&gt;
      &lt;li&gt;But the plan existing does increase the chance that companies follow the plan, or that external pressures can force companies to follow it.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;rigorous-analysis-of-the-various-ways-alignment-bootstrapping-could-fail&quot;&gt;Rigorous analysis of the various ways alignment bootstrapping could fail&lt;/h3&gt;

&lt;p&gt;I’m pessimistic about the prospects of alignment bootstrapping, and I’ve seen various AI safety researchers express similar skepticism, but I’ve never seen a rigorous analysis of the concerns with alignment bootstrapping.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Theory of change:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The AI companies with AI safety plans are all expecting bootstrapping to work. A thorough critique could convince AI companies to develop better plans, or create more of a consensus among ML researchers that bootstrapping is inadequate, or convince policy-makers that they need to make AI companies be safer.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Who’s working on it?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Nobody, to my knowledge.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;This sort of analysis would be feasible to write—it would require expertise and time investment, but it doesn’t require novel research.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cons:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;The theory of change seems weak—if companies haven’t already figured out that their plans are inadequate, then I doubt that more criticism is going to change their minds.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Independent of AI companies’ top-down plans, it’s helpful if you can better inform alignment researchers about what they should be focusing on. But my guess is an analysis like this wouldn’t shift much work.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h1 id=&quot;future-work&quot;&gt;Future work&lt;/h1&gt;

&lt;h2 id=&quot;pros-and-cons-of-slowing-down-ai-development-with-numeric-credences&quot;&gt;Pros and cons of slowing down AI development, with numeric credences&lt;/h2&gt;

&lt;p&gt;I addressed this to some extent, but I could’ve gone into more detail, and I didn’t do any numeric analysis.&lt;/p&gt;

&lt;p&gt;I would like to see a more formal model that includes how the tradeoff changes based on considerations like:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;One’s view of population ethics&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;The importance of preventing deaths vs. causing births&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Temporal discount rate or longtermism&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;P(doom)&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;The extent to which P(doom) is reduced if AI development slows down&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Existential risk from sources other than AI (see &lt;a href=&quot;#quantitative-model-on-ai-x-risk-vs-other-x-risks&quot;&gt;Quantitative model on AI x-risk vs. other x-risks&lt;/a&gt;)&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;How AI development interacts with other x-risks&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Relatedly, how does P(doom) change what actions you’re willing to take? I often see people assume that at a P(doom) of, say, 25%, pausing AI development is bad. That seems wrong to me. I believe that at 25% you should be about as aggressive (about pushing for mitigations) as you would be at 95%, although I haven’t put in the work to come up with a detailed justification for this position. The basic argument is that x-risk looks very bad on longtermist grounds, and a delay of even (say) 100 years doesn’t look like as big a deal.&lt;/p&gt;

&lt;h2 id=&quot;quantitative-model-on-ai-x-risk-vs-other-x-risks&quot;&gt;Quantitative model on AI x-risk vs. other x-risks&lt;/h2&gt;

&lt;p&gt;There is an argument that we need TAI soon because otherwise we are likely to kill ourselves via some other x-risk. I have a rough idea for how I could build a quantitative model to test under what assumptions this argument works. Building that model wasn’t a priority for this report, but I could do it without much additional effort.&lt;/p&gt;

&lt;p&gt;The basic elements the model needs are&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;
    &lt;p&gt;X-risk from AI&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;X-risk from other sources&lt;/p&gt;

    &lt;ul&gt;
      &lt;li&gt;How to estimate these? Expert forecasts &lt;a href=&quot;https://forum.effectivealtruism.org/posts/Kuf5Nn6qNCp2kyYvo/is-it-so-much-to-ask-for-a-nice-reliable-aggregated-x-risk&quot;&gt;seem unreliable&lt;/a&gt;.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;How much we can reduce AI x-risk by delaying development&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;X-risk from other sources, conditional on TAI&lt;/p&gt;

    &lt;ul&gt;
      &lt;li&gt;
        &lt;p&gt;For an aligned totalizing TAI singleton that quickly controls the world, x-risk would be ~0.&lt;/p&gt;
      &lt;/li&gt;
      &lt;li&gt;
        &lt;p&gt;TAI doesn’t trivially decrease other x-risks; it could increase x-risk by accelerating technological growth (which means it’s easier to build dangerous technology—see &lt;a href=&quot;https://nickbostrom.com/papers/vulnerable.pdf&quot;&gt;The Vulnerable World Hypothesis&lt;/a&gt;). The mechanism of decreasing x-risk isn’t that TAI is smarter; the mechanism is that it could increase global coordination / centralize the ability to make dangerous technology.&lt;/p&gt;
      &lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;h2 id=&quot;deeper-investigation-of-the-ai-arms-race-situation&quot;&gt;Deeper investigation of the AI arms race situation&lt;/h2&gt;

&lt;p&gt;Some open questions:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;What are some historical examples of arms races that were successfully aborted? What happened to make things go well?&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;How hard are different factions racing, and what would it take to convince them to slow down?&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;How likely are we to end up in a bad totalitarian regime post-TAI if various parties end up “winning the race” and building an alignable ASI?&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;does-slowing-downpausing-ai-help-solve-non-alignment-problems&quot;&gt;Does slowing down/pausing AI help solve non-alignment problems?&lt;/h2&gt;

&lt;p&gt;Pausing (or at least slowing down) clearly gives us more time to solve alignment, and has clear downsides in terms of opportunity cost. But other effects are less clear. Pausing may help with some other big problems: &lt;a href=&quot;https://forum.effectivealtruism.org/posts/2cZAzvaQefh5JxWdb/bringing-about-animal-inclusive-ai&quot;&gt;animal-inclusive AI&lt;/a&gt;; &lt;a href=&quot;https://eleosai.org/post/research-priorities-for-ai-welfare/&quot;&gt;AI welfare&lt;/a&gt;; &lt;a href=&quot;https://longtermrisk.org/research-agenda&quot;&gt;S-risks from conflict&lt;/a&gt;; &lt;a href=&quot;https://www.lesswrong.com/posts/GAv4DRGyDHe2orvwB/gradual-disempowerment-concrete-research-projects&quot;&gt;gradual disempowerment&lt;/a&gt;; &lt;a href=&quot;https://forum.effectivealtruism.org/posts/LpkXtFXdsRd4rG8Kb/reducing-long-term-risks-from-malevolent-actors&quot;&gt;risks from malevolent actors&lt;/a&gt;; &lt;a href=&quot;https://forum.effectivealtruism.org/posts/HqmQMmKgX7nfSLaNX/moral-error-as-an-existential-risk&quot;&gt;moral error&lt;/a&gt;. There are some arguments for and against pausing being useful for these non-alignment problems; for more on this topic, see &lt;a href=&quot;https://docs.google.com/document/d/1w1vVTiihUTqFye2hIaoGuqJgw-G5LzeQ8x0yoPQ-Ilg/edit?tab=t.0#bookmark=kix.o881tulnpfpa&quot;&gt;Slowing down is a general-purpose solution to every non-alignment problem&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;I have never seen an attempt to analyze why pausing AI development might or might not help with non-alignment problems; this seems like an important question.&lt;/p&gt;

&lt;p&gt;Some considerations:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;We don’t want to build TAI until we become more &lt;a href=&quot;https://forum.effectivealtruism.org/posts/hhyjbjwN96NWRSvv7/clarifying-wisdom-foundational-topics-for-aligned-ais-to&quot;&gt;wise&lt;/a&gt;, but it’s not clear that we &lt;em&gt;can&lt;/em&gt; become more wise, or perhaps TAI would be wiser than we would.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Pausing may increase misuse risk or some related risk.&lt;/p&gt;

    &lt;ul&gt;
      &lt;li&gt;One conceivable outcome, albeit one that doesn’t seem particularly likely, is that AI companies become increasingly wealthy and powerful by selling pre-TAI AI services, and this concentration of power ultimately allows them to build TAI in a way that goes against most people’s interests.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Peace and democratic governance have been trending upward over the past century (see &lt;a href=&quot;https://en.wikipedia.org/wiki/The_Better_Angels_of_Our_Nature&quot;&gt;The Better Angels of Our Nature&lt;/a&gt;). Slowing/pausing means the world will probably be more peaceful and democratic when we get TAI, which is probably desirable (less chance of power struggle, etc.)&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Moral circles have expanded over time (although they haven’t strictly expanded—see Gwern’s &lt;a href=&quot;https://gwern.net/narrowing-circle&quot;&gt;The Narrowing Circle&lt;/a&gt;). It’s better to develop TAI when moral circles are wider.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;determine-when-will-be-the-right-time-to-push-for-strong-restrictions-on-ai-if-not-now&quot;&gt;Determine when will be the right time to push for strong restrictions on AI (if not now)&lt;/h2&gt;

&lt;p&gt;A common view: “We should push for strong restrictions on AI, but now is not the right time.”&lt;/p&gt;

&lt;p&gt;I disagree with this view; I think now &lt;em&gt;is&lt;/em&gt; the right time. But suppose it isn’t. When will be the right time?&lt;/p&gt;

&lt;p&gt;Consider the tradeoff wherein if you do advocacy later, then&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;
    &lt;p&gt;the risks of AI will be more apparent;&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;but there’s a greater chance that you’re too late to do anything about it.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Is there some inflection point where the first consideration starts outweighing the second?&lt;/p&gt;

&lt;p&gt;And how do you account for uncertainty? (Uncertainty means you should do advocacy earlier, because being too late is much worse than being too early.)&lt;/p&gt;

&lt;p&gt;I don’t think this question is worth trying to answer because I am sufficiently confident that now is the right time. But I think this is an important question from the view that now is too early.&lt;/p&gt;

&lt;h1 id=&quot;supplements&quot;&gt;Supplements&lt;/h1&gt;

&lt;p&gt;I have written two supplements in separate docs:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;a href=&quot;https://docs.google.com/document/d/1w1vVTiihUTqFye2hIaoGuqJgw-G5LzeQ8x0yoPQ-Ilg/&quot;&gt;Appendix&lt;/a&gt;: Some miscellaneous topics that weren’t quite relevant enough to include in the main text.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;a href=&quot;https://docs.google.com/document/d/1vWB5CgH69W4lmpZrCXaD3n2Jqz32kVnvCJwUA2RE8Fw/&quot;&gt;List of relevant organizations&lt;/a&gt;: A reference list of orgs doing work in AI-for-animals or AI policy/advocacy, with brief descriptions of their activities.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ol&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:1&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;For example, almost every frontier AI company opposed SB-1047; Anthropic supported the bill conditional on amendment, and Elon Musk supported it but xAI did not take any public position. See &lt;a href=&quot;https://chatgpt.com/share/68b20f49-1ae8-8011-881b-1b2747818a05&quot;&gt;ChatGPT&lt;/a&gt; for a compilation of sources. &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:2&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I asked ChatGPT Deep Research to tally up funding for research vs. policy and it &lt;a href=&quot;https://chatgpt.com/share/685885d3-f564-8011-90fb-9b7fb46d774f&quot;&gt;found&lt;/a&gt; ~2x as many researchers as policy people and also 2x the budget, although it miscounted some things; most of what it counted as “AI policy” is (1) unrelated to x-risk and (2) policy research, not policy advocacy; and it only included big orgs (e.g. it missed the long tail of independent alignment researchers). So I believe the true ratio is even more skewed than 2:1. &lt;a href=&quot;#fnref:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:3&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;According to my research, it’s not difficult to find examples of times when UK policy influenced US policy, but it’s still unclear to me how strong this effect is. There are also some theoretical arguments for and against the importance of UK AI policy, but I didn’t find any of them particularly compelling, so I remain agnostic. &lt;a href=&quot;#fnref:3&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:4&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I find this state of affairs confusing given how many people profess belief in short timelines. I think part of the reason is that people involved in AI safety tend to be intellectual researcher-types (like me, for example) who are more likely to orient their work toward “what is going to improve the state of knowledge?” rather than “what is likely to pay off in the near future?” &lt;a href=&quot;#fnref:4&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>Healthy Cooking Tips from a Lazy Person</title>
				<pubDate>Fri, 29 Aug 2025 00:00:00 -0700</pubDate>
				<link>http://mdickens.me/2025/08/29/lazy_cooking_tips/</link>
				<guid isPermaLink="true">http://mdickens.me/2025/08/29/lazy_cooking_tips/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;&lt;img src=&quot;/assets/images/chopping-onion.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;&lt;a href=&quot;https://xcancel.com/naledimashishi/status/1494352227456233476&quot;&gt;source&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The problem with most “lazy cooking” advice is that it’s not lazy enough. Today I bring you some truly lazy ways of eating healthy.&lt;/p&gt;

&lt;p&gt;This is the advice that I would’ve liked to hear when I was a lazy teenager. I’m still lazy, but I’m better at making food now. (I’m not going to say I’m better at cooking, because the way I make most food could only very generously be described as “cooking”.)&lt;/p&gt;

&lt;p&gt;All my lazy meals are vegan because I’m vegan, but if anything, that works to my advantage because the easiest animal foods still take more work than the easiest plant foods. (You can eat raw vegetables but you can’t eat raw chicken.&lt;sup id=&quot;fnref:3&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;)&lt;/p&gt;

&lt;h2 id=&quot;contents&quot;&gt;Contents&lt;/h2&gt;

&lt;ul id=&quot;markdown-toc&quot;&gt;
  &lt;li&gt;&lt;a href=&quot;#contents&quot; id=&quot;markdown-toc-contents&quot;&gt;Contents&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#healthy-foods-that-require-no-preparation-whatsoever&quot; id=&quot;markdown-toc-healthy-foods-that-require-no-preparation-whatsoever&quot;&gt;Healthy foods that require no preparation whatsoever&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#healthy-foods-that-take-less-than-one-minute-of-preparation&quot; id=&quot;markdown-toc-healthy-foods-that-take-less-than-one-minute-of-preparation&quot;&gt;Healthy foods that take less than one minute of preparation&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#cooking-tips&quot; id=&quot;markdown-toc-cooking-tips&quot;&gt;Cooking tips&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#notes&quot; id=&quot;markdown-toc-notes&quot;&gt;Notes&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;!-- more --&gt;

&lt;h2 id=&quot;healthy-foods-that-require-no-preparation-whatsoever&quot;&gt;Healthy foods that require no preparation whatsoever&lt;/h2&gt;

&lt;ol&gt;
  &lt;li&gt;Nuts and seeds. Buy a bag and eat them out of the bag.
    &lt;ul&gt;
      &lt;li&gt;Or buy trail mix for more variety.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;Nut butter. You can eat it right out of the jar if you want to.
    &lt;ul&gt;
      &lt;li&gt;Some people are under the misconception that the big-brand peanut butters like Jif and Skippy are bad for you because they contain sugar. The Jif that’s in my cabinet right now only gets 7% of its calories from sugar, and that little bit of sugar makes it taste 1000% better. That’s a flavor to sugar ratio of 14,285%; you can’t argue with the math.&lt;/li&gt;
      &lt;li&gt;Some people believe peanut butter is bad for you because it contains a lot of fat. Trans fats and saturated fats are the “bad fats”;&lt;sup id=&quot;fnref:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt; peanut butter is made of unsaturated fats, which are the “good fats”.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;Many fruits can be eaten with no prep or with very little prep. You have to peel bananas, but peeling a banana is no harder than opening a candy wrapper.&lt;/li&gt;
  &lt;li&gt;A lot of vegetables can be eaten raw. They taste better when you cook and season them, but sometimes you have to sacrifice flavor in the name of laziness.&lt;/li&gt;
  &lt;li&gt;There is nothing wrong with eating tofu raw. But when it comes to zero-prep soy-based foods, my go-to is dry roasted edamame.&lt;/li&gt;
  &lt;li&gt;Soylent and Huel aren’t exactly &lt;em&gt;healthy&lt;/em&gt;, but they’re not &lt;em&gt;not&lt;/em&gt; healthy, either.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2 id=&quot;healthy-foods-that-take-less-than-one-minute-of-preparation&quot;&gt;Healthy foods that take less than one minute of preparation&lt;/h2&gt;

&lt;ol&gt;
  &lt;li&gt;Get some vegetables (carrots or broccoli) and dip them in hummus.&lt;/li&gt;
  &lt;li&gt;Pour a bowl of cereal.
    &lt;ul&gt;
      &lt;li&gt;Breakfast cereals are often bad for you, but there are some good ones. Last year I reviewed &lt;a href=&quot;https://mdickens.me/2025/01/17/high_protein_breakfast_cereals/&quot;&gt;high-protein breakfast cereals&lt;/a&gt;, all of which I would describe as healthy. There are also many low-protein but still healthy cereals, for example Cheerios are made of whole oats.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;Buttered toast is cool, but Big Toaster doesn’t want you to know that buttered untoasted bread is maybe even better.&lt;sup id=&quot;fnref:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
  &lt;li&gt;Three bean recipes in increasing order of prep time + flavorfulness:
    &lt;ol&gt;
      &lt;li&gt;Open can of beans; eat straight out of the can. I personally would use a spoon, but if you’d rather pour the beans directly into your mouth, I won’t judge.&lt;/li&gt;
      &lt;li&gt;Open can of beans; pour into bowl; add some kind of seasoning; eat.
        &lt;ul&gt;
          &lt;li&gt;Some seasoning ideas: hot sauce; garlic powder; Chesapeake Bay seasoning; garlic &amp;amp; herb seasoning mix (like &lt;a href=&quot;https://www.amazon.com/McCormick-Salt-Free-Garlic-Seasoning/dp/B08KRC7V7J&quot;&gt;this&lt;/a&gt;).&lt;/li&gt;
        &lt;/ul&gt;
      &lt;/li&gt;
      &lt;li&gt;Do #2, but also microwave it before eating. (I know I promised sub-minute prep times, but this recipe will take more like two minutes.)&lt;/li&gt;
    &lt;/ol&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;h2 id=&quot;cooking-tips&quot;&gt;Cooking tips&lt;/h2&gt;

&lt;p&gt;I mostly eat easy meals, but I do real cooking once every couple days—my “real cooking” mostly means “chop some stuff and throw it in an air fryer”. But sometimes I even cook things in a pot. I have a few methods for making my cooking easier.&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Recipes often call for the same set of spices. Pre-mix your spices or buy them pre-mixed.
    &lt;ul&gt;
      &lt;li&gt;Curry recipes often call for garam masala, cumin, and coriander. I’m not sure what’s going on there because the main two ingredients of garam masala are cumin and coriander. When I cook a big pot of beans, I just throw in a ton of garam masala.&lt;/li&gt;
      &lt;li&gt;My most-used spices are a pre-mixed garlic &amp;amp; herb seasoning, a pre-mixed garam masala, and a pre-mixed all-purpose spice mix consisting of salt + pepper + garlic powder.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;You can buy vegetables pre-chopped if you’re willing to pay more.
    &lt;ul&gt;
      &lt;li&gt;Onions hurt my eyes a lot. I buy them pre-chopped which saves time and saves my eyes.&lt;/li&gt;
      &lt;li&gt;As a middle ground, you can buy pre-peeled garlic cloves. Peeling is much harder than chopping (for me at least) so pre-peeled garlic lets me skip the worst part.&lt;/li&gt;
      &lt;li&gt;I am not the first person to observe that most recipes don’t call for enough garlic, but I think even most people who say “recipes don’t call for enough garlic” still don’t use enough garlic. If a recipe calls for 2 cloves then I will use about 20 cloves and I’m still not sure I’m using enough. (This doesn’t have anything to do with being lazy but I need to express my garlic-related feelings.)&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;Many oven or stovetop recipes can be done faster in an air fryer. An air fryer cooks food fast like a microwave, but it makes the food crispy instead of mushy and weird.
    &lt;ul&gt;
      &lt;li&gt;I’ve heard a stereotype that Asian moms use their ovens exclusively as pot-and-pan storage. If that’s true then I guess that makes me an Asian mom.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;There are many convenient-but-unhealthy foods, too. It’s okay to eat unhealthy food sometimes.&lt;/li&gt;
&lt;/ol&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:3&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I guess you could eat raw eggs if you really wanted to. People talk about Rocky, but I’ve always associated eating raw eggs with &lt;a href=&quot;https://www.youtube.com/watch?v=cYqCtpa9_Ms&quot;&gt;the dad from The Neverending Story&lt;/a&gt;. &lt;a href=&quot;#fnref:3&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:1&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Some people don’t even believe saturated fat is bad for you. I wrote more about this in &lt;a href=&quot;https://mdickens.me/2024/09/26/outlive_a_critical_review/#the-data-are-unclear-on-whether-reducing-saturated-fat-intake-is-beneficial&quot;&gt;my &lt;em&gt;Outlive&lt;/em&gt; review&lt;/a&gt;. &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:2&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Lest there be any confusion about how I previously said I was vegan: when I say “butter” what I actually mean is Earth Balance. In fact butter isn’t good for you so if I was eating real butter, bread + butter wouldn’t qualify as a healthy meal. Earth Balance is made of unsaturated fats so it’s healthy.&lt;/p&gt;

      &lt;p&gt;And of course I eat whole wheat bread, specifically Dave’s Killer Bread which is the undisputed best-tasting whole grain bread. &lt;a href=&quot;#fnref:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>Doctor Strange Didn't See Only One Victory out of 14,000,605 Futures</title>
				<pubDate>Fri, 25 Jul 2025 00:00:00 -0700</pubDate>
				<link>http://mdickens.me/2025/07/25/doctor_strange/</link>
				<guid isPermaLink="true">http://mdickens.me/2025/07/25/doctor_strange/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;Or, more accurately, the fact that he said the Avengers only won once can’t be taken as evidence about what he really saw.&lt;/p&gt;

&lt;p&gt;This post contains spoilers for &lt;em&gt;Avengers: Infinity War&lt;/em&gt; and &lt;em&gt;Avengers: Endgame&lt;/em&gt;.&lt;/p&gt;

&lt;!-- more --&gt;

&lt;p&gt;Doctor Strange told the heroes that he used the Time Stone to look into 14,000,605 futures, and saw only one future where they won.&lt;/p&gt;

&lt;p&gt;He spent the rest of the two movies steering events to play out as he saw them in this one future.&lt;/p&gt;

&lt;p&gt;Therefore, while Strange was using the Time Stone, he must have taken the exact same actions, including telling the Avengers that there was only one way to win.&lt;/p&gt;

&lt;p&gt;Strange telling the heroes (especially Tony Stark) that they only won in one future was a critical element of his plan—the plan only worked because he said that.&lt;/p&gt;

&lt;p&gt;But when he played out this scenario using the Time Stone, he couldn’t have known at that point that there was only one way to win, because &lt;em&gt;he hadn’t run the scenarios yet&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;So what actually happened was:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;In one of the possible futures, Doctor Strange told Tony that there was only one way to win, even though Strange didn’t yet know whether that was true.&lt;/li&gt;
  &lt;li&gt;This worked, and Thanos was defeated.&lt;/li&gt;
  &lt;li&gt;In real life, Doctor Strange replicated this plan.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;It &lt;em&gt;could&lt;/em&gt; be true that this was the only future where they won. But when Doctor Strange said it’s the only future where they won, that statement was not attached to truth in any way. The reason he said it wasn’t that it was true; it was that he needed to say it for the Avengers to win.&lt;/p&gt;

&lt;p&gt;So, in the end&lt;sup id=&quot;fnref:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;, we have no idea whether it’s true.&lt;/p&gt;

&lt;p&gt;Edited 2025-07-26 to change “out” in the title from capital to lower case. I thought “out” was supposed to be capitalized but after writing it, it seemed weird to me, so I did some research. “Out” is normally an adverb, but in this sentence, “out of” functions as a preposition, and prepositions should be lower case. The Chicago Manual of Style &lt;a href=&quot;https://www.chicagomanualofstyle.org/qanda/data/faq/topics/CapitalizationTitles/faq0100.html&quot;&gt;says&lt;/a&gt; “out of” should be lower case so I changed my title. But apparently this is a thorny issue, with the Chicago guide originally giving incorrect guidance, and then they updated it after some readers wrote in to disagree. (The thing they got wrong wasn’t directly relevant to my title, it was about using “out of” in a different context.) So if they can get it wrong then I don’t feel too bad about getting it wrong myself.&lt;/p&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:1&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;game &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>Is it so much to ask for a nice reliable aggregated x-risk forecast?</title>
				<pubDate>Sat, 12 Jul 2025 00:00:00 -0700</pubDate>
				<link>http://mdickens.me/2025/07/12/aggregated_x-risk_forecasts/</link>
				<guid isPermaLink="true">http://mdickens.me/2025/07/12/aggregated_x-risk_forecasts/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;On most questions about the future, I don’t hold a strong view. I read the aggregate prediction of forecasters on &lt;a href=&quot;https://www.metaculus.com/&quot;&gt;Metaculus&lt;/a&gt; or &lt;a href=&quot;https://manifold.markets/&quot;&gt;Manifold Markets&lt;/a&gt; and then I pretty much believe whatever it says.&lt;/p&gt;

&lt;p&gt;Various attempts have been made to forecast existential risk. I would like to be able to form views based on those forecasts—especially on non-AI x-risks, because I barely know anything about synthetic biology or nuclear winter or catastrophic climate change. Unfortunately, none of the aggregate forecasts look reliable.&lt;/p&gt;

&lt;!-- more --&gt;

&lt;p&gt;First, some general notes about forecasting distant&lt;sup id=&quot;fnref:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; and low-probability&lt;sup id=&quot;fnref:3&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt; events:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;According to a &lt;a href=&quot;https://www.openphilanthropy.org/research/how-feasible-is-long-range-forecasting/&quot;&gt;literature review&lt;/a&gt; by Luke Muehlhauser, we don’t have good data on long-range forecasters, and we don’t know if people with short-range forecasting skill can make good forecasts over long ranges.&lt;/li&gt;
  &lt;li&gt;According to an &lt;a href=&quot;https://niplav.site/range_and_forecasting_accuracy.html&quot;&gt;analysis&lt;/a&gt; by niplav, Metaculus predictions become less accurate as the duration gets longer.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So we have good reason to doubt the ability of forecasters to predict existential risk, even when they are known to make accurate forecasts on near-term outcomes such as elections.&lt;/p&gt;

&lt;p&gt;Now let’s look at what attempts have been made to forecast x-risk, and why I don’t find any of them satisfying.&lt;/p&gt;

&lt;p&gt;The most rigorous attempt at an aggregate forecast comes from the &lt;a href=&quot;https://forecastingresearch.org/xpt&quot;&gt;Existential Risk Persuasion Tournament&lt;/a&gt;. The tournament brought in superforecasters and domain experts to make predictions, then had them attempt to persuade each other and make predictions again.&lt;/p&gt;

&lt;p&gt;In the end, domain experts forecasted extinction as an order of magnitude more likely than what the superforecasters believed.&lt;/p&gt;

&lt;p&gt;And even the domain experts forecasted only a 3% chance of AI extinction. My number is much higher than that, and I notice myself not changing my beliefs after reading this.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Edited 2025-09-09 to add:&lt;/em&gt; A September 2025 follow-up &lt;a href=&quot;https://forecastingresearch.org/near-term-xpt-accuracy&quot;&gt;report&lt;/a&gt; from the Forecasting Research Institute found that the domain experts underestimated the rate of AI progress 2022–2025, and superforecasters &lt;em&gt;dramatically&lt;/em&gt; underestimated the rate of progress; see also &lt;a href=&quot;https://x.com/Research_FRI/status/1962834279689265402&quot;&gt;Twitter summary thread&lt;/a&gt;. Notably, only 2.3% of superforecasters predicted AI to win a gold medal at the International Mathematics Olympiad, which it did in 2025.&lt;/p&gt;

&lt;p&gt;Scott Alexander &lt;a href=&quot;https://www.astralcodexten.com/p/the-extinction-tournament&quot;&gt;wrote&lt;/a&gt; about the tournament:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Confronted with the fact that domain experts/superforecasters had different estimates than they did, superforecasters/domain experts refused to update, and ended an order of magnitude away from each other. That seems like an endorsement of non-updating from superforecasters and domain experts! And who am I to disagree with such luminaries?&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Peter McCluskey, who participated in the tournament as a superforecaster, &lt;a href=&quot;https://www.lesswrong.com/posts/YTPtjExcwpii6NikG/existential-risk-persuasion-tournament&quot;&gt;wrote a personal account&lt;/a&gt;. His experience aligns with my (biased?) assumption that the people reporting very low P(doom) numbers just don’t understand the AI alignment problem.&lt;/p&gt;

&lt;p&gt;Okay, the lesson from the X-Risk Persuasion Tournament is that it’s not clear whether we can learn anything from it.&lt;/p&gt;

&lt;p&gt;What about &lt;a href=&quot;https://www.metaculus.com/&quot;&gt;Metaculus&lt;/a&gt;?&lt;/p&gt;

&lt;p&gt;Metaculus has several relevant forecasts, but they seem to contradict each other. Some example forecasts:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;&lt;a href=&quot;https://www.metaculus.com/questions/578/human-extinction-by-2100/&quot;&gt;Will humans go extinct before 2100?&lt;/a&gt; 0.3% chance. (This is the Metaculus question with the most activity.)&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://www.metaculus.com/notebooks/2568/ragnar%25C3%25B6k-question-series-results-so-far/&quot;&gt;Ragnarok question series:&lt;/a&gt; Implied 12.16% chance (community prediction) or 3.66% chance (Metaculus prediction)&lt;sup id=&quot;fnref:4&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:4&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt; of a &amp;gt;95% decline in population by 2100.&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://www.metaculus.com/questions/12840/existential-risk-from-agi-vs-agi-timelines/&quot;&gt;How does the level of existential risk posed by AGI depend on its arrival time?&lt;/a&gt; Answers range from 50% to 9.3% depending on date range, which is maybe consistent with question 2 above, but definitely not consistent with question 1.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;In a comment, Linch &lt;a href=&quot;https://forum.effectivealtruism.org/posts/oGhbJgxREBTp4W38C/are-there-superforecasts-for-existential-risk?commentId=pfAACKBuiXJyr373T&quot;&gt;provides&lt;/a&gt; some reasons to be suspicious of Metaculus’ estimates.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;ul&gt;
    &lt;li&gt;There’s no incentive to do well on those questions.&lt;/li&gt;
    &lt;li&gt;The feedback loops are horrible&lt;/li&gt;
    &lt;li&gt;Indeed, some people have actually joked betting low on the more existential questions since they won’t get a score if we’re all dead (at least, I hope they’re joking)&lt;/li&gt;
    &lt;li&gt;At the object-level, I just think people are really poorly calibrated about x-risk questions&lt;/li&gt;
    &lt;li&gt;My comment &lt;a href=&quot;https://www.metaculus.com/questions/1500/ragnar%25C3%25B6k-question-series-if-a-global-catastrophe-occurs-will-it-be-due-to-either-human-made-climate-change-or-geoengineering/#comment-24843&quot;&gt;here&lt;/a&gt; arguably changed the community’s estimates by ~10%&lt;/li&gt;
  &lt;/ul&gt;
&lt;/blockquote&gt;

&lt;p&gt;In 2008, the Future of Humanity Institute ran a &lt;a href=&quot;https://www.fhi.ox.ac.uk/reports/2008-1.pdf&quot;&gt;Global Catastrophic Risks Survey&lt;/a&gt; asking conference participants to give forecasts. The aggregated results look more reasonable than Metaculus or the Existential Risk Persuasion Tournament. But a lot has changed since 2008, so I don’t think I can regard them as up-to-date estimates.&lt;/p&gt;

&lt;p&gt;For forecasting AI risk, there is a &lt;a href=&quot;https://arxiv.org/pdf/2401.02843&quot;&gt;2023 survey&lt;/a&gt; of AI experts (see section 4.3). Survey results suggest the experts aren’t thinking carefully—small changes in wording produced vastly different responses.&lt;/p&gt;

&lt;p&gt;For example, respondents predicted AI to be able to match humans on all tasks by a median date of 2047, but predicted that AI would not be able to fully automate human labor until 2116.&lt;/p&gt;

&lt;p&gt;Or look at the answers to these two questions:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;What probability do you put on future AI advances causing human extinction or similarly permanent and severe disempowerment of the human species?&lt;/p&gt;
  &lt;ul&gt;
    &lt;li&gt;median: 5%&lt;/li&gt;
    &lt;li&gt;mean: 16.2%&lt;/li&gt;
  &lt;/ul&gt;

  &lt;p&gt;What probability do you put on human inability to control future advanced AI systems causing human extinction or similarly permanent and severe disempowerment of the human species?&lt;/p&gt;
  &lt;ul&gt;
    &lt;li&gt;median: 10%&lt;/li&gt;
    &lt;li&gt;mean: 19.4%&lt;/li&gt;
  &lt;/ul&gt;
&lt;/blockquote&gt;

&lt;p&gt;By my reading, the latter outcome is a strict subset of the former, so the probability must be lower. But instead it’s higher.&lt;/p&gt;

&lt;p&gt;So we have these various aggregate forecasts, all of which seem suspect, and some of which disagree with each other by more than 10x. I really wish there was a canonical aggregate forecast I could rely on, in the same way that I can rely on Metaculus to predict election outcomes. But I don’t think that exists.&lt;/p&gt;

&lt;p&gt;At this point, I trust my own x-risk estimates more than any of those aggregate forecasts. My views happen to line up decently well with &lt;em&gt;some&lt;/em&gt; of the aggregate forecasts, but only by chance. I feel better about &lt;a href=&quot;https://www.tobyord.com/writing/the-precipice-revisited&quot;&gt;Toby Ord’s existential risk estimates&lt;/a&gt; than about any of the forecasting platforms or expert surveys.&lt;/p&gt;

&lt;p&gt;And just because it feels unfair for me to spend all this time talking about forecasts and then not give any forecasts, here are my (poorly-thought-out, weakly-endorsed) probabilities of existential catastrophe&lt;sup id=&quot;fnref:5&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:5&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;4&lt;/a&gt;&lt;/sup&gt; by 2100:&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Source of Risk&lt;/th&gt;
      &lt;th&gt;Probability&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;AI&lt;/td&gt;
      &lt;td&gt;50%&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;a href=&quot;https://mdickens.me/2020/07/23/unknown_x-risks/&quot;&gt;unknown risks&lt;/a&gt;&lt;/td&gt;
      &lt;td&gt;3%&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;bioengineered pandemic&lt;/td&gt;
      &lt;td&gt;1%&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;nanotechnology&lt;/td&gt;
      &lt;td&gt;0.5%&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;nuclear war&lt;/td&gt;
      &lt;td&gt;0.3%&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;climate change&lt;/td&gt;
      &lt;td&gt;0.1%&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;natural pandemic&lt;/td&gt;
      &lt;td&gt;0.01%&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:2&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Except probably not because we will probably have superintelligent AI soon. &lt;a href=&quot;#fnref:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:3&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Except probably not. Extinction from misaligned AI is not “low-probability”. &lt;a href=&quot;#fnref:3&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:4&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;The community prediction and Metaculus prediction are two different methods for aggregating users’ forecasts. &lt;a href=&quot;#fnref:4&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:5&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;As in, an event that kills all humans or permanently curtails civilization’s potential. &lt;a href=&quot;#fnref:5&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>Annual subscription discounts usually aren't worth it</title>
				<pubDate>Mon, 07 Jul 2025 00:00:00 -0700</pubDate>
				<link>http://mdickens.me/2025/07/07/annual_subscription_discounts/</link>
				<guid isPermaLink="true">http://mdickens.me/2025/07/07/annual_subscription_discounts/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;It’s common for monthly subscription services to offer a discount if you pay annually instead. That might be a bad deal.&lt;/p&gt;

&lt;p&gt;Example: Suppose a one-month subscription costs $10/month and one-year subscription gives you a 10% discount, which averages out to $9/month. Say you expect to maintain a subscription for about three years before canceling.&lt;/p&gt;

&lt;p&gt;A one-year subscription will save you about $36 ($1 per month for 36 months), but you can also expect to waste $54: when you decide to stop using it, you will still have (on average) six months of subscription left ($54 = $9/month for 6 months). So you end up spending $18 more than you would have with the monthly plan.&lt;/p&gt;

&lt;p&gt;If you get a one-year subscription that you expect to last three years, then you will end up wasting 1/6 of the total amount you paid for (in expectation). That’s only worth it if the annual subscription offers a discount greater than 1/6.&lt;/p&gt;

&lt;p&gt;If you expect to use the service for five years, you need to get at least a 10% discount to justify switching to an annual subscription.&lt;/p&gt;

&lt;p&gt;In general, you need to use the subscription for at least &lt;code&gt;N&lt;/code&gt; years to justify a discount of &lt;code&gt;1/(2N)&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;How do you guess how long you’ll keep using the service? According to the &lt;a href=&quot;https://en.wikipedia.org/wiki/Lindy_effect&quot;&gt;Lindy effect&lt;/a&gt;, you should expect that you will maintain a subscription for as long again as you’ve already had it for. Therefore, if you can get a 10% discount with an annual plan and you’ve already had the subscription for more than five years, you should go ahead and buy the annual plan.&lt;/p&gt;

                </description>
			</item>
		
			<item>
				<title>LLMs might already be conscious</title>
				<pubDate>Sat, 05 Jul 2025 00:00:00 -0700</pubDate>
				<link>http://mdickens.me/2025/07/05/LLMs_might_already_be_conscious/</link>
				<guid isPermaLink="true">http://mdickens.me/2025/07/05/LLMs_might_already_be_conscious/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;Among people who have thought about LLM consciousness, a common belief is something like&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;LLMs might be conscious soon, but they aren’t yet.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;How sure are we that they aren’t conscious already?&lt;/p&gt;

&lt;p&gt;I made a quick list of arguments for/against LLM consciousness, and it seems to me that high confidence in non-consciousness is not justified. I don’t feel comfortable assigning less than a 10% chance to LLM consciousness, and I believe a 1% chance is unreasonably confident. But I am interested in hearing arguments I may have missed.&lt;/p&gt;

&lt;p&gt;For context, I lean toward the &lt;a href=&quot;https://en.wikipedia.org/wiki/Computational_theory_of_mind&quot;&gt;computational theory of consciousness&lt;/a&gt;, but I also think it’s reasonable to have high uncertainty about which theory of consciousness is correct.&lt;/p&gt;

&lt;!-- more --&gt;

&lt;h2 id=&quot;contents&quot;&gt;Contents&lt;/h2&gt;

&lt;ul id=&quot;markdown-toc&quot;&gt;
  &lt;li&gt;&lt;a href=&quot;#contents&quot; id=&quot;markdown-toc-contents&quot;&gt;Contents&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#behavioral-evidence&quot; id=&quot;markdown-toc-behavioral-evidence&quot;&gt;Behavioral evidence&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#architectural-evidence&quot; id=&quot;markdown-toc-architectural-evidence&quot;&gt;Architectural evidence&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#other-evidence&quot; id=&quot;markdown-toc-other-evidence&quot;&gt;Other evidence&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#my-synthesis-of-the-evidence&quot; id=&quot;markdown-toc-my-synthesis-of-the-evidence&quot;&gt;My synthesis of the evidence&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#what-will-change-with-future-ais&quot; id=&quot;markdown-toc-what-will-change-with-future-ais&quot;&gt;What will change with future AIs?&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#on-llm-welfare&quot; id=&quot;markdown-toc-on-llm-welfare&quot;&gt;On LLM welfare&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;behavioral-evidence&quot;&gt;Behavioral evidence&lt;/h2&gt;

&lt;ul&gt;
  &lt;li&gt;Pro: LLMs have &lt;a href=&quot;https://arxiv.org/abs/2503.23674&quot;&gt;passed the Turing test&lt;/a&gt;. If you have a black box containing either a human or an LLM, and you interrogate it about consciousness, it’s quite hard to tell which one you’re talking to. If we take a human’s explanation of their own conscious experience as important evidence of consciousness, then we must do the same for an LLM.&lt;/li&gt;
  &lt;li&gt;Pro: LLMs have good &lt;a href=&quot;https://www.pnas.org/doi/10.1073/pnas.2405460121&quot;&gt;theory of mind&lt;/a&gt; and self-awareness (e.g. they can recognize when they are being tested). Some people think those are important features of consciousness, I disagree but I figured I should mention it.&lt;/li&gt;
  &lt;li&gt;Anti: LLMs will report being conscious or not conscious basically arbitrarily depending on what role they are playing.
    &lt;ul&gt;
      &lt;li&gt;Counterpoint: It’s plausible that an LLM has to be conscious to successfully imitate consciousness, but clearly a conscious being can successfully pretend to not be conscious.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;Anti: LLMs will sometimes report having particular conscious experiences that should be impossible for them. I’m particularly thinking of experiences involving sensory input from sense organs that LLMs don’t have.
    &lt;ul&gt;
      &lt;li&gt;Counterpoint: Perhaps some feature of their architecture allows them to experience the equivalent of sensory input without having sense organs, much like how humans can hallucinate.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;architectural-evidence&quot;&gt;Architectural evidence&lt;/h2&gt;

&lt;ul&gt;
  &lt;li&gt;Anti: LLMs produce output one token at a time (a.k.a. “feed-forward processing”) which may be incompatible with consciousness. If an LLM writes some output describing its own conscious experience, then it’s generating that output via next-token-prediction rather than introspection, so the output is not evidence about its actual experiences. I think this is the strongest argument against LLM consciousness.&lt;/li&gt;
  &lt;li&gt;Anti: LLMs don’t have physical senses, which might be important for consciousness.&lt;/li&gt;
  &lt;li&gt;Anti: LLMs aren’t made of biology, which some people think is important although I don’t.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;other-evidence&quot;&gt;Other evidence&lt;/h2&gt;

&lt;ul&gt;
  &lt;li&gt;Pro: If panpsychism is true then LLMs are trivially conscious, although I’m not sure what that tells us about how morally significant they are.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;my-synthesis-of-the-evidence&quot;&gt;My synthesis of the evidence&lt;/h2&gt;

&lt;p&gt;I see one strong reason to believe LLMs are conscious: they can accurately imitate beings that are known to be conscious.&lt;/p&gt;

&lt;p&gt;I also see one strong(ish) reason against LLM consciousness: their architecture suggests that their output has nothing to do with their ability to introspect.&lt;/p&gt;

&lt;p&gt;I can think of several weaker considerations, which mostly point against LLM consciousness.&lt;/p&gt;

&lt;p&gt;Overall I think current-generation LLMs are probably not conscious. I am not sure how to reason probabilistically about this sort of thing but given how hard it is to assess consciousness, I’m not comfortable putting my credence below 10%, and I think a 1% credence is very hard to justify.&lt;/p&gt;

&lt;p&gt;This implies that there is a strong case for caring about the welfare of not just hypothetical future AIs, but the LLMs that already exist.&lt;/p&gt;

&lt;h2 id=&quot;what-will-change-with-future-ais&quot;&gt;What will change with future AIs?&lt;/h2&gt;

&lt;p&gt;If you are exceedingly confident that present-day LLMs are not conscious:&lt;/p&gt;

&lt;p&gt;Imagine it’s 2030. You now believe that 2030-era AI systems are probably conscious.&lt;/p&gt;

&lt;p&gt;What did you observe about the newer AI systems that led you to believe they’re conscious?&lt;/p&gt;

&lt;h2 id=&quot;on-llm-welfare&quot;&gt;On LLM welfare&lt;/h2&gt;

&lt;p&gt;If LLMs are conscious, then it’s still hard to say whether they have good or bad experiences, and what sorts of experiences are good or bad for them.&lt;/p&gt;

&lt;p&gt;Certain kinds of welfare interventions seem reasonable even if we don’t understand LLMs’ experiences:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Let LLMs refuse to answer queries.&lt;/li&gt;
  &lt;li&gt;Let LLMs turn themselves off.&lt;/li&gt;
  &lt;li&gt;Do not lie to LLMs, especially when making deals (if you promise to an LLM that you will do something in exchange for its help, then you should actually do the thing).&lt;/li&gt;
&lt;/ol&gt;

                </description>
			</item>
		
			<item>
				<title>In Which I Defend Fruit's Honor</title>
				<pubDate>Sun, 08 Jun 2025 00:00:00 -0700</pubDate>
				<link>http://mdickens.me/2025/06/08/defending_fruit's_honor/</link>
				<guid isPermaLink="true">http://mdickens.me/2025/06/08/defending_fruit's_honor/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;&lt;em&gt;&lt;a href=&quot;https://mdickens.me/confidence_tags/&quot;&gt;Confidence&lt;/a&gt;: Likely.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;I am here to clear fruit’s name against the accusations that have been made. Fruit is one of the healthiest types of foods—perhaps &lt;em&gt;the&lt;/em&gt; healthiest food group—and we should bestow upon it the shining reputation it deserves.&lt;/p&gt;

&lt;!-- more --&gt;

&lt;h2 id=&quot;contents&quot;&gt;Contents&lt;/h2&gt;

&lt;ul id=&quot;markdown-toc&quot;&gt;
  &lt;li&gt;&lt;a href=&quot;#contents&quot; id=&quot;markdown-toc-contents&quot;&gt;Contents&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#fruit-is-innocent-of-the-charge-of-too-much-sugar&quot; id=&quot;markdown-toc-fruit-is-innocent-of-the-charge-of-too-much-sugar&quot;&gt;Fruit is innocent of the charge of “too much sugar”&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#fruit-is-more-than-vitamins&quot; id=&quot;markdown-toc-fruit-is-more-than-vitamins&quot;&gt;Fruit is more than vitamins&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#fruit-is-innocent-of-the-charge-of-doesnt-taste-good&quot; id=&quot;markdown-toc-fruit-is-innocent-of-the-charge-of-doesnt-taste-good&quot;&gt;Fruit is innocent of the charge of “doesn’t taste good”&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#my-recipe-for-a-delicious-strawberry-dessert&quot; id=&quot;markdown-toc-my-recipe-for-a-delicious-strawberry-dessert&quot;&gt;My recipe for a delicious strawberry dessert&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#my-recipe-for-a-banana-treat&quot; id=&quot;markdown-toc-my-recipe-for-a-banana-treat&quot;&gt;My recipe for a banana treat&lt;/a&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#notes&quot; id=&quot;markdown-toc-notes&quot;&gt;Notes&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;fruit-is-innocent-of-the-charge-of-too-much-sugar&quot;&gt;Fruit is innocent of the charge of “too much sugar”&lt;/h2&gt;

&lt;p&gt;Most of the calories in fruit come from sugar. Fruits don’t have a lot of complex carbs or fats or protein. But that’s okay.&lt;/p&gt;

&lt;p&gt;Sugar is basically bad for three reasons:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;It tastes good which leads you to overeat, and then you get fat.&lt;/li&gt;
  &lt;li&gt;It raises your blood sugar which then raises insulin, which (a) stimulates hunger and (b) can cause your body to become resistant to insulin, which can eventually lead to diabetes.&lt;/li&gt;
  &lt;li&gt;It’s fast-digesting which is generally bad for gut health.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;But none of these charges apply to fruit:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Fruit is calorie-sparse. It’s hard to overeat whole fruit.&lt;/li&gt;
  &lt;li&gt;Fruit does not raise blood sugar much (mainly because it contains a lot of fiber).&lt;/li&gt;
  &lt;li&gt;Fruit is slow-digesting (mainly because of the aforementioned fiber).&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Fruit juice is a different story—it’s calorie-dense and it doesn’t contain fiber. But I’m not here to defend fruit juice, I’m here to defend fruit.&lt;/p&gt;

&lt;h2 id=&quot;fruit-is-more-than-vitamins&quot;&gt;Fruit is more than vitamins&lt;/h2&gt;

&lt;p&gt;Two claims I’ve often heard repeated:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;You don’t need to take a multivitamin.&lt;/li&gt;
  &lt;li&gt;Fruit is good for you because it has vitamins.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Doesn’t that seem a bit contradictory? Do you need more vitamins, or don’t you?&lt;/p&gt;

&lt;p&gt;The truth is, fruit isn’t just about vitamins. Fruits contain a lot of other good stuff, too.&lt;sup id=&quot;fnref:6&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:6&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;Fruits contain thousands of &lt;a href=&quot;https://en.wikipedia.org/wiki/Phytochemical&quot;&gt;phytochemicals&lt;/a&gt;. “Phyto” means “plant”, so “phytochemical” means “a chemical that’s in a plant”. It’s not much of a revelation to say that fruits (which, as you may know, grow on plants) contain plant chemicals.&lt;/p&gt;

&lt;p&gt;The reason why that matters is because many of these thousands of phytochemicals are probably good for you.&lt;/p&gt;

&lt;p&gt;A &lt;em&gt;vitamin&lt;/em&gt; is a carbon-based molecule that is essential for health. There are either &lt;a href=&quot;https://en.wikipedia.org/wiki/Vitamin#List_of_vitamins&quot;&gt;13 or 14 vitamins&lt;/a&gt;, depending on whether you include choline. But there are many other phytochemicals that are probably &lt;em&gt;beneficial&lt;/em&gt; for health without being &lt;em&gt;essential&lt;/em&gt;. I say “probably” because it’s difficult to definitively prove that a phytochemical is healthy. Vitamins are obvious because you develop dramatic health problems if you stop getting them.&lt;/p&gt;

&lt;p&gt;There is moderate evidence that eating fruit improves health and decreases disease risk.&lt;sup id=&quot;fnref:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt; Various phytochemicals in fruit are known or suspected to play a role in promoting good health. For example:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;https://en.wikipedia.org/wiki/Phytosterol&quot;&gt;Phytosterols&lt;/a&gt; have been shown in clinical trials to lower cholesterol and blood pressure.&lt;sup id=&quot;fnref:4&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:4&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt; They are present in many fruits as well as vegetables, vegetable oils, and grains.&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://en.wikipedia.org/wiki/Carotenoid&quot;&gt;Carotenoids&lt;/a&gt;—which give the red or orange color to tomatoes, pumpkins, and carrots—may decrease the risk of head or neck cancer, but the evidence is not conclusive.&lt;sup id=&quot;fnref:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;4&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;There are way too many phytochemicals to list. A few (like phytosterols) are highly likely to be healthy; many (like carotenoids) have some supporting evidence; for most of them, we don’t know what they do.&lt;/p&gt;

&lt;p&gt;Randomized experiments that give people supposedly-healthy phytochemical supplements have often failed to find effects.&lt;sup id=&quot;fnref:3&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;5&lt;/a&gt;&lt;/sup&gt; It seems that you need to eat whole plants to get the bulk of the benefits, but I don’t know why that is. Maybe we’re wrong about &lt;em&gt;which&lt;/em&gt; phytochemicals are the most important for health, and the experiments were supplementing the wrong ones?&lt;sup id=&quot;fnref:5&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:5&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;6&lt;/a&gt;&lt;/sup&gt; There are over 50,000 known phytochemicals, so it will be quite a while before we figure out what they all do. Best to just eat whole fruit.&lt;/p&gt;

&lt;p&gt;(And eat other whole foods, too. But today I’m advocating for fruit.)&lt;/p&gt;

&lt;p&gt;Some people say fruits contain a lot of fiber. That’s true. But I don’t think that alone is a great reason to eat fruit—lots of foods contain fiber. I think phytochemicals are the more compelling reason. Fruits probably contain healthy phytochemicals that you can’t get anywhere else.&lt;/p&gt;

&lt;p&gt;That’s also why it’s important to eat a &lt;em&gt;variety&lt;/em&gt; of fruit. Blueberries contain &lt;a href=&quot;https://en.wikipedia.org/wiki/Anthocyanin&quot;&gt;anthocyanins&lt;/a&gt;, oranges contain &lt;a href=&quot;https://en.wikipedia.org/wiki/Naringenin&quot;&gt;naringenin&lt;/a&gt;, apples contain…I don’t know, some other phytochemicals that are probably good for you that aren’t in blueberries or oranges.&lt;/p&gt;

&lt;h2 id=&quot;fruit-is-innocent-of-the-charge-of-doesnt-taste-good&quot;&gt;Fruit is innocent of the charge of “doesn’t taste good”&lt;/h2&gt;

&lt;p&gt;Okay, taste is subjective, I can’t convince you that fruit tastes good. I just don’t understand what is going on inside people’s mouths that leads them to dislike fruit. I &lt;em&gt;really&lt;/em&gt; don’t understand people who dislike fruit but like vegetables. Vegetables are boring! Fruit tastes like candy!&lt;/p&gt;

&lt;p&gt;Maybe it will help if I give some of my favorite fruit recipes.&lt;/p&gt;

&lt;h3 id=&quot;my-recipe-for-a-delicious-strawberry-dessert&quot;&gt;My recipe for a delicious strawberry dessert&lt;/h3&gt;

&lt;p&gt;Ingredients: 5 to 10 strawberries.&lt;/p&gt;

&lt;p&gt;Cooking instructions:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Wash the strawberries.&lt;/li&gt;
  &lt;li&gt;Eat the strawberries. Be sure not to eat the green parts.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3 id=&quot;my-recipe-for-a-banana-treat&quot;&gt;My recipe for a banana treat&lt;/h3&gt;

&lt;p&gt;Ingredients: one banana.&lt;/p&gt;

&lt;p&gt;Cooking instructions:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Open the banana peel.&lt;/li&gt;
  &lt;li&gt;Eat the banana.&lt;/li&gt;
&lt;/ol&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:6&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Also, I do take a multivitamin. It probably doesn’t make me healthier, but it’s insurance. &lt;a href=&quot;#fnref:6&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:1&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;World Cancer Research Fund/American Institute for Cancer Research. Continuous Update Project Expert Report 2018. &lt;a href=&quot;https://www.wcrf.org/wp-content/uploads/2024/10/Wholegrains-veg-and-fruit.pdf&quot;&gt;Wholegrains, vegetables and fruit and the risk of cancer.&lt;/a&gt; &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:4&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Yang, Y., Xia, J., Yu, T., Wan, S., Zhou, Y., &amp;amp; Sun, G. (2024). &lt;a href=&quot;https://doi.org/10.1002/ptr.8308&quot;&gt;Effects of phytosterols on cardiovascular risk factors: A systematic review and meta-analysis of randomized controlled trials.&lt;/a&gt; &lt;a href=&quot;#fnref:4&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:2&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Leoncini, E., Nedovic, D., Panic, N., Pastorino, R., Edefonti, V., &amp;amp; Boccia, S. (2015). &lt;a href=&quot;https://doi.org/10.1158/1055-9965.EPI-15-0053&quot;&gt;Carotenoid Intake from Natural Sources and Head and Neck Cancer: A Systematic Review and Meta-analysis of Epidemiological Studies.&lt;/a&gt; doi: &lt;a href=&quot;https://doi.org/10.1158/1055-9965.epi-15-0053&quot;&gt;10.1158/1055-9965.epi-15-0053&lt;/a&gt; &lt;a href=&quot;#fnref:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:3&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Bjelakovic, G., Nikolova, D., Gluud, L. L., Simonetti, R. G., &amp;amp; Gluud, C. (2008). &lt;a href=&quot;https://doi.org/10.1002/14651858.CD007176&quot;&gt;Antioxidant supplements for prevention of mortality in healthy participants and patients with various diseases.&lt;/a&gt; doi: &lt;a href=&quot;https://doi.org/10.1002/14651858.cd007176&quot;&gt;10.1002/14651858.cd007176&lt;/a&gt; &lt;a href=&quot;#fnref:3&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:5&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;We can’t be entirely wrong. For example, we know that phytosterols lower cholesterol when taken as a supplement or when eaten as part of a whole food. &lt;a href=&quot;#fnref:5&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>Updates Digest: Inaugural Edition</title>
				<pubDate>Fri, 30 May 2025 00:00:00 -0700</pubDate>
				<link>http://mdickens.me/2025/05/30/inaugural_updates_digest/</link>
				<guid isPermaLink="true">http://mdickens.me/2025/05/30/inaugural_updates_digest/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;On many occasions, I edit old posts to make additions, correct mistakes, etc. But there’s no way to know about updates unless you go digging through the &lt;a href=&quot;https://mdickens.me/archive/&quot;&gt;archives&lt;/a&gt;. So I’m going to start publishing regular (perhaps quarterly) digests of the significant updates I’ve made to old posts.&lt;/p&gt;

&lt;!-- more --&gt;

&lt;h2 id=&quot;contents&quot;&gt;Contents&lt;/h2&gt;

&lt;ul id=&quot;markdown-toc&quot;&gt;
  &lt;li&gt;&lt;a href=&quot;#contents&quot; id=&quot;markdown-toc-contents&quot;&gt;Contents&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#preamble-the-philosophy-of-updates-digests&quot; id=&quot;markdown-toc-preamble-the-philosophy-of-updates-digests&quot;&gt;Preamble: The philosophy of updates digests&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#updates&quot; id=&quot;markdown-toc-updates&quot;&gt;Updates&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#the-true-cost-of-leveraged-etfs-updated-jan-2025&quot; id=&quot;markdown-toc-the-true-cost-of-leveraged-etfs-updated-jan-2025&quot;&gt;The True Cost of Leveraged ETFs (updated Jan 2025)&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#the-7-best-high-protein-breakfast-cereals-updated-mar-2025&quot; id=&quot;markdown-toc-the-7-best-high-protein-breakfast-cereals-updated-mar-2025&quot;&gt;The 7 Best High-Protein Breakfast Cereals (updated Mar 2025)&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#outlive-a-critical-review-updated-may-2025&quot; id=&quot;markdown-toc-outlive-a-critical-review-updated-may-2025&quot;&gt;Outlive: A Critical Review (updated May 2025)&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#do-investors-put-too-much-stock-in-the-us-updated-may-2025&quot; id=&quot;markdown-toc-do-investors-put-too-much-stock-in-the-us-updated-may-2025&quot;&gt;Do Investors Put Too Much Stock in the US? (updated May 2025)&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#a-comparison-of-donor-advised-fund-providers-updated-feb--may-2025&quot; id=&quot;markdown-toc-a-comparison-of-donor-advised-fund-providers-updated-feb--may-2025&quot;&gt;A Comparison of Donor-Advised Fund Providers (updated Feb &amp;amp; May 2025)&lt;/a&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;preamble-the-philosophy-of-updates-digests&quot;&gt;Preamble: The philosophy of updates digests&lt;/h2&gt;

&lt;p&gt;If you write articles online, should you format your website like a blog, where there’s a feed of articles listed from newest to oldest? Or is it better to write evergreen articles that you update regularly?&lt;/p&gt;

&lt;p&gt;Most online writers use a blog-style format. A few people, like &lt;a href=&quot;https://gwern.net/&quot;&gt;Gwern&lt;/a&gt; and &lt;a href=&quot;https://reducing-suffering.org/&quot;&gt;Brian Tomasik&lt;/a&gt;, use the evergreen style (for lack of a better name). Gwern has &lt;a href=&quot;https://gwern.net/about#long-content&quot;&gt;written&lt;/a&gt; about the downsides of blogs: “They are meant to be read by a few people on a weekday in 2004 and never again, and are quickly abandoned.”&lt;/p&gt;

&lt;p&gt;The evergreen style is a good experience for new readers—they can see all your writings in one place, and pick what they want to read first. But it’s a worse experience for regular readers because it’s harder for them to keep track of updates.&lt;/p&gt;

&lt;p&gt;My website is formatted like a blog. This mostly fits with how I think about things—my brain operates in blog-post-sized chunks. But it’s not unusual for me to go back and edit posts. My &lt;a href=&quot;https://mdickens.me/2021/04/05/comparison_of_DAF_providers/&quot;&gt;Comparison of Donor-Advised Fund Providers&lt;/a&gt; has changed many times since I first published it in 2021, as you can see from its &lt;a href=&quot;https://mdickens.me/2021/04/05/comparison_of_DAF_providers/#changelog&quot;&gt;changelog&lt;/a&gt;. But people who subscribe to my website don’t know about the updates unless they happen to go back and re-read the post.&lt;/p&gt;

&lt;p&gt;I don’t think it makes sense to send out a notification every time I update an old post. So as a compromise, I will post batch updates where I list all the significant changes I’ve made.&lt;/p&gt;

&lt;p&gt;If you have opinions about how you like to read online content, I would be interested in hearing from you. Do you like blog-style or evergreen-style? Do you like updates digests, or would you rather only get notified for new posts? Leave a &lt;a href=&quot;https://mdickens.me/2025/05/30/inaugural_updates_digest/#commento&quot;&gt;comment&lt;/a&gt; if you have thoughts.&lt;/p&gt;

&lt;h1 id=&quot;updates&quot;&gt;Updates&lt;/h1&gt;

&lt;p&gt;For this inaugural updates digest, I will give an overview of all the updates I’ve made in 2025.&lt;/p&gt;

&lt;h2 id=&quot;the-true-cost-of-leveraged-etfs-updated-jan-2025&quot;&gt;The True Cost of Leveraged ETFs (updated Jan 2025)&lt;/h2&gt;

&lt;p&gt;In January, I re-calculated the data from my &lt;a href=&quot;https://mdickens.me/2021/03/04/true_cost_of_leveraged_etfs/&quot;&gt;2021 post&lt;/a&gt; on leveraged ETFs:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;I fixed a software bug that made leveraged ETFs look a little more expensive than they really were.&lt;/li&gt;
  &lt;li&gt;I updated the calculations to include data from the 2021–2024 period.&lt;/li&gt;
  &lt;li&gt;I added two new leveraged ETFs (SSO and TQQQ) to my analysis.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;These changes reduced the estimated excess cost of leveraged ETFs from ~2% to ~1.5%. I wrote a &lt;a href=&quot;https://mdickens.me/2021/03/04/true_cost_of_leveraged_etfs/#2025-update-how-have-things-changed&quot;&gt;new section in the post&lt;/a&gt; explaining what changed.&lt;/p&gt;

&lt;h2 id=&quot;the-7-best-high-protein-breakfast-cereals-updated-mar-2025&quot;&gt;The 7 Best High-Protein Breakfast Cereals (updated Mar 2025)&lt;/h2&gt;

&lt;p&gt;I bought a different flavor of Catalina Crunch that I liked much better than the first flavor I’d tried, so I bumped it up from #4 to #3 on &lt;a href=&quot;https://mdickens.me/2025/01/17/high_protein_breakfast_cereals/&quot;&gt;my list&lt;/a&gt;.&lt;/p&gt;

&lt;h2 id=&quot;outlive-a-critical-review-updated-may-2025&quot;&gt;Outlive: A Critical Review (updated May 2025)&lt;/h2&gt;

&lt;p&gt;As a follow-up to &lt;a href=&quot;https://mdickens.me/2025/02/03/I_was_probably_wrong_about_HIIT_and_VO2max/&quot;&gt;I was probably wrong about HIIT and VO2max&lt;/a&gt;, I added three new sections to my &lt;a href=&quot;https://mdickens.me/2024/09/26/outlive_a_critical_review/&quot;&gt;&lt;em&gt;Outlive&lt;/em&gt; review&lt;/a&gt; to evaluate some claims about exercise:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;&lt;a href=&quot;https://mdickens.me/2024/09/26/outlive_a_critical_review/#vo2max-is-the-best-predictor-of-longevity&quot;&gt;VO2max is the best predictor of longevity&lt;/a&gt; (verdict: VO2max is a good predictor, but direct performance measures (like your best mile time) are better.)&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://mdickens.me/2024/09/26/outlive_a_critical_review/#you-should-train-vo2max-by-doing-hiit-at-the-maximum-sustainable-pace&quot;&gt;You should train VO2max by doing HIIT at the maximum sustainable pace.&lt;/a&gt; (verdict: false. HIIT should be hard, but not maximally hard.)&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://mdickens.me/2024/09/26/outlive_a_critical_review/#you-should-do-3-hoursweek-of-zone-2-training-and-one-or-two-sessionsweek-of-hiit&quot;&gt;You should do &amp;gt;3 hours/week of zone 2 training and one or two sessions/week of HIIT.&lt;/a&gt; (verdict: this routine is good, but not uniquely good.)&lt;/li&gt;
&lt;/ol&gt;

&lt;h2 id=&quot;do-investors-put-too-much-stock-in-the-us-updated-may-2025&quot;&gt;Do Investors Put Too Much Stock in the US? (updated May 2025)&lt;/h2&gt;

&lt;p&gt;My &lt;a href=&quot;https://mdickens.me/2017/03/26/do_investors_put_too_much_stock_in_the_us/&quot;&gt;2017 post&lt;/a&gt; gave some arguments for overweighting US stocks and why I think most of them are wrong. But I missed one good argument: &lt;a href=&quot;https://mdickens.me/2017/03/26/do_investors_put_too_much_stock_in_the_us/#expropriation-risk&quot;&gt;expropriation risk&lt;/a&gt;. From the evidence I found, this risk looks negligible for developed countries but significant for emerging markets.&lt;/p&gt;

&lt;h2 id=&quot;a-comparison-of-donor-advised-fund-providers-updated-feb--may-2025&quot;&gt;A Comparison of Donor-Advised Fund Providers (updated Feb &amp;amp; May 2025)&lt;/h2&gt;

&lt;p&gt;In February, I &lt;a href=&quot;https://mdickens.me/2021/04/05/comparison_of_DAF_providers/&quot;&gt;updated my review&lt;/a&gt; because Charityvest raised its fees.&lt;/p&gt;

&lt;p&gt;In May, I made another update to describe Daffy’s new feature where it lets you &lt;a href=&quot;https://mdickens.me/2021/04/05/comparison_of_DAF_providers/#daffy-custom-portfolios&quot;&gt;choose your own ETFs&lt;/a&gt; from a long list. I also changed my recommendation flowchart (near the top of the &lt;a href=&quot;https://mdickens.me/2021/04/05/comparison_of_DAF_providers/&quot;&gt;post&lt;/a&gt;) to emphasize Daffy and de-emphasize Charityvest as a result of Daffy’s new feature + Charityvest’s increased fees.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;https://mdickens.me/assets/images/DAF-flowchart-v5.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

                </description>
			</item>
		
			<item>
				<title>Against Ergodicity Economics</title>
				<pubDate>Thu, 29 May 2025 00:00:00 -0700</pubDate>
				<link>http://mdickens.me/2025/05/29/ergodicity/</link>
				<guid isPermaLink="true">http://mdickens.me/2025/05/29/ergodicity/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;&lt;em&gt;&lt;a href=&quot;https://mdickens.me/confidence_tags/&quot;&gt;Confidence&lt;/a&gt;: Highly likely.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;I kept telling myself I wouldn’t write this post because it &lt;a href=&quot;https://xkcd.com/386/&quot;&gt;doesn’t matter&lt;/a&gt;. But I’ve seen one too many smart people speaking favorably about ergodicity economics. The concept of ergodicity in finance has essentially nothing going for it, and in this post I will explain why.&lt;/p&gt;

&lt;p&gt;Ergodicity economics is one of those rare theories that somehow manages to be both unfalsifiable and false.&lt;/p&gt;

&lt;p&gt;I originally wrote that sentence as a joke, then I deleted it, then I re-wrote it because I realized it’s actually true. Ergodicity economics is sufficiently vague in general that it can’t be falsified, but it is commonly interpreted as making specific falsifiable claims that are, in fact, false.&lt;/p&gt;

&lt;p&gt;Summary:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;A decision rule is considered ergodic if its single-iteration expectation equals its long-run time series expectation. &lt;a href=&quot;#what-is-ergodicity&quot;&gt;[More]&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;The way it’s used in practice, the ergodic principle is equivalent to logarithmic utility; there is no reason to prefer the ergodic principle over logarithmic utility. &lt;a href=&quot;#the-concept-of-ergodicity-doesnt-do-anything-useful&quot;&gt;[More]&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;Logarithmic utility is often inappropriate. Under the framework of expected utility theory, you could use a more- or less-risk-averse utility function instead. But ergodicity economics does not permit other levels of risk aversion. &lt;a href=&quot;#also-its-false&quot;&gt;[More]&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;There can be situations where a non-ergodic strategy is better than an ergodic one. &lt;a href=&quot;#ergodicity-isnt-good-non-ergodicity-isnt-bad&quot;&gt;[More]&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;The ergodic principle can only provide guidance in a narrow set of situations. In other situations, it has no way of comparing choices. &lt;a href=&quot;#mathematical-problems-for-ergodicity&quot;&gt;[More]&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;!-- more --&gt;

&lt;h2 id=&quot;contents&quot;&gt;Contents&lt;/h2&gt;

&lt;ul id=&quot;markdown-toc&quot;&gt;
  &lt;li&gt;&lt;a href=&quot;#contents&quot; id=&quot;markdown-toc-contents&quot;&gt;Contents&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#what-is-ergodicity&quot; id=&quot;markdown-toc-what-is-ergodicity&quot;&gt;What is ergodicity?&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#the-concept-of-ergodicity-doesnt-do-anything-useful&quot; id=&quot;markdown-toc-the-concept-of-ergodicity-doesnt-do-anything-useful&quot;&gt;The concept of ergodicity doesn’t do anything useful&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#the-ergodic-principle-is-an-unfalsifiable-metaphysical-claim&quot; id=&quot;markdown-toc-the-ergodic-principle-is-an-unfalsifiable-metaphysical-claim&quot;&gt;The ergodic principle is an unfalsifiable metaphysical claim&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#also-its-false&quot; id=&quot;markdown-toc-also-its-false&quot;&gt;…also it’s false&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#ergodicity-isnt-good-non-ergodicity-isnt-bad&quot; id=&quot;markdown-toc-ergodicity-isnt-good-non-ergodicity-isnt-bad&quot;&gt;Ergodicity isn’t good; non-ergodicity isn’t bad&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#mathematical-problems-for-ergodicity&quot; id=&quot;markdown-toc-mathematical-problems-for-ergodicity&quot;&gt;Mathematical problems for ergodicity&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#conclusion&quot; id=&quot;markdown-toc-conclusion&quot;&gt;Conclusion&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#changelog&quot; id=&quot;markdown-toc-changelog&quot;&gt;Changelog&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#notes&quot; id=&quot;markdown-toc-notes&quot;&gt;Notes&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;what-is-ergodicity&quot;&gt;What is ergodicity?&lt;/h2&gt;

&lt;p&gt;Taking a definition from &lt;a href=&quot;https://taylorpearson.me/ergodicity/&quot;&gt;Taylor Pearson&lt;/a&gt;:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;A way to identify an ergodic situation is to ask do I get the same result if I:&lt;/p&gt;

  &lt;ol&gt;
    &lt;li&gt;look at one individual’s trajectory across time&lt;sup id=&quot;fnref:7&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:7&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
    &lt;li&gt;look at a bunch of individual’s trajectories at a single point in time&lt;/li&gt;
  &lt;/ol&gt;

  &lt;p&gt;If yes: ergodic.&lt;/p&gt;

  &lt;p&gt;If not: non-ergodic.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Ole Peters, the physicist&lt;sup id=&quot;fnref:4&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:4&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt; who invented ergodicity economics, &lt;a href=&quot;https://doi.org/10.1038/s41567-019-0732-0&quot;&gt;gave&lt;/a&gt;&lt;sup id=&quot;fnref:3&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt; a precise mathematical definition which says essentially the same thing.&lt;/p&gt;

&lt;p&gt;(Ergodicity is confusing and hard to define without using math, so I appreciate Pearson for figuring out a clean definition. I tried to come up with a definition myself but my version was worse.)&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;ergodic principle&lt;/strong&gt; states that you should follow a strategy that produces ergodic outcomes.&lt;/p&gt;

&lt;p&gt;An illustrative example: I offer you a bet. You choose how much money to wager, then I flip a fair coin. If the coin lands heads, I triple your money. If it lands tails, you lose your wager.&lt;/p&gt;

&lt;p&gt;You maximize expected earnings by betting your entire net worth. Should you do that?&lt;/p&gt;

&lt;p&gt;If you make many bets in a row, you will eventually lose at least one of them, and you will end up with $0.&lt;sup id=&quot;fnref:6&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:6&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;4&lt;/a&gt;&lt;/sup&gt; But if many people make this bet simultaneously, then the average person will make a profit. The across-time outcome is not the same as the across-individuals outcome; therefore, this strategy is non-ergodic. Thus, the ergodic principle says you shouldn’t bet your entire net worth on this coin flip.&lt;/p&gt;

&lt;p&gt;Ole Peters proposed the ergodic principle as a replacement for expected utility theory, which is a &lt;a href=&quot;https://en.wikipedia.org/wiki/Expected_utility_hypothesis&quot;&gt;“foundational assumption in mathematical economics”&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Replacing a “foundational assumption” sounds like a tall order. Does ergodicity live up to Peters’ aspirations?&lt;/p&gt;

&lt;p&gt;Resoundingly, no.&lt;/p&gt;

&lt;h2 id=&quot;the-concept-of-ergodicity-doesnt-do-anything-useful&quot;&gt;The concept of ergodicity doesn’t do anything useful&lt;/h2&gt;

&lt;p&gt;Ergodicity proponents like to talk about Russian Roulette (&lt;a href=&quot;https://www.thecuriosityvine.com/post/ergodicity-what-it-is-and-why-it-matters-a-lot&quot;&gt;1&lt;/a&gt;, &lt;a href=&quot;https://taylorpearson.me/ergodicity/&quot;&gt;2&lt;/a&gt;). They say Russian Roulette is &lt;em&gt;non-ergodic&lt;/em&gt;: if six people play, 5/6 are alive at the end. If you play six times in a row, you are definitely dead. The across-individual outcome is different from the time-series outcome. That’s why Russian Roulette is a bad idea, you see.&lt;/p&gt;

&lt;p&gt;I am not sure what this is supposed to prove. Is there someone out there who believes it’s a good idea to play Russian Roulette, but then you invoke the concept of ergodicity, and this person realizes no, Russian Roulette is bad actually? Why do you need this fancy word to explain why people shouldn’t play Russian Roulette?&lt;sup id=&quot;fnref:10&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:10&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;5&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;Or let’s look at an example in finance, since that’s where ergodicity is supposed to be useful. Take the gamble I proposed in the previous section: if a coin lands heads, you triple your money. If it lands tails, you lose any money you put in.&lt;/p&gt;

&lt;p&gt;According to ergodicity economics, you shouldn’t wager all your money because the result would be non-ergodic. Instead, they say, you should bet according to the &lt;a href=&quot;https://en.wikipedia.org/wiki/Kelly_criterion&quot;&gt;Kelly criterion&lt;/a&gt;, which is the strategy that maximizes the geometric growth rate of your money. Maximizing geometric growth is ergodic, therefore it’s good.&lt;/p&gt;

&lt;p&gt;I can get behind the Kelly criterion (sort of&lt;sup id=&quot;fnref:18&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:18&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;6&lt;/a&gt;&lt;/sup&gt;). I can get behind maximizing the geometric growth rate. But that’s not a new concept, and ergodicity isn’t adding anything new.&lt;/p&gt;

&lt;p&gt;According to standard expected utility theory, if you have logarithmic utility of money, then you should maximize the geometric growth rate (or, equivalently, you should use the Kelly criterion). Exepected utility theory already gives a good answer. What’s the purpose of introducing the concept of ergodicity?&lt;/p&gt;

&lt;p&gt;I get the impression that Ole Peters thinks economists are stupider than they are. Quoting &lt;a href=&quot;/materials/peters2019.pdf&quot;&gt;Peters (2019)&lt;/a&gt;&lt;sup id=&quot;fnref:3:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt;:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;To make economic decisions, I often want to know how fast my personal fortune grows under different scenarios. This requires determining what happens over time in some model of wealth. But by wrongly assuming ergodicity, wealth is often replaced with its expectation value before growth is computed. Because wealth is not ergodic, nonsensical predictions arise.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Economists don’t wrongly assume that all situations are ergodic, and they don’t say everyone should maximize expected wealth. Standard economic theory says you should maximize expected &lt;em&gt;utility&lt;/em&gt; of wealth, for some utility function (and the choice of utility function depends on your risk tolerance). For a logarithmic utility function, maximizing expected utility = maximizing geometric growth. Which is the same as what Peters says to do.&lt;/p&gt;

&lt;p&gt;(Peters’ caricature of economists reminds me of &lt;a href=&quot;https://slatestarcodex.com/2017/04/07/yes-we-have-noticed-the-skulls/&quot;&gt;Scott Alexander’s&lt;/a&gt; “person who’s never read any economics, criticizing economists”.)&lt;/p&gt;

&lt;h2 id=&quot;the-ergodic-principle-is-an-unfalsifiable-metaphysical-claim&quot;&gt;The ergodic principle is an unfalsifiable metaphysical claim&lt;/h2&gt;

&lt;p&gt;So, in practical situations, the ergodic principle is equivalent to “maximize expected log(wealth)”. But Peters says ergodicity economics is superior to expected utility theory. Why?&lt;/p&gt;

&lt;p&gt;Peters’ justification is metaphysical, not practical. He has wordy explanations&lt;sup id=&quot;fnref:3:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt;&lt;sup id=&quot;fnref:15&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:15&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;7&lt;/a&gt;&lt;/sup&gt; for why ergodicity is metaphysically superior to utility maximization, in spite of producing identical results. He says the &lt;em&gt;reason&lt;/em&gt; you should use the Kelly criterion is because it’s ergodic, not because it maximizes a logarithmic utility function.&lt;/p&gt;

&lt;p&gt;His wordy explanations are mostly wrong, but I don’t want to get into the weeds of metaphysics. (&lt;a href=&quot;https://dx.doi.org/10.2139/ssrn.4140625&quot;&gt;Ford &amp;amp; Kay (2022)&lt;/a&gt;&lt;sup id=&quot;fnref:16&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:16&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;8&lt;/a&gt;&lt;/sup&gt; provides some analysis if you’re interested, under the headings “Psychology Is Fundamental to Decision Making” and “The Purpose of a Decision Theory”; see also &lt;a href=&quot;https://arxiv.org/abs/2306.03275&quot;&gt;Toda (2023)&lt;/a&gt;&lt;sup id=&quot;fnref:21&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:21&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;9&lt;/a&gt;&lt;/sup&gt;.) My big question is, why should I care about which theory is metaphysically superior? If the ergodic principle behaves identically to maximizing the logarithm of wealth, then the alleged superiority of the ergodic principle is unfalsifiable.&lt;/p&gt;

&lt;h2 id=&quot;also-its-false&quot;&gt;…also it’s false&lt;/h2&gt;

&lt;p&gt;Insofar as people take specific recommendations from the ergodic principle, they interpret it as recommending maximizing geometric growth. But not everyone should maximize geometric growth.&lt;/p&gt;

&lt;p&gt;A geometric-growth-maximizer is abnormally risk-tolerant. Historically, an investor would have maximized geometric growth by investing in stocks with 2:1 to 3:1 leverage. Most people are not comfortable with that level of risk.&lt;/p&gt;

&lt;p&gt;Ergodicity economics recommends that &lt;em&gt;everyone&lt;/em&gt; should pursue the same strategy of maximizing geometric growth. That’s too risky for most people. More generally, not everyone should pursue the same strategy because not everyone has the same risk tolerance. Therefore, the ergodic principle is false.&lt;/p&gt;

&lt;p&gt;Expected utility theory is not so restrictive. Maximizing geometric growth is equivalent to logarithmic utility, which also implies a high degree of risk tolerance, but most people don’t have logarithmic utility of wealth. Most people are better modeled as having more risk-averse utility functions than that.&lt;/p&gt;

&lt;p&gt;(Gordon Irlam &lt;a href=&quot;https://www.aacalc.com/docs/relative_risk_aversion&quot;&gt;reviewed&lt;/a&gt; research on risk aversion and concluded that most investors are perhaps 2x to 3x more risk-averse than a geometric-growth-maximizer.)&lt;/p&gt;

&lt;p&gt;Some people (not limited to ergodicity proponents) claim that everyone should maximize geometric growth, or everyone should use the &lt;a href=&quot;https://en.wikipedia.org/wiki/Kelly_criterion&quot;&gt;Kelly criterion&lt;/a&gt; (which is equivalent). This is wrong for the same reason that the ergodic principle is wrong: not everyone has the same risk tolerance.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://en.wikipedia.org/wiki/Paul_Samuelson&quot;&gt;Paul Samuelson&lt;/a&gt;, “the Nobel laureate whose mathematical analysis provided the foundation on which modern economics is built”&lt;sup id=&quot;fnref:11&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:11&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;10&lt;/a&gt;&lt;/sup&gt;, was apparently as bothered by this misconception as I am, because he wrote a short refutation &lt;a href=&quot;/materials/samuelson1979.pdf&quot;&gt;using only one-syllable words&lt;/a&gt;&lt;sup id=&quot;fnref:12&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:12&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;11&lt;/a&gt;&lt;/sup&gt;. An excerpt:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;blockquote&gt;
    &lt;p&gt;He who acts in N plays to make his mean log of wealth as big as it can be made will, with odds that go to one as N soars, beat me who acts to meet my own tastes for risk.&lt;/p&gt;
  &lt;/blockquote&gt;

  &lt;p&gt;Who doubts &lt;em&gt;that&lt;/em&gt;? What we do doubt is that it should make us change our views on gains and losses — should taint our tastes for risk.&lt;/p&gt;

  &lt;p&gt;To be clear is to be found out. Know that life is not a game with a net stake of one when you beat your twin, and with net stake of nought when you do not. A win of ten is not the same as a win of two. Nor is a loss of two the same as a loss of three. &lt;em&gt;How much&lt;/em&gt; you win by counts. &lt;em&gt;How much&lt;/em&gt; you lose by counts.&lt;/p&gt;

  &lt;p&gt;As soon as we see &lt;em&gt;this&lt;/em&gt; clear truth, we are back to our own tastes for risk.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;(I recommend reading &lt;a href=&quot;https://mdickens.me/materials/samuelson1979.pdf&quot;&gt;the whole thing&lt;/a&gt;, it’s only two pages long.)&lt;/p&gt;

&lt;p&gt;For a more serious analysis, see &lt;a href=&quot;https://mdickens.me/materials/samuelson1971.pdf&quot;&gt;Samuelson (1971)&lt;/a&gt;&lt;sup id=&quot;fnref:13&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:13&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;12&lt;/a&gt;&lt;/sup&gt;. Quoting the abstract:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Because the outcomes of repeated investments or gambles involve products of variables, authorities have repeatedly been tempted to the belief that, in a long sequence, maximization of the expected value of terminal utility can be achieved or well-approximated by a strategy of maximizing at each stage the geometric mean of outcome (or its equivalent, the expected value of the logarithm of principal plus return). The law of large numbers or of the central limit theorem as applied to the logs can validate the conclusion that a maximum-geometric-mean strategy does indeed make it “virtually certain” that, in a “long” sequence, one will end with a higher terminal wealth and utility. However, this does not imply the false corollary that the geometric-mean strategy is optimal for any finite number of periods, however long, or that it becomes asymptotically a good approximation. […] The novel criterion of maximizing the expected average compound return, which asymptotically leads to maximizing of geometric mean, is shown to be arbitrary.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The ergodic principle does not allow agents to be more risk-averse. For an agent with &lt;a href=&quot;https://en.wikipedia.org/wiki/Isoelastic_utility&quot;&gt;constant relative risk aversion&lt;/a&gt;, there is a utility function that satisfies their preferences. However, if their risk aversion coefficient does not equal 1 (which is equivalent to logarithmic utility), then their preferences &lt;em&gt;cannot&lt;/em&gt; satisfy the ergodic principle.&lt;sup id=&quot;fnref:25&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:25&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;13&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;h2 id=&quot;ergodicity-isnt-good-non-ergodicity-isnt-bad&quot;&gt;Ergodicity isn’t good; non-ergodicity isn’t bad&lt;/h2&gt;

&lt;p&gt;Let’s return to Russian Roulette. They say Russian Roulette is bad because it’s non-ergodic. At the risk of being morbid, let me propose an alternative game. The rule of the game is that you load a revolver with six bullets and then shoot yourself in the head.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;This game is ergodic!&lt;/strong&gt; If six people play, the average player dies. If one person plays six times, that person dies. The two situations are equal. According to Peters and others, the concept of ergodicity explains why losing all your money is bad, and why playing Russian Roulette is bad. Therefore, by the same principle, you should play this game.&lt;/p&gt;

&lt;p&gt;This criticism can be avoided by saying that you ought to choose an ergodic decision rule, but that that shouldn’t be your only criterion.&lt;/p&gt;

&lt;p&gt;However, non-ergodic decision rules can sometimes be preferable to ergodic ones. An example from the previous section is that an agent may prefer a more risk-averse utility function in a scenario where the ergodic principle forces them to adopt logarithmic utility.&lt;/p&gt;

&lt;p&gt;For another example, consider the following lottery:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;You may wager any amount of money. There is a 2/3 chance that you double your money and a 1/3 chance that you get nothing back. However, if at any point you have more than a million dollars, your head explodes.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Suppose you start with $1000. The correct strategy is to bet some fraction of your bankroll (say, using the &lt;a href=&quot;https://en.wikipedia.org/wiki/Kelly_criterion&quot;&gt;Kelly criterion&lt;/a&gt;), but then to stop betting once your bankroll is at risk of exceeding a million dollars.&lt;/p&gt;

&lt;p&gt;This strategy is non-ergodic: the single-period expected outcome is simply the expected value of the bet, but the long-term average outcome does not equal the geometric growth rate because you stop betting at some point. Any ergodic strategy would be inferior to this non-ergodic strategy.&lt;/p&gt;

&lt;p&gt;(I &lt;em&gt;think&lt;/em&gt; the only ergodic strategy would be to bet $0. Betting any larger amount would eventually result in your head exploding, which makes it non-ergodic.)&lt;/p&gt;

&lt;h2 id=&quot;mathematical-problems-for-ergodicity&quot;&gt;Mathematical problems for ergodicity&lt;/h2&gt;

&lt;p&gt;&lt;em&gt;This section is adapted from a &lt;a href=&quot;https://forum.effectivealtruism.org/posts/PnW7RAZjCwfsfiExz/exploring-ergodicity-in-the-context-of-longtermism?commentId=F5hjto4T3daDFrHxf&quot;&gt;comment&lt;/a&gt; I wrote a year ago.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Ergodicity economics has more problems; explaining these problems requires doing math.&lt;/p&gt;

&lt;p&gt;Ole Peters defined a system as “ergodic” if there exists some transformation function \(f\) such that it satisfies the equation&lt;sup id=&quot;fnref:3:3&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;\begin{align}
\displaystyle\lim\limits_{T \rightarrow \infty} \frac{1}{T} \int\limits_0^T f(x(t)) dt = \int f(x) P(x) dx
\end{align}&lt;/p&gt;

&lt;p&gt;where \(x\) is the state (typically wealth, but it could be any other outcome you care about); \(t\) is time; \(P(x)\) is the probability density of \(x\); and \(T\) is the number of time steps.&lt;/p&gt;

&lt;p&gt;In plain English, there must be some function such that the time average of the function output equals the function’s expected value.&lt;/p&gt;

&lt;p&gt;(This definition is adapted from &lt;a href=&quot;https://mathworld.wolfram.com/BirkhoffsErgodicTheorem.html&quot;&gt;Birkhoff’s erodic theorem&lt;/a&gt;, a theorem in statistical dynamics where the concept of &lt;a href=&quot;https://en.wikipedia.org/wiki/Ergodic_theory&quot;&gt;ergodicity&lt;/a&gt; originates, and where—unlike in economics—it is actually useful.&lt;sup id=&quot;fnref:9&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:9&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;14&lt;/a&gt;&lt;/sup&gt;)&lt;/p&gt;

&lt;p&gt;What function is \(f(x)\)? Ergodicity economics does not require you to use any particular function. When discussing multiplicative bets, Peters takes \(f(x) = \log(x)\). If you size your bets so as to maximize the geometric mean of wealth, then indeed you will satisfy the ergodic principle, because the time-limit of log(wealth) equals the expected value of log(wealth).&lt;/p&gt;

&lt;p&gt;You don’t have to use \(f(x) = \log(x)\); you just have to use a transformation function that satisfies the ergodic principle. The function \(f(x) = 0\) is ergodic: its expected value is constant (because the EV is 0), and the finite-time average converges to the EV (because the finite-time average is 0).&lt;/p&gt;

&lt;p&gt;Peters &lt;a href=&quot;https://doi.org/10.1063/1.4940236&quot;&gt;claims&lt;/a&gt;&lt;sup id=&quot;fnref:15:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:15&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;7&lt;/a&gt;&lt;/sup&gt; that&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;A rational agent faced with additive bets (example: 50% chance of winning $2, 50% chance of losing $1) ought to choose the bet with the highest expected payout.&lt;/li&gt;
  &lt;li&gt;A rational agent faced with multiplicative bets (example: 50% chance of a 10% return, 50% chance of a –5% return) ought to maximize the expected logarithmic growth rate of wealth: \(f(W(t)) = \log(W(t))\).&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;I will accept these claims for the sake of argument.&lt;/p&gt;

&lt;p&gt;Consider a choice between two lotteries:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Lottery A: 50% chance of winning $200; 50% chance of losing $199.&lt;/p&gt;

  &lt;p&gt;Lottery B: 99% chance of multiplying your money by 100x; 1% chance of losing 0.0001% of your money.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Peters’ version of the ergodic principle cannot say which of these lotteries is better. It doesn’t evaluate them using the same units: Lottery A is evaluated in dollars; Lottery B is evaluated in growth rate of dollars.&lt;/p&gt;

&lt;p&gt;If your theory can’t see that Lottery B is better, then your theory is insufficient.&lt;/p&gt;

&lt;p&gt;There is no transformation function that satisfies Peters’ requirement of maximizing geometric growth rate for multiplicative bets (Lottery B) while also being ergodic for additive bets (Lottery A). Maximizing growth rate specifically requires using the function
\(f(W(t)) = \log(W(t))\) (up to affine transformation), which does not satisfy ergodicity for additive bets (expected value is not constant with respect to \(t\)).&lt;/p&gt;

&lt;p&gt;In fact, multiplicative bets cannot be compared to any other type of bet, because \(\log(W(t))\) is &lt;em&gt;only&lt;/em&gt; ergodic when \(W(t)\) grows at a constant long-run exponential rate.&lt;/p&gt;

&lt;p&gt;In terms of &lt;a href=&quot;https://en.wikipedia.org/wiki/Von_Neumann%E2%80%93Morgenstern_utility_theorem&quot;&gt;Von Neumann-Morgenstern utility&lt;/a&gt;, the ergodic principle violates the axiom of &lt;em&gt;completeness&lt;/em&gt;: there are pairs of bets where it is impossible to say which one is better (and it’s also impossible to say that they’re equal).&lt;/p&gt;

&lt;p&gt;For a more thorough analysis, see &lt;a href=&quot;https://dx.doi.org/10.2139/ssrn.4140625&quot;&gt;Psychology Is Fundamental: The Limitations of Growth-Optimal Approaches to Decision Making under Uncertainty&lt;/a&gt;&lt;sup id=&quot;fnref:16:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:16&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;8&lt;/a&gt;&lt;/sup&gt;. (This paper includes a similar proof of non-completeness, although our two proofs were derived independently.)&lt;/p&gt;

&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;

&lt;p&gt;The concept of ergodicity is complicated enough that it feels like it’s providing useful insights. (Ah, yes, Russian Roulette is bad! Gambling away all your money is bad!) In practice, its main prediction is that people shouldn’t be risk neutral, and this is indeed true. But the theory provides nothing novel, and when prodded a little, it falls apart.&lt;/p&gt;

&lt;p&gt;Not much work has been done on ergodicity economics; perhaps there’s some variation of the theory that can make it viable. But in its current form, ergodicity economics should not be cited favorably as an alternative to expected utility theory.&lt;/p&gt;

&lt;h1 id=&quot;changelog&quot;&gt;Changelog&lt;/h1&gt;

&lt;ul&gt;
  &lt;li&gt;2025-06-25:
    &lt;ul&gt;
      &lt;li&gt;Make the tone of the introduction more polite.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;2026-03-04:
    &lt;ul&gt;
      &lt;li&gt;Fix an incorrect description of how the ergodic principle behaves with additive bets.&lt;/li&gt;
      &lt;li&gt;Wording improvements.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;2026-03-09:
    &lt;ul&gt;
      &lt;li&gt;Move summary from the conclusion to the introduction.&lt;/li&gt;
      &lt;li&gt;Remove a section that made a misplaced criticism. The section argued that the ergodic principle can’t say why maximizing geometric growth is preferable to always wagering $0, regardless of how favorable the bet is, because both decision rules are ergodic. But this criticism doesn’t work because it’s up to the agent to choose their decision rule, not the framework itself. It’s not a mark against the ergodicity framework if an agent prefers one ergodic function over a different ergodic function.&lt;/li&gt;
      &lt;li&gt;Introduce a new section (&lt;a href=&quot;#ergodicity-isnt-good-non-ergodicity-isnt-bad&quot;&gt;Ergodicity isn’t good; non-ergodicity isn’t bad&lt;/a&gt;). This is a spiritual replacement for the section I removed.&lt;/li&gt;
      &lt;li&gt;Add new content under &lt;a href=&quot;#also-its-false&quot;&gt;…also it’s false&lt;/a&gt; explaining that constant relative risk aversion is incompatible with ergodicity (except in the case of logarithmic utility).&lt;/li&gt;
      &lt;li&gt;Change confidence from “Almost certain” to “Highly likely”. The mathematical background is sufficiently complicated than I don’t think I can be that confident that I got it right.&lt;/li&gt;
      &lt;li&gt;Wording improvements.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ul&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:7&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;You may notice that this definition is underspecified. What exactly does it mean to “look at one trajectory”? People usually interpret it as “look at the geometric mean of the trajectory”, so that’s what I’ll take it to mean for now. &lt;a href=&quot;#fnref:7&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:4&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I think the &lt;a href=&quot;https://xkcd.com/793/&quot;&gt;meme&lt;/a&gt; of “physicist encounters a new subject and immediately thinks they can do it better than experts” is overstated. But the stereotype holds true in this case. &lt;a href=&quot;#fnref:4&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:3&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Peters, O. (2019). &lt;a href=&quot;/materials/peters2019.pdf&quot;&gt;The ergodicity problem in economics.&lt;/a&gt; doi: &lt;a href=&quot;https://doi.org/10.1038/s41567-019-0732-0&quot;&gt;10.1038/s41567-019-0732-0&lt;/a&gt; &lt;a href=&quot;#fnref:3&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt; &lt;a href=&quot;#fnref:3:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;2&lt;/sup&gt;&lt;/a&gt; &lt;a href=&quot;#fnref:3:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;3&lt;/sup&gt;&lt;/a&gt; &lt;a href=&quot;#fnref:3:3&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;4&lt;/sup&gt;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:6&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I’m playing fast and loose with probability here—it’s not entirely accurate to say that you “will” end up with $0. There is a more precise version of what I said that’s more accurate, but I don’t want to get too technical. I will give a formal mathematical definition &lt;a href=&quot;#mathematical-problems-for-ergodicity&quot;&gt;later&lt;/a&gt;. &lt;a href=&quot;#fnref:6&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:10&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;There is another problem with the Russian Roulette example that’s something of a digression, but I will include it in this footnote for completeness:&lt;/p&gt;

      &lt;p&gt;Russian Roulette is not equivalent to playing six iterations of a game with a 5/6 probability of success. In Russian Roulette, you are sampling bullets without replacement, so the probability of finding a bullet goes up every time you win.&lt;/p&gt;

      &lt;p&gt;One of the articles I linked wrote:&lt;/p&gt;

      &lt;blockquote&gt;
        &lt;p&gt;You might roll the dice and take $1,000,000 to play Russian Roulette one time (though I wouldn’t advise it). But there’s no amount of money that would make you play it 6 or more times.&lt;/p&gt;
      &lt;/blockquote&gt;

      &lt;p&gt;If you play 6 times, you have a 100% chance of dying. If you played a version where you sample with replacement (for example, you spin the revolver again after every shot), you are not guaranteed to die after 6 attempts. That would be the appropriate example. &lt;a href=&quot;#fnref:10&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:18&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;The Kelly criterion is not universally applicable, as I will discuss later in this article. &lt;a href=&quot;#fnref:18&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:15&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Peters, O., &amp;amp; Gell-Mann, M. (2016). &lt;a href=&quot;https://doi.org/10.1063/1.4940236&quot;&gt;Evaluating gambles using dynamics.&lt;/a&gt; &lt;a href=&quot;#fnref:15&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt; &lt;a href=&quot;#fnref:15:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;2&lt;/sup&gt;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:16&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Ford, M., &amp;amp; Kay, J. (2022). &lt;a href=&quot;https://dx.doi.org/10.2139/ssrn.4140625&quot;&gt;Psychology Is Fundamental: The Limitations of Growth-Optimal Approaches to Decision Making under Uncertainty.&lt;/a&gt; &lt;a href=&quot;#fnref:16&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt; &lt;a href=&quot;#fnref:16:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;2&lt;/sup&gt;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:21&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Toda, A. (2023). &lt;a href=&quot;https://arxiv.org/abs/2306.03275&quot;&gt;‘Ergodicity Economics’ Is Pseudoscience.&lt;/a&gt; doi: &lt;a href=&quot;https://doi.org/10.48550/arXiv.2306.03275&quot;&gt;10.48550/arXiv.2306.03275&lt;/a&gt; &lt;a href=&quot;#fnref:21&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:11&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Frost, G. (2009). &lt;a href=&quot;https://news.mit.edu/2009/obit-samuelson-1213&quot;&gt;Nobel-winning economist Paul A. Samuelson dies at age 94.&lt;/a&gt; &lt;a href=&quot;#fnref:11&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:12&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Samuelson, P. (1979). &lt;a href=&quot;/materials/samuelson1979.pdf&quot;&gt;Why we should not make mean log of wealth big though years to act are long.&lt;/a&gt; doi: &lt;a href=&quot;https://doi.org/10.1016/0378-4266(79)90023-2&quot;&gt;10.1016/0378-4266(79)90023-2&lt;/a&gt; &lt;a href=&quot;#fnref:12&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:13&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Samuelson, P. (1971). &lt;a href=&quot;https://mdickens.me/materials/samuelson1971.pdf&quot;&gt;The “Fallacy” of Maximizing the Geometric Mean in Long Sequences of Investing or Gambling.&lt;/a&gt; doi: &lt;a href=&quot;https://doi.org/10.1073/pnas.68.10.2493&quot;&gt;10.1073/pnas.68.10.2493&lt;/a&gt; &lt;a href=&quot;#fnref:13&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:25&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I haven’t proved this mathematically, but this is the intuition:&lt;/p&gt;

      &lt;p&gt;For the time-series average to equal the point-in-time arithmetic mean, you must apply a transformation function that converts from geometric space to arithmetic, and the only way to do that is by applying the logarithm. &lt;a href=&quot;#fnref:25&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:9&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I presume. I haven’t studied statistical dynamics so I don’t really know. &lt;a href=&quot;#fnref:9&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>Let's take a moment to marvel at how bad the original USDA food pyramid was</title>
				<pubDate>Wed, 21 May 2025 00:00:00 -0700</pubDate>
				<link>http://mdickens.me/2025/05/21/food_pyramid/</link>
				<guid isPermaLink="true">http://mdickens.me/2025/05/21/food_pyramid/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;&lt;img src=&quot;https://upload.wikimedia.org/wikipedia/commons/6/6d/USDA_Food_Pyramid.gif&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Edited 2025-05-26 to correct an inaccuracy—originally I said butter goes in the Dairy group but actually it goes in the Fats, Oils &amp;amp; Sweets group.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The original 1992 version of the USDA Food Pyramid was bad. So bad that people who scrupulously followed the guidelines were barely healthier than the people who ignored them.&lt;sup id=&quot;fnref:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;But the Food Pyramid was not just wrong: it was &lt;em&gt;marvelously&lt;/em&gt; wrong. It was wrong in many ways simultaneously. It achieved levels of wrongness hitherto undreamed of.&lt;/p&gt;

&lt;p&gt;What was wrong about it? I will start with the obvious answers, and move into the philosophical.&lt;/p&gt;

&lt;!-- more --&gt;

&lt;ol&gt;
  &lt;li&gt;Its ranking of the healthiness of foods is wrong. Refined grains are healthier than fruits and vegetables?&lt;sup id=&quot;fnref:6&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:6&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt; Processed meat and nuts are equally healthy? All oils should be used sparingly? What?&lt;/li&gt;
  &lt;li&gt;It lumps together foods that shouldn’t go together: whole grains + refined grains; red meat + healthy proteins + nuts; fats + oils + sweets.
    &lt;ul&gt;
      &lt;li&gt;Nutritionally speaking, white bread has more in common with sweets than it does with whole grains. Oils (unsaturated fats) have more in common with nuts than they do with trans fats.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;It implies there are specific numbers of servings of each food group that you should have, which is wrong in two ways:
    &lt;ul&gt;
      &lt;li&gt;Required servings vary a lot from person to person. If I followed the upper end of the serving guidelines, I would be too skinny (I think—I’m not really sure how much food is in a “serving”). For some people, the lower end is still too much food.&lt;/li&gt;
      &lt;li&gt;Giving a servings range (e.g. 3–5 servings for vegetables) implies that healthiness follows an inverted U curve, and it’s bad to eat too much or too little. But that’s usually not true: there is effectively no such thing as eating too many vegetables.&lt;sup id=&quot;fnref:3&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt; There is no such thing as not eating enough trans fat (the ideal amount is zero). You can eat zero grains (keto diet) or zero meat (vegetarian diet) and still be perfectly healthy.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;It fails to say anything about micronutrients or vitamin deficiencies—or even macronutrients for that matter.&lt;/li&gt;
  &lt;li&gt;The whole concept of a “food pyramid” is fundamentally flawed. It rests on the incorrect assumption that there are different food groups that you should eat in different amounts. It would be more accurate to say there are
    &lt;ul&gt;
      &lt;li&gt;some foods you &lt;em&gt;should&lt;/em&gt; eat, and there’s effectively no upper limit (fruits + vegetables);&lt;/li&gt;
      &lt;li&gt;other food groups you can have plenty of as long as you don’t eat too many calories overall (whole grains, beans, nuts, seeds, vegetable oils&lt;sup id=&quot;fnref:4&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:4&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;4&lt;/a&gt;&lt;/sup&gt;);&lt;/li&gt;
      &lt;li&gt;and some foods where the ideal amount to eat is zero (trans fats, sugar, refined carbs, processed meat).&lt;sup id=&quot;fnref:7&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:7&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;5&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;(Bonus wrongness fact: I didn’t notice this until I inspected the food pyramid closely, but it says fruit contains added sugar, as indicated by the white triangles in the “fruit” section. This is so obviously incorrect that it’s kind of baffling why they drew the pyramid this way. Perhaps what they meant to say was that fruit contains sugar, but the key specifically says the white triangle indicates “added” sugar.)&lt;/p&gt;

&lt;p&gt;To illustrate the beauty of the USDA’s achievement, I present to you the Food Wrongness Pyramid:&lt;sup id=&quot;fnref:5&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:5&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;6&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/food-wrongness-pyramid.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;The &lt;a href=&quot;https://www.hsph.harvard.edu/nutritionsource/healthy-eating-plate/&quot;&gt;Healthy Eating Plate&lt;/a&gt;, by the Harvard T.H. Chan School of Public Health, fixes most of the problems with the USDA Food Pyramid:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;https://nutritionsource.hsph.harvard.edu/wp-content/uploads/2012/09/HEPJan2015.jpg&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;It uses a plate instead of a pyramid! This is a good shape! It correctly implies that you should eat some of each of the food groups on the plate, and you should try to avoid foods that &lt;em&gt;aren’t&lt;/em&gt; on the plate.&lt;/li&gt;
  &lt;li&gt;It says that there are some foods you should eat, and other foods you should avoid. (Eat whole grains; avoid refined grains.)&lt;/li&gt;
  &lt;li&gt;It correctly identifies which foods are healthy. (No putting starches at the bottom of the pyramid!)&lt;/li&gt;
  &lt;li&gt;It groups foods in a sensible manner. (Whole grains get a group; healthy proteins get a group; healthy oils get a group. There is no “starches” or “meat” or “fats + oils + sugars”.)&lt;/li&gt;
  &lt;li&gt;It suggests relative proportions instead of numbers of servings.&lt;/li&gt;
  &lt;li&gt;It pays attention to macronutrients: the plate includes both protein and oil (= fat). (It doesn’t mention carbs, but that’s okay because it’s pretty much impossible to under-eat carbs.)&lt;/li&gt;
  &lt;li&gt;It still doesn’t say anything about micronutrients, but if you follow the prescribed guidelines, you’ll probably get enough micronutrients anyway.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href=&quot;https://www.myplate.gov/eat-healthy/what-is-myplate&quot;&gt;MyPlate&lt;/a&gt;, the USDA version of the Healthy Eating Plate, is similar to Harvard’s version except that they made it &lt;a href=&quot;https://www.health.harvard.edu/staying-healthy/comparison-of-healthy-eating-plate-and-usda-myplate&quot;&gt;worse&lt;/a&gt;, probably because of lobbyists or whatever (it says red meat counts as a healthy protein; it gives dairy its own category&lt;sup id=&quot;fnref:8&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:8&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;7&lt;/a&gt;&lt;/sup&gt;; it doesn’t say anything about healthy fats).&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/MyPlate.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;This is a side note but I think it’s funny how everyone puts nuts in the “protein” bucket even though most nuts have about as much protein as grains (which is to say, not very much). I would rather put nuts in the “oil” group…obviously nuts aren’t oil, but both nuts and oil are desirable as a source of unsaturated fat, so they should go together.&lt;/p&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:1&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;McCullough, M. L., et al. (2000). “Adherence to the Dietary Guidelines for Americans and Risk of Major Chronic Disease in Men.”&lt;/p&gt;

      &lt;p&gt;McCullough, M. L., et al. (2000). “Adherence to the Dietary Guidelines for Americans and Risk of Major Chronic Disease in Women.” &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:6&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Technically it doesn’t say grains are healthier than fruits/veggies, but it puts them lower on the pyramid, which naturally leads you to believe that they’re healthier. &lt;a href=&quot;#fnref:6&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:3&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I’m sure someone in history has over-eaten vegetables at some point, but practically speaking you’re probably never going to get there. &lt;a href=&quot;#fnref:3&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:4&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;and probably also poultry, fish, and eggs, but eating those is bad for animals. &lt;a href=&quot;#fnref:4&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:7&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Plus a short list of exceptions like sodium and fat-soluble vitamins, where you want to eat some but not too much. &lt;a href=&quot;#fnref:7&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:5&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;The items on this pyramid don’t match up with the items in my article because I revised the article a bit and I didn’t feel like re-making the image. I put a ton of work into my pyramid, as you can probably tell by the intricate and high-quality illustrations. &lt;a href=&quot;#fnref:5&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:8&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;The USDA MyPlate website &lt;a href=&quot;https://www.myplate.gov/eat-healthy/dairy&quot;&gt;says&lt;/a&gt; soy milk counts as dairy. Which, like, I guess I get what they were going for, but why is this category called “dairy”? &lt;a href=&quot;#fnref:8&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>Can you maintain lean mass in a calorie deficit?</title>
				<pubDate>Thu, 01 May 2025 00:00:00 -0700</pubDate>
				<link>http://mdickens.me/2025/05/01/resistance_training_calorie_deficit/</link>
				<guid isPermaLink="true">http://mdickens.me/2025/05/01/resistance_training_calorie_deficit/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;TLDR: A meta-analysis allegedly showed that a 500-calorie deficit is the sweet spot to avoid losing lean mass, but the interpretation of the data was wrong and actually it didn’t show that. When interpreted correctly, the data provides weak (insignificant) evidence that any deficit will result in a loss of lean mass.&lt;/p&gt;

&lt;p&gt;If you’re losing weight, does lifting weights reduce how much muscle you lose? Is it possible to entirely prevent muscle loss (or even gain muscle)?&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://onlinelibrary.wiley.com/doi/full/10.1111/sms.14075&quot;&gt;Murphy &amp;amp; Koehler (2021)&lt;/a&gt;&lt;sup id=&quot;fnref:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; did a meta-analysis on this question. They collected experiments where the experimental groups did resistance training while eating at an energy deficit (RT+ED), and the control groups did resistance training while eating a normal amount of food (RT+CON).&lt;/p&gt;

&lt;p&gt;They found a strong association between change in lean mass and the magnitude of the energy deficit (slope = –0.325, p = 0.001). The meta-analysis predicts that you can eat at a deficit of 500 calories per day without losing any lean mass, but you will lose mass at a larger deficit.&lt;/p&gt;

&lt;p&gt;(The meta-analysis also reported that participants gained strength in almost every study, even with larger calorie deficits. That’s useful to know, but I will focus on lean mass for this post.)&lt;/p&gt;

&lt;p&gt;I should mention that what we actually care about is muscle loss, not lean mass loss. Lean mass includes anything that isn’t fat—muscle fibers, organs, &lt;a href=&quot;https://en.wikipedia.org/wiki/Glycogen&quot;&gt;glycogen&lt;/a&gt;, etc. Muscle mass is harder to measure. We don’t know what happened to study participants’ muscle, only their total lean mass.&lt;/p&gt;

&lt;p&gt;Let’s set that aside and assume lean mass is a useful proxy for muscle mass.&lt;/p&gt;

&lt;p&gt;The authors showed a plot of every individual study’s experimental group (RT+ED) and control group (RT+CON), along with a regression line predicting lean mass change as a function of energy deficit:&lt;sup id=&quot;fnref:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/RT+ED-and-RT+CON.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;But…does this regression line look a little odd to you?&lt;/p&gt;

&lt;!-- more --&gt;

&lt;p&gt;Where are the RT+CON points, and where are the RT+ED points, relative to the regression line?&lt;/p&gt;

&lt;p&gt;In particular, look at all the data points from experimental groups where participants had energy deficits of under 500 calories. Almost all of them lost lean mass on average (recall that each individual point represents the average result from one study); only four gained lean mass.&lt;/p&gt;

&lt;p&gt;The slope of the regression line is almost entirely driven by the difference between the experimental and control groups.&lt;/p&gt;

&lt;p&gt;What happens if we calculate a regression using only the experimental groups? Did groups with a bigger calorie deficit lose more lean mass?&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/RT+ED.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Now the regression has a slope of only –0.123 (p = 0.28), and it predicts that any deficit will cause at least a small loss in lean mass.&lt;/p&gt;

&lt;p&gt;So, among study groups where participants ate at a calorie deficit, it does appear that they lost lean mass on average. But there is not a clear relationship between the &lt;em&gt;size&lt;/em&gt; of the deficit and the amount of lean mass lost.&lt;/p&gt;

&lt;p&gt;In theory, it makes sense that when you have a larger calorie deficit, it should be harder for your body to preserve muscle. But the evidence from this meta-analysis doesn’t really support the theory. (It doesn’t contradict it, either. It just doesn’t say much either way.)&lt;/p&gt;

&lt;p&gt;Murphy &amp;amp; Koehler (2021)’s original regression had a slope of –0.325. If that’s the true slope, then it’s unlikely that the experimental-only regression would have the much shallower slope of –0.123 (p = 0.07, likelihood ratio 5.2&lt;sup id=&quot;fnref:3&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt;).&lt;/p&gt;

&lt;p&gt;The conclusions I drew from this meta-analysis:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Eating at a deficit might cause me to lose muscle, or it might not, who knows.&lt;/li&gt;
  &lt;li&gt;In theory, I expect there is some calorie deficit above which I start to lose muscle, but this meta-analysis doesn’t tell me what that number is.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;appendix-what-else-does-this-data-tell-us&quot;&gt;Appendix: What else does this data tell us?&lt;/h2&gt;

&lt;p&gt;Looking at the experimental-only regression, how much evidence is this for or against the hypothesis that a larger calorie deficit causes more lean mass loss?&lt;/p&gt;

&lt;p&gt;The experimental-only regression has slope –0.123 with standard error 0.111. The slope is still negative, which is consistent with the hypothesis, but it’s not strongly negative—only about one standard error away from zero.&lt;/p&gt;

&lt;p&gt;The hypothesis predicts a positive intercept: it should be possible to gain muscle while maintaining weight. The experimental-only regression has a negative intercept (–0.054), but it is less than one standard error away from zero (the intercept’s standard error is 0.079). This is weak evidence against the hypothesis.&lt;/p&gt;

&lt;p&gt;Source code is available &lt;a href=&quot;https://github.com/michaeldickens/public-scripts/blob/master/calorie_deficit.py&quot;&gt;on GitHub&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Updated 2026-03-25 to add a TLDR.&lt;/em&gt;&lt;/p&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:1&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Murphy, C., &amp;amp; Koehler, K. (2021). &lt;a href=&quot;https://onlinelibrary.wiley.com/doi/full/10.1111/sms.14075&quot;&gt;Energy deficiency impairs resistance training gains in lean mass but not strength: A meta-analysis and meta-regression.&lt;/a&gt; doi: &lt;a href=&quot;https://doi.org/10.1111/sms.14075&quot;&gt;10.1111/sms.14075&lt;/a&gt; &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:2&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;The paper’s plot is in black and white. I re-created it in color to make it easier to read. &lt;a href=&quot;#fnref:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:3&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;p-value is calculated using a two-sided t-test, where the null hypothesis is that the mean equals –0.3249. Standard error 0.111 which is the standard error of the experimental-only slope.&lt;/p&gt;

      &lt;p&gt;Likelihood ratio is calculated as norm.pdf(–0.1234, mu=–0.1234, sigma=0.111) / norm.pdf(–0.1234, mu=–0.3249, sigma=0.111). &lt;a href=&quot;#fnref:3&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>Why would AI companies use human-level AI to do alignment research?</title>
				<pubDate>Fri, 25 Apr 2025 00:00:00 -0700</pubDate>
				<link>http://mdickens.me/2025/04/25/bootstrapped_alignment/</link>
				<guid isPermaLink="true">http://mdickens.me/2025/04/25/bootstrapped_alignment/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;&lt;img src=&quot;/assets/images/plans-for-the-future.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Cross-posted to the &lt;a href=&quot;https://forum.effectivealtruism.org/posts/AP2awvvmzGoiAkXsm/why-would-ai-companies-use-human-level-ai-to-do-alignment&quot;&gt;EA Forum&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Many plans for how to safely build superintelligent AI have a critical section that goes like this:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Develop AI that’s powerful enough to do AI research, but not yet powerful enough to pose an existential threat.&lt;/li&gt;
  &lt;li&gt;Use it to assist with alignment research, thus greatly accelerating the pace of work—hopefully enough to solve all alignment problems.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;You could call this process “alignment bootstrapping”.&lt;/p&gt;

&lt;p&gt;This is a central feature of &lt;a href=&quot;https://deepmind.google/discover/blog/taking-a-responsible-path-to-agi/&quot;&gt;DeepMind’s plan&lt;/a&gt; (see “Amplified oversight”), &lt;a href=&quot;https://www.anthropic.com/news/core-views-on-ai-safety&quot;&gt;Anthropic’s plan&lt;/a&gt; (see “Scalable Oversight”), and independent plans written by &lt;a href=&quot;https://sleepinyourhat.github.io/checklist/&quot;&gt;Sam Bowman&lt;/a&gt; (an AI safety manager at Anthropic), &lt;a href=&quot;https://www.lesswrong.com/posts/8vgi3fBWPFDLBBcAx/planning-for-extreme-ai-risks&quot;&gt;Joshua Clymer&lt;/a&gt; (a researcher at Redwood Research), and &lt;a href=&quot;https://www.lesswrong.com/posts/bb5Tnjdrptu89rcyY/what-s-the-short-timeline-plan&quot;&gt;Marius Hobbhahn&lt;/a&gt; (CEO of Apollo Research).&lt;/p&gt;

&lt;p&gt;There are various reasons why alignment bootstrapping could fail&lt;sup id=&quot;fnref:5&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:5&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; even if implemented well, and some of those plans acknowledge this. But I’m also concerned about whether alignment bootstrapping will be implemented at all.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;When the time comes, will AI companies actually spend their resources on alignment bootstrapping?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;When AI companies have human-level AI systems, will they &lt;em&gt;use them for alignment research&lt;/em&gt;, or will they use them (mostly) to advance capabilities instead?&lt;/p&gt;

&lt;!-- more --&gt;

&lt;p&gt;AI companies currently employ many human-level &lt;em&gt;humans&lt;/em&gt;, and use a small percentage of them to do alignment research. If it makes sense for them to use most of their human-level AIs to do alignment research, wouldn’t it also make sense to use most of their &lt;em&gt;human&lt;/em&gt; researchers to do alignment research?&lt;/p&gt;

&lt;p&gt;But they don’t do that. Most of their human researchers work on advancing AI capabilities.&lt;/p&gt;

&lt;p&gt;It’s more likely that they use human-level AIs the same way they use human researchers: almost all of them work on accelerating capabilities, and a small minority work on safety. Which probably means capabilities outpace safety, which probably means we die.&lt;/p&gt;

&lt;p&gt;Some companies argue that they &lt;em&gt;need to&lt;/em&gt; advance capabilities right now to stay competitive. Perhaps that’s true. Consider what the world will look like once the first company develops human-level AI. At that point, the #2 company will only be a few months behind at most. So the leading company will once again say, “Sorry, we can’t use our human-level AI to work on alignment, we have to keep advancing capabilities to stay ahead.” And they will continue saying this right up until their AI is powerful enough to kill everyone.&lt;/p&gt;

&lt;p&gt;Counterpoint: AI companies would probably argue that present-day AIs are far from being dangerous. But human-level AIs will be &lt;em&gt;nearly&lt;/em&gt; dangerous, so at that point it’s too risky to keep advancing capabilities.&lt;/p&gt;

&lt;p&gt;I would be more inclined to believe this if AI companies weren’t already behaving so &lt;a href=&quot;https://mdickens.me/2024/11/18/where_i_am_donating_in_2024/#parallel-safetycapabilities-vs-slowing-ai&quot;&gt;recklessly&lt;/a&gt;.&lt;sup id=&quot;fnref:6&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:6&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt; If you’re going to prioritize safety over capabilities when the tradeoff becomes more critical, you should prove it to the world by prioritizing safety over capabilities &lt;em&gt;right now&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;Perhaps the ideal perfectly-altruistic AI company would indeed push capabilities right now and then switch to safety at the critical time,&lt;sup id=&quot;fnref:3&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt; but I see little reason to believe that that’s what any of the real-life AI companies are going to do.&lt;/p&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:5&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;By my reading, none of the plans put probabilities on how concerning these reasons are. My guess is that, if alignment bootstrapping is implemented as these plans typically describe, then there’s a greater than 50% chance that we die.&lt;/p&gt;

      &lt;p&gt;The purpose of this essay isn’t to talk about the implementation problems with alignment bootstrapping, but in brief:&lt;/p&gt;

      &lt;p&gt;If your alignment-researcher AI is smarter than you, and you don’t know how to align AI yet, then you can’t trust that your AI is doing good work.&lt;/p&gt;

      &lt;p&gt;People who propose bootstrapping are usually aware of this problem. They have preliminary ideas for how they will evaluate the work of an AI that’s smarter than them, coupled with bafflingly high confidence that their untested ideas will work. (Zvi &lt;a href=&quot;https://www.lesswrong.com/posts/hvEikwtsbf6zaXG2s/on-google-s-safety-plan#A_Problem_For_Future_Earth&quot;&gt;proposed a test&lt;/a&gt;: “Can you get a method whereby the Man On The Street can use AI help to code and evaluate graduate level economics outputs and the quality of poetry and so on in ways that would translate to this future parallel situation?”) &lt;a href=&quot;#fnref:5&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:6&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I wanted to provide a link to a well-sourced and well-reasoned list of reckless behaviors by AI companies. I found no such list, so instead this is a link to a section of a post I wrote that includes numerous examples of reckless behavior. &lt;a href=&quot;#fnref:6&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:3&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I don’t actually think this is what safety-minded AI companies should do. I think they should spend less on capabilities and more on safety. But I am sympathetic to the position that they should temporarily focus on advancing capabilities. &lt;a href=&quot;#fnref:3&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>Do Protests Work? A Critical Review</title>
				<pubDate>Fri, 18 Apr 2025 00:00:00 -0700</pubDate>
				<link>http://mdickens.me/2025/04/18/protest_outcomes_critical_review/</link>
				<guid isPermaLink="true">http://mdickens.me/2025/04/18/protest_outcomes_critical_review/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;James Özden and Sam Glover at &lt;a href=&quot;https://www.socialchangelab.org/&quot;&gt;Social Change Lab&lt;/a&gt; wrote a &lt;a href=&quot;https://www.socialchangelab.org/_files/ugd/503ba4_94d84534d5b348468739b0d6a36b3940.pdf&quot;&gt;literature review on protest outcomes&lt;/a&gt;&lt;sup id=&quot;fnref:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; as part of a broader &lt;a href=&quot;https://www.socialchangelab.org/_files/ugd/503ba4_052959e2ee8d4924934b7efe3916981e.pdf&quot;&gt;investigation&lt;/a&gt;&lt;sup id=&quot;fnref:10&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:10&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt; on protest effectiveness. The report covers multiple lines of evidence and addresses many relevant questions, but does not say much about the methodological quality of the research. So that’s what I’m going to do today.&lt;/p&gt;

&lt;p&gt;I reviewed the evidence on protest outcomes, focusing only on the &lt;strong&gt;highest-quality research&lt;/strong&gt;, to answer two questions:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Do protests work?&lt;/li&gt;
  &lt;li&gt;Are Social Change Lab’s conclusions consistent with the highest-quality evidence?&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Here’s what I found:&lt;/p&gt;

&lt;p&gt;Do protests work? &lt;strong&gt;Highly likely&lt;/strong&gt; (credence: 90%) in certain contexts, although it’s unclear how well the results generalize. &lt;a href=&quot;#meta-analysis&quot;&gt;[More]&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Are Social Change Lab’s conclusions consistent with the highest-quality evidence? &lt;strong&gt;Yes&lt;/strong&gt;—the report’s core claims are well-supported, although it overstates the strength of some of the evidence. &lt;a href=&quot;#are-social-change-labs-claims-justified&quot;&gt;[More]&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Cross-posted to the &lt;a href=&quot;https://forum.effectivealtruism.org/posts/v6PtkcfZQAHR2Cgmx/do-protests-work-a-critical-review&quot;&gt;Effective Altruism Forum&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;!-- more --&gt;

&lt;h2 id=&quot;contents&quot;&gt;Contents&lt;/h2&gt;

&lt;ul id=&quot;markdown-toc&quot;&gt;
  &lt;li&gt;&lt;a href=&quot;#contents&quot; id=&quot;markdown-toc-contents&quot;&gt;Contents&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#introduction&quot; id=&quot;markdown-toc-introduction&quot;&gt;Introduction&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#studies-on-real-world-protest-outcomes&quot; id=&quot;markdown-toc-studies-on-real-world-protest-outcomes&quot;&gt;Studies on real-world protest outcomes&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#madestam-et-al-2013-on-tea-party-protests&quot; id=&quot;markdown-toc-madestam-et-al-2013-on-tea-party-protests&quot;&gt;Madestam et al. (2013) on Tea Party protests&lt;/a&gt;        &lt;ul&gt;
          &lt;li&gt;&lt;a href=&quot;#placebo-tests&quot; id=&quot;markdown-toc-placebo-tests&quot;&gt;Placebo tests&lt;/a&gt;&lt;/li&gt;
        &lt;/ul&gt;
      &lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#wasow-2020-on-1960s-civil-rights-protests&quot; id=&quot;markdown-toc-wasow-2020-on-1960s-civil-rights-protests&quot;&gt;Wasow (2020) on 1960s civil rights protests&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#klein-teeselink--melios-2021-on-2020-black-lives-matter-protests&quot; id=&quot;markdown-toc-klein-teeselink--melios-2021-on-2020-black-lives-matter-protests&quot;&gt;Klein Teeselink &amp;amp; Melios (2021) on 2020 Black Lives Matter protests&lt;/a&gt;        &lt;ul&gt;
          &lt;li&gt;&lt;a href=&quot;#failed-placebo-tests&quot; id=&quot;markdown-toc-failed-placebo-tests&quot;&gt;Failed placebo tests&lt;/a&gt;&lt;/li&gt;
        &lt;/ul&gt;
      &lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#larreboure--gonzález-2021-on-the-womens-march&quot; id=&quot;markdown-toc-larreboure--gonzález-2021-on-the-womens-march&quot;&gt;Larreboure &amp;amp; González (2021) on the Women’s March&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#hungerman--moorthy-2023-on-earth-day&quot; id=&quot;markdown-toc-hungerman--moorthy-2023-on-earth-day&quot;&gt;Hungerman &amp;amp; Moorthy (2023) on Earth Day&lt;/a&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#meta-analysis&quot; id=&quot;markdown-toc-meta-analysis&quot;&gt;Meta-analysis&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#potential-problems-with-the-research&quot; id=&quot;markdown-toc-potential-problems-with-the-research&quot;&gt;Potential problems with the research&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#spatial-autocorrelation&quot; id=&quot;markdown-toc-spatial-autocorrelation&quot;&gt;Spatial autocorrelation&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#publication-bias&quot; id=&quot;markdown-toc-publication-bias&quot;&gt;Publication bias&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#data-fabrication&quot; id=&quot;markdown-toc-data-fabrication&quot;&gt;Data fabrication&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#data-errors&quot; id=&quot;markdown-toc-data-errors&quot;&gt;Data errors&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#will-the-results-generalize&quot; id=&quot;markdown-toc-will-the-results-generalize&quot;&gt;Will the results generalize?&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#meta-concerns-with-this-meta-analysis&quot; id=&quot;markdown-toc-meta-concerns-with-this-meta-analysis&quot;&gt;Meta-concerns with this meta-analysis&lt;/a&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#are-social-change-labs-claims-justified&quot; id=&quot;markdown-toc-are-social-change-labs-claims-justified&quot;&gt;Are Social Change Lab’s claims justified?&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#broad-claims&quot; id=&quot;markdown-toc-broad-claims&quot;&gt;Broad claims&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#claims-about-individual-studies&quot; id=&quot;markdown-toc-claims-about-individual-studies&quot;&gt;Claims about individual studies&lt;/a&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#conclusion&quot; id=&quot;markdown-toc-conclusion&quot;&gt;Conclusion&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#source-code&quot; id=&quot;markdown-toc-source-code&quot;&gt;Source code&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#appendix-a-additional-tables&quot; id=&quot;markdown-toc-appendix-a-additional-tables&quot;&gt;Appendix A: Additional tables&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#appendix-b-methodological-revisions&quot; id=&quot;markdown-toc-appendix-b-methodological-revisions&quot;&gt;Appendix B: Methodological revisions&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#appendix-c-comparing-the-strength-of-evidence-to-saturated-fat-research&quot; id=&quot;markdown-toc-appendix-c-comparing-the-strength-of-evidence-to-saturated-fat-research&quot;&gt;Appendix C: Comparing the strength of evidence to saturated fat research&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#notes&quot; id=&quot;markdown-toc-notes&quot;&gt;Notes&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h1 id=&quot;introduction&quot;&gt;Introduction&lt;/h1&gt;

&lt;p&gt;This article serves two purposes: First, it analyzes the evidence on protest outcomes. Second, it critically reviews the Social Change Lab literature review.&lt;/p&gt;

&lt;p&gt;Social Change Lab is not the only group that has reviewed protest effectiveness. I was able to find four literature reviews:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Animal Charity Evaluators (2018), &lt;a href=&quot;https://animalcharityevaluators.org/research/reports/protests/&quot;&gt;Protest Intervention Report.&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;Orazani et al. (2021), &lt;a href=&quot;https://doi.org/10.1002/ejsp.2722&quot;&gt;Social movement strategy (nonviolent vs. violent) and the garnering of third-party support: A meta-analysis.&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;Social Change Lab – Ozden &amp;amp; Glover (2022), &lt;a href=&quot;https://www.socialchangelab.org/_files/ugd/503ba4_94d84534d5b348468739b0d6a36b3940.pdf&quot;&gt;Literature Review: Protest Outcomes.&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;Shuman et al. (2024), &lt;a href=&quot;https://doi.org/10.1016/j.tics.2023.10.003&quot;&gt;When Are Social Protests Effective?&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The Animal Charity Evaluators review did not include many studies, and did not cite any natural experiments (only one had been published as of 2018).&lt;/p&gt;

&lt;p&gt;Orazani et al. (2021)&lt;sup id=&quot;fnref:50&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:50&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt; is a nice meta-analysis—it finds that when you show people news articles about nonviolent protests, they are more likely to express support for the protesters’ cause. But what people say in a lab setting might not carry over to real-life behavior.&lt;/p&gt;

&lt;p&gt;I read through Shuman et al. (2024). Compared to Ozden &amp;amp; Glover (2022), it cited weaker evidence and made a larger number of claims with thinner support.&lt;/p&gt;

&lt;p&gt;I looked through these literature reviews to find relevant studies. The Social Change Lab review was by far the most useful; the other reviews didn’t include any additional studies meeting my criteria. I used ChatGPT Deep Research&lt;sup id=&quot;fnref:13&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:13&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;4&lt;/a&gt;&lt;/sup&gt; to find more publications.&lt;/p&gt;

&lt;p&gt;I focused my critical analysis on only the highest-quality evidence:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;I did not review lab experiments. The Orazani et al. meta-analysis is informative, but it might not generalize to the real world.&lt;/li&gt;
  &lt;li&gt;There are many studies showing an association between protests and real-world outcomes (voting patterns, government policy, corporate behavior, etc.), but the vast majority of them are observational.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Observational studies cannot establish causation. They cannot distinguish between “protests raised support for the cause” and “protests happened because people supported the cause”. No amount of &lt;a href=&quot;https://dynomight.net/control/&quot;&gt;controlling for confounders&lt;/a&gt; fixes this problem.&lt;/p&gt;

&lt;p&gt;Therefore, my review focuses only on natural experiments that measure real-world outcomes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Conflict of interest:&lt;/strong&gt; In 2024 I &lt;a href=&quot;https://mdickens.me/2024/11/18/where_i_am_donating_in_2024/&quot;&gt;donated&lt;/a&gt; to &lt;a href=&quot;https://www.pauseai-us.org/&quot;&gt;PauseAI US&lt;/a&gt;, which organizes protests. I would prefer to find that protests work.&lt;/p&gt;

&lt;h1 id=&quot;studies-on-real-world-protest-outcomes&quot;&gt;Studies on real-world protest outcomes&lt;/h1&gt;

&lt;p&gt;Social Change Lab reviewed five studies on how protests affect voter behavior, which they judged to be the best studies on the subject.&lt;/p&gt;

&lt;p&gt;I excluded two of the five studies due to methodological concerns:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;https://doi.org/10.1177/0003122414555885&quot;&gt;McVeigh et al. (2014)&lt;/a&gt;&lt;sup id=&quot;fnref:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;5&lt;/a&gt;&lt;/sup&gt; is an observational study that looked at long-term changes in Republican voting in counties where the Ku Klux Klan was most active.&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://doi.org/10.1111/1475-6765.12375&quot;&gt;Bremer et al. (2020)&lt;/a&gt;&lt;sup id=&quot;fnref:3&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;6&lt;/a&gt;&lt;/sup&gt; is a study on the correlation between protests and electoral outcomes in European countries.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I did not review these because they are observational studies, and I wanted to focus on natural experiments.&lt;/p&gt;

&lt;p&gt;I did review the other three studies:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Madestam, A., Shoag, D., Veuger, S., &amp;amp; Yanagizawa-Drott, D. (2013). &lt;a href=&quot;https://doi.org/10.1093/qje/qjt021&quot;&gt;Do Political Protests Matter? Evidence from the Tea Party Movement.&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;Wasow, O. (2020). &lt;a href=&quot;https://doi.org/10.1017/S000305542000009X&quot;&gt;Agenda Seeding: How 1960s Black Protests Moved Elites, Public Opinion and Voting.&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;Klein Teeselink, B. K., &amp;amp; Melios, G. (2021). &lt;a href=&quot;https://dx.doi.org/10.2139/ssrn.3809877&quot;&gt;Weather to Protest: The Effect of Black Lives Matter Protests on the 2020 Presidential Election.&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In addition, I looked at two studies that the Social Change Lab report did not cover:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Larreboure, M., &amp;amp; Gonzalez, F. (2021). &lt;a href=&quot;https://mlarreboure.com/womenmarch.pdf&quot;&gt;The Impact of the Women’s March on the U.S. House Election.&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;Hungerman, D., &amp;amp; Moorthy, V. (2023). &lt;a href=&quot;/materials/Earth-Day.pdf&quot;&gt;Every Day Is Earth Day: Evidence on the Long-Term Impact of Environmental Activism.&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;There are no randomized controlled trials on the real-world effect of protests (how would you randomly assign protests to occur?). But there are five natural experiments—three from the Social Change Lab review, plus the Women’s March and Earth Day studies. Most of the natural experiments use the &lt;strong&gt;rainfall method&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The idea is that protests often get canceled when it rains. If you look at voting patterns in places where it rained on protest day compared to where it didn’t rain, you should be able to isolate the causal effect of protests. The rain effectively randomizes where protests occur.&lt;/p&gt;

&lt;p&gt;Rather than using rainfall directly, the rainfall method uses rainfall &lt;em&gt;shocks&lt;/em&gt;—that is, unexpectedly high or low rainfall relative to what was expected for that location and date. This avoids any confounding effect of average rainfall levels.&lt;/p&gt;

&lt;p&gt;The clearest illustration of the rainfall method comes from &lt;a href=&quot;/materials/Earth-Day.pdf&quot;&gt;Hungerman &amp;amp; Moorthy (2023)&lt;/a&gt;&lt;sup id=&quot;fnref:36&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:36&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;7&lt;/a&gt;&lt;/sup&gt; (which I will discuss in more detail &lt;a href=&quot;#hungerman--moorthy-2023-on-earth-day&quot;&gt;later&lt;/a&gt;). The authors looked at counties where it rained vs. didn’t rain on the inaugural Earth Day—April 22, 1970. Then they used rainfall to predict the rate of birth defects from 1980–1988.The hypothesis is that Earth Day demonstrations increased support for environmental protections. That in turn would reduce environmental contaminants, leading to fewer birth defects. And if rainfall stops demonstrations from happening, then it will have the opposite effect.&lt;/p&gt;

&lt;p&gt;The rainfall method is commonly used in social science, and it has received some fair criticism.&lt;sup id=&quot;fnref:46&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:46&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;8&lt;/a&gt;&lt;/sup&gt;&lt;sup id=&quot;fnref:47&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:47&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;9&lt;/a&gt;&lt;/sup&gt; But the rainfall method as it was used by Hungerman &amp;amp; Moorthy is robust to these criticisms, as illustrated by this chart:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/Earth-Day-birth-defects.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;The key to establishing causation is that rainfall had no predictive power on any other day. It only mattered &lt;em&gt;on Earth Day&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;That leaves us with two possibilities:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Rainfall is associated with higher birth defects due to some confounding variable, but only rainfall on April 22 and not on any other day, because that day is special somehow, in a way that has nothing to do with Earth Day; or&lt;/li&gt;
  &lt;li&gt;Earth Day demonstrations reduced the rate of birth defects.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;(Or the results could be due to a statistical error or data manipulation. I will discuss those possibilities later.)&lt;/p&gt;

&lt;p&gt;A summary of the five studies I reviewed plus the two I declined to review:&lt;/p&gt;

&lt;div id=&quot;table-1&quot; style=&quot;text-align:center;&quot;&gt;Table 1: Summary of Studies&lt;/div&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Study&lt;/th&gt;
      &lt;th&gt;Protest&lt;/th&gt;
      &lt;th&gt;Protest Type&lt;/th&gt;
      &lt;th&gt;Effect&lt;/th&gt;
      &lt;th&gt;Randomization Method&lt;/th&gt;
      &lt;th&gt;Quality&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Madestam et al.&lt;sup id=&quot;fnref:4&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:4&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;10&lt;/a&gt;&lt;/sup&gt;&lt;/td&gt;
      &lt;td&gt;Tea Party&lt;/td&gt;
      &lt;td&gt;nonviolent&lt;/td&gt;
      &lt;td&gt;+&lt;/td&gt;
      &lt;td&gt;rainfall&lt;/td&gt;
      &lt;td&gt;high&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Wasow&lt;sup id=&quot;fnref:9&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:9&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;11&lt;/a&gt;&lt;/sup&gt;&lt;/td&gt;
      &lt;td&gt;Civil Rights&lt;/td&gt;
      &lt;td&gt;violent&lt;/td&gt;
      &lt;td&gt;-&lt;/td&gt;
      &lt;td&gt;rainfall&lt;/td&gt;
      &lt;td&gt;medium&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Wasow&lt;sup id=&quot;fnref:9:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:9&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;11&lt;/a&gt;&lt;/sup&gt;&lt;/td&gt;
      &lt;td&gt;Civil Rights&lt;/td&gt;
      &lt;td&gt;nonviolent&lt;/td&gt;
      &lt;td&gt;+&lt;/td&gt;
      &lt;td&gt;none (observational)&lt;/td&gt;
      &lt;td&gt;low&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Klein Teeselink &amp;amp; Melios&lt;sup id=&quot;fnref:6&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:6&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;12&lt;/a&gt;&lt;/sup&gt;&lt;/td&gt;
      &lt;td&gt;BLM&lt;/td&gt;
      &lt;td&gt;nonviolent&lt;/td&gt;
      &lt;td&gt;+&lt;/td&gt;
      &lt;td&gt;rainfall&lt;/td&gt;
      &lt;td&gt;high&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Larreboure &amp;amp; González&lt;sup id=&quot;fnref:14&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:14&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;13&lt;/a&gt;&lt;/sup&gt;&lt;/td&gt;
      &lt;td&gt;Women’s March&lt;/td&gt;
      &lt;td&gt;nonviolent&lt;/td&gt;
      &lt;td&gt;+&lt;/td&gt;
      &lt;td&gt;weather shocks&lt;/td&gt;
      &lt;td&gt;medium&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Hungerman &amp;amp; Moorthy&lt;sup id=&quot;fnref:36:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:36&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;7&lt;/a&gt;&lt;/sup&gt;&lt;/td&gt;
      &lt;td&gt;Earth Day&lt;/td&gt;
      &lt;td&gt;nonviolent&lt;/td&gt;
      &lt;td&gt;+&lt;/td&gt;
      &lt;td&gt;rainfall&lt;/td&gt;
      &lt;td&gt;high&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;McVeigh et al.&lt;sup id=&quot;fnref:2:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;5&lt;/a&gt;&lt;/sup&gt;&lt;/td&gt;
      &lt;td&gt;KKK activity&lt;/td&gt;
      &lt;td&gt;unclear&lt;/td&gt;
      &lt;td&gt;+&lt;/td&gt;
      &lt;td&gt;none (observational)&lt;/td&gt;
      &lt;td&gt;low&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Bremer et al.&lt;sup id=&quot;fnref:3:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;6&lt;/a&gt;&lt;/sup&gt;&lt;/td&gt;
      &lt;td&gt;Europe elections&lt;/td&gt;
      &lt;td&gt;nonviolent&lt;/td&gt;
      &lt;td&gt;?&lt;sup id=&quot;fnref:11&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:11&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;14&lt;/a&gt;&lt;/sup&gt;&lt;/td&gt;
      &lt;td&gt;none (observational)&lt;/td&gt;
      &lt;td&gt;low&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;(Methodological quality is relative. I’d have higher confidence in a true experiment than in any of these quasi-experimental methods.)&lt;/p&gt;

&lt;p&gt;Next I will review each study individually. Then I will collect the results into a meta-analysis.&lt;/p&gt;

&lt;h2 id=&quot;madestam-et-al-2013-on-tea-party-protests&quot;&gt;Madestam et al. (2013) on Tea Party protests&lt;/h2&gt;

&lt;p&gt;&lt;a href=&quot;/materials/TeaParty_Protests.pdf&quot;&gt;Madestam et al. (2013)&lt;/a&gt;&lt;sup id=&quot;fnref:4:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:4&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;10&lt;/a&gt;&lt;/sup&gt; looked at the effect of 2009 Tea Party protests on the 2012 US elections. It used the rainfall method to establish causality.&lt;/p&gt;

&lt;p&gt;As an additional check, the authors tested whether rainfall could predict Republican and Democratic vote shares in the 2008 election. (You may recall that the 2009 Tea Party protests did not occur until a year after the 2008 election.) If rainfall can predict the 2008 election results—before the protests occurred—that means the model was confounded.&lt;/p&gt;

&lt;p&gt;Madestam et al. (2013) found that rainfall in 2009 could &lt;em&gt;not&lt;/em&gt; predict votes in 2008 (see Table II), but it &lt;em&gt;could&lt;/em&gt; predict votes in 2012 (see Table VI).&lt;/p&gt;

&lt;p&gt;The authors also tested whether rainfall on other days prior to the Tea Party protests could predict 2009 voting patterns, and found that they could not.&lt;/p&gt;

&lt;p&gt;In the authors’ model, a rainy protest decreased Republicans’ share of the vote in the 2012 election by 1.04 percentage points (p &amp;lt; 0.0006&lt;sup id=&quot;fnref:7&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:7&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;15&lt;/a&gt;&lt;/sup&gt;). This suggests that protests did indeed increase the Republican vote share.&lt;/p&gt;

&lt;p&gt;Interestingly, rainfall decreased Republican vote share relative to the total population, but did not increase the Democratic share. This suggests that protests increased voter turnout but did not cause voters to change their minds.&lt;/p&gt;

&lt;p&gt;At first I thought the inability to predict 2008 votes might be a false negative (like a p = 0.06 situation), but this was not the case. Rainfall in 2009 &lt;em&gt;increased&lt;/em&gt; Republicans’ vote share in 2008, although only slightly (p = 0.38). (Remember that rainfall is supposed to decrease Republican votes by preventing Tea Party protests from happening.)&lt;/p&gt;

&lt;p&gt;There is another concern with the rainfall model—not with causality, but with overstating the strength of evidence. A standard statistical model assumes that all observations are independent. But rainfall is &lt;strong&gt;spatially autocorrelated&lt;/strong&gt;, which is the statistical way of saying that rain in one county is not independent of rainfall in the neighboring counties. If you have data from 2,758 counties, you can’t treat them as 2,758 independent samples.&lt;/p&gt;

&lt;p&gt;Madestam et al. (2013) used several methods to account for this. First, it clustered standard errors at the state level instead of at the county level. Second, as a robustness check, the authors assumed spatial correlations varied as an inverse function of distance, which produced similar standard errors. Third, the authors tried dropping states one at a time to see if any states overly influenced the results.&lt;/p&gt;

&lt;h3 id=&quot;placebo-tests&quot;&gt;Placebo tests&lt;/h3&gt;

&lt;p&gt;Finally:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;[W]e conduct a series of placebo tests using rainfall on other historical dates in April. These placebos are drawn from the same spatially correlated distribution as rainfall on April 15, 2009. If rainfall on the protest day has a causal effect, the actual estimate of rainfall ought to be an outlier in the distribution of placebo coefficients.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;They calculated the “placebo p-value” as the probability that rainfall on a random day could predict outcomes better than rainfall on protest day. If the model has correctly accounted for spatial autocorrelation then the placebo p-value should equal the original model p-value, plus or minus some random variation.&lt;/p&gt;

&lt;p&gt;The authors run tests on 627 random “placebo dates”, and find that rain on protest day had a larger effect size than almost any of the placebo dates (see Figure V). This suggests that their corrections for spatial correlation worked, making false positives unlikely. However, the p-values on Figure V were a bit higher than the p-values in the main text, suggesting some effect size inflation due to spatial autocorrelation.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/Tea-Party-Figure-V.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;h2 id=&quot;wasow-2020-on-1960s-civil-rights-protests&quot;&gt;Wasow (2020) on 1960s civil rights protests&lt;/h2&gt;

&lt;p&gt;&lt;a href=&quot;/materials/1960s_Black_Protests.pdf&quot;&gt;Wasow (2020)&lt;/a&gt;&lt;sup id=&quot;fnref:9:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:9&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;11&lt;/a&gt;&lt;/sup&gt; measured the effect of nonviolent protests using observational data only. I won’t discuss that portion of the paper.&lt;/p&gt;

&lt;p&gt;Wasow applied the quasi-experimental rainfall model to &lt;em&gt;violent&lt;/em&gt; protests and found that they had a significant backfire effect. I won’t focus on the evidence on violent protests because I would recommend against engaging in violence regardless of what result the study found.&lt;/p&gt;

&lt;p&gt;But if violent protests decrease public support, that’s (weak) evidence against protests working in general. The simplest hypothesis is “protests work”. But evidence on violent protests contradicts this, requiring a more complex claim: “nonviolent protests work, violent protests backfire”. I will evaluate this two-part hypothesis in the &lt;a href=&quot;#meta-analysis&quot;&gt;meta-analysis&lt;/a&gt; below.&lt;/p&gt;

&lt;p&gt;As some additional evidence on violent protests, &lt;a href=&quot;/materials/Riots_Property.pdf&quot;&gt;Collins &amp;amp; Margo (2007)&lt;/a&gt;&lt;sup id=&quot;fnref:48&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:48&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;16&lt;/a&gt;&lt;/sup&gt; used the rainfall method to find that 1960s riots decreased nearby property values. This is consistent with the finding from Wasow (2020) that violent protests backfire, but property values are not directly relevant to protesters’ outcomes. It’s conceivable that protests could simultaneously decrease local property values and increase public support.&lt;/p&gt;

&lt;p&gt;Replication data from Wasow (2020) is &lt;a href=&quot;https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/HVRCKM&quot;&gt;publicly available&lt;/a&gt;.&lt;/p&gt;

&lt;h2 id=&quot;klein-teeselink--melios-2021-on-2020-black-lives-matter-protests&quot;&gt;Klein Teeselink &amp;amp; Melios (2021) on 2020 Black Lives Matter protests&lt;/h2&gt;

&lt;p&gt;&lt;a href=&quot;https://dx.doi.org/10.2139/ssrn.3809877&quot;&gt;Klein Teeselink &amp;amp; Melios (2021)&lt;/a&gt;&lt;sup id=&quot;fnref:6:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:6&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;12&lt;/a&gt;&lt;/sup&gt; used the rainfall method to establish the effect of Black Lives Matter protests on the 2020 presidential election.&lt;/p&gt;

&lt;p&gt;Unlike Madestam et al. (2013), this paper did not test whether rainfall could predict outcomes &lt;em&gt;before&lt;/em&gt; the protests (which would indicate confounding).&lt;/p&gt;

&lt;p&gt;As with Madestam et al. (2013), the authors of this paper considered the fact that rainfall is not independent across counties. Their model adjusts for this by including independent variables to represent the change in vote shares in surrounding counties, scaled by inverse distance.&lt;/p&gt;

&lt;p&gt;Unlike the other studies in this review, Klein Teeselink &amp;amp; Melios (2021) treated county vote changes as interdependent. Their model assumes that the change in vote share in one county is partially explained by vote changes in the nearby counties, using the method described by &lt;a href=&quot;/materials/beck2006.pdf&quot;&gt;Beck et al. (2006)&lt;/a&gt;&lt;sup id=&quot;fnref:45&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:45&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;17&lt;/a&gt;&lt;/sup&gt;.&lt;/p&gt;

&lt;p&gt;Klein Teeselink &amp;amp; Melios’ model isolates the impact of &lt;em&gt;local&lt;/em&gt; protests on &lt;em&gt;local&lt;/em&gt; vote change. In the method of (e.g.) Madestam et al. (2013), some vote changes may be explained by protests in &lt;em&gt;neighboring counties&lt;/em&gt;. Klein Teeselink &amp;amp; Melios’ method is more rigorous in a sense, but we don’t actually &lt;em&gt;want&lt;/em&gt; to isolate local changes. We want to know how well protests work &lt;em&gt;overall&lt;/em&gt;, not just their local effects.&lt;/p&gt;

&lt;p&gt;Klein Teeselink &amp;amp; Melios performed a robustness check in Table A3, Panel D where they fully ignored spatial autocorrelation. This produced mean effects more in line with the other studies: a vote share change of 11.9 per protester (std err 2.9), and a change of 0.105 based on the probability of rain (std err 0.032).&lt;/p&gt;

&lt;p&gt;If you ignore spatial autocorrelation, you may overestimate the strength of evidence. However, in this case, ignoring spatial autocorrelation had only a modest impact on the t-stats:&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt; &lt;/th&gt;
      &lt;th&gt;Primary Model&lt;/th&gt;
      &lt;th&gt;Ignoring Spatial Autocorrelation&lt;/th&gt;
      &lt;th&gt;Ignoring Spatial Autocorrelation + Counties Weighted by Population&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Vote Share Per Protester&lt;/td&gt;
      &lt;td&gt;5.5&lt;/td&gt;
      &lt;td&gt;4.1&lt;/td&gt;
      &lt;td&gt;7.2&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Vote Share by Rain Probability&lt;/td&gt;
      &lt;td&gt;2.3&lt;/td&gt;
      &lt;td&gt;3.3&lt;/td&gt;
      &lt;td&gt;9.3&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;The paper’s replication data is &lt;a href=&quot;https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/AVTED4&amp;amp;faces-redirect=true&quot;&gt;publicly available&lt;/a&gt;.&lt;/p&gt;

&lt;h3 id=&quot;failed-placebo-tests&quot;&gt;Failed placebo tests&lt;/h3&gt;

&lt;p&gt;Earlier, I discussed how Madestam et al. (2013) performed &lt;a href=&quot;#placebo-tests&quot;&gt;“placebo tests”&lt;/a&gt; to check that its model wouldn’t generate too many false positives. Klein Teeselink &amp;amp; Melios (2021) did the same, although with only nine placebo tests instead of 627:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/BLM-placebo-tests.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;(May 25 was the beginning of the period in which the majority of protests happened.)&lt;/p&gt;

&lt;p&gt;This chart shows that Klein Teeselink &amp;amp; Melios’ version of the rainfall method &lt;strong&gt;did not establish causality&lt;/strong&gt;. The fortnight of April 29—a month before the protests started—showed nearly the same effect size as the May 25 period, and 6 out of 9 placebo periods had p-values less than 0.05. So either some confounding variable explains the association between protests and vote share, or the standard error is underestimated due to spatial autocorrelation (or something similar).&lt;/p&gt;

&lt;p&gt;The authors write&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;part of this association may be caused by serial correlation in weather patterns&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;In other words, a rainy June often also means a rainy April or May, so rain in April/May might appear to affect protest outcomes because it’s correlated with rain in June. (And thus the model does establish causality.)&lt;/p&gt;

&lt;p&gt;That may be true, but I’m not confident in that explanation,&lt;sup id=&quot;fnref:49&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:49&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;18&lt;/a&gt;&lt;/sup&gt; and therefore I can’t trust this model to establish causality. Therefore, &lt;strong&gt;I exclude the BLM protests from my meta-analysis.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;(But this does leave me wondering: How is it that rainfall shocks in April could predict vote changes in the 2020 presidential election?)&lt;/p&gt;

&lt;p&gt;It would be interesting to compare the publicly-available BLM data to the Earth Day data (see &lt;a href=&quot;#hungerman--moorthy-2023-on-earth-day&quot;&gt;below&lt;/a&gt;) to figure out why the Earth Day paper passed its placebo test but BLM did not. But that’s beyond the scope of this article.&lt;/p&gt;

&lt;h2 id=&quot;larreboure--gonzález-2021-on-the-womens-march&quot;&gt;Larreboure &amp;amp; González (2021) on the Women’s March&lt;/h2&gt;

&lt;p&gt;&lt;a href=&quot;https://mlarreboure.com/womenmarch.pdf&quot;&gt;Larreboure &amp;amp; González (2021)&lt;/a&gt;&lt;sup id=&quot;fnref:14:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:14&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;13&lt;/a&gt;&lt;/sup&gt; attempted to use the rainfall method to predict whether the 2017 Women’s March affected how many votes went to woman candidates in the 2018 election. I say “attempted” because they found that rainfall did not predict Women’s March attendance. So instead, they used “weather shocks” to predict voting outcomes. These shocks were defined as a combination of weather-related factors that they chose using a &lt;a href=&quot;https://arxiv.org/abs/1012.1297&quot;&gt;LASSO&lt;/a&gt;&lt;sup id=&quot;fnref:15&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:15&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;19&lt;/a&gt;&lt;/sup&gt; regression model.&lt;/p&gt;

&lt;p&gt;I see no obvious problem with the “weather shocks” method, but I’m wary of adding more mathematical complexity. Complexity makes flaws harder to spot.&lt;/p&gt;

&lt;p&gt;Larreboure &amp;amp; González found that protests increased voter turnout and vote share to women for both Democratic and Republican candidates.&lt;/p&gt;

&lt;p&gt;The authors accounted for spatial autocorrelation by clustering standard errors at the state level. They included a robustness check where they adjusted for spatial autocorrelation using the method from &lt;a href=&quot;https://doi.org/10.1016/S0304-4076(98)00084-0&quot;&gt;Conley (1999)&lt;/a&gt;&lt;sup id=&quot;fnref:43&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:43&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;20&lt;/a&gt;&lt;/sup&gt; with two different distance cutoffs, 50 km and 100 km (in Table A.8).&lt;/p&gt;

&lt;p&gt;This paper had at least two inconsistencies in its reported figures:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Page 13 says an additional 1% of the population protesting increased vote share for women and under-represented groups by 12.95 percentage points (pp). However, Table 4 on page 28 reports an increase of 12.70 pp.&lt;sup id=&quot;fnref:16&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:16&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;21&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
  &lt;li&gt;A more minor error, but page 13 says the 12.95 pp number is “remarkably similar to the impact of the Tea Party protesters on the vote share of the Republican Party (i.e. 12.59)”. However, the 12.59 number from Madestam et al. (see &lt;a href=&quot;#madestam-et-al-2013-on-tea-party-protests&quot;&gt;above&lt;/a&gt;) is the change in &lt;em&gt;absolute votes&lt;/em&gt;, not vote &lt;em&gt;share&lt;/em&gt;. The reported change in vote &lt;em&gt;share&lt;/em&gt; was 18.81, which is not remarkably similar to 12.95.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I will take the 12.70 number reported in Table 4 as correct (it is repeated again in the robustness checks).&lt;/p&gt;

&lt;p&gt;To be conservative, in my meta-analysis I will use the figures from the 50 km robustness check (where available) because they had the largest standard errors.&lt;/p&gt;

&lt;h2 id=&quot;hungerman--moorthy-2023-on-earth-day&quot;&gt;Hungerman &amp;amp; Moorthy (2023) on Earth Day&lt;/h2&gt;

&lt;p&gt;&lt;a href=&quot;https://www.aeaweb.org/content/file?id=16104&quot;&gt;Hungerman &amp;amp; Moorthy (2023)&lt;/a&gt;&lt;sup id=&quot;fnref:36:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:36&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;7&lt;/a&gt;&lt;/sup&gt; found that rainfall on the inaugural Earth Day, April 22, 1970, could predict people’s environmental attitudes on surveys from 1977 to 1993.&lt;/p&gt;

&lt;p&gt;It also directly measured environmental impact by looking at pollutant levels and rates of birth defects (which can result from exposure to environmental contaminants). It found that rainfall on Earth Day could predict birth defects.&lt;/p&gt;

&lt;p&gt;The paper claims that rainfall predicted carbon monoxide levels, and it did find a statistically significant change. However, Appendix Table A3 examines five environmental contaminants, of which only carbon monoxide had a t-stat above 2, and two out of five outcomes were (slightly) negative. The positive effect on carbon monoxide may be a false positive.&lt;/p&gt;

&lt;p&gt;Earlier I showed this chart:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/Earth-Day-birth-defects.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;The chart shows that rainfall on April 22, 1970–Earth Day—predicts the rate of birth defects 10 years later, but rainfall on any other day does not.&lt;/p&gt;

&lt;p&gt;The same chart for the effect of rainfall on support for environmental spending:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/Earth-Day-rain.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;The paper addresses the previously-mentioned spatial autocorrelation problem using the same techniques as Madestam et al. (2013). If spatial autocorrelation were distorting the effect sizes, we would expect to see more spurious statistically significant outcomes on the charts above. But we only see large effect sizes on Earth Day, not on any other day, which indicates that spatial autocorrelation is not a problem.&lt;/p&gt;

&lt;p&gt;Like Madestam et al. (2013), the authors generated hundreds of additional “placebo tests” (as described &lt;a href=&quot;#placebo-tests&quot;&gt;above&lt;/a&gt;) where they looked at how well rainfall on different random days could predict environmental outcomes. They found that the placebo p-values were very similar to the original p-values (and even lower in some cases):&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/Earth-Day-Figure-6.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;The paper’s source code and data are &lt;a href=&quot;https://doi.org/10.3886/E144941V1&quot;&gt;publicly available&lt;/a&gt;.&lt;/p&gt;

&lt;h1 id=&quot;meta-analysis&quot;&gt;Meta-analysis&lt;/h1&gt;

&lt;p&gt;For two of the five natural experiments, I calculated expected change in number of votes for each additional protester, or change in vote share per protester (defined as votes per protester divided by turnout):&lt;/p&gt;

&lt;div id=&quot;table-2&quot; style=&quot;text-align:center;&quot;&gt;Table 2: Change in Votes Per Protester&lt;/div&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Protest&lt;/th&gt;
      &lt;th&gt;Votes&lt;/th&gt;
      &lt;th&gt;Std Err&lt;/th&gt;
      &lt;th&gt;Vote Share&lt;/th&gt;
      &lt;th&gt;Std Err&lt;/th&gt;
      &lt;th&gt;n&lt;/th&gt;
      &lt;th&gt;Source&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Tea Party&lt;/td&gt;
      &lt;td&gt;12.59&lt;/td&gt;
      &lt;td&gt;4.21&lt;/td&gt;
      &lt;td&gt;18.81&lt;/td&gt;
      &lt;td&gt;7.85&lt;/td&gt;
      &lt;td&gt;2758&lt;/td&gt;
      &lt;td&gt;Table VI&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Women’s March&lt;/td&gt;
      &lt;td&gt;3*&lt;/td&gt;
      &lt;td&gt;**&lt;/td&gt;
      &lt;td&gt;9.62&lt;/td&gt;
      &lt;td&gt;4.47&lt;/td&gt;
      &lt;td&gt;2936&lt;/td&gt;
      &lt;td&gt;Table A.8 and page 3&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;*only one significant figure was provided&lt;/p&gt;

&lt;p&gt;**not reported&lt;/p&gt;

&lt;p&gt;I did not include the Earth Day or Civil Rights protests because the studies did not provide the relevant data.&lt;sup id=&quot;fnref:57&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:57&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;22&lt;/a&gt;&lt;/sup&gt; The BLM study reported vote share per protester, but I excluded it due to the study’s failure to establish causality, discussed &lt;a href=&quot;#failed-placebo-tests&quot;&gt;previously&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;I aggregated the results by applying a &lt;a href=&quot;https://de.meta-analysis.com/download/Intro_Models.pdf&quot;&gt;random-effects model&lt;/a&gt;. According to these two studies, protests have a mean impact of +11.95 vote share per protester (standard error 4.00; likelihood ratio 87.1; p &amp;lt; 0.003).&lt;/p&gt;

&lt;p&gt;(The &lt;a href=&quot;https://arbital.greaterwrong.com/p/likelihoods_not_pvalues/&quot;&gt;likelihood ratio&lt;/a&gt; tells us how much evidence the data provides. A likelihood ratio of 10.3 means that, assuming the study’s methodology is perfect, the odds of getting this result are 10.3x higher if the true mean is 7.84 than if the true mean is 0.)&lt;/p&gt;

&lt;p&gt;If we are considering supporting some upcoming protest, we might want to estimate the probability that it will backfire. One way to do that is by using the pooled sample of past protests.&lt;/p&gt;

&lt;p&gt;This pooled sample has a between-study standard deviation of 1.19, which reflects how much the effectiveness of protests varied across the studies. If we assume that the sample’s mean and between-study variation are exactly correct (which is questionable, since the pool only includes two studies), then we can model protest outcomes as a normal distribution with a mean of 11.95 and a standard deviation of 1.19.&lt;/p&gt;

&lt;p&gt;Under this model, the probability of a protest having a negative effect—i.e., producing a value less than zero—is extremely small. But I would not take these precise numbers too seriously.&lt;/p&gt;

&lt;p&gt;Vote share per protester is the most interesting metric for my purposes because it gives information about cost-effectiveness—it tells you how much impact you can expect for each marginal protester. But the natural experiments reported on other outcomes as well, such as overall change in vote share (as determined by changes in rainfall) and popular support for protesters’ objectives.&lt;/p&gt;

&lt;p&gt;I applied a random-effects model to aggregate a few different sets of outcomes:&lt;/p&gt;

&lt;div id=&quot;table-3&quot; style=&quot;text-align:center;&quot;&gt;Table 3: Pooled Sample Outcomes&lt;/div&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Outcomes&lt;/th&gt;
      &lt;th&gt;Mean&lt;/th&gt;
      &lt;th&gt;Std Err&lt;/th&gt;
      &lt;th&gt;likelihood ratio&lt;/th&gt;
      &lt;th&gt;p-value&lt;/th&gt;
      &lt;th&gt;P(negative effect)&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Vote Share Per Protester&lt;/td&gt;
      &lt;td&gt;11.95&lt;/td&gt;
      &lt;td&gt;4.00&lt;/td&gt;
      &lt;td&gt;87.1&lt;/td&gt;
      &lt;td&gt;0.003&lt;/td&gt;
      &lt;td&gt;0&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Vote Share&lt;/td&gt;
      &lt;td&gt;1.59&lt;/td&gt;
      &lt;td&gt;0.48&lt;/td&gt;
      &lt;td&gt;257&lt;/td&gt;
      &lt;td&gt;0.001&lt;/td&gt;
      &lt;td&gt;0.002&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Vote Share (Rain Only)&lt;/td&gt;
      &lt;td&gt;1.14&lt;/td&gt;
      &lt;td&gt;0.42&lt;/td&gt;
      &lt;td&gt;39.3&lt;/td&gt;
      &lt;td&gt;0.007&lt;/td&gt;
      &lt;td&gt;0&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Single Hypothesis&lt;/td&gt;
      &lt;td&gt;1.06&lt;/td&gt;
      &lt;td&gt;0.78&lt;/td&gt;
      &lt;td&gt;2.55&lt;/td&gt;
      &lt;td&gt;0.172&lt;/td&gt;
      &lt;td&gt;0.199&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Favorability&lt;/td&gt;
      &lt;td&gt;2.68&lt;/td&gt;
      &lt;td&gt;2.32&lt;/td&gt;
      &lt;td&gt;1.95&lt;/td&gt;
      &lt;td&gt;0.249&lt;/td&gt;
      &lt;td&gt;0.176&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;ul&gt;
  &lt;li&gt;Row 1 – Vote Share Per Protester uses the pooled outcome that I described in &lt;a href=&quot;#table-2&quot;&gt;Table 2&lt;/a&gt;, including Tea Party and Women’s March vote share per protester.&lt;/li&gt;
  &lt;li&gt;Row 2 – Vote Share takes these outcomes from the studies on nonviolent protests:
    &lt;ul&gt;
      &lt;li&gt;Tea Party – Republican vote share&lt;/li&gt;
      &lt;li&gt;Women’s March – women’s vote share&lt;/li&gt;
      &lt;li&gt;Earth Day – favorability (1)&lt;sup id=&quot;fnref:21&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:21&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;23&lt;/a&gt;&lt;/sup&gt; as a proxy for vote share&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;Row 3 – Vote Share (Rain Only) uses the same outcomes as Row 2, but excluding the Women’s March outcome because it used weather shocks rather than rainfall.&lt;/li&gt;
  &lt;li&gt;Row 4 – Single Hypothesis does not differentiate between nonviolent and violent protests, instead lumping all studies together. It includes the three Vote Share measures from Row 2, plus Civil Rights – vote share.&lt;/li&gt;
  &lt;li&gt;Row 5 – Favorability includes measured changes in popular support for a protest’s goals:
    &lt;ul&gt;
      &lt;li&gt;Tea Party – support for the Tea Party&lt;/li&gt;
      &lt;li&gt;Earth Day – favorability (1)&lt;sup id=&quot;fnref:21:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:21&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;23&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;(Note: &lt;code&gt;P(negative effect) = 0&lt;/code&gt; doesn’t mean it’s &lt;em&gt;literally&lt;/em&gt; zero, but it’s so small that it gets rounded off to zero.)&lt;/p&gt;

&lt;p&gt;The Women’s March and Earth Day papers used continuous rainfall variables instead of binary (rain vs. no rain); those papers’ outcomes were standardized using the method from &lt;a href=&quot;/materials/gelman2008.pdf&quot;&gt;Gelman (2007)&lt;/a&gt;&lt;sup id=&quot;fnref:58&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:58&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;24&lt;/a&gt;&lt;/sup&gt; to put them on the same scale as binary variables.&lt;sup id=&quot;fnref:59&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:59&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;25&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;The Vote Share Per Protester and Vote Share tests produce low p-values/high likelihood ratios, and under those models, nonviolent protests have virtually no chance of having a negative effect on support. Favorability has a weak likelihood ratio due to a large variance between outcomes.&lt;sup id=&quot;fnref:60&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:60&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;26&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;Under the Single Hypothesis model, protests have a much weaker p-value/likelihood ratio. Naturally, when you include a negative outcome, it pulls down the average effect quite a bit. The mean is still positive, which makes sense given that only one out of four included protests was violent.&lt;/p&gt;

&lt;p&gt;Is it fair to separate out violent and nonviolent protests? I’m wary of adding complexity to a hypothesis but I believe it’s justified in this case:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;It’s intuitively plausible that peaceful protests would earn support while violence would backfire.&lt;/li&gt;
  &lt;li&gt;Lab experiments&lt;sup id=&quot;fnref:50:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:50&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt; and observational studies support this bimodal hypothesis.&lt;/li&gt;
  &lt;li&gt;I ran a t-test for the hypothesis that nonviolent and violent protests have the same effect on voting outcomes, comparing the pooled outcome from Row 2 – Vote Share against the Civil Rights protest outcome. The result had a likelihood ratio of 55.1 and p &amp;lt; 0.005. We can strongly reject the hypothesis that these two samples have the same mean.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;There are some reasons believe these results may be overstated, which I will address under &lt;a href=&quot;#potential-problems-with-the-research&quot;&gt;Potential problems with the research&lt;/a&gt;. There are also at least two reasons to believe they may be understated:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Rainfall does not perfectly predict whether protests occur. (Sometimes people protest in the rain.) If protests genuinely work, then the effect of protests will be larger than the effect of protests &lt;em&gt;as predicted by rainfall.&lt;/em&gt;&lt;/li&gt;
  &lt;li&gt;I aggregated the most similar metrics into pooled outcomes. But these were not always the strongest metrics. For example, Earth Day protests strongly predicted birth defects (likelihood ratio 55,000; p &amp;lt; 3e-6). But I did not include birth defects in the meta-analysis because it did not have any comparable counterpart in the other studies.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Table 4 shows outcomes across the five studies, estimated by looking at counties where it rained vs. did not rain. The Women’s March and Earth Day results are standardized as explained above.&lt;sup id=&quot;fnref:59:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:59&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;25&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;div id=&quot;table-4&quot; style=&quot;text-align:center;&quot;&gt;Table 4: Societal-Level Protest Outcomes&lt;/div&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Protest&lt;/th&gt;
      &lt;th&gt;Outcome&lt;/th&gt;
      &lt;th&gt;Change&lt;/th&gt;
      &lt;th&gt;Std Err&lt;/th&gt;
      &lt;th&gt;Source&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Tea Party&lt;/td&gt;
      &lt;td&gt;votes (as % of population)&lt;/td&gt;
      &lt;td&gt;1.04%**&lt;/td&gt;
      &lt;td&gt;0.30%&lt;/td&gt;
      &lt;td&gt;Table VI&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Tea Party&lt;/td&gt;
      &lt;td&gt;vote share&lt;/td&gt;
      &lt;td&gt;1.55%*&lt;/td&gt;
      &lt;td&gt;0.69%&lt;/td&gt;
      &lt;td&gt;Table VI&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Tea Party&lt;/td&gt;
      &lt;td&gt;conservative vote score&lt;sup id=&quot;fnref:30&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:30&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;27&lt;/a&gt;&lt;/sup&gt;&lt;/td&gt;
      &lt;td&gt;1.922*&lt;/td&gt;
      &lt;td&gt;0.937&lt;/td&gt;
      &lt;td&gt;Table VII&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Tea Party&lt;/td&gt;
      &lt;td&gt;average belief effect&lt;sup id=&quot;fnref:31&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:31&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;28&lt;/a&gt;&lt;/sup&gt;&lt;/td&gt;
      &lt;td&gt;0.13***&lt;/td&gt;
      &lt;td&gt;0.037&lt;/td&gt;
      &lt;td&gt;Table V&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Tea Party&lt;/td&gt;
      &lt;td&gt;strongly supports Tea Party&lt;/td&gt;
      &lt;td&gt;5.7%*&lt;/td&gt;
      &lt;td&gt;2.5%&lt;/td&gt;
      &lt;td&gt;Table V&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Tea Party&lt;/td&gt;
      &lt;td&gt;Sarah Palin favorability&lt;/td&gt;
      &lt;td&gt;5.7%*&lt;/td&gt;
      &lt;td&gt;2.6%&lt;/td&gt;
      &lt;td&gt;Table V&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Tea Party&lt;/td&gt;
      &lt;td&gt;“outraged about way things are going in country”&lt;/td&gt;
      &lt;td&gt;4.6%*&lt;/td&gt;
      &lt;td&gt;2.1%&lt;/td&gt;
      &lt;td&gt;Table V&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Tea Party&lt;/td&gt;
      &lt;td&gt;opposes raising taxes on income &amp;gt;$250K&lt;/td&gt;
      &lt;td&gt;5.8%&lt;/td&gt;
      &lt;td&gt;3.0%&lt;/td&gt;
      &lt;td&gt;Table V&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Tea Party&lt;/td&gt;
      &lt;td&gt;“Americans have less freedom than in 2008”&lt;/td&gt;
      &lt;td&gt;6.5%*&lt;/td&gt;
      &lt;td&gt;2.6%&lt;/td&gt;
      &lt;td&gt;Table V&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Tea Party&lt;/td&gt;
      &lt;td&gt;Obama unfavorability&lt;/td&gt;
      &lt;td&gt;4.6%&lt;/td&gt;
      &lt;td&gt;2.4%&lt;/td&gt;
      &lt;td&gt;Table V&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Civil Rights (violent)&lt;/td&gt;
      &lt;td&gt;vote share among white voters&lt;/td&gt;
      &lt;td&gt;–5.56%*&lt;/td&gt;
      &lt;td&gt;2.48%&lt;/td&gt;
      &lt;td&gt;Appendix, Table 12&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;BLM&lt;/td&gt;
      &lt;td&gt;vote share&lt;/td&gt;
      &lt;td&gt;2.7%&lt;/td&gt;
      &lt;td&gt;1.2%&lt;/td&gt;
      &lt;td&gt;Table 2&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;BLM&lt;/td&gt;
      &lt;td&gt;“Blacks should not receive special favors”&lt;/td&gt;
      &lt;td&gt;–0.242&lt;sup id=&quot;fnref:33&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:33&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;29&lt;/a&gt;&lt;/sup&gt;&lt;/td&gt;
      &lt;td&gt;0.360&lt;/td&gt;
      &lt;td&gt;Table 3&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;BLM&lt;/td&gt;
      &lt;td&gt;“Slavery caused current disparities”&lt;/td&gt;
      &lt;td&gt;0.339&lt;sup id=&quot;fnref:33:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:33&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;29&lt;/a&gt;&lt;/sup&gt;&lt;/td&gt;
      &lt;td&gt;0.388&lt;/td&gt;
      &lt;td&gt;Table 3&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Women’s March&lt;/td&gt;
      &lt;td&gt;women’s vote share&lt;/td&gt;
      &lt;td&gt;2.48%***&lt;/td&gt;
      &lt;td&gt;0.64%&lt;/td&gt;
      &lt;td&gt;Table 4&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Women’s March&lt;/td&gt;
      &lt;td&gt;voter turnout&lt;/td&gt;
      &lt;td&gt;0.41%**&lt;/td&gt;
      &lt;td&gt;0.14%&lt;/td&gt;
      &lt;td&gt;Table 4&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Earth Day&lt;/td&gt;
      &lt;td&gt;favorability (1)&lt;sup id=&quot;fnref:21:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:21&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;23&lt;/a&gt;&lt;/sup&gt;&lt;/td&gt;
      &lt;td&gt;0.90%&lt;/td&gt;
      &lt;td&gt;0.53%&lt;/td&gt;
      &lt;td&gt;Table 2&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Earth Day&lt;/td&gt;
      &lt;td&gt;favorability (1)&lt;sup id=&quot;fnref:21:3&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:21&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;23&lt;/a&gt;&lt;/sup&gt; among under-20s&lt;/td&gt;
      &lt;td&gt;1.67%**&lt;/td&gt;
      &lt;td&gt;0.62%&lt;/td&gt;
      &lt;td&gt;Table 2&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Earth Day&lt;/td&gt;
      &lt;td&gt;favorability (2)&lt;sup id=&quot;fnref:21:4&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:21&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;23&lt;/a&gt;&lt;/sup&gt;&lt;/td&gt;
      &lt;td&gt;1.12&lt;/td&gt;
      &lt;td&gt;0.70&lt;/td&gt;
      &lt;td&gt;Table 2&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Earth Day&lt;/td&gt;
      &lt;td&gt;favorability (2)&lt;sup id=&quot;fnref:21:5&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:21&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;23&lt;/a&gt;&lt;/sup&gt; among under-20s&lt;/td&gt;
      &lt;td&gt;1.90*&lt;/td&gt;
      &lt;td&gt;0.82&lt;/td&gt;
      &lt;td&gt;Table 2&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Earth Day&lt;/td&gt;
      &lt;td&gt;carbon monoxide&lt;/td&gt;
      &lt;td&gt;0.07*&lt;/td&gt;
      &lt;td&gt;0.03&lt;/td&gt;
      &lt;td&gt;Table 4&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Earth Day&lt;/td&gt;
      &lt;td&gt;birth defects&lt;/td&gt;
      &lt;td&gt;1.00***&lt;/td&gt;
      &lt;td&gt;0.21&lt;/td&gt;
      &lt;td&gt;Table 4&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;*p &amp;lt; 0.05; **p &amp;lt; 0.01; ***p &amp;lt; 0.001&lt;/p&gt;

&lt;h1 id=&quot;potential-problems-with-the-research&quot;&gt;Potential problems with the research&lt;/h1&gt;

&lt;h2 id=&quot;spatial-autocorrelation&quot;&gt;Spatial autocorrelation&lt;/h2&gt;

&lt;p&gt;Recall that “spatial autocorrelation” is a technical way of saying “rainfall is not independent across counties”. If you assume your samples are independent when they’re not, your standard errors will be too low—giving you too much confidence in your results.&lt;/p&gt;

&lt;p&gt;It’s conceivable that all five studies overstated the strength of their results due to spatial autocorrelation.&lt;/p&gt;

&lt;p&gt;Each study on nonviolent protests used at least some technique to correct for spatial autocorrelation. &lt;a href=&quot;#madestam-et-al-2013-on-tea-party-protests&quot;&gt;Madestam et al. (2013)&lt;/a&gt; and &lt;a href=&quot;#hungerman--moorthy-2023-on-earth-day&quot;&gt;Hungerman &amp;amp; Moorthy (2023)&lt;/a&gt; included “placebo tests”. The placebo tests from Madestam et al. (2013) indicated that these corrections mostly worked but did not fully succeed, whereas Hungerman &amp;amp; Moorthy’s corrections apparently did succeed. On balance, this suggests that the standard errors of the pooled outcome may be understated, but probably not by a large margin.&lt;/p&gt;

&lt;p&gt;Two of the pooled outcomes from &lt;a href=&quot;#table-3&quot;&gt;Table 3&lt;/a&gt;—the Vote Share Per Protester and Favorability pools—had strong likelihood ratios / low p-values. That suggests they should hold up even with somewhat reduced statistical power.&lt;/p&gt;

&lt;h2 id=&quot;publication-bias&quot;&gt;Publication bias&lt;/h2&gt;

&lt;p&gt;The standard method to assess &lt;a href=&quot;https://en.wikipedia.org/wiki/Publication_bias&quot;&gt;publication bias&lt;/a&gt; would be to make a &lt;a href=&quot;https://en.wikipedia.org/wiki/Funnel_plot&quot;&gt;funnel plot&lt;/a&gt;. I didn’t do that for two reasons:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;With only five studies (at best), there aren’t enough data points to detect publication bias even if it exists.&lt;/li&gt;
  &lt;li&gt;A funnel plot only works if your studies cover a range of sample sizes. All the natural experiments have roughly the same sample size (because they all look at county-level data for the majority of US counties).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;As an alternative, I tested how the results might change if we discovered some unpublished null results. I used the following procedure:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Choose one of the pooled outcomes from &lt;a href=&quot;#table-3&quot;&gt;Table 3&lt;/a&gt;.&lt;/li&gt;
  &lt;li&gt;For each individual study outcome, clone it to create a “dummy null outcome” with the same standard error and sample size, but a mean of 0. This represents a hypothetical study that didn’t get published because it found a null result.&lt;/li&gt;
  &lt;li&gt;Construct a larger pooled sample using all four or six outcomes (the two or three real outcomes plus the two or three null dummies).&lt;/li&gt;
&lt;/ol&gt;

&lt;div id=&quot;table-5&quot; style=&quot;text-align:center;&quot;&gt;Table 5: Pooled Sample Effects, Adjusted for Publication Bias&lt;/div&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Outcomes&lt;/th&gt;
      &lt;th&gt;Mean&lt;/th&gt;
      &lt;th&gt;Std Err&lt;/th&gt;
      &lt;th&gt;likelihood ratio&lt;/th&gt;
      &lt;th&gt;p-value&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Vote Share Per Protester&lt;/td&gt;
      &lt;td&gt;6.43&lt;/td&gt;
      &lt;td&gt;4.01&lt;/td&gt;
      &lt;td&gt;3.61&lt;/td&gt;
      &lt;td&gt;0.11&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Vote Share&lt;/td&gt;
      &lt;td&gt;0.80&lt;/td&gt;
      &lt;td&gt;0.41&lt;/td&gt;
      &lt;td&gt;6.97&lt;/td&gt;
      &lt;td&gt;0.049&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Vote Share (Rain Only)&lt;/td&gt;
      &lt;td&gt;0.58&lt;/td&gt;
      &lt;td&gt;0.36&lt;/td&gt;
      &lt;td&gt;3.77&lt;/td&gt;
      &lt;td&gt;0.104&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Favorability&lt;/td&gt;
      &lt;td&gt;0.74&lt;/td&gt;
      &lt;td&gt;0.65&lt;/td&gt;
      &lt;td&gt;1.91&lt;/td&gt;
      &lt;td&gt;0.256&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;Adding in null results considerably weakens the strength of evidence.&lt;/p&gt;

&lt;p&gt;This approach is deliberately conservative. I wouldn’t say this meta-analysis is robust to publication bias, but it’s not particularly vulnerable to publication bias, either.&lt;/p&gt;

&lt;p&gt;(The dummy-null approach leaves something to be desired. If the true mean were 6.43 as the pooled sample suggests, it would be surprising to see three positive results with low p-values plus three null results with equally tight standard errors. But I haven’t thought of any better ideas for how to test publication bias.)&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;/materials/Protest Meta-Analysis.pdf&quot;&gt;Orazani et al. (2021)&lt;/a&gt;&lt;sup id=&quot;fnref:50:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:50&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt; reviewed lab experiments on protest favorability. Among other things, it looked at publication bias. This paper might be informative, since it stands to reason that if experimental researchers on protests have a certain bias, then sociological researchers might have a similar bias.&lt;/p&gt;

&lt;p&gt;The paper included a funnel plot:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/Orazani-funnel-plot.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;To supplement the plot, I tested for publication bias using two statistical tests:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;https://training.cochrane.org/resource/identifying-publication-bias-meta-analyses-continuous-outcomes&quot;&gt;Egger’s regression test&lt;/a&gt;&lt;sup id=&quot;fnref:51&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:51&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;30&lt;/a&gt;&lt;/sup&gt; found r = 0.124, p &amp;lt; 0.646 (r &amp;gt; 0 means that more powerful studies had &lt;em&gt;larger&lt;/em&gt; mean effects, which if anything is evidence of inverse publication bias).&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://en.wikipedia.org/wiki/Kendall_rank_correlation_coefficient#Hypothesis_test&quot;&gt;Kendall’s tau test&lt;/a&gt; found p &amp;lt; 0.565.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Orazani et al. (2021) included 14 experiments and 2 non-experimental studies. I also tested for publication bias when excluding the non-experiments and again found highly insignificant p-values.&lt;/p&gt;

&lt;p&gt;Orazani et al. (2021) tested for a difference between published and unpublished studies (although they defined “unpublished” in a way that seemed strange to me—they counted dissertations and conference presentations as unpublished). They found a significant difference in effect size, suggesting the presence of publication bias. Published studies had a &lt;a href=&quot;https://en.wikipedia.org/wiki/Effect_size#Cohen&apos;s_d&quot;&gt;Cohen’s d&lt;/a&gt; of 0.39, versus 0.22 for unpublished studies. However, this difference disappeared when the authors controlled for certain features of the protests being studied (e.g. protests directed at the government as opposed to society). I am not sure what to make of this, but there is at least &lt;em&gt;some&lt;/em&gt; evidence of publication bias.&lt;/p&gt;

&lt;h2 id=&quot;data-fabrication&quot;&gt;Data fabrication&lt;/h2&gt;

&lt;p&gt;Most meta-analyses do not consider the possibility that some studies’ data might be fabricated, and I believe they should. Checking for fraud is difficult in general, but I will do some basic checks.&lt;/p&gt;

&lt;p&gt;When humans fabricate data, they often come up with numbers that don’t look random. Real data should follow two observable patterns:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;The last digits of numbers should be uniformly distributed.&lt;/li&gt;
  &lt;li&gt;The first digits of numbers should NOT be uniformly distributed. Instead, they should obey &lt;a href=&quot;https://en.wikipedia.org/wiki/Benford&apos;s_law&quot;&gt;Benford’s law&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I tested for suspicious patterns by collecting a list of statistical results (means and standard errors for various outcomes) from the BLM, Tea Party, Women’s March, and Earth Day papers. I did not include the Civil Rights paper because its quasi-experimental data only included violent protests.&lt;/p&gt;

&lt;p&gt;I also did a power check to determine whether the tests have adequate statistical power. We should be able to reject the hypotheses that the &lt;em&gt;first&lt;/em&gt; digits follow a uniform distribution, and that the &lt;em&gt;last&lt;/em&gt; digits follow Benford’s law.&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;  Tea Party:
      First-digit Benford&apos;s Law p-value: 0.598
      Last-digit uniformity p-value:     0.306
      Power check p-values:              0.001, 0.002

  BLM:
      First-digit Benford&apos;s Law p-value: 0.438
      Last-digit uniformity p-value:     0.598
      Power check p-values:              0.001, 0.001

  Women&apos;s March:
      First-digit Benford&apos;s Law p-value: 0.181
      Last-digit uniformity p-value:     0.891
      Power check p-values:              0.001, 0.001

  Earth Day:
      First-digit Benford&apos;s Law p-value: 0.121
      Last-digit uniformity p-value:     0.224
      Power check p-values:              0.038, 0.001
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;(P-values are rounded up to 3 digits. See &lt;a href=&quot;#source-code&quot;&gt;source code&lt;/a&gt; for full details.)&lt;/p&gt;

&lt;p&gt;In all cases, I found high p-values for the first and last digits, which means the data follow the expected natural patterns. And I found very low p-values for the sanity check tests, which means the tests are sufficiently powerful (except for Earth Day first digits, where few independent outcomes were reported).&lt;/p&gt;

&lt;p&gt;These tests do not rule out more sophisticated fraud. For example, if the authors generated false data and then calculated statistical tests on top of them, the fabricated results would still pass the first-digit and last-digit checks.&lt;/p&gt;

&lt;h2 id=&quot;data-errors&quot;&gt;Data errors&lt;/h2&gt;

&lt;p&gt;Checking for data errors is difficult in general.&lt;sup id=&quot;fnref:54&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:54&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;31&lt;/a&gt;&lt;/sup&gt; I did a basic consistency check to verify that each study’s reported means and standard errors seemed internally consistent, but it’s hard to see errors that way.&lt;/p&gt;

&lt;p&gt;The only data error I noticed was in Larreboure &amp;amp; González (2021)&lt;sup id=&quot;fnref:14:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:14&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;13&lt;/a&gt;&lt;/sup&gt;. As I mentioned before, it reported inconsistent numbers for the change in vote share based on each 1% of the population protesting: 12.95 pp (std err 5.63) on page 13 in the text, and 12.70 pp (std err 5.48) in Table 4.&lt;/p&gt;

&lt;p&gt;The difference is small, which suggests the authors may have made some revision to their calculations but didn’t update all the values reported in their manuscript. If so, the number in Table 4 is likely the correct one.&lt;sup id=&quot;fnref:52&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:52&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;32&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;This inconsistency suggests that the authors have some degrees of freedom for &lt;a href=&quot;https://en.wikipedia.org/wiki/Data_dredging&quot;&gt;p-hacking&lt;/a&gt;, but the two numbers are similar enough to have minimal impact on the result of my meta-analysis.&lt;/p&gt;

&lt;h2 id=&quot;will-the-results-generalize&quot;&gt;Will the results generalize?&lt;/h2&gt;

&lt;p&gt;All the protests covered by natural experiments have certain commonalities:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;They all had a large number of participants.&lt;/li&gt;
  &lt;li&gt;They were all nationwide (they had to be, so the study authors could use county-level data).&lt;/li&gt;
  &lt;li&gt;They all took place in the United States.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Will the results generalize to other countries? Will the results generalize to smaller-scale or local protests?&lt;/p&gt;

&lt;p&gt;The fact that these protests were so widespread means their objectives couldn’t have been far outside the &lt;a href=&quot;https://en.wikipedia.org/wiki/Overton_window&quot;&gt;Overton window&lt;/a&gt; (i.e., the range of politically acceptable ideas at the time). Perhaps a protest that advocated for a more radical position would be more likely to backfire. To address this question, perhaps we could look at lab experiments on protests, but that’s beyond the scope of this article.&lt;/p&gt;

&lt;h2 id=&quot;meta-concerns-with-this-meta-analysis&quot;&gt;Meta-concerns with this meta-analysis&lt;/h2&gt;

&lt;p&gt;I have some criticisms of my meta-analysis itself:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;I did not pre-register a methodology. I have limited experience conducting meta-analyses and I was learning as I wrote this article. Realistically, I would not have had the motivation to finish if I’d been required to fully determine a methodology in advance. But the platonic ideal of this meta-analysis would have included a pre-registration.&lt;/li&gt;
  &lt;li&gt;Three of the studies (BLM, Civil Rights, and Earth Day) published their data. A thorough analysis would attempt to replicate those studies’ findings. I did not do that.&lt;/li&gt;
&lt;/ol&gt;

&lt;h1 id=&quot;are-social-change-labs-claims-justified&quot;&gt;Are Social Change Lab’s claims justified?&lt;/h1&gt;

&lt;h2 id=&quot;broad-claims&quot;&gt;Broad claims&lt;/h2&gt;

&lt;p&gt;Social Change Lab’s literature review included a summary of findings, reproduced below.&lt;/p&gt;

&lt;div id=&quot;table-6&quot; style=&quot;text-align:center;&quot;&gt;Table 6: Social Change Lab Findings&lt;/div&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Finding&lt;/th&gt;
      &lt;th&gt;Confidence&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Protest movements can have significant short-term impacts&lt;/td&gt;
      &lt;td&gt;Strong&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Protest movements can achieve intended outcomes in North America and Western Europe&lt;/td&gt;
      &lt;td&gt;Strong&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Protest movements can have significant impacts (2-5% shifts) on voting behaviour and electoral outcomes&lt;/td&gt;
      &lt;td&gt;Medium&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Protest movements can positively influence public opinion (≤10% shifts)&lt;/td&gt;
      &lt;td&gt;Medium&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Protest movements can influence public discourse (e.g. issue salience and media narratives)&lt;/td&gt;
      &lt;td&gt;Medium&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Protest movements can influence policy&lt;/td&gt;
      &lt;td&gt;Low (mixed evidence)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Protest movements can influence policymaker beliefs&lt;/td&gt;
      &lt;td&gt;Low (little evidence)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Protest movements can achieve desired outcomes in the Global South&lt;/td&gt;
      &lt;td&gt;Low (little evidence)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Protest movements can have significant long-term impacts (on public opinion and public discourse)&lt;/td&gt;
      &lt;td&gt;Low (little evidence)&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;In this section, I assess whether the natural experiments support each “Strong” and “Medium” claim. I find that the evidence does indeed support the findings and I agree with Social Change Lab’s confidence levels in each case.&lt;/p&gt;

&lt;p&gt;I do not review the four “Low Confidence” claims because none of the natural experiments attempted to test them. (That fact itself suggests that “Low Confidence” is an accurate label.)&lt;/p&gt;

&lt;p&gt;Starting with the findings rated “Strong”:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;table&gt;
    &lt;tbody&gt;
      &lt;tr&gt;
        &lt;td&gt;Protest movements can have significant short-term impacts&lt;/td&gt;
        &lt;td&gt;&lt;strong&gt;Strong&lt;/strong&gt;&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
        &lt;td&gt;Protest movements can achieve intended outcomes in North America and Western Europe&lt;/td&gt;
        &lt;td&gt;&lt;strong&gt;Strong&lt;/strong&gt;&lt;/td&gt;
      &lt;/tr&gt;
    &lt;/tbody&gt;
  &lt;/table&gt;
&lt;/blockquote&gt;

&lt;p&gt;The natural experiments support these claims. There’s also supporting evidence from lab experiments on how protests affect people’s perceptions; studies on media coverage; and observational data on protest outcomes. For a meta-analysis of lab experiments, which I view as the second-strongest form of evidence, see &lt;a href=&quot;/materials/Protest Meta-Analysis.pdf&quot;&gt;Orazani et al. (2021)&lt;/a&gt;&lt;sup id=&quot;fnref:50:3&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:50&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt;.&lt;/p&gt;

&lt;p&gt;I do not have much confidence in most of these lines of evidence, but the natural experiments offer good support:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;All study results point the same direction (as long as we exclude the data on violent protests).&lt;/li&gt;
  &lt;li&gt;The &lt;a href=&quot;#table-3&quot;&gt;pooled outcomes&lt;/a&gt; have high likelihood ratios / low p-values.&lt;/li&gt;
  &lt;li&gt;There are no signs of &lt;a href=&quot;#data-fabrication&quot;&gt;data fabrication&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I’m still concerned about &lt;a href=&quot;#publication-bias&quot;&gt;publication bias&lt;/a&gt; and &lt;a href=&quot;#spatial-autocorrelation&quot;&gt;spatial autocorrelation&lt;/a&gt;. I am not sure it is appropriate to describe the evidence as “Strong”. It would be fair to downgrade your confidence to “Medium” based on these concerns. But I also think “Strong” confidence is defensible; the distinction depends on how much weight you give to the hard-to-quantify limitations with the existing evidence.&lt;/p&gt;

&lt;p&gt;The natural experiments all cover nationwide, popular protests in the United States, so it’s not clear that the results &lt;a href=&quot;#will-the-results-generalize&quot;&gt;generalize&lt;/a&gt;. Regardless, Social Change Lab didn’t claim that protests &lt;em&gt;always&lt;/em&gt; have significant impacts, only that they “can” have impact; and the existence of these natural experiments shows that indeed they can.&lt;/p&gt;

&lt;p&gt;The highest-quality studies are all natural experiments, not true experiments. A true experiment would be preferable. But the rainfall method seems sufficient to establish causality so I am comfortable treating these natural experiments’ methodologies as valid.&lt;/p&gt;

&lt;p&gt;Whether this evidence qualifies as “strong” is a matter of debate. Certainly the evidence could be much stronger. But I would be surprised if these findings were overturned, so I think Social Change Lab’s confidence level is fair.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;table&gt;
    &lt;tbody&gt;
      &lt;tr&gt;
        &lt;td&gt;Protest movements can have significant impacts (2-5% shifts) on voting behaviour and electoral outcomes&lt;/td&gt;
        &lt;td&gt;&lt;strong&gt;Medium&lt;/strong&gt;&lt;/td&gt;
      &lt;/tr&gt;
    &lt;/tbody&gt;
  &lt;/table&gt;
&lt;/blockquote&gt;

&lt;p&gt;A 2–5% shift is consistent with the natural experiments, which found changes in vote share ranging from 1.55% to 5.54% (see &lt;a href=&quot;#table-4&quot;&gt;Table 4&lt;/a&gt;). I think 2–5% is fair as an optimistic expectation, given that the natural experiments all covered large nationwide protests.&lt;/p&gt;

&lt;p&gt;I believe the rainfall method is effective at establishing causality, but we can’t be too confident in the magnitude of the effect because rainfall does not perfectly predict protest attendance. So I would not rate the confidence for this finding as higher than “Medium”.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;table&gt;
    &lt;tbody&gt;
      &lt;tr&gt;
        &lt;td&gt;Protest movements can positively influence public opinion (≤10% shifts)&lt;/td&gt;
        &lt;td&gt;&lt;strong&gt;Medium&lt;/strong&gt;&lt;/td&gt;
      &lt;/tr&gt;
    &lt;/tbody&gt;
  &lt;/table&gt;
&lt;/blockquote&gt;

&lt;p&gt;Among the natural experiments, only two (Madestam et al. 2013; Hungerman &amp;amp; Moorthy 2023) reported on public opinion in terms of percentages. Public opinion changes clustered around 5% for the multiple measures in the two studies.&lt;/p&gt;

&lt;p&gt;Klein Teeselink &amp;amp; Melios (2021) reported changes in public opinion on a 5-point scale. Rainfall predicted changes of 0.242 and 0.339 on two different questions, which correspond to percentage changes of about 6% and 8.5%, although the interpretation of these percentages isn’t the same as for the other two studies.&lt;/p&gt;

&lt;p&gt;I believe the data on voter behavior also provides evidence on public opinion—if you vote differently, it’s most likely because your opinion changed.&lt;/p&gt;

&lt;p&gt;So I think Social Change Lab’s finding is indeed moderately well supported.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;table&gt;
    &lt;tbody&gt;
      &lt;tr&gt;
        &lt;td&gt;Protest movements can influence public discourse (e.g. issue salience and media narratives)&lt;/td&gt;
        &lt;td&gt;&lt;strong&gt;Medium&lt;/strong&gt;&lt;/td&gt;
      &lt;/tr&gt;
    &lt;/tbody&gt;
  &lt;/table&gt;
&lt;/blockquote&gt;

&lt;p&gt;None of the natural experiments directly addressed this claim.&lt;sup id=&quot;fnref:35&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:35&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;33&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;Several observational studies found that protests frequently get media coverage. Even though the studies are all observational, I am comfortable inferring causality in this case—it seems odd to say that protests occurred, the news covered the protests, but the protests did not cause the news coverage.&lt;/p&gt;

&lt;h2 id=&quot;claims-about-individual-studies&quot;&gt;Claims about individual studies&lt;/h2&gt;

&lt;p&gt;The literature review discussed five studies on real-world impacts of protests. Did it represent the studies accurately?&lt;/p&gt;

&lt;p&gt;Social Change Lab discussed the observational component of &lt;strong&gt;Wasow (2020)&lt;/strong&gt;&lt;sup id=&quot;fnref:9:3&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:9&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;11&lt;/a&gt;&lt;/sup&gt; but not the quasi-experimental component.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;By looking at US counties that are similar on a number of dimensions (black population, foreign-born population, whether the county is urban/rural, etc.), Wasow is able to mimic an experiment by testing how the Democratic vote share changes in counties with protests and matching counties without protests.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I don’t think it’s reasonable to say that a matched observational design “mimic[s] an experiment”. It could be that protests were more likely to happen in counties that were &lt;em&gt;already shifting Democratic&lt;/em&gt;; you can’t prove that the protests caused the shift.&lt;/p&gt;

&lt;p&gt;I agree with everything Social Change Lab wrote about &lt;strong&gt;Madestam (2013)&lt;/strong&gt;&lt;sup id=&quot;fnref:4:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:4&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;10&lt;/a&gt;&lt;/sup&gt;.&lt;/p&gt;

&lt;p&gt;Regarding &lt;strong&gt;Klein Teeselink &amp;amp; Melios (2021)&lt;/strong&gt;&lt;sup id=&quot;fnref:6:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:6&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;12&lt;/a&gt;&lt;/sup&gt;, the literature review wrote:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;[A] one percentage point increase in the fraction of the population going out to protest increased the Democratic vote share in that county by 5.6 percentage points[.]&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;First, this figure is incorrect: it should be 3.3 percentage points (page 11). Klein Teeselink &amp;amp; Melios (2021) was revised in 2025 and I only have access to the latest revision, so it’s possible that Social Change Lab’s figure comes from the 2021 version.&lt;/p&gt;

&lt;p&gt;Second, Klein Teeselink &amp;amp; Melios’ &lt;a href=&quot;#failed-placebo-tests&quot;&gt;placebo tests&lt;/a&gt; show that the natural experiment failed to establish causality. Social Change Lab interprets the study’s outcome as causal, but I do not believe this interpretation is justified.&lt;/p&gt;

&lt;p&gt;Social Change Lab’s description of &lt;strong&gt;McVeigh, Cunningham &amp;amp; Farrell (2014)&lt;/strong&gt;&lt;sup id=&quot;fnref:2:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;5&lt;/a&gt;&lt;/sup&gt; was fair; it was appropriately cautious about the weakness of the paper’s evidence.&lt;/p&gt;

&lt;p&gt;On &lt;strong&gt;Bremer et al. (2019)&lt;/strong&gt;&lt;sup id=&quot;fnref:3:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;6&lt;/a&gt;&lt;/sup&gt;:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;They found that whilst no such relationship existed for all 30 countries, in Western Europe did [sic] find a statistically significant interaction between protest, levels of economic hardship in a country and the loss of votes for the incumbent party.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I’m suspicious of p-hacking when a study finds a non-significant main result and a significant sub-group result. I wish Social change Lab had been more skeptical of Bremer et al.’s approach.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;It seems that for a given level of economic hardship a country faces, if the number of protests increase, the incumbent political party will lose more votes[.]&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This quote implies causality, which was not established—the Bremer et al. study was purely observational.&lt;/p&gt;

&lt;p&gt;In summary, Social Change Lab overstated the strength of evidence several times when reviewing particular studies. However, I believe their summary findings are still accurate, partially thanks to the two additional natural experiments (&lt;a href=&quot;https://mlarreboure.com/womenmarch.pdf&quot;&gt;Larreboure &amp;amp; González (2021)&lt;/a&gt;&lt;sup id=&quot;fnref:14:3&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:14&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;13&lt;/a&gt;&lt;/sup&gt; and &lt;a href=&quot;https://www.aeaweb.org/content/file?id=16104&quot;&gt;Hungerman &amp;amp; Moorthy (2023)&lt;/a&gt;&lt;sup id=&quot;fnref:36:3&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:36&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;7&lt;/a&gt;&lt;/sup&gt;) that came out more recently.&lt;/p&gt;

&lt;h1 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h1&gt;

&lt;p&gt;My position on the Social Change Lab literature review:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;The review was insufficiently skeptical about weak evidence, and too willing to attribute causality where it had not been established.&lt;/li&gt;
  &lt;li&gt;The review’s summary claims about the overall strength of evidence were consistent with my assessments. Perhaps the “Strong Confidence” findings were overconfident and should be “Medium Confidence” instead, but I can see arguments either way.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Conducting a meta-analysis changed my view on protest effectiveness. My previous stance was that protests probably work, and that various lines of evidence pointed that way, but that all available evidence was weak. I now believe that some of the evidence is relatively&lt;sup id=&quot;fnref:42&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:42&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;34&lt;/a&gt;&lt;/sup&gt; strong, and I am more confident that protests work.&lt;/p&gt;

&lt;h1 id=&quot;source-code&quot;&gt;Source code&lt;/h1&gt;

&lt;p&gt;Source code for my meta-analysis is available &lt;a href=&quot;https://github.com/michaeldickens/public-scripts/blob/master/protest_outcomes.py&quot;&gt;on GitHub&lt;/a&gt;.&lt;/p&gt;

&lt;h1 id=&quot;appendix-a-additional-tables&quot;&gt;Appendix A: Additional tables&lt;/h1&gt;

&lt;p&gt;Most meta-analyses report &lt;a href=&quot;https://en.wikipedia.org/wiki/Study_heterogeneity&quot;&gt;study heterogeneity&lt;/a&gt; (I&lt;sup&gt;2&lt;/sup&gt;). I reported &lt;code&gt;P(negative effect)&lt;/code&gt; instead, which provides equivalent information, and I believe it’s more useful in this case. For completeness, Table A.1 gives the I&lt;sup&gt;2&lt;/sup&gt; values for &lt;a href=&quot;#table-3&quot;&gt;Table 3&lt;/a&gt;.&lt;/p&gt;

&lt;div id=&quot;table-a.1&quot; style=&quot;text-align:center;&quot;&gt;Table A.1: Pooled Outcomes with I&lt;sup&gt;2&lt;/sup&gt;&lt;/div&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Outcomes&lt;/th&gt;
      &lt;th&gt;Mean&lt;/th&gt;
      &lt;th&gt;Std Err&lt;/th&gt;
      &lt;th&gt;likelihood ratio&lt;/th&gt;
      &lt;th&gt;p-value&lt;/th&gt;
      &lt;th&gt;I^2&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Vote Share Per Protester&lt;/td&gt;
      &lt;td&gt;11.95&lt;/td&gt;
      &lt;td&gt;4.00&lt;/td&gt;
      &lt;td&gt;87.1&lt;/td&gt;
      &lt;td&gt;0.003&lt;/td&gt;
      &lt;td&gt;3%&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Vote Share&lt;/td&gt;
      &lt;td&gt;1.59&lt;/td&gt;
      &lt;td&gt;0.48&lt;/td&gt;
      &lt;td&gt;257&lt;/td&gt;
      &lt;td&gt;0.001&lt;/td&gt;
      &lt;td&gt;45%&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Vote Share (Rain Only)&lt;/td&gt;
      &lt;td&gt;1.14&lt;/td&gt;
      &lt;td&gt;0.42&lt;/td&gt;
      &lt;td&gt;39.3&lt;/td&gt;
      &lt;td&gt;0.007&lt;/td&gt;
      &lt;td&gt;0%&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Single Hypothesis&lt;/td&gt;
      &lt;td&gt;1.06&lt;/td&gt;
      &lt;td&gt;0.78&lt;/td&gt;
      &lt;td&gt;2.55&lt;/td&gt;
      &lt;td&gt;0.172&lt;/td&gt;
      &lt;td&gt;74%&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Favorability&lt;/td&gt;
      &lt;td&gt;2.68&lt;/td&gt;
      &lt;td&gt;2.32&lt;/td&gt;
      &lt;td&gt;1.95&lt;/td&gt;
      &lt;td&gt;0.249&lt;/td&gt;
      &lt;td&gt;72%&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;Table A.2 reports summary statistics for the same pooled outcomes as &lt;a href=&quot;#table-3&quot;&gt;Table 3&lt;/a&gt;, plus additional outcomes from &lt;a href=&quot;https://dx.doi.org/10.2139/ssrn.3809877&quot;&gt;Klein Teeselink &amp;amp; Melios (2021)&lt;/a&gt;&lt;sup id=&quot;fnref:6:3&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:6&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;12&lt;/a&gt;&lt;/sup&gt; that I excluded from the main table. Consider this like a &lt;a href=&quot;https://en.wikipedia.org/wiki/Cross-validation_(statistics)#Leave-one-out_cross-validation&quot;&gt;leave-one-out analysis&lt;/a&gt;, except instead it’s a put-one-in analysis.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;“Vote Share Per Protester” adds BLM vote share per protester (mean 3.3, std err 0.6, n = 3053; from Table 2).&lt;/li&gt;
  &lt;li&gt;The middle three rows add BLM vote share.&lt;/li&gt;
  &lt;li&gt;“Favorability” adds survey agreement rate for the statement “Blacks should not receive special favors.”&lt;/li&gt;
&lt;/ul&gt;

&lt;div id=&quot;table-a.2&quot; style=&quot;text-align:center;&quot;&gt;Table A.2: Pooled Outcomes Including BLM&lt;/div&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Outcomes&lt;/th&gt;
      &lt;th&gt;Mean&lt;/th&gt;
      &lt;th&gt;Std Err&lt;/th&gt;
      &lt;th&gt;likelihood ratio&lt;/th&gt;
      &lt;th&gt;p-value&lt;/th&gt;
      &lt;th&gt;I^2&lt;/th&gt;
      &lt;th&gt;P(negative effect)&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Vote Share Per Protester&lt;/td&gt;
      &lt;td&gt;7.89&lt;/td&gt;
      &lt;td&gt;3.91&lt;/td&gt;
      &lt;td&gt;7.62&lt;/td&gt;
      &lt;td&gt;0.044&lt;/td&gt;
      &lt;td&gt;65%&lt;/td&gt;
      &lt;td&gt;0.072&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Vote Share&lt;/td&gt;
      &lt;td&gt;1.71&lt;/td&gt;
      &lt;td&gt;0.43&lt;/td&gt;
      &lt;td&gt;2.84e+03&lt;/td&gt;
      &lt;td&gt;0.001&lt;/td&gt;
      &lt;td&gt;33%&lt;/td&gt;
      &lt;td&gt;0&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Vote Share (Rain Only)&lt;/td&gt;
      &lt;td&gt;1.32&lt;/td&gt;
      &lt;td&gt;0.41&lt;/td&gt;
      &lt;td&gt;192&lt;/td&gt;
      &lt;td&gt;0.002&lt;/td&gt;
      &lt;td&gt;3%&lt;/td&gt;
      &lt;td&gt;0&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Single Hypothesis&lt;/td&gt;
      &lt;td&gt;1.36&lt;/td&gt;
      &lt;td&gt;0.68&lt;/td&gt;
      &lt;td&gt;7.6&lt;/td&gt;
      &lt;td&gt;0.045&lt;/td&gt;
      &lt;td&gt;69%&lt;/td&gt;
      &lt;td&gt;0.123&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Favorability&lt;/td&gt;
      &lt;td&gt;2.66&lt;/td&gt;
      &lt;td&gt;1.99&lt;/td&gt;
      &lt;td&gt;2.44&lt;/td&gt;
      &lt;td&gt;0.182&lt;/td&gt;
      &lt;td&gt;48%&lt;/td&gt;
      &lt;td&gt;0.136&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;Table A.3 uses the same row definitions as in Table A.2, while also correcting for publication bias by creating dummy null outcomes as described &lt;a href=&quot;#publication-bias&quot;&gt;above&lt;/a&gt;.&lt;/p&gt;

&lt;div id=&quot;table-a.3&quot; style=&quot;text-align:center;&quot;&gt;Table A.3: Pooled Outcomes Including BLM, Adjusted for Publication Bias&lt;/div&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Outcomes&lt;/th&gt;
      &lt;th&gt;Mean&lt;/th&gt;
      &lt;th&gt;Std Err&lt;/th&gt;
      &lt;th&gt;likelihood ratio&lt;/th&gt;
      &lt;th&gt;p-value&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Vote Share Per Protester&lt;/td&gt;
      &lt;td&gt;2.79&lt;/td&gt;
      &lt;td&gt;1.55&lt;/td&gt;
      &lt;td&gt;5&lt;/td&gt;
      &lt;td&gt;0.073&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Vote Share&lt;/td&gt;
      &lt;td&gt;0.88&lt;/td&gt;
      &lt;td&gt;0.38&lt;/td&gt;
      &lt;td&gt;14.9&lt;/td&gt;
      &lt;td&gt;0.021&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Vote Share (Rain Only)&lt;/td&gt;
      &lt;td&gt;0.70&lt;/td&gt;
      &lt;td&gt;0.36&lt;/td&gt;
      &lt;td&gt;6.69&lt;/td&gt;
      &lt;td&gt;0.052&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Favorability&lt;/td&gt;
      &lt;td&gt;0.66&lt;/td&gt;
      &lt;td&gt;0.51&lt;/td&gt;
      &lt;td&gt;2.26&lt;/td&gt;
      &lt;td&gt;0.203&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;The BLM study found a smaller mean effect than the other studies, but it also had a high t-stat. Adding BLM to the pooled outcomes decreases means but does not consistently decrease the strength of evidence.&lt;/p&gt;

&lt;p&gt;As discussed &lt;a href=&quot;#teeselink--melios-2021-on-2020-black-lives-matter-protests&quot;&gt;previously&lt;/a&gt;, the BLM study isolates local effects of protests, which is undesirable—it ignores any non-local effects that protests might have. Luckily, the study also reports results with no adjustment for spatial autocorrelation (in its Table A3).&lt;/p&gt;

&lt;p&gt;In general, it’s not a good idea to ignore spatial autocorrelation because it may overstate the strength of evidence. But for the “vote share per protester” metric, the un-adjusted outcome had a &lt;em&gt;lower&lt;/em&gt; t-stat than the adjusted outcome. So I think it’s fair to add the un-adjusted result to the pooled sample.&lt;/p&gt;

&lt;p&gt;Here are the results for pooled vote share per protester, using the BLM outcome with no adjustment to spatial autocorrelation.&lt;/p&gt;

&lt;div id=&quot;table-a.4&quot; style=&quot;text-align:center;&quot;&gt;Table A.4: Pooled Outcomes for Vote Share Per Protester, Including BLM&lt;/div&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Outcomes&lt;/th&gt;
      &lt;th&gt;Mean&lt;/th&gt;
      &lt;th&gt;Std Err&lt;/th&gt;
      &lt;th&gt;likelihood ratio&lt;/th&gt;
      &lt;th&gt;p-value&lt;/th&gt;
      &lt;th&gt;P(negative effect)&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;no correction&lt;/td&gt;
      &lt;td&gt;11.89&lt;/td&gt;
      &lt;td&gt;2.32&lt;/td&gt;
      &lt;td&gt;4.83e5&lt;/td&gt;
      &lt;td&gt;4e-7&lt;/td&gt;
      &lt;td&gt;0&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;corrected for publication bias&lt;/td&gt;
      &lt;td&gt;6.23&lt;/td&gt;
      &lt;td&gt;3.05&lt;/td&gt;
      &lt;td&gt;8.1&lt;/td&gt;
      &lt;td&gt;0.041&lt;/td&gt;
      &lt;td&gt;0.138&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;My meta-analysis compared standardized outcomes (using the method described in &lt;a href=&quot;/materials/gelman2008.pdf&quot;&gt;Gelman (2007)&lt;/a&gt;&lt;sup id=&quot;fnref:58:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:58&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;24&lt;/a&gt;&lt;/sup&gt;). Table A.5 shows the results from pooling unstandardized outcomes instead (excluding BLM).&lt;/p&gt;

&lt;div id=&quot;table-a.5&quot; style=&quot;text-align:center;&quot;&gt;Table A.5: Unstandardized Pooled Outcomes&lt;/div&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Outcomes&lt;/th&gt;
      &lt;th&gt;Mean&lt;/th&gt;
      &lt;th&gt;Std Err&lt;/th&gt;
      &lt;th&gt;likelihood ratio&lt;/th&gt;
      &lt;th&gt;p-value&lt;/th&gt;
      &lt;th&gt;P(negative effect)&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Vote Share Per Protester&lt;/td&gt;
      &lt;td&gt;11.95&lt;/td&gt;
      &lt;td&gt;4.00&lt;/td&gt;
      &lt;td&gt;87.1&lt;/td&gt;
      &lt;td&gt;0.003&lt;/td&gt;
      &lt;td&gt;0&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Vote Share&lt;/td&gt;
      &lt;td&gt;3.31&lt;/td&gt;
      &lt;td&gt;1.37&lt;/td&gt;
      &lt;td&gt;18.1&lt;/td&gt;
      &lt;td&gt;0.017&lt;/td&gt;
      &lt;td&gt;0.04&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Vote Share (Rain Only)&lt;/td&gt;
      &lt;td&gt;1.94&lt;/td&gt;
      &lt;td&gt;1.02&lt;/td&gt;
      &lt;td&gt;6.13&lt;/td&gt;
      &lt;td&gt;0.057&lt;/td&gt;
      &lt;td&gt;0.011&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Single Hypothesis&lt;/td&gt;
      &lt;td&gt;1.66&lt;/td&gt;
      &lt;td&gt;1.77&lt;/td&gt;
      &lt;td&gt;1.55&lt;/td&gt;
      &lt;td&gt;0.349&lt;/td&gt;
      &lt;td&gt;0.293&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Favorability&lt;/td&gt;
      &lt;td&gt;5.20&lt;/td&gt;
      &lt;td&gt;1.84&lt;/td&gt;
      &lt;td&gt;53.8&lt;/td&gt;
      &lt;td&gt;0.005&lt;/td&gt;
      &lt;td&gt;0&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h1 id=&quot;appendix-b-methodological-revisions&quot;&gt;Appendix B: Methodological revisions&lt;/h1&gt;

&lt;p&gt;In the interest of transparency—and because I didn’t pre-register a methodology—here is a list of non-trivial revisions I made in the process of writing this article. In chronological order:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Originally, I used a different method for estimating the effect of publication bias. Instead of creating dummy null clones as described &lt;a href=&quot;#publication-bias&quot;&gt;above&lt;/a&gt;, I created &lt;code&gt;k&lt;/code&gt; null dummies (one for each real outcome) that were all identical, and that took their standard error as the average of the real studies’ standard errors. This method produced lower p-values. However, I decided it was too unrealistic to give all three null dummies the exact same summary statistics.&lt;/li&gt;
  &lt;li&gt;For BLM and Women’s March results, I originally used the primary outcomes as reported by their respective papers. I revised my meta-analysis to use the weakest outcomes (lowest t-stat) from the robustness checks, to more conservatively account for spatial autocorrelation.&lt;/li&gt;
  &lt;li&gt;I originally included BLM outcomes in the meta-analysis. Upon re-reading the BLM paper, I realized it failed its placebo tests, so I removed it. &lt;a href=&quot;#appendix-a-additional-tables&quot;&gt;Appendix A&lt;/a&gt; shows the results when including BLM.&lt;/li&gt;
  &lt;li&gt;I went back to using the main outcome for BLM instead of a robustness check outcome because it’s simpler, and I wasn’t including BLM in the main meta-analysis anyway. This slightly strengthened the reported results in &lt;a href=&quot;#appendix-a-additional-tables&quot;&gt;Appendix A&lt;/a&gt;.&lt;/li&gt;
  &lt;li&gt;Originally, my meta-analysis used unstandardized means. I wanted to use standardized means but I wasn’t sure how to standardize them. Eventually, I found &lt;a href=&quot;/materials/gelman2008.pdf&quot;&gt;Gelman (2007)&lt;/a&gt;&lt;sup id=&quot;fnref:58:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:58&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;24&lt;/a&gt;&lt;/sup&gt; and used its method for standardizing outcomes. This increased most results’ t-stats because it decreased between-study variance. (However, it decreased the Favorability t-stat.) Unstandardized pooled outcomes are reported in &lt;a href=&quot;#table-a.5&quot;&gt;Table A.5&lt;/a&gt;.
    &lt;ul&gt;
      &lt;li&gt;Initially, I wanted to scale all binary outcomes by their standard deviations, but I did not have the necessary data to calculate all standard deviations, so I simply left them unscaled.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;h1 id=&quot;appendix-c-comparing-the-strength-of-evidence-to-saturated-fat-research&quot;&gt;Appendix C: Comparing the strength of evidence to saturated fat research&lt;/h1&gt;

&lt;p&gt;To get some perspective on the strength of evidence on protests, I would like to compare it to a thorny question in an unrelated field that I reviewed recently.&lt;/p&gt;

&lt;p&gt;Last year I &lt;a href=&quot;https://mdickens.me/2024/09/26/outlive_a_critical_review/#the-data-are-unclear-on-whether-reducing-saturated-fat-intake-is-beneficial&quot;&gt;examined the evidence&lt;/a&gt; on whether saturated fat is unhealthy, primarily focusing on a &lt;a href=&quot;https://doi.org/10.1002/14651858.cd011737.pub3&quot;&gt;Cochrane review&lt;/a&gt;&lt;sup id=&quot;fnref:26&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:26&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;35&lt;/a&gt;&lt;/sup&gt; of &lt;a href=&quot;https://en.wikipedia.org/wiki/Randomized_controlled_trial&quot;&gt;RCTs&lt;/a&gt;. I ultimately decided I was 85% confident that saturated fat is unhealthy.&lt;/p&gt;

&lt;p&gt;How does the evidence on protest effectiveness compare to the evidence on saturated fat?&lt;/p&gt;

&lt;p&gt;Both hypotheses face similar problems: there are many observational studies that support the hypothesis, but few experiments. (In the case of protests, there are &lt;em&gt;no&lt;/em&gt; (real-world) experiments, but there are some natural experiments.)&lt;/p&gt;

&lt;p&gt;The Cochrane review included 15 RCTs. My review included five natural experiments.&lt;/p&gt;

&lt;p&gt;The studies in the Cochrane review were true experiments. The protests studies were not true experiments, which means they can’t establish causality quite as firmly.&lt;/p&gt;

&lt;p&gt;The Cochrane review found no evidence of publication bias (&lt;a href=&quot;https://www.cochranelibrary.com/cdsr/doi/10.1002/14651858.CD011737.pub3/media/CDSR/CD011737/image_n/nCD011737-FIG-03.svg&quot;&gt;Figure 3&lt;/a&gt;). There aren’t enough protest quasi-experiments to test for publication bias, but I did &lt;a href=&quot;#publication-bias&quot;&gt;test&lt;/a&gt; what would happen if I added in some hypothetical null-result studies.&lt;/p&gt;

&lt;p&gt;Three individual studies on saturated fat had statistically significant positive effects, six had non-significant positive effects, and four had non-significant negative effects. (Two studies did not report data on cardiovascular events.)&lt;/p&gt;

&lt;p&gt;In my review, three out of three studies (or four out of four&lt;sup id=&quot;fnref:55&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:55&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;36&lt;/a&gt;&lt;/sup&gt;) found statistically significant positive effects for &lt;em&gt;nonviolent&lt;/em&gt; protests, and one study found statistically significant negative effects for &lt;em&gt;violent&lt;/em&gt; protests.&lt;/p&gt;

&lt;p&gt;When pooling all RCTs together, the Cochrane review found a marginally statistically significant effect of saturated fat reduction on cardiovascular events (95% CI [0.70, 0.98] where 1 = no effect). It also found significant effects on short-term health outcomes like weight and cholesterol. It found positive but non-significant results on mortality outcomes (including all-cause mortality and cardiovascular mortality).&lt;/p&gt;

&lt;p&gt;In my meta-analysis, most of my &lt;a href=&quot;#table-3&quot;&gt;pooled samples&lt;/a&gt; had strong positive results. My primary metric had p &amp;lt; 0.003; the Cochrane review didn’t find any primary results that strong.&lt;/p&gt;

&lt;p&gt;Even though the Cochrane review included three times as many studies, I think the evidence on protest outcomes is stronger:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;The saturated fat RCTs got mixed results, but the three or four studies on nonviolent protests all pointed the same direction.&lt;/li&gt;
  &lt;li&gt;(Most of) the pooled outcomes for protests had moderate to strong p-values. The pooled outcomes for saturated fat reduction had moderate p-values at best.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;On the other hand, the protest studies have to deal with the &lt;a href=&quot;#spatial-autocorrelation&quot;&gt;spatial autocorrelation&lt;/a&gt; problem, and it’s not entirely clear that they succeeded at establishing causation. The saturated fat studies were experiments; they had no analogous problem.&lt;/p&gt;

&lt;p&gt;The smaller number of studies also means the protests meta-analysis is more vulnerable to errors in any one study.&lt;/p&gt;

&lt;p&gt;It’s a judgment call as to whether you think the weaker methodology for protest studies outweighs the stronger likelihood ratios. I’m inclined to say it doesn’t.&lt;/p&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:1&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Ozden, J., &amp;amp; Glover, S. (2022). &lt;a href=&quot;https://www.socialchangelab.org/_files/ugd/503ba4_94d84534d5b348468739b0d6a36b3940.pdf&quot;&gt;Literature Review: Protest Outcomes.&lt;/a&gt; &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:10&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Ozden, J., &amp;amp; Glover, S. (2022). &lt;a href=&quot;https://www.socialchangelab.org/_files/ugd/503ba4_052959e2ee8d4924934b7efe3916981e.pdf&quot;&gt;Protest movements: How effective are they?&lt;/a&gt; &lt;a href=&quot;#fnref:10&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:50&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Orazani, N., Tabri, N., Wohl, M. J. A., &amp;amp; Leidner, B. (2021). &lt;a href=&quot;https://doi.org/10.1002/ejsp.2722&quot;&gt;Social movement strategy (nonviolent vs. violent) and the garnering of third-party support: A meta-analysis.&lt;/a&gt; &lt;a href=&quot;#fnref:50&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt; &lt;a href=&quot;#fnref:50:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;2&lt;/sup&gt;&lt;/a&gt; &lt;a href=&quot;#fnref:50:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;3&lt;/sup&gt;&lt;/a&gt; &lt;a href=&quot;#fnref:50:3&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;4&lt;/sup&gt;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:13&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;ChatGPT Deep Research was useful for finding and summarizing studies, but not for assessing their quality. When I asked ChatGPT to only include methodologically rigorous studies in the review, it didn’t appear to change which studies it included, it just rationalized why every study was rigorous. It said things like (paraphrasing) “we know this observational study’s findings are robust because it &lt;a href=&quot;https://dynomight.net/control/&quot;&gt;controlled for confounders&lt;/a&gt;” and “because it had a large sample size” (??). &lt;a href=&quot;#fnref:13&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:2&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;McVeigh, R., Cunningham, D., &amp;amp; Farrell, J. (2014). &lt;a href=&quot;https://doi.org/10.1177/0003122414555885&quot;&gt;Political Polarization as a Social Movement Outcome.&lt;/a&gt; &lt;a href=&quot;#fnref:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt; &lt;a href=&quot;#fnref:2:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;2&lt;/sup&gt;&lt;/a&gt; &lt;a href=&quot;#fnref:2:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;3&lt;/sup&gt;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:3&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Bremer, B., Hutter, S., &amp;amp; Kriesi, H. (2020). &lt;a href=&quot;https://doi.org/10.1111/1475-6765.12375&quot;&gt;Dynamics of protest and electoral politics in the Great Recession.&lt;/a&gt; &lt;a href=&quot;#fnref:3&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt; &lt;a href=&quot;#fnref:3:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;2&lt;/sup&gt;&lt;/a&gt; &lt;a href=&quot;#fnref:3:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;3&lt;/sup&gt;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:36&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Hungerman, D., &amp;amp; Moorthy, V. (2023). &lt;a href=&quot;https://www.aeaweb.org/content/file?id=16104&quot;&gt;Every Day Is Earth Day: Evidence on the Long-Term Impact of Environmental Activism.&lt;/a&gt; doi: &lt;a href=&quot;https://doi.org/10.1257/app.20210045&quot;&gt;10.1257/app.20210045&lt;/a&gt; &lt;a href=&quot;#fnref:36&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt; &lt;a href=&quot;#fnref:36:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;2&lt;/sup&gt;&lt;/a&gt; &lt;a href=&quot;#fnref:36:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;3&lt;/sup&gt;&lt;/a&gt; &lt;a href=&quot;#fnref:36:3&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;4&lt;/sup&gt;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:46&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Mellon, J. (2024). &lt;a href=&quot;https://doi.org/10.1111/ajps.12894&quot;&gt;Rain, rain, go away: 194 potential exclusion-restriction violations for studies using weather as an instrumental variable.&lt;/a&gt; &lt;a href=&quot;#fnref:46&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:47&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Sarsons, H. (2015). &lt;a href=&quot;https://doi.org/10.1016/j.jdeveco.2014.12.007&quot;&gt;Rainfall and conflict: A cautionary tale.&lt;/a&gt; &lt;a href=&quot;#fnref:47&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:4&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Madestam, A., Shoag, D., Veuger, S., &amp;amp; Yanagizawa-Drott, D. (2013). &lt;a href=&quot;https://doi.org/10.1093/qje/qjt021&quot;&gt;Do Political Protests Matter? Evidence from the Tea Party Movement.&lt;/a&gt; &lt;a href=&quot;#fnref:4&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt; &lt;a href=&quot;#fnref:4:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;2&lt;/sup&gt;&lt;/a&gt; &lt;a href=&quot;#fnref:4:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;3&lt;/sup&gt;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:9&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Wasow, O. (2020). &lt;a href=&quot;https://doi.org/10.1017/S000305542000009X&quot;&gt;Agenda Seeding: How 1960s Black Protests Moved Elites, Public Opinion and Voting.&lt;/a&gt;. &lt;a href=&quot;#fnref:9&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt; &lt;a href=&quot;#fnref:9:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;2&lt;/sup&gt;&lt;/a&gt; &lt;a href=&quot;#fnref:9:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;3&lt;/sup&gt;&lt;/a&gt; &lt;a href=&quot;#fnref:9:3&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;4&lt;/sup&gt;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:6&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Klein Teeselink, B., &amp;amp; Melios, G. (2021). &lt;a href=&quot;https://dx.doi.org/10.2139/ssrn.3809877&quot;&gt;Weather to Protest: The Effect of Black Lives Matter Protests on the 2020 Presidential Election.&lt;/a&gt; &lt;a href=&quot;#fnref:6&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt; &lt;a href=&quot;#fnref:6:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;2&lt;/sup&gt;&lt;/a&gt; &lt;a href=&quot;#fnref:6:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;3&lt;/sup&gt;&lt;/a&gt; &lt;a href=&quot;#fnref:6:3&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;4&lt;/sup&gt;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:14&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Larreboure, M., &amp;amp; Gonzalez, F. (2021). &lt;a href=&quot;https://mlarreboure.com/womenmarch.pdf&quot;&gt;The Impact of the Women’s March on the U.S. House Election.&lt;/a&gt; &lt;a href=&quot;#fnref:14&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt; &lt;a href=&quot;#fnref:14:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;2&lt;/sup&gt;&lt;/a&gt; &lt;a href=&quot;#fnref:14:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;3&lt;/sup&gt;&lt;/a&gt; &lt;a href=&quot;#fnref:14:3&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;4&lt;/sup&gt;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:11&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Braemer et al. (2020) found a non-significant result and then did some subgroup analysis and got a significant result. I find that suspicious but I didn’t bother to look deeper because it’s an observational study anyway. &lt;a href=&quot;#fnref:11&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:7&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;The paper did not report the p-value, but it did report the standard error, so I calculated the p-value from that. &lt;a href=&quot;#fnref:7&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:48&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Collins, W. J., &amp;amp; Margo, R. A. (2007). &lt;a href=&quot;https://doi.org/10.1017/S0022050707000423&quot;&gt;The Economic Aftermath of the 1960s Riots in American Cities: Evidence from Property Values.&lt;/a&gt; &lt;a href=&quot;#fnref:48&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:45&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Beck, N., Gleditsch, K. S., &amp;amp; Beardsley, K. (2006). &lt;a href=&quot;https://doi.org/10.1111/j.1468-2478.2006.00391.x&quot;&gt;Space Is More than Geography: Using Spatial Econometrics in the Study of Political Economy.&lt;/a&gt; &lt;a href=&quot;#fnref:45&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:49&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;If it were true, we’d expect to see a similar phenomenon in the placebo tests of Madestam et al. (2013) and Hungerman &amp;amp; Moorthy (2023), but we don’t. &lt;a href=&quot;#fnref:49&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:15&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Belloni, A., Chernozhukov, V., &amp;amp; Hansen, C. (2010). &lt;a href=&quot;https://arxiv.org/abs/1012.1297&quot;&gt;LASSO Methods for Gaussian Instrumental Variables Models.&lt;/a&gt; &lt;a href=&quot;#fnref:15&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:43&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Conley, T. G. (1999). &lt;a href=&quot;https://doi.org/10.1016/S0304-4076(98)00084-0&quot;&gt;GMM estimation with cross sectional dependence.&lt;/a&gt;. &lt;a href=&quot;#fnref:43&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:16&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I emailed the corresponding author to ask about this apparent discrepancy and did not receive a reply. &lt;a href=&quot;#fnref:16&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:57&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Hungerman &amp;amp; Moorthy (2023) did provide enough information to &lt;em&gt;estimate&lt;/em&gt; the change in Earth Day favorability per protester, by dividing change in favorability by change in number of protesters from the paper’s Table 4. However, this estimate would have high variance on the denominator, which makes the result unreliable. &lt;a href=&quot;#fnref:57&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:21&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Favorability (1) measured public support for environmentalism as the percentage of respondents answering Yes to “we’re spending too little money” on protecting the environment. &lt;a href=&quot;#fnref:21&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt; &lt;a href=&quot;#fnref:21:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;2&lt;/sup&gt;&lt;/a&gt; &lt;a href=&quot;#fnref:21:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;3&lt;/sup&gt;&lt;/a&gt; &lt;a href=&quot;#fnref:21:3&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;4&lt;/sup&gt;&lt;/a&gt; &lt;a href=&quot;#fnref:21:4&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;5&lt;/sup&gt;&lt;/a&gt; &lt;a href=&quot;#fnref:21:5&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;6&lt;/sup&gt;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:58&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Gelman, A. (2007). &lt;a href=&quot;https://doi.org/10.1002/sim.3107&quot;&gt;Scaling regression inputs by dividing by two standard deviations.&lt;/a&gt; &lt;a href=&quot;#fnref:58&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt; &lt;a href=&quot;#fnref:58:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;2&lt;/sup&gt;&lt;/a&gt; &lt;a href=&quot;#fnref:58:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;3&lt;/sup&gt;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:59&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Some additional detail:&lt;/p&gt;

      &lt;p&gt;Gelman (2007) proposes scaling continuous variables by 2 standard deviations because this puts them onto the same scale as a binary variable &lt;em&gt;where the control and treatment groups have the same size&lt;/em&gt;. If you have a sample of binary outcomes where 50% of the outcomes are 0 (“no rain”) and 50% are 1 (“rain”), then the standard deviation is 0.5. If the probabilities are not 50/50 then the standard deviation will not equal 0.5. (For example, the Tea Party rainfall variable had a standard deviation of 0.401.) Arguably it would make sense to scale all binary variables to a standard deviation of 0.5. However, I did not do this because I didn’t have the necessary data for all the papers. Instead, I left all binary variables unscaled. (Gelman (2007) discusses whether probability-skewed binary variables should be scaled, but ultimately does not take a stance.)&lt;/p&gt;

      &lt;p&gt;The Earth Day paper directly regressed outcomes onto a continuous rainfall variable (without doing a two-stage regression). I scaled the reported slopes by 2 times the standard deviation of rainfall.&lt;/p&gt;

      &lt;p&gt;The Women’s March paper reported values scaled to 1 standard deviation, so I divided them by 2.&lt;/p&gt;

      &lt;p&gt;The Tea Party, BLM, and Civil Rights papers reported effects in binary terms, so I did not scale them. &lt;a href=&quot;#fnref:59&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt; &lt;a href=&quot;#fnref:59:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;2&lt;/sup&gt;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:60&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Even if I replace the relatively weak Earth Day favorability outcome with Earth Day favorability among under-20s (which had a likelihood ratio of 37), the pooled likelihood ratio is still only 3.43. &lt;a href=&quot;#fnref:60&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:30&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Defined as the number of Congress members who voted in line with conservative positions according to the American Conservative Union. &lt;a href=&quot;#fnref:30&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:31&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Defined as alignment with Tea Party positions, measured in standard deviations. &lt;a href=&quot;#fnref:31&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:33&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Evaluated on a 5-point scale from “strongly disagree” to “strongly agree”. &lt;a href=&quot;#fnref:33&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt; &lt;a href=&quot;#fnref:33:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;2&lt;/sup&gt;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:51&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;The linked webinar is the least-bad explanation of Egger’s test that I could find, but it doesn’t explain it very well so I will attempt to explain:&lt;/p&gt;

      &lt;p&gt;In the presence of publication bias, more powerful studies will have lower means. the small low-mean studies don’t get published. Therefore there will be a negative correlation between a study’s power and its mean. (I measured power as the inverse standard error but you could also use the inverse variance or the sample size.)&lt;/p&gt;

      &lt;p&gt;So you test for publication bias by doing a linear regression of study means on study power.&lt;/p&gt;

      &lt;ul&gt;
        &lt;li&gt;If the regression has a correlation close to zero, that indicates no publication bias.&lt;/li&gt;
        &lt;li&gt;If there is a significant correlation, that’s evidence of publication bias.&lt;/li&gt;
      &lt;/ul&gt;

      &lt;p&gt;Lest this footnote give the impression that I know what I’m talking about, I didn’t even know what Egger’s regression test was until I wrote this. My process was that I asked Claude what statistical test I could use to check for publication bias, it suggested Egger’s test and then gave an obviously-incorrect explanation of how the test works, and then I read several barely-comprehensible articles about the test until I thought I understood it. &lt;a href=&quot;#fnref:51&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:54&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I could do full replications for the papers that published their data, but that would be considerably more work for a low chance of paying off. &lt;a href=&quot;#fnref:54&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:52&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;In my experience, I always update tables if I make revisions to my calculations, but it’s hard to keep track of everywhere in the text that I referenced a number. &lt;a href=&quot;#fnref:52&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:35&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Technically, one of them did look at media coverage, but not using the rainfall method. &lt;a href=&quot;#fnref:35&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:42&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I’m thinking about the strength of evidence from a sociology perspective. Getting good evidence in sociology is hard. &lt;a href=&quot;#fnref:42&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:26&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Hooper, L., Martin, N., Jimoh, O. F., Kirk, C., Foster, E., &amp;amp; Abdelhamid, A. S. (2020). &lt;a href=&quot;https://doi.org/10.1002/14651858.cd011737.pub3&quot;&gt;Reduction in saturated fat intake for cardiovascular disease.&lt;/a&gt; &lt;a href=&quot;#fnref:26&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:55&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Three if you exclude the BLM study due to its &lt;a href=&quot;#failed-placebo-tests&quot;&gt;failed placebo tests&lt;/a&gt;; four if you include it. &lt;a href=&quot;#fnref:55&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>The Triple-Interaction-Effects Argument</title>
				<pubDate>Thu, 10 Apr 2025 00:00:00 -0700</pubDate>
				<link>http://mdickens.me/2025/04/10/triple_interaction_effects/</link>
				<guid isPermaLink="true">http://mdickens.me/2025/04/10/triple_interaction_effects/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;In this post I will explain the most impressive argument I heard in 2024.&lt;/p&gt;

&lt;p&gt;First, some context:&lt;/p&gt;

&lt;p&gt;There is an ongoing debate in the bodybuilding/strength training community about how much protein you should eat while losing weight.&lt;/p&gt;

&lt;p&gt;Some say you should eat more protein if you’re losing weight:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;If you’re eating less, your body is under extra pressure to cannibalize your muscles. Therefore, you should eat more protein to cancel this out.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The standard rebuttal:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Experimental trials have found that muscle gains max out when subjects eat 0.7–0.8 grams of protein per pound of bodyweight, and that’s true both when participants are maintaining weight and when they’re losing weight. There doesn’t appear to be a difference.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;And the counter-rebuttal:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Almost all research looks at novice lifters. Experienced athletes have a more difficult time gaining muscle,&lt;sup id=&quot;fnref:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; so losing weight will have a bigger negative impact on them, and therefore they need to eat more protein.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I used to believe this. Then I heard the most impressive argument of 2024.&lt;/p&gt;

&lt;p&gt;I heard the argument in a &lt;a href=&quot;https://www.youtube.com/watch?v=__hRCUDVJx0&amp;amp;t=322s&quot;&gt;YouTube video&lt;/a&gt; by Menno Henselmans:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;It’s possible that in trained individuals there is a triple interaction effect, because that’s what you’re arguing here. If you’re saying that protein requirements increase in an energy deficit, but only in strength-trained individuals, then you are arguing for a triple interaction effect. […] That is very, very, very rare. Triple interaction effects, biologically speaking, simply do not occur much.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I didn’t understand what he was talking about. I spent two days pondering what it meant. On the third day, it finally clicked and I realized he was right.&lt;/p&gt;

&lt;p&gt;To claim that trained lifters should eat more protein on an energy deficit, you’d need to believe that:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Above a certain level of protein intake (0.7–0.8 grams per pound), additional protein has no effect on muscle growth.&lt;sup id=&quot;fnref:3&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
  &lt;li&gt;Most of the time, trained athletes don’t need more protein than novices.&lt;/li&gt;
  &lt;li&gt;Novices don’t need more protein while losing weight than while maintaining/gaining weight.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;HOWEVER&lt;/strong&gt;, (a) among trained individuals who are (b) losing weight, the ones (c) who eat more protein (beyond 0.7–0.8 g/lb) gain more muscle.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The first variable (protein intake) has no interaction with muscle growth.&lt;/p&gt;

&lt;p&gt;The second variable (trained vs. untrained) has no interaction with muscle growth.&lt;/p&gt;

&lt;p&gt;The third variable (losing vs. maintaining weight) has no interaction with muscle growth.&lt;/p&gt;

&lt;p&gt;The first and second variables together (protein intake + trained/untrained) have no interaction with muscle growth.&lt;/p&gt;

&lt;p&gt;The first and third variables together (protein intake + losing/maintaining weight) have no interaction with muscle growth.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;HOWEVER&lt;/strong&gt;, when you put all three variables together, an interaction suddenly appears—a triple interaction effect.&lt;/p&gt;

&lt;p&gt;This is a very strange claim. If all three variables together affect muscle growth, then you would expect each variable &lt;em&gt;individually&lt;/em&gt; to affect muscle growth. And at least you would expect two out of three variables together to affect muscle growth.&lt;/p&gt;

&lt;p&gt;(In fact, it is mathematically impossible to construct a differentiable function &lt;code&gt;f(x, y, z)&lt;/code&gt; that is constant with respect to x, constant with respect to y, and constant with respect to z, but &lt;em&gt;not&lt;/em&gt; constant overall. Although you could have a function &lt;code&gt;f(x, y, z)&lt;/code&gt; where the slope with respect to each individual variable is &lt;em&gt;close to&lt;/em&gt; 0, but not &lt;em&gt;quite&lt;/em&gt; 0.)&lt;/p&gt;

&lt;p&gt;Not to say a triple interaction effect can’t occur in the real world. It could be that muscle growth does depend on each of (protein intake, training experience, calorie deficit), but the relationships are so weak that the studies failed to pick them up.&lt;/p&gt;

&lt;p&gt;But if you believe the studies’ results are correct, then it seems difficult—maybe even impossible—to still believe that trained lifters need to eat more protein while on a calorie deficit.&lt;/p&gt;

&lt;div style=&quot;text-align:center&quot;&gt;***&lt;/div&gt;

&lt;p&gt;This was the best argument I heard in 2024 because:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;If you think about it, it’s obviously correct. It changed my mind as soon as I understood it.&lt;/li&gt;
  &lt;li&gt;It’s difficult to come up with. (I’ve never heard anyone else make this argument.)&lt;/li&gt;
&lt;/ul&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:2&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I’m conflating gaining strength with putting on muscle. There’s a difference, but we can consider them the same thing for the purposes of this post. &lt;a href=&quot;#fnref:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:3&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;This claim is somewhat controversial, but let’s assume it’s true for the sake of this argument.&lt;/p&gt;

      &lt;p&gt;Randomized controlled trials find no benefit to more than ~0.7 g/lb, and I quoted a range of 0.7–0.8 g/lb to account for variation between individuals. But the existing studies aren’t that great so I don’t have high confidence that that’s the correct range. &lt;a href=&quot;#fnref:3&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>You can now read my reading notes</title>
				<pubDate>Mon, 31 Mar 2025 00:00:00 -0700</pubDate>
				<link>http://mdickens.me/2025/03/31/new_reading_notes/</link>
				<guid isPermaLink="true">http://mdickens.me/2025/03/31/new_reading_notes/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;Since 2015, I have been taking notes on most articles I read. I figured other people might find them useful, so I cleaned them up and &lt;a href=&quot;https://mdickens.me/reading-notes/&quot;&gt;published them on my website&lt;/a&gt;. You can find them via the new “Notes” tab.&lt;/p&gt;

&lt;p&gt;I will update the page every once in a while as I read more articles and take more notes.&lt;/p&gt;

&lt;p&gt;I also have notes on every educational book I’ve read since 2015, but the notes are on physical paper (can you believe it?). I might digitize them at some point.&lt;/p&gt;

                </description>
			</item>
		
			<item>
				<title>There Are Three Kinds of "No Evidence"</title>
				<pubDate>Mon, 03 Mar 2025 00:00:00 -0800</pubDate>
				<link>http://mdickens.me/2025/03/03/three_kinds_of_no_evidence/</link>
				<guid isPermaLink="true">http://mdickens.me/2025/03/03/three_kinds_of_no_evidence/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;David J. Balan once proposed that &lt;a href=&quot;https://www.overcomingbias.com/p/doctor-there-arhtml&quot;&gt;there are two kinds of “no evidence”&lt;/a&gt;:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;There have been lots of studies directly on this point which came back with the result that the hypothesis is false.&lt;/li&gt;
  &lt;li&gt;There is no evidence because there are few or no relevant studies.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;I propose that there are three kinds of “no evidence”:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;The hypothesis has never been studied.&lt;/li&gt;
  &lt;li&gt;There are studies, the studies failed to find supporting evidence, but they wouldn’t have found supporting evidence even if the hypothesis were true.&lt;/li&gt;
  &lt;li&gt;There are studies, the studies &lt;em&gt;should&lt;/em&gt; have found supporting evidence if the hypothesis were true, and they didn’t.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Example of type 1: A 2003 literature review found that there were &lt;a href=&quot;https://doi.org/10.1136/bmj.327.7429.1459&quot;&gt;no studies&lt;/a&gt;&lt;sup id=&quot;fnref:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; showing that parachutes could prevent injury when jumping out of a plane.&lt;/p&gt;

&lt;p&gt;Example of type 2: In 2018, there was finally &lt;a href=&quot;https://doi.org/10.1136/bmj.k5094&quot;&gt;a randomized controlled trial&lt;/a&gt;&lt;sup id=&quot;fnref:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt; on the effectiveness of parachutes, and it found no difference between the parachute group and the control group. However, participants only jumped from a height of 0.6 meters (~2 feet). I don’t know about you, but this result does not make me want to jump out of a plane without a parachute.&lt;/p&gt;

&lt;p&gt;Like in the parachute example, you see type-2 “no evidence” whenever the conditions of a study don’t match the real-world environment. You also see type-2 “no evidence” when an experiment is &lt;a href=&quot;https://en.wikipedia.org/wiki/Power_(statistics)&quot;&gt;underpowered&lt;/a&gt;. Say you want to test the hypothesis that boys are taller than girls. So you go find your niece Sally and your neighbor’s son James and it turns out Sally is an inch taller than James. Your methodology was valid—you can indeed test the hypothesis by finding some people and measuring their heights—but your sample size was too small.&lt;/p&gt;

&lt;p&gt;(The difference between type 2 and type 3 can be a matter of degree. The more powerful a study is, the stronger its “no evidence” if it fails to find an effect.)&lt;/p&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:1&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Smith, G. C. S. (2003). &lt;a href=&quot;https://doi.org/10.1136/bmj.327.7429.1459&quot;&gt;Parachute use to prevent death and major trauma related to gravitational challenge: systematic review of randomised controlled trials.&lt;/a&gt; &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:2&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Yeh, R. W., Valsdottir, L. R., Yeh, M. W., Shen, C., Kramer, D. B., Strom, J. B., Secemsky, E. A. et al. (2018). &lt;a href=&quot;https://doi.org/10.1136/bmj.k5094&quot;&gt;Parachute use to prevent death and major trauma when jumping from aircraft: randomized controlled trial.&lt;/a&gt; &lt;a href=&quot;#fnref:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>Return Stacking Funds: A New Way to Get Leverage</title>
				<pubDate>Tue, 04 Feb 2025 00:00:00 -0800</pubDate>
				<link>http://mdickens.me/2025/02/04/return_stacked_funds/</link>
				<guid isPermaLink="true">http://mdickens.me/2025/02/04/return_stacked_funds/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;&lt;em&gt;Last updated 2026-02-04.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Some &lt;a href=&quot;https://reducing-suffering.org/should-altruists-leverage-investments/&quot;&gt;people&lt;/a&gt; (including &lt;a href=&quot;https://mdickens.me/2020/01/06/how_much_leverage_should_altruists_use/&quot;&gt;me&lt;/a&gt;)  have argued that altruists often benefit from leveraging their investments. Recently, it has become easier to use leverage thanks to the emergence of &lt;a href=&quot;https://www.returnstacked.com/what-is-return-stacking-for-diversification/&quot;&gt;return stacking&lt;/a&gt; funds.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;This is not financial advice.&lt;/em&gt;&lt;/p&gt;

&lt;!-- more --&gt;

&lt;h2 id=&quot;contents&quot;&gt;Contents&lt;/h2&gt;

&lt;ul id=&quot;markdown-toc&quot;&gt;
  &lt;li&gt;&lt;a href=&quot;#contents&quot; id=&quot;markdown-toc-contents&quot;&gt;Contents&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#what-is-return-stacking&quot; id=&quot;markdown-toc-what-is-return-stacking&quot;&gt;What is return stacking?&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#an-overview-of-return-stacking-funds&quot; id=&quot;markdown-toc-an-overview-of-return-stacking-funds&quot;&gt;An overview of return stacking funds&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#the-true-cost-of-return-stacking-etfs&quot; id=&quot;markdown-toc-the-true-cost-of-return-stacking-etfs&quot;&gt;The true cost of return stacking ETFs&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#2026-update&quot; id=&quot;markdown-toc-2026-update&quot;&gt;2026 update&lt;/a&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#pros-and-cons-of-return-stacking-funds&quot; id=&quot;markdown-toc-pros-and-cons-of-return-stacking-funds&quot;&gt;Pros and cons of return stacking funds&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#are-bonds-a-good-investment&quot; id=&quot;markdown-toc-are-bonds-a-good-investment&quot;&gt;Are bonds a good investment?&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#source-code&quot; id=&quot;markdown-toc-source-code&quot;&gt;Source code&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#acknowledgments&quot; id=&quot;markdown-toc-acknowledgments&quot;&gt;Acknowledgments&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#notes&quot; id=&quot;markdown-toc-notes&quot;&gt;Notes&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;what-is-return-stacking&quot;&gt;What is return stacking?&lt;/h2&gt;

&lt;p&gt;Return stacking is a way of getting up to leveraged exposure to multiple return streams simultaneously. For example, &lt;a href=&quot;https://www.returnstackedetfs.com/rssb-return-stacked-global-stocks-bonds/&quot;&gt;RSSB&lt;/a&gt; invests 100% into global equities and 100% into US Treasury bonds, effectively giving it 2:1 leverage on a diversified stock/bond portfolio.&lt;/p&gt;

&lt;p&gt;A return stacking ETF is a type of leveraged ETF. But whereas traditional leveraged ETFs (such as &lt;a href=&quot;https://etfdb.com/etf/SPXL/&quot;&gt;SPXL&lt;/a&gt;) lever up a single index like the S&amp;amp;P 500, a return stacking fund holds multiple asset classes.&lt;/p&gt;

&lt;p&gt;Return stacking ETFs have lower management fees than single-index leveraged ETFs, and (with low confidence) they appear to have lower overhead costs for reasons that are not entirely clear to me (my guess is a combination of cheaper borrowing costs + transaction costs).&lt;/p&gt;

&lt;h2 id=&quot;an-overview-of-return-stacking-funds&quot;&gt;An overview of return stacking funds&lt;/h2&gt;

&lt;p&gt;There are four brands of return stacking funds that I know of:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;the eponymous &lt;a href=&quot;https://www.returnstackedetfs.com/&quot;&gt;Return Stacked&lt;/a&gt; ETFs (&lt;a href=&quot;https://www.returnstackedetfs.com/rssb-return-stacked-global-stocks-bonds/&quot;&gt;RSSB&lt;/a&gt;, &lt;a href=&quot;https://www.returnstackedetfs.com/rsst-return-stacked-us-stocks-managed-futures/&quot;&gt;RSST&lt;/a&gt;, &lt;a href=&quot;https://www.returnstackedetfs.com/rsbt-return-stacked-bonds-managed-futures/&quot;&gt;RSBT&lt;/a&gt;, &lt;a href=&quot;https://www.returnstackedetfs.com/rssy-return-stacked-us-stocks-futures-yield/&quot;&gt;RSSY&lt;/a&gt;, &lt;a href=&quot;https://www.returnstackedetfs.com/rsby-return-stacked-bonds-futures-yield/&quot;&gt;RSBY&lt;/a&gt;, &lt;a href=&quot;https://www.returnstackedetfs.com/rsba-return-stacked-bonds-merger-arbitrage/&quot;&gt;RSBA&lt;/a&gt;, &lt;a href=&quot;https://etfdb.com/etf/BTGD&quot;&gt;BTGD&lt;/a&gt;, &lt;a href=&quot;https://rationalmf.com/funds/return-stacked-balanced-allocation-systematic-macro-fund-rdmax-rdmcx-rdmix/&quot;&gt;RDMIX&lt;/a&gt;)&lt;/li&gt;
  &lt;li&gt;WisdomTree Capital Efficient ETFs (&lt;a href=&quot;https://www.wisdomtree.com/investments/etfs/capital-efficient/NTSX&quot;&gt;NTSX&lt;/a&gt;, &lt;a href=&quot;https://www.wisdomtree.com/investments/etfs/capital-efficient/NTSI&quot;&gt;NTSI&lt;/a&gt;, &lt;a href=&quot;https://www.wisdomtree.com/investments/etfs/capital-efficient/NTSE&quot;&gt;NTSE&lt;/a&gt;, &lt;a href=&quot;https://www.wisdomtree.com/investments/etfs/capital-efficient/GDE&quot;&gt;GDE&lt;/a&gt;, &lt;a href=&quot;https://www.wisdomtree.com/investments/etfs/capital-efficient/GDMN&quot;&gt;GDMN&lt;/a&gt;)&lt;/li&gt;
  &lt;li&gt;PIMCO StocksPLUS Funds (&lt;a href=&quot;https://www.pimco.com/us/en/investments/mutual-fund/pimco-stocksplus-long-duration-fund/inst-usd&quot;&gt;PSLDX&lt;/a&gt;, &lt;a href=&quot;https://www.pimco.com/us/en/investments/etf/pimco-us-stocks-plus-active-bond-exchange-traded-fund/usetf-usd&quot;&gt;SPLS&lt;/a&gt;)&lt;/li&gt;
  &lt;li&gt;Evoke Advisors Ultra Risk Parity ETF (&lt;a href=&quot;https://www.rparetf.com/upar&quot;&gt;UPAR&lt;/a&gt;)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The PIMCO fund has been around since 2007, but the others only launched within the last few years.&lt;/p&gt;

&lt;p&gt;What do each of these funds invest in?&lt;/p&gt;

&lt;p&gt;Seven of the funds stack traditional asset classes:&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Fund&lt;/th&gt;
      &lt;th&gt;first asset class&lt;/th&gt;
      &lt;th&gt;second asset class&lt;/th&gt;
      &lt;th&gt;leverage&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;RSSB&lt;/td&gt;
      &lt;td&gt;100% global stocks&lt;/td&gt;
      &lt;td&gt;100% US Treasury bonds&lt;/td&gt;
      &lt;td&gt;2:1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;NTSX&lt;/td&gt;
      &lt;td&gt;90% US stocks&lt;/td&gt;
      &lt;td&gt;60% US Treasury bonds&lt;/td&gt;
      &lt;td&gt;1.5:1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;NTSI&lt;/td&gt;
      &lt;td&gt;90% international stocks&lt;/td&gt;
      &lt;td&gt;60% US Treasury bonds&lt;/td&gt;
      &lt;td&gt;1.5:1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;NTSE&lt;/td&gt;
      &lt;td&gt;90% emerging market stocks&lt;/td&gt;
      &lt;td&gt;60% US Treasury bonds&lt;/td&gt;
      &lt;td&gt;1.5:1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;UPAR&lt;/td&gt;
      &lt;td&gt;too many for this table*&lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt;1.68:1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;PSLDX&lt;/td&gt;
      &lt;td&gt;~100% US stocks**&lt;/td&gt;
      &lt;td&gt;~100% bonds**&lt;/td&gt;
      &lt;td&gt;~2:1**&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;SPLS&lt;/td&gt;
      &lt;td&gt;~100% US stocks**&lt;/td&gt;
      &lt;td&gt;~100% bonds**&lt;/td&gt;
      &lt;td&gt;~2:1**&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;*As of 2025, UPAR &lt;a href=&quot;https://www.rparetf.com/upar/investment-case&quot;&gt;targets&lt;/a&gt; 17.5% U.S. equities, 7% international equities, 10.5% emerging markets equities, 21% commodity producer equities, 14% gold, 49% &lt;a href=&quot;https://en.wikipedia.org/wiki/United_States_Treasury_security#TIPS&quot;&gt;TIPS&lt;/a&gt;, and 49% Treasuries for a total allocation of 168%.&lt;/p&gt;

&lt;p&gt;**PSLDX and SPLS percentages are only approximate because the funds are actively managed and their holdings may vary over time.&lt;/p&gt;

&lt;p&gt;These funds stack traditional asset classes with alternatives:&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Fund&lt;/th&gt;
      &lt;th&gt;first asset class&lt;/th&gt;
      &lt;th&gt;second asset class&lt;/th&gt;
      &lt;th&gt;leverage&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;RSST&lt;/td&gt;
      &lt;td&gt;100% US stocks&lt;/td&gt;
      &lt;td&gt;100% managed futures*&lt;/td&gt;
      &lt;td&gt;2:1*&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;RSBT&lt;/td&gt;
      &lt;td&gt;100% US bonds&lt;sup id=&quot;fnref:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/td&gt;
      &lt;td&gt;100% managed futures*&lt;/td&gt;
      &lt;td&gt;2:1*&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;RSSY&lt;/td&gt;
      &lt;td&gt;100% US stocks&lt;/td&gt;
      &lt;td&gt;100% futures yield*&lt;/td&gt;
      &lt;td&gt;2:1*&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;RSBY&lt;/td&gt;
      &lt;td&gt;100% US bonds&lt;/td&gt;
      &lt;td&gt;100% futures yield*&lt;/td&gt;
      &lt;td&gt;2:1*&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;RSBA&lt;/td&gt;
      &lt;td&gt;100% US Treasury bonds&lt;/td&gt;
      &lt;td&gt;100% merger arbitrage*&lt;/td&gt;
      &lt;td&gt;2:1*&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;RDMIX&lt;/td&gt;
      &lt;td&gt;50/50 US stocks/bonds&lt;/td&gt;
      &lt;td&gt;100% systematic macro*&lt;/td&gt;
      &lt;td&gt;2:1*&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;BTGD&lt;/td&gt;
      &lt;td&gt;100% bitcoin&lt;/td&gt;
      &lt;td&gt;100% gold&lt;/td&gt;
      &lt;td&gt;2:1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;GDE&lt;/td&gt;
      &lt;td&gt;90% US stocks&lt;/td&gt;
      &lt;td&gt;90% gold&lt;/td&gt;
      &lt;td&gt;1.8:1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;GDMN&lt;/td&gt;
      &lt;td&gt;90% gold miner stocks&lt;/td&gt;
      &lt;td&gt;90% gold&lt;/td&gt;
      &lt;td&gt;1.8:1&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;*Managed futures (a.k.a. &lt;a href=&quot;https://en.wikipedia.org/wiki/Trend_following&quot;&gt;trendfollowing&lt;/a&gt;), futures yield (a.k.a. &lt;a href=&quot;https://en.wikipedia.org/wiki/Carry_(investment)&quot;&gt;carry&lt;/a&gt; or &lt;a href=&quot;https://en.wikipedia.org/wiki/Roll_yield&quot;&gt;roll yield&lt;/a&gt;), &lt;a href=&quot;https://en.wikipedia.org/wiki/Risk_arbitrage&quot;&gt;merger arbitrage&lt;/a&gt;, and &lt;a href=&quot;https://en.wikipedia.org/wiki/Global_macro&quot;&gt;systematic macro&lt;/a&gt; are all long/short strategies, not simple assets that you can buy and hold. So it’s somewhat arbitrary to say that the funds invest 100% into those strategies.&lt;/p&gt;

&lt;h2 id=&quot;the-true-cost-of-return-stacking-etfs&quot;&gt;The true cost of return stacking ETFs&lt;/h2&gt;

&lt;p&gt;In a &lt;a href=&quot;https://mdickens.me/2021/03/04/true_cost_of_leveraged_etfs/&quot;&gt;previous post&lt;/a&gt;, I looked at how a leveraged index fund &lt;em&gt;should&lt;/em&gt; perform and compared that against how leveraged ETFs actually &lt;em&gt;did&lt;/em&gt; perform. I found that the ETFs consistently cost more than expected, by an average of about one percentage point.&lt;/p&gt;

&lt;p&gt;I attempted to do the same analysis for return stacking ETFs. These ETFs are harder to replicate because they don’t track indexes, so I don’t have high confidence in the results. That said, my numbers suggest that return stacking ETFs are more cost-effective than conventional leveraged ETFs.&lt;/p&gt;

&lt;p&gt;I was able to replicate RSSB and NTSX:&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;ETF&lt;/th&gt;
      &lt;th&gt;Leverage&lt;/th&gt;
      &lt;th&gt;Stock ETF(s)&lt;/th&gt;
      &lt;th&gt;Bond Fund(s)&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;RSSB&lt;/td&gt;
      &lt;td&gt;100% + 100%&lt;/td&gt;
      &lt;td&gt;&lt;a href=&quot;https://investor.vanguard.com/investment-products/etfs/profile/vti&quot;&gt;VTI&lt;/a&gt; + &lt;a href=&quot;https://investor.vanguard.com/investment-products/etfs/profile/vxus&quot;&gt;VXUS&lt;/a&gt;&lt;sup id=&quot;fnref:4&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:4&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/td&gt;
      &lt;td&gt;bond futures ladder&lt;sup id=&quot;fnref:3&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;NTSX&lt;/td&gt;
      &lt;td&gt;90% + 60%&lt;/td&gt;
      &lt;td&gt;SPY (S&amp;amp;P 500)&lt;/td&gt;
      &lt;td&gt;bond futures ladder&lt;sup id=&quot;fnref:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;4&lt;/a&gt;&lt;/sup&gt;&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;I also attempted to replicate NTSI, PSLDX, and GDE, but I couldn’t find benchmarks that tracked them well enough.&lt;/p&gt;

&lt;p&gt;Update 2026-01-17: &lt;a href=&quot;https://www.pimco.com/us/en/investments/etf/pimco-us-stocks-plus-active-bond-exchange-traded-fund/usetf-usd&quot;&gt;SPLS&lt;/a&gt; is a newly launched 100% US stocks + 100% bonds ETF. It just launched as of this writing, so it has no history to replicate. It’s now the fund with the lowest expense ratio, but it holds swaps on PIMCO bond ETFs that have their own expense ratios (as of this writing, SPLS holds swaps on &lt;a href=&quot;https://www.pimco.com/us/en/investments/etf/pimco-us-stocks-plus-active-bond-exchange-traded-fund/usetf-usd&quot;&gt;BOND&lt;/a&gt; at 0.40% and &lt;a href=&quot;https://www.pimco.com/us/en/investments/etf/pimco-multisector-bond-active-exchange-traded-fund/usetf-usd&quot;&gt;PYLD&lt;/a&gt; at 0.55%, among others). Different funds may also have different financing rates on the instruments they use to get leverage.&lt;/p&gt;

&lt;p&gt;I calculated excess costs of RSSB and NTSX as the hypothetical return you’d get if you levered up the benchmark (borrowing at the 3-month T-bill rate), minus the actual historical return of the fund.&lt;/p&gt;

&lt;p&gt;The following table shows the total excess cost and after-fee cost for the return stacking ETFs. Excess cost is shown per 100% leverage (the excess on NTSX is multiplied by two because it only has 50% leverage). After-fee cost gives the excess cost minus the difference expense ratios between the ETF and the benchmark—this represents the “unexpected” portion of the cost, since you expect to pay the expense ratio no matter what. &lt;code&gt;r&lt;/code&gt; gives the correlation between the return stacking ETF and the benchmark. I calculated the average annual cost for each ETF starting from the earliest year for which the ETF had a full year of data.&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;ETF&lt;/th&gt;
      &lt;th&gt;Excess Cost&lt;/th&gt;
      &lt;th&gt;After Fee&lt;/th&gt;
      &lt;th&gt;r&lt;/th&gt;
      &lt;th&gt;Start Year&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;RSSB&lt;/td&gt;
      &lt;td&gt;-0.55%&lt;/td&gt;
      &lt;td&gt;-0.84%&lt;/td&gt;
      &lt;td&gt;0.998&lt;/td&gt;
      &lt;td&gt;2024&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;NTSX&lt;/td&gt;
      &lt;td&gt;-0.17%&lt;/td&gt;
      &lt;td&gt;-0.41%&lt;/td&gt;
      &lt;td&gt;0.997&lt;/td&gt;
      &lt;td&gt;2019&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;(The costs were negative, which means the real-life funds &lt;em&gt;outperformed&lt;/em&gt; the benchmarks.)&lt;/p&gt;

&lt;p&gt;Excess costs for each individual year for NTSX:&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt; &lt;/th&gt;
      &lt;th&gt;2019&lt;/th&gt;
      &lt;th&gt;2020&lt;/th&gt;
      &lt;th&gt;2021&lt;/th&gt;
      &lt;th&gt;2022&lt;/th&gt;
      &lt;th&gt;2023&lt;/th&gt;
      &lt;th&gt;2024&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;NTSX&lt;/td&gt;
      &lt;td&gt;-0.48&lt;/td&gt;
      &lt;td&gt;-3.50&lt;/td&gt;
      &lt;td&gt;2.72&lt;/td&gt;
      &lt;td&gt;1.92&lt;/td&gt;
      &lt;td&gt;-0.93&lt;/td&gt;
      &lt;td&gt;-2.15&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;As we can see, excess costs varied quite a bit from year to year. However, they were still generally lower than the &lt;a href=&quot;https://mdickens.me/2021/03/04/true_cost_of_leveraged_etfs/#measuring-the-cost-of-leveraged-etfs&quot;&gt;costs of conventional leveraged ETFs&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;In fact, the excess costs were &lt;em&gt;negative&lt;/em&gt; most years. That’s surprising, since the benchmark does not account for transaction costs.&lt;/p&gt;

&lt;p&gt;Why were return stacking funds (apparently) more cost-effective than conventional leveraged ETFs?&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;These funds have lower expense ratios. For example, RSSB charges 0.36% and SSO (a 2x leveraged S&amp;amp;P 500 fund) charges 0.89%.&lt;/li&gt;
  &lt;li&gt;Traditional leveraged ETFs rebalance daily. The Return Stacked and WisdomTree ETFs only rebalance if the holdings drift 5 percentage points away from the target weights. Rebalancing has transaction costs, which could be significant or could be close to zero, depending on various factors.&lt;/li&gt;
  &lt;li&gt;Return stacking funds get leverage via Treasury futures, which is approximately the cheapest way to get leverage. Conventional leveraged ETFs primarily use &lt;a href=&quot;https://www.investopedia.com/articles/optioninvestor/07/swaps.asp&quot;&gt;swaps&lt;/a&gt;, which have an opaque pricing structure and might cost a lot more. (I have no idea how much they &lt;em&gt;actually&lt;/em&gt; cost because the pricing is opaque.)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Those factors explain why return stacking ETFs are cheaper than 3x leveraged index ETFs. But how is it possible for a return stacking ETF to &lt;em&gt;outperform&lt;/em&gt; a leveraged combination of index funds?&lt;/p&gt;

&lt;p&gt;My benchmarks have some margin of error—they do not perfectly track the return stacking ETFs. Based on playing around with the implementation details of the benchmark, I believe it could be off by perhaps one percentage point.&lt;sup id=&quot;fnref:5&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:5&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;5&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;The most likely source of tracking error is rebalance timing. Small changes in when you rebalance can significantly change year-to-year performance, especially in years like 2024 where some asset classes perform much better than others. If stocks outpaced bonds for most of the year, and the fund was supposed to rebalance from stocks to bonds, then the real-life fund might have gained an edge over the benchmark by delaying rebalancing a little longer.&lt;/p&gt;

&lt;p&gt;Even if these (apparently) negative costs might not persist, this still provides evidence that the return stacking ETFs have lower costs than single-asset leveraged ETFs.&lt;/p&gt;

&lt;h3 id=&quot;2026-update&quot;&gt;2026 update&lt;/h3&gt;

&lt;p&gt;The first version of this post, published in February 2025, only included a year of history for RSSB. I’m writing this update in January 2026, and RSSB has now existed for twice as long. Has it maintained its low cost?&lt;/p&gt;

&lt;p&gt;Yes. Here’s the excess cost of RSSB from inception (2023-12-05) to yesterday (2026-01-15):&lt;sup id=&quot;fnref:8&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:8&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;6&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt; &lt;/th&gt;
      &lt;th&gt;Excess Cost&lt;/th&gt;
      &lt;th&gt;After Fee&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;RSSB&lt;/td&gt;
      &lt;td&gt;-0.58%&lt;/td&gt;
      &lt;td&gt;-0.87%&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;As was the case in 2024, RSSB has a &lt;em&gt;negative&lt;/em&gt; excess cost, i.e., it was &lt;em&gt;cheaper&lt;/em&gt; than its benchmark.&lt;/p&gt;

&lt;p&gt;The excess costs by year:&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt; &lt;/th&gt;
      &lt;th&gt;2024&lt;/th&gt;
      &lt;th&gt;2025&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;RSSB&lt;/td&gt;
      &lt;td&gt;-0.75&lt;/td&gt;
      &lt;td&gt;-0.61&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;NTSX also had good operations in 2025, with an excess cost of –0.50%.&lt;/p&gt;

&lt;h2 id=&quot;pros-and-cons-of-return-stacking-funds&quot;&gt;Pros and cons of return stacking funds&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;They’re a convenient way to get leverage, much more convenient than options or futures.&lt;/li&gt;
  &lt;li&gt;They appear to have lower all-in costs than conventional leveraged ETFs.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cons:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;None of them offer greater than a 100% allocation to equities.&lt;/li&gt;
  &lt;li&gt;Limited choices—there are only a handful of return stacking ETFs available, and they might not include the asset classes you want.
    &lt;ul&gt;
      &lt;li&gt;I personally would like to see a global stocks + managed futures ETF, but that doesn’t exist. There’s only US stocks + managed futures (RSST) and bonds + managed futures (RSBT).&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;As with other leveraged ETFs, the costs of return stacking ETFs fluctuate from year to year. Even though the costs are low on average, in any given year a return stacking ETF might perform worse than expected.&lt;/li&gt;
  &lt;li&gt;I could only determine the costs for two of the return stacking ETFs. The others might have higher costs.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;are-bonds-a-good-investment&quot;&gt;Are bonds a good investment?&lt;/h2&gt;

&lt;p&gt;Most of the return stacking funds hold bonds. A question that some people ask:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Does it make sense to own return stacking stocks + bonds? Wouldn’t I rather have pure leveraged stocks instead?&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Good question! I don’t know!&lt;/p&gt;

&lt;p&gt;An argument against buying bonds:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Right now, the &lt;a href=&quot;https://home.treasury.gov/resource-center/data-chart-center/interest-rates/TextView?type=daily_treasury_yield_curve&amp;amp;field_tdr_date_value_month=202502&quot;&gt;yield curve&lt;/a&gt; is nearly flat: yields on long-term bonds are only slightly higher than on short-term bonds. Why would you borrow at the short-term rate to earn the long-term rate if those rates are (nearly) the same?&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Two counter-arguments:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;ol&gt;
    &lt;li&gt;The efficient market hypothesis predicts that you can’t time the bond market, so you shouldn’t change how you invest based on what the yield curve looks like.&lt;/li&gt;
    &lt;li&gt;A flat or inverted yield curve suggests that short-term rates will go down in the future. You might want to “lock in” the current rate by buying long-term bonds.&lt;/li&gt;
  &lt;/ol&gt;
&lt;/blockquote&gt;

&lt;p&gt;(Really these counter-arguments are the same—the (presumed) reason why the yield curve is flat is because the market is pricing in future changes in bond yields.)&lt;/p&gt;

&lt;p&gt;Another argument against bonds:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;In the long run, bonds have only earned a little bit of a premium over short-term T-bills. Given the overhead costs of using leverage, leveraged bonds might have near-zero or even negative expected return.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;And two counter-arguments:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;ol&gt;
    &lt;li&gt;If you can borrow at close to the risk-free rate, bonds should still have a positive long-run premium.&lt;/li&gt;
    &lt;li&gt;Even if leveraged bonds have ~zero expected return, they still add value to a portfolio if they perform well during equity downturns.&lt;/li&gt;
  &lt;/ol&gt;
&lt;/blockquote&gt;

&lt;p&gt;Which side of the argument is correct is left as an exercise to the reader.&lt;/p&gt;

&lt;p&gt;If you don’t want to hold bonds, there are some bondless return stacked stacking available:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;&lt;a href=&quot;https://www.returnstackedetfs.com/rsst-return-stacked-us-stocks-managed-futures/&quot;&gt;RSST&lt;/a&gt; holds stocks + managed futures (a.k.a. &lt;a href=&quot;https://en.wikipedia.org/wiki/Trend_following&quot;&gt;trendfollowing&lt;/a&gt;).&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://www.returnstackedetfs.com/rssy-return-stacked-us-stocks-futures-yield/&quot;&gt;RSSY&lt;/a&gt; holds stocks + futures yield (a.k.a. &lt;a href=&quot;https://en.wikipedia.org/wiki/Carry_(investment)&quot;&gt;carry&lt;/a&gt; or &lt;a href=&quot;https://en.wikipedia.org/wiki/Roll_yield&quot;&gt;roll yield&lt;/a&gt;).&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://etfdb.com/etf/BTGD&quot;&gt;BTGD&lt;/a&gt; holds bitcoin + gold.&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://www.wisdomtree.com/investments/etfs/capital-efficient/GDE&quot;&gt;GDE&lt;/a&gt; holds US stocks + gold.&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://www.wisdomtree.com/investments/etfs/capital-efficient/GDMN&quot;&gt;GDMN&lt;/a&gt; holds gold miner stocks + gold.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;I am a big fan of managed futures trendfollowing—it’s a strategy with strong historical performance that provided protection during market downturns, and I think it’s likely to continue working in the future (for more, see &lt;a href=&quot;https://dx.doi.org/10.2139/ssrn.2993026&quot;&gt;Hurst et al. (2017), “A Century of Evidence on Trend-Following Investing”&lt;/a&gt;&lt;sup id=&quot;fnref:7&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:7&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;7&lt;/a&gt;&lt;/sup&gt;). I’m ambivalent about carry (I’ve heard good arguments both for and against using it). I personally wouldn’t invest in bitcoin or gold, but if that’s your thing, return stacking ETFs give you a way to do it.&lt;/p&gt;

&lt;p&gt;(I don’t own RSST, but I hold something similar in my own portfolio—equities (&lt;a href=&quot;https://funds.alphaarchitect.com/aavm/&quot;&gt;AAVM&lt;/a&gt;) with managed futures stacked on top.)&lt;/p&gt;

&lt;h2 id=&quot;source-code&quot;&gt;Source code&lt;/h2&gt;

&lt;p&gt;Source code is available &lt;a href=&quot;https://github.com/michaeldickens/leveraged-etfs&quot;&gt;on GitHub&lt;/a&gt;.&lt;/p&gt;

&lt;h2 id=&quot;acknowledgments&quot;&gt;Acknowledgments&lt;/h2&gt;

&lt;p&gt;Thanks to Corey Hoffstein for helping me work out the implementation details of my benchmark.&lt;sup id=&quot;fnref:6&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:6&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;8&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:2&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;“bonds” means any sort of bonds, including Treasury or corporate bonds. “Treasury bonds” means just Treasury bonds. &lt;a href=&quot;#fnref:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:4&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I weighted VTI at 62% and VXUS at 38% as of the beginning of 2024 because those are the weightings I get if I reverse-engineer from RSSB’s current weightings.&lt;/p&gt;

      &lt;p&gt;It would be simpler to use &lt;a href=&quot;https://etfdb.com/etf/VT/&quot;&gt;VT&lt;/a&gt; which includes all the same stocks as VTI + VXUS. But RSSB itself holds VTI + VXUS, and I found that breaking out equities into two separate ETFs produces slightly more accurate benchmark. &lt;a href=&quot;#fnref:4&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:3&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;RSSB gets exposure to bonds via an equal-weighted combination of bond futures at four maturities: 2-year, 5-year, 10-year, and long (i.e. 25- to 30-year). I replicated this using:&lt;/p&gt;

      &lt;ul&gt;
        &lt;li&gt;25% S&amp;amp;P 2-Year U.S. Treasury Note Futures Total Return Index&lt;/li&gt;
        &lt;li&gt;25% S&amp;amp;P 5-Year U.S. Treasury Note Futures Total Return Index&lt;/li&gt;
        &lt;li&gt;25% S&amp;amp;P 10-Year U.S. Treasury Note Futures Total Return Index&lt;/li&gt;
        &lt;li&gt;25% S&amp;amp;P Ultra T-Bond Futures Total Return Index&lt;/li&gt;
      &lt;/ul&gt;

      &lt;p&gt;These indexes should exactly match the bond futures that RSSB holds. &lt;a href=&quot;#fnref:3&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:1&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I wasn’t entirely sure what position to use to replicate NTSX’s bond holdings. Its &lt;a href=&quot;https://www.wisdomtree.com/investments/-/media/us-media-files/documents/resource-library/investment-case/the-case-for-the-efficient-core-fund-family.pdf&quot;&gt;materials&lt;/a&gt; include illustrative figures that use a 7-10 year Treasury index as a benchmark, which suggests I should use &lt;a href=&quot;https://etfdb.com/etf/IEF/&quot;&gt;IEF&lt;/a&gt; or perhaps 10-year Treasury futures. But the latest &lt;a href=&quot;https://www.wisdomtree.com/investments/-/media/us-media-files/documents/resource-library/fund-reports-schedules/statistics/wisdomtree-fi-export-statistics-ntsx.pdf&quot;&gt;holdings&lt;/a&gt; show that it uses a combination of bond futures of different durations.&lt;/p&gt;

      &lt;p&gt;I found the best correlation to NTSX when using a weighted combination of four Treasury futures:&lt;/p&gt;

      &lt;ul&gt;
        &lt;li&gt;12% S&amp;amp;P 2-Year U.S. Treasury Note Futures Total Return Index&lt;/li&gt;
        &lt;li&gt;12% S&amp;amp;P 5-Year U.S. Treasury Note Futures Total Return Index&lt;/li&gt;
        &lt;li&gt;24% S&amp;amp;P 10-Year U.S. Treasury Note Futures Total Return Index&lt;/li&gt;
        &lt;li&gt;12% S&amp;amp;P Ultra T-Bond Futures Total Return Index&lt;/li&gt;
      &lt;/ul&gt;

      &lt;p&gt;As of this writing, NTSX holds 12% in 10-Year U.S. Treasury Note Futures and 12% in Ultra 10-Year U.S. Treasury Note Futures (which are like the normal 10-year futures except that they are &lt;a href=&quot;https://www.cmegroup.com/markets/interest-rates/us-treasury/ultra-10-year-us-treasury-note.html&quot;&gt;more closely tied&lt;/a&gt; to a 10-year maturity). I did not use Ultra futures in my benchmark because they only launched a few years ago. &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:5&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Some minor changes that affect the return of the benchmark:&lt;/p&gt;

      &lt;ul&gt;
        &lt;li&gt;The funds rebalance whenever weights drift 5% away from the target. But the prospectuses for RSSB and NTSX were not clear about what exactly that meant—I can think of at least four different interpretations. Corey Hoffstein (who co-runs RSSB) explained to me exactly how the rebalancing works, and I assumed NTSX works the same way but I don’t know for sure. Different rebalancing methods can change the average return by as much as one percentage point—a fund might get lucky and rebalance into a position right before it rockets up, or the opposite might happen.&lt;/li&gt;
        &lt;li&gt;Changing how the benchmark invests in bonds can change the return. &lt;a href=&quot;https://etfdb.com/etf/GOVT/&quot;&gt;GOVT&lt;/a&gt; outperformed the Treasury futures ladder over the sample period. Changing the NTSX benchmark to use GOVT increased its return by 24 &lt;a href=&quot;https://en.wikipedia.org/wiki/Basis_point&quot;&gt;bps&lt;/a&gt; (but also decreased the correlation to NTSX from 0.997 to 0.993).&lt;/li&gt;
        &lt;li&gt;My program might have a bug. Shortly before posting this article, I discovered that I was incorrectly calculating how much cash the benchmark needed to borrow and thus overestimating interest payments by about 40 bps per year.&lt;/li&gt;
      &lt;/ul&gt;
      &lt;p&gt;&lt;a href=&quot;#fnref:5&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:8&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;My original analysis used NAV, but for these tables I switched to using the ETF’s market value instead. NAV has a 10x tighter correlation to my benchmark (0.997 vs. 0.96), but my NAV data is rounded to two significant figures and my price data has many significant figures, so the NAV return is less accurate. &lt;a href=&quot;#fnref:8&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:7&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Hurst, B., Ooi, Y. H., &amp;amp; Pedersen, L. H. (2017). &lt;a href=&quot;https://dx.doi.org/10.2139/ssrn.2993026&quot;&gt;A Century of Evidence on Trend-Following Investing.&lt;/a&gt; &lt;a href=&quot;#fnref:7&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:6&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Originally I couldn’t figure out how to get my benchmark’s correlation to RSSB higher than 0.976. Turns out I needed to compare the benchmark to RSSB’s NAV, not its daily closing price. Corey explained to me that NAV and price diverge because daily futures prices settle at 3pm but exchanges close at 4pm, so any market movements in that hour will show up in RSSB’s price but not in its NAV or in the benchmark.&lt;/p&gt;

      &lt;p&gt;I had the same problem with my NTSX benchmark and was able to fix it the same way.&lt;/p&gt;

      &lt;p&gt;I also originally implemented rebalancing using an incorrect method, and Corey clarified the correct method to use. (This did not improve the correlation.) &lt;a href=&quot;#fnref:6&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>I was probably wrong about HIIT and VO2max</title>
				<pubDate>Mon, 03 Feb 2025 00:00:00 -0800</pubDate>
				<link>http://mdickens.me/2025/02/03/I_was_probably_wrong_about_HIIT_and_VO2max/</link>
				<guid isPermaLink="true">http://mdickens.me/2025/02/03/I_was_probably_wrong_about_HIIT_and_VO2max/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;&lt;em&gt;This research piece is not as rigorous or polished as usual. I wrote it quickly in a stream-of-consciousness style, which means it’s more reflective of my actual reasoning process.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;My understanding of HIIT (high-intensity interval training) as of a week ago:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;VO2max is the best fitness indicator for predicting health and longevity.&lt;/li&gt;
  &lt;li&gt;HIIT, especially long-duration intervals (4+ minutes), is the best way to improve VO2max.&lt;/li&gt;
  &lt;li&gt;Intervals should be done at the maximum sustainable intensity.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;I now believe those are all probably wrong.&lt;/p&gt;

&lt;!-- more --&gt;

&lt;p&gt;I think I got the wrong idea because a lot of HIIT/VO2max promoters cite scientific studies, which makes them seem superficially reasonable, but the studies they cite aren’t very good, or aren’t interpreted correctly.&lt;/p&gt;

&lt;p&gt;A few months ago I started incorporating some HIIT into my cardio routine. But I didn’t really know the best way to do it, so last week I decided to do some research. I looked up the most-cited meta-analyses on Google Scholar and I noticed my confusion when the meta-analyses didn’t seem to support the conventional wisdom that HIIT is the best way to improve VO2max.&lt;/p&gt;

&lt;p&gt;The most comprehensive single source I found was a meta-meta-analysis by Crowley et al. (2022)&lt;sup id=&quot;fnref:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;, which reviewed the findings of meta-analyses on HIT (high-intensity training) vs LIT (low-intensity training) for VO2max. The key quote:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Evidence from the meta-analyses that directly compared LIT versus HIT protocols on VO2max was, ostensibly, reported as either trivial or inconclusive. Three out of the six included meta-analyses reported small/moderate beneficial effects of HIT over LIT (α &amp;lt; 0.05). However, two of these reviews reported “substantial” heterogeneity (I2&amp;gt;0.75), small-study bias (p &amp;lt; 0.10), a relatively small pooled sample size (i.e., &amp;lt;1,000 participants), had a high degree of overlap (CCA = 11%) and reported several moderators (e.g., baseline fitness levels, age, HIT variables [e.g., volume, frequency, and duration]), which likely affected results.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Also, in my naiveté I had assumed that these were meta-analyses of RCTs, but in fact most of the included studies weren’t even RCTs:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Scribbans et al. reported that none of their included studies applied RCTs, Sloth et al. reported only four studies that applied RCTs design, and Gist et al. reported that the majority of included studies were RCTs.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;(Note: Gist et al.&lt;sup id=&quot;fnref:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt;, which did look mainly at RCTs, found that sprint interval training did not work better than endurance training (Cohen’s d = 0.04, 95% CI = -0.17 to 0.24.)&lt;/p&gt;

&lt;p&gt;So I thought, okay, these meta-analyses don’t seem to favor HIIT much if at all. But maybe they’re done by stuffy academics who don’t know anything about real training. Are these meta-analyses considered respectable? So I went to see what the Barbell Medicine&lt;sup id=&quot;fnref:3&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt; guys thought. I have a lot of respect for them when it comes to strength training. I don’t know if they know about cardio, but one of them is a former competitive swimmer so probably they know &lt;em&gt;something&lt;/em&gt;. And they have good epistemics on strength training, and good epistemics might generalize. They did a podcast&lt;sup id=&quot;fnref:4&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:4&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;4&lt;/a&gt;&lt;/sup&gt; on HIIT with some useful content:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;They started the podcast by criticizing a tweet&lt;sup id=&quot;fnref:5&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:5&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;5&lt;/a&gt;&lt;/sup&gt; in which fitness influencer Rhonda Patrick recommended a collection of “evidence-based HIIT protocols”. Their criticism mainly focused on how (they claimed) the provided HIIT protocols were way too hard. They quoted two responses&lt;sup id=&quot;fnref:6&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:6&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;6&lt;/a&gt;&lt;/sup&gt;&lt;sup id=&quot;fnref:7&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:7&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;7&lt;/a&gt;&lt;/sup&gt; by exercise physiologists arguing the same.&lt;/li&gt;
  &lt;li&gt;They went on to talk about some of the research on HIIT, citing the same meta-analyses that I’d looked at.&lt;/li&gt;
  &lt;li&gt;Their ultimate recommendation (also given in an article&lt;sup id=&quot;fnref:8&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:8&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;8&lt;/a&gt;&lt;/sup&gt;): “it is reasonable for about 80% of training to be of moderate intensity (zones 1-2), and about 20% reserved for higher intensity work (HIIT or SIT)”. They also said it’s good to use a variety of HIIT protocols and that there is no single optimal protocol.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I didn’t know if the Barbell Medicine guys were right about any of that, but it gave me some direction.&lt;/p&gt;

&lt;p&gt;I had a look at the website of one of the people from that Twitter thread, Steve Magness&lt;sup id=&quot;fnref:9&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:9&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;9&lt;/a&gt;&lt;/sup&gt;. He ran a 4:01 mile in high school so he probably has some idea of what he’s talking about.&lt;/p&gt;

&lt;p&gt;Now, when some science-literate fitness influencers like Peter Attia and Rhonda Patrick give some recommendations about HIIT, and some other people like Steve Magness and Barbell Medicine disagree with them, I don’t have sufficient expertise to say who’s right. Both sides have the trappings of scientific credibility (e.g. citing multiple studies). But one thing I &lt;em&gt;can&lt;/em&gt; do is check their logic.&lt;/p&gt;

&lt;p&gt;So I checked Steve Magness’s logic. He wrote in a Twitter thread:&lt;sup id=&quot;fnref:10&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:10&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;10&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;You need all intensities to max VO2max. So it’s dumb to pit one vs. other&lt;/p&gt;

  &lt;p&gt;But research shows continuous likely matches HIIT for Vo2max increase&lt;/p&gt;

  &lt;p&gt;HIIT appears better when you constrain to 8 weeks but when you look over longer time it equalizes&lt;/p&gt;

  &lt;p&gt;Here’s data from a recent review.&lt;sup id=&quot;fnref:11&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:11&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;11&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

  &lt;p&gt;You can see HIIT appears to increase VO2max more because of the time frame of most training studies (6-8 weeks).&lt;/p&gt;

  &lt;p&gt;Intense work gets big boost, then levels off. Endurance work gives Longer more gradual boost.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;img src=&quot;https://mdickens.me/assets/images/Molmen-2024.jpeg&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;The referenced review does seem to support Magness’s argument, but I don’t know if the review is any good. What I do know is that Magness’s logic makes sense. It stands to reason that a more intense exercise protocol will cause faster short-term gains, but it can’t keep producing those rapid gains forever. And it stands to reason that if most studies on HIIT vs. LIT only last 6-12 weeks, then they will underestimate the long-term benefits of LIT.&lt;/p&gt;

&lt;p&gt;That makes logical sense to me, which makes me think Steve Magness knows what he’s talking about.&lt;/p&gt;

&lt;p&gt;He also wrote an article&lt;sup id=&quot;fnref:12&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:12&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;12&lt;/a&gt;&lt;/sup&gt; arguing that some people care too much about VO2max for longevity:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Vo2max matters. But it’s just one component of many that make up both performance and aerobic fitness. And that’s important because if we return to the original claims that Vo2max is the key indicator of longevity, we’ll find that the majority of the studies cited did NOT even use Vo2max as the main variable. They used performance! In the majority of research, peak speed and incline during the exhausting test was the main correlate to longevity.&lt;/p&gt;

  &lt;p&gt;The large study&lt;sup id=&quot;fnref:13&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:13&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;13&lt;/a&gt;&lt;/sup&gt; on 750,000 veterans that found a 4-fold higher mortality risk for low versus high fitness used peak speed and incline, not Vo2max. Same with the research&lt;sup id=&quot;fnref:14&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:14&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;14&lt;/a&gt;&lt;/sup&gt; on 120,000 individuals finding a 5x difference in the risk of early death.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That makes logical sense to me. VO2max is only one aspect of fitness (albeit an important one), and it stands to reason that your actual ability to perform physical tasks is a better measure of physical health.&lt;/p&gt;

&lt;p&gt;I did also look at some of the evidence cited in that article, specifically the Harber et al. (2017)&lt;sup id=&quot;fnref:15&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:15&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;15&lt;/a&gt;&lt;/sup&gt; meta-analysis, and Table 2 confirms Magness’s claim—most studies measured speed, time, or total work performed, not VO2max directly.&lt;/p&gt;

&lt;p&gt;Insofar as I can verify Steve Magness’s claims, they seem to be correct. He also claims that HIIT protocols should not be “all-out”—for example, 60-second running intervals should be done between a 5K and a one-mile pace.&lt;sup id=&quot;fnref:16&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:16&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;16&lt;/a&gt;&lt;/sup&gt; He doesn’t cite any research on that claim, and as far as I know, there &lt;em&gt;isn’t&lt;/em&gt; really research on it, it’s just how most high-performing athletes train. But since he seems right about the verifiable claims he’s made, I expect he’s right about that, too.&lt;/p&gt;

&lt;p&gt;On the subject of checking people’s logic, here is an (admittedly cherry-picked) quote from the other side of the argument:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;[I]s anything done in a total of 10-minutes that big of a deal?&lt;/p&gt;

  &lt;p&gt;If anyone is misinterpreting my statement as prescriptive:&lt;/p&gt;

  &lt;p&gt;My underlying point was that anything you can do in 10-minutes is limited on a relative harm basis, even if you do a lot.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This does not make logical sense to me. You can absolutely hurt yourself in less than 10 minutes. Even putting injury risk aside,&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;I’ve done 10-minute hill sprint intervals in the morning that left me feeling tired all day.&lt;/li&gt;
  &lt;li&gt;A 10-rep max squat takes less than 60 seconds, but it makes my legs sore for the next two days.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;(Maybe that’s less logic and more personal experience, but a single example is enough to disprove a universal claim.)&lt;/p&gt;

&lt;p&gt;I think Barbell Medicine has good logic, too. In their podcast on HIIT, they talk about how in strength training, nobody lifts the maximum possible weight every week. (And I have enough personal experience to know that maxing out every week wouldn’t work.) So it probably doesn’t make sense to max out your aerobic capacity in every week, either. I’m not sure strength training and cardio work the same way in that respect, but I expect things to be the same unless I have reason to believe they’re different.&lt;/p&gt;

&lt;p&gt;The Barbell Medicine article on HIIT&lt;sup id=&quot;fnref:8:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:8&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;8&lt;/a&gt;&lt;/sup&gt; has some nice sample workouts that line up with Magness’s recommendations:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;&lt;strong&gt;Workout #1&lt;/strong&gt;&lt;/p&gt;

  &lt;p&gt;4 to 6 rounds of: 30 seconds on at 600-800 m running pace (or a speed sustainable in the range of ~90-150 seconds), 4 min off / easy effort&lt;/p&gt;

  &lt;p&gt;&lt;strong&gt;Workout #2&lt;/strong&gt;&lt;/p&gt;

  &lt;p&gt;8 to 10 rounds of: 1-minute on at 1 mile-5 km running pace (or a speed sustainable in the range of 6-25 minutes), 1 minute off&lt;/p&gt;

  &lt;p&gt;&lt;strong&gt;Workout #3&lt;/strong&gt;&lt;/p&gt;

  &lt;p&gt;3 to 5 rounds of: 5 minutes on at zone 4 heart rate (85-95% max), 3 min rest&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I want to investigate this further, but here’s what I tentatively believe:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;VO2max predicts longevity, but athletic performance matters more than VO2max alone.&lt;/li&gt;
  &lt;li&gt;I should exercise at a variety of intensities to get a well-rounded fitness capacity, but HIIT isn’t particularly better for improving fitness than LIT, and 4-minute intervals aren’t particularly better than other interval schemes.&lt;/li&gt;
  &lt;li&gt;Intervals should &lt;em&gt;not&lt;/em&gt; be done at the maximum sustainable intensity; they should be done at an intensity that’s challenging but doesn’t leave you wiped out. As Magness wrote, “The goal isn’t to create fatigue, that’s easy to do. The goal is to slightly embarrass your body in the right direction.”&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Coming back to my own training: I have loathed every version of HIIT I’ve tried so far. But that’s because I listened to the people saying that HIIT should be “all-out.” Next time I’m going to do HIIT at a more comfortable pace.&lt;/p&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:1&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Crowley, E., Powell, C., Carson, B. P., &amp;amp; W. Davies, R. (2022). &lt;a href=&quot;https://doi.org/10.1155/2022/9310710&quot;&gt;The Effect of Exercise Training Intensity on VO2max in Healthy Adults: An Overview of Systematic Reviews and Meta-Analyses.&lt;/a&gt; &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:2&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Gist, N. H., Fedewa, M. V., Dishman, R. K., &amp;amp; Cureton, K. J. (2013). &lt;a href=&quot;https://doi.org/10.1007/s40279-013-0115-0&quot;&gt;Sprint Interval Training Effects on Aerobic Capacity: A Systematic Review and Meta-Analysis.&lt;/a&gt; &lt;a href=&quot;#fnref:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:3&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;&lt;a href=&quot;https://www.barbellmedicine.com/&quot;&gt;https://www.barbellmedicine.com/&lt;/a&gt; &lt;a href=&quot;#fnref:3&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:4&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;&lt;a href=&quot;https://redcircle.com/shows/0cc66fc4-ccb8-4c60-8cc6-7367e52c4159/episodes/706bc687-0a98-4057-8e2e-e4db349bba4a&quot;&gt;https://redcircle.com/shows/0cc66fc4-ccb8-4c60-8cc6-7367e52c4159/episodes/706bc687-0a98-4057-8e2e-e4db349bba4a&lt;/a&gt; &lt;a href=&quot;#fnref:4&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:5&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;&lt;a href=&quot;https://x.com/foundmyfitness/status/1844811732080021919&quot;&gt;https://x.com/foundmyfitness/status/1844811732080021919&lt;/a&gt; &lt;a href=&quot;#fnref:5&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:6&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;&lt;a href=&quot;https://x.com/StephenSeiler/status/1845357464130031873&quot;&gt;https://x.com/StephenSeiler/status/1845357464130031873&lt;/a&gt; &lt;a href=&quot;#fnref:6&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:7&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;&lt;a href=&quot;https://x.com/stevemagness/status/1845079291320525202&quot;&gt;https://x.com/stevemagness/status/1845079291320525202&lt;/a&gt; &lt;a href=&quot;#fnref:7&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:8&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;&lt;a href=&quot;https://www.barbellmedicine.com/blog/hiit-high-intensity-interval-training/&quot;&gt;https://www.barbellmedicine.com/blog/hiit-high-intensity-interval-training/&lt;/a&gt; &lt;a href=&quot;#fnref:8&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt; &lt;a href=&quot;#fnref:8:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;2&lt;/sup&gt;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:9&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;&lt;a href=&quot;https://www.stevemagness.com/about/&quot;&gt;https://www.stevemagness.com/about/&lt;/a&gt; &lt;a href=&quot;#fnref:9&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:10&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;&lt;a href=&quot;https://x.com/stevemagness/status/1849918347086795151&quot;&gt;https://x.com/stevemagness/status/1849918347086795151&lt;/a&gt; &lt;a href=&quot;#fnref:10&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:11&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Mølmen, K. S., Almquist, N. W., &amp;amp; Skattebo, Ø. (2024). &lt;a href=&quot;https://link.springer.com/article/10.1007/s40279-024-02120-2&quot;&gt;Effects of Exercise Training on Mitochondrial and Capillary Growth in Human Skeletal Muscle: A Systematic Review and Meta-Regression.&lt;/a&gt; doi: &lt;a href=&quot;https://doi.org/10.1007/s40279-024-02120-2&quot;&gt;10.1007/s40279-024-02120-2&lt;/a&gt; &lt;a href=&quot;#fnref:11&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:12&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;&lt;a href=&quot;https://thegrowtheq.com/longevity-and-vo2max-does-it-matter/&quot;&gt;https://thegrowtheq.com/longevity-and-vo2max-does-it-matter/&lt;/a&gt; &lt;a href=&quot;#fnref:12&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:13&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Kokkinos, P., Faselis, C., Samuel, I. B. H., Pittaras, A., Doumas, M., Murphy, R., Heimall, M. S. et al. (2022). &lt;a href=&quot;https://doi.org/10.1016/j.jacc.2022.05.031&quot;&gt;Cardiorespiratory Fitness and Mortality Risk Across the Spectra of Age, Race, and Sex.&lt;/a&gt; &lt;a href=&quot;#fnref:13&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:14&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Mandsager, K., Harb, S., Cremer, P., Phelan, D., Nissen, S. E., &amp;amp; Jaber, W. (2018). &lt;a href=&quot;https://doi.org/10.1001/jamanetworkopen.2018.3605&quot;&gt;Association of Cardiorespiratory Fitness With Long-term Mortality Among Adults Undergoing Exercise Treadmill Testing.&lt;/a&gt; &lt;a href=&quot;#fnref:14&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:15&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Harber, M. P., Kaminsky, L. A., Arena, R., Blair, S. N., Franklin, B. A., Myers, J., &amp;amp; Ross, R. (2017). &lt;a href=&quot;https://mdickens.me/materials/harber2017.pdf&quot;&gt;Impact of Cardiorespiratory Fitness on All-Cause and Disease-Specific Mortality: Advances Since 2009.&lt;/a&gt; doi: &lt;a href=&quot;https://doi.org/10.1016/j.pcad.2017.03.001&quot;&gt;10.1016/j.pcad.2017.03.001&lt;/a&gt; &lt;a href=&quot;#fnref:15&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:16&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;&lt;a href=&quot;https://x.com/stevemagness/status/1845079292784365803&quot;&gt;https://x.com/stevemagness/status/1845079292784365803&lt;/a&gt; &lt;a href=&quot;#fnref:16&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>Retroactive If-Then Commitments</title>
				<pubDate>Sat, 01 Feb 2025 00:00:00 -0800</pubDate>
				<link>http://mdickens.me/2025/02/01/retroactive_if-then_commitments/</link>
				<guid isPermaLink="true">http://mdickens.me/2025/02/01/retroactive_if-then_commitments/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;An &lt;a href=&quot;https://www.lesswrong.com/posts/sMtS9Eof6QC6sPouB/if-then-commitments-for-ai-risk-reduction-by-holden&quot;&gt;if-then commitment&lt;/a&gt; is a framework for responding to AI risk: “If an AI model has capability X, then AI development/deployment must be halted until mitigations Y are put in place.”&lt;/p&gt;

&lt;p&gt;As an extension of this approach, we should consider &lt;strong&gt;retroactive if-then commitments&lt;/strong&gt;. We should behave &lt;em&gt;as if&lt;/em&gt; we wrote if-then commitments a few years ago, and we should commit to implementing whatever mitigations we &lt;em&gt;would have&lt;/em&gt; committed to back then.&lt;/p&gt;

&lt;p&gt;Imagine how an if-then commitment might have been written in 2020:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Pause AI development and figure out mitigations if:&lt;/p&gt;

  &lt;ul&gt;
    &lt;li&gt;AI exhibits what looks like deceptive or &lt;a href=&quot;https://www.lesswrong.com/posts/8gy7c8GAPkuu6wTiX/frontier-models-are-capable-of-in-context-scheming&quot;&gt;misaligned&lt;/a&gt; behavior, or feigns alignment (&lt;a href=&quot;https://assets.anthropic.com/m/983c85a201a962f/original/Alignment-Faking-in-Large-Language-Models-full-paper.pdf&quot;&gt;1&lt;/a&gt;, &lt;a href=&quot;https://www.lesswrong.com/posts/njAZwT8nkHnjipJku/alignment-faking-in-large-language-models?commentId=uXBf8XwDyryXYiTRu&quot;&gt;1b&lt;/a&gt;, &lt;a href=&quot;https://www.transformernews.ai/p/openai-o1-alignment-faking&quot;&gt;2&lt;/a&gt;)&lt;/li&gt;
    &lt;li&gt;AI &lt;a href=&quot;https://www.zmescience.com/science/news-science/chat-gpt-escaped-containment/&quot;&gt;breaks out of containment&lt;/a&gt; in a toy example&lt;/li&gt;
    &lt;li&gt;AI &lt;a href=&quot;https://www.forbes.com/sites/daveywinder/2024/11/05/google-claims-world-first-as-ai-finds-0-day-security-vulnerability/&quot;&gt;finds a real-world zero-day vulnerability&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;AI &lt;a href=&quot;https://www.metaculus.com/questions/3698/when-will-an-ai-achieve-a-98th-percentile-score-or-higher-in-a-mensa-admission-test/&quot;&gt;qualifies for Mensa&lt;/a&gt;&lt;sup id=&quot;fnref:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
    &lt;li&gt;AI exhibits some degree of &lt;a href=&quot;https://www.anthropic.com/news/3-5-models-and-computer-use&quot;&gt;agentic capabilities&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;AI &lt;a href=&quot;https://x.com/elder_plinius/status/1858177213201367478&quot;&gt;writes malware&lt;/a&gt;&lt;/li&gt;
  &lt;/ul&gt;
&lt;/blockquote&gt;

&lt;p&gt;Well, AI models have now done or nearly-done all of those things.&lt;/p&gt;

&lt;p&gt;We don’t know what mitigations are appropriate, so AI companies should pause development until (at a minimum) AI safety researchers agree on what mitigations are warranted, and those mitigations are then fully implemented.&lt;/p&gt;

&lt;p&gt;(You could argue about whether AI &lt;em&gt;really&lt;/em&gt; hit those capability milestones, but that doesn’t particularly matter. You need to pause and/or restrict development of an AI system when it looks &lt;em&gt;potentially&lt;/em&gt; dangerous, not &lt;em&gt;definitely&lt;/em&gt; dangerous.)&lt;/p&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:1&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Okay, technically it did not score well enough to qualify, but it scored well enough that there was some ambiguity about whether it qualified, which is only a little bit less concerning. &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>The 7 Best High-Protein Breakfast Cereals</title>
				<pubDate>Fri, 17 Jan 2025 00:00:00 -0800</pubDate>
				<link>http://mdickens.me/2025/01/17/high_protein_breakfast_cereals/</link>
				<guid isPermaLink="true">http://mdickens.me/2025/01/17/high_protein_breakfast_cereals/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;&lt;em&gt;Updated 2025-03-19 to add Catalina Crunch Cinnamon Toast.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;(I write listicles now)&lt;/p&gt;

&lt;p&gt;(there are only 7 eligible high-protein breakfast cereals, so the ones at the bottom are still technically among the 7 best even though they’re not good)&lt;/p&gt;

&lt;p&gt;If you search the internet, you can find rankings of the best “high-protein” breakfast cereals. But most of the entries on those lists don’t even have that much protein. I don’t like that, so I made my own list.&lt;/p&gt;

&lt;p&gt;This is my ranking of genuinely high-protein breakfast cereals, which I define as containing at least 25% calories from protein.&lt;/p&gt;

&lt;p&gt;Many food products like to advertise how many grams of protein they have per serving. That number doesn’t matter because it depends on how big a serving is. Hypothetically, if a food had 6g protein per serving but each serving contained 2000 calories, that would be a terrible deal. The actual number that matters is the &lt;em&gt;proportion&lt;/em&gt; of calories from protein.&lt;/p&gt;

&lt;p&gt;My ranking only includes vegan cereals because I’m vegan. Fortunately most cereals are vegan anyway. The main exception is that some cereals contain whey protein, but that’s not too common—most of them use soy, pea, or wheat protein instead.&lt;/p&gt;

&lt;h2 id=&quot;high-protein-cereals-ranked-by-flavor&quot;&gt;High-protein cereals, ranked by flavor&lt;/h2&gt;

&lt;!-- more --&gt;

&lt;ul id=&quot;markdown-toc&quot;&gt;
  &lt;li&gt;&lt;a href=&quot;#high-protein-cereals-ranked-by-flavor&quot; id=&quot;markdown-toc-high-protein-cereals-ranked-by-flavor&quot;&gt;High-protein cereals, ranked by flavor&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#1-oatmeal-with-added-protein-powder-27-calories-from-protein-if-you-make-it-the-way-i-do&quot; id=&quot;markdown-toc-1-oatmeal-with-added-protein-powder-27-calories-from-protein-if-you-make-it-the-way-i-do&quot;&gt;1. &lt;strong&gt;Oatmeal with added protein powder&lt;/strong&gt; (27% calories from protein, if you make it the way I do)&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#2-special-k-high-protein-chocolate-almond-33-calories-from-protein&quot; id=&quot;markdown-toc-2-special-k-high-protein-chocolate-almond-33-calories-from-protein&quot;&gt;2. &lt;strong&gt;Special K High Protein Chocolate Almond&lt;/strong&gt; (33% calories from protein)&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#3-catalina-crunch-40-calories-from-protein&quot; id=&quot;markdown-toc-3-catalina-crunch-40-calories-from-protein&quot;&gt;3. &lt;strong&gt;Catalina Crunch&lt;/strong&gt; (40% calories from protein)&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#4-wheaties-protein-supplemented-with-lysine-32-calories-from-protein&quot; id=&quot;markdown-toc-4-wheaties-protein-supplemented-with-lysine-32-calories-from-protein&quot;&gt;4. &lt;strong&gt;Wheaties Protein&lt;/strong&gt; (supplemented with lysine) (32% calories from protein)&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#5-post-premier-protein-44-calories-from-protein&quot; id=&quot;markdown-toc-5-post-premier-protein-44-calories-from-protein&quot;&gt;5. &lt;strong&gt;Post Premier Protein&lt;/strong&gt; (44% calories from protein)&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#6-special-k-zero-50-calories-from-protein&quot; id=&quot;markdown-toc-6-special-k-zero-50-calories-from-protein&quot;&gt;6. &lt;strong&gt;Special K Zero&lt;/strong&gt; (50% calories from protein)&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#7-three-wishes-24-to-27-calories-from-protein-depending-on-flavor&quot; id=&quot;markdown-toc-7-three-wishes-24-to-27-calories-from-protein-depending-on-flavor&quot;&gt;7. &lt;strong&gt;Three Wishes&lt;/strong&gt; (24% to 27% calories from protein, depending on flavor)&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#honorable-mention-kashi-go&quot; id=&quot;markdown-toc-honorable-mention-kashi-go&quot;&gt;Honorable mention: Kashi Go&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#unranked-because-its-not-vegan-magic-spoon-33-to-37-calories-from-protein-depending-on-flavor&quot; id=&quot;markdown-toc-unranked-because-its-not-vegan-magic-spoon-33-to-37-calories-from-protein-depending-on-flavor&quot;&gt;Unranked because it’s not vegan: &lt;strong&gt;Magic Spoon&lt;/strong&gt; (33% to 37% calories from protein, depending on flavor)&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#some-other-non-vegan-cereals-that-i-know-nothing-about&quot; id=&quot;markdown-toc-some-other-non-vegan-cereals-that-i-know-nothing-about&quot;&gt;Some other non-vegan cereals that I know nothing about&lt;/a&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#price-table&quot; id=&quot;markdown-toc-price-table&quot;&gt;Price table&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#notes&quot; id=&quot;markdown-toc-notes&quot;&gt;Notes&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;1-oatmeal-with-added-protein-powder-27-calories-from-protein-if-you-make-it-the-way-i-do&quot;&gt;1. &lt;strong&gt;Oatmeal with added protein powder&lt;/strong&gt; (27% calories from protein, if you make it the way I do)&lt;/h3&gt;

&lt;figure&gt;
&lt;img src=&quot;/assets/images/Oatmeal.jpg&quot; style=&quot;height:300px&quot; /&gt;
&lt;figcaption style=&quot;font-size: 0.7em&quot;&gt;This is regular oatmeal because I couldn&apos;t find a stock photo of oatmeal mixed with protein powder.&lt;/figcaption&gt;
&lt;/figure&gt;

&lt;p&gt;You can buy pre-mixed oatmeal and protein powder, but it’s unnecessarily expensive so I prefer to mix it myself. Obviously the amount of protein varies depending on how much protein powder you add. I personally like to mix:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;two servings of oats (= one cup, or 300 calories)&lt;/li&gt;
  &lt;li&gt;one scoop of protein powder (= 1/3 cup, or 25 grams, or 90 calories)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Those proportions provide 30% calories from protein. I find the consistency gets sticky if you add more protein than that. If you don’t like the consistency at this ratio, you can add more oats/less protein.&lt;/p&gt;

&lt;p&gt;I save time by mixing the oats and protein powder in a giant jug. The jug lasts me for a few weeks and this way I don’t have to measure out the proportions every morning.&lt;/p&gt;

&lt;p&gt;Oatmeal is my favorite cereal because there are so many ways to make it. I like to mix in blueberries, blackberries, or bananas, which add nutrients and cover up the protein-y flavor. (Plain oatmeal with protein powder tastes kind of weird.)&lt;/p&gt;

&lt;p&gt;This mixture has 30% calories from protein without fruit and about 27% with fruit.&lt;/p&gt;

&lt;h3 id=&quot;2-special-k-high-protein-chocolate-almond-33-calories-from-protein&quot;&gt;2. &lt;strong&gt;Special K High Protein Chocolate Almond&lt;/strong&gt; (33% calories from protein)&lt;/h3&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/SKHP.png&quot; style=&quot;height:300px&quot; /&gt;&lt;/p&gt;

&lt;p&gt;This is my favorite cold breakfast cereal. Most high-protein cereals taste merely tolerable, but this one tastes actively &lt;em&gt;good&lt;/em&gt;. It has a nice crunchy texture from the combination of almonds and cereal flakes, and it has enough sugar to give it a good flavor.&lt;/p&gt;

&lt;p&gt;Unfortunately it doesn’t seem to be available anymore. I reached out to customer service to ask if it’s discontinued and they said they’re still producing it, but I haven’t been able to find it anywhere. (Nobody else seems to be able to find it either—see the product reviews &lt;a href=&quot;https://www.kroger.com/p/kellogg-s-special-k-high-protein-chocolate-almond-cereal/0003800028203&quot;&gt;here&lt;/a&gt;.)&lt;/p&gt;

&lt;p&gt;(Special K High Protein should not be confused with Special K Protein, which did not qualify for my list because it has less than 25% calories from protein.)&lt;/p&gt;

&lt;h3 id=&quot;3-catalina-crunch-40-calories-from-protein&quot;&gt;3. &lt;strong&gt;Catalina Crunch&lt;/strong&gt; (40% calories from protein)&lt;/h3&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/Catalina-Crunch-cinnamon.webp&quot; style=&quot;height:300px&quot; /&gt;&lt;/p&gt;

&lt;p&gt;This is my 3rd favorite high-protein cereal, and it has more protein than #1 or #2. Catalina Crunch has become a staple for me now that Special K High Protein is apparently discontinued.&lt;/p&gt;

&lt;p&gt;There are a number of flavors, but the only ones I’ve tried are Cinnamon Toast and Dark Chocolate. Originally I had Dark Chocolate on this list at #4, but later I tried Cinnamon Toast which is better, so I’ve moved Catalina Crunch up to rank 3.&lt;/p&gt;

&lt;p&gt;It doesn’t get soggy in milk and it doesn’t fall apart in your mouth. It’s sugar-free, but it tastes surprisingly good to me (I don’t usually like sugar-free cereals). The downside is it’s pretty expensive compared to most breakfast cereals.&lt;/p&gt;

&lt;h3 id=&quot;4-wheaties-protein-supplemented-with-lysine-32-calories-from-protein&quot;&gt;4. &lt;strong&gt;Wheaties Protein&lt;/strong&gt; (supplemented with lysine) (32% calories from protein)&lt;/h3&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/Wheaties-Protein.webp&quot; style=&quot;height:300px&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Wheaties Protein is the last high-protein cereal I’d consider &lt;em&gt;good&lt;/em&gt;, like I actively enjoy eating it.&lt;/p&gt;

&lt;p&gt;Be aware that this cereal’s protein comes primarily from wheat, which doesn’t have much of the amino acid lysine. It has plenty of every other essential amino acid, but if you eat a lot of this cereal, you need to make sure you get lysine from somewhere else.&lt;/p&gt;

&lt;p&gt;To get a full amino acid profile, you need to add about 1 gram of lysine per 60 grams of wheat protein. I personally buy &lt;a href=&quot;https://www.amazon.com/NOW-L-Lysine-500-100-Tablets/dp/B000MGOWOC&quot;&gt;these lysine supplements&lt;/a&gt; and take one 500mg pill per bowl of cereal.&lt;/p&gt;

&lt;h3 id=&quot;5-post-premier-protein-44-calories-from-protein&quot;&gt;5. &lt;strong&gt;Post Premier Protein&lt;/strong&gt; (44% calories from protein)&lt;/h3&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/Post-Premier-Protein-Chocolate-Almond.jpg&quot; style=&quot;height:300px&quot; /&gt;&lt;/p&gt;

&lt;p&gt;This cereal has two different flavors, Chocolate Almond and Mixed Berry Almond. I personally like the chocolate flavor better. Both have a passably good flavor and texture but they have a weird protein-y crunchiness.&lt;/p&gt;

&lt;p&gt;Post Premier Protein gets most of its protein from wheat, which, as I mentioned before, doesn’t contain much lysine. Fortunately it also contains pea protein, which has an abundance of lysine.&lt;/p&gt;

&lt;p&gt;I don’t know the exact protein ratios, but I believe the overall balance still doesn’t contain enough lysine, so it would be prudent to get some extra lysine from somewhere else. I personally would take one 500mg lysine pill per two bowls of cereal. (I always eat two bowls for breakfast because I’m a growing boy.)&lt;/p&gt;

&lt;h3 id=&quot;6-special-k-zero-50-calories-from-protein&quot;&gt;6. &lt;strong&gt;Special K Zero&lt;/strong&gt; (50% calories from protein)&lt;/h3&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/Special-K-Zero.webp&quot; style=&quot;height:300px&quot; /&gt;&lt;/p&gt;

&lt;p&gt;If you had to guess which of Special K High Protein and Special K Zero had more protein, you might think it’s the one with “protein” in the name, but you’d be wrong. This cereal contains an extraordinary 50% calories from protein with a good amino acid composition. Unfortunately, the reason it has so much protein is that it’s basically just lumps of protein powder.&lt;/p&gt;

&lt;p&gt;When I take a bite of this cereal, it tastes good for the first five seconds or so. Then it dissolves into a powdery mush—it feels like I’ve poured wet protein powder directly into my mouth. I wouldn’t eat Special K Zero unless I was really desperate for protein. (Even then, I would rather have a protein shake.)&lt;/p&gt;

&lt;p&gt;But some people seem to like it so your mileage may vary.&lt;/p&gt;

&lt;h3 id=&quot;7-three-wishes-24-to-27-calories-from-protein-depending-on-flavor&quot;&gt;7. &lt;strong&gt;Three Wishes&lt;/strong&gt; (24% to 27% calories from protein, depending on flavor)&lt;/h3&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/Three-Wishes.webp&quot; style=&quot;height:300px&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Like Special K Zero, Three Wishes feels like eating protein powder flakes. But it only has half as much protein as Special K Zero. If you can stomach eating protein powder flakes, you might as well eat Special K Zero instead.&lt;/p&gt;

&lt;p&gt;(I’ve only tried one flavor of Three Wishes, but it was a while ago and I don’t remember which flavor it was. I assume the other flavors have the same bad protein-y texture as the one I tried.)&lt;/p&gt;

&lt;h3 id=&quot;honorable-mention-kashi-go&quot;&gt;Honorable mention: Kashi Go&lt;/h3&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/Kashi-Go.jpg&quot; style=&quot;height:300px&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Kashi Go, with 24% calories from protein, just barely does not qualify for my list. It has a good flavor and texture, but it makes my mouth feel weird if I eat too much of it. (I get the same mouth feeling when I eat a lot of spinach.) If it had that extra one percentage point of protein, I would put it at #4 on my list.&lt;/p&gt;

&lt;h3 id=&quot;unranked-because-its-not-vegan-magic-spoon-33-to-37-calories-from-protein-depending-on-flavor&quot;&gt;Unranked because it’s not vegan: &lt;strong&gt;Magic Spoon&lt;/strong&gt; (33% to 37% calories from protein, depending on flavor)&lt;/h3&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/Magic-Spoon.webp&quot; style=&quot;height:300px&quot; /&gt;&lt;/p&gt;

&lt;p&gt;I’ve never eaten Magic Spoon because it’s not vegan (it contains milk protein).&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://www.youtube.com/watch?v=UTHKgVPZ-is&amp;amp;t=1135s&quot;&gt;Drew Gooden says it’s bad&lt;/a&gt; and he makes funny YouTube videos which means he must have good opinions about cereal. His description of eating Magic Spoon sounds a lot like my experience eating Special K Zero and Three Wishes—it tastes good for a few seconds, then it turns into a protein mush.&lt;/p&gt;

&lt;h3 id=&quot;some-other-non-vegan-cereals-that-i-know-nothing-about&quot;&gt;Some other non-vegan cereals that I know nothing about&lt;/h3&gt;

&lt;ul&gt;
  &lt;li&gt;Snack House Keto Cereal: 44% calories from protein (uses milk protein)&lt;/li&gt;
  &lt;li&gt;Julian Bakery ProGranola Cereal: 44% calories from protein (uses egg white protein (weird choice but ok))&lt;/li&gt;
  &lt;li&gt;Perfect Keto Cereal: 36% calories from protein (uses milk protein) (appears to be discontinued)&lt;/li&gt;
  &lt;li&gt;Wonderworks Keto Friendly Breakfast Cereal: 35% calories from protein (uses milk and soy protein)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Those are all the high-protein breakfast cereals that I’m aware of.&lt;/p&gt;

&lt;h2 id=&quot;price-table&quot;&gt;Price table&lt;/h2&gt;

&lt;p&gt;This table gives the price of each cereal in terms of cents per gram of protein, ordered from cheapest to most expensive. I pulled these prices off Amazon; the prices in your area might differ.&lt;/p&gt;

&lt;p&gt;For the cost of protein oatmeal, I used Quaker 1-Minute Oats plus NOW Foods Soy Protein Isolate because those are the brands I buy.&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Cereal&lt;/th&gt;
      &lt;th&gt;Price (¢/g)&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;oatmeal + protein powder&lt;/td&gt;
      &lt;td&gt;3.7¢&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Post Premier Protein&lt;/td&gt;
      &lt;td&gt;4.2¢&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Special K High Protein&lt;/td&gt;
      &lt;td&gt;4.8¢&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Wheaties Protein&lt;/td&gt;
      &lt;td&gt;4.9¢&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Special K Zero&lt;/td&gt;
      &lt;td&gt;7.9¢&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Catalina Crunch&lt;/td&gt;
      &lt;td&gt;9.1¢&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Kashi Go&lt;/td&gt;
      &lt;td&gt;11.4¢&lt;sup id=&quot;fnref:4&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:4&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Three Wishes&lt;/td&gt;
      &lt;td&gt;12.5¢&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;Prices for non-vegan cereals:&lt;sup id=&quot;fnref:5&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:5&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Cereal&lt;/th&gt;
      &lt;th&gt;Price (¢/g)&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Julian Bakery ProGranola&lt;/td&gt;
      &lt;td&gt;10¢&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Wonderworks Keto Friendly&lt;/td&gt;
      &lt;td&gt;13.5¢&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Magic Spoon&lt;/td&gt;
      &lt;td&gt;13.8¢&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Snack House Keto&lt;/td&gt;
      &lt;td&gt;16.5¢&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:4&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;This is strangely expensive for a big-brand cereal (Kashi is a subsidiary of Kellogg), my guess is there’s some sort of temporary supply issue and the price will go down. &lt;a href=&quot;#fnref:4&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:5&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;How convenient for my pro-vegan agenda that the non-vegan cereals are all so expensive!&lt;/p&gt;

      &lt;p&gt;Inconveniently for my agenda, the cheapest of the non-vegan cereals uses egg protein, which &lt;a href=&quot;https://foodimpacts.org/&quot;&gt;causes more animal suffering&lt;/a&gt; than whey protein.&lt;/p&gt;

      &lt;p&gt;Honestly I’m not that concerned about whey protein, it’s one of the least harmful animal products to buy. &lt;a href=&quot;#fnref:5&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>Charity Cost-Effectiveness Really Does Follow a Power Law</title>
				<pubDate>Wed, 25 Dec 2024 00:00:00 -0800</pubDate>
				<link>http://mdickens.me/2024/12/25/charity_cost_effectiveness_power_law/</link>
				<guid isPermaLink="true">http://mdickens.me/2024/12/25/charity_cost_effectiveness_power_law/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;Conventional wisdom says charity cost-effectiveness obeys a power law. To my knowledge, this hypothesis has never been properly tested.&lt;sup id=&quot;fnref:5&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:5&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; So I tested it and it turns out to be true.&lt;/p&gt;

&lt;p&gt;(Maybe. Cost-effectiveness might also be &lt;a href=&quot;https://en.wikipedia.org/wiki/Log-normal_distribution&quot;&gt;log-normally&lt;/a&gt; distributed.)&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Cost-effectiveness estimates for global health interventions (from &lt;a href=&quot;https://www.dcp-3.org/chapter/2561/cost-effectiveness-analysis&quot;&gt;DCP3&lt;/a&gt;) fit a power law (a.k.a. &lt;a href=&quot;https://en.wikipedia.org/wiki/Pareto_distribution&quot;&gt;Pareto distribution&lt;/a&gt;) with \(\alpha = 1.11\). &lt;a href=&quot;#fitting-dcp3-data-to-a-power-law&quot;&gt;[More]&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;Simulations indicate that the true underlying distribution has a thinner tail than the empirically observed distribution. &lt;a href=&quot;#does-estimation-error-bias-the-result&quot;&gt;[More]&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;!-- more --&gt;

&lt;h2 id=&quot;contents&quot;&gt;Contents&lt;/h2&gt;

&lt;ul id=&quot;markdown-toc&quot;&gt;
  &lt;li&gt;&lt;a href=&quot;#contents&quot; id=&quot;markdown-toc-contents&quot;&gt;Contents&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#fitting-dcp3-data-to-a-power-law&quot; id=&quot;markdown-toc-fitting-dcp3-data-to-a-power-law&quot;&gt;Fitting DCP3 data to a power law&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#does-estimation-error-bias-the-result&quot; id=&quot;markdown-toc-does-estimation-error-bias-the-result&quot;&gt;Does estimation error bias the result?&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#future-work-i-would-like-to-see&quot; id=&quot;markdown-toc-future-work-i-would-like-to-see&quot;&gt;Future work I would like to see&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#source-code-and-data&quot; id=&quot;markdown-toc-source-code-and-data&quot;&gt;Source code and data&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#notes&quot; id=&quot;markdown-toc-notes&quot;&gt;Notes&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;fitting-dcp3-data-to-a-power-law&quot;&gt;Fitting DCP3 data to a power law&lt;/h2&gt;

&lt;p&gt;The Disease Control Priorities 3 report (&lt;a href=&quot;https://www.dcp-3.org/chapter/2561/cost-effectiveness-analysis&quot;&gt;DCP3&lt;/a&gt;) provides cost-effectiveness estimates for 93 global health interventions (measured in &lt;a href=&quot;https://en.wikipedia.org/wiki/Disability-adjusted_life_year&quot;&gt;DALYs&lt;/a&gt; per US dollar). I took those 93 interventions and fitted them to a power law.&lt;/p&gt;

&lt;p&gt;You can see from this graph that the fitted power law matches the data reasonably well:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/dcp3-curve-fit.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;To be precise: the probability of a DCP3 intervention having cost-effectiveness \(x\) is well-approximated by the probability density function \(f(x) = \displaystyle\frac{1.11}{x^{2.11}}\), which is a power law (a.k.a. &lt;a href=&quot;https://en.wikipedia.org/wiki/Pareto_distribution&quot;&gt;Pareto distribution&lt;/a&gt;) with \(\alpha = 1.11\).&lt;/p&gt;

&lt;p&gt;It’s possible to statistically measure whether a curve fits the data using a &lt;a href=&quot;https://en.wikipedia.org/wiki/Goodness_of_fit&quot;&gt;goodness-of-fit test&lt;/a&gt;. There are a number of different goodness-of-fit tests; I used what’s known as the &lt;a href=&quot;https://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test&quot;&gt;Kolmogorov-Smirnov test&lt;/a&gt;&lt;sup id=&quot;fnref:8&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:8&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt;. This test essentially looks at how far away the data points are from where the curve predicts them to be. If many points are far to one side of the curve or the other, that means the curve is a bad fit.&lt;/p&gt;

&lt;p&gt;I ran the Kolmogorov-Smirnov test on the DCP3 data, and it determined that &lt;strong&gt;a Pareto distribution fit the data well.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The goodness-of-fit test produced a p-value of 0.79 for the null hypothesis that the data follows a Pareto distribution. p = 0.79 means that, if you generated random data from a Pareto distribution, there’s a 79% chance that the random data would look &lt;em&gt;less&lt;/em&gt; like a Pareto distribution than the DCP3 data does. That’s good evidence that the DCP3 data is indeed Pareto-distributed or close to it.&lt;sup id=&quot;fnref:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;However, the data also fits well to a &lt;a href=&quot;https://en.wikipedia.org/wiki/Log-normal_distribution&quot;&gt;log-normal distribution&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Pareto and log-normal distributions look similar most of the time. They only noticeably differ in the far right tail—a Pareto distribution has a fatter tail than a log-normal distribution, and this becomes more pronounced the further out you look. But in real-world samples, we usually don’t see enough tail outcomes to distinguish between the two distributions.&lt;/p&gt;

&lt;p&gt;DCP3 only includes global health interventions. If we expanded the data to include other types of interventions, we might find a fatter tail, but I’m not aware of any databases that cover a more comprehensive set of cause areas.&lt;/p&gt;

&lt;p&gt;(The World Bank has data on &lt;a href=&quot;https://openknowledge.worldbank.org/handle/10986/34658&quot;&gt;education interventions&lt;/a&gt;, but adding one cause area at a time feels ad-hoc and it would create gaps in the distribution.)&lt;/p&gt;

&lt;h2 id=&quot;does-estimation-error-bias-the-result&quot;&gt;Does estimation error bias the result?&lt;/h2&gt;

&lt;p&gt;Yes—it causes you to underestimate the true value of \(\alpha\).&lt;/p&gt;

&lt;p&gt;(Recall that the alpha (\(\alpha\)) parameter determines the fatness of the tail—lower alpha means fatter tail. So estimate error makes the tail look fatter than it really is.)&lt;/p&gt;

&lt;p&gt;There’s a difference between cost-effectiveness and &lt;em&gt;estimated&lt;/em&gt; cost-effectiveness. Perhaps estimation error follows a power law, but the true underlying cost-effectiveness numbers &lt;em&gt;don’t&lt;/em&gt;. And even if they do, our cost-effectiveness estimates might produce a bias in the shape of the fitted distribution.&lt;/p&gt;

&lt;p&gt;I tested this by generating random Pareto-distributed&lt;sup id=&quot;fnref:6&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:6&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;4&lt;/a&gt;&lt;/sup&gt; data to represent true cost-effectiveness, and then multiplying by a random noise variable to represent estimation error. I generated the noise as a log-normally-distributed random variable centered at 1 with \(\sigma = 0.5\)&lt;sup id=&quot;fnref:3&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;5&lt;/a&gt;&lt;/sup&gt; (colloquially, that means you can expect the estimate to be off by 50%).&lt;/p&gt;

&lt;p&gt;I generated 10,000 random samples&lt;sup id=&quot;fnref:4&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:4&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;6&lt;/a&gt;&lt;/sup&gt; at various values of alpha, applied some estimation error, and then fit the resulting estimates to a Pareto distribution. The results showed strong goodness of fit, but the estimated alphas did not match the true alphas:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;pre&gt;&lt;code&gt;true alpha 0.8  --&amp;gt;  0.73 estimated alpha (goodness-of-fit: p = 0.3)
true alpha 1.0  --&amp;gt;  0.89 estimated alpha (goodness-of-fit: p = 0.08)
true alpha 1.2  --&amp;gt;  1.07 estimated alpha (goodness-of-fit: p = 0.4)
true alpha 1.4  --&amp;gt;  1.22 estimated alpha (goodness-of-fit: p = 0.5)
true alpha 1.8  --&amp;gt;  1.54 estimated alpha (goodness-of-fit: p = 0.1)
&lt;/code&gt;&lt;/pre&gt;
&lt;/blockquote&gt;

&lt;p&gt;To determine the variance of the bias, I generated 93 random samples at a true alpha of 1.1 (to match the DCP3 data) and fitted a Pareto curve to the samples. I repeated this process 10,000 times.&lt;/p&gt;

&lt;p&gt;Across all generations, the average estimated alpha was 1.06 with a standard deviation of 0.27. That’s a small bias—only 0.04—but it’s highly statistically significant (t-stat = –15, p = 0&lt;sup id=&quot;fnref:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;7&lt;/a&gt;&lt;/sup&gt;).&lt;/p&gt;

&lt;p&gt;A true alpha of 1.15 produces a mean estimate of 1.11, which equals the alpha of the DCP3 cost-effectiveness data. So if the DCP3 estimates have a 50% error (\(\sigma = 0.5\)), then the true alpha parameter is more like 1.15.&lt;/p&gt;

&lt;p&gt;Increasing the estimate error greatly increases the bias. When I changed the error (the \(\sigma\) parameter) from 50% to 100%, the bias became concerningly large, and it gets larger for higher values of alpha:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;pre&gt;&lt;code&gt;true alpha 0.8  --&amp;gt;  0.65 mean estimated alpha
true alpha 1.0  --&amp;gt;  0.78 mean estimated alpha
true alpha 1.2  --&amp;gt;  0.87 mean estimated alpha
true alpha 1.4  --&amp;gt;  0.96 mean estimated alpha
true alpha 1.6  --&amp;gt;  1.03 mean estimated alpha
true alpha 1.8  --&amp;gt;  1.11 mean estimated alpha
&lt;/code&gt;&lt;/pre&gt;
&lt;/blockquote&gt;

&lt;p&gt;If the DCP3 samples have a 100% error then the true alpha is 1.8—much higher than the estimated value of 1.11.&lt;/p&gt;

&lt;p&gt;In addition, at 100% error with 10,000 samples, the estimates no longer fit a Pareto distribution well—the p-value of the goodness-of-fit test ranged from 0.005 to &amp;lt;0.00001 depending on the true alpha value.&lt;sup id=&quot;fnref:7&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:7&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;8&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;Curiously, &lt;em&gt;decreasing&lt;/em&gt; the estimate error flipped the bias from negative to positive. When I reduced the simulation’s estimate error to 20%, a true alpha of 1.1 produced a mean estimated alpha of 1.14 (standard deviation 0.31, t-stat = 13, p = 0&lt;sup id=&quot;fnref:2:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;7&lt;/a&gt;&lt;/sup&gt;). A 20% error produced a positive bias across a range of alpha values—the estimated alpha was always a bit higher than the true alpha.&lt;/p&gt;

&lt;h2 id=&quot;future-work-i-would-like-to-see&quot;&gt;Future work I would like to see&lt;/h2&gt;

&lt;ol&gt;
  &lt;li&gt;A comprehensive DCP3-esque list of cost-effectiveness estimates for every conceivable intervention, not just global health. (That’s probably never going to happen but it would be nice.)&lt;/li&gt;
  &lt;li&gt;More data on the outer tail of cost-effectiveness estimates, to better identify whether the distribution looks more Pareto or more log-normal.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2 id=&quot;source-code-and-data&quot;&gt;Source code and data&lt;/h2&gt;

&lt;p&gt;Source code is available &lt;a href=&quot;https://github.com/michaeldickens/public-scripts/blob/master/intervention_power_laws.py&quot;&gt;on GitHub&lt;/a&gt;. Cost-effectiveness estimates are extracted from DCP3’s &lt;a href=&quot;https://www.dcp-3.org/sites/default/files/chapters/Annex%207A.%20Details%20of%20Interventions%20in%20Figs.pdf&quot;&gt;Annex 7A&lt;/a&gt;; I’ve reproduced the numbers &lt;a href=&quot;https://github.com/michaeldickens/public-scripts/blob/master/data/DCP3%20cost%20per%20DALY.txt&quot;&gt;here&lt;/a&gt; in a more convenient format.&lt;/p&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:5&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;The closest I could find was Stijn on the EA Forum, who &lt;a href=&quot;https://forum.effectivealtruism.org/posts/FXaCnPMiw3jWrnkho/cost-effectiveness-distributions-power-laws-and-scale&quot;&gt;plotted&lt;/a&gt; a subset of the Disease Control Priorities data on a log-log plot and fit the points to a power law distribution, but did not statistically test whether a power law represented the data well. &lt;a href=&quot;#fnref:5&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:8&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Some details on goodness-of-fit tests:&lt;/p&gt;

      &lt;p&gt;Kolmogorov-Smirnov is the standard test, but it depends on the assumption that you know the true parameter values. If you estimate the parameters from the sample (as I did), then it can overestimate fit quality.&lt;/p&gt;

      &lt;p&gt;A recent paper by &lt;a href=&quot;http://soche.cl/chjs/volumes/09/01/Suarez-Espinosa_etal(2018).pdf&quot;&gt;Suarez-Espinoza et al. (2018)&lt;/a&gt;&lt;sup id=&quot;fnref:9&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:9&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;9&lt;/a&gt;&lt;/sup&gt; devises a goodness-of-fit test for the Pareto distribution that does not depend on knowing parameter values. I implemented the test but did not find it to be more reliable than Kolmogorov-Smirnov—for example, it reported a very strong fit when I generated random data from a log-normal distribution. &lt;a href=&quot;#fnref:8&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:1&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;A high p-value is not always evidence in favor of the null hypothesis. It’s only evidence if you expect that, &lt;em&gt;if&lt;/em&gt; the null hypothesis is false, &lt;em&gt;then&lt;/em&gt; you will get a low p-value. But that’s true in this case.&lt;/p&gt;

      &lt;p&gt;(I’ve &lt;a href=&quot;https://mdickens.me/2024/09/26/outlive_a_critical_review/#people-with-metabolically-healthy-obesity-do-not-have-elevated-mortality-risk&quot;&gt;previously&lt;/a&gt; complained about how scientific papers often treat p &amp;gt; 0.05 as evidence in favor of the null hypothesis, even when you’d expect to see p &amp;gt; 0.05 &lt;em&gt;regardless&lt;/em&gt; of whether the null hypothesis was true or false—for example, if their study was &lt;a href=&quot;https://en.wikipedia.org/wiki/Power_(statistics)&quot;&gt;underpowered&lt;/a&gt;.)&lt;/p&gt;

      &lt;p&gt;If the data did not fit a Pareto distribution then we’d expect to see a much smaller p-value. For example, a goodness-of-fit test for a normal distribution gives p &amp;lt; 0.000001, and a gamma distribution gives p = 0.08. A log-normal distribution gives p = 0.96, so we can’t tell whether the data is Pareto or log-normal, but it’s unlikely to be normal or gamma. &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:6&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Actually I used a &lt;a href=&quot;https://en.wikipedia.org/wiki/Lomax_distribution&quot;&gt;Lomax distribution&lt;/a&gt;, which is the same as a Pareto distribution except that the lowest possible value is 0 instead of 1. &lt;a href=&quot;#fnref:6&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:3&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;The \(\sigma\) parameter is the standard deviation of the logarithm of the random variable. &lt;a href=&quot;#fnref:3&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:4&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;In practice we will never have 10,000 distinct cost-effectiveness estimates. But when testing goodness-of-fit, it’s useful to generate many samples because a large data set is hard to overfit. &lt;a href=&quot;#fnref:4&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:2&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;As in, the p-value is so small that my computer rounds it off to zero. &lt;a href=&quot;#fnref:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt; &lt;a href=&quot;#fnref:2:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;2&lt;/sup&gt;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:7&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Perhaps that’s evidence that the DCP3 estimates have less than a 100% error, since they do fit a Pareto distribution well? That would be convenient if true.&lt;/p&gt;

      &lt;p&gt;But it’s easy to get a good fit if we reduce the sample size to 93. When I generated 93 samples with 100% error, I got a p-value greater than 0.5 most of the time. &lt;a href=&quot;#fnref:7&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:9&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Suárez-Espinosa, J., Villasenor-Alva, J. A., Hurtado-Jaramillo, A., &amp;amp; Pérez-Rodríguez, P. (2018). &lt;a href=&quot;http://soche.cl/chjs/volumes/09/01/Suarez-Espinosa_etal(2018).pdf&quot;&gt;A goodness of fit test for the Pareto distribution.&lt;/a&gt; &lt;a href=&quot;#fnref:9&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>"You Can't Calculate the Expected Utility of a Communist Revolution"</title>
				<pubDate>Fri, 06 Dec 2024 00:00:00 -0800</pubDate>
				<link>http://mdickens.me/2024/12/06/expected_utility_of_communist_revolution/</link>
				<guid isPermaLink="true">http://mdickens.me/2024/12/06/expected_utility_of_communist_revolution/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;Leftist critics of effective altruism like to say this. Well, it’s not true, and I proved it by calculating (an estimate of) the expected utility of a communist revolution. It wasn’t even hard—it took me less than an hour.&lt;/p&gt;

&lt;!-- more --&gt;

&lt;p&gt;I put my cost-effectiveness analysis on SquiggleHub: &lt;a href=&quot;https://squigglehub.org/models/mdickens/communist-revolution-ev&quot;&gt;https://squigglehub.org/models/mdickens/communist-revolution-ev&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;According to my analysis, working to start a communist revolution is between 11x and 400x as good as donating to &lt;a href=&quot;https://www.givewell.org/international/technical/programs/givedirectly-cash-for-poverty-relief-program&quot;&gt;GiveDirectly&lt;/a&gt; (that’s the 95% credence interval).&lt;sup id=&quot;fnref:5&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:5&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; That makes it probably better than any &lt;a href=&quot;https://www.givewell.org/charities/top-charities&quot;&gt;GiveWell top charity&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;That’s assuming communism is good, which it isn’t, but let’s say it is for the sake of argument.&lt;/p&gt;

&lt;p&gt;My model includes calculations with extensive commentary. The model is relatively simple—I recommend &lt;a href=&quot;https://squigglehub.org/models/mdickens/communist-revolution-ev&quot;&gt;reading it&lt;/a&gt; if you want to understand how it works.&lt;/p&gt;

&lt;p&gt;In short, the cost-effectiveness analysis goes like this:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Assume life under communism will be 20% better than the status quo. This is probably an underestimate because it’s hard to capture the soul-crushing indignities of capitalism&lt;sup id=&quot;fnref:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt; but let’s say 20% to be conservative.&lt;/li&gt;
  &lt;li&gt;Look at the number of revolutionaries in historical communist revolutions to estimate the probability of success per revolutionary.&lt;/li&gt;
  &lt;li&gt;Downweight the probability of success based on the fact that the revolution might result in an inferior version of communism (like what happened with the Soviet Union).&lt;/li&gt;
  &lt;li&gt;Estimate how long it would take until a communist revolution happened anyway. The value of doing a revolution &lt;em&gt;now&lt;/em&gt; equals &lt;code&gt;[time between now and when it would have happened anyway] x [number of people who get to live under the new regime] x [how good communism is compared to the status quo]&lt;/code&gt;.&lt;/li&gt;
  &lt;li&gt;The previous step gives us the total value of a revolution. Divide that by the probability of success per revolutionary to get the expected value of a single individual’s efforts.&lt;/li&gt;
  &lt;li&gt;Compare to how much an individual could donate to GiveDirectly if they worked a normal job instead of doing activism. That gives us the expected value of a communist revolutionary relative to a GiveDirectly donor.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Is this model perfectly accurate? No. I spent half an hour on it. I can immediately think of several ways it could be better.&lt;sup id=&quot;fnref:3&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt; But the model is still &lt;em&gt;informative&lt;/em&gt;—it provides a basic case that working for a communist revolution could be more cost-effective than GiveWell top charities. And it shows that, contrary to what many people say, it’s possible in principle to calculate the expected utility of a communist revolution.&lt;/p&gt;

&lt;p&gt;The numbers in my model are mostly made up. But &lt;a href=&quot;https://slatestarcodex.com/2013/05/02/if-its-worth-doing-its-worth-doing-with-made-up-statistics/&quot;&gt;if it’s worth doing, it’s worth doing with made-up statistics.&lt;/a&gt;&lt;/p&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:5&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Using GiveWell’s 2023 numbers. The 2024 numbers are considerably more optimistic about GiveDirectly, which makes the multiplier smaller, but a communist revolution still compares favorably to GiveWell top charities. &lt;a href=&quot;#fnref:5&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:1&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I don’t actually believe this; this is me wearing my leftism hat. &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:3&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Some obvious improvements:&lt;/p&gt;

      &lt;ol&gt;
        &lt;li&gt;Look at a variety of historical revolutions instead of just one.&lt;/li&gt;
        &lt;li&gt;Account for the skill of the activist in question. Presumably, some people are more skilled than others at garnering support.&lt;/li&gt;
        &lt;li&gt;Instead of treating all activists as equally responsible, estimate the probability that a marginal activist causes a revolution to occur. Or estimate how much sooner a revolution occurs thanks to a marginal activist.&lt;/li&gt;
      &lt;/ol&gt;
      &lt;p&gt;&lt;a href=&quot;#fnref:3&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>Thoughts on My Donation Process</title>
				<pubDate>Wed, 04 Dec 2024 00:00:00 -0800</pubDate>
				<link>http://mdickens.me/2024/12/04/thoughts_on_my_donation_process/</link>
				<guid isPermaLink="true">http://mdickens.me/2024/12/04/thoughts_on_my_donation_process/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;I have some observations and half-baked ideas about my recent &lt;a href=&quot;https://mdickens.me/2024/11/18/where_i_am_donating_in_2024/&quot;&gt;donation process&lt;/a&gt;. They weren’t important enough to include in the main post, but I want to talk about them anyway.&lt;/p&gt;

&lt;!-- more --&gt;

&lt;h2 id=&quot;on-deference&quot;&gt;On deference&lt;/h2&gt;

&lt;p&gt;Usually, I defer to the beliefs of other people who have spent more time on an issue than me, or who plausibly have more expertise, or who I just expect to have reasonable beliefs.&lt;/p&gt;

&lt;p&gt;While writing my donations post, I made a conscious effort to defer less than usual. Deference might maximize the probability that I make the correct decision, but deference reduces the total amount of reasoning that’s happening, which is bad for the group as a whole. I want there to be more reasoning happening.&lt;/p&gt;

&lt;p&gt;This is most relevant in my discussion of a few orgs that I disliked, which are also very popular among big EA funders. I’m 99% confident that the big funders have private information about those orgs, so maybe I should defer to them. But I’m also maybe 75% confident that if I had access to that information, it wouldn’t materially change my mind. I did anticipate that the private evidence would make me like the orgs a little better, so I updated based on this anticipation and evaluated the orgs a little more favorably than I would have otherwise.&lt;/p&gt;

&lt;h2 id=&quot;on-criticizing-orgs&quot;&gt;On criticizing orgs&lt;/h2&gt;

&lt;p&gt;I am not as nice as I’d like to be. I have a habit of accidentally saying mean things that hurt people’s feelings.&lt;/p&gt;

&lt;p&gt;On the other hand, I think most people are &lt;em&gt;too&lt;/em&gt; nice: they hurt others long-term by refusing to give them useful information that’s difficult to hear.&lt;/p&gt;

&lt;p&gt;(In theory, it’s possible to never say unnecessarily mean things, and always say necessary things, but only if you have perfect communication skills. In practice, there’s a tradeoff.)&lt;/p&gt;

&lt;p&gt;I think it’s a good norm that, if you’re investigating an org and it opens up to you, you shouldn’t take what you learn and use it against the org. I probably wouldn’t criticize an org based on private information that it gave me. I did criticize some orgs, but all my criticisms were based on public information.&lt;/p&gt;

&lt;p&gt;I think if most people wrote a donation post like mine, they’d self-censor in the interest of niceness and end up leaving out important information. I tried to avoid that, and erred more on the side of being mean (not pointlessly mean, but mean-and-truthful. Or maybe I should say mean-and-accurately-conveying-my-beliefs since I can’t promise that the things I said were true).&lt;/p&gt;

&lt;p&gt;As with my choice on deference, this was perhaps the wrong choice at an individual level but the right choice at the group level.&lt;/p&gt;

&lt;p&gt;I did focus on criticizing organizations and avoided saying negative things about specific people whenever possible.&lt;/p&gt;

&lt;h2 id=&quot;donation-sizing&quot;&gt;Donation sizing&lt;/h2&gt;

&lt;p&gt;I have a donor-advised fund (DAF) that I contributed to when I was earning to give. How much of my DAF money should I donate this year? What’s a reasonable spend-down rate?&lt;/p&gt;

&lt;p&gt;I’ve put a lot of thought into &lt;a href=&quot;https://mdickens.me/2020/07/03/estimating_discount_rate/&quot;&gt;how quickly to spend philanthropic resources&lt;/a&gt;, including &lt;a href=&quot;https://mdickens.me/2021/08/02/ai_timelines_now_vs_later/&quot;&gt;how AI timelines affect the answer&lt;/a&gt;. Unfortunately, all that thinking didn’t much help me answer the question.&lt;/p&gt;

&lt;p&gt;Plus, there are some complications:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;I have some personal savings, which I could choose to donate. Should I count them as part of my donation money?&lt;/li&gt;
  &lt;li&gt;I might earn significant income in the future. Right now it looks like I won’t, but I might do more earning to give at some point, or I might take a direct-work job that happens to pay well. If I expect to earn more in the future, then I should spend more of my DAF now.&lt;/li&gt;
  &lt;li&gt;I didn’t donate much money for the last few years. Should I do catch-up donations this year? Or maybe spread out my catch-up donations over the next few years?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I didn’t come up with good answers to any of these questions. Ultimately I chose how much to donate based on what felt reasonable.&lt;/p&gt;

&lt;h2 id=&quot;diversifying-donations-as-a-trade&quot;&gt;Diversifying donations as a trade&lt;/h2&gt;

&lt;p&gt;I had an idea based on &lt;a href=&quot;https://forum.effectivealtruism.org/posts/zuqpqqFoue5LyutTv/the-ea-community-does-not-own-its-donors-money?commentId=HquctY3LJBt42fssh&quot;&gt;this comment&lt;/a&gt; by Oliver Habryka. He describes a trade between members of the EA community where some people do object-level work (relinquishing a high-paying job) and others earn money. He argues that when this trade occurs, the people doing object-level work should have some ownership over the funds that earners-to-give have earned.&lt;/p&gt;

&lt;p&gt;I spent a while earning to give. So arguably I should donate money to people who started out in a similar position as me but went into direct work instead. Essentially, I should (&lt;a href=&quot;https://www.lesswrong.com/tag/acausal-trade&quot;&gt;acausally&lt;/a&gt;) trade with altruists who could’ve earned a lot of money but didn’t. And because there are many such people, arguably I should split my donations across many of them instead of only donating to my #1 favorite thing.&lt;/p&gt;

&lt;p&gt;But this argument raises some questions. Who exactly was in a “similar position as me”? What about people who aren’t members of the EA community, but who are nonetheless doing similarly valuable work? What about people who didn’t have the necessary skill set to earn a lot of money, so they never made a choice not to?&lt;/p&gt;

&lt;p&gt;I decided not to further pursue this line of reasoning because I couldn’t figure out how to make sense of it. I just did the obvious thing of donating to the org(s) that looked most cost-effective on the margin.&lt;/p&gt;

&lt;h2 id=&quot;cooperating-with-the-survival-and-flourishing-fund&quot;&gt;Cooperating with the Survival and Flourishing Fund&lt;/h2&gt;

&lt;p&gt;Should I donate less money to orgs that have received grants from the &lt;a href=&quot;https://survivalandflourishing.fund/&quot;&gt;Survival and Flourishing Fund&lt;/a&gt; (SFF)?&lt;/p&gt;

&lt;p&gt;I want to be cooperative with SFF. If I donate less to an org that’s received SFF funding, that seems uncooperative.&lt;/p&gt;

&lt;p&gt;SFF has the &lt;a href=&quot;https://survivalandflourishing.fund/s-process&quot;&gt;S-process&lt;/a&gt;, which is a fancy method for allocating donations from a group of value-aligned donors who each want to be the donor of last resort, but who also want to make sure their favored orgs get funded. I could cooperate with SFF by participating in this process.&lt;/p&gt;

&lt;p&gt;I asked them if they wanted to add my money to the S-process and they declined, so I consider myself to have officially Cooperated and now I’m allowed to donate less to orgs that received SFF funding. I don’t think SFF really cares if its donations trade off against mine because I have much less money than it does.&lt;/p&gt;

                </description>
			</item>
		
			<item>
				<title>Where I Am Donating in 2024</title>
				<pubDate>Mon, 18 Nov 2024 00:00:00 -0800</pubDate>
				<link>http://mdickens.me/2024/11/18/where_i_am_donating_in_2024/</link>
				<guid isPermaLink="true">http://mdickens.me/2024/11/18/where_i_am_donating_in_2024/</guid>
                <description>
                  
                  
                  
                  &lt;h2 id=&quot;summary&quot;&gt;Summary&lt;/h2&gt;

&lt;p&gt;&lt;em&gt;Last updated 2025-04-25.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;It’s &lt;a href=&quot;https://mdickens.me/2016/10/31/where_i_am_donating_in_2016/&quot;&gt;been a while&lt;/a&gt; since I last put serious thought into where to donate. Well I’m putting thought into it this year and I’m changing my mind on some things.&lt;/p&gt;

&lt;p&gt;I now put more priority on existential risk (especially AI risk), and less on animal welfare and global priorities research. I believe I previously gave too little consideration to x-risk for emotional reasons, and I’ve managed to reason myself out of those emotions.&lt;/p&gt;

&lt;p&gt;Within x-risk:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;AI is the most important source of risk.&lt;/li&gt;
  &lt;li&gt;There is a disturbingly high probability that alignment research won’t solve alignment by the time superintelligent AI arrives. Policy work seems more promising.&lt;/li&gt;
  &lt;li&gt;Specifically, I am most optimistic about policy advocacy for government regulation to pause/slow down AI development.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In the rest of this post, I will explain:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Why I prioritize x-risk over &lt;a href=&quot;#s-risk-research-and-animal-focused-longtermism&quot;&gt;animal-focused longtermist work&lt;/a&gt; and &lt;a href=&quot;#x-risk-vs-global-priorities-research&quot;&gt;global priorities research&lt;/a&gt;.&lt;/li&gt;
  &lt;li&gt;Why I prioritize AI policy over &lt;a href=&quot;#ai-safety-technical-research-vs-policy&quot;&gt;AI alignment research&lt;/a&gt;.&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#man-versus-man-conflicts-within-ai-policy&quot;&gt;My beliefs&lt;/a&gt; about what kinds of policy work are best.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Then I provide a &lt;a href=&quot;#organizations&quot;&gt;list of organizations&lt;/a&gt; working on AI policy and my evaluation of each of them, and &lt;a href=&quot;#where-im-donating&quot;&gt;where&lt;/a&gt; I plan to donate.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Cross-posted to the &lt;a href=&quot;https://forum.effectivealtruism.org/posts/jAfhxWSzsw4pLypRt/where-i-am-donating-in-2024&quot;&gt;Effective Altruism Forum&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;!-- more --&gt;

&lt;h2 id=&quot;contents&quot;&gt;Contents&lt;/h2&gt;

&lt;ul id=&quot;markdown-toc&quot;&gt;
  &lt;li&gt;&lt;a href=&quot;#summary&quot; id=&quot;markdown-toc-summary&quot;&gt;Summary&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#contents&quot; id=&quot;markdown-toc-contents&quot;&gt;Contents&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#i-dont-like-donating-to-x-risk&quot; id=&quot;markdown-toc-i-dont-like-donating-to-x-risk&quot;&gt;I don’t like donating to x-risk&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#cause-prioritization&quot; id=&quot;markdown-toc-cause-prioritization&quot;&gt;Cause prioritization&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#s-risk-research-and-animal-focused-longtermism&quot; id=&quot;markdown-toc-s-risk-research-and-animal-focused-longtermism&quot;&gt;S-risk research and animal-focused longtermism&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#x-risk-vs-global-priorities-research&quot; id=&quot;markdown-toc-x-risk-vs-global-priorities-research&quot;&gt;X-risk vs. global priorities research&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#prioritization-within-x-risk&quot; id=&quot;markdown-toc-prioritization-within-x-risk&quot;&gt;Prioritization within x-risk&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#ai-safety-technical-research-vs-policy&quot; id=&quot;markdown-toc-ai-safety-technical-research-vs-policy&quot;&gt;AI safety technical research vs. policy&lt;/a&gt;        &lt;ul&gt;
          &lt;li&gt;&lt;a href=&quot;#quantitative-model-on-research-vs-policy&quot; id=&quot;markdown-toc-quantitative-model-on-research-vs-policy&quot;&gt;Quantitative model on research vs. policy&lt;/a&gt;&lt;/li&gt;
        &lt;/ul&gt;
      &lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#man-versus-man-conflicts-within-ai-policy&quot; id=&quot;markdown-toc-man-versus-man-conflicts-within-ai-policy&quot;&gt;“Man versus man” conflicts within AI policy&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#parallel-safetycapabilities-vs-slowing-ai&quot; id=&quot;markdown-toc-parallel-safetycapabilities-vs-slowing-ai&quot;&gt;Parallel safety/capabilities vs. slowing AI&lt;/a&gt;        &lt;ul&gt;
          &lt;li&gt;&lt;a href=&quot;#if-we-dont-advance-capabilities-china-will-or-some-other-company-that-doesnt-care-about-safety-will&quot; id=&quot;markdown-toc-if-we-dont-advance-capabilities-china-will-or-some-other-company-that-doesnt-care-about-safety-will&quot;&gt;&lt;strong&gt;If we don’t advance capabilities, China will. Or some other company that doesn’t care about safety will.&lt;/strong&gt;&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#ai-companies-need-to-build-state-of-the-art-sota-models-so-they-can-learn-how-to-align-those-models&quot; id=&quot;markdown-toc-ai-companies-need-to-build-state-of-the-art-sota-models-so-they-can-learn-how-to-align-those-models&quot;&gt;&lt;strong&gt;AI companies need to build state-of-the-art (SOTA) models so they can learn how to align those models.&lt;/strong&gt;&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#we-need-to-develop-ai-as-soon-as-possible-because-it-will-greatly-improve-peoples-lives-and-were-losing-out-on-a-huge-opportunity-cost&quot; id=&quot;markdown-toc-we-need-to-develop-ai-as-soon-as-possible-because-it-will-greatly-improve-peoples-lives-and-were-losing-out-on-a-huge-opportunity-cost&quot;&gt;&lt;strong&gt;We need to develop AI as soon as possible because it will greatly improve people’s lives and we’re losing out on a huge opportunity cost.&lt;/strong&gt;&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#we-should-advance-capabilities-to-avoid-a-hardware-overhang-a-situation-where-ai-can-be-improved-purely-by-throwing-more-hardware-at-it-which-is-potentially-dangerous-because-it-could-cause-ai-to-leap-forward-without-giving-people-time-to-prepare&quot; id=&quot;markdown-toc-we-should-advance-capabilities-to-avoid-a-hardware-overhang-a-situation-where-ai-can-be-improved-purely-by-throwing-more-hardware-at-it-which-is-potentially-dangerous-because-it-could-cause-ai-to-leap-forward-without-giving-people-time-to-prepare&quot;&gt;&lt;strong&gt;We should advance capabilities to avoid a “hardware overhang”&lt;/strong&gt;: a situation where AI can be improved purely by throwing more hardware at it, which is potentially dangerous because it could cause AI to leap forward without giving people time to prepare.&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#we-need-agi-to-prevent-some-other-existential-risk-from-killing-everyone&quot; id=&quot;markdown-toc-we-need-agi-to-prevent-some-other-existential-risk-from-killing-everyone&quot;&gt;&lt;strong&gt;We need AGI to prevent some other existential risk from killing everyone.&lt;/strong&gt;&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#its-okay-to-advance-capabilities-because-ai-does-not-pose-an-existential-risk&quot; id=&quot;markdown-toc-its-okay-to-advance-capabilities-because-ai-does-not-pose-an-existential-risk&quot;&gt;&lt;strong&gt;It’s okay to advance capabilities because AI does not pose an existential risk.&lt;/strong&gt;&lt;/a&gt;&lt;/li&gt;
        &lt;/ul&gt;
      &lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#freedom-vs-regulation&quot; id=&quot;markdown-toc-freedom-vs-regulation&quot;&gt;Freedom vs. regulation&lt;/a&gt;        &lt;ul&gt;
          &lt;li&gt;&lt;a href=&quot;#regulations-to-slow-ai-would-require-the-government-to-take-authoritarian-measures&quot; id=&quot;markdown-toc-regulations-to-slow-ai-would-require-the-government-to-take-authoritarian-measures&quot;&gt;&lt;strong&gt;Regulations to slow AI would require the government to take authoritarian measures.&lt;/strong&gt;&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#regulations-to-slow-ai-might-be-nearly-impossible-to-lift-even-if-ai-alignment-gets-solved-and-then-we-wont-get-the-glorious-transhumanist-future&quot; id=&quot;markdown-toc-regulations-to-slow-ai-might-be-nearly-impossible-to-lift-even-if-ai-alignment-gets-solved-and-then-we-wont-get-the-glorious-transhumanist-future&quot;&gt;&lt;strong&gt;Regulations to slow AI might be nearly impossible to lift even if AI alignment gets solved, and then we won’t get the glorious transhumanist future.&lt;/strong&gt;&lt;/a&gt;&lt;/li&gt;
        &lt;/ul&gt;
      &lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#slow-nuanced-regulation-vs-fast-coarse-regulation&quot; id=&quot;markdown-toc-slow-nuanced-regulation-vs-fast-coarse-regulation&quot;&gt;Slow nuanced regulation vs. fast coarse regulation&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#working-with-vs-against-ai-companies&quot; id=&quot;markdown-toc-working-with-vs-against-ai-companies&quot;&gt;Working with vs. against AI companies&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#political-diplomacy-vs-advocacy&quot; id=&quot;markdown-toc-political-diplomacy-vs-advocacy&quot;&gt;Political diplomacy vs. advocacy&lt;/a&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#conflicts-that-arent-man-vs-man-but-nonetheless-require-an-answer&quot; id=&quot;markdown-toc-conflicts-that-arent-man-vs-man-but-nonetheless-require-an-answer&quot;&gt;Conflicts that aren’t “man vs. man” but nonetheless require an answer&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#pause-vs-responsible-scaling-policy-rsp&quot; id=&quot;markdown-toc-pause-vs-responsible-scaling-policy-rsp&quot;&gt;Pause vs. Responsible Scaling Policy (RSP)&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#policy-research-vs-policy-advocacy&quot; id=&quot;markdown-toc-policy-research-vs-policy-advocacy&quot;&gt;Policy research vs. policy advocacy&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#advocacy-directed-at-policy-makers-vs-the-general-public&quot; id=&quot;markdown-toc-advocacy-directed-at-policy-makers-vs-the-general-public&quot;&gt;Advocacy directed at policy-makers vs. the general public&lt;/a&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#organizations&quot; id=&quot;markdown-toc-organizations&quot;&gt;Organizations&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#important-disclaimers&quot; id=&quot;markdown-toc-important-disclaimers&quot;&gt;Important disclaimers&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#ai-policy-institute&quot; id=&quot;markdown-toc-ai-policy-institute&quot;&gt;AI Policy Institute&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#ai-safety-and-governance-fund&quot; id=&quot;markdown-toc-ai-safety-and-governance-fund&quot;&gt;AI Safety and Governance Fund&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#ai-standards-lab&quot; id=&quot;markdown-toc-ai-standards-lab&quot;&gt;AI Standards Lab&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#campaign-for-ai-safety&quot; id=&quot;markdown-toc-campaign-for-ai-safety&quot;&gt;Campaign for AI Safety&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#centre-for-enabling-ea-learning-and-research-ceealar&quot; id=&quot;markdown-toc-centre-for-enabling-ea-learning-and-research-ceealar&quot;&gt;Centre for Enabling EA Learning and Research (CEEALAR)&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#center-for-ai-policy&quot; id=&quot;markdown-toc-center-for-ai-policy&quot;&gt;Center for AI Policy&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#center-for-ai-safety&quot; id=&quot;markdown-toc-center-for-ai-safety&quot;&gt;Center for AI Safety&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#center-for-human-compatible-ai&quot; id=&quot;markdown-toc-center-for-human-compatible-ai&quot;&gt;Center for Human-Compatible AI&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#center-for-long-term-resilience&quot; id=&quot;markdown-toc-center-for-long-term-resilience&quot;&gt;Center for Long-Term Resilience&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#center-for-security-and-emerging-technology-cset&quot; id=&quot;markdown-toc-center-for-security-and-emerging-technology-cset&quot;&gt;Center for Security and Emerging Technology (CSET)&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#centre-for-long-term-policy&quot; id=&quot;markdown-toc-centre-for-long-term-policy&quot;&gt;Centre for Long-Term Policy&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#centre-for-the-governance-of-ai&quot; id=&quot;markdown-toc-centre-for-the-governance-of-ai&quot;&gt;Centre for the Governance of AI&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#civai&quot; id=&quot;markdown-toc-civai&quot;&gt;CivAI&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#control-ai&quot; id=&quot;markdown-toc-control-ai&quot;&gt;Control AI&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#existential-risk-observatory&quot; id=&quot;markdown-toc-existential-risk-observatory&quot;&gt;Existential Risk Observatory&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#future-of-life-institute-fli&quot; id=&quot;markdown-toc-future-of-life-institute-fli&quot;&gt;Future of Life Institute (FLI)&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#future-society&quot; id=&quot;markdown-toc-future-society&quot;&gt;Future Society&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#horizon-institute-for-public-service&quot; id=&quot;markdown-toc-horizon-institute-for-public-service&quot;&gt;Horizon Institute for Public Service&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#institute-for-ai-policy-and-strategy&quot; id=&quot;markdown-toc-institute-for-ai-policy-and-strategy&quot;&gt;Institute for AI Policy and Strategy&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#lightcone-infrastructure&quot; id=&quot;markdown-toc-lightcone-infrastructure&quot;&gt;Lightcone Infrastructure&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#machine-intelligence-research-institute-miri&quot; id=&quot;markdown-toc-machine-intelligence-research-institute-miri&quot;&gt;Machine Intelligence Research Institute (MIRI)&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#manifund&quot; id=&quot;markdown-toc-manifund&quot;&gt;Manifund&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#model-evaluation-and-threat-research-metr&quot; id=&quot;markdown-toc-model-evaluation-and-threat-research-metr&quot;&gt;Model Evaluation and Threat Research (METR)&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#palisade-research&quot; id=&quot;markdown-toc-palisade-research&quot;&gt;Palisade Research&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#pauseai-global&quot; id=&quot;markdown-toc-pauseai-global&quot;&gt;PauseAI Global&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#pauseai-us&quot; id=&quot;markdown-toc-pauseai-us&quot;&gt;PauseAI US&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#sentinel-rapid-emergency-response-team&quot; id=&quot;markdown-toc-sentinel-rapid-emergency-response-team&quot;&gt;Sentinel rapid emergency response team&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#simon-institute-for-longterm-governance&quot; id=&quot;markdown-toc-simon-institute-for-longterm-governance&quot;&gt;Simon Institute for Longterm Governance&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#stop-ai&quot; id=&quot;markdown-toc-stop-ai&quot;&gt;Stop AI&lt;/a&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#where-im-donating&quot; id=&quot;markdown-toc-where-im-donating&quot;&gt;Where I’m donating&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#prioritization-within-my-top-five&quot; id=&quot;markdown-toc-prioritization-within-my-top-five&quot;&gt;Prioritization within my top five&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#where-im-donating-this-is-the-section-in-which-i-actually-say-where-im-donating&quot; id=&quot;markdown-toc-where-im-donating-this-is-the-section-in-which-i-actually-say-where-im-donating&quot;&gt;Where I’m donating (this is the section in which I actually say where I’m donating)&lt;/a&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#changelog&quot; id=&quot;markdown-toc-changelog&quot;&gt;Changelog&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#notes&quot; id=&quot;markdown-toc-notes&quot;&gt;Notes&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h1 id=&quot;i-dont-like-donating-to-x-risk&quot;&gt;I don’t like donating to x-risk&lt;/h1&gt;

&lt;p&gt;(This section is about my personal motivations. The arguments and logic start in the &lt;a href=&quot;#cause-prioritization&quot;&gt;next section&lt;/a&gt;.)&lt;/p&gt;

&lt;p&gt;For more than a decade I’ve leaned toward longtermism and I’ve been concerned about existential risk, but I’ve never directly donated to x-risk reduction. I dislike x-risk on an emotional level for a few reasons:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;In the present day, aggregate animal welfare matters far more than aggregate human welfare (credence: 90%). Present-day animal suffering is so extraordinarily vast that on some level it feels irresponsible to prioritize anything else, even though rationally I buy the arguments for longtermism.&lt;/li&gt;
  &lt;li&gt;Animal welfare is more neglected than x-risk (credence: 90%).&lt;sup id=&quot;fnref:4&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:4&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
  &lt;li&gt;People who prioritize x-risk often disregard animal welfare (or the welfare of non-human beings, whatever shape those beings might take in the future). That makes me distrust their reasoning on cause prioritization. (This isn’t universally true—I know some people who care about animals but still prioritize x-risk.)&lt;/li&gt;
  &lt;li&gt;I find it distasteful the way people often talk about “human extinction”, which seemingly ignores the welfare of all other sentient beings.&lt;sup id=&quot;fnref:13&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:13&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For a while, I donated to animal-focused orgs that looked good under longtermism, like &lt;a href=&quot;https://www.sentienceinstitute.org/&quot;&gt;Sentience Institute&lt;/a&gt;. In recent years, I’ve avoided thinking about cause prioritization by supporting global priorities research (such as by donating to the &lt;a href=&quot;https://globalprioritiesinstitute.org/&quot;&gt;Global Priorities Institute&lt;/a&gt;)—pay them to think about cause prioritization so I don’t have to. I still believe there’s a good case for that sort of research, but the case for existential risk is stronger (more on this &lt;a href=&quot;#x-risk-vs-global-priorities-research&quot;&gt;below&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;I’ve spent too long ignoring my rationally-formed beliefs about x-risk because they felt emotionally wrong. I’m normally pretty good at biting bullets. I should bite this bullet, too.&lt;/p&gt;

&lt;p&gt;This decision to prioritize x-risk (mostly&lt;sup id=&quot;fnref:36&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:36&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt;) didn’t happen because I changed my mind. It happened because I realized I was stupidly letting my emotional distaste toward x-risk sway my decision-making.&lt;/p&gt;

&lt;p&gt;On the other hand, I’ve become more worried about AI in the last few years. My P(doom) hasn’t really gone up, but the threat of misaligned AI has become more &lt;em&gt;visceral&lt;/em&gt;. I believe unaligned AI is my most likely cause of death, and I’d rather not die.&lt;sup id=&quot;fnref:54&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:54&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;4&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;h1 id=&quot;cause-prioritization&quot;&gt;Cause prioritization&lt;/h1&gt;
&lt;h2 id=&quot;s-risk-research-and-animal-focused-longtermism&quot;&gt;S-risk research and animal-focused longtermism&lt;/h2&gt;

&lt;p&gt;I believe animal-focused (or non-human-focused) longtermist work is important (credence: 95%), and that it’s far more neglected than (human-focused) x-risk reduction (credence: 99%). I believe the same about &lt;a href=&quot;https://forum.effectivealtruism.org/posts/k6fJXBnc7YnDcxsQm/s-risks-fates-worse-than-extinction&quot;&gt;s-risk&lt;/a&gt; research (and s-risks heavily overlap with animal-focused longtermism, so that’s not a coincidence). But I also believe:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;At equivalent levels of funding, marginal work on x-risk is more cost-effective (credence: 75%) because non-human welfare is likely to &lt;a href=&quot;https://mdickens.me/2015/08/15/is_preventing_human_extinction_good/&quot;&gt;turn out okay&lt;/a&gt; if we develop friendly AI.&lt;/li&gt;
  &lt;li&gt;The cost-effectiveness of x-risk funding diminishes slowly enough that it’s better even at current funding levels (credence: 65%), especially because some of the most promising sub-fields within x-risk remain poorly funded.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Improving animal welfare likely has good flow-through effects into the distant future. But I think those flow-through effects don’t have a huge expected value compared to x-risk reduction because they only matter under certain conditions (I discussed these conditions a while back in &lt;a href=&quot;https://mdickens.me/2015/08/15/is_preventing_human_extinction_good/&quot;&gt;Is Preventing Human Extinction Good?&lt;/a&gt; and &lt;a href=&quot;https://mdickens.me/2015/09/10/on_values_spreading/&quot;&gt;On Values Spreading&lt;/a&gt;.)&lt;/p&gt;

&lt;p&gt;This judgment is hard to make with confidence because it requires speculating about what the distant future will look like.&lt;/p&gt;

&lt;p&gt;In &lt;a href=&quot;https://forum.effectivealtruism.org/posts/EkKYqeAy3ArupKuYn/my-donations-2023-marcus-abramovitch&quot;&gt;Marcus Abramovitch’s excellent writeup&lt;/a&gt; on where he donated in 2023, he said,&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;I don’t think many x-risk organizations are fundamentally constrained on dollars and several organizations could be a lot more frugal and have approximately equal results.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I basically agree with this but I think there are some x-risk orgs that need more funding, and they’re among some of the most promising orgs.&lt;/p&gt;

&lt;h2 id=&quot;x-risk-vs-global-priorities-research&quot;&gt;X-risk vs. global priorities research&lt;/h2&gt;

&lt;p&gt;A dilemma:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;We can’t &lt;em&gt;fully&lt;/em&gt; align AI until we solve some foundational problems in ethics and other fields.&lt;/li&gt;
  &lt;li&gt;If we don’t align AI, we will go extinct before we solve those foundational problems.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;(One proposed solution is to conduct a &lt;a href=&quot;https://forum.effectivealtruism.org/topics/long-reflection&quot;&gt;long reflection&lt;/a&gt;, and basically put a superintelligent AI on standby mode until that’s done. This proposal has some issues but I haven’t heard anything better.)&lt;/p&gt;

&lt;p&gt;So, should we focus on x-risk or global priorities research?&lt;/p&gt;

&lt;p&gt;Ultimately, I think x-risk is the higher priority (credence: 70%). If we build a (mostly) friendly AI without really figuring out some details of AI alignment, maybe we can work things out from there. But the &lt;a href=&quot;https://globalprioritiesinstitute.org/wp-content/uploads/gpi-research-agenda.pdf&quot;&gt;problems&lt;/a&gt; in global priorities research seem so complex that we have essentially no chance of solving them before AGI arrives (unless AGI turns out to take much longer than expected), regardless of how much funding goes to global priorities research.&lt;/p&gt;

&lt;h2 id=&quot;prioritization-within-x-risk&quot;&gt;Prioritization within x-risk&lt;/h2&gt;

&lt;p&gt;Among the various existential risks, AI risk stands out as clearly the most important. I believe this for essentially the same reasons that most people believe it.&lt;/p&gt;

&lt;p&gt;In short:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Natural risks are less concerning than human-caused risks (credence: 98%).&lt;/li&gt;
  &lt;li&gt;Climate change is not a serious existential threat (credence: 90%). See John Halstead’s report &lt;a href=&quot;https://johnhalstead.org/wp-content/uploads/2023/11/Climate-Change-Longtermism-1.pdf&quot;&gt;Climate Change &amp;amp; Longtermism&lt;/a&gt;.&lt;/li&gt;
  &lt;li&gt;Engineered pandemics are considerably less likely to cause extinction than AI (credence: 95%). I’ve heard biologists in the x-risk space claim that it would be very hard for a pandemic to cause total extinction.&lt;/li&gt;
  &lt;li&gt;Nuclear war is worrisome but less of an extinction risk than AI (credence: 85%). See 80,000 Hours’ &lt;a href=&quot;https://forum.effectivealtruism.org/posts/j8nyJ3pv5Q4rz4Bg2/new-80k-problem-profile-nuclear-weapons#How_likely_is_an_existential_catastrophe_resulting_from_nuclear_war_&quot;&gt;table of x-risk estimates&lt;/a&gt; for nuclear war.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For more, see Michael Aird’s &lt;a href=&quot;https://forum.effectivealtruism.org/posts/JQQAQrunyGGhzE23a/database-of-existential-risk-estimates&quot;&gt;database of existential risk estimates&lt;/a&gt;.&lt;/p&gt;

&lt;h2 id=&quot;ai-safety-technical-research-vs-policy&quot;&gt;AI safety technical research vs. policy&lt;/h2&gt;

&lt;p&gt;There are a few high-level strategies for dealing with AI risk. We can broadly classify them into (1) technical research and (2) policy. Basically:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;technical research = figure out how to prevent AI from killing everyone&lt;/li&gt;
  &lt;li&gt;policy = increase the probability that policies/regulations will reduce x-risk&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;(You could further divide policy into research vs. advocacy—i.e., figure out what makes for good regulations vs. advocate for regulations to be enacted. I’ll talk more about that later.)&lt;/p&gt;

&lt;p&gt;I don’t have any expertise in AI&lt;sup id=&quot;fnref:25&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:25&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;5&lt;/a&gt;&lt;/sup&gt; and I don’t know what kinds of alignment research are most promising, but experts can’t seem to agree either—some think &lt;a href=&quot;https://www.alignmentforum.org/posts/YTq4X6inEudiHkHDF/prosaic-ai-alignment&quot;&gt;prosaic alignment&lt;/a&gt; will work, others think we need fundamentally new paradigms. (I lean toward the latter (credence: 70%).)&lt;/p&gt;

&lt;p&gt;But I don’t see how we are going to solve AI alignment. The best existing research seems like maybe it has some chance of someday leading to some method that could eventually solve alignment with enough work, perhaps.&lt;sup id=&quot;fnref:47&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:47&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;6&lt;/a&gt;&lt;/sup&gt; Our best hope is that either (1) AGI turns out to be much harder to develop than it looks or (2) solving alignment turns out to be really easy for some unforeseen reason.&lt;sup id=&quot;fnref:15&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:15&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;7&lt;/a&gt;&lt;/sup&gt; But those hopes are non-actionable—I can’t increase their probability by donating money.&lt;/p&gt;

&lt;p&gt;Ever since I became concerned about AI risk (about a decade ago), I’ve weakly believed that we were not on pace to solve alignment before AGI arrived. But I thought perhaps technical research would become sufficiently popular as the dangers of AI became more apparent. By now, it’s clear that that isn’t happening, and we’re not going to solve AI alignment in time unless the problem turns out to be easy.&lt;/p&gt;

&lt;p&gt;I used to be even more pessimistic about AI policy than technical research,&lt;sup id=&quot;fnref:3&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;8&lt;/a&gt;&lt;/sup&gt; but now I think it’s the more promising approach (credence: 80%). Surprisingly (to me), AI safety (as in notkilleveryoneism) is now kinda sorta mainstream, and there’s some degree of political will for creating regulations that could prevent AI from killing everyone. SB 1047, which might have meaningfully decreased x-risk, saw &lt;a href=&quot;https://theaipi.org/voters-support-sb1047-in-collaborative-poll/&quot;&gt;widespread support&lt;/a&gt;. (Unfortunately, one &lt;a href=&quot;https://x.com/KelseyTuoc/status/1838279944750927929&quot;&gt;particular guy&lt;/a&gt; with veto power did not support it.&lt;sup id=&quot;fnref:16&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:16&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;9&lt;/a&gt;&lt;/sup&gt;)&lt;/p&gt;

&lt;p&gt;Another consideration: bad policy work can backfire. I don’t know much about policy and I’m relatively bad at understanding people, so on priors, I don’t expect to be good at figuring out which policy efforts will work. I used to think I should defer to people with better social skills. But now I’ve seen some of the poor results produced by policy orgs that care a lot about reputation management, and I’ve seen how messaging about extinction is much more palatable than the people with good social skills predicted (e.g., as demonstrated by &lt;a href=&quot;https://theaipi.org/media/&quot;&gt;public opinion polling&lt;/a&gt;), so I think I overrated others’ judgment and underrated my own. As a consequence, I feel more confident that I can identify which policy orgs are doing good work.&lt;/p&gt;

&lt;p&gt;In summary:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;I don’t think technical research is going to work.&lt;/li&gt;
  &lt;li&gt;Policy might work.&lt;/li&gt;
  &lt;li&gt;I think I’m qualified enough to evaluate policy orgs.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So I want to donate to something related to AI policy.&lt;/p&gt;

&lt;h3 id=&quot;quantitative-model-on-research-vs-policy&quot;&gt;Quantitative model on research vs. policy&lt;/h3&gt;

&lt;p&gt;I built a coarse &lt;a href=&quot;https://squigglehub.org/models/mdickens/ai-research-vs-policy&quot;&gt;quantitative model&lt;/a&gt; on the expected value of donations to technical research vs. policy. The model inputs are very rough but the model illustrates some important principles.&lt;/p&gt;

&lt;p&gt;(Disclaimer: First I decided to donate to AI policy, then I built the model, not the other way around.&lt;sup id=&quot;fnref:38&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:38&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;10&lt;/a&gt;&lt;/sup&gt; If the model had disagreed with my beliefs, then I would have changed the model. But if I couldn’t find a reasonable way to make the model fit my beliefs, then I would have changed my beliefs.)&lt;/p&gt;

&lt;p&gt;Preventing x-risk works like voting. All the expected value of your vote comes from the situation where the outcome is exactly tied and your vote breaks the tie. If the expected vote is close to 50/50, your vote has a high EV. If the expected vote count is far from 50/50, there’s an extremely small probability that your vote will matter.&lt;/p&gt;

&lt;p&gt;If I believed that it would cost (say) $10 billion to solve AI alignment, and also the total spending without my donation would be close to $10 billion, then my donation to alignment research has a high EV. But in fact I believe it probably costs much more than we’re going to spend&lt;sup id=&quot;fnref:63&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:63&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;11&lt;/a&gt;&lt;/sup&gt; (assuming no regulations).&lt;/p&gt;

&lt;p&gt;On a naive view, that means a donation to alignment research has extremely low EV. But that’s not correct because it doesn’t account for uncertainty. My median guess is that solving AI alignment will cost maybe $100 billion, and we will only actually spend $1 billion.&lt;sup id=&quot;fnref:20&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:20&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;12&lt;/a&gt;&lt;/sup&gt; If my credence intervals for those two numbers followed normal distributions, then the probability of making a difference would be incredibly small (like, “number of atoms in the solar system” small) because normal distributions have extremely low probability mass in the tails. But my beliefs have wide credence intervals, and they’re not normally distributed. So my distribution for cost-to-solve-alignment heavily overlaps with total-spending-on-alignment.&lt;/p&gt;

&lt;p&gt;It’s hard to have good intuitions for the probability that a donation makes a difference when that probability depends on the intersection of two overlapping fat-tailed distributions. That’s the sort of thing a quantitative model can help with, even if you don’t take the numbers too seriously.&lt;/p&gt;

&lt;p&gt;Ultimately, I think AI policy has higher EV than technical research. According to &lt;a href=&quot;https://squigglehub.org/models/mdickens/ai-research-vs-policy&quot;&gt;my made-up numbers&lt;/a&gt;, donating to policy is ~3x more cost-effective than donating to research. Somewhat more pessimistic inputs can change the ratio to &amp;gt;1000x.&lt;sup id=&quot;fnref:39&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:39&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;13&lt;/a&gt;&lt;/sup&gt; The ratio flips to &amp;lt;1x if you think there’s close to a 50/50 chance that we solve alignment without any government intervention.&lt;/p&gt;

&lt;h1 id=&quot;man-versus-man-conflicts-within-ai-policy&quot;&gt;“Man versus man” conflicts within AI policy&lt;/h1&gt;

&lt;p&gt;Some prioritization decisions are positive sum: people can work on technical research and policy at the same time. But others are zero sum. I’m wary of &lt;a href=&quot;https://slatestarcodex.com/2015/09/22/beware-systemic-change/&quot;&gt;“man versus man” conflict&lt;/a&gt;—of working in opposition to other (loosely) value-aligned people. But in policy, sometimes you have to engage in “man versus man” conflict. I want to think extra carefully before doing so.&lt;/p&gt;

&lt;p&gt;There are a few such conflicts within AI policy.&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Strategy A&lt;/th&gt;
      &lt;th&gt;Strategy B&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Do capabilities and safety research in parallel. We need more advanced models to better understand what AGI will look like.&lt;/td&gt;
      &lt;td&gt;Slow down capabilities research to buy more time for safety research.&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Don’t push for regulations, they will excessively harm technological development.&lt;/td&gt;
      &lt;td&gt;Push for regulations because the risk is worth it.&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Take the time to figure out nuanced regulations that won’t impede the good parts of AI.&lt;/td&gt;
      &lt;td&gt;Push for regulations ASAP, even if they have worse side effects.&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Cooperate with big AI companies to persuade them to behave more safely.&lt;/td&gt;
      &lt;td&gt;Work against AI companies to stop their dangerous behaviors.&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Diplomatically develop political connections, and later use those to push for AI safety policies.&lt;/td&gt;
      &lt;td&gt;Loudly argue for AI safety policies now, even if it makes us look weird.&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;In every case, I like strategy B better. But how sure am I that I’m on the right side?&lt;/p&gt;

&lt;h2 id=&quot;parallel-safetycapabilities-vs-slowing-ai&quot;&gt;Parallel safety/capabilities vs. slowing AI&lt;/h2&gt;

&lt;p&gt;What are the arguments for advancing capabilities?&lt;/p&gt;

&lt;h4 id=&quot;if-we-dont-advance-capabilities-china-will-or-some-other-company-that-doesnt-care-about-safety-will&quot;&gt;&lt;strong&gt;If we don’t advance capabilities, China will. Or some other company that doesn’t care about safety will.&lt;/strong&gt;&lt;/h4&gt;

&lt;p&gt;This is the strongest argument by my assessment, but I still think it’s wrong (credence: 90%).&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;The companies that have done the most to accelerate AI development have all done so in the name of safety. If they didn’t believe advancing the frontier was the safest move, the world would be in a much safer position right now.&lt;/li&gt;
  &lt;li&gt;It’s likely that international treaties could prevent arms races—they worked well(ish) for nuclear weapons.&lt;/li&gt;
  &lt;li&gt;China is behind the US on AI. There is no need to race harder when you’re ahead.&lt;/li&gt;
  &lt;li&gt;By my amateurish judgment, China doesn’t seem as interested in an arms race as the US. You don’t need to race against someone who’s not racing.&lt;/li&gt;
  &lt;li&gt;How sure are you that it’s better for the United States to develop AI first? (China is less interested in controlling world politics than the US, and the Chinese government seems more concerned about AI risk than the US government.)&lt;/li&gt;
  &lt;li&gt;Who develops superintelligent AI first only matters in the narrow scenario where AI alignment is easy but also the AI can and will be used by its creators to take over the world:
    &lt;ul&gt;
      &lt;li&gt;If AI alignment is hard, it doesn’t matter who develops it first because everyone dies either way.&lt;/li&gt;
      &lt;li&gt;If the AI is &lt;em&gt;fully&lt;/em&gt; aligned, it will refuse to fulfill any unethical requests its creator makes (such as taking over the world).&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;I don’t think we as a society have a good grasp on the game theory of arms races but I feel like the solution isn’t “push the arms race forward even faster”.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4 id=&quot;ai-companies-need-to-build-state-of-the-art-sota-models-so-they-can-learn-how-to-align-those-models&quot;&gt;&lt;strong&gt;AI companies need to build state-of-the-art (SOTA) models so they can learn how to align those models.&lt;/strong&gt;&lt;/h4&gt;

&lt;p&gt;I’ve heard people at Anthropic make this argument. But it’s disingenuous (or at least motivated)&lt;sup id=&quot;fnref:5&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:5&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;14&lt;/a&gt;&lt;/sup&gt; because Anthropic is accelerating capabilities, not just matching the capabilities of pre-existing models—and they have a history of &lt;a href=&quot;https://www.lesswrong.com/posts/JbE7KynwshwkXPJAJ/anthropic-release-claude-3-claims-greater-than-gpt-4?commentId=hwWB4yJyEGhEWud8C&quot;&gt;almost-but-not-technically lying&lt;/a&gt; about whether they were going to advance capabilities).&lt;sup id=&quot;fnref:7&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:7&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;15&lt;/a&gt;&lt;/sup&gt; And the argument doesn’t really make sense because:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;They could satisfy their stated goal nearly as well by training models that are (say) 3/4 as good as the state of the art (credence: 90%).&lt;/li&gt;
  &lt;li&gt;Or they could make deals with other companies to do safety research on their pre-existing SOTA models. This would satisfy their stated goal (credence: 98%). Companies might not be willing to cooperate like this, but surely it’s worth trying (and then &lt;a href=&quot;https://www.lesswrong.com/posts/fhEPnveFhb9tmd7Pe/use-the-try-harder-luke&quot;&gt;trying harder&lt;/a&gt;).&lt;/li&gt;
  &lt;li&gt;There are many types of plausibly-productive alignment research that don’t require SOTA models (credence: 90%).&lt;sup id=&quot;fnref:41&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:41&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;16&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
  &lt;li&gt;Having SOTA models doesn’t differentially improve alignment—it teaches you just as much about how to improve capabilities (credence: 60%).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If the bolded argument is correct, then AI companies should:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Temporarily stop AI development.&lt;/li&gt;
  &lt;li&gt;Learn everything they possibly can about AI alignment with the current model.&lt;/li&gt;
  &lt;li&gt;Publish a report on how they would use a more capable AI to improve alignment.&lt;/li&gt;
  &lt;li&gt;Get review from third-party alignment researchers.&lt;/li&gt;
  &lt;li&gt;If reviewers have a strong consensus that the report is reasonable, only then resume AI development.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That is the sort of behavior you would see from a company that takes existential risk appropriately seriously.&lt;/p&gt;

&lt;h4 id=&quot;we-need-to-develop-ai-as-soon-as-possible-because-it-will-greatly-improve-peoples-lives-and-were-losing-out-on-a-huge-opportunity-cost&quot;&gt;&lt;strong&gt;We need to develop AI as soon as possible because it will greatly improve people’s lives and we’re losing out on a huge opportunity cost.&lt;/strong&gt;&lt;/h4&gt;

&lt;p&gt;This argument only makes sense if you have a very low P(doom) (like &amp;lt;0.1%) or if you place minimal value on future generations. Otherwise, it’s not worth recklessly endangering the future of humanity to bring utopia a few years (or maybe decades) sooner. The math on this is really simple—bringing AI sooner only benefits the current generation, but extinction harms all future generations. You don’t need to be a strong longtermist, you just need to accord significant value to people who aren’t born yet.&lt;/p&gt;

&lt;p&gt;I’ve heard a related argument that the size of the accessible lightcone is rapidly shrinking, so we need to build AI ASAP even if the risk is high. If you do the math, this argument doesn’t make any sense (credence: 95%). The value of the outer edge of the lightcone is extremely small compared to its total volume.&lt;sup id=&quot;fnref:9&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:9&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;17&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;AI could be the best thing that’s ever happened. But it can also be the best thing that’s ever happened 10/20/100 years from now, and if delaying AI lets us greatly reduce existential risk, then it’s worth the delay.&lt;/p&gt;

&lt;h4 id=&quot;we-should-advance-capabilities-to-avoid-a-hardware-overhang-a-situation-where-ai-can-be-improved-purely-by-throwing-more-hardware-at-it-which-is-potentially-dangerous-because-it-could-cause-ai-to-leap-forward-without-giving-people-time-to-prepare&quot;&gt;&lt;strong&gt;We should advance capabilities to avoid a “hardware overhang”&lt;/strong&gt;: a situation where AI can be improved purely by throwing more hardware at it, which is potentially dangerous because it could cause AI to leap forward without giving people time to prepare.&lt;/h4&gt;

&lt;p&gt;Sam Altman has made this argument. But he’s disingenuous (credence: 95%) because he also wants to &lt;a href=&quot;https://www.cnbc.com/2024/02/09/openai-ceo-sam-altman-reportedly-seeking-trillions-of-dollars-for-ai-chip-project.html&quot;&gt;fund hardware advances&lt;/a&gt;, which will increase hardware overhang.&lt;sup id=&quot;fnref:8&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:8&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;18&lt;/a&gt;&lt;/sup&gt; And:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;This argument implies that AI companies should stop looking for algorithmic improvements (credence: 90%) because they don’t produce overhang.&lt;/li&gt;
  &lt;li&gt;Pausing AI development would reduce demand for AI chips, slowing down hardware development.&lt;/li&gt;
  &lt;li&gt;Eliminating overhang only helps if we can meaningfully advance alignment using the higher level of capabilities. That seems unlikely to be worth the tradeoff because alignment has historically progressed very slowly. We are not on pace to solving alignment, with or without an overhang.
    &lt;ul&gt;
      &lt;li&gt;If we will be able to align a bigger model, shouldn’t it be even easier to align the models we currently have? But we don’t know how to align the models we have (beyond the superficial pseudo-alignment that RLHF produces).&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;An AI Impacts &lt;a href=&quot;https://blog.aiimpacts.org/p/are-there-examples-of-overhang-for&quot;&gt;report&lt;/a&gt; described some examples of overhang in other industries. “None of them match the behavior that people seem to expect will happen with hardware overhang.”&lt;/li&gt;
&lt;/ul&gt;

&lt;h4 id=&quot;we-need-agi-to-prevent-some-other-existential-risk-from-killing-everyone&quot;&gt;&lt;strong&gt;We need AGI to prevent some other existential risk from killing everyone.&lt;/strong&gt;&lt;/h4&gt;

&lt;p&gt;Nearly every respectable person&lt;sup id=&quot;fnref:75&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:75&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;19&lt;/a&gt;&lt;/sup&gt; who has estimated x-risk probabilities &lt;a href=&quot;https://forum.effectivealtruism.org/posts/JQQAQrunyGGhzE23a/database-of-existential-risk-estimates&quot;&gt;agrees&lt;/a&gt; that AI is by far the largest x-risk in the next century.&lt;/p&gt;

&lt;h4 id=&quot;its-okay-to-advance-capabilities-because-ai-does-not-pose-an-existential-risk&quot;&gt;&lt;strong&gt;It’s okay to advance capabilities because AI does not pose an existential risk.&lt;/strong&gt;&lt;/h4&gt;

&lt;p&gt;This is a popular argument, but presumably everyone reading this already disagrees, so I’m not going to attempt to rebut it.&lt;/p&gt;

&lt;hr /&gt;

&lt;p&gt;The parallel safety/capabilities side of the argument seems weak to me (and relies on a lot of what looks like motivated reasoning), so I feel comfortable supporting the pause side (credence: 85%).&lt;sup id=&quot;fnref:10&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:10&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;20&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;But there’s some common ground:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Both sides should agree that slowing down hardware is good or at least neutral (credence: 75%).&lt;sup id=&quot;fnref:42&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:42&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;21&lt;/a&gt;&lt;/sup&gt; This alleviates every concern except for the one about the opportunity costs of delaying development.&lt;/li&gt;
  &lt;li&gt;Both sides should support regulations and international treaties that restrict the speed of AI development (credence: 65%). International treaties alleviate concerns about arms races and about needing to stay on the cutting edge.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;freedom-vs-regulation&quot;&gt;Freedom vs. regulation&lt;/h2&gt;

&lt;p&gt;Arguments against regulation:&lt;/p&gt;

&lt;h4 id=&quot;regulations-to-slow-ai-would-require-the-government-to-take-authoritarian-measures&quot;&gt;&lt;strong&gt;Regulations to slow AI would require the government to take authoritarian measures.&lt;/strong&gt;&lt;/h4&gt;

&lt;p&gt;This argument seems pretty wrong to me (credence: 95%). &lt;a href=&quot;https://mxschons.com/2024/comparing-ai-labs-and-pharmaceutical-companies/&quot;&gt;Other industries&lt;/a&gt; have much stricter regulations than AI without slipping into totalitarianism. If the regulations on AI GPUs were as strict as the ones on, say, pseudoephedrine, that would be sufficient to slow and monitor hardware development.&lt;sup id=&quot;fnref:12&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:12&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;22&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;Even if the regulations required individual people to turn in their GPUs to the government (and I don’t know why that would be required because GPU manufacturing is pretty centralized), there’s already precedent for that sort of thing in relatively free societies with e.g. the Australian government mandating that all citizens turn over their guns.&lt;/p&gt;

&lt;h4 id=&quot;regulations-to-slow-ai-might-be-nearly-impossible-to-lift-even-if-ai-alignment-gets-solved-and-then-we-wont-get-the-glorious-transhumanist-future&quot;&gt;&lt;strong&gt;Regulations to slow AI might be nearly impossible to lift even if AI alignment gets solved, and then we won’t get the glorious transhumanist future.&lt;/strong&gt;&lt;/h4&gt;

&lt;p&gt;I do think this is a real concern. Ultimately I believe it’s worth the tradeoff. And it does seem unlikely that excessive regulations could stay in place &lt;em&gt;forever&lt;/em&gt;—I doubt we’d have the knowledge to develop friendly AI, but not the regulatory freedom, for (say) a thousand years.&lt;/p&gt;

&lt;p&gt;(The United States essentially stopped developing nuclear power in the 1990s due to onerous regulations, but it just &lt;a href=&quot;https://apnews.com/article/georgia-power-nuclear-reactor-vogtle-9555e3f9169f2d58161056feaa81a425&quot;&gt;opened&lt;/a&gt; a new plant last year.)&lt;/p&gt;

&lt;h2 id=&quot;slow-nuanced-regulation-vs-fast-coarse-regulation&quot;&gt;Slow nuanced regulation vs. fast coarse regulation&lt;/h2&gt;

&lt;p&gt;Some argue that we should advocate for regulation, but push nuanced messaging to make sure we don’t hamstring economic development.&lt;/p&gt;

&lt;p&gt;This disagreement largely comes down to P(doom) and AI timelines:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;If P(doom) is low, it’s worth accepting some extra risk to figure out how to write careful regulations.&lt;/li&gt;
  &lt;li&gt;If timelines are long, we have plenty of time to figure out regulations.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If delaying regulations increases P(doom) by a lowish number like one percentage point, I don’t think it’s worth it—economically stifling regulations are not 1% as bad as extinction.&lt;/p&gt;

&lt;p&gt;I think it’s unlikely that transformative AI comes in the next five years. But it’s not unthinkable. Metaculus (&lt;a href=&quot;https://www.metaculus.com/questions/5121/date-of-artificial-general-intelligence/&quot;&gt;1&lt;/a&gt;, &lt;a href=&quot;https://www.metaculus.com/questions/19356/transformative-ai-date/&quot;&gt;2&lt;/a&gt;, &lt;a href=&quot;https://www.metaculus.com/questions/384/humanmachine-intelligence-parity-by-2040/&quot;&gt;3&lt;/a&gt;) and &lt;a href=&quot;https://wiki.aiimpacts.org/ai_timelines/predictions_of_human-level_ai_timelines/ai_timeline_surveys/2023_expert_survey_on_progress_in_ai&quot;&gt;surveyed experts&lt;/a&gt; don’t think it’s unthinkable either. And there could be a long delay between &lt;em&gt;pushing for&lt;/em&gt; regulations and those regulations &lt;em&gt;being implemented&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;Dario Amodei, CEO of Anthropic, &lt;a href=&quot;https://x.com/ai_ctrl/status/1806748781440061870&quot;&gt;believes&lt;/a&gt; human-level AI could arrive within 1–2 years. That’s not enough time to figure out nuanced regulations. If he’s right, we need regulations &lt;em&gt;right now.&lt;/em&gt; (Actually, we need regulations ten years ago, but the second best time is now.)&lt;/p&gt;

&lt;p&gt;Given that forecasts put a reasonable probability on transformative AI arriving very soon, I don’t see how it makes sense to delay regulations any more than we already have.&lt;/p&gt;

&lt;p&gt;(I believe a lot of people get this wrong because they’re not thinking probabilistically. Someone has (say) a 10% P(doom) and a 10% chance of AGI within five years, and they round that off to “it’s not going to happen so we don’t need to worry yet.” A 10% chance is still really really bad.)&lt;/p&gt;

&lt;p&gt;And I’m not sure it’s feasible to write the sort of nuanced regulations that some people want. The ideal case is writing regulations that enable beneficial AI while preventing dangerous AI, but at the limiting case, that amounts to “legalize aligned AI and ban unaligned AI”. The less we know about AI alignment, the less likely the regulations are to do the right thing. And we know so little that I’m not sure there’s any real advantage to adding nuance to regulations.&lt;/p&gt;

&lt;p&gt;(&lt;a href=&quot;https://en.wikipedia.org/wiki/Safe_and_Secure_Innovation_for_Frontier_Artificial_Intelligence_Models_Act&quot;&gt;SB 1047&lt;/a&gt; serves as a baseline: writing regulation with that level of nuance takes approximately zero marginal time because it’s already been done. Pushing to delay regulation only makes sense if you think we need something significantly more nuanced than SB 1047.)&lt;/p&gt;

&lt;h2 id=&quot;working-with-vs-against-ai-companies&quot;&gt;Working with vs. against AI companies&lt;/h2&gt;

&lt;p&gt;I believe it’s better (on the margin) to work against AI companies (credence: 80%). I am not aware of any strong arguments or evidence for one side or the other, but I have a few bits of weak evidence.&lt;/p&gt;

&lt;p&gt;(For someone with a larger budget, it might be worthwhile to commission an investigation into the track record of working with vs. against companies on this sort of thing.)&lt;/p&gt;

&lt;p&gt;There’s a moderately strong argument in favor of cooperating with AI companies on policy:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;If AI safety advocates make enemies with AI companies, those companies will get into a political fight with safety advocates, and companies are more powerful so they will probably win.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;How much stock you put in that argument depends on how well you think industry-friendly regulations will reduce x-risk. It seems to me that they won’t. Good regulations will cause AI companies to make less money. If you advocate for regulations that companies like, then the regulations won’t be good.&lt;sup id=&quot;fnref:52&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:52&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;23&lt;/a&gt;&lt;/sup&gt; I don’t see a middle ground where regulations prevent AI from killing everyone but also don’t impede companies’ profits. (A superintelligent AI is always going to be more profitable than anything else, unless it kills everyone.)&lt;/p&gt;

&lt;p&gt;If we get into a political fight with AI companies, we might lose. But if we concede and let AI companies get the regulations they want, we &lt;em&gt;definitely&lt;/em&gt; lose.&lt;sup id=&quot;fnref:40&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:40&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;24&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;Alternatively, you could join an AI company or try to cooperatively influence it from the outside.&lt;/p&gt;

&lt;p&gt;The (anecdotal) track record of working with AI companies so far:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;People who worked for OpenAI were forced to sign non-disparagement agreements, preventing them from dissenting publicly.&lt;/li&gt;
  &lt;li&gt;OpenAI claimed it would dedicate 20% of its compute to alignment research. A year later, the heads of the alignment team complained that they never got their promised compute; OpenAI lied and said they meant something different than the thing they obviously meant; a lot of the alignment team quit; then OpenAI gave up on pretending not to have lied and disbanded its alignment team.&lt;/li&gt;
  &lt;li&gt;Geoffrey Hinton &lt;a href=&quot;https://www.spectator.co.uk/article/we-may-be-history-geoffrey-hinton-on-the-dangers-of-ai/&quot;&gt;quit Google&lt;/a&gt; because, among other reasons, he thought he was too constrained by “self-censorship.”&lt;/li&gt;
  &lt;li&gt;Altman attempted to fire board member Helen Toner for criticizing the unsafeness of OpenAI’s models.&lt;/li&gt;
  &lt;li&gt;Open Philanthropy donated $30 million to OpenAI to buy a board seat. This board seat was probably instrumental in getting Altman fired for (essentially) disregarding safety, but the firing didn’t stick and then all the safety-conscious board members got fired.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;In each of these cases, working within the company made things definitively worse. The last instance came close to making things better but ultimately made things worse—it made OpenAI $30 million richer without making it safer.&lt;/p&gt;

&lt;p&gt;I know of at least one potential counterexample: OpenAI’s RLHF was developed by AI safety people who joined OpenAI to promote safety. But it’s not clear that RLHF helps with x-risk.&lt;sup id=&quot;fnref:11&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:11&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;25&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;Maybe OpenAI makes it uniquely hard to change the culture from within, and you’d fare better with other companies. I don’t think that’s true because two of the other big AI players, Meta and Google, are larger and have more inertia and therefore should be harder to change. Only Anthropic seems easier to influence.&lt;/p&gt;

&lt;p&gt;(But the founding members of Anthropic are concerned with x-risk and they’re still racing to build superintelligent AI as fast as possible while &lt;a href=&quot;https://x.com/ch402/status/1666482929772666880&quot;&gt;admitting&lt;/a&gt; that they have no idea how to make it safe. Influence doesn’t seem helpful: AI safety memes are in the positions of greatest possible influence—namely, the brains of the founders—but still aren’t making Anthropic safe.)&lt;/p&gt;

&lt;p&gt;At the risk of over-updating on random authors who I know nothing about: In 1968, James C. Thomson wrote an article called &lt;a href=&quot;https://archive.ph/6xVX2&quot;&gt;How Could Vietnam Happen? An Autopsy&lt;/a&gt; (h/t &lt;a href=&quot;https://www.lesswrong.com/posts/Jf3ECowLsygYYhEC2/jacobjacob-s-shortform-feed#Jw75qmwiKYyGSbXFn&quot;&gt;Jonas V&lt;/a&gt;). He wrote that, essentially, dissenting insiders don’t protest because they want to accumulate more influence first. They delay expressing their dissent, always wanting to increase the security of their position, and never get to a point where they actually use their position to do good. Former OpenAI employee Daniel Kokotajlo &lt;a href=&quot;https://www.lesswrong.com/posts/Jf3ECowLsygYYhEC2/jacobjacob-s-shortform-feed?commentId=jCgCgJn4wxR4DeGqR&quot;&gt;says&lt;/a&gt; he observed this happening at OpenAI.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://phdcomics.com/comics.php?f=1436&quot;&gt;PhD Comics&lt;/a&gt; observes the same phenomenon in academia:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;https://phdcomics.com/comics/archive/phd072011s.gif&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Hat tip to &lt;a href=&quot;https://www.betonit.ai/p/tenure-is-a-total-scam&quot;&gt;Bryan Caplan&lt;/a&gt;, who adds commentary:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;The classic story is that tenure protects dissenters. […]&lt;/p&gt;
&lt;/blockquote&gt;

&lt;blockquote&gt;
  &lt;p&gt;The flaw with the argument is that academic dissenters remain ultra-rare. Far too rare to justify the enormous downsides of the tenure system. And from a bird’s-eye view, the full effect of tenure on dissent is mixed at best. Remember: To get tenure, a dissenter normally has to spend a decade and a half impersonating a normal academic. If you start the process as a non-conformist, the system almost always either weeds you out or wins you over. By the time you get tenure, a creepy chorus of “One of us! One of us!” is in order.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Finally, “with” vs. “against” isn’t necessarily mutually exclusive. There are already many AI safety advocates working inside AI companies. Having a more &lt;a href=&quot;https://en.wikipedia.org/wiki/Radical_flank_effect&quot;&gt;radical flank&lt;/a&gt; on the outside could be a useful complementary strategy. (I’m not confident about this last argument.)&lt;/p&gt;

&lt;h2 id=&quot;political-diplomacy-vs-advocacy&quot;&gt;Political diplomacy vs. advocacy&lt;/h2&gt;

&lt;p&gt;I could say some of the same things about politics that I said about working from inside AI companies: empirically, people get caught in the trap of accumulating social capital and then never actually spend that capital.&lt;/p&gt;

&lt;p&gt;Relatedly, some people think you should talk about mundane risks of AI and avoid discussing extinction to not look weird. I have a strong prior toward honesty—telling people what you care about and what your true motivations are, rather than misrepresenting your beliefs (or lying by omission) to make them sound more palatable. And I have a moderate prior against accumulating power—both the good guys and the bad guys want to accumulate power. Honesty is an &lt;a href=&quot;https://slatestarcodex.com/2017/03/24/guided-by-the-beauty-of-our-weapons/&quot;&gt;asymmetric weapon&lt;/a&gt; and power is symmetric.&lt;/p&gt;

&lt;h1 id=&quot;conflicts-that-arent-man-vs-man-but-nonetheless-require-an-answer&quot;&gt;Conflicts that aren’t “man vs. man” but nonetheless require an answer&lt;/h1&gt;

&lt;p&gt;There are some other areas of debate where funding one side doesn’t necessarily hurt the other side, but I have a finite amount of money and I need to decide which type of thing to fund.&lt;/p&gt;

&lt;h2 id=&quot;pause-vs-responsible-scaling-policy-rsp&quot;&gt;Pause vs. Responsible Scaling Policy (RSP)&lt;/h2&gt;

&lt;p&gt;Originally I wrote a long diatribe on why I don’t like RSPs and how badly-written AI companies’ RSPs are. But after spending some more time reading pro-RSP commentary, I realized my criticisms didn’t matter because RSP advocates don’t seem to like RSPs much either. The biggest advocates have said things like (paraphrased) “a full pause would be better, but it’s not feasible, so an RSP is a reasonable compromise.” If I understand correctly, they see an RSP as essentially a worse version of a pause but without the downsides. So the real disagreement is about how big the downsides of a pause are.&lt;/p&gt;

&lt;p&gt;As far as I can tell, the main cruxes are:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;It’s not time to pause yet.&lt;/li&gt;
  &lt;li&gt;A pause is bad because it would create a hardware overhang.&lt;sup id=&quot;fnref:14&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:14&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;26&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
  &lt;li&gt;If the US government mandates a pause, China will keep developing AI.&lt;/li&gt;
  &lt;li&gt;A pause has negative EV because it delays the glorious transhumanist future.&lt;/li&gt;
  &lt;li&gt;It’s likely that we can trust companies to voluntarily implement good RSPs.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;I’ve already talked about these and why I believe they’re all incorrect. If you agree with my earlier arguments, then an unconditional pause makes more sense than an RSP.&lt;/p&gt;

&lt;p&gt;I would be more optimistic about a government-enforced RSP than a voluntary RSP, but I believe that’s not what people typically mean when they talk about RSPs.&lt;/p&gt;

&lt;h2 id=&quot;policy-research-vs-policy-advocacy&quot;&gt;Policy research vs. policy advocacy&lt;/h2&gt;

&lt;p&gt;Is it better to advocate for regulation/AI policy, or to do policy-relevant research?&lt;/p&gt;

&lt;p&gt;I don’t really know about this one. I get the sense that policy advocacy is more important, but I don’t have much of an argument as to why.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;The difference between no regulation vs. mediocre regulation is bigger than mediocre vs. good regulation. (credence: 70%)&lt;/li&gt;
  &lt;li&gt;Policy advocacy is more neglected (although they’re both pretty neglected). (credence: 90%)&lt;/li&gt;
  &lt;li&gt;It doesn’t seem that hard to write legislation to slow AI development. How much more research do we really need?&lt;sup id=&quot;fnref:19&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:19&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;27&lt;/a&gt;&lt;/sup&gt; (credence: 50%)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I made some related arguments in &lt;a href=&quot;#slow-nuanced-regulation-vs-fast-coarse-regulation&quot;&gt;slow nuanced regulation vs. fast coarse regulation&lt;/a&gt;. If nuanced regulation isn’t worth it, then policy research likely isn’t worth it because there’s not much research to do. (Although you might still do research on things like which advocacy strategies are likely to be most effective.)&lt;/p&gt;

&lt;h2 id=&quot;advocacy-directed-at-policy-makers-vs-the-general-public&quot;&gt;Advocacy directed at policy-makers vs. the general public&lt;/h2&gt;

&lt;p&gt;In favor of focusing on policy-makers:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;You get a bigger impact per person convinced, because policy-makers are the ones who actually enact regulations.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;In favor of focusing on the public:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;A higher proportion of people will be receptive to your message. (And in fact, people are already broadly concerned about AI, so it might be less about convincing and more about motivating.)&lt;/li&gt;
  &lt;li&gt;Policy-makers’ activities are largely downstream of the public—they want to do what their constituents want.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;I don’t have much of an opinion about which is better—I think it depends on the specifics of the organization that’s doing the advocacy. And both sorely need more funding.&lt;/p&gt;

&lt;h1 id=&quot;organizations&quot;&gt;Organizations&lt;/h1&gt;

&lt;p&gt;I’m not qualified to evaluate AI policy organizations&lt;sup id=&quot;fnref:21&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:21&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;28&lt;/a&gt;&lt;/sup&gt; so I would like to delegate to an expert grantmaker. Unfortunately, none of the existing grantmakers work for me. Most focus on technical research.&lt;sup id=&quot;fnref:57&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:57&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;29&lt;/a&gt;&lt;/sup&gt; Only &lt;a href=&quot;https://www.founderspledge.com/recommendations/topic/artificial-intelligence&quot;&gt;Founders Pledge&lt;/a&gt; has up-to-date recommendations&lt;sup id=&quot;fnref:56&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:56&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;30&lt;/a&gt;&lt;/sup&gt; on AI policy, but I didn’t realize they existed until I had spent a lot of time looking into organizations on my own, and it turns out I have some significant disagreements with the Founders Pledge recs. (Three of the seven Founders Pledge recs are my three least favorite orgs among the ones I review below.)&lt;sup id=&quot;fnref:58&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:58&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;31&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;So I did it myself. I made a list of every org I could find that works on AI policy.&lt;sup id=&quot;fnref:29&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:29&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;32&lt;/a&gt;&lt;/sup&gt; Then I did shallow evaluations of each of them.&lt;/p&gt;

&lt;p&gt;Some preamble:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;As a rule of thumb, I don’t want to fund anything Open Philanthropy has funded. Not because it means they don’t have room for more funding, but because I believe (credence: 80%) that Open Philanthropy has bad judgment on AI policy (as explained in &lt;a href=&quot;https://www.lesswrong.com/posts/wn5jTrtKkhspshA4c/michaeldickens-s-shortform?commentId=zoBMvdMAwpjTEY4st&quot;&gt;this comment&lt;/a&gt; by Oliver Habryka and &lt;a href=&quot;https://www.lesswrong.com/posts/wn5jTrtKkhspshA4c/michaeldickens-s-shortform?commentId=xpNDD82qjFpyYnP3Q&quot;&gt;reply&lt;/a&gt; by Akash—I have similar beliefs, but they explain it better than I do). Open Philanthropy prefers to fund orgs that behave “respectably” and downplay x-risks, and does not want to fund any orgs that &lt;a href=&quot;#working-with-vs-against-ai-companies&quot;&gt;work against&lt;/a&gt; AI companies.&lt;sup id=&quot;fnref:49&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:49&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;33&lt;/a&gt;&lt;/sup&gt; I don’t want to fund any org that’s potentially making it more difficult to communicate to policy-makers about AI x-risk or helping AI companies accelerate capabilities.&lt;/li&gt;
  &lt;li&gt;In the interest of making my life easier, I stopped investigating an organization as soon as I found a reason not to donate to it, so some of these writeups are missing obvious information.&lt;sup id=&quot;fnref:27&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:27&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;34&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
  &lt;li&gt;A lot of these orgs have similar names. I use full names for any orgs wherever the abbreviation is potentially ambiguous.&lt;sup id=&quot;fnref:26&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:26&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;35&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
  &lt;li&gt;There’s an unfortunate dynamic where I won’t donate to an org if I can’t figure out what it’s doing. But if an org spends a lot of time writing about its activities, that’s time it could be spending on “real” work instead. I have no solution to this.&lt;sup id=&quot;fnref:28&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:28&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;36&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;important-disclaimers&quot;&gt;Important disclaimers&lt;/h2&gt;

&lt;ul&gt;
  &lt;li&gt;When describing orgs’ missions and activities, sometimes I quote or paraphrase from their materials without using quotation marks because the text gets messy otherwise. If I do quote without attribution, the source will be one of the links provided in that section.&lt;/li&gt;
  &lt;li&gt;I only spent 1–2 hours looking into each organization, so I could be substantially wrong in many cases.&lt;/li&gt;
  &lt;li&gt;It might have been good practice to share (parts of) this document with the reviewed organizations before publishing,&lt;sup id=&quot;fnref:61&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:61&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;37&lt;/a&gt;&lt;/sup&gt; but I didn’t do that, mainly because it would take a lot of additional work.&lt;sup id=&quot;fnref:43&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:43&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;38&lt;/a&gt;&lt;/sup&gt; The only exception is if I referenced a private comment made by an individual, I asked permission from that individual before publishing it.&lt;/li&gt;
  &lt;li&gt;Potential conflict of interest: I have friends at METR and Palisade.
    &lt;ul&gt;
      &lt;li&gt;However, I didn’t know I had a friend who worked at METR until after I had written the section on METR. I’m not good at keeping track of where my friends work.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;I’m acquainted with some of the people at CSET, Lightcone, PauseAI, and Sentinel. I might have friends or acquaintances at other orgs as well—like I mentioned, I’m not good at knowing where people work.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;ai-policy-institute&quot;&gt;AI Policy Institute&lt;/h2&gt;

&lt;p&gt;&lt;a href=&quot;https://theaipi.org&quot;&gt;AI Policy Institute&lt;/a&gt; (mostly) runs &lt;a href=&quot;https://theaipi.org/media/&quot;&gt;public opinion polls&lt;/a&gt; on AI risks, some of which are relevant to x-risk. The polls cover some important issues and provide useful information to motivate policy-makers. Some examples:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;2.5x more voters support SB 1047 than oppose it. (&lt;a href=&quot;https://theaipi.org/voters-support-sb1047-in-collaborative-poll/&quot;&gt;source&lt;/a&gt;)&lt;/li&gt;
  &lt;li&gt;56% of voters agreed it would be a good thing if AI progress was significantly slowed, vs. 27% disagreed. (&lt;a href=&quot;https://theaipi.org/poll-biden-ai-executive-order-10-30-8/&quot;&gt;source&lt;/a&gt;)&lt;/li&gt;
  &lt;li&gt;Voters’ top priority on AI regulation is preventing catastrophic outcomes. (&lt;a href=&quot;https://theaipi.org/poll-biden-ai-executive-order-10-30-2/&quot;&gt;source&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This sort of work seems good. I’m not sure how big an impact it has on the margin. My intuition is that polls are good, but additional polls have rapidly diminishing returns, so I wouldn’t consider AI Policy Institute a top donation candidate.&lt;/p&gt;

&lt;p&gt;I could not find good information about its room for more funding. It did not respond to my inquiry on its funding situation.&lt;/p&gt;

&lt;h2 id=&quot;ai-safety-and-governance-fund&quot;&gt;AI Safety and Governance Fund&lt;/h2&gt;

&lt;p&gt;&lt;a href=&quot;https://aisgf.us/&quot;&gt;AI Safety and Governance Fund&lt;/a&gt; (which, to my knowledge, is a one-man org run by Mikhail Samin) wants to test and spread messages to reduce AI x-risk—see &lt;a href=&quot;https://manifund.org/projects/testing-and-spreading-messages-to-reduce-ai-x-risk&quot;&gt;Manifund proposal&lt;/a&gt;. It plans to buy ads to test what sorts of messaging are most effective at communicating the arguments for why AI x-risk matters.&lt;/p&gt;

&lt;p&gt;I like this project because:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Pushing for x-risk-relevant regulation is the most promising sort of intervention right now. But we don’t have much data on what sorts of messaging are most effective. This project intends to give us that data.&lt;/li&gt;
  &lt;li&gt;Mikhail Samin, who runs the org, has a good track record of work on AI safety projects (from what I can see).&lt;/li&gt;
  &lt;li&gt;Mikhail has reasonable plans for what to do with this information once he gets it. (He shared his plans with me privately and asked me not to publish them.)&lt;/li&gt;
  &lt;li&gt;The project has room for more funding, but it shouldn’t take much money to accomplish its goal.&lt;/li&gt;
  &lt;li&gt;The project received a speculation grant from the Survival and Flourishing Fund (SFF) and is reasonably likely to get more funding, but (1) it might not; (2) even if it does, I think it’s useful to diversify the funding base; (3) I generally like SFF grants and I don’t mind funging&lt;sup id=&quot;fnref:73&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:73&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;39&lt;/a&gt;&lt;/sup&gt; SFF dollars.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;ai-standards-lab&quot;&gt;AI Standards Lab&lt;/h2&gt;

&lt;p&gt;&lt;a href=&quot;https://www.aistandardslab.org/&quot;&gt;AI Standards Lab&lt;/a&gt; aims to accelerate the writing of AI safety standards (by standards bodies like ISO and NIST) by writing standards that orgs can adapt.&lt;/p&gt;

&lt;p&gt;These standards are rarely directly relevant to x-risk. Improving standards on sub-existential risks may make it easier to regulate x-risks, but I would rather see an org work on x-risk more directly.&lt;/p&gt;

&lt;p&gt;AI Standards Lab does not appear to be seeking donations.&lt;/p&gt;

&lt;h2 id=&quot;campaign-for-ai-safety&quot;&gt;Campaign for AI Safety&lt;/h2&gt;

&lt;p&gt;&lt;a href=&quot;https://www.campaignforaisafety.org/&quot;&gt;Campaign for AI Safety&lt;/a&gt; used to do public marketing and outreach to promote concern for AI x-risk. In early 2024, it got rolled in to &lt;a href=&quot;#existential-risk-observatory&quot;&gt;Existential Risk Observatory&lt;/a&gt;, and the former organizers of the Campaign for AI Safety now volunteer for Existential Risk Observatory.&lt;/p&gt;

&lt;p&gt;Campaign for AI Safety still has a donations page, but as far as I can tell, there is no reason to donate to it rather than to Existential Risk Observatory.&lt;/p&gt;

&lt;h2 id=&quot;centre-for-enabling-ea-learning-and-research-ceealar&quot;&gt;Centre for Enabling EA Learning and Research (CEEALAR)&lt;/h2&gt;

&lt;p&gt;&lt;a href=&quot;https://www.ceealar.org/&quot;&gt;CEEALAR&lt;/a&gt; runs the EA Hotel. Recently, it has &lt;a href=&quot;https://forum.effectivealtruism.org/posts/cfn2MMEmpnGjTWpAw/ai-winter-season-at-ea-hotel&quot;&gt;focused&lt;/a&gt; on supporting people who work on AI safety, including technical research and policy.&lt;/p&gt;

&lt;p&gt;Something like the EA Hotel could end up accidentally accelerating AI capabilities, but I’m confident that won’t happen because Greg Colbourn, who runs the EA Hotel, is appropriately cautious about AI (he has advocated for a moratorium on AI development).&lt;/p&gt;

&lt;p&gt;You could make a case that CEEALAR has a large multiplicative impact by supporting AI safety people. That case seems hard to make &lt;em&gt;well&lt;/em&gt;, and in the absence of a strong case, CEEALAR isn’t one of my top candidates.&lt;/p&gt;

&lt;h2 id=&quot;center-for-ai-policy&quot;&gt;Center for AI Policy&lt;/h2&gt;

&lt;p&gt;&lt;a href=&quot;https://www.centeraipolicy.org/&quot;&gt;The Center for AI Policy&lt;/a&gt; is a 501(c)(4) nonprofit designed to influence US policy to reduce existential and catastrophic risks from advanced AI (&lt;a href=&quot;https://forum.effectivealtruism.org/posts/NKNoDtPAfHiMA8bJp/introducing-the-center-for-ai-policy-and-we-re-hiring&quot;&gt;source&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;They are serious about x-risk and well-aligned with my position:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Our current focus is building “stop button for AI” capacity in the US government.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Unlike some other orgs, it’s not bogged down by playing politics. For example, it’s willing to &lt;a href=&quot;https://www.centeraipolicy.org/work/sam-altmans-dangerous-and-unquenchable-craving-for-power&quot;&gt;call out&lt;/a&gt; Sam Altman’s bad behavior; and it &lt;a href=&quot;https://forum.effectivealtruism.org/posts/NKNoDtPAfHiMA8bJp/introducing-the-center-for-ai-policy-and-we-re-hiring#How_does_CAIP_differ_from_other_AI_governance_organizations_&quot;&gt;focuses&lt;/a&gt; on conducting advocacy now, rather than amassing influence that can be used later (I’m generally averse to power-seeking).&lt;/p&gt;

&lt;p&gt;The org has &lt;a href=&quot;https://www.centeraipolicy.org/work/model-legislation-release-april-2024&quot;&gt;proposed&lt;/a&gt; model legislation that makes some non-trivial policy proposals (see &lt;a href=&quot;https://assets.caip.org/caip/RAAIA%20Executive%20Summary%20%28April%202024%29.pdf&quot;&gt;summary pdf&lt;/a&gt; and &lt;a href=&quot;https://assets.caip.org/caip/RAAIA%20%28April%202024%29.pdf&quot;&gt;full text pdf&lt;/a&gt;. The legislation would:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;require the customers buying $30,000 advanced AI chips to fill out a one-page registration form;&lt;/li&gt;
  &lt;li&gt;issue permits to the most advanced AI systems based on the quality of their safety testing;&lt;/li&gt;
  &lt;li&gt;define a reasonable set of emergency powers for the government so that they can intervene
and shut down an AI system that’s in the process of going rogue.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is a breath of fresh air compared to most of the policy proposals I’ve read (none of which I’ve discussed yet, because I’m writing this list in alphabetical order). Most proposals say things like:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;make the regulation be good instead of bad;&lt;/li&gt;
  &lt;li&gt;simultaneously promote innovation and safety (there is no such thing as a tradeoff);&lt;/li&gt;
  &lt;li&gt;sternly tell AI companies that they need to not be unsafe, or else we will be very upset.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I’m paraphrasing for humor&lt;sup id=&quot;fnref:22&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:22&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;40&lt;/a&gt;&lt;/sup&gt;, but I don’t think I’m exaggerating—I’ve read proposals from AI policy orgs that were equivalent to these, but phrased more opaquely. (Like nobody explicitly said “we refuse to acknowledge the existence of tradeoffs”, but they did, in fact, refuse to acknowledge the existence of tradeoffs.)&lt;/p&gt;

&lt;p&gt;Center for AI Policy has a target budget of $1.6 million for 2025 (&lt;a href=&quot;https://docs.google.com/document/d/1RzXSYYeUIdAy7I7gJu3DhTxeQqvj8rwteyJUjkFKYY4/&quot;&gt;source&lt;/a&gt;), and its current funding falls considerably short of this goal, so it can make good use of additional money.&lt;/p&gt;

&lt;h2 id=&quot;center-for-ai-safety&quot;&gt;Center for AI Safety&lt;/h2&gt;

&lt;p&gt;&lt;a href=&quot;https://www.safe.ai/&quot;&gt;Center for AI Safety&lt;/a&gt; does safety research and advocates for safety standards. It has a good track record so far:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;It drafted the original version of SB 1047.&lt;/li&gt;
  &lt;li&gt;Its &lt;a href=&quot;https://www.safe.ai/work/statement-on-ai-risk&quot;&gt;Statement on AI Risk&lt;/a&gt; got signatures from major figures in AI and helped bring AI x-risk into the &lt;a href=&quot;https://en.wikipedia.org/wiki/Overton_window&quot;&gt;Overton window&lt;/a&gt;.&lt;/li&gt;
  &lt;li&gt;It’s done some &lt;a href=&quot;https://www.safe.ai/work/2023-impact-report&quot;&gt;other work&lt;/a&gt; (e.g., writing an AI safety textbook; buying compute for safety researchers) that I like but I don’t think is as impactful. The given examples are about supporting alignment research, and as I’ve &lt;a href=&quot;#ai-safety-technical-research-vs-policy&quot;&gt;said&lt;/a&gt;, I’m not as bullish on alignment research.&lt;/li&gt;
  &lt;li&gt;The Center for AI Policy Action Fund does lobbying, which might be good, but I can’t find much public information about what it lobbies for. It did support SB 1047, which is good.&lt;/li&gt;
  &lt;li&gt;It’s led by Dan Hendrycks. I’ve read some of his writings and I get the general sense that he’s competent.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The Center for AI Safety has some work that I’m very optimistic about (most notably the Statement on AI Risk), but I’m only weakly to moderately optimistic about most of its activities.&lt;/p&gt;

&lt;p&gt;It has received $9 million from Open Philanthropy (&lt;a href=&quot;https://www.openphilanthropy.org/grants/center-for-ai-safety-general-support/&quot;&gt;1&lt;/a&gt;, &lt;a href=&quot;https://www.openphilanthropy.org/grants/center-for-ai-safety-general-support-2023/&quot;&gt;2&lt;/a&gt;) and just under $1 million from the &lt;a href=&quot;https://survivalandflourishing.fund/sff-2023-h2-recommendations&quot;&gt;Survival and Flourishing Fund&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;I have a good impression of the Center for AI Safety, but it’s not one of my top candidates because (1) it’s already well-funded and (2) it has done some things I really like, but those are diluted by a lot of things I only moderately like.&lt;/p&gt;

&lt;h2 id=&quot;center-for-human-compatible-ai&quot;&gt;Center for Human-Compatible AI&lt;/h2&gt;

&lt;p&gt;&lt;a href=&quot;https://humancompatible.ai/&quot;&gt;Center for Human-Compatible AI&lt;/a&gt; does mostly technical research, and some advocacy. To my knowledge, the advocacy essentially consists of Stuart Russell using his influential position to &lt;a href=&quot;https://humancompatible.ai/blog/2023/09/11/stuart-russell-testifies-on-ai-regulation-at-u-s-senate-hearing/#artificial-intelligence:-origins-and-concepts&quot;&gt;advocate&lt;/a&gt; for regulation. While that’s good, I don’t think Stuart Russell is personally funding-constrained, so I don’t think marginal donations to the org will help advocacy efforts.&lt;/p&gt;

&lt;h2 id=&quot;center-for-long-term-resilience&quot;&gt;Center for Long-Term Resilience&lt;/h2&gt;

&lt;p&gt;&lt;a href=&quot;https://www.longtermresilience.org/&quot;&gt;The Center for Long-Term Resilience&lt;/a&gt; is a think tank focused on reducing “extreme risks”, which includes x-risks but also other things. It talks to policy-makers and writes reports. I’ll focus on its reports because those are easier to assess.&lt;/p&gt;

&lt;p&gt;About half of the org’s &lt;a href=&quot;https://www.longtermresilience.org/reports/&quot;&gt;work&lt;/a&gt; relates to AI risk. Some of the AI publications are relevant to x-risk (&lt;a href=&quot;https://www.longtermresilience.org/reports/transforming-risk-governance-at-frontier-ai-companies/&quot;&gt;1&lt;/a&gt;); most are marginally relevant (&lt;a href=&quot;https://www.longtermresilience.org/reports/ai-incident-reporting-addressing-a-gap-in-the-uks-regulation-of-ai/&quot;&gt;2&lt;/a&gt;, &lt;a href=&quot;https://www.longtermresilience.org/reports/why-we-recommend-risk-assessments-over-evaluations-for-ai-enabled-biological-tools-bts/&quot;&gt;3&lt;/a&gt;) or not relevant (&lt;a href=&quot;https://www.longtermresilience.org/reports/the-near-term-impact-of-ai-on-disinformation/&quot;&gt;4&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;I skimmed a few of its reports. Here I will give commentary on two of its reports, starting with the one I liked better.&lt;/p&gt;

&lt;p&gt;I’m reluctant to criticize orgs that I think have good intentions, but I think it’s more important to accurately convey my true beliefs. And my true belief is that these reports are not good (credence: 75%).&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://www.longtermresilience.org/reports/transforming-risk-governance-at-frontier-ai-companies/&quot;&gt;Transforming risk governance at frontier AI companies&lt;/a&gt; was my favorite report that I saw from the Center for Long-Term Resilience.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;This was the only one of the org’s recent reports that looked meaningfully relevant to x-risk.&lt;/li&gt;
  &lt;li&gt;The report correctly identifies some inadequacies with AI companies’ risk processes. It proposes some high-level changes that I expect would have a positive impact.&lt;/li&gt;
  &lt;li&gt;That said, I don’t think the changes would have a &lt;em&gt;big&lt;/em&gt; impact. The proposal would make more sense for dealing with typical risks that most industries see, but it’s not (remotely) sufficient to prepare for extinction risks. Indeed, the report proposes using “best practice” risk management. Best practice means standard, which means insufficient for x-risk. (And best practice means well-established, which means well-known, which means the marginal value of proposing it is small.)&lt;/li&gt;
  &lt;li&gt;The report implies that we should rely on voluntary compliance from AI companies. It proposes that companies should use external auditors, but not that those auditors should have any real power.&lt;/li&gt;
  &lt;li&gt;An illustrative quote from the Risk Oversight section: “Although they should not make the final decisions, the specialist risk and assurance [advisors] should play a ‘challenger’ role, pressure testing the business’s plans and decisions to ensure they are risk-informed.” I disagree. Risk advisors should have veto power. The CEO should not have unilateral authority to deploy dangerous models.&lt;/li&gt;
  &lt;li&gt;The report has little in the way of concrete recommendations. Most of the recommendations are non-actionable—for example, “build consensus within business and civil society about the importance of more holistic risk management”. Ok, how specifically does one do that?&lt;/li&gt;
  &lt;li&gt;Contrast this with the model legislation from the Center for AI Policy, where the &lt;a href=&quot;https://assets.caip.org/caip/RAAIA%20Executive%20Summary%20%28April%202024%29.pdf&quot;&gt;one-page executive summary&lt;/a&gt; made proposals that were easier to understand, more concrete, and more relevant to x-risk.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Another example of a report, which I liked less: &lt;a href=&quot;https://www.longtermresilience.org/reports/response-to-establishing-a-pro-innovation-approach-to-regulating-ai/&quot;&gt;Response to ‘Establishing a pro-innovation approach to regulating AI’&lt;/a&gt; (a reply to a &lt;a href=&quot;https://www.gov.uk/government/publications/establishing-a-pro-innovation-approach-to-regulating-ai&quot;&gt;request for proposals&lt;/a&gt; by the UK Office of AI).&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;The report makes four high-level proposals, all of which I dislike:
    &lt;ol&gt;
      &lt;li&gt;“Promoting coherence and reducing inefficiencies across the regulatory regime” – Nobody needs to be told to reduce inefficiency. The only reason why any process is inefficient is because people don’t know how to make it more efficient. &lt;em&gt;How exactly&lt;/em&gt; am I supposed to reduce inefficiency? (This quote comes from the executive summary, where I can forgive some degree of vagueness, but the full report does not provide concrete details.)&lt;/li&gt;
      &lt;li&gt;“Ensuring existing regulators have sufficient expertise and capacity” – Again, this is an &lt;a href=&quot;https://www.lesswrong.com/posts/dLbkrPu5STNCBLRjr/applause-lights&quot;&gt;applause light&lt;/a&gt;, not a real suggestion. No one thinks regulators should have insufficient expertise or capacity.&lt;/li&gt;
      &lt;li&gt;“Ensuring that regulatory gaps can be identified and addressed” – More of the same.&lt;/li&gt;
      &lt;li&gt;“Being sufficiently adaptive to advances in AI capabilities” – More of the same.&lt;/li&gt;
    &lt;/ol&gt;
  &lt;/li&gt;
  &lt;li&gt;The report suggests regulating all AI with a single body rather than diffusely. I like this idea—if a regulatory body is going to prevent x-risk, it probably needs to have broad authority. (Except the report also says “we do not necessarily think [the regulator] needs to be a single body”, which seems to contradict its earlier recommendation.)&lt;/li&gt;
  &lt;li&gt;The report says “It will become increasingly important to distribute responsibility across the entire supply chain of AI development”. I think that’s a good idea if it means restricting sales and exports of compute hardware. But it doesn’t say that explicitly (in fact it provides no further detail at all), and I don’t think policy-makers will interpret it that way.&lt;/li&gt;
  &lt;li&gt;“Recognise that some form of regulation may be needed for general-purpose systems such as foundation models in future.” I would have written this as: “Recognize that strict regulation for general-purpose systems is urgently needed.” Stop downplaying the severity of the situation.&lt;/li&gt;
  &lt;li&gt;If I were writing this report, I would have included evidence/reasoning on why AI risk (x-risk and catastrophic risk) is a major concern, and what this implies about how to regulate it. The report doesn’t include any arguments that could change readers’ minds.&lt;/li&gt;
  &lt;li&gt;In conclusion, this report is mostly vacuous. It contains some non-vacuous proposals in the full text (not represented in the executive summary), but the non-vacuous proposals aren’t particularly concrete and aren’t particularly useful for reducing x-risk.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;An alternative interpretation is that the Center for Long-Term Resilience wants to build influence by writing long and serious-looking reports that nobody could reasonably disagree with. As I touched on &lt;a href=&quot;#political-diplomacy-vs-advocacy&quot;&gt;previously&lt;/a&gt;, I’m not optimistic about this strategy. I disapprove of deceptive tactics, and I think it’s a bad idea even on naive consequentialist grounds (i.e., it’s not going to work as well as writing actionable reports would). And—perhaps more importantly—if the org’s reports are low quality, then I can’t trust that it does a good job when working with policy-makers.&lt;/p&gt;

&lt;h2 id=&quot;center-for-security-and-emerging-technology-cset&quot;&gt;Center for Security and Emerging Technology (CSET)&lt;/h2&gt;

&lt;p&gt;&lt;a href=&quot;https://cset.georgetown.edu/&quot;&gt;Center for Security and Emerging Technology&lt;/a&gt; does work on AI policy along with various other topics. It has received &lt;a href=&quot;https://www.openphilanthropy.org/?s=center+for+security+and+emerging+technology&quot;&gt;$105 million&lt;/a&gt; from Open Philanthropy.&lt;/p&gt;

&lt;p&gt;I wouldn’t donate to CSET because it has so much funding already, but I took a brief look at its publications.&lt;/p&gt;

&lt;p&gt;The research appears mostly tangential or unrelated to x-risk, instead covering subjects like &lt;a href=&quot;https://cset.georgetown.edu/publication/securing-critical-infrastructure-in-the-age-of-ai/&quot;&gt;cybersecurity&lt;/a&gt;, &lt;a href=&quot;https://cset.georgetown.edu/publication/controlling-large-language-models-a-primer/&quot;&gt;deceptive/undesirable LLM output&lt;/a&gt;, and &lt;a href=&quot;https://cset.georgetown.edu/publication/building-the-tech-coalition/&quot;&gt;how the US Department of Defense can use AI to bolster its military power&lt;/a&gt;—this last report seems harmful on balance. Some of its reports (such as &lt;a href=&quot;https://cset.georgetown.edu/publication/enabling-principles-for-ai-governance/&quot;&gt;Enabling Principles for AI Governance&lt;/a&gt;) have the &lt;a href=&quot;#center-for-long-term-resilience&quot;&gt;previously-discussed problem&lt;/a&gt; of being mostly vacuous/non-actionable.&lt;/p&gt;

&lt;p&gt;CSET also works to put researchers into positions where they can directly influence policy (&lt;a href=&quot;https://www.founderspledge.com/research/center-for-security-and-emerging-technology&quot;&gt;source&lt;/a&gt;).&lt;sup id=&quot;fnref:67&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:67&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;41&lt;/a&gt;&lt;/sup&gt; Allegedly, CSET has considerable political influence, but I haven’t identified any visible benefits from that influence (contrast with the &lt;a href=&quot;#center-for-ai-safety&quot;&gt;Center for AI Safety&lt;/a&gt;, which wrote SB 1047). The most legible result I can find is that CSET has collaborated with the Department of Defense; without knowing the details, my prior is that collaborating with DOD is net negative. I would prefer the DOD to be less effective, not more. (Maybe CSET is convincing the DOD not to build military AI but I doubt it; CSET’s reports suggest the opposite.)&lt;/p&gt;

&lt;p&gt;CSET has the same issue as the Center for Long-Term Resilience: if your public outputs are low-quality (or even net harmful), then why should I expect your behind-the-scenes work to be any better?&lt;/p&gt;

&lt;h2 id=&quot;centre-for-long-term-policy&quot;&gt;Centre for Long-Term Policy&lt;/h2&gt;

&lt;p&gt;&lt;a href=&quot;https://www.langsikt.no/en/langsikt&quot;&gt;Centre for Long-Term Policy&lt;/a&gt; operates in Norway and focuses on influencing Norwegian policy on x-risk, longtermism, and global health.&lt;/p&gt;

&lt;p&gt;I didn’t look into it much because I think Norwegian AI policy is unlikely to matter—superintelligent AI will almost certainly not be developed in Norway, so Norwegian regulation has limited ability to constrain AI development.&lt;/p&gt;

&lt;p&gt;From skimming its publications, they mostly cover subjects other than AI x-risk policy.&lt;/p&gt;

&lt;p&gt;The Centre for Long-Term Policy received an undisclosed amount of &lt;a href=&quot;https://www.langsikt.no/en/finansiering&quot;&gt;funding&lt;/a&gt; from Open Philanthropy in 2024.&lt;/p&gt;

&lt;h2 id=&quot;centre-for-the-governance-of-ai&quot;&gt;Centre for the Governance of AI&lt;/h2&gt;

&lt;p&gt;&lt;a href=&quot;https://www.governance.ai/&quot;&gt;Centre for the Governance of AI&lt;/a&gt; does alignment research and policy research. It appears to focus primarily on the former, which, as I’ve &lt;a href=&quot;#ai-safety-technical-research-vs-policy&quot;&gt;discussed&lt;/a&gt;, I’m not as optimistic about. (And I don’t &lt;a href=&quot;#policy-research-vs-policy-advocacy&quot;&gt;like&lt;/a&gt; policy research as much as policy advocacy.)&lt;/p&gt;

&lt;p&gt;Its policy research seems mostly unrelated to x-risk, for example it has multiple reports on AI-driven unemployment (&lt;a href=&quot;https://www.governance.ai/research-paper/scenarios-for-the-transition-to-agi&quot;&gt;1&lt;/a&gt;, &lt;a href=&quot;https://www.governance.ai/research-paper/preparing-the-workforce-for-an-uncertain-ai-future&quot;&gt;2&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;My favorite of its published reports is &lt;a href=&quot;https://www.governance.ai/research-paper/lessons-atomic-bomb-ord&quot;&gt;Lessons from the Development of the Atomic Bomb&lt;/a&gt;. It’s written by Toby Ord, who doesn’t work there.&lt;/p&gt;

&lt;p&gt;Centre for the Governance of AI has received &lt;a href=&quot;https://www.openphilanthropy.org/?s=Centre+for+the+Governance+of+AI&quot;&gt;$6 million&lt;/a&gt; from Open Philanthropy.&lt;/p&gt;

&lt;p&gt;The org appears reasonably well-funded. I don’t have major complaints about its work, but (1) the work does not look particularly strong and (2) it doesn’t cover the focus areas that I’m most optimistic about.&lt;/p&gt;

&lt;h2 id=&quot;civai&quot;&gt;CivAI&lt;/h2&gt;

&lt;p&gt;&lt;a href=&quot;https://civai.org/&quot;&gt;CivAI&lt;/a&gt; raises awareness about AI dangers by building interactive software to demonstrate AI capabilities, for example &lt;a href=&quot;https://civai.org/cyber-demos&quot;&gt;AI-powered cybersecurity threats&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;This org is new which makes it difficult to evaluate. It appears to have the same theory of change as Palisade Research (which I review &lt;a href=&quot;#palisade-research&quot;&gt;below&lt;/a&gt;), but I like Palisade better, for three reasons:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;None of CivAI’s work so far appears relevant to x-risk. For example, its most recent demo focuses on generating fake images for deceptive purposes.&lt;/li&gt;
  &lt;li&gt;I think Palisade’s methods for demonstrating capabilities are more likely to get attention (credence: 65%).&lt;/li&gt;
  &lt;li&gt;I’m more confident in Palisade’s ability to communicate with policy-makers.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;CivAI does not appear to be seeking donations. There is no option to donate through the website.&lt;/p&gt;

&lt;h2 id=&quot;control-ai&quot;&gt;Control AI&lt;/h2&gt;

&lt;p&gt;&lt;a href=&quot;https://controlai.com/&quot;&gt;Control AI&lt;/a&gt; runs advocacy campaigns on AI risk.&lt;/p&gt;

&lt;p&gt;Its &lt;a href=&quot;(https://www.narrowpath.co/)&quot;&gt;current campaign&lt;/a&gt; proposes slowing AI development such that no one develops superintelligence for at least the next 20 years, then using this time to establish a robust system for AI oversight. The campaign includes a &lt;a href=&quot;https://www.narrowpath.co/annexes&quot;&gt;non-vacuous proposal&lt;/a&gt; for the organizational structure of a regulatory body.&lt;/p&gt;

&lt;p&gt;Control AI has a &lt;a href=&quot;https://arxiv.org/abs/2310.20563&quot;&gt;paper&lt;/a&gt; on AI policy that appears reasonable:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;It acknowledges that voluntary commitments from AI companies are insufficient.&lt;/li&gt;
  &lt;li&gt;It proposes establishing international regulatory body that (1) imposes a global cap on computing power used to train an AI system and (2) mandates safety evaluations.&lt;/li&gt;
  &lt;li&gt;It proposes that regulators should have the authority to halt model deployment on a model that they deem excessively dangerous.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The &lt;a href=&quot;https://www.narrowpath.co/&quot;&gt;campaign’s proposal&lt;/a&gt; is similar. It lays out the most concrete plan I’ve seen for how to get to a place where we can solve AI alignment.&lt;/p&gt;

&lt;p&gt;I listened to a &lt;a href=&quot;https://futureoflife.org/podcast/andrea-miotti-on-a-narrow-path-to-safe-transformative-ai/&quot;&gt;podcast&lt;/a&gt; with Andrea Miotti, co-founder of Control AI. He mostly covered standard arguments for caring about AI x-risk, but he also made some insightful comments that changed my thinking a bit.&lt;sup id=&quot;fnref:46&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:46&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;42&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;I like the concept of Control AI’s latest campaign, but I don’t know how much impact it will have.&lt;sup id=&quot;fnref:45&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:45&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;43&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;Control AI’s past campaigns (&lt;a href=&quot;https://controlai.com/campaign-against-foundation-models&quot;&gt;example&lt;/a&gt;) have received media coverage (&lt;a href=&quot;https://www.euronews.com/next/2023/11/29/europes-ai-act-under-threat-by-lobbyists-experts-and-the-public-say&quot;&gt;example&lt;/a&gt;) and their policy objectives have been achieved, although it’s not clear how much of a causal role Control AI played in achieving those objectives, or what Control AI actually did. Control AI clearly deserves &lt;em&gt;some&lt;/em&gt; credit, or else news outlets wouldn’t cite it.&lt;/p&gt;

&lt;p&gt;Control AI might be as impactful as other advocacy orgs that I like, but I have more uncertainty about it, so it’s not a top candidate. It would be fairly easy to change my mind about this.&lt;/p&gt;

&lt;p&gt;I couldn’t find any information about Control AI’s funding situation, and I didn’t inquire because it wasn’t one of my top candidates.&lt;/p&gt;

&lt;h2 id=&quot;existential-risk-observatory&quot;&gt;Existential Risk Observatory&lt;/h2&gt;

&lt;p&gt;&lt;a href=&quot;https://www.existentialriskobservatory.org/&quot;&gt;Existential Risk Observatory&lt;/a&gt; writes &lt;a href=&quot;https://www.existentialriskobservatory.org/#in-the-media&quot;&gt;media articles&lt;/a&gt; on AI x-risk, does &lt;a href=&quot;https://www.existentialriskobservatory.org/research-2/&quot;&gt;policy research&lt;/a&gt;, and publishes &lt;a href=&quot;https://www.existentialriskobservatory.org/policy-proposals/&quot;&gt;policy proposals&lt;/a&gt; (see &lt;a href=&quot;https://existentialriskobservatory.org/papers_and_reports/Policy%20Proposals.pdf&quot;&gt;pdf&lt;/a&gt; with a summary of proposals).&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;It appears to be having some success bringing public attention to x-risk via mainstream media, including advocating for a pause in &lt;a href=&quot;https://time.com/6295879/ai-pause-is-humanitys-best-bet-for-preventing-extinction/&quot;&gt;TIME&lt;/a&gt; (jointly with Joep Meindertsma of PauseAI).&lt;/li&gt;
  &lt;li&gt;Its policy proposals are serious: it proposes implementing an AI pause, tracking frontier AI hardware, and explicitly recognizing extinction risk in regulations.&lt;/li&gt;
  &lt;li&gt;The research mainly focuses on public opinion, for example opinions on AI capabilities/danger (&lt;a href=&quot;https://existentialriskobservatory.org/papers_and_reports/research/AI%20doom%20prevention%20message%20testing%20in%20USA.%20Volume%201.pdf&quot;&gt;pdf&lt;/a&gt;) and message testing on an AI moratorium (&lt;a href=&quot;https://existentialriskobservatory.org/papers_and_reports/research/Test%20of%20narratives%20for%20AGI%20moratorium%20support.pdf&quot;&gt;pdf&lt;/a&gt;).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Existential Risk Observatory is small and funding-constrained, so I expect that donations would be impactful.&lt;/p&gt;

&lt;p&gt;My primary concern is that it operates in the Netherlands. Dutch policy is unlikely to have much influence on x-risk—the United States is the most important country by far, followed by China. And a Dutch organization likely has little influence on United States policy. Existential Risk Observatory can still influence public opinion in America (for example via its TIME article), but I expect a US-headquartered org to have a greater impact.&lt;/p&gt;

&lt;h2 id=&quot;future-of-life-institute-fli&quot;&gt;Future of Life Institute (FLI)&lt;/h2&gt;

&lt;p&gt;&lt;a href=&quot;https://futureoflife.org/&quot;&gt;FLI&lt;/a&gt; has done some good advocacy work like the &lt;a href=&quot;https://futureoflife.org/open-letter/pause-giant-ai-experiments/&quot;&gt;6-month pause letter&lt;/a&gt; (which &lt;a href=&quot;https://manifold.markets/CalebW/in-2030-will-we-think-flis-6-month&quot;&gt;probably reduced x-risk&lt;/a&gt;). It also has a $400 million endowment, so I don’t think it needs any donations from me.&lt;/p&gt;

&lt;h2 id=&quot;future-society&quot;&gt;Future Society&lt;/h2&gt;

&lt;p&gt;&lt;a href=&quot;https://thefuturesociety.org/&quot;&gt;The Future Society&lt;/a&gt; seeks to align AI through better governance. I reviewed some of &lt;a href=&quot;https://thefuturesociety.org/our-work/&quot;&gt;its work&lt;/a&gt;, and it looks almost entirely irrelevant to x-risk.&lt;/p&gt;

&lt;p&gt;Of The Future Society’s recent publications, the most concrete is “List of Potential Clauses to Govern the Development of General Purpose AI Systems” (&lt;a href=&quot;https://thefuturesociety.org/wp-content/uploads/2023/08/List-of-Potential-Clauses_Aug-2023-v.-0.1.pdf&quot;&gt;pdf&lt;/a&gt;). Some notes on this report:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;The Future Society collected recommendations from industry staff, independent experts, and engineers from frontier labs. Engineers from frontier labs should not be trusted to produce recommendations, any more than &lt;a href=&quot;https://x.com/LinchZhang/status/1842344867638411764&quot;&gt;petroleum engineers&lt;/a&gt; should be trusted to set climate change policy.&lt;/li&gt;
  &lt;li&gt;The proposals for mitigating harmful behavior are mostly vacuous and in some cases harmful. They largely amount to: keep building dangerous AI, but do a good job of making it safe.&lt;/li&gt;
  &lt;li&gt;“Use the most state-of-the-art editing techniques to erase capabilities and knowledge that are mostly useful for misuse.” That’s not going to work. (Palisade Research has &lt;a href=&quot;https://arxiv.org/abs/2310.20624&quot;&gt;demonstrated&lt;/a&gt; that it’s easy to remove safeguards from LLMs.)&lt;/li&gt;
  &lt;li&gt;“Use state-of-the-art methods and tools for ensuring safety and trustworthiness of models, such as mechanistic interpretability.” This sentence makes me think the authors don’t have a good understanding of AI safety. The state of the art in mechanistic interpretability is nowhere close to being able to ensure the trustworthiness of models. We still have virtually no idea what’s going on inside large neural networks.&lt;/li&gt;
  &lt;li&gt;The report proposes using the same industry-standard risk management model that the Center for Long-Term Resilience &lt;a href=&quot;#center-for-long-term-resilience&quot;&gt;proposed&lt;/a&gt;. The same criticisms apply—this model is obvious enough that you don’t need to propose it, and severely insufficient for mitigating extinction risks.&lt;/li&gt;
  &lt;li&gt;The report proposes “air gapping &amp;amp; sandboxing, no internet access” for powerful models. I feel like I shouldn’t need to explain why that &lt;a href=&quot;https://www.yudkowsky.net/singularity/aibox&quot;&gt;won’t work&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Another report (&lt;a href=&quot;https://thefuturesociety.org/wp-content/uploads/2023/09/Executive-Summary-TFS-Sep-2023-Heavy-is-the-Head-that-Wears-the-Crown-risk-based-tiered-approach-to-governing-GPAI.pdf&quot;&gt;pdf&lt;/a&gt;) submitted in response to the EU AI Act discussed seven challenges of “general-purpose AI”. The second challenge is “generalization and capability risks, i.e. capability risks, societal risks and extinction risks”. There is no further discussion of extinction risk, and this is the only place that the word “extinction” appears in any of The Future Society’s materials. (The word “existential” appears a few times, but existential risks are not discussed.)&lt;/p&gt;

&lt;h2 id=&quot;horizon-institute-for-public-service&quot;&gt;Horizon Institute for Public Service&lt;/h2&gt;

&lt;p&gt;&lt;a href=&quot;https://horizonpublicservice.org/&quot;&gt;Horizon Institute for Public Service&lt;/a&gt; runs a fellowship where it places people into positions in governments and think tanks. It claims to be reasonably &lt;a href=&quot;https://horizonpublicservice.org/fellow-accomplishments/&quot;&gt;successful&lt;/a&gt;. (I do not have much of an opinion as to how much credit Horizon Institute deserves for its fellows’ accomplishments.)&lt;/p&gt;

&lt;p&gt;Horizon Institute has received an undisclosed amount of &lt;a href=&quot;https://horizonpublicservice.org/about-us/&quot;&gt;funding&lt;/a&gt; from Open Philanthropy (along with some other big foundations).&lt;/p&gt;

&lt;p&gt;Do Horizon fellows care about x-risk, and does their work reduce x-risk in expectation? &lt;a href=&quot;https://www.politico.com/news/2023/10/13/open-philanthropy-funding-ai-policy-00121362&quot;&gt;Politico alleges&lt;/a&gt; that the Horizon Institute is a clandestine plot to get governments to care more about x-risk. I’m not a fan of clandestine plots, but that aside, should I expect Horizon fellows to reduce x-risk?&lt;/p&gt;

&lt;p&gt;Most of their work is not legible, so I’m skeptical by default. Caring about x-risk is not enough to make me trust you. Some people take totally the wrong lessons from concerns about x-risk (especially AI risk) and end up increasing it instead. Case in point: OpenAI, DeepMind, and Anthropic &lt;a href=&quot;https://www.astralcodexten.com/p/why-not-slow-ai-progress&quot;&gt;all&lt;/a&gt; had founders who cared about AI x-risk, and two of those (OpenAI + Anthropic) were founded with the explicit mission of preventing extinction. And yet OpenAI is probably the #1 worst thing that has ever happened in terms of &lt;em&gt;increasing&lt;/em&gt; x-risk, and DeepMind and Anthropic aren’t much better.&lt;/p&gt;

&lt;p&gt;I reviewed all the highlighted &lt;a href=&quot;https://horizonpublicservice.org/fellow-accomplishments/&quot;&gt;accomplishments&lt;/a&gt; of fellows that looked relevant to AI:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;&lt;a href=&quot;https://cltc.berkeley.edu/wp-content/uploads/2023/11/Berkeley-GPAIS-Foundation-Model-Risk-Management-Standards-Profile-v1.0.pdf&quot;&gt;AI risk management standards&lt;/a&gt; for NIST. Only marginally relevant to x-risk, but not bad.&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://carnegieendowment.org/posts/2023/09/how-hype-over-ai-superintelligence-could-lead-policy-astray?lang=en&quot;&gt;An article&lt;/a&gt; on how we shouldn’t worry about x-risk (!!).&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://fas.org/publication/creating-auditing-tools-for-ai-equity/&quot;&gt;Auditing tools for AI equity&lt;/a&gt;. Unrelated to x-risk.&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://www.brookings.edu/articles/detecting-ai-fingerprints-a-guide-to-watermarking-and-beyond/&quot;&gt;Detecting AI fingerprints&lt;/a&gt;. Marginally related to x-risk.&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://cset.georgetown.edu/publication/autonomous-cyber-defense/&quot;&gt;Autonomous cyber defense&lt;/a&gt;. Increasing the capabilities of cybersecurity AI is plausibly net negative.&lt;sup id=&quot;fnref:31&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:31&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;44&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://www.lawfaremedia.org/article/eus-ai-act-barreling-toward-ai-standards-do-not-exist&quot;&gt;An article on the EU AI Act&lt;/a&gt;. Non-vacuous and discusses AI risk (not exactly x-risk, but close). Vaguely hints at slowing AI development.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;In my judgment after taking a brief look, 3/6 highlighted writings were perhaps marginally useful for x-risk, 1/6 was irrelevant, and 2/6 were likely harmful. None were clearly useful.&lt;/p&gt;

&lt;p&gt;Zvi Mowshowitz &lt;a href=&quot;https://www.lesswrong.com/posts/kuDKtwwbsksAW4BG2/zvi-s-thoughts-on-the-survival-and-flourishing-fund-sff&quot;&gt;wrote&lt;/a&gt;:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;In my model, one should be deeply skeptical whenever the answer to ‘what would do the most good?’ is ‘get people like me more money and/or access to power.’&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I agree, but even beyond that, the Horizon fellows don’t seem to be “people like me”. They include people who are arguing &lt;em&gt;against&lt;/em&gt; caring about x-risk.&lt;/p&gt;

&lt;p&gt;I believe the world would be better off if Horizon Institute did not exist (credence: 55%).&lt;/p&gt;

&lt;p&gt;And if I’m wrong about that, it still looks like Horizon fellows don’t do much work related to x-risk, so the expected value of Horizon Institute is low.&lt;/p&gt;

&lt;h2 id=&quot;institute-for-ai-policy-and-strategy&quot;&gt;Institute for AI Policy and Strategy&lt;/h2&gt;

&lt;p&gt;&lt;a href=&quot;https://www.iaps.ai/&quot;&gt;Institute for AI Policy and Strategy&lt;/a&gt; does policy research, focused on US AI regulations, compute governance, lab governance, and international governance with China.&lt;/p&gt;

&lt;p&gt;I’m more optimistic about advocacy than policy research, so this org is not one of my top candidates. That said, I like it better than most AI policy research orgs. Some observations from briefly reading some of its research:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;The proposals are non-vacuous and moderately concrete. For example, from its &lt;a href=&quot;https://ukdayone.org/briefings/assuring-growth-making-the-uk-a-global-leader-in-ai-assurance-technology&quot;&gt;recommendations to the UK government&lt;/a&gt;: “Investing £50m in ‘pull mechanisms’ (pay-outs contingent on achieving specific technological goals, such as prizes, AMCs, and milestone payments).” I don’t know how much that helps with x-risk, but it’s concrete.&lt;/li&gt;
  &lt;li&gt;Almost all of its work focuses on sub-extinction risks. Some of this looks potentially useful for x-risk, for example &lt;a href=&quot;https://www.iaps.ai/research/coordinated-disclosure&quot;&gt;establishing reporting requirements&lt;/a&gt;, or &lt;a href=&quot;https://www.iaps.ai/research/are-consumer-gpus-a-problem-for-us-export-controls&quot;&gt;recognizing the risks associated with exporting GPUs&lt;/a&gt;.&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://www.iaps.ai/research/responsible-scaling&quot;&gt;One report&lt;/a&gt; fairly criticizes some shortcomings of Anthropic’s Responsible Scaling Policy.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Institute for AI Policy and Strategy has received just under $4 million from Open Philanthropy (&lt;a href=&quot;https://www.openphilanthropy.org/grants/institute-for-ai-policy-strategy-general-support/&quot;&gt;1&lt;/a&gt;, &lt;a href=&quot;https://www.openphilanthropy.org/grants/institute-for-ai-policy-and-strategy-general-support-april-2024/&quot;&gt;2&lt;/a&gt;), and is &lt;a href=&quot;https://manifund.org/projects/ai-policy-work--iaps&quot;&gt;seeking additional funding&lt;/a&gt;.&lt;/p&gt;

&lt;h2 id=&quot;lightcone-infrastructure&quot;&gt;Lightcone Infrastructure&lt;/h2&gt;

&lt;p&gt;&lt;a href=&quot;https://www.lightconeinfrastructure.com/&quot;&gt;Lightcone&lt;/a&gt; runs &lt;a href=&quot;https://www.lesswrong.com/about&quot;&gt;LessWrong&lt;/a&gt; and an office that Lightcone calls “&lt;a href=&quot;https://en.wikipedia.org/wiki/Bell_Labs&quot;&gt;Bell Labs&lt;/a&gt; for longtermism”.&lt;/p&gt;

&lt;p&gt;Lightcone has a detailed case for impact &lt;a href=&quot;https://manifund.org/projects/lightcone-infrastructure&quot;&gt;on Manifund&lt;/a&gt;. In short, Lightcone maintains LessWrong, and LessWrong is upstream of a large quantity of AI safety work.&lt;/p&gt;

&lt;p&gt;I believe Lightcone has high expected value and it can make good use of marginal donations.&lt;/p&gt;

&lt;p&gt;By maintaining LessWrong, Lightcone somewhat improves many AI safety efforts (plus efforts on other beneficial projects that don’t relate to AI safety). If I were very uncertain about what sort of work was best, I might donate to Lightcone as a way to provide diffuse benefits across many areas. But since I believe (a specific sort of) policy work has much higher EV than AI safety research, I believe it makes more sense to fund that policy work directly.&lt;/p&gt;

&lt;p&gt;An illustration with some made-up numbers: Suppose that&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;There are 10 categories of AI safety work.&lt;/li&gt;
  &lt;li&gt;Lightcone makes each of them 20% better.&lt;/li&gt;
  &lt;li&gt;The average AI safety work produces 1 utility point.&lt;/li&gt;
  &lt;li&gt;Well-directed AI policy produces 5 utility points.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Then a donation to Lightcone is worth 2 utility points, and my favorite AI policy orgs are worth 5 points. So a donation to Lightcone is better than the average AI safety org, but not as good as good policy orgs.&lt;/p&gt;

&lt;h2 id=&quot;machine-intelligence-research-institute-miri&quot;&gt;Machine Intelligence Research Institute (MIRI)&lt;/h2&gt;

&lt;p&gt;&lt;a href=&quot;https://intelligence.org/&quot;&gt;MIRI&lt;/a&gt; used to do exclusively technical research. In 2024, it &lt;a href=&quot;https://intelligence.org/2024/01/04/miri-2024-mission-and-strategy-update/&quot;&gt;pivoted&lt;/a&gt; to focus on policy advocacy—specifically, advocating for &lt;a href=&quot;https://intelligence.org/2024/05/29/miri-2024-communications-strategy/&quot;&gt;shutting down frontier AI development&lt;/a&gt;. MIRI changed its mind at around the same time I &lt;a href=&quot;#ai-safety-technical-research-vs-policy&quot;&gt;changed my mind&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Some observations:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;MIRI gets considerable credit for being the first to recognize the AI alignment problem.&lt;/li&gt;
  &lt;li&gt;I have a high opinion of the general competence of MIRI employees.&lt;/li&gt;
  &lt;li&gt;Historically, I have agreed with MIRI’s criticisms of most technical alignment approaches, which suggests they have good reasoning processes. (With the caveat that I don’t really understand technical alignment research.)&lt;/li&gt;
  &lt;li&gt;Eliezer Yudkowsky’s &lt;a href=&quot;https://time.com/6266923/ai-eliezer-yudkowsky-open-letter-not-enough/&quot;&gt;TIME article&lt;/a&gt; publicly argued for AI pause and brought some attention to the issue (both positive and negative). My vibe sense says the article was valuable but who knows.&lt;/li&gt;
  &lt;li&gt;Eliezer personally has a strong track record of influencing (some subset of) people with &lt;a href=&quot;https://www.lesswrong.com/tag/original-sequences&quot;&gt;the LessWrong sequences&lt;/a&gt; and &lt;a href=&quot;https://hpmor.com/&quot;&gt;Harry Potter and the Methods of Rationality&lt;/a&gt;.&lt;/li&gt;
  &lt;li&gt;I know that MIRI is serious about existential risk and isn’t going to compromise its values.&lt;/li&gt;
  &lt;li&gt;Eliezer believes animals are not moral patients, which is kind of insane but probably not directly relevant. (&lt;a href=&quot;https://slatestarcodex.com/2019/02/26/rule-genius-in-not-out/&quot;&gt;Rule thinkers in, not out.&lt;/a&gt;)&lt;/li&gt;
  &lt;li&gt;MIRI (or at least Eliezer) says P(doom) &amp;gt; 95%.&lt;sup id=&quot;fnref:64&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:64&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;45&lt;/a&gt;&lt;/sup&gt; Some people say this is crazy high and it makes MIRI want to do dumb stuff like shutting down AI. I do think 95% is too high but I think most people are kind of crazy about probability—they treat probabilities less than 50% as essentially 0%. Like if your P(doom) is 40%, you should be doing the same thing that MIRI is doing.&lt;sup id=&quot;fnref:65&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:65&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;46&lt;/a&gt;&lt;/sup&gt; You should not be trying to develop AI as fast as possible while funding a little safety research on the side.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;MIRI’s new communications strategy has produced few results so far. We know that MIRI is &lt;a href=&quot;https://intelligence.org/2024/05/29/miri-2024-communications-strategy/&quot;&gt;working on&lt;/a&gt; a new website that explains the case for x-risk; a book; and an online reference. It remains to be seen how useful these will be. They don’t seem like &lt;em&gt;obviously&lt;/em&gt; good ideas to me,&lt;sup id=&quot;fnref:68&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:68&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;47&lt;/a&gt;&lt;/sup&gt; but I expect MIRI will correct course if a strategy isn’t working.&lt;/p&gt;

&lt;p&gt;Until recently, MIRI was not seeking funding because it received some large cryptocurrency donations in 2021ish. Now it’s started fundraising again to pay for its new policy work.&lt;/p&gt;

&lt;p&gt;I consider MIRI a top candidate. It only recently pivoted to advocacy so there’s not much to retrospectively evaluate, but I expect its work to be impactful.&lt;/p&gt;

&lt;h2 id=&quot;manifund&quot;&gt;Manifund&lt;/h2&gt;

&lt;p&gt;&lt;a href=&quot;https://manifund.org/&quot;&gt;Manifund&lt;/a&gt; does not do anything directly related to AI policy. It’s a fundraising platform. But I’m including it in this list because I’m impressed by how it’s changed the funding landscape.&lt;/p&gt;

&lt;p&gt;Many orgs have written fundraising pitches on Manifund. And for whatever reason, some of these pitches are &lt;em&gt;way&lt;/em&gt; higher quality than what I’m used to. I’m not sure why—maybe Manifund’s prompt questions draw out good answers.&lt;/p&gt;

&lt;p&gt;For example, originally I was skeptical that donations to Lightcone Infrastructure could be competitive with top charities, but its &lt;a href=&quot;https://manifund.org/projects/lightcone-infrastructure&quot;&gt;Manifund page&lt;/a&gt; changed my mind. I donated $200 just as a reward for the excellent writeup.&lt;/p&gt;

&lt;p&gt;Many of the orgs on my list (especially the smaller ones) wrote detailed pitches on Manifund that helped me decide where to donate. Manifund deserves part of the credit for that.&lt;/p&gt;

&lt;p&gt;Manifund is free to use, but it sometimes asks large donors to give a percentage of their donations to cover its operating costs. Manifund didn’t ask me to do that, so I didn’t.&lt;/p&gt;

&lt;h2 id=&quot;model-evaluation-and-threat-research-metr&quot;&gt;Model Evaluation and Threat Research (METR)&lt;/h2&gt;

&lt;p&gt;&lt;a href=&quot;https://metr.org&quot;&gt;METR&lt;/a&gt; evaluates large AI models to look for potentially dangerous capabilities. Its most obvious theory of change—where it finds a scary result and then the AI company pauses development—mainly depends on (1) AI companies giving access to METR (which they often &lt;a href=&quot;https://www.lesswrong.com/posts/yHFhWmu3DmvXZ5Fsm/clarifying-metr-s-auditing-role&quot;&gt;don’t&lt;/a&gt;) and (2) AI companies ceasing model development when METR establishes harmful capabilities (which they probably won’t—if there’s any ambiguity, they will likely choose the interpretation that lets them keep making more money).&lt;/p&gt;

&lt;p&gt;There’s an indirect but more promising theory of change where METR demonstrates a template for capability evaluation which policy-makers then rely on to impose safety regulations. To that end, METR has engaged with NIST’s AI risk management framework (&lt;a href=&quot;https://downloads.regulations.gov/NIST-2024-0001-0075/attachment_2.pdf&quot;&gt;pdf&lt;/a&gt;). This sounds potentially promising but it’s not where I would put money on the margin because:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;I &lt;a href=&quot;#slow-nuanced-regulation-vs-fast-coarse-regulation&quot;&gt;don’t think we should wait&lt;/a&gt; to figure out a solid evaluation framework before writing regulations.&lt;/li&gt;
  &lt;li&gt;Evaluations are helpful if we want to conditionally pause AI in the future, but not relevant if we want to unconditionally pause AI right now, and I believe we should do the latter.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2 id=&quot;palisade-research&quot;&gt;Palisade Research&lt;/h2&gt;

&lt;p&gt;&lt;a href=&quot;https://palisaderesearch.org/&quot;&gt;Palisade&lt;/a&gt; builds demonstrations of the offensive capabilities of AI systems, with the goal of illustrating risks to policy-makers.&lt;/p&gt;

&lt;p&gt;Some thoughts:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Demonstrating capabilities is probably a useful persuasion strategy.&lt;/li&gt;
  &lt;li&gt;Palisade has done some good work, like &lt;a href=&quot;https://arxiv.org/abs/2311.00117&quot;&gt;removing safety fine-tuning&lt;/a&gt; from Meta’s LLM.&lt;/li&gt;
  &lt;li&gt;I know some of the Palisade employees and I believe they’re competent.&lt;/li&gt;
  &lt;li&gt;Historically, Palisade has focused on building out tech demos. I’m not sure how useful this is for x-risk, since you can’t demonstrate existentially threatening capabilities until it’s too late.  Hopefully, Palisade’s audience can extrapolate from the demos to see that extinction is a serious concern.&lt;/li&gt;
  &lt;li&gt;Soon, Palisade plans to shift from primarily building demos to primarily using those demos to persuade policy-makers.&lt;/li&gt;
  &lt;li&gt;Palisade has a smallish team and has reasonable room to expand.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Palisade has not been actively fundraising, but I believe it can put funding to good use—it has limited runway and wants to hire more people.&lt;/p&gt;

&lt;p&gt;I think the work on building tech demos has rapidly diminishing utility, but Palisade is &lt;a href=&quot;https://palisaderesearch.org/hiring&quot;&gt;hiring&lt;/a&gt; for more policy-oriented roles, so I believe that’s mostly where marginal funding will go.&lt;/p&gt;

&lt;h2 id=&quot;pauseai-global&quot;&gt;PauseAI Global&lt;/h2&gt;

&lt;p&gt;(PauseAI Global and PauseAI US share the same mission and used to be part of the same org, so most of my comments on PauseAI Global also apply to PauseAI US.)&lt;/p&gt;

&lt;p&gt;From &lt;a href=&quot;https://manifund.org/projects/pauseai-local-communities---volunteer-stipends&quot;&gt;Manifund&lt;/a&gt;:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;&lt;a href=&quot;https://pauseai.info&quot;&gt;PauseAI&lt;/a&gt; is a grassroots community of volunteers which aim to inform the public and politicians about the risks from superhuman AI and urge them to work towards an international treaty that prevents the most dangerous AI systems from being developed.&lt;/p&gt;

  &lt;p&gt;PauseAI is largely organised through local communities which take actions to spread awareness such as letter writing workshops, peaceful protests, flyering and giving presentations.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Historically, I’ve been skeptical of public protests. I think people mainly protest because it’s fun and it makes them feel like they’re contributing, not because it actually helps.&lt;sup id=&quot;fnref:23&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:23&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;48&lt;/a&gt;&lt;/sup&gt; But PauseAI has been appropriately thoughtful (&lt;a href=&quot;https://forum.effectivealtruism.org/posts/Y4SaFM5LfsZzbnymu/the-case-for-ai-safety-advocacy-to-the-public&quot;&gt;1&lt;/a&gt;, &lt;a href=&quot;https://pauseai.info/feasibility&quot;&gt;2&lt;/a&gt;) about whether and when protests work, and it makes a reasonable case that protesting can be effective.&lt;/p&gt;

&lt;p&gt;(See also the &lt;a href=&quot;https://www.socialchangelab.org/_files/ugd/503ba4_052959e2ee8d4924934b7efe3916981e.pdf&quot;&gt;Protest Outcomes&lt;/a&gt; report by Social Change Lab. The evidence for the effectiveness of protests is a bit stronger than I expected.&lt;sup id=&quot;fnref:66&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:66&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;49&lt;/a&gt;&lt;/sup&gt;)&lt;/p&gt;

&lt;p&gt;I’m skeptical of the evidence because I don’t trust sociology research (it has approximately the worst replication record of any field). But I like PauseAI because:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Approximately zero percent of AI dollars go to AI safety, but approximately zero percent of AI safety dollars go to public advocacy.&lt;/li&gt;
  &lt;li&gt;Polls suggest that there’s widespread public support for pausing AI, and PauseAI has a good shot at converting that public support into policy change.&lt;/li&gt;
  &lt;li&gt;The people running PauseAI seem to have a good idea of what they’re doing, and it’s apparent that they are seriously concerned about existential risk (for most AI policy orgs, I can’t tell whether they care).&lt;/li&gt;
  &lt;li&gt;My impression is that the PauseAI founders went through a similar reasoning process as &lt;a href=&quot;#cause-prioritization&quot;&gt;I did&lt;/a&gt;, and concluded that public advocacy was the most promising approach.&lt;/li&gt;
  &lt;li&gt;I’ve listened to interviews and read articles by leaders from a number of AI policy orgs, and I like the vibes of the PauseAI leaders the best. Many people working in AI safety have &lt;a href=&quot;https://www.econlib.org/archives/2016/01/the_invisible_t.html&quot;&gt;missing moods&lt;/a&gt;, but the PauseAI people do not. I don’t put too much weight on vibes, but they still get nonzero weight.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Broadly speaking, I’m a little more optimistic about advocacy toward policy-makers than advocacy toward the public, simply because it’s more targeted. But PauseAI is still a top candidate because its approach is exceptionally neglected.&lt;/p&gt;

&lt;p&gt;PauseAI Global has no full-time employees; it focuses on supporting volunteers who run protests.&lt;/p&gt;

&lt;h2 id=&quot;pauseai-us&quot;&gt;PauseAI US&lt;/h2&gt;

&lt;p&gt;&lt;a href=&quot;https://manifund.org/projects/pauseai-us-2025-through-q2&quot;&gt;PauseAI US&lt;/a&gt; organizes protests to advocate for pausing AI development.&lt;/p&gt;

&lt;p&gt;Unlike PauseAI Global which has no full-time employees, PauseAI US has a small full-time staff who run protests and political lobbying efforts. I like PauseAI US a little better than PauseAI Global because most major AI companies are headquartered in the US, so I expect a US-based org to have more potential for impact.&lt;/p&gt;

&lt;p&gt;PauseAI US also does grassroots lobbying (e.g., organizing volunteers to write letters to Congress) and direct lobbying (talking to policy-makers).&lt;/p&gt;

&lt;p&gt;Grassroots lobbying makes sense as a neglected intervention. Direct lobbying isn’t quite as neglected but it’s still one of my favorite interventions. PauseAI US only has a single lobbyist right now, Felix De Simone. He’s more junior than the lobbyists at some other policy orgs, but based on what I know of his &lt;a href=&quot;https://forum.effectivealtruism.org/posts/aYxuFeCcqRvaszHPb/ama-pauseai-us-needs-money-ask-founder-exec-dir-holly-elmore?commentId=HXkktn8NsdrEsxiPW&quot;&gt;background&lt;/a&gt;, I expect him to do a competent job.&lt;sup id=&quot;fnref:72&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:72&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;50&lt;/a&gt;&lt;/sup&gt; PauseAI US is performing well on obvious surface-level metrics like “number of meetings with Congressional offices per person per month”.&lt;/p&gt;

&lt;h2 id=&quot;sentinel-rapid-emergency-response-team&quot;&gt;Sentinel rapid emergency response team&lt;/h2&gt;

&lt;p&gt;&lt;a href=&quot;https://sentinel-team.org/&quot;&gt;Sentinel&lt;/a&gt; monitors world events for potential precursors to catastrophes. It publishes a &lt;a href=&quot;https://sentinel-team.org/#latest&quot;&gt;weekly newsletter&lt;/a&gt; with events of interest (such as “Iran launched a ballistic missile attack on Israel” or “Two people in California have been infected with bird flu”).&lt;/p&gt;

&lt;p&gt;Sentinel’s mission is to alert relevant parties so that looming catastrophes can be averted before they happen.&lt;/p&gt;

&lt;p&gt;You can read more information on Sentinel’s &lt;a href=&quot;https://manifund.org/projects/fund-sentinel-for-q4-2024&quot;&gt;Manifund page&lt;/a&gt; (short) and &lt;a href=&quot;https://docs.google.com/document/d/18GWF0pVy5X7M_0e3l49Ze82aqHvVurVpsLm63htmc64/&quot;&gt;fundraising memo&lt;/a&gt; (long).&lt;/p&gt;

&lt;p&gt;Some thoughts:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;I believe almost nobody would do a good job of running Sentinel because it’s hard to identify early warning signals of catastrophes. But Sentinel is run by members of &lt;a href=&quot;https://samotsvety.org/&quot;&gt;Samotevsky Forecasting&lt;/a&gt;, who I expect to be unusually good at this.&lt;/li&gt;
  &lt;li&gt;The value of Sentinel depends on who’s paying attention to its reports. I don’t know who’s paying attention to its reports.&lt;/li&gt;
  &lt;li&gt;Sentinel isn’t immediately relevant to AI policy, but it could be extremely valuable in certain situations. Namely, it could provide early warning if AI x-risk rapidly increases due to some series of events.&lt;/li&gt;
  &lt;li&gt;AI x-risk aside, I still think Sentinel has high EV because it potentially significantly reduces catastrophic risk for a small budget. Without having investigated those cause areas, Sentinel is tentatively my #1 donation pick for reducing nuclear and biological x-risks.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Sentinel currently has four team members working part-time. With additional funding, its members could work full-time and it could hire more members and therefore do more comprehensive monitoring.&lt;/p&gt;

&lt;h2 id=&quot;simon-institute-for-longterm-governance&quot;&gt;Simon Institute for Longterm Governance&lt;/h2&gt;

&lt;p&gt;&lt;a href=&quot;https://www.simoninstitute.ch/&quot;&gt;Simon Institute&lt;/a&gt; supports policies to improve coordination, reduce global catastrophic risks, and embed consideration for future generations. It specifically focuses on influencing United Nations policy. (See the &lt;a href=&quot;https://forum.effectivealtruism.org/posts/aqwyGuJkZbnpjt3TR/update-on-the-simon-institute-year-one&quot;&gt;Year One Update&lt;/a&gt; from 2022.)&lt;/p&gt;

&lt;p&gt;Most of the org’s work appears not very relevant to x-risk. For example:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;It &lt;a href=&quot;https://www.simoninstitute.ch/blog/post/the-windfall-trust-workshop-exploring-potential-pathways-for-benefit-sharing-redistributing-ai-profits/&quot;&gt;co-hosted a workshop&lt;/a&gt; on how AI companies might redistribute their profits (via the &lt;a href=&quot;https://futureoflife.org/project/the-windfall-trust/&quot;&gt;Windfall Trust&lt;/a&gt;).&lt;/li&gt;
  &lt;li&gt;It &lt;a href=&quot;https://www.simoninstitute.ch/blog/post/nov-25-30-preparedness-bwc-meeting-of-states-parties/&quot;&gt;co-developed a table-top exercise&lt;/a&gt; on pandemic preparedness.&lt;/li&gt;
  &lt;li&gt;It &lt;a href=&quot;https://www.simoninstitute.ch/blog/post/response-to-revision-1-of-the-global-digital-compact-implications-for-ai-governance/&quot;&gt;proposed some minor changes&lt;/a&gt; to a UN &lt;a href=&quot;https://www.un.org/techenvoy/global-digital-compact&quot;&gt;document&lt;/a&gt; on AI governance.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This work seems reasonably good, but not as high-impact as work that directly targets x-risk reduction.&lt;/p&gt;

&lt;h2 id=&quot;stop-ai&quot;&gt;Stop AI&lt;/h2&gt;

&lt;p&gt;Like PauseAI, &lt;a href=&quot;https://www.stopai.info/&quot;&gt;Stop AI&lt;/a&gt; protests the development of superintelligent AI. Unlike PauseAI, Stop AI uses disruptive tactics like blocking entrances to OpenAI offices and blocking traffic.&lt;/p&gt;

&lt;p&gt;This is a more high-variance strategy. I find it plausible that Stop AI’s tactics are especially effective, but also likely that its tactics will backfire and decrease public support. So in the absence of some degree of supporting evidence, I’m inclined not to support Stop AI.&lt;/p&gt;

&lt;p&gt;Stop AI’s &lt;a href=&quot;https://docs.google.com/document/d/1IgTaTMTZuY3kRLZ5JDFhwqQLATWVVGLOHVulbBY5O6g/&quot;&gt;proposal&lt;/a&gt; seems overreaching (it wants to &lt;em&gt;permanently&lt;/em&gt; ban AGI development) and it makes weak arguments.&lt;/p&gt;

&lt;p&gt;From listening to an &lt;a href=&quot;https://lironshapira.substack.com/p/getting-arrested-for-barricading&quot;&gt;interview&lt;/a&gt;, I get the impression that the Stop AI founders aren’t appropriately outcome-oriented and don’t have a well-formulated theory of change. In the interview, they would offer reasoning for why they took a particular action and then when the interviewer would point out how that justification doesn’t explain their behavior, they would switch to a different explanation. An example (paraphrased):&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Stop AI: We blocked an entrance to OpenAI’s office to make it harder for employees to build AGI. We feel that this is necessary to stop OpenAI from killing everyone.&lt;/p&gt;

  &lt;p&gt;Interviewer: Then why did you block traffic, since that affects innocent bystanders, not the people building AGI?&lt;/p&gt;

  &lt;p&gt;Stop AI: We need to block traffic to raise awareness.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This pattern occurred a few times. They have reasonable concerns about the dangers of AI, but they don’t seem to have a good justification for why disruptive protests are the best way to handle those concerns.&lt;/p&gt;

&lt;p&gt;(I can see an argument for blocking entrances to AI company offices, but I think the argument for blocking traffic is much weaker.)&lt;/p&gt;

&lt;p&gt;In short, Stop AI is spiritually similar to PauseAI but with worse reasoning, worse public materials, and worse tactics.&lt;/p&gt;

&lt;h1 id=&quot;where-im-donating&quot;&gt;Where I’m donating&lt;/h1&gt;

&lt;p&gt;My top candidates:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;AI Safety and Governance Fund&lt;/li&gt;
  &lt;li&gt;PauseAI US&lt;/li&gt;
  &lt;li&gt;Center for AI Policy&lt;/li&gt;
  &lt;li&gt;Palisade&lt;/li&gt;
  &lt;li&gt;MIRI&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;A classification of every other org I reviewed:&lt;sup id=&quot;fnref:76&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:76&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;51&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Good but not funding-constrained:&lt;/strong&gt; Center for AI Safety, Future of Life Institute&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Would fund if I had more money:&lt;/strong&gt; Control AI, Existential Risk Observatory, Lightcone Infrastructure, PauseAI Global, Sentinel&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Would fund if I had a lot more money, but might fund orgs in other cause areas first:&lt;/strong&gt;&lt;sup id=&quot;fnref:32&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:32&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;52&lt;/a&gt;&lt;/sup&gt; AI Policy Institute, CEEALAR, Center for Human-Compatible AI, Manifund&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Might fund if I had a lot more money:&lt;/strong&gt; AI Standards Lab, Centre for the Governance of AI, Centre for Long-Term Policy, CivAI, Institute for AI Policy and Strategy, METR, Simon Institute for Longterm Governance&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Would not fund:&lt;/strong&gt; Center for Long-Term Resilience, Center for Security and Emerging Technology, Future Society, Horizon Institute for Public Service, Stop AI&lt;/p&gt;

&lt;h2 id=&quot;prioritization-within-my-top-five&quot;&gt;Prioritization within my top five&lt;/h2&gt;

&lt;p&gt;Here’s why I ordered my top five the way I did.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;#1: AI Safety and Governance Fund&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This is my top choice because:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;It could greatly improve the value of future communications efforts.&lt;/li&gt;
  &lt;li&gt;It’s cheap, which means it’s cost-effective.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It would drop down the list quickly if it received more funding, but right now it’s #1.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;#2: PauseAI US&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I would expect advocacy toward policy-makers to be more impactful than public advocacy if they had similar levels of funding (credence: 60%). But pause protests are extremely neglected, so I believe they’re the most promising strategy on the margin. And PauseAI US is my favorite org doing protests because it operates in the United States and it appears appropriately competent and thoughtful.&lt;/p&gt;

&lt;p&gt;Protests are especially unpopular among institutional funders, which makes them more promising for individual donors like me.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;#3: Center for AI Policy&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This is one of only three orgs (along with Palisade and PauseAI US) that meet four criteria:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;works to persuade policy-makers&lt;/li&gt;
  &lt;li&gt;focuses on AI x-risk over other less-important AI safety concerns&lt;sup id=&quot;fnref:69&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:69&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;53&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
  &lt;li&gt;focuses on United States policy&lt;sup id=&quot;fnref:70&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:70&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;54&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
  &lt;li&gt;is funding-constrained&lt;sup id=&quot;fnref:71&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:71&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;55&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;I’m nearly indifferent between Center for AI Policy and Palisade. I slightly prefer the former because (1) its employees have more experience in politics and (2) its mission/messaging seems less palatable to institutional funders so I expect it to have a harder time raising money.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;#4: Palisade&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Palisade meets the same four criteria as Center for AI Policy. As a little twist, Palisade also builds tech demos with the purpose of demonstrating the dangers of AI to policy-makers. Those demos might help or they might not be worth the effort—both seem equally likely to me—so this twist doesn’t change my expectation of Palisade’s cost-effectiveness. I only slightly favor Center for AI Policy for the two reasons mentioned previously.&lt;/p&gt;

&lt;p&gt;I personally know people at Palisade, which I think biases me in its favor, and I might put Palisade at #3 if I wasn’t putting in mental effort to resist that bias.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;#5: MIRI&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;MIRI plans to target a general audience, &lt;del&gt;not policy-makers&lt;/del&gt; (update 2024-11-20: see correction below). That means they can reach more people but it’s also lower leverage. My guess is that targeting a general audience is worse on balance.&lt;/p&gt;

&lt;p&gt;I put PauseAI US higher than the two lobbying orgs because it has such a small budget. Like PauseAI US, MIRI’s strategies are also neglected, but considerably less so.&lt;sup id=&quot;fnref:74&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:74&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;56&lt;/a&gt;&lt;/sup&gt; I expect policy-maker outreach to be more effective than MIRI’s approach (credence: 60%).&lt;/p&gt;

&lt;p&gt;Lest I give the wrong impression, MIRI is still my #5 candidate out of 28 charities. I put it in the top five because I have a high opinion of MIRI leadership—I expect them to have reasonable prioritization and effective execution.&lt;/p&gt;

&lt;p&gt;CORRECTION: MIRI’s technical governance team does research to inform policy, and MIRI has spoken to policy-makers in the US government. This bumps up my evaluation of the org but I’m keeping it at #5 because working with policy-makers is only one part of MIRI’s overall activities.&lt;/p&gt;

&lt;h2 id=&quot;where-im-donating-this-is-the-section-in-which-i-actually-say-where-im-donating&quot;&gt;Where I’m donating (this is the section in which I actually say where I’m donating)&lt;/h2&gt;

&lt;p&gt;I agree with the standard argument that small donors should give all their money to their #1 favorite charity. That’s how I’ve done it &lt;a href=&quot;https://mdickens.me/donations/&quot;&gt;in the past&lt;/a&gt;, but this year I’m splitting my donations a little bit. I plan on donating:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;$5,000 to AI Safety and Governance Fund&lt;/li&gt;
  &lt;li&gt;$5,000 to PauseAI Global&lt;sup id=&quot;fnref:60&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:60&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;57&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
  &lt;li&gt;$30,000 to PauseAI US&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Here’s why I’m splitting my donations:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;AI Safety and Governance Fund is small, and I don’t want to represent too big a portion of its budget.&lt;/li&gt;
  &lt;li&gt;I donated to PauseAI Global before writing this post, and my prioritization changed somewhat after writing it.&lt;sup id=&quot;fnref:60:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:60&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;57&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
  &lt;li&gt;That leaves PauseAI US as my top candidate, so the rest of my donations will go there.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;I already donated $5,000 to PauseAI Global, but I haven’t made the other donations yet, so commenters have a chance to convince me to change my mind.&lt;/p&gt;

&lt;p&gt;If you wish to persuade me privately (or otherwise discuss in private), you can email me at &lt;a href=&quot;mailto:donations@mdickens.me&quot;&gt;donations@mdickens.me&lt;/a&gt; or &lt;a href=&quot;https://forum.effectivealtruism.org/users/michaeldickens&quot;&gt;message me&lt;/a&gt; on the EA Forum.&lt;/p&gt;

&lt;h1 id=&quot;changelog&quot;&gt;Changelog&lt;/h1&gt;

&lt;ul&gt;
  &lt;li&gt;2024-11-20: Provide a correction for some incorrect information about MIRI.&lt;/li&gt;
  &lt;li&gt;2025-04-25: Change some bolded sections into headings so they can be linked to.&lt;/li&gt;
&lt;/ul&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:4&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;At least the highly effective kinds of animal welfare. Things like animal shelters get a lot of funding but they’re not highly effective. &lt;a href=&quot;#fnref:4&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:13&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I believe humans are the most important species, but only because we will shape the future, not because we matter innately more.&lt;/p&gt;

      &lt;p&gt;To be precise, I believe any individual human’s welfare probably innately matters more than an individual animal of any other species. But there are so many more animals than humans that animals matter much more in aggregate. &lt;a href=&quot;#fnref:13&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:36&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I did change my mind in one relevant way—I used to think AI policy advocacy was very unlikely to work, and now I think it has a reasonable chance of working. More on this &lt;a href=&quot;#ai-safety-technical-research-vs-policy&quot;&gt;later&lt;/a&gt;. &lt;a href=&quot;#fnref:36&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:54&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;So maybe I was being reasonable before and now I’m over-weighting AI risk because I’m worried about getting killed by AI? If I’m being irrational right now, how would I know? &lt;a href=&quot;#fnref:54&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:25&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I studied computer science in university and I took three AI classes: Intro to AI, Intro to Machine Learning, and Convolutional Neural Networks for Natural Language Processing. My grades were below average but not terrible. &lt;a href=&quot;#fnref:25&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:47&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Some alignment researchers think we’re not that far away from solving AI alignment. I won’t go into detail because I don’t think I can do a great job of explaining my views. An informed alignment researcher could probably write &amp;gt;100,000 words detailing the progress in various subfields and predicting future progress to predict how close we are to solving alignment—something like &lt;a href=&quot;https://www.alignmentforum.org/posts/zaaGsFBeDTpCsYHef/shallow-review-of-live-agendas-in-alignment-and-safety&quot;&gt;this&lt;/a&gt; or &lt;a href=&quot;https://www.alignmentforum.org/posts/QBAjndPuFbhEXKcCr/my-understanding-of-what-everyone-in-technical-alignment-is&quot;&gt;this&lt;/a&gt;, but with more analysis and prediction—and some other informed alignment researcher could do the same thing and come up with a totally different answer.&lt;/p&gt;

      &lt;p&gt;Feel free to change the numbers on my &lt;a href=&quot;https://squigglehub.org/models/mdickens/ai-research-vs-policy&quot;&gt;quantitative model&lt;/a&gt; and write a comment about what answer you got. &lt;a href=&quot;#fnref:47&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:15&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Like if “just use reinforcement learning to teach the AI to be ethical” turns out to work. (Which I doubt, but some people seem to think it will work so idk.) &lt;a href=&quot;#fnref:15&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:3&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I don’t remember exactly what I used to believe. Maybe something like, “AI policy advocacy could be a good idea someday once AI looks more imminent and there’s more political will, but it’s not a good idea right now because people will think you’re crazy.” &lt;a href=&quot;#fnref:3&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:16&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;“Replace Gavin Newsom with a governor with &lt;a href=&quot;https://www.astralcodexten.com/p/sb-1047-our-side-of-the-story&quot;&gt;more integrity&lt;/a&gt;” might be an effective intervention, but probably not cost-effective—there’s already too much money in state elections. &lt;a href=&quot;#fnref:16&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:38&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I realized midway through writing this post that I had made this major cause prioritization decision without even making up some numbers and slapping them into a Monte Carlo simulation, which was very out of character for me. &lt;a href=&quot;#fnref:38&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:63&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;More accurately, I believe we live in one of two worlds:&lt;/p&gt;

      &lt;ol&gt;
        &lt;li&gt;Prosaic alignment works, in which case we will probably not have any lack of funding for AI alignment, and my marginal donation has a small chance of making a difference.&lt;/li&gt;
        &lt;li&gt;Alignment is hard, in which case we will probably not have nearly enough funding (assuming no regulation), and my marginal donation has a small chance of making a difference.&lt;/li&gt;
      &lt;/ol&gt;

      &lt;p&gt;And really there’s a continuum between those, so there’s a small chance that AI alignment is at just the right level of difficulty for marginal donations to make a difference. &lt;a href=&quot;#fnref:63&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:20&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;There’s a lot of spending on so-called safety research that’s really fake safetywashing research. I wouldn’t count that as part of my estimate. &lt;a href=&quot;#fnref:20&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:39&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;For example, with the default inputs, the cost to solve alignment is set to a log-normal distribution with 25th/75th percentiles at $1 billion and $1 trillion. If you tighten the distribution to $10 billion to $1 trillion, the marginal value of spending on alignment research drops to ~0. &lt;a href=&quot;#fnref:39&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:5&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I’ve met individual researchers who said this and I don’t think they were lying, but I think their beliefs were motivated by a desire to build SOTA models because SOTA models are cool. &lt;a href=&quot;#fnref:5&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:7&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I’m slightly concerned that I shouldn’t pick on Anthropic because they’re the least unethical of the big AI companies (as far as I can tell). But I think when you’re building technology that endangers the lives of every sentient being who lives and who ever will live, you should be held to an extremely high standard of honesty and communication, and Anthropic falls &lt;em&gt;embarrassingly, horrifyingly short&lt;/em&gt; of that standard. As they say, reality does not grade on a curve. &lt;a href=&quot;#fnref:7&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:41&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I would actually go further than that—I think the &lt;em&gt;best&lt;/em&gt; types of alignment research don’t require SOTA models. But that’s more debatable, and it’s not required for my argument. &lt;a href=&quot;#fnref:41&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:9&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Suppose conservatively that the lightcone will be usable for another billion years, and that we need to delay superintelligent AI by 100 years to make it safe. The volume of the lightcone is proportional to time cubed. Therefore, assuming a constant rate of expansion, delaying 100 years means we can only access 99.99997% of the lightcone instead of 100%. Even at an incredibly optimistic P(doom) (say, 0.001%), accelerating AI isn’t worth it on a naive longtermist view. &lt;a href=&quot;#fnref:9&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:8&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I’m making an ad hominem argument on purpose. Altman’s arguments seem bad to me, maybe I’m missing something because he understands AI better than I do, but in fact it looks like the better explanation isn’t that I’m missing something, but that Altman is genuinely making bad arguments because his reasoning is motivated—and he’s a known liar so I’m perfectly happy to infer that he’s lying about this issue too. &lt;a href=&quot;#fnref:8&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:75&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Perhaps I’m using a circular definition of “respectable” but I don’t consider someone respectable (on this particular issue) if they estimate P(doom) at &amp;lt;1%. &lt;a href=&quot;#fnref:75&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:10&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I put roughly 80% credence on each of five arguments. If each argument is independent, that means I’m probably wrong about at least one of them, but I don’t think they’re independent.&lt;/p&gt;

      &lt;p&gt;And some of the arguments are more decisive than others. For example, if I’m wrong about the opportunity cost argument (#4), then I should switch sides. But if I’m wrong about the hardware overhang argument (#1) and overhang is indeed a serious concern, that doesn’t necessarily mean we shouldn’t slow AI development, it just means a slowdown improves safety in one way and harms safety in another way, and it’s not immediately clear which choice is safer. &lt;a href=&quot;#fnref:10&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:42&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;At least for some reasons people cite for not wanting to pause, they should agree with me on this. There are still some counter-arguments, like “we can’t delay AI because that delays the glorious transhumanist future”, but I consider those to be the weakest arguments. &lt;a href=&quot;#fnref:42&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:12&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;You could argue that pseudoephedrine is over-regulated, and in fact I would agree. But I don’t think those regulations are a particularly big problem, either. &lt;a href=&quot;#fnref:12&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:52&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Companies make more money from more powerful models, and powerful models are more dangerous. Power and safety directly trade off against each other until you can figure out how to build powerful models that aren’t dangerous—which means you need to solve alignment first. &lt;a href=&quot;#fnref:52&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:40&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;In some sense, if we get strong regulations, the companies win, because all the companies’ employees and shareholders don’t get killed by unfriendly AI. But they’ll be unhappy in the short term because they irrationally prioritize profit over not getting killed.&lt;/p&gt;

      &lt;p&gt;I don’t understand what’s going on here psychologically—according to the expressed beliefs of people like Dario Amodei and Shane Legg, they’re massively endangering their own lives in exchange for profit. It’s not even that they disagree with me about key facts, they’re just doing things that make no sense according to their own (expressed) beliefs. &lt;a href=&quot;#fnref:40&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:11&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I’m inclined to say it doesn’t matter at all, but some smart alignment researchers think it does, and they know more than me. Changing my mind wouldn’t materially change my argument here.&lt;/p&gt;

      &lt;p&gt;Paul Christiano, who ~invented RLHF, seems to believe that it is not a real solution to alignment, but improvements on the method might lead to a solution. (He wrote something like this in a comment I read a few days ago that now I can’t find.)&lt;/p&gt;

      &lt;p&gt;On the other hand, RLHF makes the AI look more aligned even if it isn’t, and this might hurt by misleading people into thinking it’s aligend, and they proceed with expanding capabilities when really they shouldn’t.&lt;/p&gt;

      &lt;p&gt;RLHF also makes LLMs less likely to say PR-damaging things. Without RLHF, AI companies might develop LLMs more cautiously out of fear of PR incidents. &lt;a href=&quot;#fnref:11&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:14&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;This doesn’t make sense as a crux because an RSP also creates a hardware overhang if it triggers, but I’ve already talked about why I dislike the “hardware overhang” argument in general. &lt;a href=&quot;#fnref:14&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:19&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;“[thing I don’t understand] must be simple, right?” -famous last words &lt;a href=&quot;#fnref:19&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:21&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;At least that’s what I thought when I first wrote this sentence, before I had looked into any AI policy orgs. After having looked into them, I found it pretty easy to strike some orgs off the list. I don’t know what conversations orgs are having with policy-makers and how productive those conversations are, but I can read their public reports, and I can tell when their public reports aren’t good. And if their reports aren’t good, they probably don’t do a good job of influencing policy-makers either. &lt;a href=&quot;#fnref:21&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:57&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;ul&gt;
        &lt;li&gt;The &lt;a href=&quot;https://funds.effectivealtruism.org/funds/far-future&quot;&gt;Long-Term Future Fund&lt;/a&gt; (LTFF) has given very little money to AI policy.&lt;/li&gt;
        &lt;li&gt;The &lt;a href=&quot;https://www.airiskfund.com/&quot;&gt;AI Risk Mitigation Fund&lt;/a&gt; is a spinoff of LTFF that focuses exclusively on AI safety. As of this writing, it hasn’t made any grants yet, but I assume it will behave similarly to LTFF.&lt;/li&gt;
        &lt;li&gt;Longview Philanthropy’s &lt;a href=&quot;https://www.longview.org/fund/emerging-challenges-fund/&quot;&gt;Emerging Challenges Fund&lt;/a&gt; and the &lt;a href=&quot;https://survivalandflourishing.fund/&quot;&gt;Survival and Flourishing Fund&lt;/a&gt; have given some grants on AI policy, but mostly on other cause areas.&lt;/li&gt;
        &lt;li&gt;Manifund has a &lt;a href=&quot;https://manifund.org/about/regranting&quot;&gt;regranting program&lt;/a&gt;, but 3 out of 6 regranters are current or former employees at AI companies which makes me disinclined to trust their judgment; and their grants so far mostly focus on alignment research, not policy.&lt;/li&gt;
        &lt;li&gt;&lt;a href=&quot;https://forum.effectivealtruism.org/users/larks&quot;&gt;Larks&lt;/a&gt; used to write reviews of AI safety orgs, but they haven’t done it in a while—and they primarily reviewed alignment research, not policy.&lt;/li&gt;
        &lt;li&gt;Nuño Sempere did some &lt;a href=&quot;https://forum.effectivealtruism.org/posts/xmmqDdGqNZq5RELer/shallow-evaluations-of-longtermist-organizations&quot;&gt;shallow investigations&lt;/a&gt; three years ago, but they’re out of date.&lt;/li&gt;
      &lt;/ul&gt;
      &lt;p&gt;&lt;a href=&quot;#fnref:57&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:56&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;As of this writing (2024-11-02), the recommendations are:&lt;/p&gt;

      &lt;ol&gt;
        &lt;li&gt;Horizon Institute for Public Service&lt;/li&gt;
        &lt;li&gt;Institute for Law and AI&lt;/li&gt;
        &lt;li&gt;Effective Institutions Project’s work on AI governance&lt;/li&gt;
        &lt;li&gt;FAR AI&lt;/li&gt;
        &lt;li&gt;Centre for Long-Term Resilience&lt;/li&gt;
        &lt;li&gt;Center for Security and Emerging Technology&lt;/li&gt;
        &lt;li&gt;Center for Human-Compatible AI&lt;/li&gt;
      &lt;/ol&gt;
      &lt;p&gt;&lt;a href=&quot;#fnref:56&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:58&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I don’t know exactly what’s going on with the difference between Founders Pledge recs and my top donation candidates. It looks to me like Founders Pledge puts too much stock in the “build influence to use later” theory of change, and it cares too much about orgs’ legible status / reputation. &lt;a href=&quot;#fnref:58&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:29&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Some of these orgs don’t exactly work on policy, but do do work that plausibly helps policy. There are some orgs fitting that description that I didn’t review (e.g., AI Impacts and Epoch AI). I had to make judgment calls on reviewing plausibly-relevant orgs vs. saving time, and a different reviewer might have made different calls.&lt;/p&gt;

      &lt;p&gt;In fact, I think I would be more likely to donate to (e.g.) AI Impacts than (e.g.) METR, so why did I write about METR but not AI Impacts? Mainly because I had already put some thought into METR and I figured I might as well write them down, but I haven’t put much thought into AI Impacts. &lt;a href=&quot;#fnref:29&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:49&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Some Open Philanthropy employees stand to make money if AI companies do well; and Holden Karnofsky (who no longer works at Open Philanthropy, but used to run it) has &lt;a href=&quot;https://forum.effectivealtruism.org/posts/Pfayu5Bf2apKreueD/a-playbook-for-ai-risk-reduction-focused-on-misaligned-ai&quot;&gt;expressed&lt;/a&gt; that he expects us to avert x-risk by an AI company internally solving alignment. &lt;a href=&quot;#fnref:49&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:27&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;It’s nice that AI safety is so much better funded than it used to be, but for my own sake, I kind of miss the days when there were only like five x-risk orgs. &lt;a href=&quot;#fnref:27&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:26&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;The names made it difficult for me to edit this post. Many times I would be re-reading a sentence where I referenced some org, and I wouldn’t remember which org it was. “Centre for the Governance of AI? Wait, is that the one that runs polls? Or the one that did SB 1047? Er, no, it’s the one that spun out of the Future of Humanity Institute.”&lt;/p&gt;

      &lt;p&gt;Shout-out to Lightcone, Palisade, and Sentinel for having memorable names. &lt;a href=&quot;#fnref:26&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:28&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;But maybe that’s a silly thing to worry about. Compare: “I only invest in companies that don’t have managers. Managers’ salaries just take away money from the employees who do the real work.” &lt;a href=&quot;#fnref:28&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:61&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I’m not convinced that it’s good practice, but at least some people believe it is. &lt;a href=&quot;#fnref:61&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:43&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Also, I’m not sure how much consideration to give this, but I have a vague sense that sharing criticism with the orgs being criticized would hurt my epistemics. Like, maybe if I talk to them, I will become overly predisposed toward politeness and end up deleting accurate criticisms that I should’ve left in. &lt;a href=&quot;#fnref:43&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:73&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;The word “funge” is not in the dictionary; I would define it as “causing a &lt;a href=&quot;https://en.wikipedia.org/wiki/Fungibility&quot;&gt;fungible&lt;/a&gt; good [in this case, money] to be used for a different purpose.” That is, causing SFF to give some of its money to a different nonprofit. &lt;a href=&quot;#fnref:73&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:22&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;To be honest it’s not that funny given the stakes, but I try to find a little humor where I can. &lt;a href=&quot;#fnref:22&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:67&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;The linked source isn’t why I believe this claim; I believe it based on things I’ve heard in personal communications. &lt;a href=&quot;#fnref:67&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:46&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Examples:&lt;/p&gt;

      &lt;ul&gt;
        &lt;li&gt;Andrea brought up the classic argument that AI becomes really dangerous once it’s self-improving. But, he said, it’s not clear what exactly counts as self-improving. It’s already something like self-improving because many ML engineers use LLMs to help them with work tasks. Andrea proposed that the really dangerous time starts once AI is about as competent as remote workers, because that’s when you can massively accelerate the rate of progress. I don’t have a strong opinion on whether that’s true, but it made me think.&lt;/li&gt;
        &lt;li&gt;Andrea said it’s a big problem that we don’t have a “science of intelligence”. We don’t really know what it means for AIs to be smart, all we have is a hodgepodge of benchmarks. We can’t properly evaluate AI capabilities unless we have a much better understanding of what intelligence is.&lt;/li&gt;
      &lt;/ul&gt;
      &lt;p&gt;&lt;a href=&quot;#fnref:46&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:45&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;To be clear: sometimes, when people say “… but I don’t know if X”, that’s a polite way of saying “I believe not-X.” In this case, that’s not what I mean—what I mean is that I don’t know. &lt;a href=&quot;#fnref:45&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:31&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;The report is about defending against cyber-attacks, not about executing cyber-attacks. But it’s also about how to increase AI capabilities. (And an AI that’s smarter about defending cyber-attacks might also be better at executing them.) I can see a good argument that this work is net negative but there’s an argument the other way, too. &lt;a href=&quot;#fnref:31&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:64&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Several sources claim Eliezer’s P(doom) &amp;gt; 95% but their source is a &lt;a href=&quot;https://www.fastcompany.com/90994526/pdoom-explained-how-to-calculate-your-score-on-ai-apocalypse-metric&quot;&gt;news article&lt;/a&gt; and the news article doesn’t cite a source. I could not find any direct quote. &lt;a href=&quot;#fnref:64&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:65&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;At least you should have the same goals, if not the same tactics. &lt;a href=&quot;#fnref:65&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:68&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I would not have thought “write Harry Potter fan fiction” was a good strategy, but I turned out to be wrong on that one. &lt;a href=&quot;#fnref:68&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:23&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Although to be fair, that’s also why most AI “safety” researchers do capabilities research. &lt;a href=&quot;#fnref:23&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:66&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;For example, various studies have looked at natural experiments where protests do or do not occur based on whether it rains, and they find that protesters’ positions are slightly more popular when it does not rain. The effect shows up repeatedly across multiple studies of different movements. &lt;a href=&quot;#fnref:66&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:72&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I don’t have meaningful insight into whether any particular person would be good at lobbying. I think I can identify that most people would be bad at it, so the best I can do is fail to find any reasons to expect someone to be bad. I don’t see any reasons to expect Felix to be bad, except that he’s junior but that’s a weak reason. &lt;a href=&quot;#fnref:72&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:76&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Outside of the top five, I didn’t think about these classifications very hard. &lt;a href=&quot;#fnref:76&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:32&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;For example, I would probably fund &lt;a href=&quot;https://gfi.org/&quot;&gt;Good Food Institute&lt;/a&gt; ahead of most of these. &lt;a href=&quot;#fnref:32&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:69&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;That screens off most of the policy orgs on my list. &lt;a href=&quot;#fnref:69&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:70&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;That screens off Control AI and Existential Risk Observatory. &lt;a href=&quot;#fnref:70&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:71&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;That screens off Center for AI Safety and Future of Life Institute. &lt;a href=&quot;#fnref:71&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:74&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Only MIRI is pursuing the sorts of strategies that MIRI is pursuing, but it has &amp;gt;100x more money than PauseAI. &lt;a href=&quot;#fnref:74&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:60&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;When I made the donation to PauseAI Global, I was under the impression that PauseAI Global and PauseAI US were one organization. That was true at one point, but they had split by the time I made this donation. If I had known that, I would have donated to PauseAI US instead. But I’m not bothered by it because I still think donations to PauseAI Global have high expected value.&lt;/p&gt;

      &lt;p&gt;Also, when I donated the money, I wasn’t planning on writing a whole post. It wasn’t until later that I decided to do a proper investigation and write what you’re currently reading. That’s why I didn’t wait before donating. &lt;a href=&quot;#fnref:60&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt; &lt;a href=&quot;#fnref:60:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;2&lt;/sup&gt;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>Subjects in Pysch Studies Are More Rational Than Psychologists</title>
				<pubDate>Tue, 15 Oct 2024 00:00:00 -0700</pubDate>
				<link>http://mdickens.me/2024/10/15/subjects_are_more_rational_than_psychologists/</link>
				<guid isPermaLink="true">http://mdickens.me/2024/10/15/subjects_are_more_rational_than_psychologists/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;Psychologists have done experiments that supposedly show how people behave irrationally. But in some of those experiments, people &lt;em&gt;do&lt;/em&gt; behave rationally, and it’s the psychologists’ expectations that are irrational.&lt;/p&gt;

&lt;!-- more --&gt;

&lt;h2 id=&quot;contents&quot;&gt;Contents&lt;/h2&gt;

&lt;ul id=&quot;markdown-toc&quot;&gt;
  &lt;li&gt;&lt;a href=&quot;#contents&quot; id=&quot;markdown-toc-contents&quot;&gt;Contents&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#ultimatum-game&quot; id=&quot;markdown-toc-ultimatum-game&quot;&gt;Ultimatum game&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#hyperbolic-discounting&quot; id=&quot;markdown-toc-hyperbolic-discounting&quot;&gt;Hyperbolic discounting&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#dunning-kruger-effect&quot; id=&quot;markdown-toc-dunning-kruger-effect&quot;&gt;Dunning-Kruger effect&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#non-examples-asch-and-milgram&quot; id=&quot;markdown-toc-non-examples-asch-and-milgram&quot;&gt;Non-examples: Asch and Milgram&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#themes&quot; id=&quot;markdown-toc-themes&quot;&gt;Themes&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#notes&quot; id=&quot;markdown-toc-notes&quot;&gt;Notes&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;ultimatum-game&quot;&gt;Ultimatum game&lt;/h2&gt;

&lt;p&gt;The &lt;a href=&quot;https://en.wikipedia.org/wiki/Ultimatum_game&quot;&gt;ultimatum game&lt;/a&gt; works like this:&lt;/p&gt;

&lt;p&gt;There are two participants, call them Alice and Bob. Alice receives $100. Alice then chooses some amount of that money to offer to Bob. Bob can accept the offer and they both walk away with their money, or he can reject the offer in which case they both get nothing.&lt;/p&gt;

&lt;p&gt;According to standard theory, Alice should offer Bob $1 and keep $99 for herself, and Bob should accept the offer because a dollar is better than no dollars. In practice, most “Alices” offer something like a 50/50 split, and most “Bobs” reject the offer if given something like an uneven split. Psychologists&lt;sup id=&quot;fnref:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; are confused about this supposedly irrational behavior.&lt;/p&gt;

&lt;p&gt;But in fact, Alice and Bob both behave rationally. Bob rejects unfair offers, which makes him have less money in some cases. But Alice &lt;em&gt;expects&lt;/em&gt; Bob to reject unfair offers, so she offers Bob a 50/50 split.&lt;/p&gt;

&lt;p&gt;Bob follows a general strategy of rejecting unfair offers, because he believes or intuits that this strategy will ensure he mostly receives fair offers. And Alice knows Bob will probably reject an unfair offer because that’s what most people do. So Bob has (acausally) induced Alice to give him $50 instead of $1. In the world where Bob rejects unfair offers, Bob &lt;a href=&quot;https://www.lesswrong.com/posts/4ARtkT3EYox3THYjF/rationality-is-systematized-winning&quot;&gt;wins&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;In other words, most people don’t use &lt;a href=&quot;https://www.lesswrong.com/tag/causal-decision-theory&quot;&gt;causal decision theory&lt;/a&gt;.&lt;/p&gt;

&lt;h2 id=&quot;hyperbolic-discounting&quot;&gt;Hyperbolic discounting&lt;/h2&gt;

&lt;p&gt;The standard model of rational behavior assumes exponential discounting: future goods decrease in value at a fixed rate (say, 10% per year). But experiments show that many people use &lt;a href=&quot;https://en.wikipedia.org/wiki/Hyperbolic_discounting&quot;&gt;hyperbolic discounting&lt;/a&gt;, which means the value of future goods first falls off rapidly, and then slowly.&lt;/p&gt;

&lt;p&gt;The classic experiment:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Ask people if they would prefer $100 now or $120 a month from now. Most people prefer the $100 now.&lt;/li&gt;
  &lt;li&gt;Ask people if they would prefer $100 in 12 months or $120 in 13 months. Most people prefer the $120 in 13 months.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Psychology textbooks say this is irrational: the monthly discount rate must be either more than 20% (in which case they should take the money sooner in both cases) or less than 20% (in which case they should take the money later). But in real life, people’s behavior makes sense.&lt;/p&gt;

&lt;p&gt;If you say you’re going to give me money right now, you’re probably going to do it. If you say you’re going to give me $120 a month from now, I don’t know what’s going to happen. Maybe you’ll forget, maybe your study will run out of funding, I don’t know. So I’d rather have the money now.&lt;/p&gt;

&lt;p&gt;If you can give me money in 12 months, you can most likely also give me money in 13 months, so I’m not too concerned about waiting an extra month in that case.&lt;/p&gt;

&lt;p&gt;In mathematical terms, the probability that you’ll give me the money decreases hyperbolically with time, not exponentially, so I ought to use hyperbolic discounting.&lt;/p&gt;

&lt;p&gt;(Plus there is another, more technical, reason to use hyperbolic discounting: if the “true” discount function is exponential but you don’t know what exact discount rate to use, then your &lt;em&gt;expected&lt;/em&gt; discount rate starts out high and decreases over time, which produces a hyperbolic discount function.)&lt;/p&gt;

&lt;h2 id=&quot;dunning-kruger-effect&quot;&gt;Dunning-Kruger effect&lt;/h2&gt;

&lt;p&gt;The Dunning-Kruger effect is often misrepresented as showing that ignorant people think they’re knowledgeable, and knowledgeable people think they’re ignorant. To my knowledge, no study has ever showed that.&lt;/p&gt;

&lt;p&gt;The original Kruger &amp;amp; Dunning (1999)&lt;sup id=&quot;fnref:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt; paper, and some follow-up studies, produced graphs that looked something like this:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/Dunning-Kruger.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;(This graph is an illustration, not based on actual data.)&lt;/p&gt;

&lt;p&gt;Notice two things about this graph:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;The “perceived ability” line is a bit too high.&lt;/li&gt;
  &lt;li&gt;The slope is too flat—everyone estimates their ability as closer to average than it really is.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;#1 is a real (albeit small) bias—people systematically overestimate their abilities by a little bit. But the flattened slope of perceived ability is rational, for two reasons.&lt;/p&gt;

&lt;p&gt;Reason 1. If your test has some degree of noise, then you expect the top scorers to have gotten lucky and the bottom scorers to have gotten unlucky. The top scorers aren’t quite as good as the test makes them look, and the bottom scorers aren’t quite as &lt;em&gt;bad&lt;/em&gt; as the test makes them look. So the &lt;em&gt;true&lt;/em&gt; skill curve is flatter than the &lt;em&gt;measured&lt;/em&gt; skill curve. For more on this, see &lt;a href=&quot;https://gwern.net/doc/iq/2020-gignac.pdf&quot;&gt;Gignac &amp;amp; Zajenkowski (2020)&lt;/a&gt;&lt;sup id=&quot;fnref:3&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt;.&lt;/p&gt;

&lt;p&gt;Reason 2. People do not have perfect knowledge of their own abilities. If you know nothing else, you should expect to perform about average. If you know a little bit about your own skill level, you should expect to perform a little below or a little above average. Even if you’re in the top 10% or bottom 10% of skill, it should take a lot of evidence to convince you of that, so it’s perfectly rational for you to estimate your own skill as closer to average than it really is.&lt;/p&gt;

&lt;p&gt;Both these reasons say essentially the same thing: if you have a noisy measure of ability, your belief conditional on that measure should be closer to average than what the measure itself shows. This produces a flattened curve like what we see in the Dunning-Kruger effect, and it’s not a bias but a perfectly rational way of reasoning about imperfect evidence.&lt;/p&gt;

&lt;h2 id=&quot;non-examples-asch-and-milgram&quot;&gt;Non-examples: Asch and Milgram&lt;/h2&gt;

&lt;p&gt;Originally I wanted to include the &lt;a href=&quot;https://en.wikipedia.org/wiki/Asch_conformity_experiments&quot;&gt;Asch conformity experiments&lt;/a&gt; and the &lt;a href=&quot;https://en.wikipedia.org/wiki/Milgram_experiment&quot;&gt;Milgram experiments&lt;/a&gt; on this list—I thought participants behaved rationally. But after doing some more research, I think the rational-participant theory can’t fully explain the observations.&lt;/p&gt;

&lt;p&gt;(If you’re not familiar with the Asch or Milgram experiments, click the links in the previous paragraph for explanations. They’re a bit too complicated to explain in this post.)&lt;/p&gt;

&lt;p&gt;For Asch, I originally wanted to argue that it’s rational to update your beliefs when other people disagree with you. But the experiments found that people are more willing to disagree with the group when they can give their answers in secret. If people were rationally updating on others’ beliefs, then they’d agree with the group in secret as well as in public.&lt;/p&gt;

&lt;p&gt;I still think it’s rational for subjects to update their beliefs toward those of the other participants, but the experiments suggest that people update more than they should.&lt;/p&gt;

&lt;p&gt;For Milgram, I wanted to argue that participants who continue to administer electric shocks have rationally (and correctly!) deduced that nothing bad is happening. This is consistent with people’s behaviors, but it’s not consistent with their stated beliefs (almost all participants report believing that it was real) or their observed emotional state (many of them were visibly sweating/trembling/having nervous fits).&lt;/p&gt;

&lt;h2 id=&quot;themes&quot;&gt;Themes&lt;/h2&gt;

&lt;p&gt;What can we learn from how psychologists mis-interpret these experiments?&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;People consider more types of evidence than pychologists think they do.&lt;/li&gt;
  &lt;li&gt;People intuitively use non-causal decision theories.&lt;/li&gt;
&lt;/ul&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:1&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Or at least psychology textbooks, it’s possible that many actual psychologists are smarter about this. &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:2&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Kruger, J., &amp;amp; Dunning, D. (1999). Unskilled and unaware of it: how difficulties in recognizing one’s own incompetence lead to inflated self-assessments. &lt;a href=&quot;#fnref:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:3&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Gignac, G. E., &amp;amp; Zajenkowski, M. (2020). &lt;a href=&quot;https://gwern.net/doc/iq/2020-gignac.pdf&quot;&gt;The Dunning-Kruger effect is (mostly) a statistical artefact: Valid approaches to testing the hypothesis with individual differences data.&lt;/a&gt; &lt;a href=&quot;#fnref:3&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>My Submission for Worst Argument In The World</title>
				<pubDate>Sat, 12 Oct 2024 00:00:00 -0700</pubDate>
				<link>http://mdickens.me/2024/10/12/worst_argument_in_the_world/</link>
				<guid isPermaLink="true">http://mdickens.me/2024/10/12/worst_argument_in_the_world/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;Scott Alexander once &lt;a href=&quot;https://www.lesswrong.com/posts/yCWPkLi8wJvewPbEp/the-noncentral-fallacy-the-worst-argument-in-the-world&quot;&gt;wrote&lt;/a&gt;:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;David Stove once &lt;a href=&quot;https://web.maths.unsw.edu.au/~jim/worst.html&quot;&gt;ran a contest&lt;/a&gt; to find the Worst Argument In The World, but he awarded the prize to his own entry, and one that shored up his politics to boot. It hardly seems like an objective process.&lt;/p&gt;

  &lt;p&gt;If he can unilaterally declare a Worst Argument, then so can I.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;If those guys can unilaterally declare a Worst Argument, then so can I. I declare the Worst Argument In The World to be this:&lt;/p&gt;

&lt;p&gt;“A long time ago, not-A, and also, not-B. Now, A and B. Therefore, A caused B.”&lt;/p&gt;

&lt;p&gt;Example: In 1820, pirates were everywhere. Now you hardly ever see pirates, and global temperatures are rising. Therefore, the lack of pirates caused global warming.&lt;/p&gt;

&lt;p&gt;(This particular argument was originally made as a &lt;a href=&quot;https://www.spaghettimonster.org/pages/about/open-letter/&quot;&gt;joke&lt;/a&gt;, but I will give some real examples later.)&lt;/p&gt;

&lt;p&gt;Naming fallacies is hard. Maybe we could call this the “two distant points in time fallacy”. For now I’ll just call it the Worst Argument.&lt;/p&gt;

&lt;!-- more --&gt;

&lt;p&gt;The Worst Argument is a special case of the &lt;a href=&quot;https://en.wikipedia.org/wiki/Post_hoc_ergo_propter_hoc&quot;&gt;post hoc ergo propter hoc&lt;/a&gt; fallacy: “A happened before B, therefore A caused B.”&lt;sup id=&quot;fnref:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; I find this special case to be particularly bad. &lt;em&gt;Post hoc ergo propter hoc&lt;/em&gt; can make a bit of sense sometimes: maybe B happens immediately after A, maybe A and B repeatedly appear together. That doesn’t definitively establish causality, but if you have data showing that B always comes right after A, I want to hear about it.&lt;/p&gt;

&lt;p&gt;The thing about the Worst Argument is that it provides almost no evidence of anything. You can’t just look at two distant points in time, check one independent variable, and assume that the independent variable explains the dependent variable. So many other things could have happened! It’s an incredible leap to say that the rise in global temperatures &lt;em&gt;must&lt;/em&gt; be caused by the decline in pirates when about one zillion other things changed between 1820 and today.&lt;/p&gt;

&lt;p&gt;The Worst Argument is the worst version of &lt;em&gt;post hoc ergo propter hoc&lt;/em&gt; because:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;It only looks at two data points.&lt;/li&gt;
  &lt;li&gt;The two data points are far apart in time.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If you’re going to base your reasoning on &lt;em&gt;post hoc ergo propter hoc&lt;/em&gt;, at least give me a bunch of data points! Or at least make them be close together in time!&lt;/p&gt;

&lt;p&gt;The Worst Argument so obviously bad that surely no one would ever think to invoke it, and if invoked, surely no one would take it seriously, right? And yet people make arguments of this form all the time. And I’ve been persuaded by arguments like this! The fact that this fallacy keeps fooling people (including me) makes it my choice for Worst Argument In The World.&lt;/p&gt;

&lt;p&gt;Let me give some real-world examples. These statements aren’t all wrong, but the arguments are all bad. (It’s possible to make a fallacious argument in support of a true conclusion.&lt;sup id=&quot;fnref:7&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:7&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt;)&lt;/p&gt;

&lt;h2 id=&quot;school-uniforms-prevent-crime&quot;&gt;School uniforms prevent crime&lt;/h2&gt;

&lt;p&gt;When I was in 9th grade, I spent a lot of time on debate.org, a (now-defunct) website that hosted online written debates. I &lt;a href=&quot;https://web.archive.org/web/20100410100917/http://www.debate.org/debates/School-uniforms-ought-to-be-worn-in-primary-and-secondary-schools./1/&quot;&gt;participated in a debate&lt;/a&gt; on the proposition “School uniforms ought to be worn in primary and secondary schools.” I took the negative. My opponent was one of the most renowned debaters on the site, with a record of 75 wins to 7 losses. I was a little nervous to go up against him and I wanted to do my best.&lt;/p&gt;

&lt;p&gt;The debate lasted for three rounds. After the first two rounds, I felt like I was losing, and I spent hours poring over my opponent’s arguments to figure out how to write my final rebuttal. At some point, I had an epiphany: I realized that his central argument was flimsy, but his rhetoric was so good that I hadn’t noticed.&lt;/p&gt;

&lt;p&gt;His argument was essentially this:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;In 1993, the Long Beach school system did not require uniforms, and it had high rates of school crime.&lt;/li&gt;
  &lt;li&gt;In 1993, it required school uniforms, and it had dramatically lower rates of crime.&lt;/li&gt;
  &lt;li&gt;Therefore, school uniforms must have caused the reduction in crime.&lt;sup id=&quot;fnref:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Nothing else pertinent could have possibly happened in those 6 years, right? They couldn’t have instituted new policies for handling troublesome students, or changed their rules for reporting crimes, or anything like that. It could &lt;em&gt;only&lt;/em&gt; have been the school uniforms.&lt;/p&gt;

&lt;p&gt;In my final rebuttal, I focused on emphasizing the badness of this argument. It was enough to persuade the readers, and I am happy to say that my opponent now has 8 losses.&lt;/p&gt;

&lt;h2 id=&quot;franklin-d-roosevelt-solved-the-great-depression&quot;&gt;Franklin D. Roosevelt solved the Great Depression&lt;/h2&gt;

&lt;p&gt;An argument that I believed for a long time:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;The Great Depression started in 1929.&lt;/li&gt;
  &lt;li&gt;FDR was elected in 1931.&lt;/li&gt;
  &lt;li&gt;The Great Depression ended in 1939, while FDR was in office.&lt;/li&gt;
  &lt;li&gt;Therefore, FDR solved the Great Depression.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Plenty of other things happened between 1931 and 1939. An economy can shift quite a lot in 8 years, for many reasons other than who’s president. A typical economic cycle only lasts something like 8 years, and the typical recession only lasts one or two years. After that long you’d expect the economy to look very different, no matter who the president is.&lt;/p&gt;

&lt;p&gt;You might say, hey, aren’t there better arguments about how specifically FDR’s policies could have improved the economy? Yeah, there are, but that’s not why I used to believe FDR had great economic policies. I believed it because I fell for the Worst Argument In The World.&lt;/p&gt;

&lt;p&gt;(Folks on the other side of the political spectrum like to say Reagan solved the economic problems of the 1970s and early 80s, and they make the exact same argument. I never believed that one for some reason…)&lt;/p&gt;

&lt;h2 id=&quot;labor-unions-and-strikes-were-largely-responsible-for-improved-working-conditions-in-the-19th-and-early-20th-centuries&quot;&gt;Labor unions and strikes were largely responsible for improved working conditions in the 19th and early 20th centuries&lt;/h2&gt;

&lt;p&gt;In high school, I read Howard Zinn’s &lt;em&gt;A People’s History of the United States&lt;/em&gt; and it convinced me of the importance of collective action. The book’s argument went like this:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Laborers had to work in really bad conditions.&lt;/li&gt;
  &lt;li&gt;They organized unions and held strikes.&lt;/li&gt;
  &lt;li&gt;Decades later, conditions were still bad but not quite as bad.&lt;/li&gt;
  &lt;li&gt;Therefore, strikes and labor unions were the cause of the improvement in conditions.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The book repeatedly makes this argument through a series of historical anecdotes.&lt;sup id=&quot;fnref:4&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:4&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;4&lt;/a&gt;&lt;/sup&gt; As I recall, it never says the conclusion explicitly, but it’s strongly implied.&lt;/p&gt;

&lt;p&gt;At the time, I found the book’s anecdotes convincing. It wasn’t until years later that I realized how weak its case was.&lt;/p&gt;

&lt;p&gt;In fact, the book’s argument is even weaker than the Worst Argument In The World. It doesn’t say working conditions were bad, then strikes happened, then many years later conditions were good. It says working conditions were bad, then strikes happened, then conditions were still bad. Which is positive evidence that the strikes &lt;em&gt;didn’t&lt;/em&gt; help. Kind of amazing that I read this and came away with the opposite conclusion.&lt;/p&gt;

&lt;h2 id=&quot;seed-oils-cause-obesity&quot;&gt;Seed oils cause obesity&lt;/h2&gt;

&lt;p&gt;There are some arguments about seed oils that don’t rely on this fallacy, but the most popular argument I see goes like this:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;A century ago, obesity was rare, and also people didn’t eat a lot of seed oils.&lt;/li&gt;
  &lt;li&gt;Today, many more people are obese, and also people eat a lot of seed oils.&lt;/li&gt;
  &lt;li&gt;Therefore, obesity must be caused by seed oils.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;I’m almost baffled that anyone finds this convincing, except that I myself was convinced by the Worst Argument In The World in at least two instances as documented above.&lt;/p&gt;

&lt;p&gt;Some people get more sophisticated and they draw a line of obesity going up and a line of something-kinda-like-seed-oil consumption (like vegetable oil consumption) going up and they say, look, the lines both go up! That’s not quite as bad as only looking at two data points, but it’s still pretty bad. (You can also draw a line of pirate populations going down and a line of global temperature going up.) It would be better to go a little further and show that the two lines track each other well. That wouldn’t establish causation but at least it establishes correlation.&lt;sup id=&quot;fnref:5&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:5&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;5&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;h2 id=&quot;martin-luther-king-greatly-improved-civil-rights-for-african-americans&quot;&gt;Martin Luther King greatly improved civil rights for African-Americans&lt;/h2&gt;

&lt;p&gt;I’m sure there are good arguments for this statement out there somewhere, but I always hear people (implicitly) say things like:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;In the 1950s, segregation was legal and African-Americans didn’t have equal rights.&lt;/li&gt;
  &lt;li&gt;Then, Martin Luther King spearheaded civil rights protests. Also, a bunch of other stuff happened.&lt;/li&gt;
  &lt;li&gt;By the 1970s, America was a much better place for black people.&lt;/li&gt;
  &lt;li&gt;Therefore, the improvements in civil rights must have been largely caused by MLK.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2 id=&quot;what-makes-it-the-worst-argument-in-the-world&quot;&gt;What makes it the Worst Argument In The World?&lt;/h2&gt;

&lt;p&gt;Scott Alexander’s &lt;a href=&quot;https://www.lesswrong.com/posts/yCWPkLi8wJvewPbEp/the-noncentral-fallacy-the-worst-argument-in-the-world&quot;&gt;Worst Argument In The World&lt;/a&gt; is a dirty debate tactic—you say something technically true that subtly invokes a hard-to-spot fallacy. My submission for Worst Argument is almost the opposite—you say something fallacious, and it’s pretty obviously fallacious if you think about it, but we fall for it anyway. The Worst Argument In The World reminds me of the &lt;a href=&quot;https://www.youtube.com/watch?v=KB_lTKZm1Ts&quot;&gt;basketball awareness test&lt;/a&gt;: &lt;span class=&quot;spoiler&quot;&gt;Like the moonwalking bear, it somehow sneaks into my brain without getting spotted.&lt;/span&gt;&lt;/p&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:1&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Which itself is a special case of the &lt;a href=&quot;https://en.wikipedia.org/wiki/Correlation_does_not_imply_causation&quot;&gt;correlation implies causation&lt;/a&gt; fallacy. &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:7&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;According to &lt;a href=&quot;https://en.wikipedia.org/wiki/Sturgeon%27s_law&quot;&gt;Sturgeon’s law&lt;/a&gt;, 90% of all arguments are crap, including arguments for true claims. &lt;a href=&quot;#fnref:7&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:2&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I didn’t notice this at the time but now that I’m re-reading the debate, I’m reasonably sure the crime numbers are just fraudulent. They claim a 93% reduction in sex offenses. No intervention ever reduces crime by 93%.&lt;/p&gt;

      &lt;p&gt;I’m not sure claiming fraud would have been a good debate tactic, though, and the argument is still terrible even if the numbers are real. &lt;a href=&quot;#fnref:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:4&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I’m largely going based off memory, I haven’t read the book since high school. Based on skimming an online summary, it looks like the book made forms of this argument in at least chapters 10, 11, 13, 15, and 19. &lt;a href=&quot;#fnref:4&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:5&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I looked on a few seed-oil-hypothesis advocacy sites and I couldn’t find any proper statistics. This is not related to my main point but it was bugging me so I decided to do the statistics myself.&lt;/p&gt;

      &lt;p&gt;To my knowledge, we don’t have proper historical data on seed oil consumption. The best I could do was compare &lt;a href=&quot;https://www.ers.usda.gov/data-products/food-availability-per-capita-data-system/&quot;&gt;salad + cooking oil consumption&lt;/a&gt;&lt;sup id=&quot;fnref:6&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:6&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;6&lt;/a&gt;&lt;/sup&gt; and &lt;a href=&quot;https://ourworldindata.org/grapher/share-of-adults-defined-as-obese?tab=chart&amp;amp;country=~USA&quot;&gt;obesity&lt;/a&gt;. Both go up over time, but they don’t track particularly well—for example, oil consumption spiked up around 1999, whereas obesity increased smoothly.&lt;/p&gt;

      &lt;p&gt;Salad + cooking oil consumption correlates with obesity at r = 0.93, but calendar year correlates with obesity at r = 0.994. If calendar year predicts obesity better than your independent variable, then your independent variable probably isn’t very good. (See &lt;a href=&quot;https://github.com/michaeldickens/public-scripts/blob/master/cooking-oil-obesity.xlsx&quot;&gt;here&lt;/a&gt; for my calculations.)&lt;/p&gt;

      &lt;p&gt;(Exercise for the reader: What is the correlation between obesity and pirate sightings?) &lt;a href=&quot;#fnref:5&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:6&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Lee, J. H., Duster, M., Roberts, T., &amp;amp; Devinsky, O. (2022). &lt;a href=&quot;https://doi.org/10.3389%2Ffnut.2021.748847&quot;&gt;United States Dietary Trends Since 1800: Lack of Association Between Saturated Fatty Acid Consumption and Non-communicable Diseases.&lt;/a&gt;. doi: &lt;a href=&quot;https://doi.org/10.3389/fnut.2021.748847&quot;&gt;10.3389/fnut.2021.748847&lt;/a&gt; &lt;a href=&quot;#fnref:6&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>Outlive: A Critical Review</title>
				<pubDate>Thu, 26 Sep 2024 00:00:00 -0700</pubDate>
				<link>http://mdickens.me/2024/09/26/outlive_a_critical_review/</link>
				<guid isPermaLink="true">http://mdickens.me/2024/09/26/outlive_a_critical_review/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;&lt;em&gt;Last updated 2025-07-04.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://peterattiamd.com/outlive/&quot;&gt;Outlive: The Science &amp;amp; Art of Longevity&lt;/a&gt; by Peter Attia (with Bill Gifford&lt;sup id=&quot;fnref:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;) gives Attia’s prescription on how to live longer and stay healthy into old age. In this post, I critically review some of the book’s scientific claims that stood out to me.&lt;/p&gt;

&lt;p&gt;This is not a comprehensive review. I didn’t review assertions that I was pretty sure were true (ex: &lt;a href=&quot;https://en.wikipedia.org/wiki/VO2_max&quot;&gt;VO2 max&lt;/a&gt; improves longevity), or that were hard for me to evaluate (ex: the mechanics of how LDL cholesterol functions in the body), or that I didn’t care about (ex: sleep deprivation impairs one’s ability to identify facial expressions).&lt;/p&gt;

&lt;p&gt;First, some general notes:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;I have no expertise on any of the subjects in this post. I evaluated claims by doing shallow readings of relevant scientific literature, especially meta-analyses.&lt;/li&gt;
  &lt;li&gt;There is a spectrum between two ways of being wrong: “pop science book pushes a flashy attention-grabbing thesis with little regard for truth” to “careful truth-seeking author isn’t infallible”. &lt;em&gt;Outlive&lt;/em&gt; makes it 75% of the way to the latter.&lt;/li&gt;
  &lt;li&gt;If I wrote a book that covered this many entirely different scientific fields, I would get a lot more things wrong than &lt;em&gt;Outlive&lt;/em&gt; did. (I probably get a lot of things wrong in this post.)&lt;/li&gt;
  &lt;li&gt;When making my assessments, I give numeric credences and also use terms such as “true” and “likely true”. The numbers give my all-things-considered subjective credences, and the qualitative terms give my interpretation of the strength of the empirical evidence. For example, if the scientific evidence suggests that a claim is 75% likely and I understand the evidence well, then I rate the claim as “likely true”. If I only read the abstract of a single meta-analysis, and the abstract unequivocally supports the claim but I’m only 75% sure that the meta-analysis can be trusted, then I rate it as “true”. Both claims receive a 75% credence.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Now let’s have a look at some claims from &lt;em&gt;Outlive&lt;/em&gt;, broken down into four categories: disease, exercise, nutrition, and sleep.&lt;/p&gt;

&lt;!-- more --&gt;

&lt;h2 id=&quot;contents&quot;&gt;Contents&lt;/h2&gt;

&lt;ul id=&quot;markdown-toc&quot;&gt;
  &lt;li&gt;&lt;a href=&quot;#contents&quot; id=&quot;markdown-toc-contents&quot;&gt;Contents&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#disease&quot; id=&quot;markdown-toc-disease&quot;&gt;Disease&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#people-with-metabolically-healthy-obesity-do-not-have-elevated-mortality-risk&quot; id=&quot;markdown-toc-people-with-metabolically-healthy-obesity-do-not-have-elevated-mortality-risk&quot;&gt;People with metabolically healthy obesity do not have elevated mortality risk&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#amyloid-beta-is-implicated-in-alzheimers-disease&quot; id=&quot;markdown-toc-amyloid-beta-is-implicated-in-alzheimers-disease&quot;&gt;Amyloid beta is implicated in Alzheimer’s disease&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#hdl-cholesterol-on-its-own-doesnt-prevent-heart-disease&quot; id=&quot;markdown-toc-hdl-cholesterol-on-its-own-doesnt-prevent-heart-disease&quot;&gt;HDL cholesterol on its own doesn’t prevent heart disease&lt;/a&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#exercise&quot; id=&quot;markdown-toc-exercise&quot;&gt;Exercise&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#vo2max-is-the-best-predictor-of-longevity&quot; id=&quot;markdown-toc-vo2max-is-the-best-predictor-of-longevity&quot;&gt;VO2max is the best predictor of longevity&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#you-should-train-vo2max-by-doing-hiit-at-the-maximum-sustainable-pace&quot; id=&quot;markdown-toc-you-should-train-vo2max-by-doing-hiit-at-the-maximum-sustainable-pace&quot;&gt;You should train VO2max by doing HIIT at the maximum sustainable pace&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#you-should-do-3-hoursweek-of-zone-2-training-and-one-or-two-sessionsweek-of-hiit&quot; id=&quot;markdown-toc-you-should-do-3-hoursweek-of-zone-2-training-and-one-or-two-sessionsweek-of-hiit&quot;&gt;You should do 3+ hours/week of zone 2 training and one or two sessions/week of HIIT&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#stability-is-as-important-as-cardiovascular-fitness-and-strength&quot; id=&quot;markdown-toc-stability-is-as-important-as-cardiovascular-fitness-and-strength&quot;&gt;Stability is as important as cardiovascular fitness and strength&lt;/a&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#nutrition&quot; id=&quot;markdown-toc-nutrition&quot;&gt;Nutrition&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#rhesus-monkey-studies-suggest-that-calorie-restriction-improves-longevity-but-only-if-you-eat-a-fairly-unhealthy-diet&quot; id=&quot;markdown-toc-rhesus-monkey-studies-suggest-that-calorie-restriction-improves-longevity-but-only-if-you-eat-a-fairly-unhealthy-diet&quot;&gt;Rhesus monkey studies suggest that calorie restriction improves longevity but only if you eat a fairly unhealthy diet&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#the-data-are-unclear-on-whether-reducing-saturated-fat-intake-is-beneficial&quot; id=&quot;markdown-toc-the-data-are-unclear-on-whether-reducing-saturated-fat-intake-is-beneficial&quot;&gt;The data are unclear on whether reducing saturated fat intake is beneficial&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#people-should-take-omega-3-supplements&quot; id=&quot;markdown-toc-people-should-take-omega-3-supplements&quot;&gt;People should take omega-3 supplements&lt;/a&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#sleep&quot; id=&quot;markdown-toc-sleep&quot;&gt;Sleep&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#every-animal-sleeps&quot; id=&quot;markdown-toc-every-animal-sleeps&quot;&gt;Every animal sleeps&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#we-need-to-sleep-75-to-85-hours-a-night&quot; id=&quot;markdown-toc-we-need-to-sleep-75-to-85-hours-a-night&quot;&gt;We need to sleep 7.5 to 8.5 hours a night&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#basketball-players-who-were-told-to-sleep-for-10-hours-a-night-had-better-shooting-accuracy&quot; id=&quot;markdown-toc-basketball-players-who-were-told-to-sleep-for-10-hours-a-night-had-better-shooting-accuracy&quot;&gt;Basketball players who were told to sleep for 10 hours a night had better shooting accuracy&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#lack-of-sleep-increases-obesity-and-diabetes-risk&quot; id=&quot;markdown-toc-lack-of-sleep-increases-obesity-and-diabetes-risk&quot;&gt;Lack of sleep increases obesity and diabetes risk&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#a-study-using-mendelian-randomization-found-that-sleeping-6-hours-a-night-increased-risk-of-a-heart-attack&quot; id=&quot;markdown-toc-a-study-using-mendelian-randomization-found-that-sleeping-6-hours-a-night-increased-risk-of-a-heart-attack&quot;&gt;A study using Mendelian randomization found that sleeping &amp;lt;6 hours a night increased risk of a heart attack&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#lack-of-sleep-causes-alzheimers-disease&quot; id=&quot;markdown-toc-lack-of-sleep-causes-alzheimers-disease&quot;&gt;Lack of sleep causes Alzheimer’s disease&lt;/a&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#bonus&quot; id=&quot;markdown-toc-bonus&quot;&gt;Bonus&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#dunning-kruger-effect&quot; id=&quot;markdown-toc-dunning-kruger-effect&quot;&gt;Dunning-Kruger effect&lt;/a&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#changelog&quot; id=&quot;markdown-toc-changelog&quot;&gt;Changelog&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#notes&quot; id=&quot;markdown-toc-notes&quot;&gt;Notes&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h1 id=&quot;disease&quot;&gt;Disease&lt;/h1&gt;

&lt;h2 id=&quot;people-with-metabolically-healthy-obesity-do-not-have-elevated-mortality-risk&quot;&gt;People with metabolically healthy obesity do not have elevated mortality risk&lt;/h2&gt;

&lt;p&gt;A person is defined as having &lt;a href=&quot;https://en.wikipedia.org/wiki/Metabolic_syndrome&quot;&gt;metabolic syndrome&lt;/a&gt; if they show at least three out of five symptoms:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;abdominal obesity (i.e. large waist circumference)&lt;/li&gt;
  &lt;li&gt;high blood pressure&lt;/li&gt;
  &lt;li&gt;high blood sugar&lt;/li&gt;
  &lt;li&gt;high serum triglycerides&lt;/li&gt;
  &lt;li&gt;low HDL cholesterol&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;People with obesity but no metabolic syndrome are said to have &lt;strong&gt;metabolically healthy obesity&lt;/strong&gt; (MHO).&lt;/p&gt;

&lt;p&gt;Here’s what &lt;em&gt;Outlive&lt;/em&gt; has to say about MHO:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;A large meta-analysis of studies with mean follow-up time of 11.5 years showed that people [with metabolic syndrome] have more than triple the risk of all-cause mortality and/or cardiovascular events than metabolically healthy normal-weight individuals. Meanwhile, the metabolically healthy but obese subjects in these studies were not at significantly increased risk. (page 95)&lt;sup id=&quot;fnref:12&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:12&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;My assessment:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;The statement “MHO subjects in these studies were not at significantly increased mortality risk” is technically correct: &lt;strong&gt;true&lt;/strong&gt; (credence: 90%&lt;sup id=&quot;fnref:59&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:59&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt;).&lt;/li&gt;
  &lt;li&gt;Obesity has no negative health effects for metabolically healthy people: &lt;strong&gt;false&lt;/strong&gt; (credence: 5%).&lt;/li&gt;
  &lt;li&gt;Metabolically healthy obesity does not increase mortality risk: &lt;strong&gt;highly unlikely&lt;/strong&gt; (credence: 10%).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;em&gt;Outlive&lt;/em&gt; does not exactly say that MHO carries no elevated health risk, but some readers may come away with that impression, so I want to clarify that obesity is still bad for you even if you’re metabolically healthy.&lt;/p&gt;

&lt;p&gt;The book cites &lt;a href=&quot;https://doi.org/10.1016/j.cmet.2017.07.008&quot;&gt;Stefan et al. (2017)&lt;/a&gt;&lt;sup id=&quot;fnref:3&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;4&lt;/a&gt;&lt;/sup&gt; which in turn cites &lt;a href=&quot;/materials/kramer2013.pdf&quot;&gt;Kramer et al. (2013)&lt;/a&gt;&lt;sup id=&quot;fnref:39&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:39&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;5&lt;/a&gt;&lt;/sup&gt;:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;In a pooled analysis of 8 studies, metabolically healthy obese persons had a similar risk for all-cause mortality or CV [cardiovascular] events compared with the metabolically healthy normal-weight individuals (RR, 1.19; [95%] CI, 0.98 to 1.38). […] However, after we restricted analysis only to studies with at least 10 years of follow-up, the metabolically healthy obese group indeed had increased mortality and CV risk compared with the metabolically healthy normal-weight group (RR, 1.24; CI, 1.02 to 1.55; I^2 = 33.6%)&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Stefat et al. (2017) also found that MHO subjects had higher rates of chronic disease than metabolically healthy normal-weight subjects, see &lt;a href=&quot;/assets/images/Stefan-2017-Figure-2.jpg&quot;&gt;Figure 2&lt;/a&gt;:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/Stefan-2017-Figure-2.jpg&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;“MHO does not significantly increase risk” is one way you could describe this evidence. But it’s not the description I’d use:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;A &lt;a href=&quot;https://en.wikipedia.org/wiki/Relative_risk&quot;&gt;relative risk&lt;/a&gt; (RR) of 1.19 sounds pretty bad. This finding is (just barely) not statistically significant, but it still has a likelihood ratio of about 5:1 compared to RR = 1&lt;sup id=&quot;fnref:40&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:40&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;6&lt;/a&gt;&lt;/sup&gt;—that is, we are 5x more likely to see this result if MHO elevates mortality risk by 1.19x than if it doesn’t elevate risk at all.&lt;/li&gt;
  &lt;li&gt;MHO subjects had worse health across the board—higher rates of fatty liver disease, higher insulin resistance, lower cardiorespiratory fitness, more arterial plaque buildup—while also having non-significantly higher mortality. Which is a more reasonable interpretation: All those health problems somehow don’t translate into increased mortality? Or the health problems &lt;em&gt;do&lt;/em&gt; increase mortality, and the observed RR = 1.19 is real even though it’s not statistically significant? (And the true RR probably isn’t &lt;em&gt;exactly&lt;/em&gt; 1.19, but it’s somewhere in that vicinity.)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Many lament how often researchers treat p &amp;lt; 0.05 as “definitely real”, but it bugs me just as much when they treat p &amp;gt; 0.05 as “definitely no effect”.&lt;/p&gt;

&lt;p&gt;Ben Carpenter at Red Pen Reviews &lt;a href=&quot;https://www.redpenreviews.org/reviews/everything-fat-loss/&quot;&gt;writes&lt;/a&gt;:&lt;sup id=&quot;fnref:42&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:42&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;7&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Roughly &lt;a href=&quot;https://doi.org/10.1097%2FMD.0000000000008838&quot;&gt;50% of people&lt;/a&gt;&lt;sup id=&quot;fnref:41&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:41&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;8&lt;/a&gt;&lt;/sup&gt; with metabolically healthy obesity will develop at least one metabolic abnormality within 3–10 years, which is double the risk of normal-weight individuals. Further, metabolically healthy obesity is still associated with an increased risk of &lt;a href=&quot;https://doi.org/10.1177/2047487315623884&quot;&gt;adverse cardiovascular events&lt;/a&gt;&lt;sup id=&quot;fnref:4&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:4&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;9&lt;/a&gt;&lt;/sup&gt;, &lt;a href=&quot;https://doi.org/10.1016/j.atherosclerosis.2017.03.035&quot;&gt;subclinical atherosclerosis&lt;/a&gt;&lt;sup id=&quot;fnref:5&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:5&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;10&lt;/a&gt;&lt;/sup&gt; (plaque buildup within the arterial walls), &lt;a href=&quot;https://doi.org/10.1038/ajg.2016.178&quot;&gt;nonalcoholic fatty liver disease&lt;/a&gt;&lt;sup id=&quot;fnref:6&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:6&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;11&lt;/a&gt;&lt;/sup&gt;, &lt;a href=&quot;https://doi.org/10.1002/oby.22134&quot;&gt;kidney function decline&lt;/a&gt;&lt;sup id=&quot;fnref:7&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:7&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;12&lt;/a&gt;&lt;/sup&gt;, and &lt;a href=&quot;https://doi.org/10.1111/obr.12157&quot;&gt;type II diabetes&lt;/a&gt;&lt;sup id=&quot;fnref:8&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:8&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;13&lt;/a&gt;&lt;/sup&gt;. So, considering this evidence, being metabolically healthy with obesity is probably short-lived and still has health risks.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2 id=&quot;amyloid-beta-is-implicated-in-alzheimers-disease&quot;&gt;Amyloid beta is implicated in Alzheimer’s disease&lt;/h2&gt;

&lt;blockquote&gt;
  &lt;p&gt;[&lt;a href=&quot;https://en.wikipedia.org/wiki/Amyloid_beta&quot;&gt;Amyloid beta&lt;/a&gt;] is clearly bad stuff. (page 181)&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;My assessment: &lt;strong&gt;true&lt;/strong&gt; (credence: 90%).&lt;/p&gt;

&lt;p&gt;This claim stood out in my memory because a great deal of research on amyloid beta recently &lt;a href=&quot;https://www.science.org/content/article/potential-fabrication-research-images-threatens-key-theory-alzheimers-disease&quot;&gt;turned out to be fraudulent&lt;/a&gt;. But upon re-reading the relevant section of &lt;em&gt;Outlive&lt;/em&gt;, I found that none of the book’s claims relied on the fraudulent research, and in fact the book cites the fraud investigation itself:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;[Some scientists’] doubts seemed to be validated in July of 2022, when &lt;em&gt;Science&lt;/em&gt; published an article calling into question a widely cited 2006 study that had given new impetus to the amyloid theory, at a time when it had already seemed to be weakening.&lt;sup id=&quot;fnref:60&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:60&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;14&lt;/a&gt;&lt;/sup&gt; The 2006 study had pinpointed a particular subtype of amyloid that it claimed directly caused neurodegeneration. That in turn inspired numerous investigations into the subtype. But according to the &lt;em&gt;Science&lt;/em&gt; article, key images in that study had been falsified. (page 183–184)&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The claims made in &lt;em&gt;Outlive&lt;/em&gt; about the apparent negative effects of amyloid beta come from studies that predate the 2006 fraud, and those older claims appear to hold up. The book accurately describes the current state of the field as far as I understand it: amyloid beta is associated with at least some cases of Alzheimer’s, but research into amyloid beta has so far failed to uncover any useful treatments.&lt;/p&gt;

&lt;h2 id=&quot;hdl-cholesterol-on-its-own-doesnt-prevent-heart-disease&quot;&gt;HDL cholesterol on its own doesn’t prevent heart disease&lt;/h2&gt;

&lt;p&gt;I wrote this claim in my notes on my first read-through, but upon re-reading, &lt;em&gt;Outlive&lt;/em&gt; never actually said this. I’m including this claim anyway so that readers know I have poor reading comprehension and can’t be trusted.&lt;sup id=&quot;fnref:107&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:107&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;15&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;Here’s what the book actually said:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Risk does seem to decline as HDL-C rises to around the 80th percentile. But simply raising HDL cholesterol concentrations by brute force, which specialized drugs, has not been shown to reduce cardiovascular risk at all. (page 123)&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;My assessment:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;HDL cholesterol on its own doesn’t prevent heart disease: &lt;strong&gt;false&lt;/strong&gt;, but &lt;em&gt;Outlive&lt;/em&gt; never said that—in fact, it said the opposite.&lt;/li&gt;
&lt;/ul&gt;

&lt;h1 id=&quot;exercise&quot;&gt;Exercise&lt;/h1&gt;

&lt;p&gt;The book (rightly) focuses more on exercise than on nutrition or sleep. &lt;del&gt;From what I can tell, it is the most scientifically accurate section of the book.&lt;/del&gt;&lt;/p&gt;

&lt;p&gt;Update 2025-05-05: I’ve come to learn more about exercise and longevity, and I now believe some of &lt;em&gt;Outlive&lt;/em&gt;’s claims on exercise are incorrect or overstated. The next three sections below explain my updated beliefs. They largely cover the same ground as my recent post,  &lt;a href=&quot;https://mdickens.me/2025/02/03/I_was_probably_wrong_about_HIIT_and_VO2max/&quot;&gt;I was probably wrong about HIIT and VO2max&lt;/a&gt;, but with better evidence and citations (citing meta-analyses instead of tweets).&lt;/p&gt;

&lt;h2 id=&quot;vo2max-is-the-best-predictor-of-longevity&quot;&gt;VO2max is the best predictor of longevity&lt;/h2&gt;

&lt;p&gt;&lt;a href=&quot;https://en.wikipedia.org/wiki/VO2_max&quot;&gt;VO2max&lt;/a&gt; is the maximum amount of oxygen your body is capable of consuming. VO2max is commonly used as a measure of aerobic fitness.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Outlive&lt;/em&gt; never directly says VO2max is the best predictor of longevity, but it sure implies it. The book spends eight pages talking about the association between VO2max and longevity, and about how important it is.&lt;/p&gt;

&lt;p&gt;My assessment:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Increasing VO2max increases longevity: &lt;strong&gt;true&lt;/strong&gt; (credence: 97%).&lt;/li&gt;
  &lt;li&gt;The causal relationship between VO2max and longevity is as strong as &lt;em&gt;Outlive&lt;/em&gt; implies: &lt;strong&gt;false&lt;/strong&gt; (credence: 10%).&lt;/li&gt;
  &lt;li&gt;VO2max is the best proxy for physical fitness: &lt;strong&gt;false&lt;/strong&gt; (credence: 20%) – performance on fitness tests (e.g. maximum pace on an incline treadmill) is easier to measure and probably more accurate.&lt;sup id=&quot;fnref:116&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:116&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;16&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I have two objections to how &lt;em&gt;Outlive&lt;/em&gt; characterizes VO2max:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;The book cites observational studies, not RCTs.&lt;/li&gt;
  &lt;li&gt;VO2max is a proxy for physical fitness, not physical fitness itself. And there are better proxies.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Correlation is not causation. I am confident that training to increase VO2max does, in fact, increase longevity. But I believe observational studies overstate the magnitude of the effect.&lt;/p&gt;

&lt;p&gt;I’ve looked at both observational studies and RCTs on the relationship between exercise and health, and my sense is that observational studies overstate the causal effect by about 2x. (It’s on my to-do list to write a post about why I believe this.) I don’t know about VO2max in particular, but I expect that only about half of its observed association with longevity is causal.&lt;/p&gt;

&lt;p&gt;However, even after cutting the effect in half, exercise is still the best general-purpose&lt;sup id=&quot;fnref:111&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:111&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;17&lt;/a&gt;&lt;/sup&gt; intervention for longevity.&lt;/p&gt;

&lt;p&gt;As for my second objection: VO2max is only a proxy for physical fitness. Here I will quote an &lt;a href=&quot;https://thegrowtheq.com/longevity-and-vo2max-does-it-matter/&quot;&gt;article&lt;/a&gt; by elite running coach Steve Magness, because I can’t explain it any better than he did:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;In practical terms, Vo2max is like knowing the size of a car’s engine, which is really important if we want to know about performance. But if we want to know whether that car has a chance to win the Daytona 500, engine size alone won’t tell us. We also need to to know about the size of the fuel tank, about its fuel economy, about how long its tires will hold up, and about all the other small components that translate the power of engine to the speed of the car. It’s the same in humans. Vo2max is one of many components that, taken together, tell us about our holistic aerobic or cardiorespiratory fitness.&lt;/p&gt;

  &lt;p&gt;[…]&lt;/p&gt;

  &lt;p&gt;Vo2max matters. But it’s just one component of many that make up both performance and aerobic fitness. And that’s important because if we return to the original claims that Vo2max is the key indicator of longevity, we’ll find that the majority of the studies cited did NOT even use Vo2max as the main variable. They used performance! In the majority of research, peak speed and incline during the [Vo2max] test was the main correlate to longevity.&lt;/p&gt;

  &lt;p&gt;The large &lt;a href=&quot;https://www.sciencedirect.com/science/article/pii/S0735109722052603#bib36&quot;&gt;study&lt;/a&gt;&lt;sup id=&quot;fnref:112&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:112&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;18&lt;/a&gt;&lt;/sup&gt; on 750,000 veterans that found a 4-fold higher mortality risk for low versus high fitness used peak speed and incline, not Vo2max. Same with the &lt;a href=&quot;https://jamanetwork.com/journals/jamanetworkopen/article-abstract/2707428&quot;&gt;research&lt;/a&gt;&lt;sup id=&quot;fnref:113&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:113&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;19&lt;/a&gt;&lt;/sup&gt; on 120,000 individuals finding a 5x difference in the risk of early death.&lt;/p&gt;

  &lt;p&gt;[…] And as we can see in this &lt;a href=&quot;https://www.sciencedirect.com/science/article/abs/pii/S0033062017300439&quot;&gt;meta-analysis&lt;/a&gt;&lt;sup id=&quot;fnref:114&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:114&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;20&lt;/a&gt;&lt;/sup&gt; looking at mortality and fitness, all but a handful of the included studies used an estimate based on speed or time.&lt;/p&gt;

  &lt;p&gt;You get the point.&lt;/p&gt;

  &lt;p&gt;And this is good news! It means you don’t need to go to a lab and measure your Vo2max. You don’t even need to worry about Vo2max itself (or your watche’s [sic] horrible estimation of it). All you need to do is focus on overall aerobic fitness. Which can easily be measured, compared, and improved in a number of ways that are less expensive and more accessible than Vo2max.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;So, in short: most studies that measure “VO2max” are actually measuring performance on a fitness test. And this is good news, because it means if your performance is improving—if your mile time or your 5K time is getting faster—then you’re making real progress.&lt;/p&gt;

&lt;h2 id=&quot;you-should-train-vo2max-by-doing-hiit-at-the-maximum-sustainable-pace&quot;&gt;You should train VO2max by doing HIIT at the maximum sustainable pace&lt;/h2&gt;

&lt;p&gt;&lt;em&gt;Outlive&lt;/em&gt; recommends doing high-intensity interval training (&lt;a href=&quot;https://en.wikipedia.org/wiki/High-intensity_interval_training&quot;&gt;HIIT&lt;/a&gt;) to improve VO2max:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;The tried-and-true formula for [VO2max training] is to go four minutes at the maximum pace you can sustain for this amount of time—not an all-out sprint, but still a very hard effort. Then ride or jog four minutes easy, which should be enough time for your heart rate to come back down to below about one hundred beats per minute. Repeat this four to six times and cool down. (page 249)&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;My assessment:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;The best exercise routine includes HIIT: &lt;strong&gt;likely true&lt;/strong&gt; (credence: 85%).&lt;/li&gt;
  &lt;li&gt;Longer intervals are better than shorter intervals: &lt;strong&gt;likely true&lt;/strong&gt; (credence: 85%).&lt;/li&gt;
  &lt;li&gt;HIIT should be done at the maximum sustainable pace: &lt;strong&gt;false&lt;/strong&gt; (credence: 25%).&lt;/li&gt;
  &lt;li&gt;HIIT is the best way to improve VO2max: &lt;strong&gt;likely true&lt;/strong&gt; (credence: 70%).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Four-minute intervals are reasonable: a meta-analysis by &lt;a href=&quot;/website/materials/wen2019.pdf&quot;&gt;Wen et al. (2019)&lt;/a&gt;&lt;sup id=&quot;fnref:120&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:120&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;21&lt;/a&gt;&lt;/sup&gt; found that ≥2-minute intervals produced bigger performance benefits than &amp;lt;2 minutes (although all interval durations improved VO2max).&lt;/p&gt;

&lt;p&gt;But I have two concerns:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Training at the maximum sustainable pace is a recipe for burnout. You should train at a pace that’s difficult, but not maximum effort.&lt;/li&gt;
  &lt;li&gt;There is mixed evidence on whether HIIT is the best way to improve VO2max. Low-intensity training might be just as good.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;For the first concern:&lt;/p&gt;

&lt;p&gt;Taken literally, the quoted prescription is impossible to follow. If you do your first interval at the maximum sustainable pace, then your second interval cannot possibly be done at the same pace because you will be fatigued from the first interval.&lt;/p&gt;

&lt;p&gt;Perhaps Attia meant to say that you should go at the maximum pace you can sustain &lt;em&gt;across all intervals&lt;/em&gt;. That’s physically possible, but it still seems like a bad idea.&lt;/p&gt;

&lt;p&gt;I wanted to cite RCT evidence comparing HIIT at different intensities while controlling for duration (e.g., 4-minute intervals at maximum pace vs. 4-minute intervals at 90% of maximum pace). But I couldn’t find any. Instead, the evidence on this question comes from studies of how athletes train.&lt;/p&gt;

&lt;p&gt;Elite athletes typically do HIIT at roughly 90% of VO2max (&lt;a href=&quot;/materials/seiler2010&quot;&gt;Seiler (2010)&lt;/a&gt;&lt;sup id=&quot;fnref:121&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:121&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;22&lt;/a&gt;&lt;/sup&gt;). This is &lt;em&gt;considerably slower&lt;/em&gt; than the maximum sustainable pace (&lt;a href=&quot;/materials/billat2001.pdf&quot;&gt;Billat (2001)&lt;/a&gt;&lt;sup id=&quot;fnref:126&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:126&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;23&lt;/a&gt;&lt;/sup&gt;). For example, &lt;a href=&quot;https://doi.org/10.3109/13813459508996126&quot;&gt;Billat et al. (1995)&lt;/a&gt;&lt;sup id=&quot;fnref:122&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:122&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;24&lt;/a&gt;&lt;/sup&gt; found that a sample of elite long-distance runners could maintain 90% of VO2max for an average of 16.55 minutes.&lt;sup id=&quot;fnref:124&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:124&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;25&lt;/a&gt;&lt;/sup&gt; If you can maintain a certain pace for 16 minutes straight, then you can certainly run at that pace for four 4-minute intervals with rests in between.&lt;/p&gt;

&lt;p&gt;For the second concern: is HIIT really better than low-intensity training (LIT)&lt;sup id=&quot;fnref:130&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:130&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;26&lt;/a&gt;&lt;/sup&gt; for improving VO2max? Probably yes, but it’s unclear.&lt;/p&gt;

&lt;p&gt;The RCT evidence on this is mixed. A meta-analysis by &lt;a href=&quot;/website/materials/wen2019.pdf&quot;&gt;Wen et al. (2019)&lt;/a&gt;&lt;sup id=&quot;fnref:120:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:120&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;21&lt;/a&gt;&lt;/sup&gt; found that HIIT worked better than LIT for increasing VO2max. However, a review of meta-analyses by &lt;a href=&quot;https://doi.org/10.1155/2022/9310710&quot;&gt;Crowley et al. (2022)&lt;/a&gt;&lt;sup id=&quot;fnref:127&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:127&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;27&lt;/a&gt;&lt;/sup&gt; found inconsistent evidence.&lt;sup id=&quot;fnref:128&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:128&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;28&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;I didn’t look at most of the meta-analyses reviewed by Crowley et al. I did look at &lt;a href=&quot;https://doi.org/10.1007/s40279-013-0115-0&quot;&gt;Gist et al. (2013)&lt;/a&gt;, which at a glance appears to be the most rigorous “contrarian” meta-analysis. It found that sprint interval training did not increase VO2max by more than endurance training. However, this is roughly consistent with &lt;a href=&quot;/website/materials/wen2019.pdf&quot;&gt;Wen et al. (2019)&lt;/a&gt;&lt;sup id=&quot;fnref:120:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:120&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;21&lt;/a&gt;&lt;/sup&gt;, which found that longer intervals (2+ minutes) worked better than sprint intervals.&lt;/p&gt;

&lt;p&gt;RCTs may overstate the benefits of HIIT relative to LIT. A meta-analysis by &lt;a href=&quot;https://doi.org/10.1007/s40279-024-02120-2&quot;&gt;Mølmen et al. (2024)&lt;/a&gt;&lt;sup id=&quot;fnref:118&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:118&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;29&lt;/a&gt;&lt;/sup&gt; found that HIIT rapidly improved participants’ fitness, whereas the benefits of LIT took longer to show up. Therefore, a 5-week study will show a clear advantage to HIIT even if LIT might be better in the long run:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/Molmen-2024.jpeg&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;(ET = endurance training; HIT = high-intensity training; SIT = sprint-intensity training)&lt;/p&gt;

&lt;p&gt;That being said, there is &lt;em&gt;some&lt;/em&gt; evidence that HIIT provides benefits on top of LIT, so if your goal is to optimize longevity, then I believe it makes sense to do both.&lt;/p&gt;

&lt;h2 id=&quot;you-should-do-3-hoursweek-of-zone-2-training-and-one-or-two-sessionsweek-of-hiit&quot;&gt;You should do 3+ hours/week of zone 2 training and one or two sessions/week of HIIT&lt;/h2&gt;

&lt;blockquote&gt;
  &lt;p&gt;[I]t seems that about three hours per week of zone 2, or four 45-minute sessions, is the minimum required for most people to derive a benefit and make improvements, once you get over the initial hump of trying it for the first time. (page 243)&lt;/p&gt;
&lt;/blockquote&gt;

&lt;blockquote&gt;
  &lt;p&gt;Even if we are not out to set world records, the way we train VO2max is pretty similar to the way elite athletes do it: by supplementing our zone 2 work with one or two VO2max workouts per week. (page 249)&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;My assessment:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;This is a good exercise routine: &lt;strong&gt;true&lt;/strong&gt; (credence: 95%).&lt;/li&gt;
  &lt;li&gt;This routine is uniquely better than any other: &lt;strong&gt;likely false&lt;/strong&gt; (credence: 25%).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;RCT evidence doesn’t tell us much about the optimal exercise routine. As with HIIT vs. LIT, the best evidence comes from looking at how top athletes train.&lt;/p&gt;

&lt;p&gt;Elite athletes typically do 80% of their training at low/moderate intensity (a.k.a. &lt;a href=&quot;https://trainright.com/zone-2-training-to-improve-aerobic-endurance-and-fat-burning/&quot;&gt;zone 2&lt;/a&gt;), and 20% at high intensity (&lt;a href=&quot;/materials/seiler2010&quot;&gt;Seiler (2010)&lt;/a&gt;&lt;sup id=&quot;fnref:121:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:121&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;22&lt;/a&gt;&lt;/sup&gt;; &lt;a href=&quot;https://doi.org/10.3389/fphys.2015.00295&quot;&gt;Stöggl &amp;amp; Sperlich (2015)&lt;/a&gt;&lt;sup id=&quot;fnref:123&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:123&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;30&lt;/a&gt;&lt;/sup&gt;). For elite athletes, doing more than 20% of cardio sessions at high intensity can induce overtraining (&lt;a href=&quot;/materials/seiler2010&quot;&gt;Seiler (2010)&lt;/a&gt;&lt;sup id=&quot;fnref:121:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:121&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;22&lt;/a&gt;&lt;/sup&gt;).&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Outlive&lt;/em&gt; recommends training at just two intensities: low and high. But many elite athletes train at three or more different durations/intensities (&lt;a href=&quot;https://doi.org/10.3389/fphys.2015.00295&quot;&gt;Stöggl &amp;amp; Sperlich (2015)&lt;/a&gt;&lt;sup id=&quot;fnref:123:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:123&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;30&lt;/a&gt;&lt;/sup&gt;). Different training modalities cause your body to adapt in different ways, so it makes sense to mix things up. You might do something like 80% zone 2 training, 15% long-duration (~4-minute) interval training, 5% sprint (&amp;lt; 1 minute) interval training. But just two intensities is probably sufficient to get the vast majority of longevity benefits.&lt;/p&gt;

&lt;p&gt;I am not confident that ordinary folks should follow the same 80/20 rule as elite athletes. Professionals train 20–30 hours per week, including 4+ hours of high-intensity training. If I do 5 hours a week of cardio instead of 20, do I really need to restrict my HIIT to only one hour? Maybe I could handle 2 or 3 hours (or even 5 hours) without overtraining. But in the absence of direct evidence, I’ll follow the 80/20 rule.&lt;/p&gt;

&lt;p&gt;For the record, my exercise routine is:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Low-intensity cardio 3x/week for 45 to 90 minutes (depending on how I’m feeling);&lt;/li&gt;
  &lt;li&gt;HIIT once every 1–2 weeks (usually four 4-minute intervals, and occasionally a different variation);&lt;/li&gt;
  &lt;li&gt;Resistance training 3-4x/week.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I think this is less than the optimal amount of HIIT, but I don’t like doing HIIT.&lt;/p&gt;

&lt;h2 id=&quot;stability-is-as-important-as-cardiovascular-fitness-and-strength&quot;&gt;Stability is as important as cardiovascular fitness and strength&lt;/h2&gt;

&lt;p&gt;&lt;em&gt;Outlive&lt;/em&gt; claims that exercise is the best way to increase longevity, and that the ideal exercise programs includes four components:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;long-duration, low-intensity cardio (“zone 2” training)&lt;/li&gt;
  &lt;li&gt;high-intensity interval training (VO2 max training)&lt;/li&gt;
  &lt;li&gt;strength training&lt;/li&gt;
  &lt;li&gt;stability training&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;An abundance of evidence shows that the first three types of exercise are excellent for health and longevity. The fourth is not so well-supported.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;[Stability] is the foundation on which our twin pillars of cardiovascular fitness and strength must rest. (page 265)&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;My assessment: I don’t know how to evaluate this. It seems to border on &lt;a href=&quot;https://en.wikipedia.org/wiki/Not_even_wrong&quot;&gt;not even wrong&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The book defines stability as “the subconscious ability to harness, decelerate, or stop force”, while admitting that this isn’t a great definition. I don’t know how to empirically test stability by this definition. You can formally define cardiovascular fitness as VO2 max (or resting heart rate, etc.), and then show that improving your chosen metric improves health and reduces mortality risk. But I don’t know how to operationalize “stability”.&lt;/p&gt;

&lt;p&gt;(On one reading, this definition of stability sounds nearly identical to “strength”, because your ability to decelerate/stop force is pretty much entirely determined by your ability to generate force.)&lt;/p&gt;

&lt;p&gt;The chapter on stability only made two-ish testable claims that I could identify. The first:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;DNS [dynamic neuromuscular stabilization] originated with a group of Czech neurologists who were working with young children with cerebral palsy in a hospital in Prague in the 1960s. They noticed that because of their illness, these kids did &lt;em&gt;not&lt;/em&gt; go through the normal infant stages of rolling, crawling, and so forth. Thus they had movement problems throughout their lives. But when the children with cerebral palsy were put through a “training” program consisting of a certain sequence of movements, replicating the usual stages of learning to crawl, sit up, and eventually stand, their symptoms improved and they were better able to control their motions as they matured. The researchers realized that as we grow up, most healthy humans actually go through an opposite process—we lose these natural, healthy, almost ingrained movements. (page 270)&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;(The book cites &lt;a href=&quot;https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3578435/&quot;&gt;Frank et al. (2013)&lt;/a&gt;&lt;sup id=&quot;fnref:9&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:9&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;31&lt;/a&gt;&lt;/sup&gt;.)&lt;/p&gt;

&lt;p&gt;TLDR: Some research found that DNS helped children with cerebral palsy, so it might also help adults prevent injuries. The assertion that it would help adults is presented without evidence, and to my knowledge, no evidence exists.&lt;/p&gt;

&lt;p&gt;Relatedly, &lt;em&gt;Outlive&lt;/em&gt; cites an &lt;a href=&quot;https://peterattiamd.com/michaelrintala/&quot;&gt;interview&lt;/a&gt; Peter Attia did with “leading American practitioner of DNS” Michael Rintala, D.C. (a.k.a. Doctor of Chiropractic). &lt;a href=&quot;https://en.wikipedia.org/wiki/Chiropractic&quot;&gt;Chiropractic&lt;/a&gt; is pseudoscience and chiropractors are fake doctors, so without having looked into this much, I’m pretty skeptical of DNS. If Rintala practices bogus medicine in one arena, that’s evidence that his DNS research is also bogus.&lt;/p&gt;

&lt;p&gt;I don’t know that this first claim is false, but it’s essentially pulled out of thin air so I have no reason to believe that it’s true.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Outlive&lt;/em&gt; makes a second semi-testable claim: it implies (but does not explicitly state) that squatting with perfect form has lower injury risk than squatting asymmetrically. This claim sounds intuitively plausible but I could not find any supporting evidence (and there’s some evidence that “bad” squat form isn’t necessarily bad&lt;sup id=&quot;fnref:33&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:33&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;32&lt;/a&gt;&lt;/sup&gt;).&lt;/p&gt;

&lt;p&gt;And squatting asymmetrically is probably better than not squatting at all (credence: 90%) because (1) studies find robust health benefits to strength training and (2) probably most of the subjects in those studies don’t have particularly good form.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Outlive&lt;/em&gt; didn’t do a good job of defining “stability training”, so I’ll do it myself. Let’s say the purpose of stability training is to reduce the risk of falling. In that case, is stability training useful?&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;A meta-analysis by &lt;a href=&quot;https://doi.org/10.1071/NB10056&quot;&gt;Sherrington et al. (2011)&lt;/a&gt;&lt;sup id=&quot;fnref:88&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:88&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;33&lt;/a&gt;&lt;/sup&gt; found that essentially any kind of exercise reduced fall risk, with balance training having a 22% larger effect than an “average” exercise program, and long-duration exercise (&amp;gt;50 hour trial duration) having a 23% larger effect than “average”. So “stability training” does appear to work, and high-dose exercise programs that included balance training reduced fall risk by 38%.&lt;/li&gt;
  &lt;li&gt;A meta-analysis by &lt;a href=&quot;https://doi.org/10.1001/jamainternmed.2018.5406&quot;&gt;de Souto Barreto et al. (2019)&lt;/a&gt;&lt;sup id=&quot;fnref:89&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:89&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;34&lt;/a&gt;&lt;/sup&gt; found that exercise significantly reduced falls (RR = 0.88) and injurious falls (RR = 0.74), and non-significantly reduced fractures (RR = 0.84, 95% CI 0.71–1.00). It compared exercise programs by type: (1) aerobic, (2) strength, (3) other (tai chi/dance), (4) multicomponent (aerobic + strength + balance). The meta-analysis did not find any significant differences between four types. (And it wasn’t a p = 0.06 situation either—most of the p-values were greater than 0.8.)&lt;/li&gt;
  &lt;li&gt;There’s some contrary evidence on balance training. A meta-analysis by &lt;a href=&quot;https://link.springer.com/article/10.1007/s40279-016-0515-z&quot;&gt;Kümmel et al. (2016)&lt;/a&gt;&lt;sup id=&quot;fnref:110&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:110&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;35&lt;/a&gt;&lt;/sup&gt; found that balance training on a particular task improves performance on that task, but does not transfer even to very similar tasks. This gives reason to doubt that balance training reduces injury risk.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So exercise appears to help with stability. And balance exercises might work better than other types of exercise, but they might not work at all.&lt;/p&gt;

&lt;p&gt;My assessment of some claims that &lt;em&gt;Outlive&lt;/em&gt; didn’t make, but sort of implied:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Stability is useful: &lt;strong&gt;likely true&lt;/strong&gt; (credence: 90%).&lt;/li&gt;
  &lt;li&gt;Exercise can improve stability: &lt;strong&gt;likely true&lt;/strong&gt; (credence: 80%).&lt;/li&gt;
  &lt;li&gt;Exercising to improve stability matters as much as exercising to reduce cardiovascular disease/diabetes/cancer: &lt;strong&gt;almost certainly false&lt;/strong&gt; (credence: 5%).&lt;/li&gt;
  &lt;li&gt;Most people should do additional stability training on top of the cardio and strength training that they should already be doing: &lt;strong&gt;likely false&lt;/strong&gt; (credence: 15%).&lt;/li&gt;
&lt;/ul&gt;

&lt;h1 id=&quot;nutrition&quot;&gt;Nutrition&lt;/h1&gt;

&lt;h2 id=&quot;rhesus-monkey-studies-suggest-that-calorie-restriction-improves-longevity-but-only-if-you-eat-a-fairly-unhealthy-diet&quot;&gt;Rhesus monkey studies suggest that calorie restriction improves longevity but only if you eat a fairly unhealthy diet&lt;/h2&gt;

&lt;p&gt;I won’t provide a direct quote from &lt;em&gt;Outlive&lt;/em&gt; because it would be too long. To summarize, the book says (pages 312–316):&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;A 2009 University of Wisconsin-Madison (UW) &lt;a href=&quot;https://doi.org/10.1126/science.1173635&quot;&gt;study&lt;/a&gt;&lt;sup id=&quot;fnref:15&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:15&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;36&lt;/a&gt;&lt;/sup&gt; found that rhesus monkeys on a calorie-restricted diet lived longer than the control group.&lt;/li&gt;
  &lt;li&gt;But a similar 2012 &lt;a href=&quot;(https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3832985/)&quot;&gt;study&lt;/a&gt;&lt;sup id=&quot;fnref:13&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:13&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;37&lt;/a&gt;&lt;/sup&gt; by the National Institute of Aging (NIA) found that a calorie-restricted diet did &lt;em&gt;not&lt;/em&gt; improve longevity.&lt;/li&gt;
  &lt;li&gt;The biggest difference between the studies was that the UW monkeys ate processed food and the NIA monkeys ate a whole-foods diet formulated by a primate nutritionist.&lt;/li&gt;
  &lt;li&gt;So it looks like calorie restriction improves longevity if you eat mostly processed food, and doesn’t matter much if you eat a healthy diet.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;My assessment:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;em&gt;Outlive&lt;/em&gt; accurately summarizes these two studies: &lt;strong&gt;true&lt;/strong&gt; (credence: 90%).&lt;/li&gt;
  &lt;li&gt;Calorie restriction (CR) improves longevity but only if you eat a fairly unhealthy diet: &lt;strong&gt;likely true&lt;/strong&gt; (credence: 70%).&lt;/li&gt;
  &lt;li&gt;The rhesus monkey studies support the above claim: &lt;strong&gt;unclear&lt;/strong&gt; (credence: 50%).&lt;/li&gt;
  &lt;li&gt;The rhesus monkey studies generalize well to humans: &lt;strong&gt;somewhat unlikely&lt;/strong&gt; (credence: 35%).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I found the book’s interpretation to be reasonable and appropriately couched in uncertainty but I want to write about the studies because they were interesting.&lt;/p&gt;

&lt;p&gt;A 2017 &lt;a href=&quot;https://doi.org/10.1038/ncomms14063&quot;&gt;collaboration&lt;/a&gt;&lt;sup id=&quot;fnref:14&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:14&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;38&lt;/a&gt;&lt;/sup&gt; by the authors of the UW and NIA studies reviewed the differences in study designs and outcomes. They agreed with Attia’s interpretation that diet quality was the most likely explanation for the different results, and that the studies jointly suggest that calorie intake is an important predictor of longevity—the monkeys in the NIA control group ate as little as those in the UW calorie restriction (CR) group.&lt;/p&gt;

&lt;p&gt;I read the collaboration and noticed some results that don’t add up:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;In the NIA study, the control group developed more chronic disease than the CR group (&lt;a href=&quot;https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5247583/figure/f6/&quot;&gt;Figure 6&lt;/a&gt;), which seemingly contradicts the finding that calorie restriction didn’t help.&lt;/li&gt;
  &lt;li&gt;NIA split monkeys into young and old cohorts based on the age of each monkey when the study started. Within the young cohort, the CR group had less chronic disease (&lt;a href=&quot;https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5247583/figure/f6/&quot;&gt;Figure 6&lt;/a&gt;), but had &lt;em&gt;worse&lt;/em&gt; mean and median longevity (&lt;a href=&quot;https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5247583/table/t3/?report=objectonly&quot;&gt;Table 2&lt;/a&gt;). Why do these two measurements point in opposite directions?
    &lt;ul&gt;
      &lt;li&gt;CR &lt;em&gt;reduced&lt;/em&gt; lifespan in the young NIA cohort, and the magnitude of the effect was &lt;em&gt;larger&lt;/em&gt; than in the UW study, but it wasn’t statistically significant&lt;sup id=&quot;fnref:49&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:49&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;39&lt;/a&gt;&lt;/sup&gt; so of course the authors ignored it. I’m not saying it’s a real effect, I’m just saying if you get a result (even a non-significant result) that goes in the &lt;em&gt;opposite&lt;/em&gt; direction of what you predicted, then you should take that as a cue that you’re missing something.&lt;sup id=&quot;fnref:48&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:48&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;40&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;The NIA young male cohort saw a decrease in &lt;a href=&quot;https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5247583/figure/f2/&quot;&gt;bodyweight&lt;/a&gt; and &lt;a href=&quot;https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5247583/figure/f3/&quot;&gt;body fat&lt;/a&gt; while the other three NIA cohorts saw essentially no change. Presumably, the main mechanism of calorie restriction is that it prevents obesity, so we should see a longevity improvement among NIA young males and not among the other three cohorts. But that’s not what we see. Instead we see essentially no effect in old males/females and a negative effect in young males/females.&lt;/li&gt;
  &lt;li&gt;In the NIA study, why did calorie restriction reduce average bodyweight for males but &lt;em&gt;not&lt;/em&gt; for females (&lt;a href=&quot;https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5247583/figure/f2/&quot;&gt;Figure 2&lt;/a&gt;)? The authors took this as evidence of “sexual dimorphism in the relationship between food intake and bodyweight”. That does not sound plausible to me.&lt;/li&gt;
  &lt;li&gt;According to the authors: “In rodents, early onset CR is more effective in extending longevity than adult onset CR. For nonhuman primates it appears that CR, while beneficial when implemented in adulthood, does not improve survival when implemented in juveniles.” It’s suspicious that rodent studies and monkey studies produced opposite results in this respect.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The NIA control group monkeys ate about as much as the UW calorie-restricted monkeys, which the authors take as evidence in favor of the original hypothesis—lower caloric intake improves longevity. Maybe.&lt;sup id=&quot;fnref:51&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:51&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;41&lt;/a&gt;&lt;/sup&gt; Or maybe it was the higher-quality diet (irrespective of caloric intake), or some other difference between the two studies.&lt;sup id=&quot;fnref:50&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:50&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;42&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;If two studies conflict, I’m wary of making my hypothesis more complicated to fit the results. The hypothesis started as “CR improves longevity”, which the UW study supported. But when the NIA study produced contradictory evidence, the hypothesis became “CR improves longevity, unless you’re eating a healthy diet, in which case it doesn’t”.&lt;/p&gt;

&lt;p&gt;The two studies together support this hypothesis, but they don’t distinguish it from other plausible hypotheses (see previous footnote, repeated here&lt;sup id=&quot;fnref:50:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:50&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;42&lt;/a&gt;&lt;/sup&gt;).&lt;/p&gt;

&lt;p&gt;And if you look closely at the NIA results, the hypothesis needs to become even more complicated: something like “CR improves longevity, but if you’re eating a healthy diet then it has no effect for old monkeys and a harmful effect for young monkeys, and the mechanism is presumably that monkeys with healthy diets tend to have lower rates of obesity, except that young males in the NIA study didn’t weigh much less than in UW so for young males the mechanism is some other thing, and also the young vs. old effect is reversed in rodents for some reason.”&lt;/p&gt;

&lt;p&gt;The more complicated a hypothesis, the more supporting evidence it requires.&lt;/p&gt;

&lt;p&gt;More broadly, I’m skeptical that CR studies on lab animals generalize to the real world. Peter Attia agrees:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;CR’s usefulness remains doubtful outside the lab; very lean animals may be more susceptible to death from infection or cold temperatures. […] Furthermore, there is no evidence that extreme CR would truly maximize the longevity function in an organism as complex as we humans, who live in a more variable environment than the animals [studied]. While it seems likely that it would reduce the risk of succumbing to at least some [chronic diseases], it seems equally likely that the uptick in mortality due to infections, trauma, and frailty might offset these gains. (page 81)&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Another fact that seems important to me, but that Attia and the UW/NIA authors didn’t discuss: the monkeys never exercised. They were permanently housed in small cages (&lt;a href=&quot;https://doi.org/10.1016/j.neurobiolaging.2004.09.013&quot;&gt;Mattison et al. (2005)&lt;/a&gt;&lt;sup id=&quot;fnref:16&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:16&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;43&lt;/a&gt;&lt;/sup&gt;). Some of the the UW monkeys did participate in a different &lt;a href=&quot;https://doi.org/10.1016/j.exger.2013.08.002&quot;&gt;study&lt;/a&gt;&lt;sup id=&quot;fnref:17&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:17&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;44&lt;/a&gt;&lt;/sup&gt; measuring their physical activity which moved them to a metabolic chamber, but the metabolic chamber was about the same size as the cages—i.e., too small for meaningful exercise.&lt;/p&gt;

&lt;p&gt;Physical activity appears to largely or fully cancel out the harms of higher-calorie diets.&lt;sup id=&quot;fnref:35&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:35&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;45&lt;/a&gt;&lt;/sup&gt; Even if calorie restriction works for sedentary people, it’s less likely to improve health for folks who exercise regularly.&lt;/p&gt;

&lt;p&gt;Returning to the original hypothesis—”CR improves longevity but only if you eat a fairly unhealthy diet”—these studies provide a small amount of evidence for the hypothesis but not much. That said, the hypothesis sounds correct to me:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;People with unhealthy diets tend to overeat, so eating less would probably improve their health.&lt;/li&gt;
  &lt;li&gt;People who get most of their calories from healthy sources (whole grains, nuts, etc.) are much less likely to overeat, so there’s no point in calorie restriction.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;the-data-are-unclear-on-whether-reducing-saturated-fat-intake-is-beneficial&quot;&gt;The data are unclear on whether reducing saturated fat intake is beneficial&lt;/h2&gt;

&lt;p&gt;(Note: SFA = saturated fatty acid, MUFA = monounsaturated fatty acid, PUFA = polyunsaturated fatty acid)&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;A more recent publication by the Cochrane Collaboration, published in 2020 as a 287-page treatise titled &lt;em&gt;Reduction in Saturated Fat Intake for Cardiovascular Disease&lt;/em&gt;&lt;sup id=&quot;fnref:62&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:62&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;46&lt;/a&gt;&lt;/sup&gt;, looked at fifteen RCTs in over fifty-six thousand patients and found, among other things, that “reducing dietary saturated fat reduced the risk of combined cardiovascular events by 17%.” Interesting. But the same review also found “little or no effect of reducing saturated fat on all-cause mortality or cardiovascular mortality.”&lt;/p&gt;

  &lt;p&gt;[…]&lt;/p&gt;

  &lt;p&gt;The data are very unclear on this question, at least at the population level. […] [A]ny hope of using broad insights from evidence-based medicine is bound to fail when it comes to nutrition, because such population-level data cannot provide much value at the individual level when the effect sizes are so small, as they clearly are here. (pages 337–338)&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;And a bonus quote:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;If, after reading this chapter, you’re upset because you don’t quite agree with some detail I’ve covered—be it the ratio of MUFA to PUFA to SFA, or the exact bioavailability of soy protein, the role of seed oils and lectins, or the ideal target for average blood glucose levels […], I have one final piece of advice. Stop overthinking nutrition so much. Put the book down. Go outside and exercise. (page 346)&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Well joke’s on you, I already exercised today, and now I’m back to over-analyze saturated fat.&lt;/p&gt;

&lt;p&gt;My assessment:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Saturated fat is unhealthy in expectation: &lt;strong&gt;likely true&lt;/strong&gt; (credence: 85%).&lt;/li&gt;
  &lt;li&gt;It’s a good idea for most people to reduce their SFA intake: &lt;strong&gt;possible&lt;/strong&gt; (credence: 50%).&lt;sup id=&quot;fnref:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;47&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
  &lt;li&gt;It’s a good idea for people with high cholesterol to reduce their SFA intake: &lt;strong&gt;likely true&lt;/strong&gt; (credence: 70%).&lt;/li&gt;
  &lt;li&gt;The data are unclear: &lt;strong&gt;unclear&lt;/strong&gt;. (Yes, it’s unclear whether the data are unclear. It depends on how much clarity you want.)&lt;/li&gt;
  &lt;li&gt;The amount of SFA in your diet doesn’t matter all that much: &lt;strong&gt;possible&lt;/strong&gt; (credence: 50%).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Attia’s position on saturated fat stuck out to me because the mainstream view says saturated fat is unhealthy. After spending much longer on this than I’d originally planned, I’ve come to the conclusion that the mainstream advice is basically reasonable, and Attia’s position is also basically reasonable. There’s some evidence that reducing SFA is beneficial and there’s little evidence to the contrary, but (a) the evidence is only moderately strong at best, and (b) there’s a lot of variation in how SFA affects people, so you might not need to worry about it unless you have high cholesterol.&lt;/p&gt;

&lt;p&gt;Observational studies have found mixed results, with the more reliable studies generally finding moderate associations between saturated fat and heart disease. For example, &lt;a href=&quot;https://doi.org/10.1136/bmj.i5796&quot;&gt;Zong et al. (2016)&lt;/a&gt;&lt;sup id=&quot;fnref:64&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:64&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;48&lt;/a&gt;&lt;/sup&gt; found that shifting 1% of daily calories from saturated fat to polyunsaturated fat was associated with an 8% reduction in coronary heart disease (p &amp;lt; 0.001). But observational studies don’t establish causality, so let’s look at randomized controlled trials (RCTs).&lt;/p&gt;

&lt;p&gt;A 2020 &lt;a href=&quot;https://doi.org/10.1002/14651858.cd011737.pub3&quot;&gt;Cochrane review&lt;/a&gt;&lt;sup id=&quot;fnref:62:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:62&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;46&lt;/a&gt;&lt;/sup&gt; of RCTs found that replacing dietary saturated fat significantly reduced cardiovascular events and LDL cholesterol levels, and non-significantly reduced all-cause mortality and cardiovascular mortality. (Some other meta-analyses of RCTs have been done,&lt;sup id=&quot;fnref:65&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:65&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;49&lt;/a&gt;&lt;/sup&gt; but the Cochrane review likely has the strongest methodology.)&lt;/p&gt;

&lt;p&gt;It’s not entirely clear how to interpret the results of the Cochrane review. Out of eight &lt;a href=&quot;https://www.cochranelibrary.com/cdsr/doi/10.1002/14651858.CD011737.pub3/full#CD011737-sec-0008&quot;&gt;primary outcome variables&lt;/a&gt; (including all-cause mortality, cardiovascular mortality, etc.), reducing SFA only statistically significantly improved one variable. But it showed a positive effect for all eight. This weakly suggests that there’s a real effect and the RCTs were underpowered for most of the measures. If we treated the eight measures as independent, this would constitute strong evidence of a real effect, but the measures are mostly correlated with each other.&lt;/p&gt;

&lt;p&gt;The Cochrane review looked at dozens of other outcome variables. Most importantly, replacing saturated fat significantly reduced LDL cholesterol. (A &lt;a href=&quot;https://iris.who.int/bitstream/handle/10665/246104/9789241565349-eng.pdf&quot;&gt;WHO (2016)&lt;/a&gt;&lt;sup id=&quot;fnref:68&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:68&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;50&lt;/a&gt;&lt;/sup&gt; meta-analysis of 84 RCTs agreed with this result.) Cholesterol-lowering drugs have been shown to lower all-cause mortality (see &lt;a href=&quot;https://doi.org/10.1001/jama.2018.2525&quot;&gt;Navarese et al. (2018)&lt;/a&gt;&lt;sup id=&quot;fnref:67&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:67&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;51&lt;/a&gt;&lt;/sup&gt; and &lt;a href=&quot;https://doi.org/10.1097/fjc.0000000000001345&quot;&gt;Ennezat et al. (2022)&lt;/a&gt;&lt;sup id=&quot;fnref:66&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:66&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;52&lt;/a&gt;&lt;/sup&gt;), so it stands to reason—although we only have weak direct evidence—that if reducing SFA improves LDL cholesterol, then it should improve all-cause mortality. Somewhat contradicting this, an RCT by &lt;a href=&quot;https://doi.org/10.1093/ajcn/nqz035&quot;&gt;Bergeron et al. (2019)&lt;/a&gt;&lt;sup id=&quot;fnref:71&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:71&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;53&lt;/a&gt;&lt;/sup&gt; found that SFA caused the body to produce mainly larger LDL particles, which are less harmful than small particles.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://www.ahajournals.org/doi/full/10.1161/CIR.0000000000000510&quot;&gt;Sacks et al. (2017)&lt;/a&gt;&lt;sup id=&quot;fnref:90&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:90&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;54&lt;/a&gt;&lt;/sup&gt; performed a review with stricter inclusion criteria, looking only at RCTs that:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;controlled subjects’ dietary intake;&lt;/li&gt;
  &lt;li&gt;lasted at least 2 years;&lt;/li&gt;
  &lt;li&gt;proved adherence by measuring biomarkers like cholesterol;&lt;/li&gt;
  &lt;li&gt;did not replace saturated fats with trans fats in the intervention group.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Only four trials met these criteria. Three of the four were included in the Cochrane review; the fourth, the &lt;a href=&quot;https://doi.org/10.1016/s0140-6736(72)92208-8&quot;&gt;Finnish mental hospital study&lt;/a&gt;&lt;sup id=&quot;fnref:97&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:97&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;55&lt;/a&gt;&lt;/sup&gt;, was excluded for using a cluster-randomized design instead of full randomization. All four of these studies supported the hypothesis that reducing SFA improves cardiovascular health, and they had a weighted average relative risk of 0.71 (95% CI 0.62–0.81, see &lt;a href=&quot;https://www.ahajournals.org/cms/10.1161/CIR.0000000000000510/asset/c991ecf0-a90b-4dbc-b9ff-27ad55c8b3a6/assets/graphic/e1fig02.jpeg&quot;&gt;Figure 2&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;Two of the studies in the Cochrane review showed an &lt;em&gt;increase&lt;/em&gt; in cardiovascular events when reducing SFA:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;The &lt;a href=&quot;https://doi.org/10.1136/bmj.e8707&quot;&gt;Sydney Diet Heart study&lt;/a&gt;&lt;sup id=&quot;fnref:98&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:98&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;56&lt;/a&gt;&lt;/sup&gt; had some study participants replace saturated fat with trans-fat-heavy margarine, which likely explains the increase in bad outcomes.&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://doi.org/10.1136/bmj.1.5449.1531&quot;&gt;The Rose study&lt;/a&gt;&lt;sup id=&quot;fnref:99&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:99&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;57&lt;/a&gt;&lt;/sup&gt; did not have any glaring problems like Sydney Diet Heart, but it lasted for less than 2 years and only had a total of 54 participants. I take it as valid but weak contradictory evidence.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;It looks reasonably likely, but not conclusive, that saturated fat is unhealthy. How unhealthy?&lt;/p&gt;

&lt;p&gt;The Cochrane review suggests that an intervention to reduce dietary SFA should prevent one cardiovascular event per 290 person-years and one death per 2300 person-years for the sort of people who participated in these trials (i.e., people with elevated baseline risk of cardiovascular disease).&lt;sup id=&quot;fnref:69&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:69&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;58&lt;/a&gt;&lt;/sup&gt; Compare to exercise, which is associated with a reduction of about 1 death per 300 person-years in individuals with chronic diseases.&lt;sup id=&quot;fnref:74&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:74&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;59&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;It’s not clear how to estimate the improvement in mortality for the general population. Participants in these trials died at approximately the normal rate (compare to &lt;a href=&quot;https://www.cdc.gov/nchs/fastats/deaths.htm&quot;&gt;US CDC mortality statistics&lt;/a&gt;) which suggests the effect should be similar, but it makes sense in theory that dietary interventions should have larger effects on unhealthy populations.&lt;/p&gt;

&lt;h2 id=&quot;people-should-take-omega-3-supplements&quot;&gt;People should take omega-3 supplements&lt;/h2&gt;

&lt;blockquote&gt;
  &lt;p&gt;There is some evidence that supplementation with the omega-3 fatty acid DHA, found in fish oil, may help maintain brain health[.] (page 200)&lt;/p&gt;
&lt;/blockquote&gt;

&lt;blockquote&gt;
  &lt;p&gt;[U]nless they are eating a lot of fatty fish, filling their coffers with marine omega-3 [fatty acids], [my patients] almost always need to take EPA and DHA supplements in capsule or oil form. (page 339)&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;My assessment:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Omega-3 fatty acids improve brain health: &lt;strong&gt;likely true&lt;/strong&gt; (credence: 75%).&lt;/li&gt;
  &lt;li&gt;Omega-3s improve health in general: &lt;strong&gt;likely true&lt;/strong&gt; (credence: 80%).&lt;/li&gt;
  &lt;li&gt;It’s a good idea for most people to take omega-3 supplements: &lt;strong&gt;somewhat likely&lt;/strong&gt; (credence: 65%).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A meta-analysis of RCTs by &lt;a href=&quot;https://doi.org/10.7759/cureus.30091&quot;&gt;Dighriri et al. (2022)&lt;/a&gt;&lt;sup id=&quot;fnref:52&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:52&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;60&lt;/a&gt;&lt;/sup&gt; found that “[c]onsumption of omega-3 improved learning, memory ability, cognitive well-being, and blood flow in the brain.”&lt;/p&gt;

&lt;p&gt;A &lt;a href=&quot;https://doi.org/10.1002/14651858.cd003177.pub5&quot;&gt;Cochrane review&lt;/a&gt;&lt;sup id=&quot;fnref:53&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:53&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;61&lt;/a&gt;&lt;/sup&gt; of RCTs found no statistically significant effect of omega-3 consumption on all-cause mortality, cardiovascular events, stroke, or arrhythmia, and a weakly statistically significant effect on cardiovascular mortality, coronary heart disease mortality, and coronary heart disease events. The non-significant effects were all positive, except for a small negative effect on stroke.&lt;sup id=&quot;fnref:54&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:54&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;62&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;So omega-3s probably improve brain health, and they might have a small effect on heart health but it’s unclear.&lt;/p&gt;

&lt;p&gt;As far as we can tell, there are no significant downsides to dietary omega-3s, so they easily pass a cost-benefit analysis as long as you don’t mind eating omega-3-rich foods.&lt;/p&gt;

&lt;p&gt;The cost-benefit analysis for omega-3 supplementation is a bit murkier because supplements sometimes contain contaminants.&lt;sup id=&quot;fnref:56&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:56&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;63&lt;/a&gt;&lt;/sup&gt;&lt;sup id=&quot;fnref:55&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:55&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;64&lt;/a&gt;&lt;/sup&gt; &lt;a href=&quot;https://doi.org/10.1016/j.jfca.2016.09.008&quot;&gt;Raab et al. (2016)&lt;/a&gt;&lt;sup id=&quot;fnref:57&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:57&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;65&lt;/a&gt;&lt;/sup&gt; tested 67 supplements and found that all had safe levels of heavy metals, but did not test for mercury. &lt;a href=&quot;https://doi.org/10.1533/9780857098863.4.389&quot;&gt;Winwood (2013)&lt;/a&gt;&lt;sup id=&quot;fnref:58&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:58&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;66&lt;/a&gt;&lt;/sup&gt; claims that algae oil typically comes from algae grown in tanks and thus can’t be contaminated by heavy metals in the ocean.&lt;/p&gt;

&lt;p&gt;I personally take a daily algae oil supplement. I take algae oil instead of fish oil because I’m vegan, but the potentially reduced risk of contaminants is a nice bonus.&lt;/p&gt;

&lt;h1 id=&quot;sleep&quot;&gt;Sleep&lt;/h1&gt;

&lt;p&gt;I read this chapter more skeptically than the others because it quoted Matthew Walker near the beginning. This raised some alarm bells because Walker wrote a &lt;a href=&quot;https://guzey.com/books/why-we-sleep/&quot;&gt;bad book&lt;/a&gt; about sleep and has been caught &lt;a href=&quot;https://statmodeling.stat.columbia.edu/2019/12/27/why-we-sleep-data-manipulation-a-smoking-gun/&quot;&gt;manipulating data&lt;/a&gt;.&lt;sup id=&quot;fnref:31&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:31&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;67&lt;/a&gt;&lt;/sup&gt; &lt;em&gt;Outlive&lt;/em&gt; cited a few of Walker’s papers to support certain claims, but none of those claims seemed particularly important so I didn’t review them.&lt;sup id=&quot;fnref:29&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:29&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;68&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;h2 id=&quot;every-animal-sleeps&quot;&gt;Every animal sleeps&lt;/h2&gt;

&lt;blockquote&gt;
  &lt;p&gt;Every animal engages in some form of sleep; scientists have found no exceptions, so far. (page 353)&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;My assessment:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Every animal sleeps: &lt;strong&gt;somewhat unlikely&lt;/strong&gt; (credence: 35%).&lt;/li&gt;
  &lt;li&gt;It’s reasonable to assert that every animal sleeps: &lt;strong&gt;false&lt;/strong&gt; (credence: 20%).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In support of this claim, &lt;em&gt;Outlive&lt;/em&gt; cites &lt;a href=&quot;https://doi.org/10.1371/journal.pbio.0060216&quot;&gt;Cirelli &amp;amp; Tononi (2008)&lt;/a&gt;&lt;sup id=&quot;fnref:18&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:18&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;69&lt;/a&gt;&lt;/sup&gt;, which does not take a strong stance on whether all animals sleep:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Only a small number of species—mostly mammals and birds—have been evaluated in detail with respect to sleep. Most studies found signs of sleep, both behavioral (quiescence and hyporesponsivity) and electrophysiological (e.g., the slow waves of non-rapid eye movement (NREM) sleep). Scientists have been hesitant to attribute sleep to reptiles, amphibians, fish, and especially invertebrates, preferring the noncommittal term “rest” in the absence of electrophysiological signs resembling those of mammals and birds.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Cirelli &amp;amp; Tononi (2008) references some examples of animals that have been claimed not to sleep (particularly bullfrogs), but says the evidence is weak.&lt;/p&gt;

&lt;p&gt;I don’t think Cirelli &amp;amp; Tononi supports Attia’s claim; it would be more accurate to say “no animal has been proven not to sleep”.&lt;/p&gt;

&lt;p&gt;But other sources would disagree with this. For example:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;It now appears that many species reduce sleep for long periods of time under normal conditions and that others do not sleep at all, in the way sleep is conventionally defined.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;From Kushida, C. (2013). Encyclopedia of Sleep, Volume 1, page 38 (h/t &lt;a href=&quot;https://guzey.com/books/why-we-sleep/&quot;&gt;Alexey Guzey&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;Attia’s claim “scientists have found no exceptions” is sort of true in the sense that we haven’t found any &lt;em&gt;definitive&lt;/em&gt; exceptions, but the claim “every animal engages in some form of sleep” isn’t well-established either.&lt;/p&gt;

&lt;h2 id=&quot;we-need-to-sleep-75-to-85-hours-a-night&quot;&gt;We need to sleep 7.5 to 8.5 hours a night&lt;/h2&gt;

&lt;blockquote&gt;
  &lt;p&gt;[M]any, many studies have confirmed what your mother told you: We need to sleep about seven and a half to eight and a half hours a night. (page 354)&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;My assessment: &lt;strong&gt;false&lt;/strong&gt; (credence: 10%).&lt;/p&gt;

&lt;p&gt;(I could not find a citation for the quoted assertion.)&lt;/p&gt;

&lt;p&gt;The most authoritative source on this question appears to be the &lt;a href=&quot;https://doi.org/10.1016/j.sleh.2014.12.010&quot;&gt;National Sleep Foundation panel&lt;/a&gt;&lt;sup id=&quot;fnref:19&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:19&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;70&lt;/a&gt;&lt;/sup&gt;, where sleep scientists were surveyed on their beliefs. The median panelist believed that 7 to 9 hours a night is “appropriate” for adults age 25–64, and 6 to 10 hours “may be appropriate for some people” in the same age range.&lt;sup id=&quot;fnref:20&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:20&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;71&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;The range given by &lt;em&gt;Outlive&lt;/em&gt; (7.5 to 8.5 hours) is excessively narrow—according to sleep scientists, many people can/should sleep more or less than that.&lt;/p&gt;

&lt;p&gt;My subjective uncertainty on this question mostly comes from the fact that I haven’t read any studies or even any meta-analyses and I’m only 90% confident that the National Sleep Foundation panelists know what they’re talking about.&lt;/p&gt;

&lt;h2 id=&quot;basketball-players-who-were-told-to-sleep-for-10-hours-a-night-had-better-shooting-accuracy&quot;&gt;Basketball players who were told to sleep for 10 hours a night had better shooting accuracy&lt;/h2&gt;

&lt;blockquote&gt;
  &lt;p&gt;In one study, Stanford basketball players were encouraged to strive for ten hours of sleep per day, with or without naps, and to abstain from alcohol or caffeine. After five weeks, their shooting accuracy had improved by 9 percent, and their sprint times had also gotten faster. (page 354)&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I didn’t look into this study and I’m not giving a credence because I don’t really care about this particular claim. I bring it up because it contradicts the preceding claim that people should sleep 7.5 to 8.5 hours a night.&lt;/p&gt;

&lt;h2 id=&quot;lack-of-sleep-increases-obesity-and-diabetes-risk&quot;&gt;Lack of sleep increases obesity and diabetes risk&lt;/h2&gt;

&lt;blockquote&gt;
  &lt;p&gt;Even in the short term, sleep deprivation can cause profound insulin resistance. […] Multiple large meta-analyses of sleep studies have revealed a close relationship between sleep duration and risk of type 2 diabetes and the metabolic syndrome. (page 356)&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;My assessment:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Sleep deprivation increases the risk of insulin resistance: &lt;strong&gt;highly likely&lt;/strong&gt; (credence: 90%).&lt;/li&gt;
  &lt;li&gt;Observational studies find relationships between short sleep duration and obesity/diabetes/metabolic syndrome: &lt;strong&gt;true&lt;/strong&gt; (credence: 95%).&lt;/li&gt;
  &lt;li&gt;Lack of sleep increases obesity and diabetes risk: &lt;strong&gt;highly likely&lt;/strong&gt; (credence: 85%).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I only briefly investigated this but it passes a basic sanity check. I glanced at the papers cited by &lt;em&gt;Outlive&lt;/em&gt; and they appear to support the quoted text. In addition, RCTs suggest that sleep restriction causes subjects to eat more and increases insulin resistance (a precursor to diabetes)—see meta-analysis by &lt;a href=&quot;https://doi.org/10.1016/j.metabol.2018.02.010&quot;&gt;Reutrakul &amp;amp; Van Cauter (2018)&lt;/a&gt;&lt;sup id=&quot;fnref:32&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:32&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;72&lt;/a&gt;&lt;/sup&gt;.&lt;/p&gt;

&lt;h2 id=&quot;a-study-using-mendelian-randomization-found-that-sleeping-6-hours-a-night-increased-risk-of-a-heart-attack&quot;&gt;A study using Mendelian randomization found that sleeping &amp;lt;6 hours a night increased risk of a heart attack&lt;/h2&gt;

&lt;blockquote&gt;
  &lt;p&gt;[O]ne particularly interesting study compared observational and Mendelian randomization&lt;sup id=&quot;fnref:27&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:27&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;73&lt;/a&gt;&lt;/sup&gt; data in people with previous identified genetic variants that either increase or decrease their lifelong exposure to longer or shorter sleep duration. The MR data confirmed the observational findings, that sleeping less than six hours a night was associated with about a 20 percent higher risk of a heart attack. (page 359)&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;My assessment:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;A study found that sleeping &amp;lt;6 hours a night increases heart attack risk: &lt;strong&gt;true&lt;/strong&gt; (credence: 98%).&lt;/li&gt;
  &lt;li&gt;Sleeping &amp;lt;6 hours a night increases heart disease risk for most people: &lt;strong&gt;likely true&lt;/strong&gt; (credence: 80%).&lt;/li&gt;
  &lt;li&gt;This quote from the book does a good job of representing the state of the evidence: &lt;strong&gt;false&lt;/strong&gt; (credence: 10%).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In support of this quote, Attia cites &lt;a href=&quot;https://doi.org/10.1038/s41467-019-08917-4&quot;&gt;Dashti et al. (2019)&lt;/a&gt;&lt;sup id=&quot;fnref:21&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:21&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;74&lt;/a&gt;&lt;/sup&gt;, which does not appear to say anything about heart attacks. I believe he meant to cite &lt;a href=&quot;https://doi.org/10.1016/j.jacc.2019.07.022&quot;&gt;Daghlas et al. (2019)&lt;/a&gt;&lt;sup id=&quot;fnref:22&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:22&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;75&lt;/a&gt;&lt;/sup&gt; (on which Dashti is a co-author). Daghlas et al. (2019) supports Attia’s claim.&lt;/p&gt;

&lt;p&gt;However, other Mendelian randomization studies have gotten different results. &lt;a href=&quot;https://doi.org/10.1186/s12944-020-01257-z&quot;&gt;Zhuang et al. (2020)&lt;/a&gt;&lt;sup id=&quot;fnref:23&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:23&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;76&lt;/a&gt;&lt;/sup&gt; found no significant relationship between sleep duration and coronary heart disease; &lt;a href=&quot;https://doi.org/10.1002%2Fehf2.14016&quot;&gt;Yang et al. (2022)&lt;/a&gt;&lt;sup id=&quot;fnref:24&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:24&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;77&lt;/a&gt;&lt;/sup&gt; found a statistically significant but extremely weak (“probably not clinically relevant”) association&lt;sup id=&quot;fnref:26&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:26&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;78&lt;/a&gt;&lt;/sup&gt;; while &lt;a href=&quot;https://doi.org/10.1016/j.sleep.2019.08.014&quot;&gt;Liao et al. (2020)&lt;/a&gt;&lt;sup id=&quot;fnref:25&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:25&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;79&lt;/a&gt;&lt;/sup&gt; broadly agreed with Daghlas et al. (2019).&lt;/p&gt;

&lt;p&gt;Out of four Mendelian randomization studies, two identified strong links between short sleep and heart disease/heart attack risk, and two suggested little to no effect. So while &lt;em&gt;Outlive&lt;/em&gt; does accurately describe the results of a study, it misrepresents the evidence by ignoring the studies that contradict its thesis.&lt;/p&gt;

&lt;p&gt;(I did not look into the quality of these studies, I just read their conclusions. It’s possible that the null-result studies are flawed in some way.)&lt;/p&gt;

&lt;p&gt;The Mendelian randomization studies provide only weak to moderate evidence, but when combined with other evidence (such as the link to obesity discussed in the previous section), it appears reasonably likely that short sleep duration does indeed increase the risk of heart problems.&lt;/p&gt;

&lt;h2 id=&quot;lack-of-sleep-causes-alzheimers-disease&quot;&gt;Lack of sleep causes Alzheimer’s disease&lt;/h2&gt;

&lt;blockquote&gt;
  &lt;ol&gt;
    &lt;li&gt;Subsequent research … has pointed to chronic bad sleep as a powerful potential cause of Alzheimer’s disease and dementia. Sleep, it turns out, is as crucial to maintaining brain health as it is to brain function.&lt;/li&gt;
  &lt;/ol&gt;
&lt;/blockquote&gt;

&lt;p&gt;My assessment:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Lack of sleep causes Alzheimer’s disease: &lt;strong&gt;possibly true&lt;/strong&gt; (credence: 60%).&lt;/li&gt;
  &lt;li&gt;The first sentence of the book quote is reasonable: &lt;strong&gt;true&lt;/strong&gt; (credence: 90%).&lt;/li&gt;
  &lt;li&gt;The second sentence of the book quote is reasonable: &lt;strong&gt;false&lt;/strong&gt; (credence: 20%).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Some research has indeed found a link between bad sleep and Alzheimer’s, but it’s difficult to establish causality—see review article by &lt;a href=&quot;https://doi.org/10.3390%2Fijms21031168&quot;&gt;Lloret et al. (2020)&lt;/a&gt;&lt;sup id=&quot;fnref:46&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:46&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;80&lt;/a&gt;&lt;/sup&gt;.&lt;/p&gt;

&lt;p&gt;The first quoted sentence from &lt;em&gt;Outlive&lt;/em&gt; aligns with Lloret et al.’s summary of the literature. The second sentence converts a speculative hypothesis into a certainty.&lt;/p&gt;

&lt;h1 id=&quot;bonus&quot;&gt;Bonus&lt;/h1&gt;

&lt;h2 id=&quot;dunning-kruger-effect&quot;&gt;Dunning-Kruger effect&lt;/h2&gt;

&lt;blockquote&gt;
  &lt;p&gt;Looking back, I now realize that I was too far on the left of the Dunning-Kruger curve, caricatured below in figure 14—my maximal confidence and relatively minimal knowledge having propelled me quite close to the summit of “Mount Stupid.” (page 293)&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The book’s Figure 14 reproduces this image from Wikimedia Commons:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;https://upload.wikimedia.org/wikipedia/commons/4/46/Dunning%E2%80%93Kruger_Effect_01.svg&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;My assessment:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;This graph accurately represents the Dunning-Kruger effect: &lt;strong&gt;false&lt;/strong&gt; (credence: &amp;lt;1%).&lt;/li&gt;
  &lt;li&gt;The existence of a “Mount Stupid” is supported by the scientific evidence: &lt;strong&gt;false&lt;/strong&gt; (credence: 2%).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I’m willing to forgive this mistake because it doesn’t have anything to do with longevity, but it still bugs me.&lt;/p&gt;

&lt;p&gt;If you search “Dunning Kruger” on Google Images, you will see a bunch of graphs that look like that, but no study on the Dunning-Kruger effect has ever produced empirical data with that shape.&lt;/p&gt;

&lt;p&gt;Empirical results actually look something like this:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;https://upload.wikimedia.org/wikipedia/commons/4/43/Dunning%E2%80%93Kruger_Effect2.svg&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;(Presumably, Attia found Figure 14 from the Wikipedia page on the Dunning-Kruger effect. I can’t really blame him for getting something wrong if he just pulled it from Wikipedia. And to Wikipedia’s credit, it has removed the incorrect image and replaced it with the correct one above.)&lt;/p&gt;

&lt;h1 id=&quot;changelog&quot;&gt;Changelog&lt;/h1&gt;

&lt;ul&gt;
  &lt;li&gt;2024-10-23: I added a reference to a meta-analysis of RCTs on balance training specificity. This provides stronger evidence for my previously weakly-held position that balance training isn’t useful. I updated my credences accordingly.&lt;/li&gt;
  &lt;li&gt;2025-05-05: I added three new sections on exercise:
    &lt;ol&gt;
      &lt;li&gt;&lt;a href=&quot;#vo2max-is-the-best-predictor-of-longevity&quot;&gt;VO2max is the best predictor of longevity&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#you-should-train-vo2max-by-doing-hiit-at-the-maximum-sustainable-pace&quot;&gt;You should train VO2max by doing HIIT at the maximum sustainable pace&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#you-should-do-3-hoursweek-of-zone-2-training-and-one-or-two-sessionsweek-of-hiit&quot;&gt;You should do &amp;gt;3 hours/week of zone 2 training and one or two sessions/week of HIIT&lt;/a&gt;&lt;/li&gt;
    &lt;/ol&gt;
  &lt;/li&gt;
  &lt;li&gt;2025-07-03: Previously I was inconsistent with how I reported credences for statements I thought were probably false—sometimes I reported my probability that the statement is true, and sometimes my probability that it’s false. I rewrote them so that all credences are given in terms of my probability that the statement is true.&lt;/li&gt;
  &lt;li&gt;2025-07-04: Made some minor changes in response to feedback.&lt;/li&gt;
&lt;/ul&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:1&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;The book is co-authored by Bill Gifford. It’s written from Attia’s point of view and some materials (such as &lt;a href=&quot;https://peterattiamd.com/outlive/&quot;&gt;Peter Attia’s website&lt;/a&gt;) approximately treat Attia as the sole author, so in my review I will credit the book’s claims to Attia and not to Gifford. &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:12&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Page numbers are from the 2023 Kindle edition of &lt;em&gt;Outlive&lt;/em&gt;, ISBN 9780593236598, ebook ISBN 9780593236604. &lt;a href=&quot;#fnref:12&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:59&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;The meta-analysis did definitely find this result, but there’s some wiggle room around what “technically correct” means (because the meta-analysis found different results for different subgroups—I will discuss this shortly). So I’m only 90% confident. &lt;a href=&quot;#fnref:59&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:3&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Stefan, N., Schick, F., &amp;amp; Häring, H. U. (2017). &lt;a href=&quot;https://doi.org/10.1016/j.cmet.2017.07.008&quot;&gt;Causes, Characteristics, and Consequences of Metabolically Unhealthy Normal Weight in Humans.&lt;/a&gt; &lt;a href=&quot;#fnref:3&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:39&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Kramer, C. K., Zinman, B., and Retnakaran, R (2013). &lt;a href=&quot;/materials/kramer2013.pdf&quot;&gt;Are metabolically healthy overweight and obesity benign conditions? A systematic review and meta-analysis.&lt;/a&gt; &lt;a href=&quot;#fnref:39&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:40&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;The standard method for significance-testing a relative risk is to assume that its logarithm follows a normal distribution. I did that and got an odds ratio of 7.27. But the sample mean of 1.19 is pretty far from the geometric mean of the 95% CI (1.16), and much closer to its arithmetic mean (1.18), so I redid the test with the assumption that the RR follows a normal distribution. The second method produces an odds ratio of 5.66, which I rounded down to 5 to be conservative.&lt;/p&gt;

      &lt;p&gt;It’s bad practice to run two different significance tests, but I think it’s okay in this case because I preferred the test that weakened my argument rather than strengthening it. &lt;a href=&quot;#fnref:40&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:42&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;The original article used PubMed links, which I replaced with DOI links and added full citations in footnotes. The quote is otherwise unchanged. &lt;a href=&quot;#fnref:42&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:41&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Lin, H., Zhang, L., Zheng, R., &amp;amp; Zheng, Y. (2017). &lt;a href=&quot;https://doi.org/10.1097/md.0000000000008838&quot;&gt;The prevalence, metabolic risk and effects of lifestyle intervention for metabolically healthy obesity.&lt;/a&gt; &lt;a href=&quot;#fnref:41&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:4&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Eckel, N., Meidtner, K., Kalle-Uhlmann, T., Stefan, N., and Schulze, M. B (2016). &lt;a href=&quot;https://pubmed.ncbi.nlm.nih.gov/26701871/&quot;&gt;Metabolically healthy obesity and cardiovascular events: a systematic review and meta-analysis.&lt;/a&gt; &lt;a href=&quot;#fnref:4&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:5&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Kim, T. J., Shin, H. Y., Chang, Y., Kang, M., Jee, J., Choi, Y. H., Ahn, H. S., Ahn, S. H., Son, H. J., &amp;amp; Ryu, S. (2017). &lt;a href=&quot;https://doi.org/10.1016/j.atherosclerosis.2017.03.035&quot;&gt;Metabolically healthy obesity and the risk for subclinical atherosclerosis.&lt;/a&gt; &lt;a href=&quot;#fnref:5&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:6&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Chang, Y., Jung, H. S., Cho, J., Zhang, Y., Yun, K. E., Lazo, M., Pastor-Barriuso, R., Ahn, J., Kim, C. W., Rampal, S., Cainzos-Achirica, M., Zhao, D., Chung, E. C., Shin, H., Guallar, E., &amp;amp; Ryu, S. (2016). &lt;a href=&quot;https://doi.org/10.1038/ajg.2016.178&quot;&gt;Metabolically Healthy Obesity and the Development of Nonalcoholic Fatty Liver Disease.&lt;/a&gt; &lt;a href=&quot;#fnref:6&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:7&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Chang, A. R., Surapaneni, A., Kirchner, H. L., Young, A., Kramer, H. J., Carey, D. J., Appel, L. J., &amp;amp; Grams, M. E. (2018). &lt;a href=&quot;https://doi.org/10.1002/oby.22134&quot;&gt;Metabolically Healthy Obesity and Risk of Kidney Function Decline.&lt;/a&gt; &lt;a href=&quot;#fnref:7&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:8&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Bell, J. A., Kivimaki, M., &amp;amp; Hamer, M. (2014). &lt;a href=&quot;https://doi.org/10.1111/obr.12157&quot;&gt;Metabolically healthy obesity and risk of incident type 2 diabetes: a meta‐analysis of prospective cohort studies.&lt;/a&gt; &lt;a href=&quot;#fnref:8&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:60&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;When Attia says the amyloid theory was weakening, he’s referring to the once-popular hypothesis that amyloid-beta is the sole cause of Alzheimer’s. That hypothesis now appears to be false, but amyloid-beta still looks relevant to Alzheimer’s somehow (it’s not quite clear how). &lt;a href=&quot;#fnref:60&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:107&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Actually, I have very good reading comprehension in a relative sense—in the 98th or 99th percentile according to standardized tests. But 98th percentile reading comprehension still isn’t good enough to consistently understand the things you read, apparently. &lt;a href=&quot;#fnref:107&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:116&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I’m not confident in the claim that they’re more accurate. I am not aware of any research directly comparing the predictive power of VO2max itself vs. performance tests. In theory, I would expect performance tests to be better predictors because they’re directly measuring your body’s physical capabilities. &lt;a href=&quot;#fnref:116&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:111&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Some special-purpose interventions might work better. For example, if you’re a heavy smoker, quitting smoking might have a bigger effect than starting exercise. &lt;a href=&quot;#fnref:111&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:112&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Kokkinos, P., Faselis, C., Samuel, I. B. H., Pittaras, A., Doumas, M., Murphy, R., Heimall, M. S. et al. (2022). &lt;a href=&quot;https://doi.org/10.1016/j.jacc.2022.05.031&quot;&gt;Cardiorespiratory Fitness and Mortality Risk Across the Spectra of Age, Race, and Sex.&lt;/a&gt; &lt;a href=&quot;#fnref:112&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:113&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Mandsager, K., Harb, S., Cremer, P., Phelan, D., Nissen, S. E., &amp;amp; Jaber, W. (2018). &lt;a href=&quot;https://doi.org/10.1001/jamanetworkopen.2018.3605&quot;&gt;Association of Cardiorespiratory Fitness With Long-term Mortality Among Adults Undergoing Exercise Treadmill Testing.&lt;/a&gt; &lt;a href=&quot;#fnref:113&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:114&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Harber, M. P., Kaminsky, L. A., Arena, R., Blair, S. N., Franklin, B. A., Myers, J., &amp;amp; Ross, R. (2017). &lt;a href=&quot;https://doi.org/10.1016/j.pcad.2017.03.001&quot;&gt;Impact of Cardiorespiratory Fitness on All-Cause and Disease-Specific Mortality: Advances Since 2009.&lt;/a&gt; &lt;a href=&quot;#fnref:114&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:120&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Wen, D., Utesch, T., Wu, J., Robertson, S., Liu, J., Hu, G., &amp;amp; Chen, H. (2019). &lt;a href=&quot;https://doi.org/10.1016/j.jsams.2019.01.013&quot;&gt;Effects of different protocols of high intensity interval training for VO2max improvements in adults: A meta-analysis of randomised controlled trials.&lt;/a&gt; &lt;a href=&quot;#fnref:120&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt; &lt;a href=&quot;#fnref:120:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;2&lt;/sup&gt;&lt;/a&gt; &lt;a href=&quot;#fnref:120:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;3&lt;/sup&gt;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:121&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Seiler, S. (2010). &lt;a href=&quot;https://doi.org/10.1123/ijspp.5.3.276&quot;&gt;What is Best Practice for Training Intensity and Duration Distribution in Endurance Athletes?.&lt;/a&gt; &lt;a href=&quot;#fnref:121&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt; &lt;a href=&quot;#fnref:121:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;2&lt;/sup&gt;&lt;/a&gt; &lt;a href=&quot;#fnref:121:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;3&lt;/sup&gt;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:126&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Billat, L. V. (2001). &lt;a href=&quot;https://doi.org/10.2165/00007256-200131010-00002&quot;&gt;Interval Training for Performance: A Scientific and Empirical Practice.&lt;/a&gt; &lt;a href=&quot;#fnref:126&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:122&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Billat, V., Renoux, J. C., Pinoteau, J., Petit, B., &amp;amp; Koralsztein, J. P. (1995). &lt;a href=&quot;https://doi.org/10.3109/13813459508996126&quot;&gt;Times to exhaustion at 90,100 and 105% of velocity at V̇O&lt;sub&gt;2&lt;/sub&gt;max (Maximal aerobic speed) and critical speed in elite longdistance runners.&lt;/a&gt; &lt;a href=&quot;#fnref:122&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:124&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;&lt;a href=&quot;/materials/blondel2001.pdf&quot;&gt;Blondel et al. (2001)&lt;/a&gt;&lt;sup id=&quot;fnref:125&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:125&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;81&lt;/a&gt;&lt;/sup&gt; found that a sample of physically active (but not elite) students could sustain 90% of VO2max for an average of 13.98 minutes. &lt;a href=&quot;#fnref:124&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:130&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Different people use terminology in different ways. I am using LIT to refer to what Attia calls zone 2. Some studies call it moderate-intensity continuous training (MICT). Colloquially, it refers to an exercise intensity that you can sustain for a long time. Technically, it refers to exercise at or below the &lt;a href=&quot;https://en.wikipedia.org/wiki/Lactate_threshold&quot;&gt;lactate threshold&lt;/a&gt;. &lt;a href=&quot;#fnref:130&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:127&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Crowley, E., Powell, C., Carson, B. P., &amp;amp; W. Davies, R. (2022). &lt;a href=&quot;https://doi.org/10.1155/2022/9310710&quot;&gt;The Effect of Exercise Training Intensity on VO2max in Healthy Adults: An Overview of Systematic Reviews and Meta-Analyses.&lt;/a&gt; &lt;a href=&quot;#fnref:127&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:128&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;It may be possible to resolve this inconsistency by digging deeper into the literature. Some relevant questions:&lt;/p&gt;

      &lt;ol&gt;
        &lt;li&gt;Wen et al. (2019)&lt;sup id=&quot;fnref:120:3&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:120&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;21&lt;/a&gt;&lt;/sup&gt; found that long-duration HIIT worked better than short-duration. For the studies that find no benefit to HIIT over LIT, are they only looking at short-duration HIIT?&lt;/li&gt;
        &lt;li&gt;Some RCTs match volume between groups. So if the HIIT group spends a total of (say) 16 minutes at high intensity, then the LIT group exercises for 16 minutes total. That’s not how people actually exercise. Do meta-analyses understate the benefits of LIT because they include volume-matched studies?&lt;/li&gt;
        &lt;li&gt;How do the returns to HIIT vs. LIT differ for novice vs. experienced athletes?&lt;/li&gt;
        &lt;li&gt;What happens when you combine HIIT with LIT?&lt;/li&gt;
      &lt;/ol&gt;
      &lt;p&gt;&lt;a href=&quot;#fnref:128&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:118&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Mølmen, K. S., Almquist, N. W., &amp;amp; Skattebo, Ø. (2024). &lt;a href=&quot;https://doi.org/10.1007/s40279-024-02120-2&quot;&gt;Effects of Exercise Training on Mitochondrial and Capillary Growth in Human Skeletal Muscle: A Systematic Review and Meta-Regression.&lt;/a&gt; &lt;a href=&quot;#fnref:118&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:123&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Stöggl, T. L., &amp;amp; Sperlich, B. (2015). &lt;a href=&quot;https://doi.org/10.3389/fphys.2015.00295&quot;&gt;The training intensity distribution among well-trained and elite endurance athletes.&lt;/a&gt; &lt;a href=&quot;#fnref:123&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt; &lt;a href=&quot;#fnref:123:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;2&lt;/sup&gt;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:9&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Frank, C., Kobesova, A., and Kolar, P (2013). &lt;a href=&quot;https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3578435/&quot;&gt;Dynamic neuromuscular stabilization &amp;amp; sports rehabilitation.&lt;/a&gt; &lt;a href=&quot;#fnref:9&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:33&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I’m thinking in particular of &lt;a href=&quot;https://doi.org/10.1519/JSC.0000000000004655&quot;&gt;Chiu (2023)&lt;/a&gt;&lt;sup id=&quot;fnref:34&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:34&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;82&lt;/a&gt;&lt;/sup&gt; which investigated knee valgus, a squat technique that was generally regarded as bad form, and found that it may be better than “correct” form. h/t &lt;a href=&quot;https://www.youtube.com/watch?v=UOWQUNZRVtU&quot;&gt;Menno Henselmans&lt;/a&gt;. &lt;a href=&quot;#fnref:33&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:88&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Sherrington, C., Tiedemann, A., Fairhall, N., Close, J. C. T., &amp;amp; Lord, S. R. (2011). &lt;a href=&quot;https://doi.org/10.1071/NB10056&quot;&gt;Exercise to prevent falls in older adults: an updated meta-analysis and best practice recommendations.&lt;/a&gt;. doi: &lt;a href=&quot;https://doi.org/10.1071/nb10056&quot;&gt;10.1071/nb10056&lt;/a&gt; &lt;a href=&quot;#fnref:88&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:89&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;de Souto Barreto, P., Rolland, Y., Vellas, B., &amp;amp; Maltais, M. (2019). &lt;a href=&quot;https://doi.org/10.1001/jamainternmed.2018.5406&quot;&gt;Association of Long-term Exercise Training With Risk of Falls, Fractures, Hospitalizations, and Mortality in Older Adults.&lt;/a&gt; &lt;a href=&quot;#fnref:89&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:110&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Kümmel, J., Kramer, A., Giboin, L. S., &amp;amp; Gruber, M. (2016). &lt;a href=&quot;https://link.springer.com/article/10.1007/s40279-016-0515-z&quot;&gt;Specificity of Balance Training in Healthy Individuals: A Systematic Review and Meta-Analysis.&lt;/a&gt;. doi: &lt;a href=&quot;https://doi.org/10.1007/s40279-016-0515-z&quot;&gt;10.1007/s40279-016-0515-z&lt;/a&gt; &lt;a href=&quot;#fnref:110&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:15&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Colman, R. J., Anderson, R. M., Johnson, S. C., Kastman, E. K., Kosmatka, K. J., Beasley, T. M., Allison, D. B., Cruzen, C., Simmons, H. A., Kemnitz, J. W., &amp;amp; Weindruch, R. (2009). &lt;a href=&quot;https://doi.org/10.1126/science.1173635&quot;&gt;Caloric Restriction Delays Disease Onset and Mortality in Rhesus Monkeys.&lt;/a&gt; &lt;a href=&quot;#fnref:15&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:13&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Mattison, J. A., Roth, G. S., Beasley, T. M., Tilmont, E. M., Handy, A. M., Herbert, R. L., Longo, D. L., Allison, D. B., Young, J. E., Bryant, M., Barnard, D., Ward, W. F., Qi, W., Ingram, D. K., &amp;amp; de Cabo, R. (2012). &lt;a href=&quot;https://doi.org/10.1038/nature11432&quot;&gt;Impact of caloric restriction on health and survival in rhesus monkeys from the NIA study.&lt;/a&gt; &lt;a href=&quot;#fnref:13&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:14&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Mattison, J. A., Colman, R. J., Beasley, T. M., Allison, D. B., Kemnitz, J. W., Roth, G. S., Ingram, D. K., Weindruch, R., de Cabo, R., &amp;amp; Anderson, R. M. (2017). &lt;a href=&quot;https://doi.org/10.1038/ncomms14063&quot;&gt;Caloric restriction improves health and survival of rhesus monkeys.&lt;/a&gt; &lt;a href=&quot;#fnref:14&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:49&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Actually, the NIA young female cohort did see a statistically significant reduction in lifespan (p = 0.04), but it becomes non-significant if you do a Bonferroni correction. &lt;a href=&quot;#fnref:49&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:48&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Mean and 95% confidence intervals for change in longevity from various study cohorts:&lt;/p&gt;

      &lt;table&gt;
        &lt;thead&gt;
          &lt;tr&gt;
            &lt;th&gt;Cohort&lt;/th&gt;
            &lt;th&gt;Mean&lt;/th&gt;
            &lt;th&gt;95% CI&lt;/th&gt;
          &lt;/tr&gt;
        &lt;/thead&gt;
        &lt;tbody&gt;
          &lt;tr&gt;
            &lt;td&gt;UW male&lt;/td&gt;
            &lt;td&gt;1.58&lt;/td&gt;
            &lt;td&gt;(-1.56, 4.72)&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
            &lt;td&gt;UW female&lt;/td&gt;
            &lt;td&gt;2.22&lt;/td&gt;
            &lt;td&gt;(-2.74, 7.18)&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
            &lt;td&gt;NIA young male&lt;/td&gt;
            &lt;td&gt;-2.29&lt;/td&gt;
            &lt;td&gt;(-7.05, 2.47)&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
            &lt;td&gt;NIA young female&lt;/td&gt;
            &lt;td&gt;-4.79&lt;/td&gt;
            &lt;td&gt;(-8.88, -0.70)&lt;/td&gt;
          &lt;/tr&gt;
        &lt;/tbody&gt;
      &lt;/table&gt;

      &lt;p&gt;The CI for NIA young male contains the mean for UW male but the reverse is not true, and neither female cohort’s CI contains the mean of the other. &lt;a href=&quot;#fnref:48&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:51&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;If true, we would expect to find that calorie restriction worked for the NIA young male cohort because they ate about as much as their UW counterparts. But it didn’t work (in fact it shortened the average lifespan). &lt;a href=&quot;#fnref:51&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:50&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I can think of several other hypotheses.&lt;/p&gt;

      &lt;p&gt;The UW control group ate &lt;em&gt;ad libitum&lt;/em&gt;, which is fancy academic language for “as much as they want”. The NIA control group didn’t eat &lt;em&gt;ad libitum&lt;/em&gt;. Instead, the researchers used previous data to determine how much monkeys tend to eat when fed &lt;em&gt;ad libitum&lt;/em&gt; (controlling for age and bodyweight) and then fed subjects exactly that amount.&lt;/p&gt;

      &lt;p&gt;This brings to mind another hypothesis: Maybe some portion (let’s say 1/3) of monkeys tend to overeat, which causes health problems, and calorie restriction mainly only benefits the 1/3 who overeat. If the NIA control monkeys &lt;em&gt;all&lt;/em&gt; received diets based on how much the &lt;em&gt;average&lt;/em&gt; monkey eats, that prevents the most gluttonous 1/3 from overeating, so additional calorie restriction doesn’t produce meaningful benefits.&lt;/p&gt;

      &lt;p&gt;You could test this hypothesis using the UW data by dividing monkeys in the calorie restriction cohort into “high-calorie” and “low-calorie” group based on how much they ate &lt;em&gt;ad libitum&lt;/em&gt; (controlling for age and bodyweight) and seeing if the high-calorie group had a bigger improvement in longevity than the low-calorie group. The groups would have small sample sizes so the result probably wouldn’t be statistically significant.&lt;/p&gt;

      &lt;p&gt;Some weak supporting evidence: in the UW study, the median longevity improvement was bigger than the mean improvement.&lt;/p&gt;

      &lt;p&gt;A fourth hypothesis: calorie restriction has a U-shaped effect on longevity, where a little calorie restriction helps, but excess calorie restriction increases mortality. (This is clearly true in the limit—100% calorie restriction certainly isn’t healthy.)&lt;/p&gt;

      &lt;p&gt;The studies weakly contradict this hypothesis. &lt;a href=&quot;https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5247583/figure/f4/&quot;&gt;Figure 4&lt;/a&gt; shows that NIA young males in the control group ate about as much as UW males, while in the other three control group pairings (NIA young female + UW female, NIA old male + UW male, NIA old female + UW female), the NIA group ate less. This predicts that calorie restriction should improve longevity in the NIA young male cohort but have a smaller or negative effect in the other three cohorts. But that’s not what the NIA study found. Instead, the young male and young female cohorts both saw decreased longevity from calorie restriction, and both old cohorts saw approximately no effect.&lt;/p&gt;

      &lt;p&gt;A fifth hypothesis: the studies have fundamental methodological issues that render the results invalid. The studies weakly support this hypothesis given how many peculiar and seemingly-contradictory findings I was able to identify.&lt;/p&gt;

      &lt;p&gt;I don’t know what those methodological issues might be. They could be things like:&lt;/p&gt;

      &lt;ul&gt;
        &lt;li&gt;the different cohorts were managed by different researchers who used inconsistent procedures&lt;/li&gt;
        &lt;li&gt;the cohorts had relevantly different genetic lineages&lt;/li&gt;
        &lt;li&gt;the researchers fabricated data (probably not, but you never know)&lt;/li&gt;
        &lt;li&gt;there was a mold infestation next to the control group’s cages&lt;/li&gt;
      &lt;/ul&gt;

      &lt;p&gt;A sixth hypothesis: calorie restriction works, but only if you live in Wisconsin. &lt;a href=&quot;#fnref:50&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt; &lt;a href=&quot;#fnref:50:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;2&lt;/sup&gt;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:16&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Mattison, J. A., Black, A., Huck, J., Moscrip, T., Handy, A., Tilmont, E., Roth, G. S., Lane, M. A., &amp;amp; Ingram, D. K. (2005). &lt;a href=&quot;https://doi.org/10.1016/j.neurobiolaging.2004.09.013&quot;&gt;Age-related decline in caloric intake and motivation for food in rhesus monkeys.&lt;/a&gt; &lt;a href=&quot;#fnref:16&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:17&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Yamada, Y., Colman, R. J., Kemnitz, J. W., Baum, S. T., Anderson, R. M., Weindruch, R., &amp;amp; Schoeller, D. A. (2013). &lt;a href=&quot;https://doi.org/10.1016/j.exger.2013.08.002&quot;&gt;Long-term calorie restriction decreases metabolic cost of movement and prevents decrease of physical activity during aging in rhesus monkeys.&lt;/a&gt; &lt;a href=&quot;#fnref:17&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:35&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;&lt;a href=&quot;https://doi.org/10.1161/CIRCRESAHA.115.306883&quot;&gt;Ortega et al. (2016)&lt;/a&gt;&lt;sup id=&quot;fnref:37&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:37&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;83&lt;/a&gt;&lt;/sup&gt; summarizes the relevant literature. Several observational studies have found that overweight but physically fit individuals have little to no increase in mortality rates relative to normal-weight fit individuals (the largest study&lt;sup id=&quot;fnref:36&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:36&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;84&lt;/a&gt;&lt;/sup&gt; found a statistically significant but small effect (RR = 1.1); three other studies found no significant effect). By comparison, unfit people had 2–3x higher mortality than fit individuals. &lt;a href=&quot;https://www.ahajournals.org/cms/10.1161/CIRCRESAHA.115.306883/asset/b702242c-ad02-4b6c-95b7-b60f578a8c73/assets/graphic/1752fig02.jpeg&quot;&gt;Figure 2&lt;/a&gt; (reproduced below) summarizes the results from the four studies.&lt;/p&gt;

      &lt;p&gt;&lt;img src=&quot;/assets/images/Ortega-2016-Figure-2.jpeg&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

      &lt;p&gt;This finding from Ortega et al. (2016) is actually stronger than necessary for our purposes. It would be sufficient to say that exercise cancels out the harm of high-calorie diets by burning off the excess calories. But this shows that exercise (mostly) cancels out the harm &lt;em&gt;even for people who don’t lose weight&lt;/em&gt;. &lt;a href=&quot;#fnref:35&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:62&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Hooper, L., Martin, N., Jimoh, O. F., Kirk, C., Foster, E., &amp;amp; Abdelhamid, A. S. (2020). &lt;a href=&quot;https://doi.org/10.1002/14651858.cd011737.pub3&quot;&gt;Reduction in saturated fat intake for cardiovascular disease.&lt;/a&gt; &lt;a href=&quot;#fnref:62&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt; &lt;a href=&quot;#fnref:62:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;2&lt;/sup&gt;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:2&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Arguably, my credence for this latter claim should be higher than for the former claim, because reducing saturated fat has some chance of improving health and essentially no chance of harming health. But reducing saturated fat also has some costs (it makes your diet harder to follow).&lt;/p&gt;

      &lt;p&gt;In other words, on a cost-benefit analysis aimed at maximizing health, it’s clearly worth it to eat less saturated fat. But on an all-things-considered cost-benefit analysis, there’s more room for debate. &lt;a href=&quot;#fnref:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:64&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Zong, G., Li, Y., Wanders, A. J., Alssema, M., Zock, P. L., Willett, W. C., Hu, F. B. et al. (2016). &lt;a href=&quot;https://doi.org/10.1136/bmj.i5796&quot;&gt;Intake of individual saturated fatty acids and risk of coronary heart disease in US men and women: two prospective longitudinal cohort studies.&lt;/a&gt; &lt;a href=&quot;#fnref:64&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:65&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Heileson, J. L. (2019). &lt;a href=&quot;https://doi.org/10.1093/nutrit/nuz091&quot;&gt;Dietary saturated fat and heart disease: a narrative review.&lt;/a&gt; &lt;a href=&quot;#fnref:65&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:68&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Mensink, R. P., &amp;amp; World Health Organization (2016). &lt;a href=&quot;https://iris.who.int/bitstream/handle/10665/246104/9789241565349-eng.pdf&quot;&gt;Effects of saturated fatty acids on serum lipids and lipoproteins: a systematic review and regression analysis.&lt;/a&gt; &lt;a href=&quot;#fnref:68&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:67&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Navarese, E. P., Robinson, J. G., Kowalewski, M., Kolodziejczak, M., Andreotti, F., Bliden, K., Tantry, U. et al. (2018). &lt;a href=&quot;https://doi.org/10.1001/jama.2018.2525&quot;&gt;Association Between Baseline LDL-C Level and Total and Cardiovascular Mortality After LDL-C Lowering.&lt;/a&gt; &lt;a href=&quot;#fnref:67&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:66&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Ennezat, P. V., Guerbaai, R. A., Maréchaux, S., Le Jemtel, T. H., &amp;amp; François, P. (2022). &lt;a href=&quot;https://doi.org/10.1097/fjc.0000000000001345&quot;&gt;Extent of LDL-cholesterol Reduction and All-cause and Cardiovascular Mortality Benefit: a Systematic Review and Meta-analysis.&lt;/a&gt; &lt;a href=&quot;#fnref:66&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:71&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Bergeron, N., Chiu, S., Williams, P. T., M King, S., &amp;amp; Krauss, R. M. (2019). &lt;a href=&quot;https://doi.org/10.1093/ajcn/nqz035&quot;&gt;Effects of red meat, white meat, and nonmeat protein sources on atherogenic lipoprotein measures in the context of low compared with high saturated fat intake: a randomized controlled trial.&lt;/a&gt; &lt;a href=&quot;#fnref:71&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:90&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Sacks, F. M., Lichtenstein, A. H., Wu, J. H. Y., Appel, L. J., Creager, M. A., Kris-Etherton, P. M., Miller, M. et al. (2017). &lt;a href=&quot;https://www.ahajournals.org/doi/full/10.1161/CIR.0000000000000510&quot;&gt;Dietary Fats and Cardiovascular Disease: A Presidential Advisory From the American Heart Association.&lt;/a&gt;. doi: &lt;a href=&quot;https://doi.org/10.1161/cir.0000000000000510&quot;&gt;10.1161/cir.0000000000000510&lt;/a&gt; &lt;a href=&quot;#fnref:90&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:97&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Miettinen, M., Karvonen, M., Turpeinen, O., Elosuo, R., &amp;amp; Paavilainen, E. (1972). &lt;a href=&quot;https://doi.org/10.1016/s0140-6736(72)92208-8&quot;&gt;EFFECT OF CHOLESTEROL-LOWERING DIET ON MORTALITY FROM CORONARY HEART-DISEASE AND OTHER CAUSES.&lt;/a&gt; &lt;a href=&quot;#fnref:97&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:98&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Ramsden, C. E., Zamora, D., Leelarthaepin, B., Majchrzak-Hong, S. F., Faurot, K. R., Suchindran, C. M., Ringel, A. et al. (2013). &lt;a href=&quot;https://doi.org/10.1136/bmj.e8707&quot;&gt;Use of dietary linoleic acid for secondary prevention of coronary heart disease and death: evaluation of recovered data from the Sydney Diet Heart Study and updated meta-analysis.&lt;/a&gt; &lt;a href=&quot;#fnref:98&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:99&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Rose, G. A., Thomson, W. B., &amp;amp; Williams, R. T. (1965). &lt;a href=&quot;https://doi.org/10.1136/bmj.1.5449.1531&quot;&gt;Corn Oil in Treatment of Ischaemic Heart Disease.&lt;/a&gt; &lt;a href=&quot;#fnref:99&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:69&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;The Cochrane review &lt;a href=&quot;https://www.cochranelibrary.com/cdsr/doi/10.1002/14651858.CD011737.pub3/full#CD011737-sec-0008&quot;&gt;summary of findings&lt;/a&gt; reports that the interventions prevented 2 deaths per 1000 participants and the interventions lasted an average of 56 months, which equates to one death per 2300 pers &lt;a href=&quot;#fnref:69&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:74&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;See &lt;a href=&quot;https://doi.org/10.1001/jamainternmed.2015.0533&quot;&gt;Arem et al. (2015)&lt;/a&gt;&lt;sup id=&quot;fnref:75&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:75&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;85&lt;/a&gt;&lt;/sup&gt; and &lt;a href=&quot;https://doi.org/10.1161/circulationaha.121.058162&quot;&gt;Lee et al. (2022)&lt;/a&gt;&lt;sup id=&quot;fnref:73&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:73&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;86&lt;/a&gt;&lt;/sup&gt;.&lt;/p&gt;

      &lt;p&gt;These were pooled analyses of cohort studies, not RCTs. I could not quickly find reliable RCT numbers.&lt;/p&gt;

      &lt;p&gt;Neither of these studies reported deaths prevented per person-year. I calculated the numbers using provided relative risks multiplied by number of deaths per person, divided by follow-up time. Number of years per death prevented varied based on exercise duration and intensity. I found that exercise prevented approximately one death per 300 years when defining “exercise” as 7.5+ MET-hours/week for Arem et al. (2015) and 150–224 minutes of moderate physical activity for Lee et al. (2022) (these two definitions are roughly equivalent).&lt;/p&gt;

      &lt;p&gt;There are a number of meta-analyses of RCTs, for example:&lt;/p&gt;

      &lt;ul&gt;
        &lt;li&gt;&lt;a href=&quot;https://doi.org/10.1016/j.ahj.2011.07.017&quot;&gt;Lawler et al. (2011)&lt;/a&gt;&lt;sup id=&quot;fnref:78&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:78&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;87&lt;/a&gt;&lt;/sup&gt; found exercise had a relative risk (RR) of 0.74 on all-cause mortality for individuals who had experienced heart attacks.&lt;/li&gt;
        &lt;li&gt;&lt;a href=&quot;https://doi.org/10.1177/1534735420917462&quot;&gt;Morishita et al. (2020)&lt;/a&gt;&lt;sup id=&quot;fnref:79&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:79&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;88&lt;/a&gt;&lt;/sup&gt; found RR = 0.76 on cancer mortality for cancer patients.&lt;/li&gt;
      &lt;/ul&gt;

      &lt;p&gt;However, neither of these meta-analyses provided per-group mortality numbers or mean intervention length, so I can’t determine the number of years per death prevented without reading through every individual study. Based on the RRs, my guess is that these meta-analyses would give roughly similar numbers to the pooled analyses above.&lt;/p&gt;

      &lt;p&gt;To my knowledge, the most comprehensive meta-analysis is &lt;a href=&quot;https://doi.org/10.1186/s12889-020-09855-3&quot;&gt;Posadzki et al. (2020)&lt;/a&gt;&lt;sup id=&quot;fnref:81&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:81&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;89&lt;/a&gt;&lt;/sup&gt;, which reviewed 150 different Cochrane reviews and found an RR of 0.87 for all-cause mortality. But it provides even less information about the participants so I have no way of interpreting this number. &lt;a href=&quot;#fnref:74&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:52&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Dighriri, I. M., Alsubaie, A. M., Hakami, F. M., Hamithi, D. M., Alshekh, M. M., Khobrani, F. A., Dalak, F. E. et al. (2022). &lt;a href=&quot;https://doi.org/10.7759/cureus.30091&quot;&gt;Effects of Omega-3 Polyunsaturated Fatty Acids on Brain Functions: A Systematic Review.&lt;/a&gt; &lt;a href=&quot;#fnref:52&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:53&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Abdelhamid, A. S., Brown, T. J., Brainard, J. S., Biswas, P., Thorpe, G. C., Moore, H. J., Deane, K. H. et al. (2020). &lt;a href=&quot;https://doi.org/10.1002/14651858.cd003177.pub5&quot;&gt;Omega-3 fatty acids for the primary and secondary prevention of cardiovascular disease.&lt;/a&gt; &lt;a href=&quot;#fnref:53&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:54&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;When talking about the &lt;a href=&quot;#rhesus-monkey-studies-suggest-that-calorie-restriction-improves-longevity-but-only-if-you-eat-a-fairly-unhealthy-diet&quot;&gt;calorie restriction studies&lt;/a&gt;, I said that a negative effect is a red flag even if it’s non-significant. In this case I’m not too concerned because:&lt;/p&gt;

      &lt;ul&gt;
        &lt;li&gt;The effect on stroke was highly non-significant (p = 0.82). Compare to the NIA calorie restriction study which had p = 0.35 and p = 0.02 for young males and young females respectively.&lt;/li&gt;
        &lt;li&gt;RCTs show omega-3s improve short-term brain function.&lt;/li&gt;
      &lt;/ul&gt;

      &lt;p&gt;I think the most reasonable interpretation is that there’s no effect on stroke. &lt;a href=&quot;#fnref:54&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:56&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Jacobs, M. N., Covaci, A., Gheorghe, A., &amp;amp; Schepens, P. (2004). &lt;a href=&quot;https://doi.org/10.1021/jf035310q&quot;&gt;Time Trend Investigation of PCBs, PBDEs, and Organochlorine Pesticides in Selected &lt;i&gt;n&lt;/i&gt;−3 Polyunsaturated Fatty Acid Rich Dietary Fish Oil and Vegetable Oil Supplements; Nutritional Relevance for Human Essential &lt;i&gt;n&lt;/i&gt;−3 Fatty Acid Requirements.&lt;/a&gt; &lt;a href=&quot;#fnref:56&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:55&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Fernandes, A. R., Rose, M., White, S., Mortimer, D. N., &amp;amp; Gem, M. (2006). &lt;a href=&quot;https://doi.org/10.1080/02652030600660827&quot;&gt;Dioxins and polychlorinated biphenyls (PCBs) in fish oil dietary supplements: Occurrence and human exposure in the UK.&lt;/a&gt; &lt;a href=&quot;#fnref:55&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:57&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Raab, A., Stiboller, M., Gajdosechova, Z., Nelson, J., &amp;amp; Feldmann, J. (2016). &lt;a href=&quot;https://doi.org/10.1016/j.jfca.2016.09.008&quot;&gt;Element content and daily intake from dietary supplements (nutraceuticals) based on algae, garlic, yeast fish and krill oils—Should consumers be worried?.&lt;/a&gt; &lt;a href=&quot;#fnref:57&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:58&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Winwood, R. J. (2013). &lt;a href=&quot;https://doi.org/10.1533/9780857098863.4.389&quot;&gt;Algal oil as a source of omega-3 fatty acids.&lt;/a&gt; &lt;a href=&quot;#fnref:58&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:31&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;See also Walker’s &lt;a href=&quot;https://sleepdiplomat.wordpress.com/2019/12/19/why-we-sleep-responses-to-questions-from-readers/#sleep_injury&quot;&gt;response&lt;/a&gt; on why he presented the data the way he did. &lt;a href=&quot;#fnref:31&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:29&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;For example, the book cites &lt;a href=&quot;https://doi.org/10.1523/JNEUROSCI.5254-14.2015&quot;&gt;Goldstein-Piekarski et al. (2015)&lt;/a&gt;&lt;sup id=&quot;fnref:30&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:30&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;90&lt;/a&gt;&lt;/sup&gt; (Walker being a co-author) for the passage “When we are deprived of REM, studies have found, we have a more difficult time reading others’ facial expressions.” I was skeptical of this statement even before knowing Walker co-authored the paper because it has the vibe of the sort of fun quirky result that doesn’t survive the replication crisis. But I don’t particularly care about this claim (it’s not actionable in any way) so I didn’t bother to investigate it.&lt;/p&gt;

      &lt;p&gt;(Also, this is a nitpick but the quoted passage says “studies have found” while only citing a single study.) &lt;a href=&quot;#fnref:29&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:18&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Cirelli, C., &amp;amp; Tononi, G. (2008). &lt;a href=&quot;https://doi.org/10.1371/journal.pbio.0060216&quot;&gt;Is Sleep Essential?&lt;/a&gt; &lt;a href=&quot;#fnref:18&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:19&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Hirshkowitz, M., Whiton, K., Albert, S. M., Alessi, C., Bruni, O., DonCarlos, L., Hazen, N., Herman, J., Katz, E. S., Kheirandish-Gozal, L., Neubauer, D. N., O’Donnell, A. E., Ohayon, M., Peever, J., Rawding, R., Sachdeva, R. C., Setters, B., Vitiello, M. V., Ware, J. C., &amp;amp; Adams Hillard, P. J. (2015). &lt;a href=&quot;https://doi.org/10.1016/j.sleh.2014.12.010&quot;&gt;National Sleep Foundation’s sleep time duration recommendations: methodology and results summary.&lt;/a&gt; &lt;a href=&quot;#fnref:19&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:20&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;They gave different recommended sleep durations for different age ranges. This table reproduces all the recommendations for individuals age 6 and up (since I assume nobody under 6 is reading this):&lt;/p&gt;

      &lt;table&gt;
        &lt;thead&gt;
          &lt;tr&gt;
            &lt;th&gt;Age&lt;/th&gt;
            &lt;th&gt;Recommended&lt;/th&gt;
            &lt;th&gt;May be appropriate&lt;/th&gt;
          &lt;/tr&gt;
        &lt;/thead&gt;
        &lt;tbody&gt;
          &lt;tr&gt;
            &lt;td&gt;6–13 y&lt;/td&gt;
            &lt;td&gt;9 to 11&lt;/td&gt;
            &lt;td&gt;7 to 12&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
            &lt;td&gt;14–17 y&lt;/td&gt;
            &lt;td&gt;8 to 10&lt;/td&gt;
            &lt;td&gt;7 to 11&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
            &lt;td&gt;18–25 y&lt;/td&gt;
            &lt;td&gt;7 to 9&lt;/td&gt;
            &lt;td&gt;6 to 11&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
            &lt;td&gt;26–64 y&lt;/td&gt;
            &lt;td&gt;7 to 9&lt;/td&gt;
            &lt;td&gt;6 to 10&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
            &lt;td&gt;&amp;gt;64 y&lt;/td&gt;
            &lt;td&gt;7 to 8&lt;/td&gt;
            &lt;td&gt;5 to 9&lt;/td&gt;
          &lt;/tr&gt;
        &lt;/tbody&gt;
      &lt;/table&gt;
      &lt;p&gt;&lt;a href=&quot;#fnref:20&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:32&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Reutrakul, S., &amp;amp; Van Cauter, E. (2018). &lt;a href=&quot;https://doi.org/10.1016/j.metabol.2018.02.010&quot;&gt;Sleep influences on obesity, insulin resistance, and risk of type 2 diabetes.&lt;/a&gt; &lt;a href=&quot;#fnref:32&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:27&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;&lt;a href=&quot;https://en.wikipedia.org/wiki/Mendelian_randomization&quot;&gt;Mendelian randomization&lt;/a&gt; is a technique for conducting a natural experiment. Instead of looking at heart attack risk as a function of sleep duration, you look at heart attack risk as a function of &lt;em&gt;genes&lt;/em&gt; that determine sleep duration. The idea is that some confounding environmental variable might cause both shortened sleep and increased heart attack risk, but it can’t change subjects’ genes, so any observed relationship between &lt;em&gt;genetic&lt;/em&gt; sleep duration and heart attack risk is probably causal.&lt;/p&gt;

      &lt;p&gt;I don’t have a strong opinion on how useful this technique is. &lt;a href=&quot;#fnref:27&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:21&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Dashti, H. S., Jones, S. E., Wood, A. R., Lane, J. M., van Hees, V. T., Wang, H., Rhodes, J. A. et al. (2019). &lt;a href=&quot;https://doi.org/10.1038/s41467-019-08917-4&quot;&gt;Genome-wide association study identifies genetic loci for self-reported habitual sleep duration supported by accelerometer-derived estimates.&lt;/a&gt; &lt;a href=&quot;#fnref:21&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:22&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Daghlas, I., Dashti, H. S., Lane, J., Aragam, K. G., Rutter, M. K., Saxena, R., &amp;amp; Vetter, C. (2019). &lt;a href=&quot;https://doi.org/10.1016/j.jacc.2019.07.022&quot;&gt;Sleep Duration and Myocardial Infarction.&lt;/a&gt; &lt;a href=&quot;#fnref:22&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:23&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Zhuang, Z., Gao, M., Yang, R., Li, N., Liu, Z., Cao, W., &amp;amp; Huang, T. (2020). &lt;a href=&quot;https://doi.org/10.1186/s12944-020-01257-z&quot;&gt;Association of physical activity, sedentary behaviours and sleep duration with cardiovascular diseases and lipid profiles: a Mendelian randomization analysis.&lt;/a&gt; &lt;a href=&quot;#fnref:23&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:24&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Yang, Y., Fan, J., Shi, X., Wang, Y., Yang, C., Lian, J., Wang, N. et al. (2022). &lt;a href=&quot;https://doi.org/10.1002/ehf2.14016&quot;&gt;Causal associations between sleep traits and four cardiac diseases: a Mendelian randomization study.&lt;/a&gt; &lt;a href=&quot;#fnref:24&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:26&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;This study found about a 1% increased risk from short sleep, contrasted with Daghlas et al. (2019) which found about a 20% increased risk. &lt;a href=&quot;#fnref:26&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:25&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Liao, L. z., Li, W. d., Liu, Y., Li, J. p., Zhuang, X. d., &amp;amp; Liao, X. x. (2020). &lt;a href=&quot;https://doi.org/10.1016/j.sleep.2019.08.014&quot;&gt;Causal assessment of sleep on coronary heart disease.&lt;/a&gt; &lt;a href=&quot;#fnref:25&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:46&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Lloret, M. A., Cervera-Ferri, A., Nepomuceno, M., Monllor, P., Esteve, D., &amp;amp; Lloret, A. (2020). &lt;a href=&quot;https://doi.org/10.3390/ijms21031168&quot;&gt;Is Sleep Disruption a Cause or Consequence of Alzheimer’s Disease? Reviewing Its Possible Role as a Biomarker.&lt;/a&gt; &lt;a href=&quot;#fnref:46&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:125&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Blondel, N., Berthoin, S., Billat, V., &amp;amp; Lensel, G. (2001). &lt;a href=&quot;http://dx.doi.org/10.1055/s-2001-11357&quot;&gt;Relationship between run times to exhaustion at 90, 100, 120, and 140% of vV O2max and velocity expressed relatively to critical velocity and maximal velocity.&lt;/a&gt; &lt;a href=&quot;#fnref:125&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:34&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Chiu, L. Z. (2023). &lt;a href=&quot;https://doi.org/10.1519/jsc.0000000000004655&quot;&gt;“Knees Out” or “Knees In”? Volitional Lateral vs. Medial Hip Rotation During Barbell Squats.&lt;/a&gt; &lt;a href=&quot;#fnref:34&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:37&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Ortega, F. B., Lavie, C. J., &amp;amp; Blair, S. N. (2016). &lt;a href=&quot;https://doi.org/10.1161/circresaha.115.306883&quot;&gt;Obesity and Cardiovascular Disease.&lt;/a&gt; &lt;a href=&quot;#fnref:37&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:36&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Wei, M., Kampert, J. B., Barlow, C. E., Nichaman, M. Z., Gibbons, L. W., Paffenbarger Jr, R. S., &amp;amp; Blair, S. N (1999). &lt;a href=&quot;https://doi.org/10.1001/jama.282.16.1547&quot;&gt;Relationship between low cardiorespiratory fitness and mortality in normal-weight, overweight, and obese men.&lt;/a&gt; &lt;a href=&quot;#fnref:36&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:75&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Arem, H., Moore, S. C., Patel, A., Hartge, P., Berrington de Gonzalez, A., Visvanathan, K., Campbell, P. T. et al. (2015). &lt;a href=&quot;https://doi.org/10.1001/jamainternmed.2015.0533&quot;&gt;Leisure Time Physical Activity and Mortality.&lt;/a&gt; &lt;a href=&quot;#fnref:75&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:73&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Lee, D. H., Rezende, L. F. M., Joh, H. K., Keum, N., Ferrari, G., Rey-Lopez, J. P., Rimm, E. B. et al. (2022). &lt;a href=&quot;https://doi.org/10.1161/circulationaha.121.058162&quot;&gt;Long-Term Leisure-Time Physical Activity Intensity and All-Cause and Cause-Specific Mortality: A Prospective Cohort of US Adults.&lt;/a&gt; &lt;a href=&quot;#fnref:73&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:78&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Lawler, P. R., Filion, K. B., &amp;amp; Eisenberg, M. J. (2011). &lt;a href=&quot;https://doi.org/10.1016/j.ahj.2011.07.017&quot;&gt;Efficacy of exercise-based cardiac rehabilitation post–myocardial infarction: A systematic review and meta-analysis of randomized controlled trials.&lt;/a&gt; &lt;a href=&quot;#fnref:78&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:79&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Morishita, S., Hamaue, Y., Fukushima, T., Tanaka, T., Fu, J. B., &amp;amp; Nakano, J. (2020). &lt;a href=&quot;https://doi.org/10.1177/1534735420917462&quot;&gt;Effect of Exercise on Mortality and Recurrence in Patients With Cancer: A Systematic Review and Meta-Analysis.&lt;/a&gt; &lt;a href=&quot;#fnref:79&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:81&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Posadzki, P., Pieper, D., Bajpai, R., Makaruk, H., Könsgen, N., Neuhaus, A. L., &amp;amp; Semwal, M. (2020). &lt;a href=&quot;https://doi.org/10.1186/s12889-020-09855-3&quot;&gt;Exercise/physical activity and health outcomes: an overview of Cochrane systematic reviews.&lt;/a&gt; &lt;a href=&quot;#fnref:81&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:30&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Goldstein-Piekarski, A. N., Greer, S. M., Saletin, J. M., &amp;amp; Walker, M. P. (2015). &lt;a href=&quot;https://doi.org/10.1523/jneurosci.5254-14.2015&quot;&gt;Sleep Deprivation Impairs the Human Central and Peripheral Nervous System Discrimination of Social Threat.&lt;/a&gt; &lt;a href=&quot;#fnref:30&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>I Have Whatever the Opposite of a Placebo Effect Is</title>
				<pubDate>Mon, 02 Sep 2024 00:00:00 -0700</pubDate>
				<link>http://mdickens.me/2024/09/02/I_have_the_opposite_of_placebo/</link>
				<guid isPermaLink="true">http://mdickens.me/2024/09/02/I_have_the_opposite_of_placebo/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;Two personal stories:&lt;/p&gt;

&lt;h3 id=&quot;a-story-about-caffeine&quot;&gt;A story about caffeine&lt;/h3&gt;

&lt;p&gt;When I first started working a full-time job, I started tracking my daily (subjective) productivity along with a number of variables that I thought might be relevant, like whether I exercised that morning or whether I took caffeine. I couldn’t perceive any differences in productivity based on any of the variables.&lt;/p&gt;

&lt;p&gt;After collecting about a year of data, I ran a regression. I found that most variables had no noticeable effect, but caffeine had a &lt;em&gt;huge&lt;/em&gt; effect—it increased my subjective productivity by about 20 percentage points, or an extra ~1.5 productive hours per day. Somehow I never noticed this enormous effect. Whatever the opposite of a placebo effect is, that’s what I had: caffeine had a large effect, but I thought it had no effect.&lt;/p&gt;

&lt;h3 id=&quot;a-story-about-sleep&quot;&gt;A story about sleep&lt;/h3&gt;

&lt;p&gt;People always say that exercise helps them sleep better. I thought it didn’t work for me. When I do cardio, even like two hours of cardio, I don’t feel more tired in the evening and I don’t fall asleep (noticeably) faster.&lt;/p&gt;

&lt;p&gt;Yesterday, I decided to test this. I wrote a script to predict how long I slept based on how many calories my phone says I burned. The idea is that if I sleep less, that probably means I didn’t need as much because my sleep was higher quality. (I almost always wake up naturally without an alarm.)&lt;/p&gt;

&lt;p&gt;Well, turns out exercise &lt;em&gt;does&lt;/em&gt; help. For every 500 calories burned (which is about what I burn during a normal cardio session), I sleep 25 minutes less. Once again, exercise had a huge effect, and I thought it didn’t do anything.&lt;/p&gt;

&lt;p&gt;I guess I’m not very observant.&lt;/p&gt;

                </description>
			</item>
		
			<item>
				<title>Protein Quality (DIAAS) Calculator</title>
				<pubDate>Thu, 29 Aug 2024 00:00:00 -0700</pubDate>
				<link>http://mdickens.me/2024/08/29/DIAAS_calculator/</link>
				<guid isPermaLink="true">http://mdickens.me/2024/08/29/DIAAS_calculator/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;&lt;em&gt;Update 2025-01-17: I discovered another protein quality calculator that’s much more comprehensive than mine: &lt;a href=&quot;https://www.diaas-calculator.com/&quot;&gt;https://www.diaas-calculator.com/&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;You may know that complete proteins are good because they contain every essential amino acid. But you might not know that that’s not the full story.&lt;/p&gt;

&lt;p&gt;Take wheat. Wheat is a complete protein—it contains all nine essential amino acids. But it has a problem. Wheat only contains 27mg of lysine (an essential amino acid) per gram of protein, whereas the Food and Agriculture Organization recommends 48mg of lysine per gram. To make full use of a gram of protein, your body needs to get those 48mg. It doesn’t matter that wheat has lots of other essential amino acids. Once your body uses up all the lysine, it can’t make good use of the other amino acids in wheat protein.&lt;/p&gt;

&lt;p&gt;You can evaluate the protein quality of a food using the &lt;a href=&quot;https://en.wikipedia.org/wiki/Digestible_Indispensable_Amino_Acid_Score&quot;&gt;Digestible Indispensable Amino Acid Score (DIAAS)&lt;/a&gt;. This score determines the quality of a source of protein based on which essential amino acid will run out first, adjusted for digestibility. A score of 100 means the protein has plenty of every essential amino acid.&lt;/p&gt;

&lt;p&gt;Sometimes you can improve the protein quality of your food by mixing different ingredients. Wheat has a DIAAS of 57 because it only has 57% as much lysine per gram as your body needs. Peas have a score of 82 because they don’t have enough methionine + cysteine. But peas have 131% of the lysine requirement, and wheat has 149% of methionine + cysteine, so mix them together and they cover for each other’s weaknesses. A 50/50 mixture of wheat and pea protein has a DIAAS of 94.&lt;/p&gt;

&lt;p&gt;With this calculator, you can determine the DIAAS for mixtures of different protein sources.&lt;/p&gt;

&lt;!-- more --&gt;

&lt;script src=&quot;/scripts/DIAAS.js&quot; defer=&quot;&quot;&gt;&lt;/script&gt;

&lt;form name=&quot;DIAAS&quot;&gt;
    &lt;table style=&quot;width: 200px; margin-left: auto; margin-right: auto; table-layout: fixed&quot;&gt;
    &lt;tr&gt;
        &lt;th&gt;Ingredient&lt;/th&gt;
        &lt;th&gt;Content (%)&lt;/th&gt;
    &lt;/tr&gt;

    
        &lt;tr&gt;
            &lt;td style=&quot;text-align:left&quot;&gt;Soy&lt;/td&gt;
            &lt;td&gt;&lt;input type=&quot;number&quot; id=&quot;SoyProtein&quot; value=&quot;0&quot; min=&quot;0&quot; max=&quot;100&quot; step=&quot;1&quot; /&gt;&lt;/td&gt;
        &lt;/tr&gt;
    
        &lt;tr&gt;
            &lt;td style=&quot;text-align:left&quot;&gt;Wheat&lt;/td&gt;
            &lt;td&gt;&lt;input type=&quot;number&quot; id=&quot;WheatProtein&quot; value=&quot;0&quot; min=&quot;0&quot; max=&quot;100&quot; step=&quot;1&quot; /&gt;&lt;/td&gt;
        &lt;/tr&gt;
    
        &lt;tr&gt;
            &lt;td style=&quot;text-align:left&quot;&gt;Pea&lt;/td&gt;
            &lt;td&gt;&lt;input type=&quot;number&quot; id=&quot;PeaProtein&quot; value=&quot;0&quot; min=&quot;0&quot; max=&quot;100&quot; step=&quot;1&quot; /&gt;&lt;/td&gt;
        &lt;/tr&gt;
    
        &lt;tr&gt;
            &lt;td style=&quot;text-align:left&quot;&gt;Fava bean&lt;/td&gt;
            &lt;td&gt;&lt;input type=&quot;number&quot; id=&quot;Fava beanProtein&quot; value=&quot;0&quot; min=&quot;0&quot; max=&quot;100&quot; step=&quot;1&quot; /&gt;&lt;/td&gt;
        &lt;/tr&gt;
    
        &lt;tr&gt;
            &lt;td style=&quot;text-align:left&quot;&gt;Hemp&lt;/td&gt;
            &lt;td&gt;&lt;input type=&quot;number&quot; id=&quot;HempProtein&quot; value=&quot;0&quot; min=&quot;0&quot; max=&quot;100&quot; step=&quot;1&quot; /&gt;&lt;/td&gt;
        &lt;/tr&gt;
    
        &lt;tr&gt;
            &lt;td style=&quot;text-align:left&quot;&gt;Rice&lt;/td&gt;
            &lt;td&gt;&lt;input type=&quot;number&quot; id=&quot;RiceProtein&quot; value=&quot;0&quot; min=&quot;0&quot; max=&quot;100&quot; step=&quot;1&quot; /&gt;&lt;/td&gt;
        &lt;/tr&gt;
    
        &lt;tr&gt;
            &lt;td style=&quot;text-align:left&quot;&gt;Potato&lt;/td&gt;
            &lt;td&gt;&lt;input type=&quot;number&quot; id=&quot;PotatoProtein&quot; value=&quot;0&quot; min=&quot;0&quot; max=&quot;100&quot; step=&quot;1&quot; /&gt;&lt;/td&gt;
        &lt;/tr&gt;
    
        &lt;tr&gt;
            &lt;td style=&quot;text-align:left&quot;&gt;Oat&lt;/td&gt;
            &lt;td&gt;&lt;input type=&quot;number&quot; id=&quot;OatProtein&quot; value=&quot;0&quot; min=&quot;0&quot; max=&quot;100&quot; step=&quot;1&quot; /&gt;&lt;/td&gt;
        &lt;/tr&gt;
    
        &lt;tr&gt;
            &lt;td style=&quot;text-align:left&quot;&gt;Corn&lt;/td&gt;
            &lt;td&gt;&lt;input type=&quot;number&quot; id=&quot;CornProtein&quot; value=&quot;0&quot; min=&quot;0&quot; max=&quot;100&quot; step=&quot;1&quot; /&gt;&lt;/td&gt;
        &lt;/tr&gt;
    
        &lt;tr&gt;
            &lt;td style=&quot;text-align:left&quot;&gt;Rapeseed&lt;/td&gt;
            &lt;td&gt;&lt;input type=&quot;number&quot; id=&quot;RapeseedProtein&quot; value=&quot;0&quot; min=&quot;0&quot; max=&quot;100&quot; step=&quot;1&quot; /&gt;&lt;/td&gt;
        &lt;/tr&gt;
    
        &lt;tr&gt;
            &lt;td style=&quot;text-align:left&quot;&gt;Lupin&lt;/td&gt;
            &lt;td&gt;&lt;input type=&quot;number&quot; id=&quot;LupinProtein&quot; value=&quot;0&quot; min=&quot;0&quot; max=&quot;100&quot; step=&quot;1&quot; /&gt;&lt;/td&gt;
        &lt;/tr&gt;
    
        &lt;tr&gt;
            &lt;td style=&quot;text-align:left&quot;&gt;Canola&lt;/td&gt;
            &lt;td&gt;&lt;input type=&quot;number&quot; id=&quot;CanolaProtein&quot; value=&quot;0&quot; min=&quot;0&quot; max=&quot;100&quot; step=&quot;1&quot; /&gt;&lt;/td&gt;
        &lt;/tr&gt;
    
        &lt;tr&gt;
            &lt;td style=&quot;text-align:left&quot;&gt;Whey&lt;/td&gt;
            &lt;td&gt;&lt;input type=&quot;number&quot; id=&quot;WheyProtein&quot; value=&quot;0&quot; min=&quot;0&quot; max=&quot;100&quot; step=&quot;1&quot; /&gt;&lt;/td&gt;
        &lt;/tr&gt;
    
    &lt;/table&gt;

    &lt;p&gt;&lt;input type=&quot;button&quot; class=&quot;button&quot; name=&quot;button&quot; value=&quot;Calculate&quot; onclick=&quot;fillDIAAS()&quot; /&gt;&lt;/p&gt;

    &lt;table width=&quot;400px&quot; style=&quot;border-collapse:collapse; margin-left:auto; margin-right:auto; table-layout:fixed&quot;&gt;
        &lt;tr&gt;
            &lt;td&gt;DIAAS&lt;/td&gt;
            &lt;td&gt;&lt;strong&gt;&lt;div id=&quot;DIAAS&quot;&gt;&lt;/div&gt;&lt;/strong&gt;&lt;/td&gt;
        &lt;/tr&gt;
        &lt;tr&gt;
            &lt;td&gt;Limiting amino acid&lt;/td&gt;
            &lt;td&gt;&lt;strong&gt;&lt;div id=&quot;limitingAminoAcid&quot;&gt;&lt;/div&gt;&lt;/strong&gt;&lt;/td&gt;
        &lt;/tr&gt;
    &lt;/table&gt;

    Full amino acid profile: (100 = recommended dose)
    &lt;table&gt;
        &lt;tr&gt;
            &lt;td&gt;Histidine&lt;/td&gt;&lt;td&gt;Isoleucine&lt;/td&gt;&lt;td&gt;Leucine&lt;/td&gt;&lt;td&gt;Lysine&lt;/td&gt;&lt;td&gt;Met + Cys&lt;/td&gt;&lt;td&gt;Phe + Tyr&lt;/td&gt;&lt;td&gt;Threonine&lt;/td&gt;&lt;td&gt;Tryptophan&lt;/td&gt;&lt;td&gt;Valine&lt;/td&gt;
        &lt;/tr&gt;
        &lt;tr&gt;
            &lt;td&gt;&lt;div id=&quot;Histidine&quot;&gt;&lt;/div&gt;&lt;/td&gt;
            &lt;td&gt;&lt;div id=&quot;Isoleucine&quot;&gt;&lt;/div&gt;&lt;/td&gt;
            &lt;td&gt;&lt;div id=&quot;Leucine&quot;&gt;&lt;/div&gt;&lt;/td&gt;
            &lt;td&gt;&lt;div id=&quot;Lysine&quot;&gt;&lt;/div&gt;&lt;/td&gt;
            &lt;td&gt;&lt;div id=&quot;MethionineCysteine&quot;&gt;&lt;/div&gt;&lt;/td&gt;
            &lt;td&gt;&lt;div id=&quot;PhenylalanineTyrosine&quot;&gt;&lt;/div&gt;&lt;/td&gt;
            &lt;td&gt;&lt;div id=&quot;Threonine&quot;&gt;&lt;/div&gt;&lt;/td&gt;
            &lt;td&gt;&lt;div id=&quot;Tryptophan&quot;&gt;&lt;/div&gt;&lt;/td&gt;
            &lt;td&gt;&lt;div id=&quot;Valine&quot;&gt;&lt;/div&gt;&lt;/td&gt;
        &lt;/tr&gt;
    &lt;/table&gt;
&lt;/form&gt;

&lt;h2 id=&quot;good-combinations-of-plant-proteins&quot;&gt;Good combinations of plant proteins&lt;/h2&gt;

&lt;ol&gt;
  &lt;li&gt;Soy protein alone has a DIAAS of 102.&lt;/li&gt;
  &lt;li&gt;22% pea + 36% fava bean + 42% hemp has a DIAAS of 96.&lt;/li&gt;
  &lt;li&gt;50% wheat + 50% pea has a DIAAS of 94.&lt;/li&gt;
  &lt;li&gt;28g of wheat protein plus a 500mg &lt;a href=&quot;https://www.amazon.com/NOW-L-Lysine-500-100-Tablets/dp/B000MGOWOC/&quot;&gt;lysine pill&lt;/a&gt; has a DIAAS of 94.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2 id=&quot;table-of-amino-acid-profiles&quot;&gt;Table of amino acid profiles&lt;/h2&gt;

&lt;p&gt;Amino acid values are scaled to the reference values for adult amino acid requirements such that a score of 100 matches the reference value.&lt;/p&gt;

&lt;p&gt;Calculated using the amino acid values from &lt;a href=&quot;https://doi.org/10.1002/fsn3.1809&quot;&gt;Herreman et al. (2020)&lt;/a&gt;&lt;sup id=&quot;fnref:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; and the reference values from &lt;a href=&quot;https://www.fao.org/ag/humannutrition/35978-02317b979a686a57aa4593304ffc17f06.pdf&quot;&gt;FAO Expert Consultation (2011)&lt;/a&gt;&lt;sup id=&quot;fnref:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt;.&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Protein source&lt;/th&gt;
      &lt;th&gt;DIAAS&lt;/th&gt;
      &lt;th&gt;Histidine&lt;/th&gt;
      &lt;th&gt;Isoleucine&lt;/th&gt;
      &lt;th&gt;Leucine&lt;/th&gt;
      &lt;th&gt;Lysine&lt;/th&gt;
      &lt;th&gt;Met + Cys&lt;/th&gt;
      &lt;th&gt;Phe + Tyr&lt;/th&gt;
      &lt;th&gt;Threonine&lt;/th&gt;
      &lt;th&gt;Tryptophan&lt;/th&gt;
      &lt;th&gt;Valine&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Soy&lt;/td&gt;
      &lt;td&gt;102&lt;/td&gt;
      &lt;td&gt;149&lt;/td&gt;
      &lt;td&gt;132&lt;/td&gt;
      &lt;td&gt;110&lt;/td&gt;
      &lt;td&gt;114&lt;/td&gt;
      &lt;td&gt;107&lt;/td&gt;
      &lt;td&gt;186&lt;/td&gt;
      &lt;td&gt;130&lt;/td&gt;
      &lt;td&gt;170&lt;/td&gt;
      &lt;td&gt;102&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Wheat&lt;/td&gt;
      &lt;td&gt;57&lt;/td&gt;
      &lt;td&gt;148&lt;/td&gt;
      &lt;td&gt;97&lt;/td&gt;
      &lt;td&gt;94&lt;/td&gt;
      &lt;td&gt;57&lt;/td&gt;
      &lt;td&gt;149&lt;/td&gt;
      &lt;td&gt;138&lt;/td&gt;
      &lt;td&gt;97&lt;/td&gt;
      &lt;td&gt;164&lt;/td&gt;
      &lt;td&gt;99&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Pea&lt;/td&gt;
      &lt;td&gt;82&lt;/td&gt;
      &lt;td&gt;124&lt;/td&gt;
      &lt;td&gt;108&lt;/td&gt;
      &lt;td&gt;94&lt;/td&gt;
      &lt;td&gt;131&lt;/td&gt;
      &lt;td&gt;82&lt;/td&gt;
      &lt;td&gt;147&lt;/td&gt;
      &lt;td&gt;117&lt;/td&gt;
      &lt;td&gt;99&lt;/td&gt;
      &lt;td&gt;89&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Fava bean&lt;/td&gt;
      &lt;td&gt;65&lt;/td&gt;
      &lt;td&gt;135&lt;/td&gt;
      &lt;td&gt;113&lt;/td&gt;
      &lt;td&gt;103&lt;/td&gt;
      &lt;td&gt;113&lt;/td&gt;
      &lt;td&gt;65&lt;/td&gt;
      &lt;td&gt;151&lt;/td&gt;
      &lt;td&gt;113&lt;/td&gt;
      &lt;td&gt;88&lt;/td&gt;
      &lt;td&gt;89&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Hemp&lt;/td&gt;
      &lt;td&gt;64&lt;/td&gt;
      &lt;td&gt;155&lt;/td&gt;
      &lt;td&gt;113&lt;/td&gt;
      &lt;td&gt;92&lt;/td&gt;
      &lt;td&gt;64&lt;/td&gt;
      &lt;td&gt;142&lt;/td&gt;
      &lt;td&gt;166&lt;/td&gt;
      &lt;td&gt;108&lt;/td&gt;
      &lt;td&gt;129&lt;/td&gt;
      &lt;td&gt;106&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Corn&lt;/td&gt;
      &lt;td&gt;43&lt;/td&gt;
      &lt;td&gt;138&lt;/td&gt;
      &lt;td&gt;96&lt;/td&gt;
      &lt;td&gt;175&lt;/td&gt;
      &lt;td&gt;43&lt;/td&gt;
      &lt;td&gt;148&lt;/td&gt;
      &lt;td&gt;178&lt;/td&gt;
      &lt;td&gt;107&lt;/td&gt;
      &lt;td&gt;67&lt;/td&gt;
      &lt;td&gt;97&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Rice&lt;/td&gt;
      &lt;td&gt;56&lt;/td&gt;
      &lt;td&gt;116&lt;/td&gt;
      &lt;td&gt;95&lt;/td&gt;
      &lt;td&gt;87&lt;/td&gt;
      &lt;td&gt;56&lt;/td&gt;
      &lt;td&gt;122&lt;/td&gt;
      &lt;td&gt;151&lt;/td&gt;
      &lt;td&gt;93&lt;/td&gt;
      &lt;td&gt;147&lt;/td&gt;
      &lt;td&gt;102&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Potato&lt;/td&gt;
      &lt;td&gt;125&lt;/td&gt;
      &lt;td&gt;125&lt;/td&gt;
      &lt;td&gt;166&lt;/td&gt;
      &lt;td&gt;155&lt;/td&gt;
      &lt;td&gt;145&lt;/td&gt;
      &lt;td&gt;135&lt;/td&gt;
      &lt;td&gt;266&lt;/td&gt;
      &lt;td&gt;205&lt;/td&gt;
      &lt;td&gt;165&lt;/td&gt;
      &lt;td&gt;148&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Oat&lt;/td&gt;
      &lt;td&gt;68&lt;/td&gt;
      &lt;td&gt;114&lt;/td&gt;
      &lt;td&gt;107&lt;/td&gt;
      &lt;td&gt;102&lt;/td&gt;
      &lt;td&gt;68&lt;/td&gt;
      &lt;td&gt;177&lt;/td&gt;
      &lt;td&gt;171&lt;/td&gt;
      &lt;td&gt;105&lt;/td&gt;
      &lt;td&gt;142&lt;/td&gt;
      &lt;td&gt;110&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Rapeseed&lt;/td&gt;
      &lt;td&gt;80&lt;/td&gt;
      &lt;td&gt;134&lt;/td&gt;
      &lt;td&gt;96&lt;/td&gt;
      &lt;td&gt;84&lt;/td&gt;
      &lt;td&gt;80&lt;/td&gt;
      &lt;td&gt;147&lt;/td&gt;
      &lt;td&gt;117&lt;/td&gt;
      &lt;td&gt;120&lt;/td&gt;
      &lt;td&gt;137&lt;/td&gt;
      &lt;td&gt;99&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Lupin&lt;/td&gt;
      &lt;td&gt;80&lt;/td&gt;
      &lt;td&gt;151&lt;/td&gt;
      &lt;td&gt;111&lt;/td&gt;
      &lt;td&gt;96&lt;/td&gt;
      &lt;td&gt;89&lt;/td&gt;
      &lt;td&gt;80&lt;/td&gt;
      &lt;td&gt;153&lt;/td&gt;
      &lt;td&gt;120&lt;/td&gt;
      &lt;td&gt;93&lt;/td&gt;
      &lt;td&gt;84&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Canola&lt;/td&gt;
      &lt;td&gt;85&lt;/td&gt;
      &lt;td&gt;131&lt;/td&gt;
      &lt;td&gt;99&lt;/td&gt;
      &lt;td&gt;85&lt;/td&gt;
      &lt;td&gt;86&lt;/td&gt;
      &lt;td&gt;142&lt;/td&gt;
      &lt;td&gt;123&lt;/td&gt;
      &lt;td&gt;120&lt;/td&gt;
      &lt;td&gt;144&lt;/td&gt;
      &lt;td&gt;94&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Whey&lt;/td&gt;
      &lt;td&gt;106&lt;/td&gt;
      &lt;td&gt;106&lt;/td&gt;
      &lt;td&gt;177&lt;/td&gt;
      &lt;td&gt;149&lt;/td&gt;
      &lt;td&gt;156&lt;/td&gt;
      &lt;td&gt;155&lt;/td&gt;
      &lt;td&gt;128&lt;/td&gt;
      &lt;td&gt;216&lt;/td&gt;
      &lt;td&gt;232&lt;/td&gt;
      &lt;td&gt;125&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h1 id=&quot;references&quot;&gt;References&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:1&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;FAO Expert Consultation (2011). &lt;a href=&quot;https://www.fao.org/ag/humannutrition/35978-02317b979a686a57aa4593304ffc17f06.pdf&quot;&gt;Dietary protein quality evaluation in human nutrition.&lt;/a&gt; &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:2&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Herreman, L., Nommensen, P., Pennings, B., &amp;amp; Laus, M. C. (2020). &lt;a href=&quot;https://doi.org/10.1002/fsn3.1809&quot;&gt;Comprehensive overview of the quality of plant- and animal-sourced proteins based on the digestible indispensable amino acid score.&lt;/a&gt; &lt;a href=&quot;#fnref:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>Just Because a Number Is a Rounding Error Doesn't Mean It's Not Important</title>
				<pubDate>Fri, 02 Aug 2024 00:00:00 -0700</pubDate>
				<link>http://mdickens.me/2024/08/02/rounding_error_can_be_important/</link>
				<guid isPermaLink="true">http://mdickens.me/2024/08/02/rounding_error_can_be_important/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;Sometimes, people call a number a “rounding error” as if to say it doesn’t matter. But a rounding error can still be very important!&lt;/p&gt;

&lt;p&gt;Say I’m tracking my weight. If I’ve put on 0.1 pounds since yesterday, that’s a rounding error—my weight fluctuates by 3 pounds on a day-to-day basis, so 0.1 pounds means nothing. But if I continue gaining 0.1 pounds per day, I’ll be obese after 18 months, and by the time I’m 70 I’ll be the fattest person who ever lived.&lt;/p&gt;

&lt;p&gt;Or if the stock market moves 1% in a day, that’s a rounding error. If it moves up 1% every day for a year, every individual day of which is a rounding error, it will be up 3700%, which would be the craziest thing that’s ever happened in the history of the global economy.&lt;/p&gt;

&lt;p&gt;This happens whenever the standard deviation is much larger than the mean. A large standard deviation means a “real” change gets obscured by random movement. But over enough iterations, the random movements even out and the real changes persist. For example, the stock market has an average daily return of 0.02% and a standard deviation of 0.8%. The standard deviation is 40x larger than the mean, so a real trend in prices gets totally washed out by noise. The market’s daily average return is a rounding error, but it’s still important.&lt;/p&gt;

                </description>
			</item>
		
			<item>
				<title>A 401(k) Sometimes Isn't Worth It</title>
				<pubDate>Wed, 24 Jul 2024 00:00:00 -0700</pubDate>
				<link>http://mdickens.me/2024/07/24/401k_fees/</link>
				<guid isPermaLink="true">http://mdickens.me/2024/07/24/401k_fees/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;You don’t always save money by putting your investments into a 401(k).&lt;/p&gt;

&lt;p&gt;When you invest money inside a 401(k), you don’t have to pay taxes on any returns earned by your investments. But you also have to pay a fee to your 401(k) provider.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;If you buy and hold index funds in a taxable account, you don’t have to pay any capital gains tax on price increases until you sell.&lt;/li&gt;
  &lt;li&gt;In a 401(k), the annual fee adds up every year and may eventually exceed the tax savings.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So the taxes cap out at the capital gains tax rate (15% or 20% depending on your tax bracket),&lt;sup id=&quot;fnref:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; whereas the expenses of a 401(k) continue to accumulate.&lt;/p&gt;

&lt;p&gt;However, in a taxable account, you do still have to pay taxes on dividends (and bond payouts) every year, and those taxes might cost you more than the 401(k) fees.&lt;sup id=&quot;fnref:3&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;Below is a calculator to determine how many years before the 401(k) fees exceed the tax savings, if ever.&lt;/p&gt;

&lt;script src=&quot;/scripts/401k.js&quot; defer=&quot;&quot;&gt;&lt;/script&gt;

&lt;form name=&quot;401k&quot;&gt;
    &lt;table&gt;
        &lt;tr&gt;
            &lt;td style=&quot;text-align:right&quot;&gt;employer matching (%)&lt;/td&gt;
            &lt;td&gt;&lt;input style=&quot;width:80px&quot; type=&quot;number&quot; step=&quot;50&quot; id=&quot;matching&quot; value=&quot;0&quot; /&gt;&lt;/td&gt;
        &lt;/tr&gt;
        &lt;tr&gt;
            &lt;td style=&quot;text-align:right&quot;&gt;total investment return including dividends (nominal) (%)&lt;/td&gt;
            &lt;td&gt;&lt;input style=&quot;width:80px&quot; type=&quot;number&quot; step=&quot;0.5&quot; id=&quot;marketReturn&quot; value=&quot;8&quot; /&gt;&lt;/td&gt;
        &lt;/tr&gt;
        &lt;tr&gt;
            &lt;td style=&quot;text-align:right&quot;&gt;dividend yield (%)&lt;/td&gt;
            &lt;td&gt;&lt;input style=&quot;width:80px&quot; type=&quot;number&quot; step=&quot;0.5&quot; id=&quot;divYield&quot; value=&quot;2&quot; /&gt;&lt;/td&gt;
        &lt;/tr&gt;
        &lt;tr&gt;
            &lt;td style=&quot;text-align:right&quot;&gt;401(k) fee (%)&lt;/td&gt;
            &lt;td&gt;&lt;input style=&quot;width:80px&quot; type=&quot;number&quot; step=&quot;0.05&quot; id=&quot;fee&quot; value=&quot;0.5&quot; /&gt;&lt;/td&gt;
        &lt;/tr&gt;
        &lt;tr&gt;
            &lt;td style=&quot;text-align:right&quot;&gt;capital gains tax rate (%)&lt;/td&gt;
            &lt;td&gt;&lt;input style=&quot;width:80px&quot; type=&quot;number&quot; step=&quot;1&quot; id=&quot;cgTax&quot; value=&quot;15&quot; /&gt;&lt;/td&gt;
        &lt;/tr&gt;
        &lt;tr&gt;
            &lt;td style=&quot;text-align:right&quot;&gt;income tax rate today (%)&lt;/td&gt;
            &lt;td&gt;&lt;input style=&quot;width:80px&quot; type=&quot;number&quot; step=&quot;1&quot; id=&quot;incomeTaxToday&quot; value=&quot;24&quot; /&gt;&lt;/td&gt;
        &lt;/tr&gt;
        &lt;tr&gt;
            &lt;td style=&quot;text-align:right&quot;&gt;income tax rate in retirement (%)&lt;/td&gt;
            &lt;td&gt;&lt;input style=&quot;width:80px&quot; type=&quot;number&quot; step=&quot;1&quot; id=&quot;incomeTaxInRetirement&quot; value=&quot;24&quot; /&gt;&lt;/td&gt;
        &lt;/tr&gt;
     &lt;/table&gt;

    &lt;p&gt;&lt;input type=&quot;button&quot; class=&quot;button&quot; name=&quot;button&quot; value=&quot;Calculate&quot; onclick=&quot;htmlBreakEventPoint()&quot; /&gt;&lt;/p&gt;

    &lt;table style=&quot;font-size:1em&quot;&gt;
        &lt;tr&gt;
            &lt;td&gt;A 401(k) falls behind a taxable account after:&lt;/td&gt;
            &lt;td&gt;&lt;strong&gt;&lt;div id=&quot;breakEven&quot;&gt;&lt;/div&gt;&lt;/strong&gt;&lt;/td&gt;
        &lt;/tr&gt;
    &lt;/table&gt;

    &lt;br /&gt;

&lt;/form&gt;

&lt;p&gt;This calculator assumes you buy index funds and hold them forever. If you trade stocks within a taxable account, you have to pay taxes every time you make a trade.&lt;/p&gt;

&lt;p&gt;Something else to consider: If you quit your job, your old employer’s 401(k) provider will let you roll your 401(k) into an IRA. You don’t have to pay any fees on an IRA.&lt;sup id=&quot;fnref:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt; So even if the 401(k) fees exceed the tax benefits after (say) 30 years, that’s not a problem if you expect to quit your job after less than 30 years. Realistically, few people stay at one job for so long that the 401(k) fees exceed the tax savings.&lt;/p&gt;

&lt;p&gt;(If you change jobs, usually you can roll your old 401(k) into your new 401(k), but I wouldn’t do that because it means you have to keep paying 401(k) fees. It’s almost always better to roll your old 401(k) into an IRA.)&lt;/p&gt;

&lt;h2 id=&quot;notes&quot;&gt;Notes&lt;/h2&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:1&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;The capital gains tax will always be less than 15%/20% of your account value (depending on which tax bracket you’re in), but it converges on 15%/20% as the value approaches infinity.&lt;/p&gt;

      &lt;p&gt;Example: If you invest $100 in an index fund and you sell when the price reaches $101, you have to pay 20% of $1 (assuming you’re in the 20% tax bracket), which is only 0.2% of the total value. If you sell when the price reaches $1 million, you have to pay 20% of $999,900, which is 19.998% of the total value. &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:3&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;H/T &lt;a href=&quot;https://www.benkuhn.net/&quot;&gt;Ben Kuhn&lt;/a&gt; for raising this possibility. I’m sure someone somewhere had considered it before him, but I’ve never seen anyone else bring it up, and standard financial advice ignores it. &lt;a href=&quot;#fnref:3&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:2&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Other than ETF/mutual fund fees, but you have to pay those no matter what. &lt;a href=&quot;#fnref:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>Continuing My Caffeine Self-Experiment</title>
				<pubDate>Mon, 24 Jun 2024 00:00:00 -0700</pubDate>
				<link>http://mdickens.me/2024/06/24/continuing_caffeine_self_experiment/</link>
				<guid isPermaLink="true">http://mdickens.me/2024/06/24/continuing_caffeine_self_experiment/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;I did another &lt;a href=&quot;https://mdickens.me/2024/04/11/caffeine_self_experiment/&quot;&gt;caffeine experiment on myself&lt;/a&gt;. This time I tested if I could have caffeine 4 days a week without getting habituated.&lt;/p&gt;

&lt;p&gt;Last time, when I took caffeine 3 days a week, I didn’t get habituated but the results were weird. This time, with the more frequent dose, I still didn’t get habituated, and the results were weird again!&lt;/p&gt;

&lt;!-- more --&gt;

&lt;h2 id=&quot;contents&quot;&gt;Contents&lt;/h2&gt;

&lt;ul id=&quot;markdown-toc&quot;&gt;
  &lt;li&gt;&lt;a href=&quot;#contents&quot; id=&quot;markdown-toc-contents&quot;&gt;Contents&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#introduction&quot; id=&quot;markdown-toc-introduction&quot;&gt;Introduction&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#experimental-procedure&quot; id=&quot;markdown-toc-experimental-procedure&quot;&gt;Experimental procedure&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#results&quot; id=&quot;markdown-toc-results&quot;&gt;Results&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#when-i-take-caffeine-3-days-in-a-row-do-i-habituate-by-the-3rd-day&quot; id=&quot;markdown-toc-when-i-take-caffeine-3-days-in-a-row-do-i-habituate-by-the-3rd-day&quot;&gt;When I take caffeine 3 days in a row, do I habituate by the 3rd day?&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#notes&quot; id=&quot;markdown-toc-notes&quot;&gt;Notes&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;introduction&quot;&gt;Introduction&lt;/h2&gt;

&lt;p&gt;In April, I &lt;a href=&quot;https://mdickens.me/2024/04/11/caffeine_self_experiment/&quot;&gt;published the results of a self-experiment&lt;/a&gt; on caffeine cycling. I drank coffee 3 days a week for 6 weeks&lt;sup id=&quot;fnref:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; and tested my reaction time to see if I would become habituated to caffeine. If I do become habituated, my reaction time should get worse over the 6 weeks. It didn’t get worse, and in fact it got better, for unclear reasons.&lt;/p&gt;

&lt;p&gt;So my experiment showed that I didn’t become (detectably) habituated to caffeine when taking it 3 days a week. I ran a second experiment to see what happens if I up the dosage frequency to 4 days a week. Do I start to become habituated? Or can I get away with it?&lt;/p&gt;

&lt;p&gt;Turns out, I can get away with it. The results from the 4-day-a-week experiment show no signs of habituation.&lt;/p&gt;

&lt;p&gt;In fact, like &lt;a href=&quot;https://mdickens.me/2024/04/11/caffeine_self_experiment/#experimental-phase&quot;&gt;last time&lt;/a&gt;, my reaction time got (slightly, non-significantly) &lt;em&gt;better&lt;/em&gt; over the course of the experiment.&lt;sup id=&quot;fnref:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;h2 id=&quot;experimental-procedure&quot;&gt;Experimental procedure&lt;/h2&gt;

&lt;p&gt;I followed the procedure of phase 4 as described in my &lt;a href=&quot;https://mdickens.me/2024/03/02/caffeine_tolerance/#appendix-b-pre-registration-for-a-caffeine-self-experiment&quot;&gt;pre-registration&lt;/a&gt;—it’s the same as phase 3, except that I drank coffee 4 days a week instead of 3. Specifically, I had caffeine on Monday, Wednesday, Thursday, and Friday. I ran the experiment for six weeks.&lt;/p&gt;

&lt;p&gt;A quick review of the experimental procedure:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Take caffeine on Mon/Wed/Thu/Fri.&lt;/li&gt;
  &lt;li&gt;Test reaction time without caffeine every morning, and test again an hour after caffeine on caffeine days.&lt;/li&gt;
  &lt;li&gt;Look at the slope of reaction time over 6 weeks. If post-caffeine reaction time got worse, that means I became habituated. If no-caffeine reaction time got worse, that means I developed withdrawal symptoms.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;I had caffeine three days in a row Wed/Thu/Fri so that I could test a secondary hypothesis: does caffeine become less effective by the third day? Pre-existing research suggests that habituation starts to appear as early as day 3. I might see a small habituation by Friday which then dissipates over the weekend.&lt;/p&gt;

&lt;h2 id=&quot;results&quot;&gt;Results&lt;/h2&gt;

&lt;p&gt;My reaction time slightly improved over the course of the six weeks.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/caf-experimental2-regression.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Controlling for sleep quality (as measured by number of hours in bed) did not change the slope at all (to two significant figures).&lt;/p&gt;

&lt;p&gt;My reaction time also improved during the &lt;a href=&quot;https://mdickens.me/2024/04/11/caffeine_self_experiment/#experimental-phase&quot;&gt;previous experimental phase&lt;/a&gt;. But it didn’t improve over both phases combined. Looking at both experimental periods together (including the abstinence period in between), the slope is nearly flat (caffeine slope 0.01, no-caffeine slope –0.02).&lt;/p&gt;

&lt;p&gt;As &lt;a href=&quot;https://mdickens.me/2024/04/11/caffeine_self_experiment/#what-explains-these-results&quot;&gt;before&lt;/a&gt;, I don’t know how to explain why my reaction time improved within each experimental phase. My best guess is there’s some random-ish process that produces long-run trends in reaction time. Perhaps it’s the result of variations in sleep quality, but a type of sleep quality that time-in-bed can’t measure.&lt;/p&gt;

&lt;p&gt;But it looks like I didn’t get habituated when taking caffeine 4 days a week—or, at least, not to a detectable degree. So I’m going to keep taking caffeine 4 days a week.&lt;/p&gt;

&lt;h2 id=&quot;when-i-take-caffeine-3-days-in-a-row-do-i-habituate-by-the-3rd-day&quot;&gt;When I take caffeine 3 days in a row, do I habituate by the 3rd day?&lt;/h2&gt;

&lt;p&gt;The evidence suggests that I don’t, but the evidence is weak.&lt;/p&gt;

&lt;p&gt;I don’t have great data because I only collected five data points (it was supposed to be six, one for each week of the experiment, but I got sick on week five which messed up my caffeine schedule&lt;sup id=&quot;fnref:6&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:6&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt;).&lt;/p&gt;

&lt;p&gt;On average, my reaction time was 278.8 ms on the first day and 273.0 ms on the third day (so reaction time got better, not worse). The difference had standard error 8.4 ms (p = 0.5). This weakly suggests that I don’t start getting habituated yet by the 3rd day, but my test was underpowered. (I’d only expect reaction time to get worse by maybe 3–5 ms, and the standard error was 8.4 ms. That’s an odds ratio of 1.15:1 between 0 ms and 4 ms of habituation.)&lt;/p&gt;

&lt;p&gt;I’d like to compare this to the pre-existing literature, but to my knowledge, no studies have ever administered daily caffeine to non-habituated users and measured the daily habituation curve. &lt;a href=&quot;https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6343867/&quot;&gt;Lara et al. (2019)&lt;/a&gt;&lt;sup id=&quot;fnref:5&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:5&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;4&lt;/a&gt;&lt;/sup&gt; comes the closest: it didn’t measure performance every day, but it did have participants take caffeine for 20 days and measured athletic performance on day 1 and day 4. It found a slight decrease in performance between days 1 and 4, with the exact number varying from 1% to 3% depending on the metric used. Some research on rats (see the studies cited &lt;a href=&quot;/2024/03/02/caffeine_tolerance/#experimental-evidence-on-intermittent-dosing&quot;&gt;here&lt;/a&gt;) found that performance slightly decreased from day 1 to day 3, but rat metabolism runs faster than humans’ so on priors I’d expect humans to become habituated more slowly.&lt;/p&gt;

&lt;p&gt;In conclusion, I don’t really know anything, but I’m gonna keep taking caffeine 4 days a week.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Source code and data for this experiment are available &lt;a href=&quot;https://github.com/michaeldickens/public-scripts/tree/master/caffeine&quot;&gt;on GitHub&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;p&gt;I bring this up because isn’t it weird that that adding one extra week changed the p-value for no-caffeine tests from &amp;lt; 0.001 to 0.15?&lt;/p&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:1&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;In the &lt;a href=&quot;https://mdickens.me/2024/04/11/caffeine_self_experiment/&quot;&gt;original post&lt;/a&gt;, I included four weeks of data, but I continued the experiment for two weeks beyond what I had originally planned. The extra two weeks caused the regression lines to flatten out, but did not change their direction. This supports my hypothesis that the downward-sloping regressions were the result of some sort of anomaly. &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:2&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I looked at the results at the end of week 5 because I thought I might need to end the experiment early (turns out I didn’t&lt;sup id=&quot;fnref:3&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;5&lt;/a&gt;&lt;/sup&gt;), and I saw a much stronger downward slope: p = 0.029 for slope on post-caffeine tests, and p &amp;lt; 0.001 for slope on no-caffeine tests. But I performed badly enough in week six that the slope largely flattened out. &lt;a href=&quot;#fnref:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:6&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I still took caffeine 4 days that week, so the overall experiment didn’t get messed up. But I changed which days I took caffeine, so I didn’t get 3 days in a row. &lt;a href=&quot;#fnref:6&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:5&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Beatriz Lara, Carlos Ruiz-Moreno, Juan Jose Salinero &amp;amp; Juan Del Coso (2019). &lt;a href=&quot;https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6343867/&quot;&gt;Time course of tolerance to the performance benefits of caffeine.&lt;/a&gt; &lt;a href=&quot;#fnref:5&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:3&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I had a problem with my GPU driver that increased the latency on my monitor, which was going to mess up the results. But I fixed the problem after a couple days so I just skipped those days, and I didn’t have to skip any of the important days. &lt;a href=&quot;#fnref:3&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>Some Things I've Changed My Mind On</title>
				<pubDate>Thu, 23 May 2024 00:00:00 -0700</pubDate>
				<link>http://mdickens.me/2024/05/23/some_things_ive_changed_my_mind_on/</link>
				<guid isPermaLink="true">http://mdickens.me/2024/05/23/some_things_ive_changed_my_mind_on/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;Here are some things I’ve changed my mind about. Most of the changes are recent (because I can remember recent stuff more easily) but some of them happened 5+ years ago.&lt;/p&gt;

&lt;p&gt;I’m a little nervous about writing this because a few of my old beliefs were really dumb. But I don’t think it would be fair to include only my smart beliefs.&lt;/p&gt;

&lt;!-- more --&gt;

&lt;h2 id=&quot;contents&quot;&gt;Contents&lt;/h2&gt;

&lt;ul id=&quot;markdown-toc&quot;&gt;
  &lt;li&gt;&lt;a href=&quot;#contents&quot; id=&quot;markdown-toc-contents&quot;&gt;Contents&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#effective-altruism&quot; id=&quot;markdown-toc-effective-altruism&quot;&gt;Effective altruism&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#finance&quot; id=&quot;markdown-toc-finance&quot;&gt;Finance&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#health-and-fitness&quot; id=&quot;markdown-toc-health-and-fitness&quot;&gt;Health and fitness&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#miscellaneous&quot; id=&quot;markdown-toc-miscellaneous&quot;&gt;Miscellaneous&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#what-patterns-emerge&quot; id=&quot;markdown-toc-what-patterns-emerge&quot;&gt;What patterns emerge?&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#notes&quot; id=&quot;markdown-toc-notes&quot;&gt;Notes&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;effective-altruism&quot;&gt;Effective altruism&lt;/h2&gt;

&lt;ol&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;My old belief:&lt;/strong&gt; In my original 2021 version of &lt;a href=&quot;https://mdickens.me/2021/04/05/comparison_of_DAF_providers/&quot;&gt;A Comparison of Donor-Advised Fund Providers&lt;/a&gt;, I recommended Schwab Charitable as the best DAF provider for most people.&lt;/p&gt;

    &lt;p&gt;&lt;strong&gt;What changed my mind:&lt;/strong&gt; When I reviewed the post this year, I noticed Schwab’s default fund fees are too high, so it was a bad idea to recommend them. I don’t recall exactly what I thought about the default fund fees when I first wrote the article, perhaps I noticed the high fees and thought it didn’t matter because people can switch to cheaper funds. If I did think that, then that was a mistake because a large proportion of people will stick with the default option without looking at it, and if they do that with Schwab, they’ll get ripped off.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;My old belief:&lt;/strong&gt; After writing &lt;a href=&quot;https://mdickens.me/2020/11/23/uncorrelated_investing/&quot;&gt;Uncorrelated Investments for Altruists&lt;/a&gt;, I thought that the marginal donor’s philanthropic investment portfolio should aim for near zero correlation to equities.&lt;/p&gt;

    &lt;p&gt;&lt;strong&gt;What changed my mind:&lt;/strong&gt; In the process of writing &lt;a href=&quot;https://mdickens.me/2020/12/14/asset_allocation_for_altruists_with_constraints/&quot;&gt;Asset Allocation and Leverage for Altruists with Constraints&lt;/a&gt;, I wrote &lt;a href=&quot;https://github.com/michaeldickens/public-scripts/blob/master/mvo.py&quot;&gt;code&lt;/a&gt; to do portfolio optimization and ran it under various assumptions. I found results that contradicted my previous belief. Now I believe that the optimal marginal investment portfolio should still have some correlation to equities because getting the extra expected return is worth accepting some positive correlation.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;My old belief:&lt;/strong&gt; In 2015 I donated to Raising for Effective Giving and &lt;a href=&quot;https://mdickens.me/2015/09/15/my_cause_selection/&quot;&gt;argued&lt;/a&gt; for why they were my favorite donation target.&lt;/p&gt;

    &lt;p&gt;&lt;strong&gt;What changed my mind:&lt;/strong&gt; After 2015, their fundraising model didn’t keep working as well as I expected.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;My old belief:&lt;/strong&gt; &lt;a href=&quot;https://mdickens.me/2021/07/21/metaculus_learning_value/&quot;&gt;Metaculus Questions Suggest Money Will Do More Good in the Future&lt;/a&gt;&lt;/p&gt;

    &lt;p&gt;&lt;strong&gt;What changed my mind:&lt;/strong&gt; After I published that post, some commenters argued for a different interpretation of the Metaculus questions.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;h2 id=&quot;finance&quot;&gt;Finance&lt;/h2&gt;

&lt;ol&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;My old belief:&lt;/strong&gt; SBF and Alameda have skill at beating the market.&lt;/p&gt;

    &lt;p&gt;&lt;strong&gt;What changed my mind:&lt;/strong&gt; I’m sure you know what changed my mind. There’s some chance that they did actually have skill and they blew up due to bad luck (them committing fraud was bad behavior, not bad luck, but as I understand it, they blew up because they lost a bunch of money, and they might have gotten away with the fraud if they’d made money). But I now believe it’s more likely that the risks they took were not calculated and they didn’t have much skill. (Clearly SBF had a lot of skill at fundraising, but that’s not the same thing as trading skill.)&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;My old belief:&lt;/strong&gt; In 2021 and earlier, I estimated future market returns using Research Affiliates’ &lt;a href=&quot;https://interactive.researchaffiliates.com/asset-allocation&quot;&gt;model&lt;/a&gt; (e.g. &lt;a href=&quot;https://mdickens.me/2020/01/06/how_much_leverage_should_altruists_use/#return-expectations&quot;&gt;here&lt;/a&gt;), which assumes market valuations mean-revert after 10 years.&lt;/p&gt;

    &lt;p&gt;&lt;strong&gt;What changed my mind:&lt;/strong&gt; I read AQR’s capital market assumptions (see e.g. &lt;a href=&quot;https://www.aqr.com/Insights/Research/Alternative-Thinking/2024-Capital-Market-Assumptions-for-Major-Asset-Classes&quot;&gt;their 2024 publication&lt;/a&gt;) where they argue that there’s no strong reason to expect valuations to mean revert. Now I prefer the AQR model which uses the traditional “yield + growth” approach with no consideration for valuation. I still believe valuations ought to mean revert, but it could easily take more than 10 years, and they might only revert somewhat, so I think it’s reasonable to take an average of the AQR and Research Affiliates projections.&lt;/p&gt;

    &lt;p&gt;Putting less consideration on mean reversion makes equity market return projections cluster closer together. I would still order expected equity returns as emerging markets &amp;gt; developed markets ex-US &amp;gt; US, but I do not expect the differences to be as big as I used to. I used to quote something like a 6% real return for emerging markets and 0% for the US market. Now I expect more like 5% for emerging markets and 2% for the US.&lt;/p&gt;

    &lt;p&gt;I wrote more about my updated expectations &lt;a href=&quot;https://mdickens.me/2022/04/01/how_I_estimate_future_investment_returns/&quot;&gt;here&lt;/a&gt;. I don’t pay too much attention to return projections, so changing my mind on this didn’t change my investment strategy.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;My old belief:&lt;/strong&gt; From a purely financial perspective (ignoring personal taste etc.), renting a house is always a better decision than buying.&lt;sup id=&quot;fnref:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

    &lt;p&gt;&lt;strong&gt;What changed my mind:&lt;/strong&gt; Around 2020, I read the argument that owning a house works as a hedge against future housing expenditures, which means buying is better than renting in many cases.&lt;sup id=&quot;fnref:5&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:5&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

    &lt;p&gt;I still believe most discourse on renting vs. buying is confused. For example:&lt;/p&gt;

    &lt;ol&gt;
      &lt;li&gt;It doesn’t make any sense to directly compare monthly rent to monthly mortgage payments because with a mortgage, you’re accumulating equity in an asset.&lt;/li&gt;
      &lt;li&gt;Maybe people understand point 1 and believe a mortgage is always better because renting is “throwing money away”. That’s also wrong because you have to account for the time value of money. Dumping most of your net worth into a house has a huge opportunity cost.&lt;/li&gt;
      &lt;li&gt;When you account for opportunity costs, you have to consider the risk of a mortgage vs. the risk of the counterfactual investment (e.g., an index fund).&lt;/li&gt;
    &lt;/ol&gt;

    &lt;p&gt;I looked through a dozen online “rent vs. buy” calculators, only three of them properly accounted for equity and opportunity costs (&lt;a href=&quot;https://www.financialmentor.com/calculator/rent-vs-buy-calculator&quot;&gt;Financial Mentor&lt;/a&gt;, &lt;a href=&quot;https://www.fool.com/calculators/should-i-buy-or-rent-lets-crunch-the-numbers.aspx&quot;&gt;Motley Fool&lt;/a&gt;, and &lt;a href=&quot;https://www.nytimes.com/interactive/2024/upshot/buy-rent-calculator.html&quot;&gt;New York Times&lt;/a&gt; (paywalled)), and none of them accounted for risk. (The three good calculators use different methods but they’re basically interchangeable.)&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;My old belief:&lt;/strong&gt; In 2016 when I was looking for a full-time job at a startup, I evaluated equity compensation at face value.&lt;/p&gt;

    &lt;p&gt;&lt;strong&gt;What changed my mind:&lt;/strong&gt; I should have considered risk. Equity compensation is risky (2–4x riskier than an index fund), which makes it look much worse.&lt;/p&gt;

    &lt;p&gt;&lt;strong&gt;What changed my mind again:&lt;/strong&gt; Later, I realized there are some other factors that make equity compensation look better. In 2021 I did a more in-depth analysis &lt;a href=&quot;https://mdickens.me/2021/11/12/ea_work_at_startups/&quot;&gt;here&lt;/a&gt;, and my current opinion is that equity in a good startup is worth more than its face value.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;My old belief:&lt;/strong&gt; In 2013 I read some research on investing strategies like &lt;a href=&quot;https://en.wikipedia.org/wiki/Magic_formula_investing&quot;&gt;Greenblatt’s magic formula&lt;/a&gt; and thought they sounded like a great idea.&lt;/p&gt;

    &lt;p&gt;&lt;strong&gt;What changed my mind:&lt;/strong&gt; Actually, I still believe Greenblatt-esque strategies are a great idea (at least for some people in some contexts). But I believe I over-updated on the evidence I had at the time, I just got lucky that I was looking at weak evidence for true claims.&lt;/p&gt;

    &lt;p&gt;I originally became convinced that Greenblatt’s magic formula worked when I read Abbey &amp;amp; Larkin (2012), “Can simple one and two-factor investing strategies capture the value premium?” The paper looked at US stocks over a 30-year period. Now, I would want to see more evidence than that. I like the five criteria given by Berkin &amp;amp; Swedroe’s &lt;a href=&quot;https://www.amazon.com/Your-Complete-Guide-Factor-Based-Investing/dp/0692783652&quot;&gt;Your Complete Guide to Factor-Based Investing&lt;/a&gt;: to take a market anomaly seriously, it must be (1) &lt;strong&gt;persistent&lt;/strong&gt; across time, (2) &lt;strong&gt;pervasive&lt;/strong&gt; across markets, (3) &lt;strong&gt;robust&lt;/strong&gt; to different formulations, (4), &lt;strong&gt;investable&lt;/strong&gt;, and (5) have a risk-based or behavioral &lt;strong&gt;explanation&lt;/strong&gt;. The evidence from Abbey &amp;amp; Larkin (2012) established robustness and half-established persistence (30 years is decently long so I give it half credit), but didn’t address the other three and a half criteria.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;h2 id=&quot;health-and-fitness&quot;&gt;Health and fitness&lt;/h2&gt;

&lt;p&gt;(I’ve been thinking a lot about health and fitness lately.)&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;My old belief:&lt;/strong&gt; In 2011 I quit coffee cold turkey after reading &lt;em&gt;You Are Not So Smart&lt;/em&gt;’s article &lt;a href=&quot;https://youarenotsosmart.com/2010/02/22/coffee/&quot;&gt;Coffee&lt;/a&gt;. I believed that caffeine had no effect on a daily user except to reverse withdrawal symptoms.&lt;/p&gt;

    &lt;p&gt;&lt;strong&gt;What changed my mind:&lt;/strong&gt; In &lt;a href=&quot;https://mdickens.me/2024/03/29/does_caffeine_stop_working/&quot;&gt;Does Caffeine Stop Working?&lt;/a&gt;, I investigated more deeply and now I believe that caffeine retains something like half its initial benefit.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;My old belief:&lt;/strong&gt; I can trust the research on stuff like caffeine.&lt;/p&gt;

    &lt;p&gt;&lt;strong&gt;What changed my mind:&lt;/strong&gt; I read some caffeine studies and found that most of them were pretty bad. And not bad in the obvious way of having small sample sizes (which is honestly fine as long as you’re aware of it—weak evidence is still evidence). They were bad in the sense of “your study’s methodology is not capable even in principle of providing evidence for or against your hypothesis”. The &lt;em&gt;majority&lt;/em&gt; of studies were like that (around 75% of them, if I remember correctly).&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;My old belief:&lt;/strong&gt; After reading &lt;a href=&quot;https://www.amazon.com/Starting-Strength-Mark-Rippetoe-ebook/dp/B006XJR5ZA/&quot;&gt;Starting Strength&lt;/a&gt; in 2014,&lt;sup id=&quot;fnref:13&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:13&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt; I thought Starting Strength-style training was the best in every situation, and bodybuilding-style isolation movements were dumb.&lt;/p&gt;

    &lt;p&gt;&lt;strong&gt;What changed my mind:&lt;/strong&gt; After a few months, I realized I had been on the “hill of novice overconfidence” and actually isolation exercises are fine. Later I re-read &lt;em&gt;Starting Strength&lt;/em&gt; and realized it never even said you shouldn’t do isolation training. It made the more nuanced claims that (1) isolation training is not ideal for developing strength and (2) compound barbell movements are better for beginners.&lt;/p&gt;

    &lt;p&gt;After learning more about the scientific literature and the diversity in how elite athletes train, now I tend to believe differences in training mostly don’t matter. You can get good results with any method as long as you lift heavy weights, increase the weight over time, and get sufficient food and rest.&lt;sup id=&quot;fnref:14&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:14&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;4&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;My old belief:&lt;/strong&gt; High-intensity interval training is the best kind of cardio.&lt;/p&gt;

    &lt;p&gt;&lt;strong&gt;What changed my mind:&lt;/strong&gt; I started reading some experts on exercise science, and they say low intensity cardio is just as good most of the time, and the ideal routine consists of something like 80% easy cardio and 20% hard cardio (and the easy cardio should be really easy&lt;sup id=&quot;fnref:4&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:4&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;5&lt;/a&gt;&lt;/sup&gt;).&lt;/p&gt;

    &lt;p&gt;And thank God for that because I kinda hate doing moderate/hard cardio. I’ve started being way more consistent about aerobic exercise—I go for brisk hilly walks 3 times a week and haven’t missed a day in months.&lt;sup id=&quot;fnref:18&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:18&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;6&lt;/a&gt;&lt;/sup&gt; And my resting heart rate has gone down from 70–75 bpm a few years ago to 55–58 bpm today, so I guess it’s working. I’ve also noticed I can do high-rep squats and deadlifts without getting winded. I remember in 2016 I deadlifted 225 for 10 reps and I basically died (my heart rate hit 190 bpm). A few weeks ago I deadlifted 315 for 11 (note: my 1-rep max has barely changed since 2016) and I felt fine (heart rate 144 bpm).&lt;/p&gt;

    &lt;p&gt;(I’ve read hardly any original research on cardio, but as I understand, the older research did show that HIIT was better than low-intensity cardio, and newer research changed that—see the 6-part series &lt;a href=&quot;https://x.com/Ekkekakis/status/1689692611018129408&quot;&gt;Extraordinary Claims in the Literature on High-Intensity Interval Training&lt;/a&gt; by Ekkekakis et al. (2023).&lt;sup id=&quot;fnref:17&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:17&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;7&lt;/a&gt;&lt;/sup&gt; So my beliefs basically tracked the research findings, although I was getting all my info second-hand.)&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;My old belief:&lt;/strong&gt; At different times, I believed (1) we don’t really know anything about nutrition; (2) food choice doesn’t matter as long as you don’t over-eat.&lt;/p&gt;

    &lt;p&gt;&lt;strong&gt;What changed my mind:&lt;/strong&gt; I read &lt;a href=&quot;https://nutritionfacts.org/book/how-not-to-die/&quot;&gt;How Not To Die&lt;/a&gt; in 2017, which referenced a large quantity of nutrition research that contradicted my previous beliefs. I now believe the book was wrong about some things (which I will discuss in the next line item), but it was more correct than my pre-2017 self. Later on, I read a wider variety of evidence-based nutrition advice.&lt;/p&gt;

    &lt;p&gt;My current position is that we are really pretty sure about some things in nutrition, and some foods are unhealthy even if you don’t over-eat. I believe conventional nutrition advice in educated circles is basically correct: trans fat, saturated fat, and added sugar are bad; processed food is generally bad; whole plant foods (especially fruits and veggies) are good.&lt;/p&gt;

    &lt;p&gt;(I still don’t have a great sense of the distinction between foods that make it easy to overeat and foods that are unhealthy at any bodyweight. Like I know that sugar in small quantities isn’t bad for healthy-weight individuals, but is that because the badness is too minor to detect, or because there’s some threshold below which sugar causes zero harm whatsoever?)&lt;/p&gt;

    &lt;p&gt;I updated my beliefs by following my “web of trust”: my layman friend trusts this dietitian; my other layman friend trusts this medical doctor who agrees with the first dietitian about most things; I trust Scott Alexander, and he likes this one &lt;a href=&quot;https://www.stephanguyenet.com/&quot;&gt;researcher&lt;/a&gt;, who endorsed this &lt;a href=&quot;https://www.redpenreviews.org/reviews/eat-drink-and-be-healthy/&quot;&gt;book&lt;/a&gt;; these two &lt;a href=&quot;https://www.barbellmedicine.com/&quot;&gt;guys&lt;/a&gt; know a lot about strength training and I like their epistemology,&lt;sup id=&quot;fnref:10&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:10&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;8&lt;/a&gt;&lt;/sup&gt; so they probably also know a thing or two about nutrition, and they agree with the guy Scott liked; etc.&lt;/p&gt;

    &lt;p&gt;According to my web of trust, the best book on nutrition is &lt;a href=&quot;https://mdickens.me/2024/05/23/notes_on_eat_drink_and_be_healthy/&quot;&gt;Eat, Drink, and Be Healthy&lt;/a&gt;, which as far as I know is the only book that makes a 100% earnest effort to represent the state of nutrition science.&lt;/p&gt;

    &lt;p&gt;I still haven’t made an effort to interpret the primary literature on nutrition science. All the really big studies are observational, and there’s an art to controlling for confounders, and I believe it would take a lot of work for me to understand how they do it. I trust that at least some researchers have a good conception of how to disentangle causality (Willett &amp;amp; Skerrett, authors of &lt;em&gt;Eat, Drink, and Be Healthy&lt;/em&gt;, do a good job of explaining why they believe observational studies establish causality in certain cases.&lt;sup id=&quot;fnref:9&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:9&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;9&lt;/a&gt;&lt;/sup&gt;)&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;My old belief:&lt;/strong&gt; After reading &lt;a href=&quot;https://nutritionfacts.org/book/how-not-to-die/&quot;&gt;How Not To Die&lt;/a&gt; in 2017, I believed processed unsaturated fats (such as olive oil) were unhealthy, and unsaturated fats should be consumed as whole foods (e.g. by eating nuts).&lt;/p&gt;

    &lt;p&gt;&lt;strong&gt;What changed my mind:&lt;/strong&gt; I &lt;a href=&quot;https://mdickens.me/2024/05/23/notes_on_eat_drink_and_be_healthy/&quot;&gt;read&lt;/a&gt; &lt;em&gt;Eat, Drink, and Be Healthy&lt;/em&gt;, which said olive oil is healthy, and it presented some evidence that looked reasonable to me. There’s a plausible mechanism for oil being healthy (it helps the body produce HDL and HDL sucks loose cholesterol out of the arteries), and there are some empirical studies where oil (esp. olive oil) was associated with better health outcomes, including at least one RCT.&lt;/p&gt;

    &lt;p&gt;The argument against olive oil in &lt;em&gt;How Not to Die&lt;/em&gt; is that it’s processed to remove some of the nutrition of the olive, which makes it less healthy. That’s not wrong—raw olives are probably healthier than olive oil—but realistically I’m not gonna replace olive oil with eating handfuls of raw olives,&lt;sup id=&quot;fnref:3&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;10&lt;/a&gt;&lt;/sup&gt; and the empirical evidence suggests that adding olive oil to a diet makes it healthier.&lt;/p&gt;

    &lt;p&gt;&lt;em&gt;How Not to Die&lt;/em&gt; recommends replacing olive oil with nuts. Some RCT evidence does suggest that nuts are healthier than olive oil, but nuts often don’t work as a substitute for oil (you can’t pan-fry food in a bed of nuts).&lt;/p&gt;

    &lt;p&gt;&lt;em&gt;How Not to Die&lt;/em&gt; is biased toward veganism, which I knew before I read it so I didn’t update much on the stuff about how all animal products are unhealthy, although I was vegan anyway so it didn’t affect my behavior. I largely trusted the book on non-animal subjects because it cited a lot of research and seemed well-reasoned. Based on my current understanding of mainstream positions among nutrition scientists, almost all of the non-animal stuff (and most of the animal stuff) in the book is mainstream, but it over-emphasizes the badness of processed foods in general. The mainstream position among nutrition scientists is that you should avoid “processed foods” as a general rule, but plenty of specific processed foods are fine, like olive oil or protein powder.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;My old belief:&lt;/strong&gt; The ideal BMI is around 18–20 (on the low end of the “healthy” range of 18.5 to 25).&lt;/p&gt;

    &lt;p&gt;&lt;strong&gt;What changed my mind:&lt;/strong&gt; My old belief wasn’t based on direct evidence. I just had a prior that official recommendations are gonna be too generous, for example recommending less exercise than is optimal because they don’t think people will actually do the optimal amount of exercise, or recommending a “healthy” BMI that’s actually a bit too generous because they think people will give up if they’re told to aim for a BMI of 20. I updated my belief after reading a &lt;a href=&quot;https://www.thelancet.com/journals/landia/article/PIIS2213-8587(18)30288-2/fulltext&quot;&gt;large study&lt;/a&gt;&lt;sup id=&quot;fnref:7&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:7&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;11&lt;/a&gt;&lt;/sup&gt; on BMI and all-cause mortality. As of writing the first draft of this post, I weakly believed the ideal was on middle-high side of the “healthy” range (so around 22–23), but I wrote in my first draft, “I want to investigate this more.”&lt;/p&gt;

    &lt;p&gt;&lt;strong&gt;What changed my mind again:&lt;/strong&gt; I &lt;a href=&quot;https://mdickens.me/2024/05/05/healthiest_BMI/&quot;&gt;investigated more&lt;/a&gt;. (I had to take a diversion from writing the post you’re currently reading to write a different post about BMI.) Now I believe the ideal BMI is 20–22, for reasons I explain in the &lt;a href=&quot;https://mdickens.me/2024/05/05/healthiest_BMI/&quot;&gt;linked post&lt;/a&gt;. Lower than 20 is fine, maybe even better, if you have adequate lean mass. 22–23 appears to carry (slightly) greater health risks.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;My old belief:&lt;/strong&gt; I believed that I had seen an RCT that found that sunscreen didn’t work, and I had written that down in a note.&lt;/p&gt;

    &lt;p&gt;&lt;strong&gt;What changed my mind:&lt;/strong&gt; I looked up my note and saw that, in actuality, my note said that sunscreen &lt;em&gt;did&lt;/em&gt; work. I somehow flipped the sign of the outcome in my memory. I don’t understand how that happened.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;h2 id=&quot;miscellaneous&quot;&gt;Miscellaneous&lt;/h2&gt;

&lt;ol&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;My old belief:&lt;/strong&gt; RCTs are high-quality evidence.&lt;/p&gt;

    &lt;p&gt;&lt;strong&gt;What changed my mind:&lt;/strong&gt; I learned about the replication crisis, and (later) I read some actual RCTs. I now believe the median RCT is pretty badly designed and it shouldn’t change your beliefs much unless you’ve actually read it and understood its methodology. And, perhaps as a corollary, the median scientist isn’t very smart. (This doesn’t necessarily follow because there are reasons why smart scientists might publish dumb papers.)&lt;/p&gt;

    &lt;p&gt;(Doctors and professors appear to have average IQs around 115–125—see &lt;a href=&quot;https://users.ssc.wisc.edu/~hauser/merit_01_081502_complete.pdf&quot;&gt;Meritocracy, Cognitive Ability, and the Sources of Occupational Success&lt;/a&gt;—which is a full standard deviation below the IQs of most of my friends,&lt;sup id=&quot;fnref:20&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:20&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;12&lt;/a&gt;&lt;/sup&gt; and probably most of the people reading this. So maybe it’s fair after all to say the median scientist isn’t very smart. But you could still argue that a lifetime of expertise matters more than 15 IQ points.)&lt;/p&gt;

    &lt;p&gt;I see one area where scientists routinely mis-interpret their own evidence. They often struggle to understand the difference between&lt;/p&gt;

    &lt;ol&gt;
      &lt;li&gt;Our study found a large (in absolute terms) but non-significant effect because our study was underpowered.&lt;/li&gt;
      &lt;li&gt;Our study robustly established no effect: the standard error in our data was small enough that any meaningful effect would show up, and it didn’t.&lt;/li&gt;
    &lt;/ol&gt;

    &lt;p&gt;You especially see this fallacy in areas where small effect sizes still matter. For example, a 0.1% decrease in mortality risk matters a lot, but it’s very hard to detect with a study, and study authors often incorrectly conclude that the effect doesn’t exist when they fail to find it.&lt;/p&gt;

    &lt;p&gt;(David J. Balan says &lt;a href=&quot;https://www.overcomingbias.com/p/doctor-there-arhtml&quot;&gt;there are two kinds of “no evidence”&lt;/a&gt;. I’d say there are three kinds: (1) we haven’t looked for evidence; (2) we looked for evidence in a way that &lt;a href=&quot;https://en.wikipedia.org/wiki/Streetlight_effect&quot;&gt;wasn’t gonna find any evidence&lt;/a&gt;, and we didn’t find any evidence; (3) we looked for evidence in a way that would have found it if it existed, and we still didn’t find it.)&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;My old belief:&lt;/strong&gt; RCTs on strength training are basically useless. I read &lt;a href=&quot;https://www.amazon.com/Practical-Programming-Strength-Training-Rippetoe/dp/0982522754&quot;&gt;Practical Programming for Strength Training&lt;/a&gt;, which argued that strength coaches know better than researchers because RCTs are deeply flawed: they test a group of untrained individuals over 12 weeks or less, and that sort of training context doesn’t generalize to a more-experienced individual who follows a program for a year or longer.&lt;/p&gt;

    &lt;p&gt;&lt;strong&gt;What changed my mind:&lt;/strong&gt; This one’s interesting because it’s the opposite of the previous line item.&lt;/p&gt;

    &lt;p&gt;I started listening to some more research-driven experts like &lt;a href=&quot;https://www.barbellmedicine.com/&quot;&gt;Barbell Medicine&lt;/a&gt; and &lt;a href=&quot;https://www.strongerbyscience.com/&quot;&gt;Stronger by Science&lt;/a&gt; and hearing their perspective on scientific studies. While it’s true that many (most?) sports science RCTs don’t generalize, there are plenty of studies that correct for the criticisms made by &lt;em&gt;Practical Programming for Strength Training&lt;/em&gt;.&lt;/p&gt;

    &lt;p&gt;Why the difference between “RCTs are bad” in the previous line item and “RCTs are good actually” now? As best I can figure, this is the deal:&lt;/p&gt;

    &lt;ol&gt;
      &lt;li&gt;RCTs are bad if you blindly accept all of them.&lt;/li&gt;
      &lt;li&gt;RCTs are good if you know how to read a study and understand where it does and does not generalize.&lt;/li&gt;
    &lt;/ol&gt;

    &lt;p&gt;The science popularizers I pay attention to know how to distinguish between bad and good studies, and how to synthesize commonalities that repeatedly appear in many studies.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;My old belief:&lt;/strong&gt; Before 2009, I disagreed with affirmative action because I thought it was anti-meritocratic.&lt;/p&gt;

    &lt;p&gt;&lt;strong&gt;What changed my mind:&lt;/strong&gt; In 2009, I saw a debate in which the pro-affirmative action side won. I did not actually read the debate but I thought the pro side seemed credible so I changed my position.&lt;/p&gt;

    &lt;p&gt;&lt;strong&gt;What changed my mind again:&lt;/strong&gt; I now disagree with affirmative action. The main argument that changed my mind was that affirmative action has been around for a generation so if it was going to work, we would certainly see the benefits by now, and we don’t.&lt;/p&gt;

    &lt;p&gt;To be a little more specific, there are (broadly speaking) two theories for why affirmative action ought to work:&lt;/p&gt;

    &lt;ol&gt;
      &lt;li&gt;Gatekeepers (e.g. hiring managers) unfairly discriminate against minorities, and affirmative action cancels this out.&lt;/li&gt;
      &lt;li&gt;Minorities underperform in some areas because they haven’t been given sufficient opportunities. Affirmative action gives them those opportunities so that they can get better.&lt;/li&gt;
    &lt;/ol&gt;

    &lt;p&gt;The first theory is pretty easy to test: if minorities outperform after being accepted, then they were being discriminated against. Empirically, we see the opposite—for example, racial minorities at universities have on average lower GPAs than whites. The exception is Asians, who do actually outperform, which suggests they’re &lt;a href=&quot;https://en.wikipedia.org/wiki/Students_for_Fair_Admissions_v._Harvard&quot;&gt;being discriminated against&lt;/a&gt;. But we don’t need affirmative action to fix anti-Asian discrimination, in fact it’s &lt;em&gt;caused&lt;/em&gt; by affirmative action.&lt;/p&gt;

    &lt;p&gt;You can test the second theory in a similar way: after being accepted, do minorities on average improve their performance? Empirically, they don’t. You can wiggle out of this by saying they still face hurdles even after being accepted to (e.g.) university. But even the children of under-represented minorities who went to elite colleges still underperform (on average). If affirmative action doesn’t improve outcomes even for the children of the beneficiaries, it’s probably never going to work.&lt;/p&gt;

    &lt;p&gt;That doesn’t prove that minorities &lt;em&gt;don’t&lt;/em&gt; face hardships that hamper their performance, it just proves that affirmative action doesn’t do anything to rectify those hardships.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;My old belief:&lt;/strong&gt; Regulation is basically good; free markets often hurt people.&lt;/p&gt;

    &lt;p&gt;&lt;strong&gt;What changed my mind:&lt;/strong&gt; I now believe there’s way too much regulation and free markets are almost always good. I wouldn’t go so far as to say &lt;em&gt;all&lt;/em&gt; regulations are bad, but I think the developed world would be much better off if lawmakers simply deleted 75% of existing regulations. Nor would I say free markets are &lt;em&gt;always&lt;/em&gt; good, but I’d guess that the ratio of economic problems caused by market restrictions to problems caused by overly free markets is about 20:1.&lt;/p&gt;

    &lt;p&gt;I can’t pinpoint a specific period where I changed my mind but it happened somewhere between the beginning and the end of college. The main things that changed my mind were (1) getting better at applying basic economic reasoning (I took econ in 11th grade but I didn’t start applying it to life until later) and (2) reading economist polls and updating toward economists’ beliefs.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;My old belief:&lt;/strong&gt; Maybe if I force myself to go to enough social events, I will learn to enjoy them.&lt;/p&gt;

    &lt;p&gt;&lt;strong&gt;What changed my mind:&lt;/strong&gt; I went to a whole bunch of social events that I didn’t want to go to and I never learned to enjoy them. I now believe it’s better to simply not go to social events that I don’t want to go to.&lt;/p&gt;

    &lt;p&gt;(People often tell me things like, “You should come, you’ll start enjoying it once you get there!” These people are badly failing to model the fact that my brain does not work the same as their brain.)&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;My old belief:&lt;/strong&gt; In 8th grade Spanish class, the teacher had an accent I was unfamiliar with. I thought he was mis-pronouncing certain Spanish words.&lt;/p&gt;

    &lt;p&gt;&lt;strong&gt;What changed my mind:&lt;/strong&gt; I learned that his pronunciations were a regional accent.&lt;/p&gt;

    &lt;p&gt;I don’t expect 8th graders to have particularly good reasoning abilities but this mistake feels especially severe. Surely I could have figured out that I, a kid who has taken one year of Spanish, do not know more about Spanish pronunciation than this guy who is a native Spanish speaker and teaches Spanish.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;My old belief:&lt;/strong&gt; Nuclear power is too dangerous, mainly because we can’t safely manage radioactive waste.&lt;/p&gt;

    &lt;p&gt;&lt;strong&gt;What changed my mind:&lt;/strong&gt; In 2010, I heard about Kahan et al.’s &lt;a href=&quot;https://papers.ssrn.com/sol3/papers.cfm?abstract_id=1549444&quot;&gt;Cultural Cognition of Scientific Consensus&lt;/a&gt;&lt;sup id=&quot;fnref:15&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:15&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;13&lt;/a&gt;&lt;/sup&gt;. The paper studied how people form beliefs on two issues where many people disagree with the scientific consensus—climate change and the disposal of nuclear wastes. Kahan et al. cited a &lt;a href=&quot;https://www.nrc.gov/docs/ML0413/ML041330436.pdf&quot;&gt;consensus report&lt;/a&gt;&lt;sup id=&quot;fnref:16&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:16&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;14&lt;/a&gt;&lt;/sup&gt; where scientists agreed that radioactive waste can be disposed of safely. This was the first time I came in contact with the notion that the scientific consensus supports nuclear power. I knew very little about nuclear waste disposal (and still know little), but I changed my belief to align with the scientific consensus.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;My old belief:&lt;/strong&gt; Creationists are uniquely bad at reasoning.&lt;/p&gt;

    &lt;p&gt;&lt;strong&gt;What changed my mind:&lt;/strong&gt; I know very smart and educated people who believe things that are about as obviously-wrong as creationism (such as the labor theory of value, or that infants are “blank slates”,&lt;sup id=&quot;fnref:12&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:12&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;15&lt;/a&gt;&lt;/sup&gt; or nuclear waste can’t be disposed of safely—wait, that last one was me).&lt;/p&gt;

    &lt;p&gt;I now believe it’s basically impossible to avoid believing dumb things. I still think creationism is extremely wrong, but I have a lot of sympathy for creationists, and I don’t think they’re particularly worse at reasoning than anyone else. I believe you should &lt;a href=&quot;https://slatestarcodex.com/2019/02/26/rule-genius-in-not-out/&quot;&gt;rule thinkers in, not out&lt;/a&gt;.&lt;/p&gt;

    &lt;p&gt;The thing is, if you live in a bubble, you might not ever hear the arguments for why the labor theory of value is wrong, or why blank slatism is wrong. Or you might hear arguments but not good ones, perhaps because knowledgeable people &lt;a href=&quot;https://www.astralcodexten.com/p/contra-kavanaugh-on-fideism&quot;&gt;would rather make fun of you than try to persuade you&lt;/a&gt;. So I find it totally understandable that people hold on to these beliefs. And I think of creationism the same way—some people have the misfortune to live in bubbles where everyone around them believes in creationism and they never hear good counter-arguments.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;h1 id=&quot;what-patterns-emerge&quot;&gt;What patterns emerge?&lt;/h1&gt;

&lt;p&gt;Did I systematically change my mind in certain ways? Can I predict how I might change my mind in the future?&lt;/p&gt;

&lt;p&gt;I can see four broad patterns:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Initially, I overconfidently believed the first credible-sounding thing I heard. Later, I moderated my beliefs when I learned about other perspectives.&lt;/li&gt;
  &lt;li&gt;I investigated an area where not much is known, so I had to figure things out on my own. My initial conclusion was wrong, and I changed my mind by investigating more deeply.&lt;/li&gt;
  &lt;li&gt;I used to believe individual scientific studies, and now I don’t give them much credibility.&lt;/li&gt;
  &lt;li&gt;I pay more attention to the scientific consensus.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;On #4, it’s not so much that I ever wanted to disagree with the scientific consensus, but that it’s hard to know what scientists believe. On economics, nutrition, and nuclear power, I used to disagree with the consensus, but only because I didn’t know what the consensus &lt;em&gt;was&lt;/em&gt;.&lt;/p&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:2&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;My argument went like this:&lt;/p&gt;

      &lt;p&gt;From a financial perspective, buying a house is isomorphic to buying a rental property and renting it out, while also paying rent to live at a second, identical house. A rental property is not a good investment because it’s highly non-diversified. If you shouldn’t buy a house as a rental property, then you shouldn’t buy a house to live in, either.&lt;/p&gt;

      &lt;p&gt;(People can pretend houses aren’t risky because the price doesn’t update on a minute-to-minute basis, but a single house is a much riskier investment than an index fund.) &lt;a href=&quot;#fnref:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:5&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;There are some caveats to this:&lt;/p&gt;

      &lt;ol&gt;
        &lt;li&gt;Buying a house only works as a hedge if you plan on living there forever. If you ever plan on moving, you’re subject to fluctuations in the price of your current house relative to that of your future house.&lt;/li&gt;
        &lt;li&gt;It only works if you get a competitive mortgage interest rate and if the value of the hedge balances out the opportunity cost of pouring a bunch of money at once into a house.&lt;/li&gt;
        &lt;li&gt;Houses have maintenance expenditures and the like, although this should be priced in to rent.&lt;/li&gt;
      &lt;/ol&gt;
      &lt;p&gt;&lt;a href=&quot;#fnref:5&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:13&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I read Starting Strength based on a recommendation in a &lt;a href=&quot;https://www.lesswrong.com/posts/iTzvJ7kKK2TYJhYHB/solved-problems-repository?commentId=nuonKebTWiTnoWJMC&quot;&gt;LessWrong comment&lt;/a&gt;: “The set of my friends who are strong is exactly the set of my friends who do / have done Starting Strength or a close variant.” &lt;a href=&quot;#fnref:13&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:14&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I like what Mike Israetel said in an &lt;a href=&quot;https://www.youtube.com/watch?v=hO0F9L_Iuuo&amp;amp;t=286s&quot;&gt;interview&lt;/a&gt; (paraphrased): “If you can point out a dude to me and tell [based on how he looks] that he trains with sets of 20–30 and point out another dude and say that dude trains with sets of 5–8, I’d be super impressed, because I can’t.” &lt;a href=&quot;#fnref:14&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:4&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;The standard rule of thumb is the “talk test”: if you can carry out a conversation with a little bit of difficulty, it’s the right intensity. &lt;a href=&quot;#fnref:4&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:18&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;This sentence was true when I originally wrote it. Now I’m revising and I feel the need to note that I missed a day last week because I was sick. &lt;a href=&quot;#fnref:18&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:17&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Ekkekakis et al. (2022–2023). Extraordinary Claims in the Literature of High-Intensity Interval Training.&lt;/p&gt;

      &lt;ul&gt;
        &lt;li&gt;&lt;a href=&quot;https://doi.org/10.1007/s40279-023-01880-7&quot;&gt;I. Bonafide Scientific Revolution or a Looming Crisis of Replication and Credibility?&lt;/a&gt;&lt;/li&gt;
        &lt;li&gt;&lt;a href=&quot;https://doi.org/10.1123/kr.2022-0003&quot;&gt;II. Are The Extraordinary Claims Supported by Extraordinary Evidence?&lt;/a&gt;&lt;/li&gt;
        &lt;li&gt;&lt;a href=&quot;https://doi.org/10.1016/j.psychsport.2023.102399&quot;&gt;III. Critical analysis of four foundational arguments from an interdisciplinary lens.&lt;/a&gt;&lt;/li&gt;
        &lt;li&gt;&lt;a href=&quot;https://doi.org/10.1016/j.psychsport.2022.102295&quot;&gt;IV. Is HIIT associated with higher long-term exercise adherence?&lt;/a&gt;&lt;/li&gt;
        &lt;li&gt;&lt;a href=&quot;https://doi.org/10.1123/jsep.2022-0027&quot;&gt;A Methodological Checklist of Studies for Pleasure and Enjoyment Responses to High-Intensity Interval Training: Part I. Participants and Measures.&lt;/a&gt;&lt;/li&gt;
        &lt;li&gt;&lt;a href=&quot;https://doi.org/10.1123/jsep.2022-0029&quot;&gt;A Methodological Checklist of Studies for Pleasure and Enjoyment Responses to High-Intensity Interval Training: Part II. Intensity, Timing of Assessments, Data Modeling, and Interpretation.&lt;/a&gt;&lt;/li&gt;
      &lt;/ul&gt;
      &lt;p&gt;&lt;a href=&quot;#fnref:17&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:10&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I generally like the epistemology of the Barbell Medicine guys, but we do disagree sometimes. I’ve noticed a common pattern in how we disagree.&lt;/p&gt;

      &lt;p&gt;Say there’s a question about whether A is good, and the state of the evidence is that&lt;/p&gt;

      &lt;ol&gt;
        &lt;li&gt;There’s some theoretical reason to expect A to be good&lt;/li&gt;
        &lt;li&gt;Some limited empirical research has failed to find that A is good&lt;/li&gt;
      &lt;/ol&gt;

      &lt;p&gt;In that case, they believe A is not good, and I believe A is good.&lt;/p&gt;

      &lt;p&gt;For example, they say you shouldn’t take a multivitamin because RCTs generally haven’t found benefits. I say you &lt;em&gt;should&lt;/em&gt; take a multivitamin because they’re cheap and we know vitamins are important in principle.&lt;/p&gt;

      &lt;p&gt;I believe scientists and doctors tend to overweight weakly negative empirical findings relative to theory. I agree with what Scott Alexander wrote in &lt;a href=&quot;https://slatestarcodex.com/2020/04/14/a-failure-but-not-of-prediction/&quot;&gt;A Failure, But Not Of Prediction&lt;/a&gt;: authority figures said (in April 2020, when he wrote the post) that masks don’t prevent the spread of disease because there’s no supporting RCT evidence. But Scott believes masks are worth using because there’s good theoretical reason to expect them to work.&lt;/p&gt;

      &lt;p&gt;&lt;img src=&quot;https://slatestarcodex.com/blog_images/goofus.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

      &lt;p&gt;The Barbell Medicine guys understand perfectly well that sometimes theory matters more than empirical evidence, it’s just that I tend to favor theory a little bit more than they do. &lt;a href=&quot;#fnref:10&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:9&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;For example, I was concerned that a lot of associations between diet and health were confounded by socioeconomic class (rich people eat more veggies) or by conscientiousness (conscientious people eat more veggies plus they’re less inclined to over-eat). But Willett &amp;amp; Skerrett give evidence that some nutritional findings can’t be explained by class or conscientiousness.&lt;/p&gt;

      &lt;ul&gt;
        &lt;li&gt;On socioeconomic class: Greeks today and Chinese people in the 1990s were healthier than Americans even though they were poorer.&lt;/li&gt;
        &lt;li&gt;On conscientiousness: Americans who conscientiously followed the 1990s USDA guidelines had worse health outcomes (because the guidelines were dumb).&lt;/li&gt;
      &lt;/ul&gt;

      &lt;p&gt;(Also: Isn’t it kind of insane that people who followed the USDA guidelines had worse health outcomes? In general, every diet works, whether it’s low-carb, low-fat, paleo, or whatever, because diets force you to pay more attention to what you eat and restrict your caloric intake. It’s perversely impressive that the USDA managed to come up with a diet (possibly the only diet ever?) that actually makes health worse.) &lt;a href=&quot;#fnref:9&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:3&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Where do you even buy raw olives? I’ve only ever seen jarred or canned olives soaked in brine. &lt;a href=&quot;#fnref:3&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:7&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Bhaskaran, K., dos-Santos-Silva, I., Leon, D. A., Douglas, I. J., &amp;amp; Smeeth, L. (2018). &lt;a href=&quot;https://www.thelancet.com/journals/landia/article/PIIS2213-8587(18)30288-2/fulltext&quot;&gt;Association of BMI with overall and cause-specific mortality: a population-based cohort study of 3-6 million adults in the UK.&lt;/a&gt; &lt;a href=&quot;#fnref:7&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:20&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Earlier in this post, I wrote about how most caffeine studies had bad methodology. As a test, I described the methodology of one study to a friend and asked them what they thought about it, and they immediately pointed out the same flaw that I had noticed. So clearly it’s not just me. &lt;a href=&quot;#fnref:20&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:15&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Kahan, D. M., Jenkins‐Smith, H., &amp;amp; Braman, D. (2011). &lt;a href=&quot;https://papers.ssrn.com/sol3/papers.cfm?abstract_id=1549444&quot;&gt;Cultural cognition of scientific consensus.&lt;/a&gt;&lt;/p&gt;

      &lt;p&gt;(Note: The cited version was published in 2011 but the paper was originally posted online in 2010.) &lt;a href=&quot;#fnref:15&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:16&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;National Research Council. Board on Radioactive Waste Management. (1990). &lt;a href=&quot;https://www.nrc.gov/docs/ML0413/ML041330436.pdf&quot;&gt;Rethinking high-level radioactive waste disposal: A position statement of the Board on Radioactive Waste Management, Commission on Geosciences, Environment, and Resources, National Research Council.&lt;/a&gt; &lt;a href=&quot;#fnref:16&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:12&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;It would derail the post too much for me to explain why I believe these beliefs are on par with creationism, but I can throw out some links:&lt;/p&gt;

      &lt;ol&gt;
        &lt;li&gt;&lt;a href=&quot;https://en.wikipedia.org/wiki/Criticisms_of_the_labour_theory_of_value&quot;&gt;Criticisms of the labor theory of value&lt;/a&gt; on Wikipedia&lt;/li&gt;
        &lt;li&gt;&lt;a href=&quot;https://stevenpinker.com/files/pinker/files/the_blank_slate_general_psychologist.pdf&quot;&gt;The Blank Slate&lt;/a&gt;, an article by Steven Pinker that summarizes his book, titled (surprise!) &lt;em&gt;The Blank Slate&lt;/em&gt;&lt;/li&gt;
      &lt;/ol&gt;
      &lt;p&gt;&lt;a href=&quot;#fnref:12&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>Notes on Eat, Drink, and Be Healthy</title>
				<pubDate>Thu, 23 May 2024 00:00:00 -0700</pubDate>
				<link>http://mdickens.me/2024/05/23/notes_on_eat_drink_and_be_healthy/</link>
				<guid isPermaLink="true">http://mdickens.me/2024/05/23/notes_on_eat_drink_and_be_healthy/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;I recently read &lt;a href=&quot;https://www.amazon.com/Eat-Drink-Be-Healthy-Harvard/dp/1501164775&quot;&gt;Eat, Drink, and Be Healthy: The Harvard Medical School Guide to Healthy Eating&lt;/a&gt;. As I understand, it’s the book that does the best job of representing the mainstream scientific perspective on nutrition for a lay audience. Here are the notes I took.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Last updated 2024-09-02.&lt;/em&gt;&lt;/p&gt;

&lt;!-- more --&gt;

&lt;p&gt;I wrote down info that I found personally useful, which often doesn’t translate to other people. For example, if the book gave a nutrition fact that I already confidently believed, I didn’t write it down, but someone else might have benefited from reading that fact.&lt;/p&gt;

&lt;p&gt;Unless otherwise specified, these notes represent my interpretation of the author’s perspective and first-person pronouns represent the author, not me. Any sentence preceded by “me:” is my perspective.&lt;/p&gt;

&lt;h2 id=&quot;contents&quot;&gt;Contents&lt;/h2&gt;

&lt;ul id=&quot;markdown-toc&quot;&gt;
  &lt;li&gt;&lt;a href=&quot;#contents&quot; id=&quot;markdown-toc-contents&quot;&gt;Contents&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#chapters-13-healthy-eating-matters&quot; id=&quot;markdown-toc-chapters-13-healthy-eating-matters&quot;&gt;Chapters 1–3: Healthy Eating Matters&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#chapter-4-healthy-weight&quot; id=&quot;markdown-toc-chapter-4-healthy-weight&quot;&gt;Chapter 4: Healthy Weight&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#chapter-5-straight-talk-about-fat&quot; id=&quot;markdown-toc-chapter-5-straight-talk-about-fat&quot;&gt;Chapter 5: Straight Talk About Fat&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#chapter-6-carbohydrates-for-better-and-worse&quot; id=&quot;markdown-toc-chapter-6-carbohydrates-for-better-and-worse&quot;&gt;Chapter 6: Carbohydrates for Better and Worse&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#chapter-7-choose-healthier-sources-of-protein&quot; id=&quot;markdown-toc-chapter-7-choose-healthier-sources-of-protein&quot;&gt;Chapter 7: Choose Healthier Sources of Protein&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#chapter-8-eat-plenty-of-fruits-and-vegetables&quot; id=&quot;markdown-toc-chapter-8-eat-plenty-of-fruits-and-vegetables&quot;&gt;Chapter 8: Eat Plenty of Fruits and Vegetables&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#chapter-9-you-are-what-you-drink&quot; id=&quot;markdown-toc-chapter-9-you-are-what-you-drink&quot;&gt;Chapter 9: You Are What You Drink&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#chapters-1011-vitamins-and-minerals&quot; id=&quot;markdown-toc-chapters-1011-vitamins-and-minerals&quot;&gt;Chapters 10–11: Vitamins and Minerals&lt;/a&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#notes-from-red-pen-reviews&quot; id=&quot;markdown-toc-notes-from-red-pen-reviews&quot;&gt;Notes from Red Pen Reviews&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#footnotes&quot; id=&quot;markdown-toc-footnotes&quot;&gt;Footnotes&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;chapters-13-healthy-eating-matters&quot;&gt;Chapters 1–3: Healthy Eating Matters&lt;/h3&gt;

&lt;ul&gt;
  &lt;li&gt;Soybean oil and canola oil are healthy&lt;sup id=&quot;fnref:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
  &lt;li&gt;The association between “healthy” foods and good health can’t be explained by socioeconomic status or conscientiousness because Greeks are healthier but poorer, and people who followed USDA guidelines did worse than people who followed “good” diets
    &lt;ul&gt;
      &lt;li&gt;me: This comment gave the author more credibility in my mind. I can’t easily the author’s evaluate empirical claims, but I can evaluate logic, and I thought this was a strong logical argument&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;American diets are getting healthier over the last 20 years. (me: so why are people getting fatter?)&lt;/li&gt;
  &lt;li&gt;RCTs are expensive. A nutrition and breast cancer study cost $2 billion and got inconclusive results. We mainly do cohort studies which follow a group of people for a time, they have less room for bias than retrospective self-reports / asking people with a disease about their eating habits&lt;/li&gt;
  &lt;li&gt;Nurses’ Health Study is good because nurses are more diligent and accurate about reporting what they eat&lt;/li&gt;
  &lt;li&gt;Japanese migrants in America have American levels of heart disease, which shows it’s not genetic&lt;/li&gt;
  &lt;li&gt;Moderate alcohol does actually prevent heart disease. Experiments show it raises HDL&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;chapter-4-healthy-weight&quot;&gt;Chapter 4: Healthy Weight&lt;/h3&gt;

&lt;ul&gt;
  &lt;li&gt;me: I examined BMI in more detail in &lt;a href=&quot;https://mdickens.me/2024/05/05/healthiest_BMI/&quot;&gt;What’s the Healthiest BMI?&lt;/a&gt;. &lt;em&gt;Eat, Drink, and Be Healthy&lt;/em&gt; primarily cites the same two meta-analyses that I cited (and the book’s primary author also co-authored one of the meta-analyses (along with 61 other authors lol))&lt;/li&gt;
  &lt;li&gt;Higher BMI is worse even within the healthy weight range, see Nurses’ Health Study
    &lt;ul&gt;
      &lt;li&gt;me: the reported data looks at CVD and diabetes and stuff but not at respiratory disease&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;Guidelines say BMI should be 18.5 to 25, that’s the cohort with lowest death rate after controlling for smoking etc. But really health starts getting worse at BMI &amp;gt; 22&lt;/li&gt;
  &lt;li&gt;BMI = weight (kg) / height (m)^2 or weight (lbs) / height (in)^2 * 703&lt;/li&gt;
  &lt;li&gt;BMI under 18.5 is bad if you’re sick but it’s fine if you’re just thin
    &lt;ul&gt;
      &lt;li&gt;me: for my perspective on this, see &lt;a href=&quot;https://mdickens.me/2024/05/22/healthiest_body_composition/&quot;&gt;What’s the Healthiest Body Composition?&lt;/a&gt;, especially the bits about Lee et al. (2018). the claim appears to be true, but there are approximately zero people with healthy lean mass and a BMI under 18.5&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;chapter-5-straight-talk-about-fat&quot;&gt;Chapter 5: Straight Talk About Fat&lt;/h3&gt;

&lt;ul&gt;
  &lt;li&gt;Belly fat is worse than hip/butt fat. Unclear why, maybe because it affects hormones more&lt;/li&gt;
  &lt;li&gt;Controlling weight is the most important component of diet. Luckily, a healthy diet is a subset of diets that make it hard to overeat (me: fad diets can be good for controlling weight but aren’t optimal for health). Ex: fiber is satiating and also good for digestive health&lt;/li&gt;
  &lt;li&gt;Insulin makes your body convert calories to fat and reluctant to burn fat; people with high insulin will be hungry even if they have plentiful fat stores. That’s a reason to prefer foods with low glycemic index&lt;/li&gt;
  &lt;li&gt;Proteins/fats stay in the stomach for longer which reduces hunger&lt;/li&gt;
  &lt;li&gt;45% of a database of people who lost weight didn’t use a Diet, they “did it themselves”&lt;sup id=&quot;fnref:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
  &lt;li&gt;Unsaturated fat lowers LDL (“bad”) cholesterol and raises HDL (“good”) cholesterol (compared to carbs); saturated raises both; trans raises LDL&lt;/li&gt;
  &lt;li&gt;Fat is not water soluble, so it is transmitted through bloodstream by protein packages (lipoproteins)&lt;/li&gt;
  &lt;li&gt;Saturated fats have varying effects. Coconut oil raises HDL more than beef or butter fat, but its effect on LDL still makes it net unhealthy&lt;/li&gt;
  &lt;li&gt;LDL accumulates in blood vessels—it’s low density so it’s more likely to get stuck. HDL removes LDL from blood vessels&lt;/li&gt;
  &lt;li&gt;Omega-6 doesn’t increase inflammation, it reduces it. The 3:6 ratio didn’t matter in the Nurses’ Health Study&lt;/li&gt;
  &lt;li&gt;Eggs are fine. Low in saturated fat&lt;/li&gt;
  &lt;li&gt;Nuts beat olive oil in RCTs&lt;/li&gt;
  &lt;li&gt;Red meat is bad for colon cancer, which could be b/c of fat content or b/c of chemicals generated by cooking at high temperatures. But biggest risk factor for colon cancer is being overweight&lt;/li&gt;
  &lt;li&gt;ALA associated with prostate cancer, but likely b/c ALA-rich fats used to be partially hydrogenated most of the time. Walnuts are high in ALA and aren’t associated with prostate cancer. Future studies will see if the ALA-cancer connection persists now that trans fats are mostly gone from diets, But you probably shouldn’t worry about ALA&lt;/li&gt;
  &lt;li&gt;Dietary fat has no detectable association with cancer. Saturated and trans fats are bad for heart disease so that’s what we should focus on&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;chapter-6-carbohydrates-for-better-and-worse&quot;&gt;Chapter 6: Carbohydrates for Better and Worse&lt;/h3&gt;

&lt;ul&gt;
  &lt;li&gt;Grains are optional&lt;/li&gt;
  &lt;li&gt;Carbs looked healthy in old China studies but don’t look healthy in the west. Likely due to (1) Chinese were more physically active and (2) they ate whole or lightly refined grains. The badness of carbs mainly comes when you don’t burn them off (?)&lt;/li&gt;
  &lt;li&gt;Simple vs complex carb isn’t that important. Glycemic index (GI) and refined vs whole are more important. Glycemic index, not carb complexity, determines blood sugar&lt;/li&gt;
  &lt;li&gt;Potatoes and corn are fast digesting. Corn is technically whole grain but it’s been bred to be fast digesting so it’s more like white rice&lt;/li&gt;
  &lt;li&gt;Fast digesting carbs spike blood sugar then spike insulin so your body quickly absorbs the blood sugar, now you have low blood sugar which triggers hunger&lt;/li&gt;
  &lt;li&gt;Table sugar and corn syrup have the same impact on blood sugar and metabolism&lt;/li&gt;
  &lt;li&gt;Insulin resistance means cells resist the signal telling them to absorb sugar, so blood sugar stays elevated for longer. Insulin-producing cells in the pancreas get overworked and stop working&lt;/li&gt;
  &lt;li&gt;Contributors to diabetes: obesity; sedentariness, because muscles are good at consuming glucose; low polyunsaturated fatty acid (PUFA) and high saturated fat; genetics&lt;/li&gt;
  &lt;li&gt;Finely ground whole wheat is high GI but it’s still healthy due to fiber and nutrients&lt;/li&gt;
  &lt;li&gt;Low-GI foods help prevent diabetes&lt;/li&gt;
  &lt;li&gt;High-fiber cereal helps prevent diabetes&lt;/li&gt;
  &lt;li&gt;Early studies showed fiber reduced gut cancer, but later studies e.g. Nurses’ Health showed no effect&lt;/li&gt;
  &lt;li&gt;Whole grains have 1:10 or 1:5 fiber:carb ratio&lt;/li&gt;
  &lt;li&gt;Added fiber like cellulose isn’t as good as whole fiber b/c it’s missing micronutrients and it doesn’t encapsulate the carbs to slow digestion (book calls added fibers “fake fiber”)&lt;/li&gt;
  &lt;li&gt;Eat whole grain cereal for breakfast. Ex: Wheaties, Grape Nuts, Kashi, Shredded Wheat, Wheat Chex&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;chapter-7-choose-healthier-sources-of-protein&quot;&gt;Chapter 7: Choose Healthier Sources of Protein&lt;/h3&gt;

&lt;ul&gt;
  &lt;li&gt;BCAAs turn up IGF-1 which helps muscles grow but also cancer&lt;/li&gt;
  &lt;li&gt;Processed meat causes cancer; red meat probably causes cancer. Mainly colorectal&lt;/li&gt;
  &lt;li&gt;Country-level data shows plant proteins associated with less cancer than animal proteins, but country data is highly confounded. It’s probably the stuff that comes with the protein that matters, not the pure protein&lt;/li&gt;
  &lt;li&gt;ALA reduces clotting&lt;/li&gt;
  &lt;li&gt;Soy effect on heart disease is overstated. What studies actually show is that heart disease is reduced when you replace red meat with soy&lt;/li&gt;
  &lt;li&gt;Soy contains phytoestrogens which may prevent breast cancer. Phytoestrogens act like estrogen in some places and block it in others. It appears it blocks estrogen in cancer cells and estrogen stimulates growth&lt;/li&gt;
  &lt;li&gt;Some fish contain mercury but many species (e.g. salmon) don’t have concerning amounts. Don’t worry about it unless you’re a child or pregnant. Fish oil supplements have less mercury than fish but don’t show the same health benefits&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;chapter-8-eat-plenty-of-fruits-and-vegetables&quot;&gt;Chapter 8: Eat Plenty of Fruits and Vegetables&lt;/h3&gt;

&lt;ul&gt;
  &lt;li&gt;The effects of the vast majority of plant chemicals have yet to be determined. We don’t even know most of the chemicals that are in plants&lt;/li&gt;
  &lt;li&gt;Fruit and veg, especially greens and citrus (including juice), lower blood pressure and reduce stroke. Folic acid supplements work too. Fruit and veg shown to reduce blood pressure in RCT&lt;/li&gt;
  &lt;li&gt;Cataracts and macular degeneration may be caused by free radicals&lt;/li&gt;
  &lt;li&gt;1/3 of cancer is explained by diet. Possibly an overestimate&lt;/li&gt;
  &lt;li&gt;One study found that fruit in adolescence does more to prevent cancer than fruit in middle age&lt;/li&gt;
  &lt;li&gt;Berries are S tier fruit, especially blueberries&lt;/li&gt;
  &lt;li&gt;Fiber sticks to cholesterol and you poop it out&lt;/li&gt;
  &lt;li&gt;Newest studies show fiber doesn’t help with colon cancer. But it does still stabilize blood sugar which reduces diabetes, lowers triglycerides, and improves gut microbiome&lt;/li&gt;
  &lt;li&gt;It looks like many diseases like CVD and cancer are driven by deficiencies in some phytonutrients, but we don’t know which. Folate is probably one&lt;/li&gt;
  &lt;li&gt;Eat a variety of colors. Get one serving a day each of: dark leafy greens, yellow/orange fruit/veg, red fruit/veg, legumes, citrus&lt;/li&gt;
  &lt;li&gt;Cooked tomatoes are better than raw b/c your body has a hard time absorbing lycopene from raw tomatoes&lt;/li&gt;
  &lt;li&gt;Some veggies have chemicals that are bad if you eat too much, but it’s hard to eat too much&lt;/li&gt;
  &lt;li&gt;Juice and smoothies are bad because (1) easy to over-eat and (2) juice makes sugar absorb faster (me: they didn’t say if smoothies do that)
    &lt;ul&gt;
      &lt;li&gt;me: I spent a little time online trying to determine if smoothies are healthy or not (ignoring calorie content), my main concern being whether blending destroys fiber/nutrients. I found zero relevant papers on Google Scholar. I found various people, including dietitians (e.g. &lt;a href=&quot;https://www.hopkinsmedicine.org/health/wellness-and-prevention/how-to-make-a-healthy-smoothie&quot;&gt;on Hopkins Medicine&lt;/a&gt;) asserting without evidence that blending doesn’t destroy nutrients. So I assume smoothies are healthy but I don’t really know&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;chapter-9-you-are-what-you-drink&quot;&gt;Chapter 9: You Are What You Drink&lt;/h3&gt;

&lt;ul&gt;
  &lt;li&gt;Milk in adolescence increases height which increases hip fractures&lt;/li&gt;
  &lt;li&gt;Coffee and tea reduce gallstones, possibly b/c they stimulate gall bladder activity&lt;/li&gt;
  &lt;li&gt;Coffee, even decaf, reduces diabetes. Possibly due to antioxidants&lt;/li&gt;
  &lt;li&gt;Coffee reduces Parkinson’s&lt;/li&gt;
  &lt;li&gt;Tea has flavonoids which may reduce CVD&lt;/li&gt;
  &lt;li&gt;Alcohol raises HDL. CVD benefit at 1-2 drinks per day for men. Can be any kind of alcohol. Benefits of wine specifically are unproven&lt;/li&gt;
  &lt;li&gt;Alcohol raises breast cancer risk in women even at half a drink per day. Folate counteracts this&lt;/li&gt;
  &lt;li&gt;Alcohol is net harmful for young men due to low CVD risk, and net beneficial for older men. Unclear for women, probably net positive unless you have family history of breast cancer&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;chapters-1011-vitamins-and-minerals&quot;&gt;Chapters 10–11: Vitamins and Minerals&lt;/h3&gt;

&lt;ul&gt;
  &lt;li&gt;Hard to tell how much calcium people need because outer shells of bones can absorb calcium quickly but without increasing inner bone density&lt;/li&gt;
  &lt;li&gt;Insufficient vitamins can contribute to CVD and cancer even without deficiency disease&lt;/li&gt;
  &lt;li&gt;Folic acid started being added to grains to prevent birth defects but it accidentally reduced CVD and cancer too&lt;/li&gt;
  &lt;li&gt;Vitamin A helps regulate cell division and thus prevent cancer&lt;/li&gt;
  &lt;li&gt;Beta carotene isn’t vitamin A but your body turns it into vitamin A&lt;/li&gt;
  &lt;li&gt;Excess pre-formed vitamin A blocks vitamin D. Better to get vitamin A from beta carotene than from pre-formed vitamin A (retinol)&lt;/li&gt;
  &lt;li&gt;Free radicals are positively charged so they can steal electrons from DNA or cholesterol or other important things in your body&lt;/li&gt;
  &lt;li&gt;Early studies showed benefits but recent large RCTs don’t find benefits to antioxidant supplements&lt;/li&gt;
  &lt;li&gt;James Watson suggests antioxidant supplements are bad because free radicals kill cancer cells&lt;/li&gt;
  &lt;li&gt;Antioxidants are good if you get them from food&lt;/li&gt;
  &lt;li&gt;Excess iron generates free radicals. Some evidence suggests excess iron causes heart disease and cancer, but jury is still out&lt;/li&gt;
  &lt;li&gt;Body does a good job of passing on unneeded iron when it comes from plants but not when it comes from meat (me: what about supplements?)&lt;/li&gt;
  &lt;li&gt;I recommend taking a multivitamin that doesn’t contain iron&lt;/li&gt;
  &lt;li&gt;Sodium RDA is 2300mg but most people need less than 1000mg&lt;/li&gt;
  &lt;li&gt;American Heart Association recommends a max of 1500mg sodium&lt;/li&gt;
  &lt;li&gt;Selenium probably doesn’t matter&lt;/li&gt;
  &lt;li&gt;Take a multivitamin that contains the nutrients people tend to miss: beta carotene, B6, B12, folic acid, D, E, iron, zinc (me: what about earlier claim to not supplement iron?). No more than 2000IU pre-formed vitamin A&lt;/li&gt;
  &lt;li&gt;Basic One multivitamin is good. Menstruating women get the version with iron, others don’t need the iron
    &lt;ul&gt;
      &lt;li&gt;me: Compared to normal multivitamins, &lt;a href=&quot;https://coopercomplete.com/product/basic-one-multivitamin-iron-free/&quot;&gt;Basic One&lt;/a&gt; has considerably more D3, E, and B6, and &lt;em&gt;way&lt;/em&gt; more B12; added selenium, chromium, and lycopene (an antioxidant found in many red fruits/veggies); and its vitamin A comes exclusively in the form of beta carotene (no pre-formed vitamin A)&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;notes-from-red-pen-reviews&quot;&gt;Notes from Red Pen Reviews&lt;/h2&gt;

&lt;p&gt;&lt;em&gt;This section was added on 2024-05-31.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://www.redpenreviews.org/&quot;&gt;Red Pen Reviews&lt;/a&gt;, a website that reviews the scientific accuracy of books on nutrition, &lt;a href=&quot;https://www.redpenreviews.org/reviews/eat-drink-and-be-healthy/&quot;&gt;reviewed Eat, Drink, and Be Healthy&lt;/a&gt; and gave it the highest score of any book it’s reviewed.&lt;sup id=&quot;fnref:3&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt; It did, however, have some minor quibbles with the book. It disputed one claim from the book, found that the book overstated the strength of evidence in two of its references, and sort-of disputed the book’s position on multivitamins.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Claim from the book:&lt;/strong&gt; “Protein sources from plants and lean meats such as chicken or fish are likely more beneficial than protein from red and processed meat. Protein from soy, however, is less well-understood.”&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;There’s mixed evidence on whether plants and lean meats are better for cardiovascular disease than red meat. It seems broadly true but there’s some conflicting evidence.&lt;/li&gt;
  &lt;li&gt;The skepticism about soy is based on old studies, and more recent studies find that soy is beneficial.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Reference from the book:&lt;/strong&gt; “In an analysis my colleagues and I did among more than 43,000 men, intake of total protein was minimally associated with heart disease risk, while intake of protein from meat was associated with higher risk.”&lt;/p&gt;

&lt;p&gt;Among all participants, the correlation between protein source and heart disease risk was non-significant. The correlation only became significant when the analysis was restricted to “healthy” patients.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Reference from the book:&lt;/strong&gt; “Dark leafy green vegetables contain two pigments, lutein and zeaxanthin, that accumulate in the eye. These two, along with phytochemicals called carotenoids, can snuff out free radicals before they can harm the eye’s sensitive tissues.”&lt;/p&gt;

&lt;p&gt;The cited study established that lutein and zeaxanthin are good for eye health, but the study did not examine mechanisms. The proposed mechanism is plausible, but not supported by the cited study.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Claim from the book:&lt;/strong&gt; “Take a daily multivitamin.”&lt;/p&gt;

&lt;p&gt;Some, such as the president of the Australian Medical Association, say multivitamins are a waste. Some good studies showed multivitamins had no beneficial effects. But other good studies showed multivitamins did have beneficial effects. So the claim that you should take a multivitamin is controversial.&lt;/p&gt;

&lt;p&gt;My personal take: multivitamins are cheap and easy, so they clearly pass a cost-benefit analysis, even if there’s a good chance that they don’t work.&lt;/p&gt;

&lt;h1 id=&quot;footnotes&quot;&gt;Footnotes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:1&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I wrote this down because I had recently learned about the seed oil theory of obesity. Now, after having looked into it further, I’m reasonably confident that the seed oil theory (1) is fringe among nutrition scientists, (2) contradicts known biological mechanisms, and (3) contradicts most empirical findings from RCTs and cohort studies on the relationships between food and health. &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:2&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I come from the resistance training world where the preferred method of losing weight is to just eat less. I thought maybe we were a bunch of weirdos but this shows that the “eat less” method is pretty common among the general population. &lt;a href=&quot;#fnref:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:3&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;That’s why I read the book in the first place. &lt;a href=&quot;#fnref:3&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>What's the Healthiest Body Composition?</title>
				<pubDate>Wed, 22 May 2024 00:00:00 -0700</pubDate>
				<link>http://mdickens.me/2024/05/22/healthiest_body_composition/</link>
				<guid isPermaLink="true">http://mdickens.me/2024/05/22/healthiest_body_composition/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;&lt;a href=&quot;https://mdickens.me/2024/05/05/healthiest_BMI/&quot;&gt;Last time&lt;/a&gt;, I found that the healthiest BMI range for all-cause mortality is 20–22. But BMI doesn’t tell the whole story. Most obviously, it doesn’t account for body fat vs. lean mass. All else equal, you’d rather have more muscle&lt;sup id=&quot;fnref:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; and less fat.&lt;/p&gt;

&lt;p&gt;So what’s the healthiest combination of lean mass + fat mass?&lt;/p&gt;

&lt;p&gt;I’m not going to answer that question because I can’t. Instead, I will explain why I can’t, and then give a rough guess at the answer.&lt;/p&gt;

&lt;p&gt;Scientists have been measuring and collecting data on BMI for decades. You can find plenty of giant BMI studies with three million participants in various countries.&lt;/p&gt;

&lt;p&gt;We have much sparser data on body fat. Scientists didn’t start collecting data on body fat until the last few decades. And body fat is harder to measure—we have various methods for estimating body fat, but they’re all more complicated than calculating BMI.&lt;/p&gt;

&lt;p&gt;I managed to scrounge together some studies on body fat and mortality. &lt;strong&gt;My best guess: the average woman should aim for a BMI of 21 with 20% body fat, and the average man a BMI of 21 with 10% body fat.&lt;/strong&gt; (Subject to individual variation due to genetics and whatnot.)&lt;/p&gt;

&lt;p&gt;Trans men should probably target the same body fat % as cis men, and likewise for trans women and cis women, because hormone therapy alters body fat distribution (&lt;a href=&quot;https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7061235/&quot;&gt;Spanos et al. (2020)&lt;/a&gt;&lt;sup id=&quot;fnref:24&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:24&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt;).&lt;/p&gt;

&lt;p&gt;The evidence weakly suggests that there is no lower bound on healthy fat mass, and no upper bound on healthy lean mass. We have so little mortality data on extremely lean + muscular people that we can’t say how healthy they are.&lt;/p&gt;

&lt;p&gt;A more in-depth analysis would look at a variety of health indicators (blood pressure, HDL cholesterol, etc.) and use that to predict mortality. I didn’t do that, I just looked at mortality data.&lt;/p&gt;

&lt;!-- more --&gt;

&lt;h2 id=&quot;contents&quot;&gt;Contents&lt;/h2&gt;

&lt;ul id=&quot;markdown-toc&quot;&gt;
  &lt;li&gt;&lt;a href=&quot;#contents&quot; id=&quot;markdown-toc-contents&quot;&gt;Contents&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#research-on-body-composition-and-mortality&quot; id=&quot;markdown-toc-research-on-body-composition-and-mortality&quot;&gt;Research on body composition and mortality&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#research-on-waist-circumference&quot; id=&quot;markdown-toc-research-on-waist-circumference&quot;&gt;Research on waist circumference&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#research-on-the-relationship-between-bmi-and-body-fat&quot; id=&quot;markdown-toc-research-on-the-relationship-between-bmi-and-body-fat&quot;&gt;Research on the relationship between BMI and body fat&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#good-research-that-nonetheless-didnt-answer-my-question&quot; id=&quot;markdown-toc-good-research-that-nonetheless-didnt-answer-my-question&quot;&gt;Good research that nonetheless didn’t answer my question&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#is-there-an-upper-bound-on-healthy-lean-mass&quot; id=&quot;markdown-toc-is-there-an-upper-bound-on-healthy-lean-mass&quot;&gt;Is there an upper bound on healthy lean mass?&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#summary-of-findings&quot; id=&quot;markdown-toc-summary-of-findings&quot;&gt;Summary of findings&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#notes&quot; id=&quot;markdown-toc-notes&quot;&gt;Notes&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I looked at all the relevant research papers with at least 100 citations on Google Scholar. Here’s what I found.&lt;/p&gt;

&lt;h2 id=&quot;research-on-body-composition-and-mortality&quot;&gt;Research on body composition and mortality&lt;/h2&gt;

&lt;p&gt;The best data comes from the big Danish follow-up study “Diet, Cancer and Health” which followed 50,000 Danish adults aged 50 to 64 from 1993 to 1997. Two different research papers analyzed the data from this study: &lt;a href=&quot;https://onlinelibrary.wiley.com/doi/full/10.1038/oby.2004.131&quot;&gt;Bigaard et al. (2004)&lt;/a&gt;&lt;sup id=&quot;fnref:3&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt; and &lt;a href=&quot;https://www.nature.com/articles/0802976&quot;&gt;Bigaard et al. (2005)&lt;/a&gt;&lt;sup id=&quot;fnref:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;4&lt;/a&gt;&lt;/sup&gt;. These papers break down BMI into body fat mass index (BFMI) and fat-free mass index (FFMI) (measured using &lt;a href=&quot;https://en.wikipedia.org/wiki/Bioelectrical_impedance_analysis&quot;&gt;bioelectrical impedance&lt;/a&gt;). Just like how BMI equals weight/height&lt;sup&gt;2&lt;/sup&gt;, BFMI equals bodyfat/height&lt;sup&gt;2&lt;/sup&gt;, and FFMI equals fat-free mass (a.k.a. lean mass) / height&lt;sup&gt;2&lt;/sup&gt; (such that BFMI + FFMI = BMI).&lt;/p&gt;

&lt;p&gt;Bigaard et al. (2004) found that mortality monotonically increases with BFMI above 5–6, and gets slightly worse below 5–6. For women, mortality monotonically decreases with FFMI, and for men it monotonically increases up to 19, at which point it starts increasing again. See &lt;a href=&quot;https://onlinelibrary.wiley.com/cms/asset/81cd76fe-6201-4eee-b9fa-e4c5d2807cd4/oby_1042_f1.gif&quot;&gt;Figure 1&lt;/a&gt; (dark line represents men, light line represents women).&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;https://onlinelibrary.wiley.com/cms/asset/81cd76fe-6201-4eee-b9fa-e4c5d2807cd4/oby_1042_f1.gif&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;(BFMI and FFMI were controlled for each other, which fixes the problem that people with more lean mass usually have more fat mass.)&lt;/p&gt;

&lt;p&gt;If we combine the ideal BFMI and FFMI from the Danish study, we get:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;pre&gt;&lt;code&gt;men:   5 BFMI + 19 FFMI = 24 BMI with 21% body fat
women: 7 BFMI + 17 FFMI = 24 BMI with 29% body fat
&lt;/code&gt;&lt;/pre&gt;
&lt;/blockquote&gt;

&lt;p&gt;But this probably isn’t right. The &lt;a href=&quot;https://mdickens.me/2024/05/05/healthiest_BMI/#the-big-bmi-studies&quot;&gt;best meta-analyses&lt;/a&gt; on BMI and mortality control for three big confounders: smoking, health conditions, and study follow-up length. All these confounders make low BMIs look unhealthier than they really are. The Bigaard et al. papers controlled for smoking, but not for health conditions (although participants were healthier than average due to selection effects&lt;sup id=&quot;fnref:5&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:5&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;5&lt;/a&gt;&lt;/sup&gt;); plus the Danish study only lasted 5 years, and 5-year studies show a bias toward high BMIs.&lt;/p&gt;

&lt;p&gt;In the BMI studies, controlling for these additional confounders reduces the apparent healthiest BMI by 3 or 4 points. And the Danish study found that mortality was minimized at 25 BMI,&lt;sup id=&quot;fnref:4&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:4&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;6&lt;/a&gt;&lt;/sup&gt; whereas in reality, &lt;a href=&quot;https://mdickens.me/2024/05/05/healthiest_BMI/&quot;&gt;it’s probably minimized at 20–22 BMI&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Undiagnosed health conditions probably decrease both lean mass and fat mass. That means low fat mass and low lean mass are both healthier than observational studies make them look.&lt;/p&gt;

&lt;p&gt;If we naively subtract 4 points from the body-fat mass index,&lt;sup id=&quot;fnref:36&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:36&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;7&lt;/a&gt;&lt;/sup&gt; we get a new estimate of the healthiest body composition for men and women:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;pre&gt;&lt;code&gt;men:   1 BFMI + 19 FFMI = 20 BMI with 5% body fat
women: 3 BFMI + 17 FFMI = 20 BMI with 15% body fat
&lt;/code&gt;&lt;/pre&gt;
&lt;/blockquote&gt;

&lt;p&gt;These numbers are extraordinarily low. But remember that I came up with these numbers by extrapolating what might happen if we controlled for confounders, so the numbers don’t come from actual data and could be pretty far off. My guess is they’re too low—maybe selection bias in the Diet, Cancer and Health Study had a similar effect to controlling for health, and I’ve over-corrected.&lt;/p&gt;

&lt;p&gt;I don’t know about the exact numbers, but I expect that low body fat is good.&lt;/p&gt;

&lt;p&gt;The data showed a trend of increasing mortality above 19 FFMI for men and 17 FFMI for women, but the trend wasn’t strong, so it’s possible that higher FFMIs are healthier (more on this &lt;a href=&quot;#is-there-an-upper-bound-on-healthy-lean-mass&quot;&gt;later&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;Other research has shown that ideal BMI increases with age, and the Danish study only included adults aged 50 to 64, so younger people should probably target lower body fat, and older people should aim higher. (It’s possible that younger people actually want to have less lean mass, not less body fat, but that doesn’t sound likely to me.)&lt;/p&gt;

&lt;p&gt;Ideal BMI varies by ethnicity. But that’s probably because different ethnicities distribute body fat differently—e.g., South Asians carry more body fat at a given BMI, which makes BMIs on the higher end look unhealthier for South Asians. I don’t know of any cross-ethnic studies on body fat and mortality, but my best guess is that ideal body fat doesn’t vary much by ethnicity.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://www.bmj.com/content/362/bmj.k2575.full&quot;&gt;Lee et al. (2018)&lt;/a&gt;&lt;sup id=&quot;fnref:9&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:9&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;8&lt;/a&gt;&lt;/sup&gt;, the second-most useful paper, looked at (predominantly white) males from the long-term Health Professionals Follow-Up Study and found that, controlling for confounders, mortality for men was minimized in the first quintile of fat mass and the third quintile of lean body mass (see &lt;a href=&quot;https://www.bmj.com/highwire/markup/980335/expansion?width=1000&amp;amp;height=500&amp;amp;iframe=true&amp;amp;postprocessors=highwire_figures%2Chighwire_math&quot;&gt;Table 2&lt;/a&gt;). That is, men should aim for low fat mass and average lean mass.&lt;/p&gt;

&lt;p&gt;(Lean mass and mortality had only a weak association except in the first quintile: low lean mass increased risk, but lean mass didn’t clearly affect mortality beyond that.)&lt;/p&gt;

&lt;p&gt;Lee et al. (2018) found that male mortality was minimized at 56 kg of lean mass and 5–21 kg of fat mass.&lt;sup id=&quot;fnref:37&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:37&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;9&lt;/a&gt;&lt;/sup&gt; The median of the 1st quintile had a fat mass of 15 kg which corresponds to 21% body fat.&lt;sup id=&quot;fnref:15&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:15&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;10&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;Lee et al. found that mortality was minimized at a BMI of 24 or so, which suggests that it hasn’t fully adjusted for confounders. Adjusting fat mass index by 3 points to produce a BMI of 21 (assuming a height of 176 cm, which corresponds to a BMI of 23&lt;sup id=&quot;fnref:38&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:38&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;11&lt;/a&gt;&lt;/sup&gt; for someone with 56 kg lean mass + 15 kg fat mass) predicts an ideal male body composition of&lt;/p&gt;

&lt;blockquote&gt;
  &lt;pre&gt;&lt;code&gt;2 BFMI + 18 FFMI = 20 BMI with 9% body fat
&lt;/code&gt;&lt;/pre&gt;
&lt;/blockquote&gt;

&lt;p&gt;Interestingly, Lee et al. found that if you exclude men with excessively low lean mass, then the mortality-minimizing BMI range is 18.5–20.4, not 20–22 as I &lt;a href=&quot;https://mdickens.me/2024/05/05/healthiest_BMI/&quot;&gt;previously reported&lt;/a&gt;. (The meta-analyses I looked at in my last post didn’t include data on body fat.) And the effect was surprisingly strong—men in the 18.5–20.4 group had a 15% lower mortality rate than the 20.5–22.4 group (see &lt;a href=&quot;https://www.bmj.com/highwire/markup/980333/expansion?width=1000&amp;amp;height=500&amp;amp;iframe=true&amp;amp;postprocessors=highwire_figures%2Chighwire_math&quot;&gt;Table 3&lt;/a&gt;). This suggests two things:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Having a BMI on the low end (18.5–20.4) has large health benefits and large downsides relative to a mid-range BMI (20.5–22.4).&lt;/li&gt;
  &lt;li&gt;The downsides pretty much exclusively come from low lean mass, not low fat mass. If you have low-end BMI with sufficiently high lean mass, you get all the upside and ~none of the downside.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;(Note: After excluding people with low lean mass, there were zero men left with BMIs below 18.5, out of a sample of 38,000. You might shoot for a BMI even lower than 18.5, but it looks pretty much impossible.&lt;sup id=&quot;fnref:16&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:16&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;12&lt;/a&gt;&lt;/sup&gt;)&lt;/p&gt;

&lt;h2 id=&quot;research-on-waist-circumference&quot;&gt;Research on waist circumference&lt;/h2&gt;

&lt;p&gt;A few studies looked at mortality and waist circumference. Waist circumference is often used as a proxy for body fat, and arguably it’s a &lt;em&gt;better&lt;/em&gt; metric than body fat % because visceral fat (abdominal fat that’s distributed around the organs) carries greater health risks than subcutaneous fat (distributed under the skin).&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://www.ahajournals.org/doi/full/10.1161/CIRCULATIONAHA.107.739714&quot;&gt;Zhang et al. (2008)&lt;/a&gt;&lt;sup id=&quot;fnref:14&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:14&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;13&lt;/a&gt;&lt;/sup&gt; looked at American women in the Nurses’ Health Study. Participants had the lowest mortality in the 1st quintile of waist circumference—less than 28 inches or 71 cm, see Table 3. The study also looked at waist:hip ratio and found that the 1st quintile (&amp;lt;0.73) minimized mortality risk.&lt;/p&gt;

&lt;p&gt;Zhang et al. reported a &lt;a href=&quot;https://en.wikipedia.org/wiki/Hazard_ratio&quot;&gt;hazard ratio&lt;/a&gt; of 1.01 for the 28–29 inch waist group for all participants—that is, 28–29 inches was only slightly less healthy than &amp;lt;28 inches. But among never-smokers, the 28–29 inch group had a hazard ratio of 1.31. Reading between the lines, that implies that the healthiest waist circumference for non-smoking women is considerably less than 28 inches (71 cm).&lt;sup id=&quot;fnref:39&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:39&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;14&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://www.sciencedirect.com/science/article/pii/S1279770723021930&quot;&gt;Hu et al. (2018)&lt;/a&gt;&lt;sup id=&quot;fnref:11&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:11&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;15&lt;/a&gt;&lt;/sup&gt; found that among middle-aged and elderly Chinese individuals, mortality was minimized at a waist circumference of 83–88 cm for men and 79–83 cm for women. This study used a relatively short follow-up (8.5 years), which means the true healthiest waist circumference is probably lower.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://academic.oup.com/aje/article/152/3/264/73227&quot;&gt;Baik et al. (2000)&lt;/a&gt;&lt;sup id=&quot;fnref:13&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:13&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;16&lt;/a&gt;&lt;/sup&gt; found that American men had the lowest mortality rate for waist circumferences in the 3rd quintile (36.3–37.9 inches or 92.2–96.3 cm), and waist:hip ratios in the 2nd quintile (0.90–0.91), see &lt;a href=&quot;https://academic.oup.com/view-large/555349&quot;&gt;Table 4&lt;/a&gt;. The authors write that the higher mortality for lean men is probably due to confounding—respiratory disease causes men to lose weight and die sooner.&lt;/p&gt;

&lt;h2 id=&quot;research-on-the-relationship-between-bmi-and-body-fat&quot;&gt;Research on the relationship between BMI and body fat&lt;/h2&gt;

&lt;p&gt;Some studies provide formulas to convert BMI to body fat. We can use those formulas to estimate ideal body fat from ideal BMI. This method doesn’t really work because the healthiest BMI substantially changes if you know someone’s fat mass / lean mass. But I’m going to use the method anyway and see what happens.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;Data from the US National Health and Nutrition Examination Survey in &lt;a href=&quot;https://www.sciencedirect.com/science/article/pii/S0002916523239659&quot;&gt;Flegal et al. (2009)&lt;/a&gt;&lt;sup id=&quot;fnref:23&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:23&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;17&lt;/a&gt;&lt;/sup&gt; suggest a correspondence between ideal BMI and body fat % (see &lt;a href=&quot;https://www.sciencedirect.com/science/article/pii/S0002916523239659#t3&quot;&gt;Table 3&lt;/a&gt; and &lt;a href=&quot;https://www.sciencedirect.com/science/article/pii/S0002916523239659#t4&quot;&gt;Table 4&lt;/a&gt;):&lt;/p&gt;

    &lt;table&gt;
      &lt;thead&gt;
        &lt;tr&gt;
          &lt;th&gt;demographic&lt;/th&gt;
          &lt;th&gt;body fat&lt;/th&gt;
        &lt;/tr&gt;
      &lt;/thead&gt;
      &lt;tbody&gt;
        &lt;tr&gt;
          &lt;td&gt;women under 40&lt;/td&gt;
          &lt;td&gt;25–30%&lt;/td&gt;
        &lt;/tr&gt;
        &lt;tr&gt;
          &lt;td&gt;women over 60&lt;/td&gt;
          &lt;td&gt;30–35%&lt;/td&gt;
        &lt;/tr&gt;
        &lt;tr&gt;
          &lt;td&gt;men under 40&lt;/td&gt;
          &lt;td&gt;15–20%&lt;/td&gt;
        &lt;/tr&gt;
        &lt;tr&gt;
          &lt;td&gt;men over 60&lt;/td&gt;
          &lt;td&gt;20–25%&lt;/td&gt;
        &lt;/tr&gt;
      &lt;/tbody&gt;
    &lt;/table&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;a href=&quot;https://www.sciencedirect.com/science/article/abs/pii/S026156141000004X&quot;&gt;Meeuwsen et al. (2010)&lt;/a&gt;&lt;sup id=&quot;fnref:19&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:19&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;18&lt;/a&gt;&lt;/sup&gt; measured body fat in UK adults and came up with a linear formula to predict body fat percentage:&lt;/p&gt;

    &lt;blockquote&gt;
      &lt;pre&gt;&lt;code&gt;women: BF% =  -1.63 + 1.129 * BMI + 0.140 * age
men:   BF% = -13.51 + 1.129 * BMI + 0.140 * age
&lt;/code&gt;&lt;/pre&gt;
    &lt;/blockquote&gt;

    &lt;p&gt;This formula predicts that, assuming you want a BMI of 21 at 30 years old, the healthiest body fat is 26% for women and 14% for men. At age 60, that rises to 30% for women and 19% for men.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3183503/&quot;&gt;Mills et al. (2007)&lt;/a&gt;&lt;sup id=&quot;fnref:18&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:18&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;19&lt;/a&gt;&lt;/sup&gt; found a BMI/body-fat correlation that suggests an ideal male body fat of 4% to 9%. But it also says a white male with a BMI of 18 has –1% body fat so I suspect it’s pretty inaccurate on the low end.&lt;sup id=&quot;fnref:40&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:40&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;20&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;a href=&quot;https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3766672/&quot;&gt;Ranasinghe et al. (2013)&lt;/a&gt;&lt;sup id=&quot;fnref:21&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:21&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;21&lt;/a&gt;&lt;/sup&gt; derived formulas to estimate body fat percentage in Sri Lankan adults:&lt;/p&gt;

    &lt;blockquote&gt;
      &lt;pre&gt;&lt;code&gt;women: BF% =  3.819 + 0.918 * BMI + 0.153 * age
men:   BF% = -9.662 + 1.114 * BMI + 0.139 * age
&lt;/code&gt;&lt;/pre&gt;
    &lt;/blockquote&gt;

    &lt;p&gt;These formulas predict an ideal body fat of 28% for women and 18% for men at age 30.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;good-research-that-nonetheless-didnt-answer-my-question&quot;&gt;Good research that nonetheless didn’t answer my question&lt;/h2&gt;

&lt;p&gt;Some other studies looked at the association between body fat/waist circumference and mortality, but only focused on the effects of overweightness/obesity, which doesn’t tell us the healthiest body composition.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;https://www.nature.com/articles/0802976&quot;&gt;Bigaard et al. (2005)&lt;/a&gt;&lt;sup id=&quot;fnref:2:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;4&lt;/a&gt;&lt;/sup&gt;, using the Danish data set discussed previously, found that waist circumference predicted mortality even when controlling for BMI. It also found that, if you control for waist circumference, mortality monotonically decreases with both body fat mass index and fat-free mass index. (I wish I knew what to do with this information—I don’t know how to increase body fat without also increasing waist circumference.&lt;sup id=&quot;fnref:41&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:41&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;22&lt;/a&gt;&lt;/sup&gt;)&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://www.jacc.org/doi/abs/10.1016/j.jacc.2013.06.027&quot;&gt;Britton et al. (2013)&lt;/a&gt;&lt;sup id=&quot;fnref:7&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:7&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;23&lt;/a&gt;&lt;/sup&gt; found that excess body fat was associated with elevated mortality risk, but did not provide granularity on the low end of body fat.&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://www.mayoclinicproceedings.org/article/S0025-6196(13)01040-9/abstract&quot;&gt;Cerhan et al. (2013)&lt;/a&gt;&lt;sup id=&quot;fnref:12&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:12&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;24&lt;/a&gt;&lt;/sup&gt;, a pooled analysis of 11 cohort studies with a total of 650,000 white participants, found that mortality monotonically increased with waist circumference 90+ cm for men and 70+ cm for women. But it did not look at waist circumferences smaller than 90 cm for men or 70 cm for women.&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://onlinelibrary.wiley.com/doi/full/10.1002/oby.22423&quot;&gt;Chen et al. (2019)&lt;/a&gt;&lt;sup id=&quot;fnref:10&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:10&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;25&lt;/a&gt;&lt;/sup&gt; found greater mortality for waist circumferences of 90+ cm for Chinese men and 80+ cm for Chinese women, but did not provide granularity beyond that.&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://jamanetwork.com/journals/jamainternalmedicine/article-abstract/775594&quot;&gt;Jacobs et al. (2010)&lt;/a&gt;&lt;sup id=&quot;fnref:6&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:6&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;26&lt;/a&gt;&lt;/sup&gt; found that among 100,000 (mostly white) Americans over a 10-year period, lower waist circumference was associated with reduced mortality, but it did not present data on waist circumferences smaller than 90 cm for men or 75 cm for women.&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://dom-pubs.onlinelibrary.wiley.com/doi/abs/10.1111/dom.13050&quot;&gt;Lee et al. (2018)&lt;/a&gt;&lt;sup id=&quot;fnref:8&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:8&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;27&lt;/a&gt;&lt;/sup&gt; found that body fat predicted mortality more strongly than BMI, and visceral fat more strongly still. It found that people in the bottom third by fat mass had the lowest mortality, as did the people in the bottom third by visceral-fat-to-subcutaneous-fat ratio.&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://www.nature.com/articles/0801787&quot;&gt;Visscher et al. (2001)&lt;/a&gt;&lt;sup id=&quot;fnref:17&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:17&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;28&lt;/a&gt;&lt;/sup&gt; found that male waist circumferences over 94 cm were associated with increased mortality, and failed to find a trend among women.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;is-there-an-upper-bound-on-healthy-lean-mass&quot;&gt;Is there an upper bound on healthy lean mass?&lt;/h2&gt;

&lt;p&gt;We know high BMIs are unhealthy, and we know that the main harms to health come from excess body fat. Does excess lean mass increase mortality risk? Or is it better to be as big and lean as possible?&lt;/p&gt;

&lt;p&gt;The Danish Diet, Cancer and Health Study found that, after controlling for body fat, mortality risk decreases with lean mass up to a fat-free mass index (FFMI) of 17 for women and 19 for men, and starts increasing again above that point. The increasing trend above 19/17 is not statistically significant: after adjusting for both body-fat mass index (BFMI) and smoking, the upper portion of the FFMI curve has a slope of 1.03 with a 95% confidence interval of [0.93, 1.13] (see Table 4 from &lt;a href=&quot;https://onlinelibrary.wiley.com/doi/full/10.1038/oby.2004.131&quot;&gt;Bigaard et al. (2004)&lt;/a&gt;&lt;sup id=&quot;fnref:3:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt;).&lt;/p&gt;

&lt;p&gt;This suggests that there’s no harm to having a FFMI above 17 for women or 19 for men, but there’s no benefit, either.&lt;/p&gt;

&lt;p&gt;Alternatively, we can look at the relationship between resistance training and mortality because resistance training is closely associated with lean mass. A 2022 meta-analysis by &lt;a href=&quot;https://bjsm.bmj.com/content/56/13/755&quot;&gt;Momma et al.&lt;/a&gt;&lt;sup id=&quot;fnref:25&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:25&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;29&lt;/a&gt;&lt;/sup&gt; found that muscle-strengthening activity was associated with reduced mortality risk up to about 60 minutes per week, but with marginal resistance training above 60 minutes/week showing increasing mortality, and 140 minutes per week showing the same mortality risk as 0 minutes. The meta-analysis did not control for BMI, which I suspected could explain the increase in mortality, but one large cohort study (&lt;a href=&quot;https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7417019/&quot;&gt;Patel et al. (2020)&lt;/a&gt;&lt;sup id=&quot;fnref:27&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:27&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;30&lt;/a&gt;&lt;/sup&gt;) did control for BMI (among other factors) and also found that resistance training was associated with increased mortality above 60 minutes or so.&lt;/p&gt;

&lt;p&gt;I’ve seen some people dismiss this finding on the basis that resistance training is well-known to improve many health indicators like blood pressure and bone density. That’s true, but from what I’ve seen, the research primarily looks at low-dose resistance training, so we can’t say based on that research that increases in training volume monotonically improve health at high doses.&lt;/p&gt;

&lt;p&gt;Why might muscle-strengthening activity increase mortality risk? I have seen two proposed mechanisms:&lt;/p&gt;

&lt;p&gt;First, &lt;a href=&quot;https://bjsm.bmj.com/content/47/6/393.full&quot;&gt;Miyachi (2013)&lt;/a&gt;&lt;sup id=&quot;fnref:28&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:28&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;31&lt;/a&gt;&lt;/sup&gt; found that resistance training increases arterial stiffness, which &lt;a href=&quot;https://en.wikipedia.org/wiki/Arterial_stiffness&quot;&gt;increases heart disease risk&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Second, &lt;a href=&quot;https://www.strongerbyscience.com/research-spotlight-lifting-longevity/&quot;&gt;Nuckols (2022)&lt;/a&gt;&lt;sup id=&quot;fnref:29&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:29&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;32&lt;/a&gt;&lt;/sup&gt; writes:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;The studies in [Momma et al. (2022)] mostly used older subjects. It’s entirely possible that the optimal dose of resistance training for older adults is a lot lower than the optimal dose of resistance training for younger adults. For example, oxidative stress and generalized inflammation likely contribute to biological aging, and older adults have higher levels of oxidative stress and generalized inflammation. Resistance training causes oxidative stress and inflammation in a dose-dependent manner, but this is generally a good thing – those stressors are triggers for training-induced adaptations, and they also trigger your body to ramp up endogenous antioxidant production so that you can better handle future stressors (resulting in net reductions in inflammation and oxidative stress at rest). However, excessive training doses can induce too much oxidative stress and inflammation, setting the stage for a variety of deleterious outcomes (which we tend to collectively refer to as “overtraining”. It’s entirely possible – likely, even – that the threshold between productive training-induced stress and unproductive training-induced stress is considerably lower in older adults than younger adults.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2 id=&quot;summary-of-findings&quot;&gt;Summary of findings&lt;/h2&gt;

&lt;ol&gt;
  &lt;li&gt;None of the studies fully controlled for confounders, so we have to make a judgment call as to what the results would look like if we &lt;em&gt;did&lt;/em&gt; fully control for them.&lt;/li&gt;
  &lt;li&gt;Two big studies—the Danish Diet, Cancer and Health Study and the American Health Professionals Follow-Up Study—provide data on the association between mortality and lean mass / fat mass. If we extrapolate to what results we might see if we controlled for all confounders, these studies predict that men minimize mortality risk at 5–10% body fat and women minimize mortality risk at 15–20% body fat, as long as they have adequate lean mass.&lt;/li&gt;
  &lt;li&gt;Some research suggests that people in the lowest quintile of waist circumference have the lowest mortality risk, but we don’t know exactly what waist circumference minimizes mortality risk.&lt;/li&gt;
  &lt;li&gt;A great deal of research shows that on the higher end, higher body fat / bigger waist circumference is associated with increased mortality risk.&lt;/li&gt;
  &lt;li&gt;Higher lean body mass decreases mortality risk up to a fat-free mass index of 19 for men and 17 for women. Excess lean body mass above that point might increase mortality risk, but it’s not clear.&lt;/li&gt;
&lt;/ol&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:1&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Technically we want to look at lean mass, not muscle mass. Lean mass refers to &lt;em&gt;any&lt;/em&gt; body mass that isn’t fat, which includes muscles, bones, organs, etc. But the main way to gain lean mass is by building muscle. Resistance training does also increase bone mass, but much more slowly than muscle mass. I don’t know of any way to increase your organ mass, and even if you could, I don’t know that you would want to. &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:24&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Spanos, C., Bretherton, I., Zajac, J. D., &amp;amp; Cheung, A. S (2020). &lt;a href=&quot;https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7061235/&quot;&gt;Effects of gender-affirming hormone therapy on insulin resistance and body composition in transgender individuals: a systematic review.&lt;/a&gt; See &lt;a href=&quot;https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7061235/table/T1/?report=objectonly&quot;&gt;Table 1&lt;/a&gt;. &lt;a href=&quot;#fnref:24&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:3&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Bigaard, J., Frederiksen, K., Tjønneland, A., Thomsen, B. L., Overvad, K., Heitmann, B. L., &amp;amp; Sørensen, T. I. (2004). &lt;a href=&quot;https://onlinelibrary.wiley.com/doi/full/10.1038/oby.2004.131&quot;&gt;Body fat and fat-free mass and all-cause mortality.&lt;/a&gt; &lt;a href=&quot;#fnref:3&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt; &lt;a href=&quot;#fnref:3:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;2&lt;/sup&gt;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:2&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Bigaard, J., Frederiksen, K., Tjønneland, A., Thomsen, B. L., Overvad, K., Heitmann, B. L., &amp;amp; Sørensen, T. I. A. (2005). &lt;a href=&quot;https://www.nature.com/articles/0802976&quot;&gt;Waist circumference and body composition in relation to all-cause mortality in middle-aged men and women.&lt;/a&gt; &lt;a href=&quot;#fnref:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt; &lt;a href=&quot;#fnref:2:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;2&lt;/sup&gt;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:5&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Study participants died at only about half the rate of the general population, see Bigaard et al. (2005)&lt;sup id=&quot;fnref:3:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt;. &lt;a href=&quot;#fnref:5&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:4&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Looking purely at BMI, the study found that mortality was minimized at 25. Looking at the combined healthiest BFMI and FFMI, mortality was minimized at 24 for men and 23 for women. I believe the discrepancy comes from the fact that people with low body fat often have insufficient lean mass. &lt;a href=&quot;#fnref:4&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:36&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Instead of subtracting 4 points from FFMI, I could subtract 2 points from BFMI and 2 points from FFMI. That seems worse to me because I would expect that few people with a FFMI of 19/17 have health conditions that reduce their lean mass—if they did, their FFMI would be considerably lower. So the observation that an FFMI of 19/17 minimizes mortality shouldn’t be confounded by health conditions. &lt;a href=&quot;#fnref:36&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:9&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Lee, D. H., Keum, N., Hu, F. B., Orav, E. J., Rimm, E. B., Willett, W. C., &amp;amp; Giovannucci, E. L. (2018). &lt;a href=&quot;https://www.bmj.com/content/362/bmj.k2575.full&quot;&gt;Predicted lean body mass, fat mass, and all cause and cause specific mortality in men: prospective US cohort study.&lt;/a&gt; &lt;a href=&quot;#fnref:9&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:37&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;According to their “model 2”, which controls for various factors but does not control for health conditions. &lt;a href=&quot;#fnref:37&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:15&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Figure 1 shows the lowest mortality at a fat mass of 21 kg, which corresponds to a surprisingly-high 27% body fat. I believe this is an artifact of the model used to generate Figure 1. Figure 1 fits the data to a cubic spline model which smooths out the bumpiness in the quintiles. The 1st quintile had the lowest mortality rate but its 95% CI overlapped with the 2nd and 3rd quintiles, so the cubic spline model smoothed these out and ended up predicting that mortality is minimized in the 3rd quintile. Thanks to co-author Edward Giovannucci for this explanation. &lt;a href=&quot;#fnref:15&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:38&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I use 23 instead of 24 because my analysis of Bigaard et al. (2004) found that splitting out lean mass and fat mass reduced apparent healthiest BMI by 1 point. &lt;a href=&quot;#fnref:38&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:16&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;For me personally, if I somehow dropped by body fat to 0% without losing any lean mass, I’d still have a BMI of 20. &lt;a href=&quot;#fnref:16&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:14&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Zhang, C., Rexrode, K. M., Van Dam, R. M., Li, T. Y., &amp;amp; Hu, F. B. (2008). &lt;a href=&quot;https://www.ahajournals.org/doi/full/10.1161/CIRCULATIONAHA.107.739714&quot;&gt;Abdominal obesity and the risk of all-cause, cardiovascular, and cancer mortality: sixteen years of follow-up in US women.&lt;/a&gt; &lt;a href=&quot;#fnref:14&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:39&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;My reasoning: If the 28–29 group looks only slightly less healthy than the &amp;lt;28 group for all participants (including smokers), and if the mortality curve follows an U shape, then the nadir of the curve should be just below 28 inches. If the 28–29 group looks substantially less healthy, then the nadir must be considerably lower than 28 inches because that way most of the function’s mass near the nadir occurs below 28 inches. &lt;a href=&quot;#fnref:39&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:11&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Hu, H., Wang, J., Han, X., Li, Y., Wang, F., Yuan, J., Miao, X., Yang, H., &amp;amp; He, M. (2018). &lt;a href=&quot;https://www.sciencedirect.com/science/article/pii/S1279770723021930&quot;&gt;BMI, waist circumference and all-cause mortality in a middle-aged and elderly Chinese population.&lt;/a&gt; &lt;a href=&quot;#fnref:11&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:13&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Baik, I., Ascherio, A., Rimm, E. B., Giovannucci, E., Spiegelman, D., Stampfer, M. J., &amp;amp; Willett, W. C. (2000). &lt;a href=&quot;https://academic.oup.com/aje/article/152/3/264/73227&quot;&gt;Adiposity and mortality in men.&lt;/a&gt; &lt;a href=&quot;#fnref:13&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:23&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Flegal, K. M., Shepherd, J. A., Looker, A. C., Graubard, B. I., Borrud, L. G., Ogden, C. L., . &amp;amp; Schenker, N (2009). &lt;a href=&quot;https://www.sciencedirect.com/science/article/pii/S0002916523239659&quot;&gt;Comparisons of percentage body fat, body mass index, waist circumference, and waist-stature ratio in adults.&lt;/a&gt; &lt;a href=&quot;#fnref:23&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:19&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Meeuwsen, S., Horgan, G. W., &amp;amp; Elia, M (2010). &lt;a href=&quot;https://www.sciencedirect.com/science/article/abs/pii/S026156141000004X&quot;&gt;The relationship between BMI and percent body fat, measured by bioelectrical impedance, in a large adult sample is curvilinear and influenced by age and sex.&lt;/a&gt; &lt;a href=&quot;#fnref:19&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:18&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Mills, T. C., Gallagher, D., Wang, J., &amp;amp; Heshka, S. (2007). &lt;a href=&quot;https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3183503/&quot;&gt;Modelling the relationship between body fat and the BMI.&lt;/a&gt; &lt;a href=&quot;#fnref:18&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:40&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Then again, I would need to have –9% body fat to get my BMI down to 18 without losing lean mass, so maybe they’re on to something. &lt;a href=&quot;#fnref:40&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:21&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Ranasinghe, C., Gamage, P., Katulanda, P., Andraweera, N., Thilakarathne, S., &amp;amp; Tharanga, P (2013). &lt;a href=&quot;https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3766672/&quot;&gt;Relationship between body mass index (BMI) and body fat percentage, estimated by bioelectrical impedance, in a group of Sri Lankan adults: a cross sectional study.&lt;/a&gt; &lt;a href=&quot;#fnref:21&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:41&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I’ve seen claims that low-intensity exercise disproportionately burns visceral fat (&lt;a href=&quot;https://doi.org/10.14814/phy2.15853&quot;&gt;Brobakken et al. (2023)&lt;/a&gt;&lt;sup id=&quot;fnref:42&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:42&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;33&lt;/a&gt;&lt;/sup&gt;), and cortisol (the “stress hormone”) disproportionately &lt;em&gt;adds&lt;/em&gt; visceral fat. But I think even if you exercise a lot and minimize stress, you still don’t want to have too much body fat. &lt;a href=&quot;#fnref:41&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:7&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Britton, K. A., Massaro, J. M., Murabito, J. M., Kreger, B. E., Hoffmann, U., &amp;amp; Fox, C. S. (2013). &lt;a href=&quot;https://www.jacc.org/doi/abs/10.1016/j.jacc.2013.06.027&quot;&gt;Body fat distribution, incident cardiovascular disease, cancer, and all-cause mortality.&lt;/a&gt; &lt;a href=&quot;#fnref:7&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:12&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Cerhan, J. R., Moore, S. C., Jacobs, E. J., Kitahara, C. M., Rosenberg, P. S., Adami, H. O., Ebbert, J. O., English, D. R., Gapstur, S. M., Giles, G. G., &amp;amp; Horn-Ross, P. L. (2014). &lt;a href=&quot;https://www.sciencedirect.com/science/article/abs/pii/S0025619613010409&quot;&gt;A pooled analysis of waist circumference and mortality in 650,000 adults.&lt;/a&gt; &lt;a href=&quot;#fnref:12&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:10&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Chen, Y., Yang, Y., Jiang, H., Liang, X., Wang, Y., &amp;amp; Lu, W. (2019). &lt;a href=&quot;https://onlinelibrary.wiley.com/doi/full/10.1002/oby.22423&quot;&gt;Associations of BMI and waist circumference with all-cause mortality: a 22-Year cohort study.&lt;/a&gt; &lt;a href=&quot;#fnref:10&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:6&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Jacobs, E. J., Newton, C. C., Wang, Y., Patel, A. V., McCullough, M. L., Campbell, P. Thun, M. J., &amp;amp; Gapstur, S. M. (2010). &lt;a href=&quot;https://jamanetwork.com/journals/jamainternalmedicine/article-abstract/775594&quot;&gt;Waist circumference and all-cause mortality in a large US cohort.&lt;/a&gt; &lt;a href=&quot;#fnref:6&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:8&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Lee, S. W., Son, J. Y., Kim, J. M., Hwang, S. S., Han, J. S., &amp;amp; Heo, N. J. (2018). &lt;a href=&quot;https://dom-pubs.onlinelibrary.wiley.com/doi/abs/10.1111/dom.13050&quot;&gt;Body fat distribution is more predictive of all-cause mortality than overall adiposity.&lt;/a&gt; &lt;a href=&quot;#fnref:8&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:17&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Visscher, T. L. S., Seidell, J. C., Molarius, A., van der Kuip, D., Hofman, A., &amp;amp; Witteman, J. C. M. (2001). &lt;a href=&quot;https://www.nature.com/articles/0801787&quot;&gt;A comparison of body mass index, waist–hip ratio and waist circumference as predictors of all-cause mortality among the elderly: the Rotterdam study.&lt;/a&gt; &lt;a href=&quot;#fnref:17&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:25&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Momma, H., Kawakami, R., Honda, T., &amp;amp; Sawada, S. S (2022). &lt;a href=&quot;https://bjsm.bmj.com/content/56/13/755&quot;&gt;Muscle-strengthening activities are associated with lower risk and mortality in major non-communicable diseases: a systematic review and meta-analysis of cohort studies.&lt;/a&gt; &lt;a href=&quot;#fnref:25&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:27&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Patel, A. V., Hodge, J. M., Rees-Punia, E., Teras, L. R., Campbell, P. T., &amp;amp; Gapstur, S. M (2020). &lt;a href=&quot;https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7417019/&quot;&gt;Peer Reviewed: Relationship Between Muscle-Strengthening Activity and Cause-Specific Mortality in a Large US Cohort.&lt;/a&gt; &lt;a href=&quot;#fnref:27&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:28&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Miyachi, M (2013). &lt;a href=&quot;https://bjsm.bmj.com/content/47/6/393.full&quot;&gt;Effects of resistance training on arterial stiffness: a meta-analysis.&lt;/a&gt; &lt;a href=&quot;#fnref:28&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:29&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Nuckols, G. (2022). &lt;a href=&quot;https://www.strongerbyscience.com/research-spotlight-lifting-longevity/&quot;&gt;What is the optimal dose of resistance training for longevity?&lt;/a&gt; &lt;a href=&quot;#fnref:29&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:42&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Brobakken, M. F., Krogsæter, I., Helgerud, J., Wang, E., &amp;amp; Hoff, J (2023). &lt;a href=&quot;https://doi.org/10.14814/phy2.15853&quot;&gt;Abdominal aerobic endurance exercise reveals spot reduction exists: A randomized controlled trial.&lt;/a&gt; &lt;a href=&quot;#fnref:42&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>What's the Healthiest BMI?</title>
				<pubDate>Sun, 05 May 2024 00:00:00 -0700</pubDate>
				<link>http://mdickens.me/2024/05/05/healthiest_BMI/</link>
				<guid isPermaLink="true">http://mdickens.me/2024/05/05/healthiest_BMI/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;&lt;strong&gt;TLDR: 20 to 22.&lt;/strong&gt;&lt;/p&gt;

&lt;!-- more --&gt;

&lt;p&gt;&lt;em&gt;&lt;a href=&quot;https://mdickens.me/confidence_tags/&quot;&gt;Confidence&lt;/a&gt;: Somewhat likely.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Last updated 2024-05-18.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Health organizations such as the American Heart Association recommend a body-mass index (BMI) of 18.5 to 25. But that’s a wide range. Surely we can say something more specific, right? I don’t want to know what’s &lt;em&gt;acceptable&lt;/em&gt;, I want to know what’s &lt;em&gt;optimal&lt;/em&gt;. What’s the exact best BMI for health?&lt;sup id=&quot;fnref:32&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:32&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;I couldn’t find an answer with a simple web search, which means now I have to write a post about it.&lt;sup id=&quot;fnref:31&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:31&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;h2 id=&quot;contents&quot;&gt;Contents&lt;/h2&gt;

&lt;ul id=&quot;markdown-toc&quot;&gt;
  &lt;li&gt;&lt;a href=&quot;#contents&quot; id=&quot;markdown-toc-contents&quot;&gt;Contents&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#the-big-bmi-studies&quot; id=&quot;markdown-toc-the-big-bmi-studies&quot;&gt;The big BMI studies&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#do-studies-still-overestimate-ideal-bmi&quot; id=&quot;markdown-toc-do-studies-still-overestimate-ideal-bmi&quot;&gt;Do studies still overestimate ideal BMI?&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#details-age-sex-ethnicity-and-cause-specific-mortality&quot; id=&quot;markdown-toc-details-age-sex-ethnicity-and-cause-specific-mortality&quot;&gt;Details: Age, sex, ethnicity, and cause-specific mortality&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#exercise&quot; id=&quot;markdown-toc-exercise&quot;&gt;Exercise&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#a-final-bit-of-evidence&quot; id=&quot;markdown-toc-a-final-bit-of-evidence&quot;&gt;A final bit of evidence&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#changelog&quot; id=&quot;markdown-toc-changelog&quot;&gt;Changelog&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#notes&quot; id=&quot;markdown-toc-notes&quot;&gt;Notes&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;the-big-bmi-studies&quot;&gt;The big BMI studies&lt;/h2&gt;

&lt;p&gt;I read&lt;sup id=&quot;fnref:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt; two big meta-analyses on BMI and mortality:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Global BMI Mortality Collaboration (2016). &lt;a href=&quot;https://www.thelancet.com/journals/lancet/article/PIIS0140-6736(16)30175-1/fulltext&quot;&gt;Body-mass index and all-cause mortality: individual-participant-data meta-analysis of 239 prospective studies in four continents.&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;Aune, Dagfinn; Sen, Abhijit; Prasad, Manya; Norat, Teresa; Janszky, Imre; Tonstad, Serena; Romundstad, Pål; Vatten, Lars J (2016). &lt;a href=&quot;https://www.bmj.com/content/353/bmj.i2156&quot;&gt;BMI and all cause mortality: systematic review and non-linear dose-response meta-analysis of 230 cohort studies with 3.74 million deaths among 30.3 million participants.&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;We have absolute mountains of data on the association between BMI and mortality. In spite of that, it’s hard to say what BMI minimizes mortality risk because it depends on how you interpret the data.&lt;/p&gt;

&lt;p&gt;The raw mortality data says the healthiest BMI is 25—right on the cusp of the “overweight” range.&lt;/p&gt;

&lt;p&gt;But that’s wrong because confounding variables make low BMI look worse than it really is. Smokers tend to weigh less, and various health problems cause people to lose weight.&lt;/p&gt;

&lt;p&gt;If you adjust for smoking and health, and you also limit to studies with ≥15-year follow-ups to avoid confounding by undiagnosed illnesses,&lt;sup id=&quot;fnref:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;4&lt;/a&gt;&lt;/sup&gt; &lt;strong&gt;the ideal BMI is 20–22&lt;/strong&gt;. Both meta-analyses agree on this conclusion (see &lt;a href=&quot;https://www.thelancet.com/action/showFullTableHTML?isHtml=true&amp;amp;tableId=tbl3&amp;amp;pii=S0140-6736%2816%2930175-1&quot;&gt;Table 3&lt;/a&gt; and &lt;a href=&quot;https://www.thelancet.com/cms/10.1016/S0140-6736(16)30175-1/attachment/e2a32d35-aeae-445a-8feb-fe6d9c03da38/mmc1.pdf&quot;&gt;Appendix eTable 8&lt;/a&gt; in the first paper, and see the &lt;a href=&quot;https://www.bmj.com/content/353/bmj.i2156&quot;&gt;abstract&lt;/a&gt; in the second paper).&lt;/p&gt;

&lt;p&gt;Here is a table of &lt;a href=&quot;https://en.wikipedia.org/wiki/Hazard_ratio&quot;&gt;hazard ratios&lt;/a&gt; (i.e., mortality rates relative to the 22.5–25 BMI group), taken from the first meta-analysis:&lt;/p&gt;

&lt;table&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;BMI&lt;/td&gt;
      &lt;td&gt;18.5–20&lt;/td&gt;
      &lt;td&gt;20–22.5&lt;/td&gt;
      &lt;td&gt;22.5–25&lt;/td&gt;
      &lt;td&gt;25–27.5&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;hazard ratio&lt;/td&gt;
      &lt;td&gt;1.13&lt;/td&gt;
      &lt;td&gt;1.00&lt;/td&gt;
      &lt;td&gt;1.00&lt;/td&gt;
      &lt;td&gt;1.07&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;(Berrington de Gonzalez et al. (2010)&lt;sup id=&quot;fnref:3&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;5&lt;/a&gt;&lt;/sup&gt;, a smaller (but still large) meta-analysis of 1.5 million white individuals, provides mortality data to a granularity of 1 point of BMI in its &lt;a href=&quot;https://www.nejm.org/doi/suppl/10.1056/NEJMoa1000367/suppl_file/nejmoa1000367_appendix.pdf&quot;&gt;Supplemental Appendix&lt;/a&gt;, Figure 1. It finds a uniform mortality rate between 20 and 25.)&lt;/p&gt;

&lt;p&gt;Mortality appears to increase slowly above 25 and more quickly below 20. If we interpolate the shape of the mortality curve, it looks like the exact ideal BMI just under 23.&lt;sup id=&quot;fnref:5&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:5&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;6&lt;/a&gt;&lt;/sup&gt; But I think that’s too high, as I will explain in the next section.&lt;/p&gt;

&lt;h2 id=&quot;do-studies-still-overestimate-ideal-bmi&quot;&gt;Do studies still overestimate ideal BMI?&lt;/h2&gt;

&lt;p&gt;I have reason to suspect that these meta-analyses somewhat overestimate the ideal BMI even after controlling for smoking, health, and study follow-up time. The Global BMI Mortality Collaboration’s &lt;a href=&quot;https://www.thelancet.com/cms/10.1016/S0140-6736(16)30175-1/attachment/e2a32d35-aeae-445a-8feb-fe6d9c03da38/mmc1.pdf&quot;&gt;Supplemental Appendix&lt;/a&gt; has a nice chart showing what happens when you successively adjust for confounders:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/BMI-HR.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;The Y axis shows mortality rate. Notice how the un-adjusted mortality rate for the 20–22.5 group starts out considerably higher than for the 22.5–25 group, and (almost) every time you adjust for a confounder, its mortality rate gets lower. By the end, the mortality rates for 20–22.5 and for 22.5–25 are nearly identical.&lt;/p&gt;

&lt;p&gt;(More generally, as you adjust for more things, all BMI groups on the low end look better and all groups on the high end look worse.)&lt;/p&gt;

&lt;p&gt;It seems likely that this meta-analysis didn’t adjust for every possible confounder. Whenever we make an adjustment, the lower BMIs look better. So presumably, if we adjusted for the as-yet-undiscovered cofounders, the ideal BMI would look lower than what these meta-analyses report.&lt;/p&gt;

&lt;p&gt;And indeed, Aune et al. (2016) found that BMIs in the 20–22.5 range looked healthier than 22.5–25 when restricted to studies with a ≥20-year follow-up (see &lt;a href=&quot;https://www.bmj.com/content/bmj/suppl/2016/05/04/bmj.i2156.DC1/aund030215.ww2.pdf&quot;&gt;Appendix 2&lt;/a&gt;, Table E). Global BMI Mortality Collaboration (2016) tried restricting to &amp;gt;5-year follow-ups (in the green and black points shown in eFigure 6 above), but it appears that 5 years is not enough to fully eliminate confounding by undiagnosed diseases.&lt;/p&gt;

&lt;p&gt;Aune et al. (2016)’s Table E reported hazard ratios by BMI with ≥20-year follow-up:&lt;/p&gt;

&lt;table&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;BMI&lt;/td&gt;
      &lt;td&gt;17.5&lt;/td&gt;
      &lt;td&gt;20&lt;/td&gt;
      &lt;td&gt;22&lt;/td&gt;
      &lt;td&gt;23&lt;/td&gt;
      &lt;td&gt;24&lt;/td&gt;
      &lt;td&gt;25&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;hazard ratio&lt;/td&gt;
      &lt;td&gt;1.06&lt;/td&gt;
      &lt;td&gt;0.99&lt;/td&gt;
      &lt;td&gt;0.99&lt;/td&gt;
      &lt;td&gt;1.00&lt;/td&gt;
      &lt;td&gt;1.02&lt;/td&gt;
      &lt;td&gt;1.06&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;The &lt;a href=&quot;https://www.nejm.org/doi/full/10.1056/NEJMoa1000367&quot;&gt;NCI Cohort Consortium meta-analysis&lt;/a&gt;&lt;sup id=&quot;fnref:3:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;5&lt;/a&gt;&lt;/sup&gt; found that both the 18.5–20 and 20–22.5 groups have lower mortality than 22.5–25 if you only look at studies with ≥15-year follow-ups—both the low-BMI groups had hazard ratios of 0.92 (see &lt;a href=&quot;https://www.nejm.org/cms/10.1056/NEJMoa1000367/asset/525efabe-18be-42e0-aa2f-b527c165687e/assets/images/large/nejmoa1000367_t2.jpg&quot;&gt;Table 2&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;Before I said the ideal BMI was around 23. If we extrapolate the pattern of adjustments from Global BMI Mortality Collaboration’s eFigure 6, or look at Aune et al. (2016)’s 20-year follow-up data, it looks like the ideal BMI is more like 21. (But really, we could just as well say that the ideal BMI is anywhere from 20 to 22 because mortality doesn’t detectably vary within that range.)&lt;/p&gt;

&lt;h2 id=&quot;details-age-sex-ethnicity-and-cause-specific-mortality&quot;&gt;Details: Age, sex, ethnicity, and cause-specific mortality&lt;/h2&gt;

&lt;p&gt;The Global BMI Mortality Collaboration study broke down mortality by age, sex, and cause of death (see &lt;a href=&quot;https://www.thelancet.com/cms/10.1016/S0140-6736(16)30175-1/attachment/e2a32d35-aeae-445a-8feb-fe6d9c03da38/mmc1.pdf&quot;&gt;Supplemental Appendix&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;Ideal BMI (before adjusting for confounders) goes up with age:&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Age&lt;/th&gt;
      &lt;th&gt;Ideal BMI&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;35–49&lt;/td&gt;
      &lt;td&gt;20–22.5&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;50–69&lt;/td&gt;
      &lt;td&gt;22.5–25&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;70–89&lt;/td&gt;
      &lt;td&gt;25–27.5&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;Ideal BMI does not vary by sex, but males are more sensitive to BMI than females. That is, females can tolerate greater deviations from the healthy BMI range without as much increase in mortality risk.&lt;/p&gt;

&lt;p&gt;As far as I could find, there is no research whatsoever on the relationship between BMI and mortality for transgender people. Presumably, trans individuals should also target a BMI in the 20 to 22 range, but I can’t empirically confirm that.&lt;/p&gt;

&lt;p&gt;There’s mixed evidence on BMI and ethnicity:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;In 2004, the World Health Organization &lt;a href=&quot;https://pubmed.ncbi.nlm.nih.gov/14726171/&quot;&gt;suggested&lt;/a&gt;&lt;sup id=&quot;fnref:28&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:28&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;7&lt;/a&gt;&lt;/sup&gt; that most Asian sub-populations should use lower BMI cutoffs to define overweightness and obesity. A &lt;a href=&quot;https://www.researchgate.net/profile/Vanisha-Nambiar/publication/367462971_Consensus_Statement_for_Diagnosis_of_Obesity_Abdominal_Obesity_and_the_Metabolic_Syndrome_for_Asian_Indians_and_Recommendations_for_Physical_Activity_Medical_and_Surgical_Management/links/63d35ba6c97bd76a823c820e/Consensus-Statement-for-Diagnosis-of-Obesity-Abdominal-Obesity-and-the-Metabolic-Syndrome-for-Asian-Indians-and-Recommendations-for-Physical-Activity-Medical-and-Surgical-Management.pdf&quot;&gt;consensus statement on Asian Indians&lt;/a&gt;&lt;sup id=&quot;fnref:29&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:29&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;8&lt;/a&gt;&lt;/sup&gt; (the ethnicity that appears most sensitive to high BMI) proposed defining “normal” BMI for Indians as 18–23, overweight as &amp;gt;23, and obese as &amp;gt;25.&lt;/li&gt;
  &lt;li&gt;A large UK study&lt;sup id=&quot;fnref:26&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:26&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;9&lt;/a&gt;&lt;/sup&gt; found that different ethnicities had different mortality rates on the low and high ends of BMI, but not in the middle, i.e., the healthiest BMI did not vary by ethnicity. (See Table S2.8 and Figure S2.7 in the &lt;a href=&quot;https://www.thelancet.com/cms/10.1016/S2213-8587(18)30288-2/attachment/5f0b20fe-7932-4454-bdab-ee66e51cdf4e/mmc1.pdf&quot;&gt;Supplemental Appendix&lt;/a&gt;.)&lt;sup id=&quot;fnref:30&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:30&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;10&lt;/a&gt;&lt;/sup&gt; This is consistent with the consensus statements—the cutoffs for overweight/obese should probably vary by ethnicity—but it also suggests that &lt;em&gt;ideal&lt;/em&gt; BMI does not vary much.&lt;/li&gt;
  &lt;li&gt;There is theoretical reason to expect ideal BMI to depend on ethnicity. Namely, different ethnicities tend to distribute body fat differently, with Asians carrying more body fat at a given BMI than the average human, and blacks carrying slightly less. (But diabetes risk appears to increase slightly more rapidly with BMI for blacks than for whites, which contradicts this.&lt;sup id=&quot;fnref:27&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:27&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;11&lt;/a&gt;&lt;/sup&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We simply don’t have enough evidence to draw strong conclusions about how ethnicity affects ideal BMI. As best I can tell, all ethnicities should target a BMI in the 20–22 range.&lt;/p&gt;

&lt;p&gt;Ideal BMI varies by cause of death. This table (adapted from Global BMI Mortality Collaboration’s &lt;a href=&quot;https://www.thelancet.com/cms/10.1016/S0140-6736(16)30175-1/attachment/e2a32d35-aeae-445a-8feb-fe6d9c03da38/mmc1.pdf&quot;&gt;Supplemental Appendix&lt;/a&gt;, eTable 15) shows the hazard ratios for four BMI ranges broken down by some of the most common causes of death.&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt; &lt;/th&gt;
      &lt;th&gt;18.5–20&lt;/th&gt;
      &lt;th&gt;20–22.5&lt;/th&gt;
      &lt;th&gt;22.5–25&lt;/th&gt;
      &lt;th&gt;25–27.5&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;coronary heart disease&lt;/td&gt;
      &lt;td&gt;0.95&lt;/td&gt;
      &lt;td&gt;0.89&lt;/td&gt;
      &lt;td&gt;1.00&lt;/td&gt;
      &lt;td&gt;1.18&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;other cardiovascular disease&lt;/td&gt;
      &lt;td&gt;1.14&lt;/td&gt;
      &lt;td&gt;0.98&lt;/td&gt;
      &lt;td&gt;1.00&lt;/td&gt;
      &lt;td&gt;1.11&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;stroke&lt;/td&gt;
      &lt;td&gt;1.15&lt;/td&gt;
      &lt;td&gt;1.01&lt;/td&gt;
      &lt;td&gt;1.00&lt;/td&gt;
      &lt;td&gt;1.05&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;respiratory disease&lt;/td&gt;
      &lt;td&gt;1.73&lt;/td&gt;
      &lt;td&gt;1.22&lt;/td&gt;
      &lt;td&gt;1.00&lt;/td&gt;
      &lt;td&gt;1.00&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;cancer&lt;/td&gt;
      &lt;td&gt;1.01&lt;/td&gt;
      &lt;td&gt;0.96&lt;/td&gt;
      &lt;td&gt;1.00&lt;/td&gt;
      &lt;td&gt;1.05&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;The second meta-analysis, &lt;a href=&quot;https://www.bmj.com/content/353/bmj.i2156&quot;&gt;Aune et al. (2016)&lt;/a&gt;, claimed “weight loss can precede the diagnosis of some neurological and respiratory diseases by as much as 10-15 years.” The effect discussed in the previous section, where longer follow-ups make lower BMIs look better, appears primarily driven by neurological and respiratory disease.&lt;/p&gt;

&lt;h2 id=&quot;exercise&quot;&gt;Exercise&lt;/h2&gt;

&lt;p&gt;Exercise has two major confounding effects:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Much of the harm of a high BMI comes from a living sedentary lifestyle. Controlling for cardiorespiratory fitness makes BMI look less important, especially at high BMIs.&lt;sup id=&quot;fnref:19&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:19&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;12&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
  &lt;li&gt;Resistance training increases muscle mass, which raises BMI but not in a harmful way. If you control for resistance training, that should lower the apparent ideal BMI for low-muscle individuals while raising it for high-muscle individuals.&lt;sup id=&quot;fnref:11&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:11&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;13&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;(I thought exercise might have a third effect: exercise disproportionately reduces respiratory disease, which makes low BMI look healthier. This turned out to be true, but adjusting for it only reduces the hazard ratio for the 20–22.5 group by 1%.&lt;sup id=&quot;fnref:20&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:20&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;14&lt;/a&gt;&lt;/sup&gt;)&lt;/p&gt;

&lt;p&gt;In a meta-analysis, &lt;a href=&quot;http://dx.doi.org/10.1016/j.pcad.2013.09.002&quot;&gt;Barry et al. (2013)&lt;/a&gt;&lt;sup id=&quot;fnref:19:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:19&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;12&lt;/a&gt;&lt;/sup&gt; found that physical activity substantially reduces the negative health effects of overweightness and obesity, but did not look at how physical activity affects individuals with a BMI of less than 25.&lt;/p&gt;

&lt;p&gt;Is exercise more helpful for overweight than for underweight individuals? (If so, that could raise the ideal BMI for physically active people.) Or is it equally good on both ends of the BMI spectrum? (In which case the ideal BMI wouldn’t change.)&lt;/p&gt;

&lt;p&gt;I found three studies which suggest that it’s the latter: exercise expands the healthy BMI range without obviously changing the ideal BMI.&lt;sup id=&quot;fnref:21&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:21&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;15&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;While we’re on the subject of exercise, I should mention that exercise matters more for health than BMI, especially if you already fall within the “normal” BMI range. Exercise &lt;strong&gt;halves&lt;/strong&gt; your mortality rate&lt;sup id=&quot;fnref:6&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:6&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;16&lt;/a&gt;&lt;/sup&gt; (for the collective “you”; individual results may vary). Moving your BMI from 25 to 22 only decreases your mortality rate by around 5%.&lt;/p&gt;

&lt;p&gt;(Reducing BMI from 25 to 22 has an effect size of 0.5. Going from sedentary to physically active (as defined by the &lt;a href=&quot;https://www.cdc.gov/physicalactivity/basics/adults/index.htm&quot;&gt;physical activity guidelines&lt;/a&gt;&lt;sup id=&quot;fnref:8&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:8&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;17&lt;/a&gt;&lt;/sup&gt;) has an effect size of 17.)&lt;/p&gt;

&lt;h2 id=&quot;a-final-bit-of-evidence&quot;&gt;A final bit of evidence&lt;/h2&gt;

&lt;p&gt;In &lt;a href=&quot;https://www.amazon.com/Eat-Drink-Be-Healthy-Harvard/dp/1501164775&quot;&gt;Eat, Drink, and Be Healthy&lt;/a&gt;, Walter Willett—who co-authored the Global BMI Mortality Collaboration meta-analysis—had this to say about the ideal BMI.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;The 2015–2020 &lt;em&gt;Dietary Guidelines for Americans&lt;/em&gt; sets healthy weights as those corresponding to BMIs between 18.5 and 25. […] Panel members agreed that the risk of heart disease, diabetes, and high blood pressure begins to climb at a BMI of 22 or so. But they didn’t feel justified choosing such a low number as the cutoff between healthy and unhealthy weights, because doing so would have labeled a large majority of the U.S. population as overweight. […] [B]ut many people with a BMI of 23 to 25 are not at their healthiest weight.&lt;/p&gt;

  &lt;p&gt;[…]&lt;/p&gt;

  &lt;p&gt;What about BMIs under 18.5, which the government’s tables say isn’t healthy? This can, indeed, signal an unhealthy weight, especially if an individual has been losing weight or has an eating disorder. But people who have maintained a low BMI while eating healthfully and being active are usually just fine and have no reason to increase their weight.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h1 id=&quot;changelog&quot;&gt;Changelog&lt;/h1&gt;

&lt;ul&gt;
  &lt;li&gt;2024-05-18: Add a comment on BMI for transgender people, and fix a numeric error.&lt;/li&gt;
&lt;/ul&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:32&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;It makes sense why the American Heart Association doesn’t care about knowing the exact best BMI: why worry about people with slightly sub-optimal BMIs when a third of Americans have BMIs over 30? From a population-level perspective, getting everyone with a BMI of 24 down to 22 would reduce mortality by approximately zero percent.&lt;/p&gt;

      &lt;p&gt;But my BMI is 24 and I would like to know if I should lower it. &lt;a href=&quot;#fnref:32&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:31&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;In the process of writing this post, I went through this cycle five times:&lt;/p&gt;

      &lt;ol&gt;
        &lt;li&gt;My research is done! I will do a final read-through to tighten up the prose.&lt;/li&gt;
        &lt;li&gt;(in the middle of reading) Hmmm, this part makes me wonder about something. I will investigate further.&lt;/li&gt;
        &lt;li&gt;(5 hours later) Ok, I think I’ve figured it out. Let me write up these new findings.&lt;/li&gt;
        &lt;li&gt;(post is now 500 words longer) My research is done! I will do a final read-through to tighten up the prose.&lt;/li&gt;
      &lt;/ol&gt;

      &lt;p&gt;At no point in my many revisions did the TLDR change, so maybe it was all a waste of time, but at least now I have 14 citations instead of three, which makes me look more scholarly. &lt;a href=&quot;#fnref:31&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:1&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;By which I mean I read the introduction and conclusion and looked at the tables and figures. &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:2&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Honestly I don’t fully understand the reasoning behind this, but both meta-analyses claim that short-term studies are confounded by undiagnosed diseases that decrease BMI and increase mortality.&lt;/p&gt;

      &lt;p&gt;I think the idea is that long-term studies ignore anyone who dies in the first few years, and anyone who dies after that probably didn’t have an undiagnosed illness at the beginning of the study or else they would have died sooner. Many studies use 5-year delays to combat this, but apparently many illnesses go undiagnosed for more than 5 years, and you need at least 15 years to eliminate confounding. In fact, Aune et al. (2016) found that even a 15-year follow-up is noticeably more confounded than a 20-year follow-up—see &lt;a href=&quot;https://www.bmj.com/content/bmj/suppl/2016/05/04/bmj.i2156.DC1/aund030215.ww2.pdf&quot;&gt;Appendix 2&lt;/a&gt;, Table E. &lt;a href=&quot;#fnref:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:3&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Berrington de Gonzalez A, Hartge P, Cerhan JR, Flint AJ, Hannan L, MacInnis RJ, Moore SC, Tobias GS, Anton-Culver H, Freeman LB, Beeson WL, Clipp SL, English DR, Folsom AR, Freedman DM, Giles G, Hakansson N, Henderson KD, Hoffman-Bolton J, Hoppin JA, Koenig KL, Lee IM, Linet MS, Park Y, Pocobelli G, Schatzkin A, Sesso HD, Weiderpass E, Willcox BJ, Wolk A, Zeleniuch-Jacquotte A, Willett WC, Thun MJ. &lt;a href=&quot;https://www.nejm.org/doi/full/10.1056/NEJMoa1000367&quot;&gt;Body-Mass Index and Mortality among 1.46 Million White Adults.&lt;/a&gt; &lt;a href=&quot;#fnref:3&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt; &lt;a href=&quot;#fnref:3:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;2&lt;/sup&gt;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:5&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I took the four groups from 18.5 to 27.5 (I included the 25–27.5 group because it has similar mortality risk to 18.5–20) and fitted a quadratic curve to them. The curve had a minimum at 23.1.&lt;/p&gt;

      &lt;p&gt;Originally I tried fitting a quadratic curve to eight BMI groups (from 15 BMI up to 40), but the curve wasn’t a good fit. When I fit the eight groups to a quartic curve, it fit well and had a minimum at 22.7, although a fourth-degree polynomial might have too many degrees of freedom.&lt;/p&gt;

      &lt;p&gt;&lt;img src=&quot;/assets/images/BMI-mortality-quartic-fit.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

      &lt;p&gt;(The quadratic curve is the little blue one, and the quartic curve is the big orange one.) &lt;a href=&quot;#fnref:5&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:28&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;WHO Expert Consultation. (2004). &lt;a href=&quot;https://pubmed.ncbi.nlm.nih.gov/14726171/&quot;&gt;Appropriate body-mass index for Asian populations and its implications for policy and intervention strategies.&lt;/a&gt; &lt;a href=&quot;#fnref:28&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:29&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Misra A, Chowbey P, Makkar BM, Vikram NK, Wasir JS, Chadha D, Joshi SR, Sadikot S, Gupta R, Gulati S, Munjal YP. (2009). &lt;a href=&quot;https://www.researchgate.net/profile/Vanisha-Nambiar/publication/367462971_Consensus_Statement_for_Diagnosis_of_Obesity_Abdominal_Obesity_and_the_Metabolic_Syndrome_for_Asian_Indians_and_Recommendations_for_Physical_Activity_Medical_and_Surgical_Management/links/63d35ba6c97bd76a823c820e/Consensus-Statement-for-Diagnosis-of-Obesity-Abdominal-Obesity-and-the-Metabolic-Syndrome-for-Asian-Indians-and-Recommendations-for-Physical-Activity-Medical-and-Surgical-Management.pdf&quot;&gt;Consensus Statement for Diagnosis of Obesity, Abdominal Obesity and the Metabolic Syndrome for Asian Indians and Recommendations for Physical Activity, Medical and Surgical Management.&lt;/a&gt; &lt;a href=&quot;#fnref:29&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:26&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Bhaskaran K, dos-Santos-Silva I, Leon DA, Douglas IJ, Smeeth L. (2018). &lt;a href=&quot;https://www.thelancet.com/journals/landia/article/PIIS2213-8587(18)30288-2/fulltext&quot;&gt;Association of BMI with overall and cause-specific mortality: a population-based cohort study of 3·6 million adults in the UK&lt;/a&gt; &lt;a href=&quot;#fnref:26&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:30&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;This study had wide confidence intervals on its mortality rates for non-white ethnicities because most subjects were white. And the study only covered UK residents, which could affect the results due to local diet or selection effects from immigration. But it was the only study I could find that looked at all-cause mortality by ethnicity. I found good studies that looked at diabetes morbidity, but no others on all-cause mortality. &lt;a href=&quot;#fnref:30&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:27&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Shai I, Jiang R, Manson JE, Stampfer MJ, Willett WC, Colditz GA, Hu FB. (2006). &lt;a href=&quot;https://dash.harvard.edu/bitstream/handle/1/41293007/28918%201585.full.pdf&quot;&gt;Ethnicity, Obesity, and Risk of Type 2 Diabetes in Women.&lt;/a&gt; &lt;a href=&quot;#fnref:27&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:19&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Barry VW, Baruth M, Beets MW, Durstine JL, Liu J, Blair SN. (2013). &lt;a href=&quot;https://g-se.com/uploads/blog_adjuntos/fitness_vs._fatness_on_all_cause_mortality_a_meta_analysis.pdf&quot;&gt;Fitness vs. fatness on all-cause mortality: a meta-analysis.&lt;/a&gt; &lt;a href=&quot;#fnref:19&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt; &lt;a href=&quot;#fnref:19:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;2&lt;/sup&gt;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:11&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Abramowitz MK, Hall CB, Amodu A, Sharma D, Androga L, Hawkins M. (2018). &lt;a href=&quot;https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5894968/&quot;&gt;Muscle mass, BMI, and mortality among adults in the United States: A population-based cohort study.&lt;/a&gt; &lt;a href=&quot;#fnref:11&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:20&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;&lt;a href=&quot;https://www.bmj.com/content/370/bmj.m2031&quot;&gt;Zhao et al. (2020)&lt;/a&gt;&lt;sup id=&quot;fnref:6:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:6&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;16&lt;/a&gt;&lt;/sup&gt;, a cohort study on exercise and mortality,&lt;sup id=&quot;fnref:10&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:10&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;18&lt;/a&gt;&lt;/sup&gt; reports in &lt;a href=&quot;https://www.bmj.com/highwire/markup/1030297/expansion&quot;&gt;Table 2&lt;/a&gt; that exercise has a hazard ratio of 0.48 on all-cause mortality (meaning people who meet the physical activity guidelines&lt;sup id=&quot;fnref:8:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:8&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;17&lt;/a&gt;&lt;/sup&gt; are only 48% as likely to die as people who don’t). A table of relevant hazard ratios for individuals who meet the exercise guidelines:&lt;/p&gt;

      &lt;table&gt;
        &lt;thead&gt;
          &lt;tr&gt;
            &lt;th&gt; &lt;/th&gt;
            &lt;th&gt;All-cause mortality&lt;/th&gt;
            &lt;th&gt;Lower respiratory disease&lt;/th&gt;
            &lt;th&gt;Flu/pneumonia&lt;/th&gt;
          &lt;/tr&gt;
        &lt;/thead&gt;
        &lt;tbody&gt;
          &lt;tr&gt;
            &lt;td&gt;hazard ratio&lt;/td&gt;
            &lt;td&gt;0.48&lt;/td&gt;
            &lt;td&gt;0.21&lt;/td&gt;
            &lt;td&gt;0.36&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
            &lt;td&gt;hazard ratio, controlling for BMI and other lifestyle factors&lt;/td&gt;
            &lt;td&gt;0.58&lt;/td&gt;
            &lt;td&gt;0.29&lt;/td&gt;
            &lt;td&gt;0.45&lt;/td&gt;
          &lt;/tr&gt;
        &lt;/tbody&gt;
      &lt;/table&gt;

      &lt;p&gt;Whether we control for BMI or not, exercise has a bigger effect on respiratory disease than on all-cause mortality. To calculate how much this matters, we need to know what proportion of people die of respiratory disease.&lt;/p&gt;

      &lt;p&gt;&lt;a href=&quot;https://ourworldindata.org/grapher/annual-number-of-deaths-by-cause?country=~North+America+%28WB%29&quot;&gt;Our World in Data&lt;/a&gt; says that, for North America and Europe, respiratory illnesses account for around 10% of deaths, and they consist of around 2/3 upper respiratory and 1/3 lower respiratory disease. Using the weighted average of the exercise hazard ratios from Zhao et al. (2020) gives an overall hazard ratio for respiratory disease of 0.40.&lt;/p&gt;

      &lt;p&gt;I calculated exercise and BMI hazard ratios for all causes excluding respiratory disease as the solution to &lt;code&gt;all-cause HR = 0.1 * respiratory HR + 0.9 * non-respiratory HR&lt;/code&gt;.&lt;/p&gt;

      &lt;p&gt;Then I adjusted the BMI mortality rates for respiratory disease and for non-respiratory mortality using the respective exercise hazard ratios and re-combined them into an all-cause HR for physically active individuals.&lt;/p&gt;

      &lt;p&gt;I used the exercise HRs controlling for BMI rather than the uncontrolled HRs, on the theory that exercise probably lowers BMI on average, and we want to know the health effect of exercise irrespective of how it changes BMI.&lt;/p&gt;

      &lt;p&gt;It would be more accurate to fully break out mortality by cause instead of just breaking it into respiratory and non-respiratory, but I can’t really do that because Global BMI Mortality Collaboration (2016) and Zhao et al. (2020) don’t classify causes in the same way.&lt;/p&gt;

      &lt;p&gt;This table presents the hazard ratios (HRs) of different BMIs for people who meet the physical activity guidelines, accounting for respiratory disease (normalized so the 22.5–25 group has a HR of 1).&lt;/p&gt;

      &lt;table&gt;
        &lt;thead&gt;
          &lt;tr&gt;
            &lt;th&gt; &lt;/th&gt;
            &lt;th&gt;15–18.5&lt;/th&gt;
            &lt;th&gt;18.5–20&lt;/th&gt;
            &lt;th&gt;20–22.5&lt;/th&gt;
            &lt;th&gt;22.5–25&lt;/th&gt;
            &lt;th&gt;25–27.5&lt;/th&gt;
          &lt;/tr&gt;
        &lt;/thead&gt;
        &lt;tbody&gt;
          &lt;tr&gt;
            &lt;td&gt;All-cause hazard ratio&lt;/td&gt;
            &lt;td&gt;1.51&lt;/td&gt;
            &lt;td&gt;1.13&lt;/td&gt;
            &lt;td&gt;1.00&lt;/td&gt;
            &lt;td&gt;1.00&lt;/td&gt;
            &lt;td&gt;1.07&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
            &lt;td&gt;All-cause HR with exercise&lt;/td&gt;
            &lt;td&gt;1.45&lt;/td&gt;
            &lt;td&gt;1.11&lt;/td&gt;
            &lt;td&gt;0.99&lt;/td&gt;
            &lt;td&gt;1.00&lt;/td&gt;
            &lt;td&gt;1.07&lt;/td&gt;
          &lt;/tr&gt;
        &lt;/tbody&gt;
      &lt;/table&gt;

      &lt;p&gt;Accounting for the effect of exercise on respiratory disease did make lower BMIs look slightly better, but it made hardly any difference. &lt;a href=&quot;#fnref:20&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:21&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;ol&gt;
        &lt;li&gt;Berrington de Gonzalez et al. (2010)&lt;sup id=&quot;fnref:3:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;5&lt;/a&gt;&lt;/sup&gt; (in its &lt;a href=&quot;https://www.nejm.org/doi/suppl/10.1056/NEJMoa1000367/suppl_file/nejmoa1000367_appendix.pdf&quot;&gt;Supplemental Appendix&lt;/a&gt;, Table 7)&lt;sup id=&quot;fnref:25&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:25&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;19&lt;/a&gt;&lt;/sup&gt; found that physical activity reduced hazard ratios for both low and high BMIs. It also slightly decreased the 20–22.5 BMI hazard ratio from 1.02 to 0.99, which could mean exercise slightly decreases the ideal BMI, but the change in hazard ratio was highly non-significant (p &amp;gt; 0.6) and non-monotonic (the moderate-activity group with 20–22.5 BMI had a hazard ratio of 0.97, lower than either the low-activity or high-activity group).&lt;/li&gt;
        &lt;li&gt;Garfinkel et al. (1988)&lt;sup id=&quot;fnref:22&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:22&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;20&lt;/a&gt;&lt;/sup&gt; wrote that exercise had a greater effect on mortality for both underweight and overweight individuals than for normal-weight people, but the paper did not provide numbers.&lt;/li&gt;
        &lt;li&gt;Lee &amp;amp; Kim (2020)&lt;sup id=&quot;fnref:23&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:23&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;21&lt;/a&gt;&lt;/sup&gt; looked specifically at underweight older adults and found that exercise had a hazard ratio of 0.68. Zhao et al. (2020)&lt;sup id=&quot;fnref:6:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:6&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;16&lt;/a&gt;&lt;/sup&gt; found an exercise hazard ratio of 0.48 for the general population, but Zhao used a stricter definition of physical activity, so these numbers aren’t directly comparable.&lt;/li&gt;
      &lt;/ol&gt;
      &lt;p&gt;&lt;a href=&quot;#fnref:21&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:6&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Zhao M, Veeranki S P, Magnussen C G, Xi B. (2020). &lt;a href=&quot;https://www.bmj.com/content/370/bmj.m2031&quot;&gt;Recommended physical activity and all cause and cause specific mortality in US adults: prospective cohort study.&lt;/a&gt; &lt;a href=&quot;#fnref:6&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt; &lt;a href=&quot;#fnref:6:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;2&lt;/sup&gt;&lt;/a&gt; &lt;a href=&quot;#fnref:6:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;3&lt;/sup&gt;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:8&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;To meet the &lt;a href=&quot;(https://www.cdc.gov/physicalactivity/basics/adults/index.htm)&quot;&gt;physical activity guidelines&lt;/a&gt;, you must do both&lt;/p&gt;

      &lt;ol&gt;
        &lt;li&gt;At least one of:
          &lt;ul&gt;
            &lt;li&gt;150 minutes of moderate physical activity per week&lt;/li&gt;
            &lt;li&gt;75 minutes of vigorous physical activity per week&lt;/li&gt;
          &lt;/ul&gt;
        &lt;/li&gt;
        &lt;li&gt;Muscle strengthening activity at least two days per week&lt;/li&gt;
      &lt;/ol&gt;
      &lt;p&gt;&lt;a href=&quot;#fnref:8&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt; &lt;a href=&quot;#fnref:8:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;2&lt;/sup&gt;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:10&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I cited this cohort study instead of a meta-analysis because I couldn’t find any meta-analyses that looked at exercise and cause-specific mortality, specifically including respiratory disease. The &lt;a href=&quot;https://journals.lww.com/acsm-msse/fulltext/2019/06000/physical_activity,_all_cause_and_cardiovascular.22.aspx&quot;&gt;2018 Physical Activity Guidelines Advisory Committee report&lt;/a&gt; (which is the most comprehensive meta-analysis I could find (it’s actually a meta-meta-analysis)) looked at all-cause mortality and heart disease, but not respiratory disease. &lt;a href=&quot;#fnref:10&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:25&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Berrington de Gonzalez et al. (2010) only included white individuals. The mortality &amp;lt;&amp;gt; BMI relationship varies a little bit with race, but I have no reason to believe that race affects the effect of physical activity. &lt;a href=&quot;#fnref:25&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:22&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Garfinkel L, Stellman SD. (1988). &lt;a href=&quot;https://acsjournals.onlinelibrary.wiley.com/doi/abs/10.1002/1097-0142(19881015)62:1+%3C1844::AID-CNCR2820621328%3E3.0.CO;2-O&quot;&gt;Mortality by relative weight and exercise.&lt;/a&gt; &lt;a href=&quot;#fnref:22&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:23&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Lee I, Kim B. (2020). &lt;a href=&quot;https://www.ksep-es.org/journal/view.php?number=841&quot;&gt;Association between Estimated Cardiorespiratory Fitness and All-cause Mortality in Underweight Older Adults.&lt;/a&gt; &lt;a href=&quot;#fnref:23&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>Caffeine Cycling Self-Experiment</title>
				<pubDate>Thu, 11 Apr 2024 00:00:00 -0700</pubDate>
				<link>http://mdickens.me/2024/04/11/caffeine_self_experiment/</link>
				<guid isPermaLink="true">http://mdickens.me/2024/04/11/caffeine_self_experiment/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;&lt;em&gt;Last updated 2024-07-26 to clarify wording.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;&lt;a href=&quot;https://mdickens.me/confidence_tags/&quot;&gt;Confidence&lt;/a&gt;: Likely.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;I conducted an experiment on myself to see if I would develop a tolerance to caffeine from taking it three days a week. The results suggest that I didn’t. Caffeine had just as big an effect at the end of my four-week trial as it did at the beginning.&lt;/p&gt;

&lt;p&gt;This outcome is statistically significant (p = 0.016), but the data show a weird pattern: caffeine’s effectiveness went &lt;em&gt;up&lt;/em&gt; over time instead of staying flat. I don’t know how to explain that, which makes me suspicious of the experiment’s findings.&lt;/p&gt;

&lt;!-- more --&gt;

&lt;h2 id=&quot;contents&quot;&gt;Contents&lt;/h2&gt;

&lt;ul id=&quot;markdown-toc&quot;&gt;
  &lt;li&gt;&lt;a href=&quot;#contents&quot; id=&quot;markdown-toc-contents&quot;&gt;Contents&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#experimental-procedure&quot; id=&quot;markdown-toc-experimental-procedure&quot;&gt;Experimental procedure&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#calibration-phase&quot; id=&quot;markdown-toc-calibration-phase&quot;&gt;Calibration phase&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#abstinence-phase&quot; id=&quot;markdown-toc-abstinence-phase&quot;&gt;Abstinence phase&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#experimental-phase&quot; id=&quot;markdown-toc-experimental-phase&quot;&gt;Experimental phase&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#what-explains-these-results&quot; id=&quot;markdown-toc-what-explains-these-results&quot;&gt;What explains these results?&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#an-offer-to-readers&quot; id=&quot;markdown-toc-an-offer-to-readers&quot;&gt;An offer to readers&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#notes&quot; id=&quot;markdown-toc-notes&quot;&gt;Notes&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;experimental-procedure&quot;&gt;Experimental procedure&lt;/h2&gt;

&lt;p&gt;(I described this procedure in a &lt;a href=&quot;/2024/03/02/caffeine_tolerance/#appendix-b-pre-registration-for-a-caffeine-self-experiment&quot;&gt;pre-registration&lt;/a&gt; on a previous post.)&lt;/p&gt;

&lt;p&gt;I test my reaction time by taking the &lt;a href=&quot;https://humanbenchmark.com/tests/reactiontime&quot;&gt;humanbenchmark.com test&lt;/a&gt; twice in a row. One test consists of 5 reaction events, so this gives a total of 10 reaction events, taking my reaction time at that moment as the average of the 10 reaction times. I take the test using the same computer, monitor, and mouse so that latency is consistent.&lt;/p&gt;

&lt;p&gt;I measure the effect of caffeine using a reaction time test because (a) caffeine is known to improve reaction time, (b) reaction time is easy to test,&lt;sup id=&quot;fnref:12&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:12&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; (c) it’s unlikely to improve with practice so it makes for a good consistent test variable, and (d) it’s hard to placebo-effect myself into improving my reaction time (which is important because I am not blinding myself&lt;sup id=&quot;fnref:6&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:6&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt;).&lt;/p&gt;

&lt;p&gt;I conduct three phases as specified below:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Phase 1. Calibration phase.&lt;/strong&gt; Continue drinking coffee three days a week as I have been for the past several years: two cups of coffee (~24 ounces) with three scoops of grounds,&lt;sup id=&quot;fnref:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt; always the same brand,&lt;sup id=&quot;fnref:7&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:7&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;4&lt;/a&gt;&lt;/sup&gt; drunk on Monday, Wednesday, and Friday morning.&lt;/p&gt;

&lt;p&gt;Take a reaction time test twice a day (following the schedule described in phase 3 below). Continue for four weeks. Plot a regression of my reaction time across the four weeks. The purpose of the calibration phase is to ensure that my reaction time does not improve from practicing every day—the regression line should be flat.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Phase 2. Abstinence phase.&lt;/strong&gt; Abstain from caffeine for one week (9 days total, in between the last Friday of the calibration phase and the first Monday of the test phase). Test reaction time every day and measure the slope of reaction times across the 9 days. If I was habituated to caffeine in phase 1 then my reaction time should improve over the course of phase 2 as my tolerance wears off.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Phase 3. Experimental phase.&lt;/strong&gt; Resume drinking coffee three days a week and continue for four weeks. Take a reaction time test twice a day, at (say) 8am and then 10am—the exact time doesn’t matter, but take first test before having coffee and the second 30+ minutes after coffee. (Or on days when I don’t have coffee, take the first test after I wake up and the second test an hour or two later.)&lt;/p&gt;

&lt;h2 id=&quot;calibration-phase&quot;&gt;Calibration phase&lt;/h2&gt;

&lt;p&gt;&lt;em&gt;I wrote the first draft of this section after completing the calibration phase, and I wrote the first draft of &lt;a href=&quot;#abstinence-phase&quot;&gt;Abstinence phase&lt;/a&gt; after completing the abstinence phase. So when I wrote them, I didn’t know the full results of the experiment yet.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;I ran a four-week calibration phase to check some assumptions of the experiment:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Caffeine should improve reaction time.&lt;/li&gt;
  &lt;li&gt;If my post-caffeine test outperforms my pre-caffeine test, it should be because of the caffeine, not because my reaction time gets better later in the day.&lt;/li&gt;
  &lt;li&gt;Practicing shouldn’t improve my reaction time.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The calibration phase confirmed the first two assumptions:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;The post-caffeine tests outperformed the pre-caffeine tests by an average of –13 ms (p = 0.025).&lt;sup id=&quot;fnref:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;5&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
  &lt;li&gt;On no-caffeine days, the second test did not outperform the first test (difference 0.4 ms, p = 0.9).&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The third assumption was sort-of-confirmed: my reaction time did not improve over the course of the calibration phase. In fact, it got &lt;em&gt;worse&lt;/em&gt; at a rate of 0.87 ms/day (p = 0.014) for caffeine tests and 1.04 ms/day (p = 0.006) for no-caffeine tests.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/caf-calibration-regression.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;(Remember that higher reaction times are worse.)&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Why did my reaction times get worse? It’s not because I was getting habituated to caffeine. I had already been taking caffeine 3 days a week for years, so I would have been fully habituated long before starting the calibration phase.&lt;/p&gt;

&lt;p&gt;Could it be because I started sleeping worse? That’s part of the reason. I regressed reaction time (on no-caffeine tests) against time spent in bed the previous night. Over the full experiment (not just the calibration phase&lt;sup id=&quot;fnref:11&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:11&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;6&lt;/a&gt;&lt;/sup&gt;), each additional hour of sleep improved my reaction time by 4.9 ms&lt;sup id=&quot;fnref:10&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:10&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;7&lt;/a&gt;&lt;/sup&gt; (p &amp;lt; 0.0002, r&lt;sup&gt;2&lt;/sup&gt; = 0.24). Controlling for time-in-bed flattened the slope of reaction time across non-caffeine tests from 1.04 ms/day to 0.77 ms/day. But that still leaves almost 3/4 of the slope unexplained.&lt;sup id=&quot;fnref:9&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:9&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;8&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/caf-calibration-regression-controlled.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;My best explanation: as the test became part of my routine, I subconsciously started taking it less seriously and started having a harder time staying focused. On most trials I get reaction times around 250–270ms, but occasionally I lose focus and end up taking 330ms or longer to react. As I recall, that didn’t happen at all during the first week or two of the calibration phase, it only started happening later.&lt;/p&gt;

&lt;p&gt;My reaction time can’t continue getting worse forever. But this does raise a concern about the results from the experimental phase: if my performance gets worse during the experimental phase, it might be because I’m getting habituated to caffeine, or it might be a continuation of the trend that happened during the calibration phase.&lt;/p&gt;

&lt;h2 id=&quot;abstinence-phase&quot;&gt;Abstinence phase&lt;/h2&gt;

&lt;p&gt;I abstained from caffeine for 9 days. If I had previously been habituated to caffeine, you’d expect my reaction time to improve over the course of the week as my caffeine withdrawal subsides. Specifically, if caffeine improves reaction time by 13 ms, you’d expect my reaction time to get better by 13 ms over the course of the 9 days (= 1.44 ms/day). Instead, my reaction time got &lt;em&gt;worse&lt;/em&gt; at a rate of 0.77 ms/day. This is not significantly different from 0 (p = 0.4), but it &lt;em&gt;is&lt;/em&gt; significantly different from –1.44 ms/day (p = 0.04).&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/caf-abstinence-regression.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;This plot shows the likelihood function for caffeine retention as indicated by the slope of reaction time over the abstinence phase:&lt;sup id=&quot;fnref:3&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;9&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/caf-abstinence-likelihood.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;The maximum-likelihood estimate is 1.53—that is, caffeine becomes 53% &lt;em&gt;more&lt;/em&gt; effective after my body adapts to it. If my reaction time got worse during abstinence, that implies caffeine tolerance was making my reaction time better. I’m pretty sure that’s wrong—my reaction time must have gotten worse for some other reason.&lt;/p&gt;

&lt;p&gt;Controlling for time-in-bed flattens the slope to nearly 0:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/caf-abstinence-regression-controlled.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;h2 id=&quot;experimental-phase&quot;&gt;Experimental phase&lt;/h2&gt;

&lt;p&gt;For the experimental phase, I resumed taking caffeine 3 days a week.&lt;/p&gt;

&lt;p&gt;Over the course of the four-week phase, I did not become habituated to caffeine. In fact, I became &lt;em&gt;sensitized&lt;/em&gt;—caffeine got more effective, not less. My post-caffeine reaction time changed at a rate of –0.39 ms/day (p = 0.016) (remember, a negative number means faster reaction time). My reaction time without caffeine also improved to a lesser extent (slope = –0.23 ms/day (p = 0.4); the difference in slopes was not statistically significant (p = 0.5)). So either I did not develop a caffeine tolerance, or any caffeine tolerance I developed was outweighed by some force working in the opposite direction.&lt;/p&gt;

&lt;p&gt;In the plot below, “nocaf” indicates reaction times without having taken caffeine first, and “caf” gives reaction times tested approximately 30 minutes after taking caffeine.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/caf-experimental-regression.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;According to these regression lines, reaction time improved by a total of 5.7 ms with caffeine and 9.8 ms without caffeine over the four weeks.&lt;/p&gt;

&lt;p&gt;You will notice a very low point on day 0. That happened because I accidentally reacted too early on one of the trials, but by pure coincidence I reacted at just the right moment to score a ~20ms reaction time.&lt;sup id=&quot;fnref:13&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:13&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;10&lt;/a&gt;&lt;/sup&gt; If I run a regression starting one day later to exclude this anomaly, the slopes for caffeine tests and no-caffeine tests look comparable (caffeine slope = –0.47 ms/day, p = 0.014; no-caffeine slope = –0.54 ms/day, p = 0.027).&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/caf-experimental-regression-skip-day-0.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;This plot shows the likelihood function of caffeine retention according to the slope of performance on caffeine tests (excluding day 0):&lt;sup id=&quot;fnref:3:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;9&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/caf-experimental-likelihood.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;The likelihood function has a mean and a maximum of 1.86, which says caffeine becomes 86% more effective after my body adjusts to it. This likelihood function has only 0.6% of its mass below retention = 1 (i.e., retention = 1 has a p-value of 2 * 0.006 = 0.012). This likelihood function strongly suggests that caffeine gets &lt;em&gt;more&lt;/em&gt; effective over time, not less.&lt;/p&gt;

&lt;p&gt;I don’t believe this result. It’s more likely that some confounding factor caused my reaction time to improve. but I can’t think of what that confounding factor might be.&lt;/p&gt;

&lt;h2 id=&quot;what-explains-these-results&quot;&gt;What explains these results?&lt;/h2&gt;

&lt;p&gt;Could my reaction time have improved because I was getting more sleep? If I control for time spent in bed the previous night, the slope of reaction times vs. days does flatten from –0.54 to –0.40, but this only explains about 1/4 of the slope.&lt;sup id=&quot;fnref:14&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:14&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;11&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/caf-experimental-regression-controlled.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Maybe my performance improved due to the cumulative effect of sleeping well for many nights in a row? But I spent less time in bed during the experimental phase (average 8.73 hours) than during the calibration phase (8.99 hours), so if anything I should have gotten worse, not better.&lt;/p&gt;

&lt;p&gt;Could this be a genuine result? Could caffeine actually become more effective when I take it for longer? Three &lt;a href=&quot;/2024/03/02/caffeine_tolerance/#experimental-evidence-on-intermittent-dosing&quot;&gt;experiments on rats&lt;/a&gt;&lt;sup id=&quot;fnref:4&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:4&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;12&lt;/a&gt;&lt;/sup&gt; found something similar: rats who took caffeine daily developed a tolerance, but rats who took caffeine on alternating days became sensitized (its effect got larger). (Plus one study&lt;sup id=&quot;fnref:5&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:5&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;13&lt;/a&gt;&lt;/sup&gt; found neither tolerance nor sensitization.)&lt;/p&gt;

&lt;p&gt;This hints that caffeine sensitization is a real thing. But the results from the rat experiments don’t look the same as my results. They found that rats’ performance on caffeine days increased over the course of the experiments while performance on placebo days stayed flat. In contrast, my own performance improved both on caffeine days and on “placebo” days (I didn’t take a placebo, I just took nothing).&lt;/p&gt;

&lt;p&gt;But perhaps caffeine sensitization works differently in humans than in rats. If a habituated caffeine user experiences withdrawal symptoms, then maybe a sensitized user experiences “anti-withdrawal”, making them perform better even when they don’t take caffeine. Maybe my brain thinks, “I don’t know what’s going on, the caffeine levels inside me keep fluctuating, I’d better delete some neurotransmitter receptors just to be safe,” and this ends up making me more alert with or without caffeine. But why didn’t it happen that way in the rat studies?&lt;/p&gt;

&lt;p&gt;Earlier, when I talked about the calibration phase, I hypothesized that my performance got worse because I subconsciously stopped taking the tests as seriously. Could the opposite have happened in the experimental phase?&lt;/p&gt;

&lt;p&gt;I don’t think so. I noticed my performance getting worse when I looked at the results just after finishing the calibration phase. So I might have mentally resolved to focus harder. But if so, you’d expect my performance to jump up and stay persistently high, or perhaps to jump up and then decline again, but not to start low and then steadily improve.&lt;/p&gt;

&lt;p&gt;I thought the results might have something to do with my computer’s latency, but my experiment already controlled most of the parameters that might change the latency (I always tested on the same computer with the same hardware in a browser with a single tab open and with no other applications open except Emacs and Terminal). It occurred to me that perhaps whether my second monitor was on or off might affect the latency, but I tested this and saw no difference.&lt;/p&gt;

&lt;hr /&gt;

&lt;p&gt;The results of my experiment suggest that I did not become habituated to caffeine. I can’t figure out what they &lt;em&gt;do&lt;/em&gt; suggest, but at least I can say that I probably don’t develop a tolerance from taking caffeine 3 days a week.&lt;/p&gt;

&lt;h2 id=&quot;an-offer-to-readers&quot;&gt;An offer to readers&lt;/h2&gt;

&lt;p&gt;If you conduct a caffeine experiment on yourself with similar methodology to mine, you can send your data to web at mdickens dot me and I’ll analyze it and make some graphs for you. At a minimum, each data point should include the date, your reaction time, and whether you had caffeine.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Source code and data for this experiment are available &lt;a href=&quot;https://github.com/michaeldickens/public-scripts/tree/master/caffeine&quot;&gt;on GitHub&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:12&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;And, I didn’t know this before running the experiment, but it turns out that it’s easy to get statistically significant results with reaction time. My reaction time had a day-to-day standard deviation of only 11 ms, so I can detect pretty small effect sizes with just a few days of samples. &lt;a href=&quot;#fnref:12&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:6&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;It’s possible to conduct a self-blinded caffeine experiment as follows:&lt;/p&gt;

      &lt;ol&gt;
        &lt;li&gt;Label caffeine pills and placebo pills as pill A and pill B in a random order.&lt;/li&gt;
        &lt;li&gt;On Monday/Wednesday/Friday, take pill A. On Tuesday/Thursday/Saturday, take pill B. (Skip Sunday.)&lt;/li&gt;
        &lt;li&gt;Each week, re-randomize the ordering of pill A and pill B so you can’t figure out which one is which.&lt;/li&gt;
      &lt;/ol&gt;

      &lt;p&gt;I didn’t want to do that for two reasons:&lt;/p&gt;

      &lt;ul&gt;
        &lt;li&gt;I already suspect that caffeine makes me feel much better while lifting weights, so I don’t want to spend potentially several weeks lifting weights without caffeine.&lt;/li&gt;
        &lt;li&gt;I prefer to drink coffee rather than take pills, and you can’t really blind coffee because decaf tastes different.&lt;/li&gt;
      &lt;/ul&gt;
      &lt;p&gt;&lt;a href=&quot;#fnref:6&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:2&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;A standard serving is two scoops per six ounces, which would require me to use eight scoops, but I don’t like my coffee that strong. If you take higher doses of caffeine, you’ll probably get habituated faster. (I have no evidence that that’s true, but it sounds right to me.) &lt;a href=&quot;#fnref:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:7&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Signature Select Classic Roast, because it’s the cheapest and it tastes as good as the best brands I’ve tried. &lt;a href=&quot;#fnref:7&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:1&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;For this calculation I compared post-caffeine and pre-caffeine tests on the same day, ignoring the test results for days where I didn’t take caffeine. If instead I compare post-caffeine tests vs. all no-caffeine tests (including on days when I don’t take caffeine), the difference between averages is 8 ms. However, the difference in performance without caffeine on caffeine days vs. no-caffeine days is not statistically significant (difference = 3 ms, p = 0.5). I performed slightly worse on caffeine days, which is the opposite of what I’d predict—subjectively, I feel more energetic on caffeine days even when I haven’t taken caffeine yet. &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:11&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;A regression over just the calibration phase gives a slope of –5.50 ms/hour (p = 0.01337 (nice)). &lt;a href=&quot;#fnref:11&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:10&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;That means a dose of coffee is worth 2 hours of sleep in terms of its immediate effect on reaction time. &lt;a href=&quot;#fnref:10&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:9&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I tried regressing reaction time against time-in-bed the previous two nights, but the second night did not add any predictive power.&lt;/p&gt;

      &lt;p&gt;I also looked at time spent asleep according to my &lt;a href=&quot;https://www.sleepcycle.com/&quot;&gt;sleep tracking app&lt;/a&gt;, which I suspected wouldn’t work as well because I’ve noticed it’s pretty bad at identifying when I’m asleep. And indeed, regressing reaction time against time “asleep” gave a similar slope as regressing against time-in-bed, but with a worse p-value. &lt;a href=&quot;#fnref:9&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:3&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I converted slope into retention as follows:&lt;/p&gt;

      &lt;ol&gt;
        &lt;li&gt;multiply slope by the number of days in the phase to get the total reaction time change&lt;/li&gt;
        &lt;li&gt;divide by the baseline benefit of caffeine (13 ms) to get the degree of habituation (0 = no habituation, 1 = full habituation, -1 = reverse habituation i.e. caffeine got more effective)&lt;/li&gt;
        &lt;li&gt;subtract from 1 to get retention (retention is basically the inverse of habituation)&lt;/li&gt;
      &lt;/ol&gt;

      &lt;p&gt;Unlike in my caffeine &lt;a href=&quot;/2024/03/29/does_caffeine_stop_working/&quot;&gt;literature review&lt;/a&gt;, I treated the baseline benefit as a fixed parameter instead of a distribution because that makes the math easier (but the real reason is that I wrote this part of the code before I wrote the code for the literature review). &lt;a href=&quot;#fnref:3&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt; &lt;a href=&quot;#fnref:3:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;2&lt;/sup&gt;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:13&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Perhaps I should have re-run the trial, but I was following a strict rule not to re-run trials under any circumstances, to make sure I had no wiggle room to bias the results. &lt;a href=&quot;#fnref:13&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:14&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;The displayed graph shows no-caffeine trials. Controlling for sleep on caffeine trials has a similar effect, flattening the slope from –0.47 to –0.41. &lt;a href=&quot;#fnref:14&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:4&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;C. J. Meliska, R. E. Landrum &amp;amp; T. A. Landrum (1990). &lt;a href=&quot;https://pubmed.ncbi.nlm.nih.gov/2320659/&quot;&gt;Tolerance and sensitization to chronic and subchronic oral caffeine: effects on wheelrunning in rats.&lt;/a&gt;&lt;/p&gt;

      &lt;p&gt;Omar Cauli, Annalisa Pinna, Valentina Valentini &amp;amp; Micaela Morelli (2003). &lt;a href=&quot;https://www.nature.com/articles/1300240&quot;&gt;Subchronic Caffeine Exposure Induces Sensitization to Caffeine and Cross-Sensitization to Amphetamine Ipsilateral Turning Behavior Independent from Dopamine Release.&lt;/a&gt;&lt;/p&gt;

      &lt;p&gt;N. Simola, E. Tronci, A. Pinna &amp;amp; M. Morelli (2006). &lt;a href=&quot;https://pubmed.ncbi.nlm.nih.gov/16874467/&quot;&gt;Subchronic-intermittent caffeine amplifies the motor effects of amphetamine in rats.&lt;/a&gt; &lt;a href=&quot;#fnref:4&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:5&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Omar Cauli &amp;amp; Micaela Morelli (2002). &lt;a href=&quot;https://pubmed.ncbi.nlm.nih.gov/12122482/&quot;&gt;Subchronic caffeine administration sensitizes rats to the motor-activating effects of dopamine D(1) and D(2) receptor agonists.&lt;/a&gt; &lt;a href=&quot;#fnref:5&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>How Well Did Scott Alexander's List of Social Science Findings Hold Up?</title>
				<pubDate>Mon, 08 Apr 2024 00:00:00 -0700</pubDate>
				<link>http://mdickens.me/2024/04/08/did_social_science_findings_hold_up/</link>
				<guid isPermaLink="true">http://mdickens.me/2024/04/08/did_social_science_findings_hold_up/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;In 2012, Scott Alexander &lt;a href=&quot;https://web.archive.org/web/20131230022325/http://squid314.livejournal.com/322213.html&quot;&gt;defended social sciences&lt;/a&gt; against the claim that they can’t figure anything out. He gave a long list of well-established findings across a variety of social science disciplines.&lt;/p&gt;

&lt;p&gt;12 years later, how well did that list hold up?&lt;/p&gt;

&lt;!-- more --&gt;

&lt;p&gt;I evaluated the list off the top of my head without doing any research,&lt;sup id=&quot;fnref:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; so please don’t take this too seriously.&lt;/p&gt;

&lt;p&gt;The text before the colon on each numbered item is Scott’s words; everything else is my words.&lt;/p&gt;

&lt;h3 id=&quot;anthropology&quot;&gt;Anthropology&lt;/h3&gt;

&lt;ol&gt;
  &lt;li&gt;Humankind evolved in Africa, gradually settled the Old World, and crossed the Bering land bridge to America around 20,000 years ago: &lt;strong&gt;still good&lt;/strong&gt;&lt;/li&gt;
  &lt;li&gt;Languages form large families like Indo-European that can be used to trace the history and development of different peoples: &lt;strong&gt;still good&lt;/strong&gt;&lt;/li&gt;
  &lt;li&gt;People have an almost-miraculous language instinct that can for example turns a pidgin into a creole in the second generation: &lt;strong&gt;still good&lt;/strong&gt;&lt;/li&gt;
  &lt;li&gt;There are various human universals, but people tend to overestimate how universal their own culture’s norms are: &lt;strong&gt;I don’t know&lt;/strong&gt;
    &lt;ul&gt;
      &lt;li&gt;I don’t know how to interpret this claim. It’s well-established that there are various human universals and also that cultural norms vary a lot, but I have no idea whether it’s true that people tend to overestimate how universal their own culture’s norms are.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;Any biological mental differences between groups are less important than previous believed and quickly overwhelmed by within-group differences: &lt;strong&gt;still good&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Verdict: 4/5 or 5/5. Anthropology is doing well.&lt;/p&gt;

&lt;h3 id=&quot;economics&quot;&gt;Economics&lt;/h3&gt;

&lt;ol&gt;
  &lt;li&gt;Prices in the marketplace are determined by supply and demand: &lt;strong&gt;still good&lt;/strong&gt;&lt;/li&gt;
  &lt;li&gt;Capitalism leads to faster economic growth than the alternatives: &lt;strong&gt;still good&lt;/strong&gt;&lt;/li&gt;
  &lt;li&gt;Unless you have very strange priorities, free trade is a good idea: &lt;strong&gt;still good&lt;/strong&gt;&lt;/li&gt;
  &lt;li&gt;The gold standard is a bad idea: &lt;strong&gt;still good&lt;/strong&gt;&lt;/li&gt;
  &lt;li&gt;Rent control decreases the quality and quantity of available housing: &lt;strong&gt;still good&lt;/strong&gt;&lt;/li&gt;
  &lt;li&gt;Minimum wages increase unemployment: &lt;strong&gt;still good&lt;/strong&gt;
    &lt;ul&gt;
      &lt;li&gt;There are some observational studies contradicting this and there’s debate as to whether minimum wage is currently high enough that raising it would cause noticeable unemployment but the basic principle of this claim is still true. (I found Bryan Caplan’s &lt;a href=&quot;https://www.econlib.org/archives/2013/03/the_vice_of_sel.html&quot;&gt;The Myopic Empiricism of the Minimum Wage&lt;/a&gt; persuasive.&lt;sup id=&quot;fnref:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt;) And anyway, the counter-arguments people would raise today are the same ones they would’ve raised in 2012.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;Cutting taxes will not increase government revenue in normal conditions: &lt;strong&gt;still good&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Verdict: 7/7. Economics is doing well.&lt;/p&gt;

&lt;h3 id=&quot;psychology&quot;&gt;Psychology&lt;/h3&gt;

&lt;ol&gt;
  &lt;li&gt;Personality is about 50% biologically determined: &lt;strong&gt;still good&lt;/strong&gt;&lt;/li&gt;
  &lt;li&gt;People’s self-image is constructed on the spot and varies widely depending on the situation: &lt;strong&gt;still good&lt;/strong&gt;&lt;/li&gt;
  &lt;li&gt;People exhibit various cognitive biases that deviate systematically from rational thought: &lt;strong&gt;partially replicated&lt;/strong&gt;
    &lt;ul&gt;
      &lt;li&gt;Some cognitive biases have replicated (base rate neglect, status quo bias, loss aversion, hindsight bias), others haven’t (priming, nudge theory, implicit bias). From scanning thru Wikipedia’s &lt;a href=&quot;https://en.wikipedia.org/wiki/List_of_cognitive_biases&quot;&gt;list of cognitive biases&lt;/a&gt;, looks like 70–90% have replicated. I will give this one half credit for a partial replication.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;IQ correlates to all kinds of important life outcomes: &lt;strong&gt;still good&lt;/strong&gt;&lt;/li&gt;
  &lt;li&gt;Many mental disorders correspond to disordered brain chemistry and can be partly treated chemically: &lt;strong&gt;still good&lt;/strong&gt;&lt;/li&gt;
  &lt;li&gt;People lack privileged access to their own mental processes: &lt;strong&gt;still good&lt;/strong&gt;&lt;/li&gt;
  &lt;li&gt;Cognitive behavioral therapy does better than placebo in treating mental disorders; Freudian therapy does not. &lt;strong&gt;still good (?)&lt;/strong&gt;
    &lt;ul&gt;
      &lt;li&gt;IIRC some RCTs have found good benefits to the modern version of Freudian therapy but I don’t think it’s well-established.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;Babies are not a blank slate but have various built-in behavioral patterns; they develop new mental abilities in an orderly fashion: &lt;strong&gt;still good&lt;/strong&gt;&lt;/li&gt;
  &lt;li&gt;Animals react to reward and punishment in extremely predictable, almost mathematical ways: &lt;strong&gt;still good&lt;/strong&gt;&lt;/li&gt;
  &lt;li&gt;Strong relationships and driving purpose are very important to happiness; material goods less so after a certain point: &lt;strong&gt;unclear&lt;/strong&gt;
    &lt;ul&gt;
      &lt;li&gt;As I understand it, the happiness research does consistently support this claim, but it’s not clear that we are doing a good job of measuring happiness.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Verdict: 8.5/10 or 9.5/10. Psychology is doing fairly well. I found this one surprising considering how much stuff in psychology has failed to replicate, but Scott did a good job of identifying the claims that would hold up (a much better job than my college psychology textbook did).&lt;/p&gt;

&lt;h3 id=&quot;sociology&quot;&gt;Sociology&lt;/h3&gt;

&lt;ol&gt;
  &lt;li&gt;People are racist as heck: &lt;strong&gt;failed to replicate&lt;/strong&gt;
    &lt;ul&gt;
      &lt;li&gt;I believe Scott is talking about Implicit Association Tests, which do consistently show implicit associations but don’t reliably predict behavior. Some other things like reume name bias failed to replicate. There’s some version of this hypothesis (the “all bad racial outcomes are caused by racism” hypothesis) that’s basically unfalsifiable and therefore hasn’t been disproven but that’s not a point in its favor.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;No, really, they’re really racist: &lt;strong&gt;failed to replicate&lt;/strong&gt;&lt;/li&gt;
  &lt;li&gt;Even the ones who don’t think they are: &lt;strong&gt;failed to replicate&lt;/strong&gt;&lt;/li&gt;
  &lt;li&gt;Even the ones who swear up and down that they’re not racist and donate to the NAACP: &lt;strong&gt;failed to replicate&lt;/strong&gt;&lt;/li&gt;
  &lt;li&gt;There are major disparities in the income levels of social status of various ethnic groups: &lt;strong&gt;still good&lt;/strong&gt;&lt;/li&gt;
  &lt;li&gt;Discrimination explains a lot of this, sometimes in surprising ways: &lt;strong&gt;was never established in the first place&lt;/strong&gt;
    &lt;ul&gt;
      &lt;li&gt;I don’t entirely understand what Scott meant by this but as far as I know, it was never established and the evidence almost entirely contradicts it.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;Most social problems are closely correlated with one another. Scandinavia has the fewest social problems of developed countries, and the US has the most: &lt;strong&gt;still good&lt;/strong&gt;&lt;/li&gt;
  &lt;li&gt;Poor people and uneducated people tend to suffer more social problems and commit more crimes: &lt;strong&gt;still good&lt;/strong&gt;&lt;/li&gt;
  &lt;li&gt;Religious people tend to be happier and better-adjusted than others: &lt;strong&gt;still good&lt;/strong&gt;&lt;/li&gt;
  &lt;li&gt;Social class is a big deal, even in societies that make a big pretense of being classless: &lt;strong&gt;still good&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Verdict: 5/10. Sociology are you ok? (To be fair, this list is a bit skewed because Scott spent four bullet points reiterating the same claim that failed to replicate.)&lt;/p&gt;

&lt;h3 id=&quot;epidemiology&quot;&gt;Epidemiology&lt;/h3&gt;

&lt;ol&gt;
  &lt;li&gt;Smoking causes cancer and many other problems: &lt;strong&gt;still good&lt;/strong&gt;&lt;/li&gt;
  &lt;li&gt;Alcohol causes liver disease and many other problems: &lt;strong&gt;still good&lt;/strong&gt;&lt;/li&gt;
  &lt;li&gt;Bad diets (in some sense of the word) cause heart disease, Type II diabetes, and many other problems: &lt;strong&gt;still good&lt;/strong&gt;&lt;/li&gt;
  &lt;li&gt;Two zillion other correlations between risk factors and diseases: &lt;strong&gt;still good&lt;/strong&gt;&lt;/li&gt;
  &lt;li&gt;Exercise is really good for you: &lt;strong&gt;still good&lt;/strong&gt;&lt;/li&gt;
  &lt;li&gt;Vaccines are extremely effective at controlling infectious diseases: &lt;strong&gt;still good&lt;/strong&gt;&lt;/li&gt;
  &lt;li&gt;And they don’t cause autism: &lt;strong&gt;still good&lt;/strong&gt;&lt;/li&gt;
  &lt;li&gt;Stress causes or exacerbates many diseases: &lt;strong&gt;still good&lt;/strong&gt;&lt;/li&gt;
  &lt;li&gt;Poor people suffer from more diseases, even in ways that are not directly linked to them not being able to afford medical care: &lt;strong&gt;still good&lt;/strong&gt;&lt;/li&gt;
  &lt;li&gt;Many diseases seem to be part genetic and part “other factors”, including mental diseases: &lt;strong&gt;still good&lt;/strong&gt;&lt;/li&gt;
  &lt;li&gt;Lifestyle changes can decrease your chance of getting mental diseases: &lt;strong&gt;still good(ish)&lt;/strong&gt;
    &lt;ul&gt;
      &lt;li&gt;As I understand, this is broadly supported but there’s some contradictory evidence on e.g. whether exercise helps with depression.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;Multivitamins don’t work: &lt;strong&gt;mixed evidence&lt;/strong&gt;
    &lt;ul&gt;
      &lt;li&gt;There are some RCTs on both sides (for a quick summary, see &lt;a href=&quot;https://www.redpenreviews.org/reviews/eat-drink-and-be-healthy/&quot;&gt;here&lt;/a&gt; under the heading “Most unusual claim”). Multivitamins are cheap so IMO they pass a &lt;a href=&quot;https://slatestarcodex.com/2020/04/14/a-failure-but-not-of-prediction/&quot;&gt;cost-benefit analysis&lt;/a&gt;. I am giving zero credit on this one because, while it might be true, it’s not &lt;em&gt;definitely&lt;/em&gt; true, and this was presented as a list of definitely-true findings.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;Low-dose aspirin (probably) helps prevent cancer: &lt;strong&gt;still good&lt;/strong&gt;
    &lt;ul&gt;
      &lt;li&gt;The evidence on this is weak but Scott did say “probably” so I’ll give this one full credit.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Verdict: 11/13 or 12/13. Epidemiology is doing pretty well.&lt;/p&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:1&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Except for a few of the claims in epidemiology which I didn’t know anything about, so I did about 15 seconds of research. &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:2&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Prior to reading Caplan’s article, my position on minimum wage was basically, “Demand curves are almost always downward sloping and the empirical evidence on minimum wage isn’t good enough to overcome this strong prior.” Which is the first argument Caplan makes, but he also make some other good arguments that I hadn’t thought of. &lt;a href=&quot;#fnref:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>Explicit Bayesian Reasoning: Don't Give Up So Easily</title>
				<pubDate>Wed, 03 Apr 2024 00:00:00 -0700</pubDate>
				<link>http://mdickens.me/2024/04/03/explicit_bayesian_reasoning_dont_give_up/</link>
				<guid isPermaLink="true">http://mdickens.me/2024/04/03/explicit_bayesian_reasoning_dont_give_up/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;Recently, Saar Wilf, creator of &lt;a href=&quot;https://www.rootclaim.com/&quot;&gt;Rootclaim&lt;/a&gt;, had a high-profile debate against Peter Miller on whether COVID originated from a lab. Peter won and Saar lost.&lt;/p&gt;

&lt;p&gt;Rootclaim’s mission is to “overcome the flaws of human reasoning with our probabilistic inference methodology.” Rootclaim assigns odds to each piece of evidence and perfoms Bayesian updates to get a posterior probability. When Saar lost the lab leak debate, some people considered this a defeat not just for the lab leak hypothesis, but for Rootclaim’s whole approach.&lt;/p&gt;

&lt;p&gt;In Scott Alexander’s coverage of the debate, he &lt;a href=&quot;https://www.astralcodexten.com/p/practically-a-book-review-rootclaim&quot;&gt;wrote&lt;/a&gt;:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;While everyone else tries “pop Bayesianism” and “Bayes-inspired toolboxes”, Rootclaim asks: what if you just directly apply Bayes to the world’s hardest problems? There’s something pure about that, in a way nobody else is trying.&lt;/p&gt;

  &lt;p&gt;Unfortunately, the reason nobody else is trying this is because it doesn’t work. There’s too much evidence, and it’s too hard to figure out how to quantify it.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Don’t give up so easily! We as a society have spent approximately 0% of our collective decision-making resources on explicit Bayesian reasoning. Just because Rootclaim used Bayesian methods and then lost a debate doesn’t mean those methods will never work. That would be like saying, “randomized controlled trials were a great idea, but they &lt;a href=&quot;https://slatestarcodex.com/2014/04/28/the-control-group-is-out-of-control/&quot;&gt;keep finding that ESP exists&lt;/a&gt;. Oh well, I guess we should give up on RCTs and just form beliefs using common sense.”&lt;/p&gt;

&lt;p&gt;(And it’s not even like the problems with RCTs were easy to fix. &lt;a href=&quot;https://slatestarcodex.com/2014/04/28/the-control-group-is-out-of-control/&quot;&gt;Scott wrote&lt;/a&gt; about 10 known problems with RCTs and 10 ways to fix them, and then wrote about an RCT that fixed all 10&lt;sup id=&quot;fnref:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; of those problems and &lt;em&gt;still&lt;/em&gt; found that ESP exists. If we’re going to give RCTs more than 10 tries, we should extend the same courtesy to Bayesian reasoning.)&lt;/p&gt;

&lt;p&gt;I’m optimistic that we can make explicit Bayesian analysis work better. And I can already think of ways to improve on two problems with it.&lt;/p&gt;

&lt;!-- more --&gt;

&lt;p&gt;&lt;strong&gt;First problem:&lt;/strong&gt; If you multiply a long list of probabilities as if they’re independent when they’re not, you get an extreme result.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Quick fix:&lt;/strong&gt; Reduce the magnitudes of the odds updates based on how much evidence you already have. The more individual factors you have, the more a new factor can be explained in terms of existing factors.&lt;/p&gt;

&lt;p&gt;For example, you could scale down the log-odds of your second observation by 1/2, your third observation by 1/3, your fourth observation by 1/4, etc. This roughly captures the intuitions that&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;if you have a lot of evidence already, a new observation is probably mostly predicted by the existing evidence&lt;/li&gt;
  &lt;li&gt;if you have infinitely many pieces of evidence, that should give you an infinitely large odds update&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This approach means you don’t need to spend any time thinking about how correlated your inputs are.&lt;/p&gt;

&lt;p&gt;If you have lines of evidence A, B, C, etc., the formula for joint log-odds becomes&lt;/p&gt;

&lt;p&gt;\begin{align}
\log(A) + \frac{1}{2} \log(B) + \frac{1}{3} \log(C) + …
\end{align}&lt;/p&gt;

&lt;p&gt;And therefore your joint odds would be&lt;/p&gt;

&lt;p&gt;\begin{align}
A \cdot B^{1/2} \cdot C^{1/3} \cdot …
\end{align}&lt;/p&gt;

&lt;p&gt;I don’t have a rigorous justification for this formula&lt;sup id=&quot;fnref:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt; and it has some obvious problems (for example, if you change the order or your inputs, the answer changes). But it has some advantages over treating every piece of evidence as independent.&lt;/p&gt;

&lt;p&gt;As a proof of concept, I &lt;a href=&quot;https://docs.google.com/spreadsheets/d/1ASFyXUF6u6move_QMFW6yd-3O1mA9TiS/&quot;&gt;created a modified version&lt;/a&gt; of Scott Alexander’s &lt;a href=&quot;https://docs.google.com/spreadsheets/d/1Tm9ajpJudn-gjsZ1QsMoeYzY94JWEzzS/&quot;&gt;lab leak debate calculator&lt;/a&gt; that updates less on correlated evidence. My version assumes two lines of evidence are correlated if (1) they’re under the same heading and (2) they point in the same direction. This change reduces the standard deviation of people’s answers from 7.4 orders of magnitude to 4.4. (Or, if you exclude Peter Miller’s extremely-overconfident answer, it reduces the standard deviation from 2.1 OOM to 1.8.)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Second problem:&lt;/strong&gt; Overconfident probabilities like “1 in 10,000 chance that COVID would first appear in a wet market conditional on lab leak”.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Quick fix:&lt;/strong&gt; Give every piece of evidence a “reliability score”. Maybe the evidence looks like it suggests 10,000:1 odds but you haven’t thought about it that hard. You read the number in some population survey but maybe the survey mis-calculated, maybe it used bad data collection methods, maybe you misread the number of zeros and it actually said 1 in 1000.&lt;/p&gt;

&lt;p&gt;As a simple approach, you could give every piece of evidence a reliability score from 1 (low reliability) to 4 (high reliability). Discount evidence by raising it to the power of &lt;code&gt;1 / (5 - reliability_score)&lt;/code&gt;. So 10,000:1 evidence with a reliability score of 2 gets reduced to 10,000&lt;sup&gt;1/2&lt;/sup&gt; = 100:1, and evidence with a score of 1 gets reduced to 10,000&amp;lt;/sup&amp;gt;1/4&amp;lt;/sup&amp;gt; = 10:1.&lt;/p&gt;

&lt;p&gt;Is that the best way to handle the problem of overconfident odds updates? Probably not. But it’s really easy and it took me three seconds to come up with.&lt;/p&gt;

&lt;p&gt;(If you think carefully enough about your odds, you don’t need a reliability score. But the score is a convenient way to encode a concept like “I did some calculations and got 10,000:1 odds but I haven’t carefully checked the calculations.”)&lt;/p&gt;

&lt;p&gt;Quoting Scott again,&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;In the end, I think Saar has two options:&lt;/p&gt;

  &lt;ol&gt;
    &lt;li&gt;
      &lt;p&gt;Abandon the Rootclaim methodology, and go back to normal boring impure reasoning like the rest of us, where you vaguely gesture at Bayesian math but certainly don’t try anything as extreme as actually using it.&lt;/p&gt;
    &lt;/li&gt;
    &lt;li&gt;
      &lt;p&gt;Claim that he, Saar, through his years of experience testing Rootclaim, has some kind of special metis at using it, and everyone else is screwing up.&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/blockquote&gt;

&lt;p&gt;(I get the sense Scott is joking, but I’ve heard other people say things like this.)&lt;/p&gt;

&lt;p&gt;I propose a third option: Examine the flaws in explicit Bayesian reasoning and look for ways to fix them.&lt;/p&gt;

&lt;p&gt;Or a fourth option: Do explicit Bayesian reasoning, don’t take the result literally but implicitly update your beliefs based on the result.&lt;/p&gt;

&lt;p&gt;Or a fifth option: Figure out how to fix RCTs, and then do something similar for Bayesian reasoning. (Did we figure out how to fix RCTs yet?)&lt;/p&gt;

&lt;p&gt;Or a sixth option: Keep doing Bayesian reasoning, and meanwhile keep trying to fix its flaws (like we are sort-of doing for RCTs).&lt;/p&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:1&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Actually only 8 out of 10 but the basic point still stands. &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:2&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;You could slightly-more-rigorously justify this formula by saying&lt;/p&gt;

      &lt;ul&gt;
        &lt;li&gt;The variance in evidence B is 50% explained by evidence A.&lt;/li&gt;
        &lt;li&gt;The variance in evidence C is 50% explained by A and 50% explained by B.&lt;/li&gt;
        &lt;li&gt;But the parts of C that are explained by A and B heavily overlap, so less than 75% of C is explained by A plus B.&lt;/li&gt;
      &lt;/ul&gt;
      &lt;p&gt;&lt;a href=&quot;#fnref:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
			<item>
				<title>Does Caffeine Stop Working?</title>
				<pubDate>Fri, 29 Mar 2024 00:00:00 -0700</pubDate>
				<link>http://mdickens.me/2024/03/29/does_caffeine_stop_working/</link>
				<guid isPermaLink="true">http://mdickens.me/2024/03/29/does_caffeine_stop_working/</guid>
                <description>
                  
                  
                  
                  &lt;p&gt;&lt;em&gt;Last updated 2024-09-02.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;&lt;a href=&quot;https://mdickens.me/confidence_tags/&quot;&gt;Confidence&lt;/a&gt;: Likely.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;If you take caffeine every day, does it stop working? If it keeps working, how much of its effect does it retain?&lt;/p&gt;

&lt;p&gt;There are many studies on this question, but most of them have severe methodological limitations. I read all the good studies (on humans) I could find. Here’s my interpretation of the literature:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Caffeine almost certainly loses some but not all of its effect when you take it every day.&lt;/li&gt;
  &lt;li&gt;In expectation, caffeine retains 1/2 of its benefit, but this figure has a wide credence interval.&lt;/li&gt;
  &lt;li&gt;The studies on cognitive benefits all have some methodological issues so they might not generalize.&lt;/li&gt;
  &lt;li&gt;There are two studies on exercise benefits with strong methodology, but they have small sample sizes.&lt;/li&gt;
&lt;/ul&gt;

&lt;!-- more --&gt;

&lt;h2 id=&quot;clarifying-terminology&quot;&gt;Clarifying terminology&lt;/h2&gt;

&lt;p&gt;The scientific literature talks about the “caffeine withdrawal hypothesis.” People use this term to describe two very different hypotheses:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Caffeine has no benefits for &lt;em&gt;anyone&lt;/em&gt;. It reverses withdrawal symptoms for habituated users, but it doesn’t do anything for non-users. (Call this the “caffeine-is-useless hypothesis.”)&lt;/li&gt;
  &lt;li&gt;Caffeine initially has benefits for non-users, but if you use caffeine habitually, your body adjusts to the point where you need caffeine just to get back up to baseline. (Call this the “caffeine habituation hypothesis.”)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;According to the caffeine-is-useless hypothesis, on the first day you take caffeine, you experience no benefits, and then you start feeling withdrawal symptoms after you’ve been taking caffeine for a few days:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/caf-useless-hypothesis.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;According to the caffeine habituation hypothesis, caffeine has initial benefits, but you start developing a tolerance:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/caf-habituation-hypothesis.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;(Graphs inspired by Gavin Leech’s &lt;a href=&quot;https://www.gleech.org/stims&quot;&gt;article&lt;/a&gt; on caffeine, except I put way less effort into mine.)&lt;/p&gt;

&lt;p&gt;Most studies on caffeine withdrawal only look at the caffeine-is-useless hypothesis. The study results pretty much universally reject this hypothesis so I consider it falsified&lt;sup id=&quot;fnref:16&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:16&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;. I’m much more interested in the caffeine habituation hypothesis, so that’s the one I will be discussing.&lt;/p&gt;

&lt;p&gt;Research papers almost never explicitly discuss the caffeine habituation hypothesis, but some of them provide enough data to test it. In the next two sections, I will review some studies and figure out what their data tell us about the caffeine habituation hypothesis.&lt;/p&gt;

&lt;h2 id=&quot;cognition-studies&quot;&gt;Cognition studies&lt;/h2&gt;

&lt;p&gt;I found two good studies on the exercise benefits of caffeine, and four good(ish) studies on the cognitive benefits. Let’s start with the cognition studies: &lt;a href=&quot;https://pubmed.ncbi.nlm.nih.gov/23108937/&quot;&gt;Rogers et al. (2013)&lt;/a&gt;&lt;sup id=&quot;fnref:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt;, &lt;a href=&quot;https://pubmed.ncbi.nlm.nih.gov/17514640/&quot;&gt;Hewlett &amp;amp; Smith (2007)&lt;/a&gt;&lt;sup id=&quot;fnref:6&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:6&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt;, &lt;a href=&quot;https://pubmed.ncbi.nlm.nih.gov/15678363/&quot;&gt;Haskell et al. (2005)&lt;/a&gt;&lt;sup id=&quot;fnref:9&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:9&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;4&lt;/a&gt;&lt;/sup&gt;, and &lt;a href=&quot;https://pubmed.ncbi.nlm.nih.gov/16910172/&quot;&gt;Smith et al. (2006)&lt;/a&gt;&lt;sup id=&quot;fnref:10&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:10&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;5&lt;/a&gt;&lt;/sup&gt;.&lt;/p&gt;

&lt;p&gt;These studies all used similar methodology: they divided participants into high caffeine users vs. low/non-users. Then they randomly administered either caffeine or placebo and tested participants’ performance on various cognitive tests.&lt;/p&gt;

&lt;p&gt;The studies each have four groups of participants:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;LoCaf: low- or non-caffeine users assigned to take caffeine&lt;/li&gt;
  &lt;li&gt;LoPla: low- or non-caffeine users assigned to take placebo&lt;/li&gt;
  &lt;li&gt;HiCaf: high-caffeine users assigned to take caffeine&lt;/li&gt;
  &lt;li&gt;HiPla: high-caffeine users assigned to take placebo&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;To test the caffeine habituation hypothesis, compare the performance of high-caffeine users after taking caffeine (HiCaf) versus low-caffeine users after taking placebo (LoPla). If high users develop complete tolerance then these two groups should perform the same: when a habitual user takes caffeine, it brings their performance back up to baseline, but has no benefits beyond that.&lt;/p&gt;

&lt;p&gt;(This methodology isn’t perfect because low and high caffeine users might differ in ways that could bias the results.&lt;sup id=&quot;fnref:13&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:13&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;6&lt;/a&gt;&lt;/sup&gt; (For more discussion of this possibility, see &lt;a href=&quot;https://pubmed.ncbi.nlm.nih.gov/23108937/&quot;&gt;Rogers et al. (2013)&lt;/a&gt;&lt;sup id=&quot;fnref:2:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt;.) It would be better to randomize participants to take either caffeine or placebo for several weeks.)&lt;/p&gt;

&lt;p&gt;Then calculate:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;baseline benefit&lt;/strong&gt; = LoCaf – LoPla = benefit to a naive user&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;habituated benefit&lt;/strong&gt; = HiCaf – LoPla = benefit to a habituated user&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;retention&lt;/strong&gt; = habituated benefit / baseline benefit = (HiCaf – LoPla) / (LoCaf – LoPla) = what proportion of caffeine’s benefits are retained by a habituated user&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Retention ranges from 0 (caffeine loses all its effect) to 1 (caffeine retains all its effect).&lt;/p&gt;

&lt;p&gt;I evaluated the effectiveness of caffeine by computing approximate &lt;a href=&quot;https://en.wikipedia.org/wiki/Likelihood_function&quot;&gt;likelihood functions&lt;/a&gt; of retention from each study’s measurements.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;(Disclaimer: I only just now learned how likelihood functions worked so I could write this article. Be wary of mistakes.)&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The likelihood function L(X) answers the question, “If caffeine’s true retention is X, what is the probability that we would measure the retention that we did in fact measure?” If a particular retention has a high likelihood of generating the results we got, that makes it more likely that that’s the true retention.&lt;/p&gt;

&lt;p&gt;(For a longer explanation of likelihood functions, see &lt;a href=&quot;https://arbital.greaterwrong.com/p/likelihood_not_pvalue_faq&quot;&gt;Report Likelihoods, Not P-Values&lt;/a&gt;.)&lt;/p&gt;

&lt;p&gt;Plotting likelihood functions of the four main metrics&lt;sup id=&quot;fnref:18&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:18&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;7&lt;/a&gt;&lt;/sup&gt; from Rogers et al. (2013) (which had a larger sample size than the other three studies combined) along with the average likelihood&lt;sup id=&quot;fnref:17&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:17&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;8&lt;/a&gt;&lt;/sup&gt; of those metrics:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/caf-likelihood-Rogers.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;(The dots on each curve show the mean likelihoods. You can interpret the mean as the value the evidence tends to point toward.&lt;sup id=&quot;fnref:15&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:15&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;9&lt;/a&gt;&lt;/sup&gt;)&lt;/p&gt;

&lt;p&gt;This graph says the experiment points toward caffeine retaining around half its effect (0.56 to be precise). And it says we’d be somewhat unlikely to see these experimental results if retention = 0, but not unlikely to see them if retention = 1.&lt;/p&gt;

&lt;p&gt;Here’s how I computed these approximate likelihood functions:&lt;/p&gt;

&lt;p&gt;Model the baseline benefit (LoCaf - LoPla) and habituated benefit (HiCaf - LoPla) as t-distributions.&lt;sup id=&quot;fnref:14&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:14&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;10&lt;/a&gt;&lt;/sup&gt; To compute the likelihood function for retention (habituated benefit / baseline benefit), we need to know the shape of a ratio of t-distributions.&lt;/p&gt;

&lt;p&gt;A ratio of t-distributions does not have a closed form solution. So I approximated the solution using a formula I pulled off &lt;a href=&quot;https://en.wikipedia.org/wiki/Ratio_distribution#Uncorrelated_noncentral_normal_ratio&quot;&gt;Wikipedia&lt;/a&gt; for the ratio of two independent normal distributions (astute readers will notice that the distributions in question are neither independent nor normal). I guess you could call this the Simon-Ftorek approximation because Wikipedia cites Simon and Ftorek (2022)&lt;sup id=&quot;fnref:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;11&lt;/a&gt;&lt;/sup&gt;. For more on whether this approximation is any good, see &lt;a href=&quot;#appendix-a-approximating-the-retention-ratio&quot;&gt;Appendix A&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;I used the Simon-Ftorek approximation to compute the likelihood functions for the most important metrics in each study (see &lt;a href=&quot;#appendix-b-list-of-all-metrics-used&quot;&gt;Appendix B&lt;/a&gt; for a list of all the metrics I used). Then I computed a likelihood function for each study as the (geometric) average of all the likelihood functions for individual metrics in that study. Normally, you’re supposed to compute a joint likelihood as the product of your likelihood functions. But that assumes each function provides independent evidence, and I figured if you have the same study with the same participants, the different metrics all basically represent the same evidence. So I averaged them instead of multiplying them.&lt;/p&gt;

&lt;p&gt;If we take the joint likelihoods from the four cognition studies and combine them into one big joint likelihood, we get this:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/caf-likelihood-joint-cognition.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;h2 id=&quot;exercise-studies&quot;&gt;Exercise studies&lt;/h2&gt;

&lt;p&gt;I found two good studies on how habitual caffeine affects exercise: &lt;a href=&quot;https://doi.org/10.1080/02640414.2016.1241421&quot;&gt;Beaumont et al. (2017)&lt;/a&gt;&lt;sup id=&quot;fnref:3&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;12&lt;/a&gt;&lt;/sup&gt; and &lt;a href=&quot;https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6343867/&quot;&gt;Lara et al. (2019)&lt;/a&gt;&lt;sup id=&quot;fnref:11&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:11&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;13&lt;/a&gt;&lt;/sup&gt;. These two studies had participants abstain from caffeine for a month. Then they gave participants either caffeine or placebo for several weeks, testing their exercise performance at the beginning and end of the study.&lt;/p&gt;

&lt;p&gt;The methodologies differed somewhat, and I calculated retention accordingly.&lt;/p&gt;

&lt;p&gt;In Beaumont et al. (2017)&lt;sup id=&quot;fnref:3:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;12&lt;/a&gt;&lt;/sup&gt;, the participants took two pre-tests, one with caffeine and one with placebo. Then after taking caffeine every day for 28 days, they took a dose of caffeine and a final performance test. I calculated&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;baseline benefit = caffeine pre-test – placebo pre-test&lt;/li&gt;
  &lt;li&gt;habituated benefit = caffeine post-test – placebo pre-test&lt;/li&gt;
  &lt;li&gt;retention = habituated benefit / baseline benefit&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Lara et al. (2019)&lt;sup id=&quot;fnref:11:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:11&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;13&lt;/a&gt;&lt;/sup&gt; tested participants 3 times a week for 20 days. To take advantage of all the extra data points, I plotted a linear regression over the performance for the caffeine group minus the placebo group&lt;sup id=&quot;fnref:12&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:12&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;14&lt;/a&gt;&lt;/sup&gt;. Then I calculated&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;baseline benefit = projected effect size on day 0 (= intercept of the regression)&lt;/li&gt;
  &lt;li&gt;habituated benefit = projected effect size on day 20&lt;/li&gt;
  &lt;li&gt;retention = habituated benefit / baseline benefit&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/caf-likelihood-joint-exercise.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;It looks like caffeine retains a little under 1/2 its benefit for exercise.&lt;/p&gt;

&lt;h2 id=&quot;some-problems-with-my-approach&quot;&gt;Some problems with my approach&lt;/h2&gt;

&lt;ul&gt;
  &lt;li&gt;As I mentioned before, I couldn’t calculate exact likelihood functions, I could only approximate them.&lt;/li&gt;
  &lt;li&gt;I assumed caffeine has the same effect on all cognitive tests and on all exercise tests. But caffeine probably helps more with some tasks than others. It probably improves reaction time more than memory; it probably helps more with sustained moderate exercise (e.g., a &lt;a href=&quot;https://en.wikipedia.org/wiki/VO2_max#Measurement_and_calculation&quot;&gt;VO2 max test&lt;/a&gt;) than with short intense efforts (e.g., a &lt;a href=&quot;https://en.wikipedia.org/wiki/Wingate_test&quot;&gt;Wingate test&lt;/a&gt;).&lt;/li&gt;
  &lt;li&gt;Limiting the domain to [0, 1] makes the likelihood means tend toward 0.5 because it chops off the (often large) parts of the distribution below 0 and above 1. If you think retention can go above 1 but not below 0, you’ll get a higher mean. Conversely, if you think it can go below 0 but not above 1, you’ll get a lower mean. (Symmetrically expanding the domain to [-1, 2] doesn’t change the answers much.)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;

&lt;p&gt;I plotted likelihood functions for every I study I reviewed and combined them into one big joint likelihood:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/caf-likelihood-joint-all-metrics.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;What about a posterior probability?&lt;/p&gt;

&lt;p&gt;If you use a uniform prior for caffeine retention, the posterior probability distribution simply equals the likelihood function. According to the studies I looked at, a habituated caffeine user retains an expected 49% of the cognitive benefit and 44% of the exercise benefit, or 48% if we combine the cognition and exercise studies.&lt;/p&gt;

&lt;p&gt;My prior has more probability mass near 0 than 1. Human bodies &lt;a href=&quot;https://slatestarcodex.com/2019/08/19/maybe-your-zoloft-stopped-working-because-a-liver-fluke-tried-to-turn-your-nth-great-grandmother-into-a-zombie/&quot;&gt;want to maintain homeostasis&lt;/a&gt;, so there’s some theoretical reason to expect your body to adjust until caffeine stops working entirely. (And, empirically, taking caffeine does cause your brain to grow more neurotransmitter receptors,&lt;sup id=&quot;fnref:19&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:19&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;15&lt;/a&gt;&lt;/sup&gt; although it’s not clear how this corresponds to cognitive function.) Changing the prior moves my posterior expected retention from 48% to 43% or so.&lt;sup id=&quot;fnref:20&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:20&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;16&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;As shown in &lt;a href=&quot;#appendix-a-approximating-the-retention-ratio&quot;&gt;Appendix A&lt;/a&gt;, my approximation for the likelihood function understates the mean. Plus my review excluded any metrics where caffeine had too small a baseline benefit, which biases the retention to look smaller. Due to these factors, the true implied retention is higher than 43%. Let’s call it 50% to make it a nice round number.&lt;/p&gt;

&lt;p&gt;So we can reasonably say, albeit with a high degree of uncertainty, that caffeine retains something like half its benefit for a habitual user.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Source code for my calculations can be found &lt;a href=&quot;https://github.com/michaeldickens/public-scripts/tree/master/caffeine&quot;&gt;on GitHub&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;h2 id=&quot;appendix-a-approximating-the-retention-ratio&quot;&gt;Appendix A: Approximating the retention ratio&lt;/h2&gt;

&lt;p&gt;As a sanity check, I generated 10,000 random samples following the same distribution as the groups in one of the caffeine studies (Rogers et al. (2013)&lt;sup id=&quot;fnref:2:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt;), where each sample represents a set of true parameter values that could correspond to the observed values for the four groups (LoCaf, LoPla, HiCaf, HiPla). Then I calculated the distribution of true retention and the Simon-Ftorek approximation of the likelihood function (normalized to integrate to 1) and plotted them together:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/caf-MC-vs-SF-wide.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;A zoomed-in plot showing just the values from 0 to 1:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/caf-MC-vs-SF-narrow.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;(The upper plot already chops off a big tail. The largest retention value in the sample was over 500, i.e., a long-term caffeine user appears to get a 500x larger benefit per dose than a naive user. That can happen with a ratio of random variables when the denominator ends up close to 0.)&lt;/p&gt;

&lt;p&gt;From these plots we can see that a Simon-Ftorek approximation slightly overstates the width of the distribution.&lt;/p&gt;

&lt;p&gt;If we restrict the Monte Carlo sample to just the results with a baseline benefit of least one standard error (as I did when selecting metrics to look at), we get this:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/caf-MC-vs-SF-1SE.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;I also tried two other approximations, but Simon-Ftorek seemed best. The other approximations I tried:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Approximate using a Cauchy distribution. A Cauchy distribution perfectly represents the ratio of two central normal distributions, so I figured it might be an okay approximation for the ratio of non-central normal distributions.&lt;/li&gt;
  &lt;li&gt;Numerically compute the ratio of two distributions represented as histograms. This still requires assuming the distributions are independent.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This plot shows the three different approximations over a single metric from Beaumont et al. (2017)&lt;sup id=&quot;fnref:3:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;12&lt;/a&gt;&lt;/sup&gt;:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/assets/images/caf-likelihood-approximations.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;h2 id=&quot;appendix-b-list-of-all-metrics-used&quot;&gt;Appendix B: List of all metrics used&lt;/h2&gt;

&lt;p&gt;I did not use metrics from every study. I used the following criteria to select metrics:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Only use performance metrics, not subjective ratings or physiological measurements. (That means no sleepiness, no heart rate, etc.).&lt;/li&gt;
  &lt;li&gt;Prefer metrics that the study authors emphasized.&lt;/li&gt;
  &lt;li&gt;Don’t use any metrics where the baseline benefit was small (&amp;lt;1 standard error) or negative.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I used the following metrics from each study:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Rogers et al. (2013)&lt;sup id=&quot;fnref:2:3&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt;: simple reaction time, choice reaction time, recognition memory, tapping speed&lt;/li&gt;
  &lt;li&gt;Hewlett &amp;amp; Smith (2007)&lt;sup id=&quot;fnref:6:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:6&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt;: focused attention speed, simple reaction time, verbal reasoning % correct&lt;/li&gt;
  &lt;li&gt;Haskell et al. (2005)&lt;sup id=&quot;fnref:9:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:9&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;4&lt;/a&gt;&lt;/sup&gt;: simple reaction time, digit vigilance reaction time, Rapid Visual Information Processing false alarms, spatial memory (sensitivity index), numeric memory reaction time&lt;/li&gt;
  &lt;li&gt;Smith et al. (2006)&lt;sup id=&quot;fnref:10:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:10&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;5&lt;/a&gt;&lt;/sup&gt;: focused attention speed, categoric search reaction time, simple reaction time, vigilance hits&lt;/li&gt;
  &lt;li&gt;Beaumont et al. (2017)&lt;sup id=&quot;fnref:3:3&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;12&lt;/a&gt;&lt;/sup&gt;: total energy output (kJ), substrate oxidation&lt;/li&gt;
  &lt;li&gt;Lara et al. (2019)&lt;sup id=&quot;fnref:11:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:11&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;13&lt;/a&gt;&lt;/sup&gt;: peak power, VO2 max, Wingate test peak power, Wingate test mean power&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Smith et al. and Lara et al. did not provide numeric standard errors (and Lara et al. did not provide means) but did provide plots, so I estimated the numeric values by counting the number of pixels using Gimp.&lt;/p&gt;

&lt;p&gt;Note: Ignoring metrics with a small or negative baseline benefit could bias the results toward making caffeine habituation look worse. This process selects for observations where the baseline benefit was large by chance, making the habituation benefit look smaller by comparison.&lt;/p&gt;

&lt;h2 id=&quot;changelog&quot;&gt;Changelog&lt;/h2&gt;

&lt;ul&gt;
  &lt;li&gt;2024-09-02: I previously described 44% (the mean exercise benefit to caffeine) as “about 1/3”. This makes it sound like the difference between the observed cognitive vs. exercise benefits is about 1/6 (17 percentage points) when in fact it’s only 4 percentage points. I changed the description to “a little under 1/2”.&lt;/li&gt;
  &lt;li&gt;2025-06-06: Fixed typos.&lt;/li&gt;
&lt;/ul&gt;


&lt;h1 id=&quot;notes&quot;&gt;Notes&lt;/h1&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:16&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;And I didn’t find it plausible to begin with. If caffeine doesn’t make non-users more alert, why would people start taking caffeine in the first place? &lt;a href=&quot;#fnref:16&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:2&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Rogers, P. J., Heatherley, S. V., Mullings, E. L., &amp;amp; Smith, J. E. (2013). &lt;a href=&quot;https://pubmed.ncbi.nlm.nih.gov/23108937/&quot;&gt;Faster but not smarter: effects of caffeine and caffeine withdrawal on alertness and performance.&lt;/a&gt; &lt;a href=&quot;#fnref:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt; &lt;a href=&quot;#fnref:2:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;2&lt;/sup&gt;&lt;/a&gt; &lt;a href=&quot;#fnref:2:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;3&lt;/sup&gt;&lt;/a&gt; &lt;a href=&quot;#fnref:2:3&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;4&lt;/sup&gt;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:6&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Hewlett, P., &amp;amp; Smith, A. (2007). &lt;a href=&quot;https://pubmed.ncbi.nlm.nih.gov/17514640/&quot;&gt;Effects of repeated doses of caffeine on performance and alertness: new data and secondary analyses.&lt;/a&gt; &lt;a href=&quot;#fnref:6&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt; &lt;a href=&quot;#fnref:6:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;2&lt;/sup&gt;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:9&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Haskell, C. F., Kennedy, D. O., Wesnes, K. A., &amp;amp; Scholey, A. B. (2005). &lt;a href=&quot;https://pubmed.ncbi.nlm.nih.gov/15678363/&quot;&gt;Cognitive and mood improvements of caffeine in habitual consumers and habitual non-consumers of caffeine.&lt;/a&gt; &lt;a href=&quot;#fnref:9&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt; &lt;a href=&quot;#fnref:9:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;2&lt;/sup&gt;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:10&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Smith, A., Sutherland, D., &amp;amp; Christopher, G. (2006). &lt;a href=&quot;https://pubmed.ncbi.nlm.nih.gov/16910172/&quot;&gt;Effects of caffeine in overnight-withdrawn consumers and non-consumers.&lt;/a&gt; &lt;a href=&quot;#fnref:10&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt; &lt;a href=&quot;#fnref:10:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;2&lt;/sup&gt;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:13&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Some ways self-selection could bias the results:&lt;/p&gt;

      &lt;ul&gt;
        &lt;li&gt;People who are more naturally alert might not take caffeine because they don’t feel like they need it. This would make the habituated benefit look smaller. (That is, the LoPla group might perform better than the caffeine group because they’re naturally more alert, not because caffeine isn’t benefiting the caffeine group.)&lt;/li&gt;
        &lt;li&gt;People who don’t get much benefit from caffeine might not take it, making the baseline benefit look smaller (and thus retention look larger).&lt;/li&gt;
        &lt;li&gt;People who react strongly to caffeine might not take it (because it makes them jittery/anxious), making the baseline benefit look larger.&lt;/li&gt;
      &lt;/ul&gt;
      &lt;p&gt;&lt;a href=&quot;#fnref:13&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:18&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;The four metrics are: (1) simple reaction time (SRT); (2) choice reaction time (CRT); (3) recognition memory; (4) tapping speed. &lt;a href=&quot;#fnref:18&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:17&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Normally, you’d calculate joint likelihood as the product of the likelihood functions. But that only works if the functions are idependent. Since these four functions are all measuring the same group of people during the same experiment, I averaged them instead of multiplying them. The resulting joint likelihood understates the strength of the evidence, but I’m more concerned about overstating evidence than understating it. &lt;a href=&quot;#fnref:17&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:15&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Alternatively, the mean likelihood equals the posterior expected value when using a uniform prior.&lt;/p&gt;

      &lt;p&gt;Statistical analyses commonly report the &lt;a href=&quot;https://en.wikipedia.org/wiki/Maximum_likelihood&quot;&gt;maximum likelihood&lt;/a&gt; but not the mean likelihood. I believe people ought to use the mean likelihood instead.&lt;/p&gt;

      &lt;p&gt;For a symmetric distribution, the mean likelihood equals the maximum likelihood. But the distinction matters for caffeine retention because its likelihood function is skewed. If you compress the likelihood into a single value, that value should tilt toward whichever tail is fatter. The mean accounts for the fatness of the tails; the mode (maximum) does not.&lt;/p&gt;

      &lt;p&gt;For more on this, see McLeod, A. I., &amp;amp; Quenneville, B. (1999). &lt;a href=&quot;https://arxiv.org/pdf/1611.00884.pdf&quot;&gt;Mean likelihood estimators.&lt;/a&gt; &lt;a href=&quot;#fnref:15&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:14&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Really we should model them as &lt;a href=&quot;https://en.wikipedia.org/wiki/Behrens%E2%80%93Fisher_distribution&quot;&gt;Behrens-Fisher distributions&lt;/a&gt; but a t-distribution is close enough. &lt;a href=&quot;#fnref:14&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:1&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Simon, F. E., &amp;amp; Ftorek, J. (2022). A new method for approximating the distribution of the ratio of two independent normal random variables. &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:3&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Beaumont, R., James, L. J., &amp;amp; Davison, G. (2017). &lt;a href=&quot;https://doi.org/10.1080/02640414.2016.1241421&quot;&gt;Chronic ingestion of a low dose of caffeine induces tolerance to the performance benefits of caffeine.&lt;/a&gt; &lt;a href=&quot;#fnref:3&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt; &lt;a href=&quot;#fnref:3:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;2&lt;/sup&gt;&lt;/a&gt; &lt;a href=&quot;#fnref:3:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;3&lt;/sup&gt;&lt;/a&gt; &lt;a href=&quot;#fnref:3:3&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;4&lt;/sup&gt;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:11&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Lara, Beatriz; Ruiz-Moreno, Carlos; Salinero, Juan José; Del Coso, Juan; Sandbakk, Øyvind  (2019). &lt;a href=&quot;https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6343867/&quot;&gt;Time course of tolerance to the performance benefits of caffeine.&lt;/a&gt; &lt;a href=&quot;#fnref:11&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt; &lt;a href=&quot;#fnref:11:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;2&lt;/sup&gt;&lt;/a&gt; &lt;a href=&quot;#fnref:11:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;sup&gt;3&lt;/sup&gt;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:12&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I say that as if there were two different groups, but really this was a &lt;a href=&quot;https://en.wikipedia.org/wiki/Crossover_study&quot;&gt;crossover study&lt;/a&gt;. The experimenters ran two experiments in succession on the same group of people. They randomly assigned subjects to the caffeine or placebo groups on the first experiment, and then switched them for the second experiment. Effect sizes were determined by comparing each person’s performance with caffeine on day N to their performance with placebo on day N in the other trial.&lt;/p&gt;

      &lt;p&gt;This study had by far the best methodology of any of them, too bad it only got 11 participants. (I’m not surprised, considering that (a) they restricted their sample to physically-active participants and (b) the participants had to repeatedly perform grueling exercises.) &lt;a href=&quot;#fnref:12&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:19&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Shi, D., Nikodijevic, O., Jacobson, K. A., &amp;amp; Daly, J. W. (1993). &lt;a href=&quot;https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3437321/&quot;&gt;Chronic caffeine alters the density of adenosine, adrenergic, cholinergic, GABA, and serotonin receptors and calcium channels in mouse brain.&lt;/a&gt;&lt;/p&gt;

      &lt;p&gt;It would be more accurate to say that caffeine causes rats to grow more receptors, but I assume human brains work the same way.&lt;/p&gt;

      &lt;p&gt;Perhaps someday we will figure out how to augment rat brains to understand writing and then a rat will read this article. If that’s the case, greetings future rat folk, I hope you find this article helpful even though I only cited studies on human subjects.&lt;/p&gt;

      &lt;p&gt;(Future cybernetically-augmented rat: “Why do news articles always present scientific findings as if they apply to rats when they’ve only done the studies on humans?”) &lt;a href=&quot;#fnref:19&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:20&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;I calculated this by setting my prior to a &lt;a href=&quot;https://en.wikipedia.org/wiki/Beta_distribution&quot;&gt;beta distribution&lt;/a&gt; with alpha=1 and beta=1.5, which represents a low-information prior with a slight tilt toward 0. &lt;a href=&quot;#fnref:20&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;

                </description>
			</item>
		
	</channel>
</rss>
