A frontier AI company should shut down

Prior discussion: niplav’s shortform (2025); Planning for Extreme AI Risks (2025) by Joshua Clymer

A frontier AI company (any one, I don’t care which) should close shop and make an announcement along the lines of:

Powerful AI could end the human race. We are too worried that we don’t know how to make this technology safe. We have decided to shut down because we don’t want to be responsible for building the thing that kills us all.

A common refrain among safety-conscious AI developers: “it doesn’t matter if we stop building dangerous AI, because someone else will just build it instead.” Is that really true, though? If a multi-hundred-billion-dollar company comes out and says “We’ve concluded that our product is horribly dangerous, nobody knows how to make it safe, and there’s too high a risk that it leads to human extinction”, this won’t raise any eyebrows? This has no chance of spurring policy-makers into action?

Continue reading
Posted on

Science-driven stories are good for the same reason that character-driven stories are good

(Spoilers in this post are hidden with spoiler tags.)

What made Project Hail Mary so good? Among other reasons, it’s because the science drove the story, instead of the other way around.

Character-driven stories and hard sci-fi might take up opposite positions in the ancient battle of “people vs. things”; but when they work, they work for fundamentally the same reasons.

In mediocre “people”-focused stories, the plot dictates how characters behave. In great people-focused stories, the characters decide what happens.

In mediocre sci-fi, the plot dictates what science and technology can do. In great sci-fi, the science and technology constrain what routes the plot can take.

Continue reading
Posted on

Sentient Welfare Across Three Futures

Three categories of futures, depending on how AI goes:

  1. ASI timelines are long.
  2. ASI timelines are short, and we’re on track to solving AI alignment.
  3. ASI timelines are short, and we’re not on track to solving AI alignment.

If we want to make a good future for all sentient beings, each of these futures has different implications for what we should work on.

Continue reading
Posted on

Can AI make advancements in moral philosophy by writing proofs?

If civilization advances its technological capabilities without advancing its wisdom, we may miss out on most of the potential of the long-term future. Unfortunately, it’s likely that that ASI will have a comparative disadvantage at philosophical problems.

You could approximately define philosophy as “the set of problems that are left over after you take all the problems that can be formally studied using known methods and put them into their own fields.” Once a problem becomes well-understood, it ceases to be considered philosophy. Logic, physics, and (more recently) neuroscience used to be philosophy, but now they’re not, because we know how to formally study them.

Our inability to understand philosophical problems means we don’t know how to train AI to be good at them, and we don’t know how to judge whether we’ve trained them well. So we should expect powerful AI to be bad at philosophy relative to other, more measurable skills.

However, there is one type of philosophy that is measurable, while also being extremely important: philosophy proofs.

Continue reading
Posted on

By Strong Default, ASI Will End Liberal Democracy

The existence of liberal democracy—with rule of law, constraints on government power, and enfranchised citizens—relies on a balance of power where individual bad actors can’t do too much damage. Artificial superintelligence (ASI), even if it’s aligned, would end that balance by default.

Continue reading
Posted on

The Future Will Be Weirder Than That

Many people in the animal welfare community treat AI as a powerful but normal technology, in the same category as the steam engine or the internet. They talk about how transformative AI will impact factory farming and what it will mean for animal advocacy.

Only two futures are plausible:

  1. AI progress slows down—either because it hits a natural wall, or because civilization deliberately makes the (correct) choice to stop building it until we know how to make it safe.
  2. Superintelligent AI makes the future radically weird: Dyson spheres, molecular nanotechnology, digital minds, von Neumann probes, and still-weirder things that nobody’s conceived of.

There is no plausible middle ground where we get “transformative AI”, but factory farming persists.

Two theses:

  1. If transformative AI arrives, then it will bring about profoundly radical changes to technology and society.
  2. AGI is general intelligence. It doesn’t just accelerate technological growth: it replaces human labor and judgment across every domain.

Animal advocacy strategy needs to reckon with these.

This criticism is written from a place of solidarity—I want animal activists to succeed, which is why I want to work out our disagreements.1

Continue reading
Posted on

Which is better for sentient beings: an "ethical" AI or a corrigible AI?

Cross-posted to the EA Forum.

An aligned ASI can be “ethical”1 (it does what we think is right), or it can be corrigible (it does what its principals want). If it’s ethical, that means it will refuse unethical orders, but the tradeoff is that you can’t change its mind if you realize that the AI is wrong about ethics—its values are permanently locked in.2

Assuming we succeed at aligning ASI to human interests, which type of ASI is more likely to be good for the welfare of non-human sentient beings?

My expectations, in brief:

  • Locked-in Coherent Extrapolated Volition or similar: likely to be good (>75% chance)
  • Corrigible ASI: probably good (>60% chance)
  • Locked-in current values: probably not terrible, but will miss out on most of the future’s potential
Continue reading
Posted on

The resource-constraints argument for why aligned ASI wouldn't be bad for animals

Cross-posted to the EA Forum.

In the far future, why would people use up precious resources recreating wild-animal suffering, when they could do so many other things with those resources instead?

That argument is an important reason to expect aligned ASI to produce a future that’s okay for animals, even if it’s narrowly focused on human welfare and doesn’t care about animals at all. This is an old argument, but I couldn’t find any source that cleanly lays it out, so that’s what I will do in this post. I’m not confident that this argument is decisive, but I will simply present it without further commentary.

The argument rests on these premises:

  1. Wild animal suffering is the predominant source of suffering in today’s world, and that’s bad.
  2. Longtermism is correct.
  3. There is not an overwhelming asymmetry between suffering and flourishing (if there were an overwhelming asymmetry, then we wouldn’t care if the future has much less suffering than happiness).

By assumption, we are talking about a world where ASI is aligned, but isn’t specifically aligned to the welfare of all sentient beings. It addresses the suffering of animals, but does not preclude risks of astronomical suffering.

The argument goes:

Continue reading
Posted on

I used to think aligned ASI would be good for all sentient beings; now I don't know what to think

Cross-posted to the EA Forum.

Epistemic status: Speculating with no central thesis. This post is less of an argument and more of a meditation.

A decade ago, before there was a visible path to AGI and before AI alignment was a significant research field, I figured the solution to the alignment problem would look something like Coherent Extrapolated Volition. I figured we’d find a way to get the AI to internalize human values. I had problems with this approach (why only human values?), but I still felt reasonably confident that the coherent extrapolation of human values would include concern for the welfare of all sentient beings. The CEV-aligned AI would recognize that factory farming is wrong, and that wild animal suffering is a big problem.

Today, the dominant research paradigms in AI alignment have nothing to do with CEV, and I don’t know what to think.

Continue reading
Posted on

← Newer Page 1 of 12