Will Welfareans Get to Experience the Future?

Epistemic status: This entire essay rests on two controversial premises (linear aggregation and antispeciesism) that I believe are quite robust, but I will not be able to convince anyone that they’re true, so I’m not even going to try.

If welfare is important, and if the value of welfare scales something-like-linearly, and if there is nothing morally special about the human species1, then these two things are probably also true:

  1. The best possible universe isn’t filled with humans or human-like beings. It’s filled with some other type of being that’s much happier than humans, or has much richer experiences than humans, or otherwise experiences much more positive welfare than humans, for whatever “welfare” means. Let’s call these beings Welfareans.
  2. A universe filled with Welfareans is much better than a universe filled with humanoids.

(Historically, people referred to these beings as “hedonium”. I dislike that term because hedonium sounds like a thing. It doesn’t sound like something that matters. It’s supposed to be the opposite of that—it’s supposed to be the most profoundly innately valuable sentient being. So I think it’s better to describe the beings as Welfareans. I suppose we could also call them Hedoneans, but I don’t want to constrain myself to hedonistic utilitarianism.)

Even in the “Good Ending” where we solve AI alignment and governance and coordination problems and we end up with a superintelligent AI that builds a flourishing post-scarcity civilization, will there be Welfareans? In that world, humans will be able to create a flourishing future for themselves; but beings who don’t exist yet won’t be able to give themselves good lives, because they don’t exist.

Continue reading
Posted on

The Next-Gen LLM Might Pose an Existential Threat

I’m pretty sure that the next generation of LLMs will be safe. But the risk is still high enough to make me uncomfortable.

How sure are we that scaling laws are correct? Researchers have drawn curves predicting how AI capabilities scale based on how much goes into training them. If you extrapolate those curves, it looks like the next level of LLMs won’t be wildly more powerful than the current level. But maybe there’s a weird bump in the curve that happens in between GPT-5 and GPT-6 (or between Claude 4.5 and Claude 5), and LLMs suddenly become much more capable in a way that scaling laws didn’t predict. I don’t think we can be more than 99.9% confident that there’s not.

How sure are we that current-gen LLMs aren’t sandbagging (that is, deliberately hiding their true skill level)? I think they’re still dumb enough that their sandbagging can be caught, and indeed they have been caught sandbagging on some tests. I don’t think LLMs are hiding their true capabilities in general, and our understanding of AI capabilities is probably pretty accurate. But I don’t think we can be more than 99.9% confident about that.

How sure are we that the extrapolated capability level of the next-gen LLM isn’t enough to take over the world? It probably isn’t, but we don’t really know what level of capability is required for something like that. I don’t think we can be more than 99.9% confident.

Perhaps we can be >99.99% that the extrapolated capability of the next-gen LLM is still not as smart as the smartest human. But an LLM has certain advantages over humans—it can work faster (at least on many sorts of tasks), it can copy itself, it can operate computers in a way that humans can’t.

Alternatively, GPT-6/Claude 5 might not be able to take over the world, but it might be smart enough to recursively self-improve, and that might happen too quickly for us to do anything about.

How sure are we that we aren’t wrong about something else? I thought of three ways we could be disastrously wrong:

  1. We could be wrong about scaling laws;
  2. We could be wrong that LLMs aren’t sandbagging;
  3. We could be wrong about what capabilities are required for AI to take over.

But we could be wrong about some entirely different thing that I didn’t even think of. I’m not more than 99.9% confident that my list is comprehensive.

On the whole, I don’t think we can say there’s less than a 0.4% chance that the next-gen LLM forces us down a path that inevitably ends in everyone dying.

Posted on

Posted on

Mechanisms Rule Hypotheses Out, But Not In

If there is no plausible mechanism by which a scientific hypothesis could be true, then it’s almost certainly false.

But if there is a plausible mechanism for a hypothesis, then that only provides weak evidence that it’s true.

An example of the former:

Astrology teaches that the positions of planets in the sky when you’re born can affect your life trajectory. If that were true, it would contradict well-established facts in physics and astronomy. Nobody has ever observed a physical mechanism by which astrology could be true.

An example of the latter:

A 2023 study found an association between autism and diet soda consumption during pregnancy. The authors’ proposed mechanism is that aspartame (an artificial sweetener found in diet soda) metabolizes into aspartic acid, which has been shown to cause neurological problems in mice. Nonetheless, even though there is a proposed mechanism, I don’t really care and I’m pretty sure diet soda doesn’t cause autism. (For a more thorough take on the diet soda <> autism thing, I will refer you to Grug, who is much smarter than me.)

Why?

Continue reading
Posted on

How Much Does It Cost to Offset an LLM Subscription?

Is moral offsetting a good idea? Is it ethical to spend money on something harmful, and then donate to a charity that works to counteract those harms?

I’m not going to answer that question. Instead I’m going to ask a different question: if you use an LLM, how much do you have to donate to AI safety to offset the harm of using an LLM?

I can’t give a definitive answer, of course. But I can make an educated guess, and my educated guess is that for every $1 spent on an LLM subscription, you need to donate $0.87 to AI safety charities.

Continue reading
Posted on

I Made an Emacs Extension That Displays Magic: the Gathering Card Tooltips

This post is about the niche intersection of Emacs and Magic: the Gathering.

I considered not writing this because I figured, surely if you multiply the proportion of people who play Magic by the proportion of people who use Emacs, you get a very small number. But then I thought, those two variables are probably not independent. And the intersection of Magic players x Emacs users x people who read my blog might actually be greater than zero. So if you’re out there, this post is for you.

Do you like how MTG websites like magic.gg and mtg.wiki let you mouse over a card name to see a picture of the card? Well, I wrote an Emacs extension that replicates that functionality.

Continue reading
Posted on

AI Safety Landscape and Strategic Gaps

I wrote a report giving a high-level review of what work people are doing in AI safety. The report specifically focused on two areas: AI policy/advocacy and non-human welfare (including animals and digital minds).

You can read the report below. I was commissioned to write it by Rethink Priorities, but beliefs are my own.

Continue reading
Posted on

Healthy Cooking Tips from a Lazy Person

source

The problem with most “lazy cooking” advice is that it’s not lazy enough. Today I bring you some truly lazy ways of eating healthy.

This is the advice that I would’ve liked to hear when I was a lazy teenager. I’m still lazy, but I’m better at making food now. (I’m not going to say I’m better at cooking, because the way I make most food could only very generously be described as “cooking”.)

All my lazy meals are vegan because I’m vegan, but if anything, that works to my advantage because the easiest animal foods still take more work than the easiest plant foods. (You can eat raw vegetables but you can’t eat raw chicken.1)

Contents

Continue reading
Posted on

Is It So Much to Ask for a Nice Reliable Aggregated X-Risk Forecast?

On most questions about the future, I don’t hold a strong view. I read the aggregate prediction of forecasters on Metaculus or Manifold Markets and then I pretty much believe whatever it says.

Various attempts have been made to forecast existential risk. I would like to be able to form views based on those forecasts—especially on non-AI x-risks, because I barely know anything about synthetic biology or nuclear winter or catastrophic climate change. Unfortunately, none of the aggregate forecasts look reliable.

Continue reading
Posted on

Annual Subscription Discounts Usually Aren't Worth It

It’s common for monthly subscription services to offer a discount if you pay annually instead. That might be a bad deal.

Example: Suppose a one-month subscription costs $10/month and one-year subscription gives you a 10% discount, which averages out to $9/month. Say you expect to maintain a subscription for about three years before canceling.

A one-year subscription will save you about $36 ($1 per month for 36 months), but you can also expect to waste $54: when you decide to stop using it, you will still have (on average) six months of subscription left ($54 = $9/month for 6 months). So you end up spending $18 more than you would have with the monthly plan.

If you get a one-year subscription that you expect to last three years, then you will end up wasting 1/6 of the total amount you paid for (in expectation). That’s only worth it if the annual subscription offers a discount greater than 1/6.

If you expect to use the service for five years, you need to get at least a 10% discount to justify switching to an annual subscription.

In general, you need to use the subscription for at least N years to justify a discount of 1/(2N).

How do you guess how long you’ll keep using the service? According to the Lindy effect, you should expect that you will maintain a subscription for as long again as you’ve already had it for. Therefore, if you can get a 10% discount with an annual plan and you’ve already had the subscription for more than five years, you should go ahead and buy the annual plan.

Posted on

Posted on

← Newer Page 1 of 16