How valuable are weak AI safety regulations?

Image credit: Jebulon

To prevent superintelligent AI from killing everyone, I would like there to be a strong international agreement banning the development of ASI until it can be proven safe. But that sort of agreement requires a lot of political buy-in and coordination. In the meantime, it may be easier to get light-touch AI safety regulations passed. To what extent do weak regulations decrease extinction risk?

In this post:

  • Part I discusses routes by which weak regulations can reduce extinction risk. [More]
  • Part II considers some downsides of weak regulations. [More]
  • Part III reviews specific categories of weak regulation and how they might reduce risk. [More]
Continue reading
Posted on

We Need Breadth-First AI Safety Plans

Depth-first plans lay out a path from here to aligned superintelligent AI. We need those kinds of plans. But depth-first plans depend on many assumptions: “We will make AI safe by doing step 1, then step 2, then step 3.” Step 1 only works under condition A, step 2 requires condition B, step 3 requires condition C. If A or B or C is false, the whole plan fails (and there’s a good chance we all die).

Consider Google’s safety plan from April 2025. To my knowledge, this is the best among the frontier AI companies’ plans.1

Google’s plan depends on a series of conditions:

Continue reading
Posted on

I sleep less when I exercise more

They say exercise improves sleep quality. Is that true for me?

To test this hypothesis, I took my daily calorie expenditures from the Apple Health app and correlated them with that night’s sleep time.1 I also included caffeine intake as a potential confounding variable.

The hypothesis: when I exercise more, I’ll get better rest that night, and therefore wake up earlier.

The results:2

name coef t-stat p-value
intercept 9.0134 65.072 0.0000
calories -1.6844 -6.967 0.0000
caffeine 0.4157 9.404 0.0000
0.2409    

I sleep 10 minutes less for every additional 100 calories of exercise. Exercise plus caffeine explained 24% of the variance in my sleep time; exercise alone explained 6.6%.

The trend shows up whether or not I have caffeine:

Data are binned into increments of 100 calories. Any bins with fewer than 5 data points are not displayed. Vertical lines show the 95% confidence intervals for each bin.

Continue reading
Posted on

Donation Timing Under Uncertainty About AI Timelines

A few years back, I got a big pile of money from working at a tech startup. I put a lot of that money into a donor-advised fund. Since now I make hardly any money, that DAF might represent the majority of my lifetime donations. How much of my DAF should I donate per year?

In particular, how much should I donate in light of short AI timelines?

I created a simple model to answer this question.

Continue reading
Posted on

I was wrong: concentrated factor portfolios don't have alpha

Previously, I wrote about how investors can simulate leverage via concentrated stock selection. That’s still true as far as I can tell. However, I also wrote something that I now believe to be false: concentrated equal-weighted factor portfolios have alpha on top of value-weighted factor portfolios. The numbers I found before were not wrong per se. However:

  • The alpha came primarily from small-cap and micro-cap stocks. That alpha may not be feasible to capture, or it may be defeated by trading costs; and historical estimates of micro-cap returns are biased upward because closing prices do not accurately represent the average investor’s trade price (Blume & Stambaugh (1983)1).

    When I constructed hypothetical factor portfolios that had high concentration but screened out small-caps, the results did simulate leverage—they had higher returns and volatility than diversified factor portfolios—but alphas were not consistently positive.

  • In the United States (where the data goes back the furthest), the alpha only shows up over the full data series (1927–2025). When restricting to 1964 onward, the alphas are close to zero.
  • Concentrated value and momentum had positive alpha; but when I tested two new factors, profitability and investment, they each had negative alpha.
Continue reading
Posted on

Pausing AI Is the Best Answer to Post-Alignment Problems

Even if we solve the AI alignment problem, we still face post-alignment problems, which are all the other existential problems1 that AI may bring.

People have identified various imposing problems that we may need to solve before developing ASI. An incomplete list of topics: misuse; animal-inclusive AI; AI welfare; S-risks from conflict; gradual disempowerment; permanent mass unemployment; risks from malevolent actors/AI-enabled coups/gradual concentration of power; moral error.

If we figure out how to resolve one of these problems, we still have to deal with all the others. If even one problem remains unsolved, the future could be catastrophically bad. That fact diminishes the promise of working on problems individually.

A global moratorium on superintelligence buys us more time to work on alignment as well as all of the post-alignment problems. Pausing AI is in the common interest of many causes.2

Continue reading
Posted on

Cost-effectiveness model for AI alignment-to-animals vs. alignment-in-general

Cross-posted to the EA Forum.

Last September, I wrote:

  1. There’s a (say) 80% chance that an aligned(-to-humans) AI will be good for animals, but that still leaves a 20% chance of a bad outcome.
  2. AI-for-animals receives much less than 20% as much funding as AI safety.
  3. Cost-effectiveness maybe scales with the inverse of the amount invested. Therefore, AI-for-animals interventions are more cost-effective on the margin than AI safety.

Today, I’m fleshing out this argument with a cost-effectiveness model. The model estimates how much it costs to make progress on AI alignment—the general problem of getting ASI to achieve any goal without subsequently killing everyone—compared to how much it costs to make progress on aligning AI to animal welfare specifically.

The model is on SquiggleHub: https://squigglehub.org/models/AI-for-animals/alignment-to-animals-EV-simple

Continue reading
Posted on

Which types of AI alignment research are most likely to be good for all sentient beings?

Cross-posted to the EA Forum.

AI alignment is typically defined as the task of aligning artificial superintelligence to human preferences. But non-human animals, future digital minds, and maybe other sorts of beings also have moral worth; ASI ought to care for their interests, too.

In broad strokes, if we place all alignment techniques on a spectrum between

getting AI to do things that their users expressly want in the immediate term

and

embedding in AI the generalized notion of respecting beings’ preferences

then things more like the latter are better for non-humans, and things more like the former are worse.

In this post, I review 12 categories of AI safety research based on how likely they are to be good for non-human welfare.

Continue reading
Posted on

← Newer Page 1 of 11