Part of a series on quantitative models for cause selection.

Introduction

In the past I’ve written qualitatively about what sorts of interventions likely have the best far-future effects. But qualitative analysis is maybe not the best way to decide this sort of thing, so let’s build some quantitative models.

I have constructed a model of various interventions and put them in a spreadsheet. This essay describes how I came up with the formulas to estimate the value of each intervention and makes a rough attempt at estimating the inputs to the formulas. For each input, I give either a mean and σ1 or an 80% confidence interval (which can be converted into a mean and σ). Then I combine them to get a mean and σ for the estimated value of the intervention.

This essay acts as a supplement to my explanation of my quantitative model. The other post explains how the model works; this one goes into the nitty-gritty details of why I set up the inputs the way I did.

Note: All the confidence intervals here are rough first attempts and don’t represent my current best estimates; my main goal is to explain how I developed the presented series of models. I use dozens of different confidence intervals in this essay, so for the sake of time I have not revised them as I changed them. To see my up-to-date estimates, see my final model. I’m happy to hear things you think I should change, and I’ll edit my final model to incorporate feedback. And if you want to change the numbers, you can download the spreadsheet and mess around with it. This describes how to use the spreadsheet.

Contents

Common Ground: Value of the Far Future

All these estimates require estimating the value of the far future, so let’s start by doing that. This requires going into substantial depth, so I did this in a previous post.

Intervention Estimates

AI Safety

The model here is based on the framework created by the Global Priorities Project.

Inputs:

  • (a) P(AI-related extinction)
  • (b) size of AI safety community when AGI is developed
  • (c) multiplicative effect of adding a researcher
  • (d) bad scenarios averted by doubling research
  • (e) QALYs in the far future
  • (f) cost per researcher

The value of one new AI safety researcher is

a * (1 / b) * c * d * e

The expected value per dollar spent is

a * (1 / b) * c * d * e * (1 / f)

Estimates

On the FHI global catastrophic risk survey, survey respondents gave a median estimate of 5% chance that AI causes human extinction. The survey didn’t publish the raw data except as a plot, but from eyeballing the plot on page 3 it looks like the responses were (0, 1, 1, 5, 5, 5, 5, 10, 10, 15, 20, 50). If we leave out the 0, this list has a geometric mean of 6.54 and a geometric standard deviation of 3.06. That is, if we take the logarithm of all the values, compute their mean and standard deviation, and then exponentiate the results, we get 6.54% and 3.06%. (I’m using geometric mean/standard deviation here because the probabilities vary logarithmically.) I believe people are usually too optimistic so let’s make the mean an even 10%. (The probability might be substantially higher or lower, but the final outcome doesn’t change much unless we update this probability by an order of magnitude or more.)

If I remember correctly, Eliezer claimed that a few years ago he could count the number of AI safety researchers on two hands. Today I believe there are something like two dozen researchers working on AI safety full time. I expect that this number will continue to grow, but we don’t really have a good way of predicting how quickly. Based on an analysis by MIRI, there are something like 30,000 AI researchers right now. This number might grow, and it’s pretty likely that there will be a lot fewer AI safety researchers than AI researchers in general, so when/if AGI is developed there will probably be fewer than, say, 10,000 AI safety researchers. It also seems pretty likely that there will be at least 200, given that there are about a tenth that many right now and the field is growing a lot. If AGI comes sooner then we will have fewer AI safety researchers, and if it comes later than we will have more; but I believe a reasonable 80% CI is (200, 10,000).

Adding new AI safety researchers might displace future researchers. On the other hand, it might also attract more attention to the field and cause it to grow. It’s hard to quantify this but let’s say the 80% CI for the multiplicative effect of adding a researcher is (0.5, 3).

The proportion of bad scenarios averted by doubling research is probably the hardest quantity here to estimate since we really have no good way to get a handle on it. I’m just going to say my 80% CI is (0.05, 0.5). I asked Daniel Filan about it and he gave an 80% CI of (0.2, 0.95). (This was an off-the-cuff estimate and he wouldn’t necessarily endorse this upon further reflection.) I’ll roughly average these to get an 80% CI of (0.1, 0.7).

We can fairly reasonably assume that these all these factors follow a log-normal distribution. Multiplying log-normal distributions is equivalent to adding normal distributions, which is pretty simple. After doing this, it might be reasonable to multiply the σ by, say, 1.2-1.5 to adjust for overconfident estimates.

So when we combine all the factors except for (e) and (f), the resulting distribution has a mean of 2.3x10-5 with a σ of 0.94 orders of magnitude1. I made a lot of wild guesses to get this result, and I encourage you to give your own inputs and see if you get something substantially different.

Alternative Model

Let’s develop another model and see what kind of result we get. How different the result is should tell us something about the variance of these estimates.

  • (a) P(AI-related extinction)
  • (b) hours of research necessary to avert all non-trivial catastrophe scenarios
  • (c) hours of research per researcher-year
  • (d) QALYs in the far future
  • (e) cost per researcher

Then the value of one new AI safety researcher is

a * (1 / b) * c * d * (1 / e)

We’ve already estimated (a) and (d), and we can just say (c) is 2000 hours. So let’s try to estimate (b).

Probably the best way to do this is to look for similar-sized problems and see how long they took. We don’t really know how big a problem AI safety is, but it’s plausibly somewhat but not much smaller than all the work that’s gone into AI research so far.

There are something like 30,000 AI researchers today, and that number has probably increased exponentially over time. Maybe if we assume there have been 30,000 researchers working for the past 10 years, that’s about the same as the number of AI researchers that there have been since the invention of computers. That’s 300,000 person-years which is 6x108 person-hours of work. So let’s say the 80% CI on the number of hours necessary to solve the AI control problem is (106, 109).

Combining these factors except for (d) gives a distribution with mean 6.3x10-6 and σ 1.27 orders of magnitude. These are sufficiently similar that it increases my confidence in the model a bit.

Now let’s add in the last two factors, QALYs in the far future and cost per researcher. Let’s say our 80% CI on the cost per researcher is (70,000, 150,000). This gives us a final 80% CI of (5x1037, 3x1041).

This alternative model isn’t ideal: it tells us how much it costs for further AI safety efforts, but what we really care about is whether our marginal donations will further efforts by enough to prevent unfriendly AI. This is harder to estimate.

Animal Advocacy

Getting more people to care about animals has potential effects on factory farming, wild animal suffering, and the treatment of sentient computer programs2. We can estimate these three separately using the same formula.

Inputs:

  • (a) P($THING exists in the far future)
  • (b) increase in probability that society will care about $THING if we end factory farming via advocacy
  • (c) amount of $THING in the far future
  • (d) QALYs per $THING animal-year prevented
  • (e) amount of factory farming today
  • (f) cost of saving one factory-farmed animal

Where $THING is one of (factory farming, wild animal suffering, sentient simulations).

The value of saving one factory-farmed animal is

a * b * c * d * (1 / e)

(This assumes that the probability of caring about $THING scales proportionally with the amount of factory farming prevented. This might not be true–perhaps eliminating the last little bit of factory farming matters much more or less than saving the same number of factory farmed animals today–but it’s a reasonable assumption.)

The expected value per dollar spent is

a * b * c * d * (1 / e) * (1 / f)

Note: This model only looks at advocacy to reduce factory farming. There’s one organization, Animal Ethics, that does advocacy specifically for wild animals. This kind of work is still in its early stages and the results are a lot harder to quantify, so I’m not going to do anything with this for now.

Estimates for Factory Farming

In the far-future hedonium condition, there’s definitely no factory farming.

If we develop the technology to travel between solar systems, it seems pretty likely that we’ll also have the technology to synthesize meat that’s cheaper than raising animals. There’s maybe a 70% chance that we’ll develop good synthetic meat, and a 50% chance that we eliminate factory farming for some other reason even if we don’t. That gives us P(factory farming exists in the far future) = 0.2.

If we end factory farming by convincing people that factory farming is bad, that pretty much necessitates that people care about factory farming, although the change in values might not carry through to the far future. So let’s say P(society will care about factory farming if we end factory farming via advocacy) is 0.5.

Given that factory farming exists in the far future, there will probably be somewhere in (1010, 1013) factory-farmed animals per solar system (that’s 1-10 animals per human). Combining this with the numbers from the biological condition, that means the far future has an estimated 2x1033 to 5x1037 factory-farmed animals.

The total number of factory farmed animals today is about 1010, and my 80% CI on the QALYs per factory-farmed animal prevented is (0.2, 4) (being 0.2 if chickens are substantially less sentient than humans).

Based on the numbers published by ACE, let’s say $1 prevents between 5 and 50 years of factory farm life.

In the biological condition, the 80% CI of the expected value per dollar spent alleviating factory farming is (8x1020, 2x1025). Discounting this by 50% (because it’s in the biological condition only) makes it (4x1020, 1x1025).

Estimates for Wild Animal Suffering

In the far-future hedonium condition, there are no wild animals. The remainder of this section just considers the biological condition.

Filling lots of planets with wild animals doesn’t do much to further human interests so it’s likely we won’t do this. At the same time, lots of people value nature. Even if we do create wild animals, we might care enough to make their lives net good (we can probably do this if we’re capable of terraforming planets). Let’s say P(wild animal suffering exists in the far future) is 0.4.

If we eliminate factory farming, this probably increases people’s concern for wild animals by making people less speciesist. The connection is not that strong because people generally feel like they have special obligations not to hurt animals but that doesn’t mean they have obligations to help them. Let’s put P(people care about wild animal suffering if we end factory farming) at 0.1. (This really means that the difference between the factory-farming-exists condition and does-not-exist condition is 0.1.)

How much wild animal suffering will there be, given that wild animals exist? I’m going to use Brian Tomasik’s estimates here, and discount bugs by 10 to 100 (I’d give maybe a 10% chance that they’re sentient, and if they are, they matter 1/10 as much as more complex animals). Then there are currently 1016 to 1019 adjusted sentient animals on earth (bugs still dominate even after adjusting for expected sentience). So let’s say each solar system has 1016 to 1020 wild animals. Multiplying this by the number of stars and length of the far future gives an 80% CI of (3x1039, 4x1044).

For simplicity, let’s say wild animal lives are -1 QALYs per year. (This is another input where people’s intuitions will differ wildly, and you should change this on the spreadsheet if you disagree with it.) The amount of factory farming is still the same as before, and so is the cost, so our expected value per dollar spent (after dividing by 2) is (7x1026, 1x1032).

Estimates for Sentient Simulations

In the computronium condition, there’s the possibility that we would create simulations that suffer a lot, but it doesn’t seem all that likely, and if we did, we’d still want to spend most resources on ourselves. Let’s say 10% chance of creating suffering simulations, and if we do, they make up 1 to 10% of the minds. If we combine this with the previous estimate for the number of happy lives in the far future, this gives an 80% CI of (2x1045, 4x1048).

If we say that eliminating factory farming increases the probability that people care about sentient simulations by 1%, that gives us an expected value per dollar spent of (1x1032, 3x1035).

In the biological condition, let’s say there’s a 50% chance of making suffering simulations.

Simulations are a lot more efficient than wild animals, but we’d want to devote resources to lots of things other than creating simulations, so let’s say there are 10% as many suffering simulations as wild animals. That means the 80% CI is (3x1038, 4x1043).

All the other numbers are the same as before. When we combine everything we get an expected value per dollar of (8x1025, 1x1031).

Averaging these two conditions gives (2x1031, 4x1035).

AI-Targeted Values Spreading

In Charities I Would Like to See, I raised the idea of targeting values spreading at people who seem particularly influential on the far future. So let’s look at the idea of spreading concern for non-human minds to AI researchers.

Inputs:

  • (a) P(friendly AI gets built)
  • (b) P(FAI is good for animals by default)
  • (c) P(animals exist in the far future)
  • (d) value difference between animal-friendly and -unfriendly far future
  • (e) number of AI researchers with significant influence on a friendly AI at the time it’s built
  • (f) values propagation factor
  • (g) cost of convincing one AI researcher to care about animals

The expected value per dollar spent is

a * (1 - b) * c * d * (1 / e) * f * (1 / g)

Let’s say there’s a 30% chance that friendly AI gets built (it’s a hard problem and we might go extinct some other way). There’s maybe a 80% chance that a friendly AI is good for animals by default, and a 10% chance that animals exist in the far future given that we have friendly AI. This number is lower than the unconditional probability that animals exist because I expect an AI would not want to “waste” lots of space on making animals everywhere.

For the value difference (d), let’s take the number from the estimate for sentient simulations of (2x1045, 4x1048).

As before, take the size of the AI safety community as (200, 10,000).

Let the values propagation factor be (1, 3). It doesn’t seem likely that this would vary much.

Based on this, the value of convincing one AI researcher to care about animals is (3x1039, 2x1043). I have no idea how much this would cost since as far as I know no one’s ever tried to spend money to do this. The opportunity costs of talent might matter more than expenditures. But I think a good CI here is (103, 106). I don’t think it would ever cost $1 million to convince one AI researcher, but there might be a low probability of success (e.g. we spend $100,000 for a 10% chance of success).

This gives a final 80% CI of (3x1034, 2x1039).

Discussion

Have I fully incorporated my qualitative analyses?

In previous essays I’ve raised a bunch of points that affected my understanding of the value of existential risk reduction and values spreading:

  1. Future humans might spread wild animal suffering.
  2. Future humans might care about all sentient beings.
  3. If we want to spread values, it’s non-obvious which values to spread.
  4. Future humans might fill the universe wih human-like beings.
  5. Future humans might create suffering simulations.
  6. Debiasing people could lead to correct values.
  7. Value shifts might not carry through to the far future.
  8. The far future might be filled with hedonium.
  9. Values spreading has better feedback loops.

I’ve heard some people argue that quantitative models are often inferior to qualitative approaches because quantitative models don’t account for all your intuitions or perspectives. But that means if you can build a quantitative model that does account for all your perspectives, it ought to be better than a more intuitive approach.

Here’s how my model incorporates each of these:

  1. Explicit calculation.
  2. Explicit calculation.
  3. I just look at two types of values spreading (animal advocacy in general, and targeted animal advocacy to AI safety researchers).
  4. Explicit calculation.
  5. Explicit calculation.
  6. This is included in my estimates of the probability that factory farming/wild animal suffering/etc. will exist in the far future.
  7. This is included in my estimate of the probability that we end animal suffering in the future given that we eliminate factory farming.
  8. Explicit calculation.
  9. This is included in my estimates of the probability of success for different interventions (e.g. the probability of building a friendly AI).

Notes

  1. Here σ is not the same thing as standard deviation; it’s the standard deviation of the log base 10 of the distribution.  2

  2. It could also have effects on sentient aliens, other animals in captivity, and possibly other groups as well. For now I’ll leave these out because I believe they don’t have a large enough effect to justify the added model complexity.