Expected Value Estimates You Can (Maybe) Take Literally
Part of a series on quantitative models for cause selection.
Alternate title: Excessive Pessimism About Far Future Causes
In my post on cause selection, I wrote that I was roughly indifferent between $1 to MIRI, $5 to The Humane League (THL), and $10 to AMF. I based my estimate for THL on the evidence and costeffectiveness estimates for veg ads and leafleting. Our best estimates suggested that these are conservatively 10 times as costeffective as malaria nets, but the evidence was fairly weak. Based on intuition, I decided to adjust this 10x difference down to 2x, but I didn’t have a strong justification for the choice.
Corporate outreach has a lower burden of proof (the causal chain is much clearer) and estimates suggest that it may be ten times more effective than ACE top charities’ aggregate activities^{1}. So does that mean I should be indifferent between $5 to ACE top charities and $0.50 to corporate campaigns? Or perhaps even less, because the evidence for corporate campaigns is stronger? But I wouldn’t expect this 10x difference to make corporate campaigns look better than AI safety, so I can’t say both that corporate campaigns are ten times better than ACE top charities and also that AI safety is only five times better. My previous model, in which I took expected value estimates and adjusted them based on my intuition, was clearly inadequate. How do I resolve this? In general, how can we quantify the value of robust, moderately cost effective interventions against nonrobust but (ostensibly) highly cost effective interventions?
To answer that question, we have to get more abstract.
Theory
Naively, it seems that if one nonrobust estimate A looks ten times better than another nonrobust estimate B, you should count it as ten times better in expectation if both estimates are similarly nonrobust. But I suspect this is false, because overestimates scale up nonlinearly.
Example: suppose you do some naive costeffectiveness calculations and find that intervention A costs $10/QALY^{2} and intervention B costs $1 per 1000 QALYs. Your calculations were pretty naive so you shouldn’t take them too seriously. If you just discount all your calculations by a factor of 10, intervention A moves from $10/QALY to a more reasonable $100/QALY (about the same as AMF^{3}). But if you discount intervention B by a factor of 10, now it’s still at $1 per 100 QALYs. Does that mean you should still treat intervention B as a thousand times better than intervention A? Probably not.
Instead, you should have a prior probability distribution over costeffectiveness for an intervention and then update on the evidence provided by your costeffectiveness calculation. The more robust your calculation, the stronger the update. But more extreme costeffectiveness results will have lower prior probabilities, so your update does not scale linearly with the size of the costeffectiveness estimate. How exactly it scales depends on your probability distribution, which is probably something like lognormal. Holden Karnofsky discusses this idea here; in a future essay, I’ll go into detail on how to choose a prior distribution.
Maybe you say we shouldn’t care that much about explicit costeffectiveness estimates, even with discounting. But you have to have a utility function that assigns utilities to charities, so you need some way to translate costeffectiveness into utility.
Wait, why exactly do you have to have a utility function? This might not be obvious. But suppose you’re given a choice: either $1000 goes to AMF or $5000 goes to GiveDirectly. This choice has significant realworld effects, so you’d really like to be able to decide which is better. It’s not enough to just say that you prefer AMF to GiveDirectly; you have to say by how much you prefer it.
This is not merely an academic question. In 2015, I donated to REG, which raises money for effective charities. The bulk of the funds goes to GiveWell top charities; if I believe that some of the other charities it raises money for are much more costeffective, it might be better to donate directly to those instead. I created a model that estimated the relative value of donations to every organization REG raised money for. After doing this, I found that REG had a fundratio greater than 1:1, which made it look better than my favorite direct (nonfundraising) organization.
To build this model, I had to assign weights to all of the charities REG raised money for. The weights had to come from somewhere. That means the better we can figure out how to translate costeffectiveness estimates into actual expected utility, the better we’re able to estimate the value of this sort of fundraising.
Kinda Arbitrary Quantitative Model
Note: I believe a lot of the numbers here are off for various reasons so don’t take them too seriously. I’ll refine them more in a future post.
Inputs to costeffectiveness estimates tend to vary logarithmically, so it’s reasonable to assume that estimates follow a lognormal distribution. If I perform a costeffectiveness calculation that shows the result is X, then there’s a reasonable chance I will later update the answer to 10X or (1/10)X, and a smaller but nonnegligible chance I will update to 100X or (1/100)X. More robust calculations have lower variances.
Dario Amodei created a mathematical model for how to incorporate expected value estimates into a prior distribution. He used a lognormal distribution for the prior. I believe a Pareto distribution may make more sense—the distribution of interventions looks qualitatively similar to other phenomena that a Pareto distribution describes well—but for now I’ll use a lognormal distribution. (I’ll discuss this choice more in a future essay.)
Let’s use some actual numbers here. Suppose half of the interventions in our reference class perform better than GiveDirectly. (This assumption is kind of arbitrary and depends on how we define “our reference class.”) Scale our metric of utility such that GiveDirectly has costeffectiveness/utility of 1. Let’s give our prior probability distribution of σ of one order of magnitude^{4}. That means half of all interventions perform worse than GiveDirectly, 57% are less than twice as good, 73% are less than 10 times as good, 89% are less than 100 times as good, and 99% are less than 10,000 times as good^{5}. Extrapolating from GiveWell’s comparisons of AMF and GiveDirectly, we can assume that 1 utility in this model is the equivalent of about 1 QALY per $1000.
Okay, what kinds of results does this produce? Let’s compare GiveDirectly, AMF, and corporate cagefree campaigns.
According to this spreadsheet, AMF is 11 times as costeffective as GiveDirectly. This is a pretty robust estimate, so let’s say it has a standard deviation σ of 0.5 orders of magnitude. GiveDirectly is even more robust, so let’s say its σ is 0.25 orders of magnitude.
On corporate campaigns, Open Phil estimates that about $2.5 million of spending spared 300 million hens of cage confinement per year for perhaps five years, resulting in 120 henyears spared per dollar spent. Including spending on passing Prop 2, this estimate drops to 20 henyears in cages spared per dollar.
If we assume the transition from battery cages to cagefree is worth 1 hen QALY per year and hen consciousness is worth half as much as human consciousness and also assume cagefree campaigns improve 40 henyears per dollar as a middle ground between Open Phil’s two numbers (weighted toward the low end), then we can conclude that corporate campaigns are worth 20 humanequivalent QALYs per $1 spent. AMF costs about $100 per QALY, which makes corporate campaigns 2000 times better than AMF and therefore 22,000 times better than GiveDirectly. Open Phil believes that The Humane League’s campaigns are unusually effective and may spare more like 300 hens per dollar spent, which would make THL’s campaigns 300,000 times better than GiveDirectly. I’ll use the lower figure to be conservative. These numbers would be smaller if you heavily discounted the value of chickens relative to humans, but I don’t believe you could reasonably discount chickens by more than about an order of magnitude beyond what I’ve done.
I created a Guesstimate sheet to get an idea of the variation in possible estimates. The σ here probably hovers around 1 order of magnitude.
Now that we have some numbers, let’s compare these different interventions.
Let’s define our scale such that GiveDirectly has an expected utility of 1–that means 1 utility corresponds to 1 QALY per $1000 spent. AMF, with estimated utility 11, has posterior expected utility 6.8. And corporate campaigns, with estimated utility of 22,000 but high variance, has expected utility 148. This is still higher than AMF but nowhere close to 2000 times higher. If we’re too confident about corporate campaigns here and the σ is really 1.5 orders of magnitude, then the posterior expected value drops to 21.7.
How does this change as we vary σ for our prior? When we change it from 1 to 0.5, our expected utilities for AMF and corporate campaigns drop to 3.3 and 7.4, respectively; then the difference is now much smaller. If we use 0.5 for our prior and 1.5 for our corporate outreach estimate, its posterior drops to 2.7–a bit below our posterior for AMF. But I believe these values are fairly implausible, and the original estimate is more accurate. (A σ of 1 for cagefree campaigns might even be too high since the causal chain of impact is clear.)
If you apply a factor of 10 discount to the moral value of chickens, cagefree campaigns have a posterior of 45–still much better than AMF’s 6.8.
Without putting too much credence in this model, I think this suggests that The Humane League’s corporate campaigns are a substantially better bet than GiveWell top charities even after accounting for the fact that they’re less robust.
AI Safety
(This section contains a bit of a digression and isn’t really relevant unless you’re interested in AI safety.)/
Let’s look at AI safety work, which has even higher estimated effectiveness and variance than corporate campaigns. Working off the framework created by the Global Priorities Project, we can estimate the mean and variance of AI safety work. Let’s use the following 80% confidence intervals:^{6}
Probability of AIrelated extinction: (0.03, 0.3)
Size of AI safety community when AGI developed: (10,000, 200)
Effect of adding a researcher: (0.5, 3)
Bad scenarios averted by doubling research: (0.1, 0.7)
QALYs in the far future: (4x10<sup>45</sup>, 6x10<sup>51</sup>)
Cost per researcher: ($70,000, $150,000)
GiveDirectly cost per QALY: $1000
(I had a more complex method I used to estimate the QALYs in the far future–I’ll detail it in a future post. In short, it draws on Bostrom’s Astronomical Waste paper to make an educated guess about how many sentient beings would likely exist if we used the universe’s resources efficiently.)
Once we combine all these terms we get an interval of (5x10^{38}, 2x10^{45}). Then when we update our prior with this estimate we get a posterior expected value of about 342,000 QALYs per $1000.
This result is highly dependent on the size of the intervals. If you make the interval for the estimate of QALYs in the far future an order of magnitude bigger, the posterior expected value drops substantially—and if you make the interval smaller, the posterior increases significantly.
The result also highly depends on the prior, which is maybe the most difficult part of this model to get right. This area would benefit from further research.
On Using Quantitative Models
Quantitative models are dumb. But not using quantitative models is dumber.
Quantitative models are finicky and tend to produce a wide range of results depending on how you build them. But it does fundamentally matter how much good an intervention does, and we need to be able to prioritize interventions. Quantitative models may be the best way to accomplish this.
Wellconstructed models allow us to formalize intuitions about how to weigh the evidence for different interventions. I have some sense that high costeffectiveness estimates trade off against robustness: The Humane League (THL) might look more effective than AMF according to everyone’s estimates, but the evidence in its favor is a lot less robust. I personally have an intuition that THL is the better bet; some other people believe AMF is better. But I don’t have any good reason to trust my intuition here. A Bayesian expected value model could be wrong, but why should I expect it to be more wrong than my intuition?
We humans have the unfortunate problem that we trust our intuition and judge most everything against it. If my model disagrees with my intuition, I might throw out the model even if it’s wellconstructed. But is that reasonable? Certainly intuitions are correct in many cases. On the other hand, they’re wrong a lot, too. There are quite a few cognitive biases that play a large role in cause selection, such as scope insensitivity and the absurdity heuristic.
Intuitions and quantitative models both have problems, but they can complement each other. Maybe you have good intuition about how to trade off donations between AMF and GiveDirectly, but your intuition on the value of more outlandish organizations is less useful–your intuition isn’t well trained for them so you shouldn’t expect it to be reliable. This is where models come in. You can build a quantitative model that fits your intuitions for the easy cases and then use it to help you make judgments in the challenging cases. I believe people ought to use quantitative models like this more often and take them more seriously.
The last time I made a large donation, I put a lot of thought into how to trade off costeffectiveness estimates against robust evidence. I didn’t use a solid quantitative model, though, and now believe that was a mistake. I originally said I was indifferent between $1 to THL and $2 to AMF, based on my intuition about how much it mattered that the evidence for THL wasn’t that strong. But if I use the Bayesian model described in the previous section, I get that $1 to THL should be worth more like $10 to AMF. Maybe my model has issues, but it’s the strongest thing I have to go on. And I can make my model stronger by spending more time trying to approximate the σ on different interventions.
When our intuitions differ, it can be hard to productively argue about that. But suppose we use a quantitative model; then we can have a more directed discussion about what the value of σ should be. It’s still not going to be perfectly clear, but we can at least point to facts: “I think your σ is wrong because intervention X is similar to intervention Y, but your posteriors for X and Y are really different” or “You’re underestimating the variation of the inputs so your σ should be higher.”
It probably makes more sense to compute a costeffectiveness estimate, make your best guess about the variance, and plug these into a Bayesian model than to try to estimate a posterior based on your intuition. Human brains are notoriously bad at intuiting Bayesian probability.
How I’m Changing My Mind
My previous tradeoffs between animal welfare charities and AMF were not large enough. To justify the claim that THL is only twice as good as AMF, you’d have to either massively discount the value of nonhuman animals or assume that our estimates for THL’s costeffectiveness have a much higher variance than I believe is reasonable.
I’m less certain about farfuture considerations, such as on AI risk or other existential risks. It’s harder to tell whether the quantitative model is producing useful results because it’s challenging to have good intuitions about what results should look like. But then, that’s exactly when a quantitative model is most useful–when intuitions and informal estimates don’t work well anymore. Still, I plan on making my model more rigorous before I incorporate it into any serious decisionmaking.
I updated my model of REG’s costeffectiveness, including its update on money moved. It actually looks better than it did before because it’s raised more money for MIRI and other nonglobal poverty organizations.
I should use a Bayesian model to systematically discount REG’s fundratio. If I give the prior a mean of 3 and σ 0.75^{7}, and take REG’s estimated fundratio at 10 and σ at 0.5, that gives it a posterior estimated fundratio of 7:1. If I discount the weighted fundratio by 7/10, that gives a final posterior weighted fundratio of 1.6:1.
(I don’t currently have a stance on whether I should donate to REG; I have a long list of things I want to learn before making my next big donation.)
The quantitative model I discuss in this piece is pretty limited, and should be considered a rough draft. I’m currently developing a stronger model that I will actually use for making donation decisions, and plan on writing future posts describing the details of the improved model. Feedback and suggestions are appreciated, and will help me develop a better model that I hope other people can use.
Discussion Questions
 Would you use a quantitative model like this as a major input into your decisionmaking?
 For people who care a lot about having robust estimates but don’t trust Bayesian models like the one I used, why don’t you trust them? Why do you believe your decisions are better without them, or what can your reasoning add to a model that a model can’t account for?
 For people who reason about this sort of thing based on intuition, have I misrepresented your position? Do you believe your reasoning process is more reliable than I’m saying it is? Why?
Notes

ACE and Open Phil use different methodologies for their estimates, which explains a lot of the difference here. ACE’s own estimate of the effectiveness of cagefree campaigns puts it at a lower expected value than online ads. I believe most of this comes from the fact that ACE puts a much lower number on the subjective qualityoflife difference between caged and cagefree hens. These estimates were published in 2015 and may change in the future. ↩

I use QALYs as a proxy for whatever it is we actually care about, since it’s a reasonably good metric that lots of people use. ↩

I don’t actually expect that AMF does as much good as this statistic suggests because of population ethics issues, but it’s convenient to use AMF’s “~$3000 per life saved” as a benchmark. ↩

This equates to a σ of 1 * 10 / e = 1.84. ↩

I don’t think these are plausible numbers if you look at the set of all possible actions, but if we assume that our reference class does not include all the obviously lowimpact interventions (e.g. giving out free hugs, or going swimming, or any other arbitrary activity), I think this distribution is reasonable. ↩

I’m using 80% confidence intervals here rather than standard deviations (i.e. 68% confidence intervals) because I’ve taken lots of calibration tests and I know I’m wellcalibrated for 80% confidence intervals, but I haven’t tested myself on 68% CI’s. ↩

Not sure what’s reasonable here. A σ of 0.5 seemed too low and 1 seemed too high. ↩