Caffeine Cycling Self-Experiment

I conducted an experiment on myself to see if caffeine loses any of its potency when I take it three days a week. The results suggest that it doesn’t. Caffeine had just as big an effect at the end of my four-week trial as it did at the beginning.

This outcome is statistically significant (p = 0.016), but the data show a weird pattern: caffeine’s effectiveness went up over time instead of staying flat. I don’t know how to explain that, which makes me suspicious of the experiment’s findings.

Posted on Apr 11, 2024

How well did Scott Alexander's list of social science findings hold up?

In 2012, Scott Alexander defended social sciences against the claim that they can’t figure anything out. He gave a long list of well-established findings across a variety of social science disciplines.

12 years later, how well did that list hold up?

Posted on Apr 08, 2024

Explicit Bayesian Reasoning: Don't Give Up So Easily

Recently, Saar Wilf, creator of Rootclaim, had a high-profile debate against Peter Miller on whether COVID originated from a lab. Peter won and Saar lost.

Rootclaim’s mission is to “overcome the flaws of human reasoning with our probabilistic inference methodology.” Rootclaim assigns odds to each piece of evidence and perfoms Bayesian updates to get a posterior probability. When Saar lost the lab leak debate, some people considered this a defeat not just for the lab leak hypothesis, but for Rootclaim’s whole approach.

In Scott Alexander’s coverage of the debate, he wrote:

While everyone else tries “pop Bayesianism” and “Bayes-inspired toolboxes”, Rootclaim asks: what if you just directly apply Bayes to the world’s hardest problems? There’s something pure about that, in a way nobody else is trying.

Unfortunately, the reason nobody else is trying this is because it doesn’t work. There’s too much evidence, and it’s too hard to figure out how to quantify it.

Don’t give up so easily! We as a society have spent approximately 0% of our collective decision-making resources on explicit Bayesian reasoning. Just because Rootclaim used Bayesian methods and then lost a debate doesn’t mean those methods will never work. That would be like saying, “randomized controlled trials were a great idea, but they keep finding that ESP exists. Oh well, I guess we should give up on RCTs and just form beliefs using common sense.”

(And it’s not even like the problems with RCTs were easy to fix. Scott wrote about 10 known problems with RCTs and 10 ways to fix them, and then wrote about an RCT that fixed all 10¹ of those problems and still found that ESP exists. If we’re going to give RCTs more than 10 tries, we should extend the same courtesy to Bayesian reasoning.)

I’m optimistic that we can make explicit Bayesian analysis work better. And I can already think of ways to improve on two problems with it.

Posted on Apr 03, 2024

Does Caffeine Stop Working?

If you take caffeine every day, does it stop working? If it keeps working, how much of its effect does it retain?

There are many studies on this question, but most of them have severe methodological limitations. I read all the good studies (on humans) I could find. Here’s my interpretation of the literature:

Caffeine almost certainly loses some but not all of its effect when you take it every day.
In expectation, caffeine retains 1/2 of its benefit, but this figure has a wide credence interval.
The studies on cognitive benefits all have some methodological issues so they might not generalize.
There are two studies on exercise benefits with strong methodology, but they have small sample sizes.

Posted on Mar 29, 2024

Avoiding Caffeine Tolerance

Summary

Caffeine improves cognition¹ and exercise performance². But if you take caffeine every day, over time it becomes less effective.

What if instead of taking caffeine every day, you only take it intermittently—say, once every 3 days? How often can most people take caffeine without developing a tolerance?

The scientific literature on this question is sparse. Here’s what I found:

Experiments on rats found that rats who took caffeine every other day did not develop a tolerance. There are no experiments on humans. There are no experiments that use other intermittent dosing frequencies (such as once every 3 days).
Internet forum users report that they can take caffeine on average once every 3 days without developing a tolerance. But there’s a lot of variation between individuals.

This post will cover:

The motivation for intermittent dosing
A review of the experimental research on the effect of taking caffeine intermittently (TLDR: there’s almost no experimental research)
A review of self-reports from the online nootropics community
Intermittent dosing vs. taking caffeine every day

Posted on Mar 02, 2024

Utilitarianism Isn't About Doing Bad Things for the Greater Good. It's About Doing the Most Good

In the eyes of popular culture (and in the eyes of many philosophy professors), the essence of utilitarianism is “it’s okay to do bad things for the greater good.” In my mind, that’s not the essence of utilitarianism. The essence is, “doing more good is better than doing less good.”

Utilitarianism is about doing the most good. You don’t do the most good by fretting over weird edge cases where you can harm someone to help other people. You do the most good by picking up massive free wins like donating to effective charities where money does 100x more good than it would if you spent it on yourself.

(Richard Y. Chappell might call this beneficentrism: “the view that promoting the general welfare is deeply important, and should be amongst one’s central life projects.” You can be a beneficentrist without being a utilitarian, but if you’re a utilitarian, you have to be a beneficentrist, and as a utilitarian, being a beneficentrist is much more important than being a “do bad things for the greater good”-ist.)

Posted on Aug 15, 2023

The United States Is Weird

The United States is exceptional along many dimensions. Sometimes, people pick two of these dimensions and try to argue that one causes the other. And that’s probably true sometimes. But “the USA is #1 in the world on dimension X, and #1 on dimension Y” isn’t much evidence that X causes Y.

The United States has:

the 3rd largest population, and the largest population of any first-world country (by far)
the 3rd or 4th largest land area (depending on how you measure USA’s and China’s land)
the highest GDP of any country
the highest median income, the 8th highest GDP per capita (PPP), and the highest GDP per capita of any large country (the top 7 countries combined have a lower population than California)
unusually high income inequality for a developed country
the highest healthcare expenditure per capita
the highest gun ownership per capita (with double the gun ownership of the #2 country)
an unusually high homicide rate for a developed country
the most Nobel Prize winners winners (by a huge margin) and most Fields Medalists (narrowly beating France)
the highest obesity rate of any large country, and the 11th highest overall
the most top universities (whatever that means) by a wide margin
unusually low life expectancy for a developed country
an unusually high fertility rate for a developed country
the 2nd most exports (after China) and the most imports (China is #2)
the most military expenditures (by a factor of 3) and the 2nd most nuclear weapons

(Those were just the examples I could come up with in an hour of research.)

(I also looked at a few stats where I thought the USA might be exceptional, but it turned out not to be: IQ, educational attainment, infant mortality¹, and net immigration.)

A lot of these facts are clearly intertwined—the fact that the US has the highest GDP is related to the facts that it has the highest military expenditures, the highest healthcare expenditures, and the 2nd highest exports. (But they’re not fully intertwined, because the US still has high military and healthcare expenditures relative to GDP.)

For other facts, you can come up with narratives as to why they’re related—maybe the high obesity rate causes the low life expectancy, maybe high gun ownership causes the (relatively) high homicide rate. But maybe not. The United States is weird and I don’t have a great handle on why it’s weird (and, as far as I know, nobody else does either). Until someone comes up with a Grand Theory of National Weirdness, I’m reluctant to pick two ways in which the USA is weird and claim one causes the other.

Notes

Some sources say the USA has high infant mortality. I didn’t look into this much, but the CIA World Factbook claims that the USA defines infant mortality more broadly than most countries, and if you adjust for this, infant mortality looks similar to most developed countries. ↩

Posted on Aug 15, 2023

Should Patient Philanthropists Invest Differently?

TLDR: No.

Confidence: Somewhat likely.

Summary

Some philanthropists discount the future much less than normal people. For philanthropists with low discount rates, does this change how they invest their money? Can they do anything to take advantage of other investors’ high time discounting?

We can answer this question in two different ways.

Should low-discount philanthropists invest differently in theory? No. [More]

Should low-discount philanthropists invest differently in practice? The real world differs from the standard theoretical approach in a few ways. These differences suggest that low-discount philanthropists should favor risky and illiquid investments slightly more than high-discount investors do. But the difference is too small to matter in practice. [More]

Posted on Sep 14, 2022

Two Types of Scientific Misconceptions You Can Easily Disprove

Last updated 2023-03-22 to add another example.

There are two opposite types of scientific claims that are easy to prove wrong: claims that never could have been proven in the first place, and claims that directly contradict your perception.

Type 1: The unknowable misconception

A heuristic for identifying scientific misconceptions: If this were true, how would we know?

Example: “People swallow 8 spiders per year during sleep.” If you were a scientist and you wanted to know how often people swallow spiders, how would you figure it out? You’d have to do a sleep study for thousands of person-nights where you film people sleeping using a night vision camera that’s so high-quality that it can pick up something as small as a spider (which, as far as I know, doesn’t exist) and then pore over the tens of thousands of hours of footage by hand to look for spiders (because this factoid originated in a time when computer software wasn’t sophisticated enough to do it for you) and track the locations of the spiders and count how often they crawl into people’s mouths without coming back out. This is all theoretically possible but it would be insanely expensive and who would be crazy enough to do it?

Example: “The average man thinks about sex once every 7 seconds.” People can’t even introspect on their own thoughts on a continuous basis, how would scientists do it? This one seems simply impossible to prove, regardless of how big your budget is or how crazy you are.

Example: “Only 7% of communication is verbal, and 93% is nonverbal.” What does that even mean? How would a scientific study quantify all the information that two people transmit during a conversation and measure its informational complexity, and then conclude that 93% is nonverbal? You can kind of measure the information in words by compressing the text, but there’s no known way to accurately measure the information in nonverbal communication.

(This factoid does come from an actual study, but what the study actually showed¹ was that, among a sample of 37 university psychology students doing two different simple communication tasks, when verbal and nonverbal cues conflicted, they preferred the nonverbal cue 93% of the time.)

Type 2: The perception-contradicting misconception

You can disprove some common misconceptions using only your direct perception.

Example: “Certain parts of the tongue can only detect certain tastes.” You can easily disprove this by placing food on different spots on your tongue.

Example: “You need two eyes to have depth perception.” Close one eye. Notice how you still have depth perception. Or look at a photograph, which was taken from a fixed perspective. Notice how you can still detect depth in the photograph.

You need two points of reference to get “true” depth perception, but the human visual cortex can almost always infer depth based on how a scene looks. It’s possible to trick your depth perception, like this:

But in almost all real-world situations, you can correctly perceive depth with only one eye.

Example: “You (only) have five senses.” You can prove that you have at least two more senses: balance (equilibrioception) and the position of your limbs relative to each other (proprioception).

You can prove you have a sense of balance by closing your eyes and walking without falling over. You’re not using any of the standard five senses, but you can still stay upright.

You can prove you have proprioception by closing your eyes, flailing your arms around until they’re in random positions, and then bringing your hands together until your fingertips touch. You couldn’t do this if you didn’t have a proprioceptive sense.

(Some scientists say we have more than seven senses, but the other ones are harder to prove.)

So far I’ve given examples of factoids you can trivially disprove in five seconds. There are more misconceptions you can disprove if you’re willing to do a tiny bit of work. Example: “Women have one more rib than men.” Find a friend of the opposite gender and count your ribs!

Philip Yaffe (2011). The 7% Rule: Fact, Fiction, or Misunderstanding? ↩

Posted on Sep 07, 2022

Most Theories Can't Explain Why Game of Thrones Went Downhill

I’ve heard people repeat a few theories for why Game of Thrones started so well and ended so badly. Most of these theories don’t make sense.

Theory 1: David & Dan are good at adapting books, but they didn’t know what to do when they ran out of book.

Seasons 1 through 4, which adapted the first three books of A Song of Ice and Fire, are up there with the greatest television shows ever made. Season 5, which adapted the fourth book and part of the fifth book, was mediocre. If this theory were true, season 5 should have been on par with the earlier seasons, but it wasn’t.

Furthermore, season 6 was better than season 5, even though season 5 was still based on the books, and season 6 wasn’t.

Even more, David & Dan wrote some excellent original content in the earlier seasons, such as the extended arc with Arya and the Hound in season 4 (see this scene, which wasn’t in the books).

Some people say, well they know how to write short scenes, but they don’t know how to write story arcs. Then how do you explain the famously terrible dialogue in the later seasons?

Theory 2: David & Dan were always bad showrunners.

I hear this one a lot. There’s some evidence for this theory—prior to Game of Thrones, David was best known for writing X-Men Origins: Wolverine (a famously bad movie), and Dan had no prior writing credits. But if they’re bad showrunners, why were the first four seasons so good? I can buy that bad showrunners might accidentally create a pretty good show, but I don’t see how they could accidentally create one of the best shows of all time.

Theory 3: David & Dan lost interest and started phoning it in.

This explanation makes more sense because it can explain the nearly-monotonic decline in quality. But it still can’t explain why season 6 was better than season 5. And the timing doesn’t entirely work out—people usually say this about seasons 7 and 8, but season 5 was clearly worse than the previous four seasons, and it seems less plausible that they’d lose interest that early on.

Theory 4: Good writing emerges through a mysterious process that no one really understands.

This is my favorite theory. Many occasionally-great writers can’t consistently replicate their success, writers can’t tell which of their works will become popular, and nobody fully understands what makes great writing. That’s why, for example, Jane Austen thought Pride and Prejudice was her worst book, even though it’s what she’s most remembered for. Or why The Matrix is my favorite movie of all time, even though I like ~~zero (0)~~ one (1)¹ other Wachowski movie. (The Wachowskis are another example of artists who occasionally produce brilliant works and most of the time don’t, and it’s not clear why.)

Or why people used to talk about good art coming from a muse—you didn’t write that brilliant story, you just wrote down the words that your muse gave you, which is just a poetic way of saying you have no idea how you came up with it.

This is kind of a non-explanation: “the reason Game of Thrones was inconsistently good is because lots of things are inconsistently good and we don’t know why.” But at least it turns a localized mystery into a much bigger mystery about the general nature of creativity.

Edit 2023-09-04: Originally I wrote zero, but I just remembered that the Wachowskis co-wrote V for Vendetta, which I enjoyed. This is an irrelevant minor detail but I am committed to factual accuracy even when it doesn’t matter. ↩

Posted on Sep 05, 2022