Contra "Time Series Momentum: Is It There?"

Summary

Time series momentum (TSMOM) is an investment strategy that involves buying assets whose prices are trending upward and shorting assets that have a downward trend. In 2012, Moskowitz, Ooi & Pedersen published Time Series Momentum1. They analyzed a simple version of the strategy that buys assets with positive 12-month returns and shorts assets with negative 12-month returns. They found that the strategy had statistically significant outperformance in equity indexes, currencies, commodities, and bond futures from 1985 to 2009.

However, others have raised doubts. Huang, Li, Wang & Zhou (henceforth HLWZ) criticized the strategy in Time Series Momentum: Is It There? (2020)2, concluding that the evidence for TSMOM is not statistically reliable.

Some of their criticisms have merit, but TSMOM remains an appealing strategy.

The abstract of Time Series Momentum: Is It There? reads:

Time series momentum (TSMOM3) refers to the predictability of the past 12-month return on the next one-month return and is the focus of several recent influential studies. This paper shows that asset-by-asset time series regressions reveal little evidence of TSMOM, both in- and out-of-sample. While the t-statistic in a pooled regression appears large, it is not statistically reliable as it is less than the critical values of parametric and nonparametric bootstraps. From an investment perspective, the TSMOM strategy is profitable, but its performance is virtually the same as that of a similar strategy that is based on historical sample mean and does not require predictability. Overall, the evidence on TSMOM is weak, particularly for the large cross section of assets.

To rephrase, HLWZ make two central arguments:

  1. Moskowitz, Ooi & Pedersen (2012)1 did a pooled regression and found a statistically significant correlation, but this methodology is flawed: it finds a strong correlation even when time series momentum cannot predict future prices.
  2. TSMOM performed similarly to a strategy that simply buys assets with positive long-run historical returns and shorts assets with negative historical returns. The authors call this strategy Time Series History or TSH.

My responses to the two arguments:

  1. I agree that a pooled regression is flawed. But a statistically significant correlation on a pooled regression is not what convinced me that TSMOM works. [More]
  2. TSMOM and TSH indeed have similar(ish) historical returns. However:
    1. TSMOM’s positive performance cannot be explained by TSH alone. [More]
    2. TSMOM provided better diversification to an equity portfolio. [More]
    3. TSMOM has large unexplained returns when regressed onto a Fama-French factor model. [More]
    4. TSMOM still performed well on a much larger sample going back a century. [More]

TSMOM looks strong in the historical data. TSMOM probably survives fees and trading costs, but the evidence for that is less clear. [More]

Continue reading
Posted on

If AI alignment is only as hard as building the steam engine, then we likely still die

You may have seen this graph from Chris Olah illustrating a range of views on the difficulty of aligning superintelligent AI:

Evan Hubinger, an alignment team lead at Anthropic, says:

If the only thing that we have to do to solve alignment is train away easily detectable behavioral issues…then we are very much in the trivial/steam engine world. We could still fail, even in that world—and it’d be particularly embarrassing to fail that way; we should definitely make sure we don’t—but I think we’re very much up to that challenge and I don’t expect us to fail there.

I disagree; if governments and AI developers don’t start taking extinction risk more seriously, then we are not up to the challenge.

Continue reading
Posted on

Alignment Bootstrapping Is Dangerous

AI companies want to bootstrap weakly-superhuman AI to align superintelligent AI. I don’t expect them to succeed. I could give various arguments for why alignment bootstrapping is hard and why AI companies are ignoring the hard parts of the problem; but you don’t need to understand any details to know that it’s a bad plan.

When AI companies say they will bootstrap alignment, they are admitting defeat on solving the alignment problem, and saying that instead they will rely on AI to solve it for them. So they’re facing a problem of unknown difficulty, but where the difficulty is high enough that they don’t think they can solve it. And to remediate this, they will use a novel technique never before used in history—i.e., counting on slightly-superhuman AI to do the bulk of the work.

If they mess up and this plan doesn’t work, then superintelligent AI kills everyone.

And they think this is an acceptable plan, and it is acceptable for them to build up to human-level AI or beyond on the basis of this plan.

What?

Continue reading
Posted on

We won't solve non-alignment problems by doing research

Introduction

Even if we solve the AI alignment problem, we still face non-alignment problems, which are all the other existential problems1 that AI may bring.

People have written research agendas on various imposing problems that we are nowhere close to solving, and that we may need to solve before developing ASI. An incomplete list of topics: misuse; animal-inclusive AI; AI welfare; S-risks from conflict; gradual disempowerment; permanent mass unemployment; risks from malevolent actors; moral error.

The standard answer to these problems, the one that most research agendas take for granted, is “do research”. Specifically, do research in the conventional way where you create a research agenda, explore some research questions, and fund other people to work on those questions.

If transformative AI arrives within the next decade, then we won’t solve non-alignment problems by doing research on how to solve them.

Continue reading
Posted on

Do Disruptive or Violent Protests Work?

Previously, I reviewed the five strongest studies on protest outcomes and concluded that peaceful protests probably work (credence: 90%).

But what about disruptive or violent protests?

Peaceful protests use nonviolent, non-disruptive tactics such as picketing and marches.

Disruptive protests use nonviolent, in-your-face tactics such as civil disobedience, sit-ins, and blocking roads.

Violent protests use violence.

There isn’t much evidence on the other two categories of protest. My best guesses are:

  • Violent protests probably don’t work. (credence: 80%)
  • Violent protests may reduce support for a cause, but it’s unclear. (credence: 40%)
  • For disruptive protests, it’s hard to say whether they have a positive or negative impact on balance. I’m about evenly split on whether a randomly-chosen disruptive protest is net helpful, neutral, or harmful.
  • A typical disruptive protest doesn’t work as well a typical peaceful protest. (credence: 80%)
  • Peaceful protests are a better idea than disruptive protests. (credence: 90%)
Continue reading
Posted on

Epistemic Spot Check: Expected Value of Donating to Alex Bores's Congressional Campaign

Political advocacy is an important lever for reducing existential risk. One way to make political change happen is to support candidates for Congress.

In October, Eric Neyman wrote Consider donating to Alex Bores, author of the RAISE Act. He created a cost-effectiveness analysis to estimate how donations to Bores’s campaign change his probability of winning the election. It’s excellent that he did that—it’s exactly the sort of thing that we need people to be doing.

We also need more people to check other people’s cost-effectiveness estimates. To that end, in this post I will check Eric’s work.

I’m not going to talk about who Alex Bores is, why you might want to donate to his campaign, or who might not want to donate. For that, see Eric’s post.

Continue reading
Posted on

Writing Your Representatives: A Cost-Effective and Neglected Intervention

Is it a good use of time to call or write your representatives to advocate for issues you care about? I did some research, and my current (weakly-to-moderately-held) belief is that messaging campaigns are very cost-effective.

In this post:

Continue reading
Posted on

Do Small Protests Work?

TLDR: The available evidence is weak. It looks like small protests may be effective at garnering support among the general public. Policy-makers appear to be more sensitive to protest size, and it’s not clear whether small protests have a positive or negative effect on their perception.

Previously, I reviewed evidence from natural experiments and concluded that protests work (credence: 90%).

My biggest outstanding concern is that all the protests I reviewed were nationwide, whereas the causes I care most about (AI safety, animal welfare) can only put together small protests. Based on the evidence, I’m pretty confident that large protests work. But what about small ones?

I can see arguments in both directions.

On the one hand, people are scope insensitive. I’m pretty sure that a 20,000-person protest is much less than twice as impactful as a 10,000-person protest. And this principle may extend down to protests that only include 10–20 people.

On the other hand, a large protest and a small protest may send different messages. People might see a small protest and think, “Why aren’t there more people here? This cause must not be very important.” So even if large protests work, it’s conceivable that small protests could backfire.

What does the scientific literature say about which of those ideas is correct?

Continue reading
Posted on

My Third Caffeine Self-Experiment

Last year I did a caffeine cycling self-experiment and I determined that I don’t get habituated to caffeine when I drink coffee three days a week. I did a follow-up experiment where I upgraded to four days a week (Mon/Wed/Fri/Sat) and I found that I still don’t get habituated.

For my current weekly routine, I have caffeine on Monday, Wednesday, Friday, and Saturday. Subjectively, I often feel low-energy on Saturdays. Is that because the caffeine I took on Friday is having an aftereffect that makes me more tired on Saturday?

When I ran my second experiment, I took caffeine four days, including the three-day stretch of Wednesday-Thursday-Friday. I found that my performance on a reaction time test was comparable between Wednesday and Friday. If my reaction time stayed the same after taking caffeine three days in a row, that’s evidence that I didn’t develop a tolerance over the course of those three days.

But if three days isn’t long enough for me to develop a tolerance, why is it that lately I feel tired on Saturdays, after taking caffeine for only two days in a row? Was the result from my last experiment incorrect?

So I decided to do another experiment to get more data.

This time I did a new six-week self-experiment where I kept my current routine, but I tested my reaction time every day. I wanted to test two hypotheses:

  1. Is my post-caffeine reaction time worse on Saturday than on Mon/Wed/Fri?
  2. Is my reaction time worse on the morning after a caffeine day than on the morning after a caffeine-free day?

The first hypothesis tests whether I become habituated to caffeine, and the second hypothesis tests whether I experience withdrawal symptoms the following morning.

The answers I got were:

  1. No, there’s no detectable difference.
  2. No, there’s no detectable difference.

Therefore, in defiance of my subjective experience—but in agreement with my earlier experimental results—I do not become detectably habituated to caffeine on the second day.

However, it’s possible that caffeine habituation affects my fatigue even though it doesn’t affect my reaction time. So it’s hard to say for sure what’s going on without running more tests (which I may do at some point).

Continue reading
Posted on

← Newer Page 1 of 10