In Can Investors Simulate Leverage via Concentrated Stock Selection?, I used the Ken French Data Library to show that concentrated value or momentum factor portfolios historically had both higher returns and higher risk than diversified value or momentum portfolios. Thanks to a newly released dataset, we can replicate this finding across 137 factors.
Jensen, Kelly & Pedersen (2021), Is There a Replication Crisis in Finance?, attempted to replicate 153 factors that supposedly predict stock performance. More importantly for our purposes, the authors publicly released their source code and data.
Previously, using the Ken French Data Library, I found that equal-weighted factor portfolios performed better than value-weighted portfolios. This performance could not be explained by factor exposure—equal-weighted portfolios had “alpha” on top of the value-weighted factor. (I use scare quotes because it’s not true alpha. It’s still a form of factor exposure, but it’s factor exposure that a value-weighted factor construction fails to capture.) Can we replicate this finding using the Jensen, Kelly & Pedersen dataset?
I filtered the dataset down to just those factors that had data for US stocks starting in 1960. The full dataset goes back to 1926, but many factors are impossible to calculate that far back because the necessary data doesn’t exist. I started in 1960 because we have enough data by that point to calculate the returns of almost all the factors (137 out of 153).
The hypothesis is that EW factors provide a stronger signal. If a particular VW factor has a positive return, then its EW equivalent should have a higher return. And if a VW factor has a negative return, then the EW version should have a lower return.
I tested this hypothesis in three ways:
- For each of the 137 factors, does EW have a larger absolute value of return than VW, and with the correct sign?
- For each of the 137 factors, does EW have a larger absolute value of risk-adjusted return?
- Does EW provide a stronger signal if we restrict to just those factors where VW has a risk-adjusted return of at least 0.3?
My reasoning for the third test is that factors with small risk-adjusted returns might not have any predictive power—they’re more likely to be noise. If we restrict our dataset to just the value-weighted factors with good performance (taking 0.3 as an arbitrary threshold for “good”1), then we should expect their equal-weighted equivalents to have even better performance.
I calculated the returns (or risk-adjusted returns) for every factor and took the difference between EW and VW. Then I calculated the following summary statistics:
- number of eligible factors
- number of factors for which the EW formulation provided a stronger signal than the VW version
- mean difference between EW and VW factors
- standard error of differences
- t-statistic for the null hypothesis that the mean equals zero
|Absolute Return||Risk-Adjusted Return||Risk-Adjusted (> 0.3)|
In all three cases, EW provided a stronger signal for a large majority of factors, and all three differences had extremely high t-statistics. The t-statistic was highest when we restricted the sample to only factors with a risk-adjusted return of 0.3 or higher.
This supports my previous finding that, before fees and transaction costs, EW factor portfolios outperform VW portfolios.
Two important caveats:
- The Jensen, Kelly & Pedersen dataset only includes long/short factors. It’s possible that most of the outperformance of equal-weighted factors happens on the short side. This was not the case on the Ken French data, so it stands to reason that it’s not the case for most of these new factors, either. But we can’t say for sure without getting more data.
- This analysis does not account for fees and transaction costs. Equal-weighted portfolios incur higher costs, and they’re prohibitively expensive for sufficiently large investors (with perhaps $1 billion or more). Small investors can invest in equal-weighted portfolios without paying much in transaction costs. I discussed this in more detail previously.
Disclaimer: I am not an investment advisor and this should not be taken as investment advice. This content is for informational purposes only. Past performance is not a guarantee of future results. Any given portfolio results are hypothetical and do not represent returns achieved by an actual investor.
I chose 0.3 as the threshold because it’s a bit less than the risk-adjusted return of the US stock market minus T-bills. My reasoning is that if a factor performed close to as well as the stock market itself, then it’s a pretty good factor (at least ex-post). ↩