January 17, 2019

Whither Fragility? Dual Momentum GEM

Corey Hoffstein of Newfound Research recently wrote an article called, “Fragility Case Study: Dual Momentum GEM.” Corey starts out saying my dual momentum approach is the strategy he sees implemented the most among do-it-yourself tactical investors. Corey then said several investors bemoaned that GEM kept them invested in the stock market during the last quarter of 2018. It signaled them out of the S&P 500 at the beginning of January after the market was in a drawdown. This caused them to no longer follow the GEM signals as given.

Corey’s solution is to advocate the use of multiple lookback periods to reduce the chance of “bad luck.” He showed the performance of seven monthly lookback periods ranging from 6 to 12 months. He presented a composite of those lookbacks that create seven different GEM models instead of the usual one with a 12-month lookback.

Corey argued that his approach would reduce specification risk. He says this is important because “performance differences due to model specification are not expected to mean revert and are therefore expected to be random but very permanent return artifacts.”

This may be true over the short-run. You cannot expect poor recent performance to be immediately followed by good performance. (We will ignore the fact that stocks are short-term mean reverting.) But neither can you expect poor performance to follow poor performance. Each monthly return from momentum investing is independent but with a positive expected value. Otherwise, you would not do momentum investing.

Expected Value

Say you flip a coin three times, and it comes up heads every time.  You cannot say what the outcome will be over the next 3 tosses since they are independent. But the law of large numbers says that over time your results will approach 50/50. As you accumulate more coin tosses, your results should converge to the 50/50 expected value of each coin toss.

Let us say you want heads to appear and you have a fair coin with a 50/50 chance of heads coming up. You have another coin that has a 60% chance of heads coming up. You would always want to use the second coin. This is true even though your short-term results might seem random. You wouldn’t split your wagers between the two coins. You would choose the one that gives the best expected results. The same is true with investing. If you have an expected edge from a particular strategy, you should favor that strategy.

You might be able to smooth short-term volatility some by using multiple lookback models, but at what opportunity cost?   Betting red and black simultaneously in roulette will dramatically reduce your variability, but it is not a smart bet. You need to consider expected value as well as diversification.

Corey’s argument is that all lookbacks are equal and any differences among them are statistical noise. (He bases this on only 10 years of past data.) So you might as well pool them together and exploit the perceived benefits of diversification. But Corey’s choice of 6 to 12-month lookback periods shows selection bias. They are not random choices. These lookbacks have worked well in past momentum studies. They are highly correlated, which undermines tests of their statistical inferences.

Novy-Marx (2016) in his “Testing  Strategies Based on Multiple Signals” showed that combining the best candidate signals yields the same biases as those obtained using the single best candidate. [1]  This means Corey's assessment of correlated and biased composite lookbacks versus a single best lookback period may not be statistically valid.

Of the seven lookback periods Corey used, only the 10-month one would have gotten you out of the S&P 500 before the December loss. The 8 and 9-month lookbacks would have kept you in then and caused you to miss out on profits in November. The other 3 months would have given the same results as the 12 month lookback.

Corey's lookbacks also exhibit a bias in favor of short-term performance. As a result, his approach carries with it undesirables which GEM now lacks — increased complextity, transaction costs, and whipsaws. To better answer the question if a 12-month lookback is desireable, let us look at the evidence.

History of the 12-Month Lookback

A 12-month lookback with U.S. stocks was first presented by Cowles & Jones in 1937. They tabulated the performance of all NYSE stocks from 1920 through 1935. After examining the data, they concluded stocks that performed better the past 12-months also outperformed the following year. The 12-month lookback they identified has held up well in and out of sample going forward and backwards in time since 1937.

Greyserman & Kaminski (2014) showed that long/short absolute momentum with a 12-month lookback beat buy-and-hold back to the beginning of stock trading in the 1600s. It did better in all markets back to the year 1223!

I do not see how anyone can look at these results and think trend following momentum with a 12-month lookback is just good luck.

Lookback period comparisons

The first rigorous comparison of lookback periods was in Jegadeesh & Titman’s (1993) seminal momentum paper. They compared 3, 6, 9, and 12-month formation (lookback) and holding periods on U.S. stocks from 1965 through 1989.

We see a noticeable improvement in return and t-stats as we go from a 3 to a 12-month lookback period. Not only does a 12-month lookback show the best performance. The continuity in improvement as we extend the lookback period from 3 to 12-months supports the robustness of a 12-month lookback period.

Absolute (time series) momentum applied to multiple markets from 1985 through 2009 also showed a steady improvement in t-stats as the lookback period increased from 6 to 12 months.

               Source: Moskowitz, Ooi, and Pedersen (2012), “Time Series Momentum

GEM Results

Let us now look at GEM. Here are results using 3, 6, 9, and 12-month lookback periods and an equally weighted combination of these periods. The GAA benchmark is a global asset allocation of 45% S&P 500, 28% MSCI ACWI ex-U.S. or World ex-U.S., and 27% 5-Year Bonds. This represents the amount of time GEM was in each of these markets since 1950.



GEM 12 GEM 9  GEM 6 GEM 3 Composite GAA
CAGR  15.5  13.9  14.6  12.7  14.3    9.8
Standard Deviation  11.6  11.4  10.9 1 1.0  10.2    9.9
Sharpe        Ratio  0.95  0.83  0.93  0.76  0.95  0.58
Worst Drawdown -17.8 -20.7 -21.6 -23.3 -17.7 -41.2
Results are hypothetical, are NOT an indicator of future results, and do NOT represent returns that any investor actually attained. Indexes are unmanaged, do not reflect management or trading fees, and one cannot invest directly in an index. Please see our Disclaimer page for more information.

A 12-month lookback comes closest to that old Wall Street adage, "More money is made by sitting than by trading." Over these 68 years, it outperformed all the shorter lookbacks in CAGR, Sharpe ratio, and worst drawdown. It gave an increase of 120 basis points in annual return over the composite of lookback periods. Both had the same Sharpe ratio and worst month-end drawdown. Corey also showed a higher return and equal Sharpe ratio from a 12-month lookback compared to a composite of lookbacks over the short 10-year period he examined.

So why not sacrifice 120 basis points in past annual return and use the composite since the Sharpe ratios and drawdowns are the same and short-term volatility is less? There are a several reasons why you may not want to do that.

First are the higher transaction costs from more lookback models. There are 30% more trades for the composite of four lookbacks in my GEM backtest. There are more, not fewer, whipsaw losses using shorter lookbacks. A 12-month lookback is potentially more tax efficient and has no seasonal bias. Equal weighting a number of different lookback periods is a sensible thing to do if you think they may be equal in their expected future performance. I do not think the evidence supports that.

An Alternative

Those wanting to reduce short-term volatility can add a modest allocation to bonds instead of using multiple lookback models. This would be easier to do, less costly, and potentially more tax efficient. Results are also better. Here is the composite lookback model compared to simple GEM with a 10% allocation to 5-year bonds. It is in line with Warren Buffett's investment instructions for his estate: 90% S&P index fund and 10% short-term bonds.


Composite GEM 90/10
CAGR    14.3    15.2
Standard Deviation    10.2    10.4
Sharpe         Ratio    0.95    0.95
Worst Drawdown  -17.7   -16.0
Results are hypothetical, are NOT an indicator of future results, and do NOT represent returns that any investor actually attained. Indexes are unmanaged, do not reflect management or trading fees, and one cannot invest directly in an index. Please see our Disclaimer page for more information.

Outside diversification can reduce the impact of specification risk without harming the expected value of an investment model.

Investment Selection Metrics

Furthermore, Sharpe ratios may not be the best metric for making investment decisions. At the risk of getting too wonkish, let me explain. Sharpe ratios work well if returns are normally distributed or investors have a quadratic utility function.

But investment returns are not normally distributed, and quadratic utility implies increasing absolute risk aversion. This means investors will reduce the amount they have in risky assets as their wealth increases. This is an unrealistic assumption.

Those in the know prefer exponential utility that assumes a more realistic constant absolute risk aversion or logarithmic utility that assumes decreasing absolute risk aversion.

Choosing an investment based on its compound annual growth rate (CAGR) is consistent with these utility functions. A 12-month lookback model has a higher CAGR than the other lookback models or a composite of those models, so it is the logical choice.

Behavioral reactions to tail risk are not captured well by utility functions. Investors become more risk averse after steep drawdowns. This is why I also look at drawdown distributions. Once I find a tolerable level of expected downside exposure, I look at CAGRs.

Practicality


It is much easier to use one rather than many lookback models. That is another reason why I chose only one lookback, the best one, for my publicly available GEM model.

No one can say with certainty what the future will be like. Process diversification can be beneficial if it is done selectively.  Corey is correct in saying specification risk exists. But I disagree with him about different lookback periods being equally effective.

I use multiple lookbacks myself in the proprietary dual momentum models I license to investment advisors. But I do not indiscriminately combine them. I use different lookback periods when, where, and how it makes sense and the data supports it. I still rely considerably on a 12-month lookback for stock index decisions given the weight of evidence supporting it.

Better Informed Investors


To me, the most interesting idea in Corey’s article was that some investors and advisors overreact to short-term performance. Trend following models will never sell at the top nor buy at the bottom. They do not have to for investors to do well. There will always be noise and tracking error whether you have one or a dozen lookback models.

The real fragility here is with investors who misperceive the normal drawdowns you should expect from momentum investing. If you change or abandon a model when it has losing trades, you are unlikely to succeed at quantitative investing. You need to have realistic expectations. Dual momentum investors need a good understanding of the process and the research supporting it. This can help them keep the big picture in mind.


[1] For a summary discussion of this issue, see David Foulke’s “Backtesting Strategies Based on Multiple Signals – Beware of Overfitting Bias.”


January 1, 2019

Our Most Popular Posts in 2018


Happy New Year! In case you missed them, here are our most popular posts in 2018:


My book had dual momentum results from 1974 through 2013. With the acquisition of additional data, we are now able to show results back to 1950. We also explain why 1950 is a good starting date for looking at global investing.


We show examples of common mispractices in quantitative investing: overfitting of data, indiscriminate data mining, biased perceptions, and paucity of data. Ex-post and ex-ante results are not the same.


These show up regularly and repeatedly on the internet. We discuss stocks versus indices, relative  versus absolute momentum, trend following versus diversification, and trade timing issues.


Guest post by Matt Richarson, JD, PhD. Matt looks at simulated safe withdrawal rates for our popular Global Equities Momentum (GEM) model.