June 10, 2017

Real Time Factor Performance

According to S&P DJ Indices, 92% of all actively managed stock funds failed to beat their benchmarks over the past 15 years. This should come as no surprise. Similar results were published more than 20 years ago. This information has caused a move away from active stock selection and toward index funds or systematic approaches.

Money managers have recently moved more in the direction of factor-based and so-called smart beta investing. But as I pointed out in my February blog post, “Factor Zoo or Unicorn Ranch?”, there are some potential issues with this type of investing. Not the least of which is the shortfall between actual and theoretical returns.

Theoretical results are in academic papers and all over the internet. Much less information is available on the real-time performance of factor-based investments.

Lack of Real Time Performance Studies

The Loughran and Hough study in 2006 was a rare look at real-time factor performance. In it, the authors showed that there was no significant difference in performance between U.S. value and growth mutual funds from 1965 through 2001. The authors concluded by saying the idea that value generates superior long-run performance is an “illusion”.

This was the only study I could find that examined actual results of popular investment factors. But now there are two other studies. Blitz (2017) in "Are Exchange Traded Funds Harvesting Factor Premiums? examined the performance of 415 ETFs with at least 36 months of return history as of December 2105. He looked at factor harvesting over the prior 60 months if that much data existed. He concluded that on aggregate, all facotr exposures turn out to be close to zero. There is no aggreagate alpha there.

Last month Arnott, Kalesnik, and Wu (AKW) published an article called, “The Incredible Shrinking Factor Return.” AKW examined actual versus theoretical performance of four factors well-known to investors using 5323 mutual funds from January 1991 through December 2016. These factors are market, value, size, and momentum

Two Step Approach

To determine their results, AKW did a two stage (Fama-MacBeth) regression.  In stage 1, they regress mutual fund excess returns against standard factor models to determine each fund’s estimated factor loadings. In stage 2, they regress the excess returns of all against the factor loadings to get the factor premium earned by each fund. These are then compared to the theoretical factor returns.

This approach incorporates factor covariances to determine factor premia. Comparing actual to theoretical performance can reveal data mining, selection, and survivorship biases. It can also identify the effects of management fees, bid-ask spreads, and transaction costs.

In many academic papers, more than half the profits come from shorting stocks. But shorting may be expensive and sometimes impossible to do. Looking at long-only mutual fund performance removes those unrealistic profits.
  
Performance Shortfalls

Here are AKW’s regression results using 25 years of fund data from January 1991 through December 2016:


We see a 50% shortfall in the performance of the market factor. This is not surprising. For many years, other research has shown this effect. High beta tends to underperform low beta on a risk-adjusted basis going forward in time. 
 
In the AKW regression, the size factor shows a small but insignificant improvement in actual versus theoretical returns. Value is the most commonly used factor. AKW’s regression shows that value fund managers captured only 60% of the value premium since 1991. This compliments the recent findings of Kok, Ribando & Sloan (2017). They claim that outside the initial evaluation period of 1963 to 1981, the evidence of a value premium is weak to non-existent.

The largest shortfall AKW discovered is with momentum. The realized momentum return of live portfolios was close to zero, compared to a theoretical return of around 6% per year. Stock momentum alpha has not been positive since 2002, according to AKW. This is consistent with research by Battacharya, Li & Sonaer (2016) who find that momentum profits from U.S. stocks have been insignificant since 1999.

AKW say transaction costs play a major role as the source of slippage between theoretical and realized factor returns. In their words, “…higher turnover strategies, such as momentum, have trading costs that may be large enough to wipe out the premium completely if enough money is following the strategy.”

AKW concludes their study by asking if 10,000 quants all pursue the same factor tilts, how likely is it that these factors will add value?

Skepticism and Pushback

Skepticism toward new information is a good thing. More research and analysis can help advance what we know about the world.

Corey Hoffstein offers a critical response to the AKW study in an article he calls “A Simulation Based Rebuttal to Research Affiliates.” Corey points out we should not overlook style drift as a significant source of error. Return estimates can be inaccurate if managers switch investing styles. Most mutual funds have investment philosophies that stay consistent over time. Growth funds do not usually become value funds, for example. But it can occasionally happen.

In support of style drift, Corey shows 3-year rolling betas versus full period betas for the Vanguard Wellington Fund (VWELX). His data is from January 1994 through July 2016.
  

Corey’s logic is similar to saying one should be suspicious of the 10% average return of the S&P 500 index over the past 50 years because yearly returns have varied from -37% to 38%.  

Research of pension consultants shows that 3-year performance by equity managers is often mean reverting. This may explain some of the difference between full period and 3-year rolling window returns. With 3-year rolling returns, some, and perhaps a lot, of the variation in returns may be random noise.

In addition to the full data set, AKW looks at an expanding window of returns that incorporates all the data available up to that point. An expanding window regression converges to the full sample factor betas toward the end of the sample period. When AKW compares expanding window regressions to full period ones, they get comparable results.

Corey’s second argument is that you can attribute a portion of the AKW identified shortfall to estimation error. Factor loading estimates are noisy. Estimation error in the independent variables creates a pull toward zero in the beta coefficients. This causes a downward biasing of factor premia estimates in the second stage of AKW’s regression. This is a valid point.

But Corey offers no direct evidence of how much estimation bias there is in the AKW regression. Instead, Corey conducts a 1000 hypothetical fund simulation using normally distributed betas.

There are good reasons why simulations are seldom used in financial markets research. Simulations may give a false sense of precision. They are dependent on distributional assumptions that are often unrealistic when applied to financial markets. Market returns are not normally distributed with constant volatility. Underlying distributions may be non-stationary and lack independence. In his simulation, Corey assumes that mutual fund returns are normally distributed with constant volatility.
Academic researchers prefer to use as much real data as they can rather than simulated data. The AKW regression uses 25 years of actual mutual fund data, which should reduce the influence of tracking error on AKW's results.

Corey uses only one fund, VWEIX, with his simulation to estimate how much downward bias there might be in the AKW regression. He looks at the difference in standard deviation between full period and 3-year rolling estimates of VWELX’s beta coefficients. Corey is unable to show that estimation errors have the same distribution scale as the betas themselves.

Corey concludes there may be significant downward bias in the AKW regression estimates. But Corey does not explain the different degrees of slippage for different factors. Even if you accept Corey's simulation, results from 1994 through 2016 of only one fund may be on outlier. Other funds may conform to AKW's results. In the end, Corey says, "our results do not fully refute AKW’s evidence."

AKW does acknowledge downward bias in betas due to estimation error in the independent variables. They conduct six different robustness tests that reinforce their results and help mitigate that error. These robustness test results are consistent with their core findings. Studies here and here also confirm the disappearance of momentum profits since the 1990s.

We Are All Biased

I applaud Corey’s skepticism with regard to unexpected research findings. I also applaud him when he says, “…published research in finance is often like a back test. Rarely do you see any that does not support the firm’s products or existing views.”  We should also keep that in mind regarding critical assessments of new information. As the old saying goes, "Never ask a barber if you need a haircut."