June 10, 2017

Real Time Factor Performance

According to S&P DJ Indices, 92% of all actively managed stock funds failed to beat their benchmarks over the past 15 years. This should come as no surprise. Similar results were published more than 20 years ago. This information has caused a move away from active stock selection and toward index funds or systematic approaches.

Money managers have recently moved more in the direction of factor-based and so-called smart beta investing. But as I pointed out in my February blog post, “Factor Zoo or Unicorn Ranch?”, there are some serious issues with this type of investing. Not the least of which is the shortfall between actual and theoretical returns.

Theoretical results are in academic papers and all over the internet. Very little information is available on the real-time performance of factor-based investments.

Lack of Real Time Performance Studies

The Loughran and Hough study in 2006 was a rare look at real-time factor performance. In it, the authors showed that there was no significant difference in performance between U.S. value and growth mutual funds from 1965 through 2001. The authors concluded by saying the idea that value generates superior long-run performance is an “illusion”.

This was the only study I could find that examined actual rather than theoretical results of a popular investment factor. But now there is another study. Last month Arnott, Kalesnik, and Wu (AKW) published an article called, “The Incredible Shrinking Factor Return.” 

AKW examined actual versus theoretical performance of four factors well-known to investors using 5323 mutual funds from January 1991 through December 2016. These factors are market, value, size, and momentum

Two Step Approach

To determine their results, AKW did a two stage (Fama-MacBeth) regression.  In stage 1, they regress mutual fund excess returns against standard factor models to determine each fund’s estimated factor loadings. In stage 2, they regress the excess returns of all against the factor loadings to get the factor premium earned by each fund. These are then compared to the theoretical factor returns.

This approach incorporates factor covariances to determine factor premia. Comparing actual to theoretical performance can reveal data mining, selection, and survivorship biases. It can also identify the effects of management fees, bid-ask spreads, and transaction costs.

In many academic papers, more than half the profits come from shorting stocks. But shorting may be expensive and sometimes impossible to do. Looking at long-only mutual fund performance removes those unrealistic profits.
Performance Shortfalls

Here are AKW’s regression results using 25 years of fund data from January 1991 through December 2016:

We see a 50% shortfall in the performance of the market factor. This is not surprising. For many years, other research has shown this effect. High beta tends to underperform low beta on a risk-adjusted basis going forward in time. 
In the AKW regression, the size factor shows a small but insignificant improvement in actual versus theoretical returns. Value is the most commonly used factor. AKW’s regression shows that value fund managers captured only 60% of the value premium since 1991. This compliments the recent findings of Kok, Ribando & Sloan (2017). They claim that outside the initial evaluation period of 1963 to 1981, the evidence of a value premium is weak to non-existent. Value is suspect now on both a theoretical and actual basis.

The largest shortfall AKW discovered is with momentum. The realized momentum return of live portfolios was close to zero, compared to a theoretical return of around 6% per year. Stock momentum alpha has not been positive since 2002, according to AKW. This is consistent with research by Battacharya, Li & Sonaer (2016) who find that momentum profits from U.S. stocks have been insignificant since the late 1990s.

AKW say transaction costs play a major role as the source of slippage between theoretical and realized factor returns. In their words, “…higher turnover strategies, such as momentum, have trading costs that may be large enough to wipe out the premium completely if enough money is following the strategy.”

AKW concludes their study by asking if 10,000 quants all pursue the same factor tilts, how likely is it that these factors will add value?

Skepticism and Pushback

Skepticism toward new information may be a good thing. More research and analysis can help advance what we know about the world.

Corey Hoffstein offers a critical response to the AKW study in an article he calls “A Simulation Based Rebuttal to Research Affiliates.” Corey points out one should not overlook style drift as a significant source of error. Return estimates can be inaccurate if managers switch investing styles. Most mutual funds have investment philosophies that stay constant over time. Growth funds do not often become value funds, for example. But it can occasionally happen.

In support of style drift, Corey shows 3-year rolling betas versus full period betas for the Vanguard Wellington Fund (VWELX). His data is from January 1994 through July 2016.

Corey’s logic is like saying one should be suspicious of the 10% average return of the S&P 500 index over the past 50 years because yearly returns have varied from -37% to 38%.  

Research of pension consultants shows that 3-year performance by equity managers is mean reverting. This may explain some of the difference between full period and 3-year rolling window returns. With 3-year rolling returns, some, and perhaps a lot, of the variation in returns may be random noise.

In addition to the full data set, AKW looks at an expanding window of returns that incorporates all the data available up to that point. An expanding window regression converges to the full sample factor betas toward the end of the sample period. When AKW compares expanding window regressions to full period ones, they get comparable results.

Corey’s second argument is that you can attribute a portion of the AKW identified shortfall to estimation error. Factor loading estimates are noisy. Estimation error in the independent variables creates a pull toward zero in the beta coefficients. This causes a downward biasing of factor premia estimates in the second stage of AKW’s regression. This is a valid point.

But Corey offers no direct evidence of how much bias there is in the AKW regression because fo this. Instead, Corey conducts a 1000 hypothetical fund simulation using normally distributed betas.

There are good reasons why simulations are rarely used in financial markets research. Simulations are dependent on distributional assumptions that are often unrealistic with financial markets. Market returns are not independent, and their underlying distributions may be non-stationary.

In his simulation, Corey assumes that returns are normally distributed, which is not the case for mutual fund returns. Nor does Corey show that estimation errors have the same distribution scale as the betas themselves. If  one is going to use simulated data, it would be better to use simulations with other distributions.

Academic researchers prefer to use as much real data as they can rather than simulated data. The AKW regression uses 25 years of actual mutual fund data, which should be enough to minimize the influence of tracking error on AKW's results.

Corey uses only one fund, VWEIX, with his simulation to estimate how much downward bias there might be in the AKW regression. He looks at the differences in standard deviation between full period and 3 year rolling estimates of VWELX’s beta coefficients.

Corey concludes there may be significant downward bias in the AKW regression estimates. But Corey does not explain why there are different degrees of slippage for the different factors. Even if you accept his simulation, results from 1994 through 2016 of only one fund may be on outlier. Other funds may conform well to the AKW's results. In the end, Corey says, "our results do not fully refute AKW’s evidence."

AKW acknowledges this downward bias in betas due to estimation error in the independent variables. They conduct six different robustness tests that reinforce their results and may help mitigate that error. Their robustness test results are consistent with AKW’s core findings.

We Are All Biased

I applaud Corey’s skepticism with regard to unexpected research findings. I also applaud him when he says, “…published research in finance is often like a back test. Rarely do you see any that does not support the firm’s products or existing views.” We see that with many advisors and fund managers, as well as throughout the blogsphere. As the old saying goes, "Never ask a barber if you need a haircut."