According to S&P DJ Indices, 92% of all actively managed
stock funds failed to beat their benchmarks over the past 15 years. This
should come as no surprise. Similar results were published more than 20 years ago.
This information has caused a move away from active stock selection and toward index funds or systematic approaches.

Skepticism toward new information may be a good thing. More research and analysis can help advance what we know about the world.

Corey’s logic is like saying one should be suspicious of the
10% average return of the S&P 500 index over the past 50 years because
yearly returns have varied from -37% to 38%. One would never expect to earn 10% every
year going forward.

Research of pension consultants shows that 3-year performance by equity managers is mean reverting. This may explain some of the difference between full period and 3-year rolling window returns. With 3-year rolling returns, some, and perhaps a lot, of the variation in returns may be random noise.

In addition to the full data set, AKW looks at an expanding window of returns that incorporates all the data available up to that point. An expanding window regression converges to the full sample factor betas toward the end of the sample period. When AKW compares expanding window regressions to full period ones, they get comparable results.

Money managers have recently moved more in the direction of factor-based
and so-called smart beta investing. But as I pointed out in my February blog post,
“Factor
Zoo or Unicorn Ranch?”, there are some serious issues with this type of
investing. Not the least of which is the shortfall between actual and theoretical
returns.

Theoretical results are in academic papers and
all over the internet. Very little information is available on the real-time
performance of factor-based investments.

**Lack of Real Time Performance Studies**

The Loughran and Hough study in
2006 was a rare look at real-time factor performance. In it, the authors showed that there was no
significant difference in performance between U.S. value and growth mutual
funds from 1965 through 2001. The authors concluded by saying the idea that
value generates superior long-run performance is an “illusion”.

This was the only study I could find that examined actual
rather than theoretical results of a popular investment factor. But now there
is another study. Last month Arnott, Kalesnik, and Wu (AKW) published an article called, “The
Incredible Shrinking Factor Return.”

AKW examined actual versus theoretical performance of four
factors well-known to investors using 5323 mutual funds from January 1991 through
December 2016. These factors are market, value, size, and momentum

**Two Step Approach**

To determine their results, AKW did a two
stage (Fama-MacBeth) regression. In
stage 1, they regressed mutual fund returns against the excess return of each
factor to figure each fund’s average factor loadings. In stage 2, they
regressed fund returns against the average factor loadings to get the return of
each fund per unit of factor exposure. These were then compared to the
fund factor returns.

This approach is a good one since it incorporates factor
covariances to determine factor premia.

Comparing actual to theoretical performance can reveal data mining, selection, and survivorship biases. It can also identify the effects of management fees, bid-ask spreads, and transaction costs.

Comparing actual to theoretical performance can reveal data mining, selection, and survivorship biases. It can also identify the effects of management fees, bid-ask spreads, and transaction costs.

In many academic papers, more than half the profits come
from shorting stocks. But shorting may be expensive and sometimes impossible to
do. Looking here at long-only mutual fund performance removes those unrealistic
profits.

**Performance Shortfalls**

Here are AKW’s regression results using 25 years of fund data from January 1991 through December 2016:

We see a 50% shortfall in the performance of the market
factor. This is not surprising. For many years, other research has shown this
effect. High beta tends to underperform low beta on
a risk adjusted basis going forward in time.

In the AKW regression, the size factor shows a small but
insignificant improvement in actual versus theoretical returns. This may be
data noise.

Value is the most commonly used factor. AKW’s regression
shows that value fund managers captured only 60% of the value premium since
1991. This compliments the recent findings of Kok, Ribando
& Sloan (2017). They claim that outside the initial evaluation period
of 1963 to 1981, the evidence of a value premium is weak to non-existent. Value
is suspect now, both on a theoretical and actual
basis.

The largest shortfall AKW discovered is with momentum. The
realized momentum return of live portfolios was close to zero compared to a
theoretical return of around 6% per year. Stock momentum alpha has not been positive since 2002. AKW say transaction costs play a
major role as the source of slippage between theoretical and realized factor
returns. In their words, “…higher turnover strategies, such as momentum,
have trading costs that may be large enough to wipe out the premium completely
if enough money is following the strategy.”

AKW concludes their study by asking if 10,000 quants all
pursue the same factor tilts, how likely is it that these factors will add
value?

**Skepticism and Pushback**

Skepticism toward new information may be a good thing. More research and analysis can help advance what we know about the world.

Corey Hoffstein offers a critical response to the AKW study
in an article he calls “A
Simulation Based Rebuttal to Research Affiliates.” Corey points out one
should not overlook style drift as a significant source of error. Return
estimates can be inaccurate if managers switch investing styles. In support of
this, Corey shows 3-year rolling betas versus full period betas for the
Vanguard Wellington Fund (VWELX). His data is from January 1994 through July
2016.

Research of pension consultants shows that 3-year performance by equity managers is mean reverting. This may explain some of the difference between full period and 3-year rolling window returns. With 3-year rolling returns, some, and perhaps a lot, of the variation in returns may be random noise.

In addition to the full data set, AKW looks at an expanding window of returns that incorporates all the data available up to that point. An expanding window regression converges to the full sample factor betas toward the end of the sample period. When AKW compares expanding window regressions to full period ones, they get comparable results.

Corey’s second argument is that you can attribute a portion
of the AKW identified shortfall to estimation error. Factor loading estimates
are noisy. Estimation error in the independent variables creates a pull toward
zero in the beta coefficients. This causes a downward biasing of factor premia estimates in
the second stage of AKW’s regression.

Corey offers no direct evidence of how
much bias there is in the AKW regression. Instead, Corey conducts a 1000
hypothetical fund simulation using normally distributed betas.

There are some good reasons why simulations are rarely used in
financial markets research. Simulations are dependent on distributional assumptions
that are unrealistic with financial markets. Market returns are not
independent, and their underlying distributions are non-stationary.

In his simulation, Corey assumes that returns are
normally distributed, which is not the case for mutual fund returns. Nor does Corey show that estimation errors have the same distribution scale as the
betas themselves.

Academic researchers prefer to use as much real data as
they can rather than simulated data. The AKW regression uses 25 years of
actual mutual fund data, which should be enough to minimize the influence of
tracking error on AKW's results.

Corey uses only one fund, VWEIX, with his simulation to estimate
how much downward bias there might be in the AKW regression. He looks at the differences
in standard deviation between full period and 3 year rolling estimates of VWELX’s beta
coefficients.

Corey concludes there may be significant downward bias in the AKW regression estimates. Corey does not explain why there are different degrees of slippage for the different factors. Even if you accept his simulation, results from 1994 through 2016 of only one fund may just be on outlier. Other funds may conform well to the AKW's results. In the end, Corey says, "our results do not fully refute AKW’s evidence".

Corey concludes there may be significant downward bias in the AKW regression estimates. Corey does not explain why there are different degrees of slippage for the different factors. Even if you accept his simulation, results from 1994 through 2016 of only one fund may just be on outlier. Other funds may conform well to the AKW's results. In the end, Corey says, "our results do not fully refute AKW’s evidence".

AKW also mentions this downward bias in betas due to
estimation error in the independent variables. They conduct six different
robustness tests that reinforce their results and help mitigate that error. Those results are consistent with
AKW’s core findings.

**We Are All Biased**

I applaud Corey’s skepticism with regard to unexpected
research findings. I also applaud him when he says, “…published research in
finance is often like a back test. Rarely do you see any that does not support
the firm’s products or existing views.”

We see that with many advisors and fund
managers, as well as throughout the blogsphere. But we should keep in mind that this logic works both ways. Those who adhere to alternative approaches are often the ones who challenge new ideas. We
should apply some healthy skepticism to both sides of such controversies.