The Illusion of Outperformance
An example of a periodic table of returns
My roots are in STEM, where the first instinct isn’t to agree—it’s to test. In the lab, we build hypotheses, design experiments, and try to prove ourselves wrong. Nothing is taken at face value, because results only matter if they can be reproduced.
After years in investing, I’ve noticed something that feels very different. There’s a persistent tendency for investors and analysts to talk out of both sides of their mouths—acknowledging how difficult it is to beat the market, while at the same time promoting strategies that claim to do exactly that.
It reminds me of the line from Alice in Wonderland: “I give myself very good advice, but I very seldom follow it.”
This piece is an attempt to take that advice seriously—and to test whether the ideas we rely on actually hold up when we try to reproduce them in the real world.
The Baseline: Beating the Market Is Extremely Unlikely
Before getting into factors, it’s important to start with what is one of the most well-established findings in finance:
Consistently beating the market is extraordinarily difficult.
According to S&P Dow Jones’ SPIVA (S&P Indices Versus Active) reports:
Over 15-year periods, roughly 90%+ of U.S. large-cap managers underperform the S&P 500
Over 20-year periods, that rises to ~95%+
The longer the time horizon, the worse the odds become
(Source: SPIVA U.S. Scorecards, S&P Dow Jones Indices)
Even more important than underperformance is lack of persistence:
Managers who outperform in one period are statistically unlikely to repeat it
Performance tends to revert toward the mean
This suggests that:
Much of what appears to be skill is indistinguishable from randomness
This isn’t just theory.
Warren Buffett illustrated it in practice with his well-known $1 million bet:
He wagered that a low-cost S&P 500 index fund would outperform a portfolio of hedge funds over 10 years (2008–2017)
The result:
S&P 500: ~7.1% annualized
Hedge funds: ~2.2% annualized
(Berkshire Hathaway Annual Letters)
He won easily.
The Shift: From “You Can’t Beat the Market” to “Here’s How to Beat It”
Given that backdrop, something interesting happens.
The same intellectual framework that tells us:
“Markets are efficient, and outperformance is unlikely”
also introduces:
“Certain factors—size, value, etc.—have historically outperformed”
This is the foundation of the Fama-French factor models.
1993: Three-factor model (market, size, value)
2015: Expanded to five factors (adding profitability and investment)
(Fama & French, 1993; 2015)
But even within that research, things are not as clean as they’re often presented.
In their 2015 paper, Fama and French note that:
“The value factor… becomes redundant” once additional factors are included
(Fama & French, 2015)
And they also acknowledge:
The model struggles to explain the low returns of certain small-cap stocks, particularly those with weak profitability
In other words:
“Small” is not inherently rewarded—and “value” is not always necessary to explain returns
Real-World Evidence: Value and Size Are Not Stable
If these premia were robust and reliable, we should expect to see them show up consistently.
But in practice, they don’t.
For example:
A recent analysis of the Fama-French value factor (HML) shows:
1926–2007: +5.20% annual premium
2008–2025:-0.86% annual premium
(Source: Value Premium study, Journal of Financial and Quantitative Analysis)
That’s not a small deviation—that’s nearly two decades of failure in real time.
Similarly, the size premium has faced persistent challenges.
Even proponents acknowledge issues such as:
Weak historical evidence
Instability over time
Concentration in microcaps
Sensitivity to construction methods
Weak international evidence
(AQR: Size Matters, If You Control Your Junk)
If a factor:
depends heavily on construction choices
varies across time periods
and struggles in real-world implementation
then it is not a simple, reliable building block.
A Fair Counterpoint: Where the Evidence Does Support Factors
To be clear, the case for factor investing is not entirely theoretical.
There are real-world examples where strategies built around value, size, or similar tilts have appeared to work—sometimes over meaningful periods of time.
For example, certain value-oriented ETFs launched in the early 2000s outperformed the S&P 500 for extended stretches, particularly in the aftermath of the dot-com crash, when value stocks strongly outperformed growth. Similarly, analyses of long-term data have shown that small-cap value stocks have, at times, delivered higher returns than broad market benchmarks over specific multi-decade windows.
More recently, newer factor funds have attempted to refine these approaches—screening for profitability, avoiding lower-quality “junk” companies, and more precisely targeting the characteristics that academic research suggests should drive returns. Some of these funds have shown promising results in shorter, more recent periods.
Taken together, this evidence suggests something important:
The underlying ideas are not entirely without merit.
There are environments where these strategies work. There are periods where they outperform. And there are implementations that appear to improve on earlier, simpler approaches.
But this is also where the limitations become more visible.
Because these examples share a common feature:
Their success is highly dependent on when, how, and under what conditions they are implemented.
The early outperformance of value strategies, for example, was heavily influenced by starting at a moment when growth stocks were historically overvalued. Later periods tell a very different story. Similarly, small-cap value has outperformed over certain long windows—but has also experienced extended stretches of underperformance, including more than a decade in recent history.
Even proponents of factor investing acknowledge this instability. In some cases, the expected premium has failed to materialize for 15 or even 20 years—long enough to encompass an entire investing horizon.
There is also the issue of implementation. The returns observed in academic research are often stronger than what can be captured in real portfolios, due to costs, constraints, and the difficulty of precisely defining and isolating the intended factors. In practice, realized returns tend to be lower, more variable, and more sensitive to construction choices.
So while it is possible to find examples where factor-based strategies appear to work, it is much harder to find evidence that they work:
consistently
across independent time periods
in a way that investors can reliably capture and stick with
And that distinction matters.
Because it shifts the question from:
“Can this work?”
to:
“Is this reliable enough to build a portfolio around?”
This is where the gap between theory and experience becomes difficult to ignore.
The Chart Problem: When Visualization Becomes Persuasion
Which brings us to the chart shown at the top of this piece.
The “periodic table of returns,” popularized by firms like Callan and J.P. Morgan, is designed to show:
“The uncertainty inherent in all capital markets”
(Callan Institute)
And it does that well.
It shows:
Leadership rotates
No asset class consistently wins
But it is often used for something else.
Instead of:
“Markets are unpredictable, so diversify”
It becomes:
“These asset classes show up at the top often—so overweight them”
That leap is not supported by the chart.
Because the chart:
shows single-year rankings
does not show compounding
does not show drawdowns
does not show investor experience
And yet it creates the impression of consistency where none exists.
I fell for this myself for a time.
I enjoy data. I look for patterns. And like most people, I assumed I was seeing something uncovered—something real.
But when I tested it using:
actual funds
real time periods
realistic allocations
the pattern didn’t hold.
What looked like a discovered truth started to look like a constructed narrative.
Behavior: Why This Happens (Mayraz, 2011)
There’s a well-documented behavioral explanation for this.
In a study by economist Guy Mayraz, participants were assigned roles:
“Farmers” (benefit from higher wheat prices)
“Bakers” (benefit from lower prices)
They were shown identical data and asked to predict the future.
The result:
“Subjects who benefit from higher prices predict higher prices than those who benefit from lower prices.”
(Mayraz, 2011)
More broadly:
“Agents’ beliefs are influenced by their preferences.”
This matters.
Because in investing, we are strongly incentivized to find:
better strategies
higher returns
something that outperforms
So when data suggests we can outperform—even conditionally—it becomes very easy to:
emphasize supporting evidence
discount conflicting evidence
build conviction around it
Many of these ideas come from highly respected researchers, and rightly so—their work has had a meaningful impact on how markets are understood. But influence shouldn’t place ideas beyond scrutiny—it should invite more of it.
Respect for prior work should include a willingness to test it, challenge it, and verify how well it holds up outside the conditions in which it was developed.
A Different Lens: Reproducibility
This is where my background outside finance shapes how I see this.
In academic laboratory science, reproducibility is everything.
You don’t just read a paper and accept it.
You try to replicate it.
And often, you can’t.
Not because anyone is cheating—but because:
conditions differ
assumptions break
hidden variables exist
Labs have discovered issues from:
contaminated reagents
flawed measurement techniques
even impurities in water supply
Things that only became obvious when results couldn’t be reproduced.
That’s not failure—that’s progress.
It’s how understanding improves.
And it raises a fair question here:
If factor-based outperformance is real, why is it so difficult to reproduce in real portfolios?
I’ve tried and watched colleagues try.
Not with abstract datasets—but with:
actual funds
real allocations
real timeframes
And we have not been able to replicate the outcomes that are often implied.
That doesn’t prove they’re wrong.
But it does mean:
They are not robust in the way they are often presented
Where This Leaves Us
None of this means:
factors don’t exist
diversification isn’t valuable
theory is useless
It means something more specific:
We may be overinterpreting fragile relationships and building portfolios that depend on them.
There’s a difference between:
identifying a pattern
and building a portfolio that depends on that pattern showing up reliably
That gap is where problems arise.
This is also where I think we need to be honest about our responsibility as advisors.
If we are going to recommend strategies that meaningfully shape clients’ financial outcomes—especially ones that introduce higher volatility, tracking error, or long periods of potential underperformance—then those strategies should be grounded in something we’ve taken the time to understand and test beyond their original presentation.
In practice, I’m not sure that always happens.
Many of these ideas are compelling, widely accepted, and supported by familiar charts and research. But acceptance and repetition are not the same as verification. And when those ideas are implemented without being meaningfully stress-tested in real portfolios, the risk isn’t just theoretical—it’s borne by clients.
The Trap of Overthinking
There’s also a more personal risk.
The more we learn, the more we analyze, the more frameworks we build—
the easier it becomes to believe we can solve the market.
Find the right combination.
Build the better portfolio.
But at some point:
Careful thinking becomes overthinking.
We move from:
understanding markets
to:trying to out-engineer them
And markets don’t reward that.
They reward what actually happens—not what should happen.
In trying to solve uncertainty, we can end up building something that depends on certainty.
An Open Challenge.
I would like this to be wrong.
If there were a reliable, reproducible way to:
outperform the market
reduce risk
improve client outcomes
I would want to use it.
I would want my clients to benefit from it.
But based on:
academic evidence
real-world results
and attempts at reproduction
I can’t get there.
So I think the burden shifts:
If these strategies are as robust as they are often presented to be, they should be reproducible—in real portfolios, across real timeframes, in a way investors can actually experience.
If they are:
I would genuinely want to see it.
Until then, I think it’s reasonable to proceed with caution—and to focus less on trying to outthink the market, and more on building portfolios that can actually live within it.
Sources
Fama, Eugene & French, Kenneth (2015). A Five-Factor Asset Pricing Model
S&P Dow Jones. SPIVA U.S. Scorecards
Mayraz, Guy (2011). Wishful Thinking (LSE)
Berkshire Hathaway Annual Letters (Buffett Bet)
AQR: Size Matters, If You Control Your Junk
Callan Institute: Periodic Table of Investment Returns
Journal of Financial and Quantitative Analysis: Value Premium study