The Portfolio Optimizer for Wealth Management

Direct Indexing returns increase with looser tracking

Summary

With Direct Indexing (DI), instead of buying an ETF that tracks a stock index, we buy the individual stocks. DI can increase after-tax returns through tax loss harvesting.

Our infrastructure allows us to investigate how tightly DI should track the index. If we track too tightly, we cannot harvest enough losses; if too loosely, tracking error will be high.

Background

For more general background on DI, see here.

DI typically works by allowing a stock to be within a range of its target in the index. For example, if AAPL is 4% of the index, we may decide to allow AAPL to be ± 50% of its target, i.e. between 2% and 6%. [Note: this 4% is based on prices at a single point in time, because stock indexes are published in terms of shares, not dollar percentages, which change over time]. Our software supports a more general formula than a simple multiple of the target, but this works well in practice.

What is a good formula to use? If it results in ranges that are...

too narrow, we may forced to limit the amount we harvest when we decide to.
too wide, the portfolio may not behave similarly enough to the stock index, i.e. tracking error may be large.

We used our backtest infrastructure to see how DI would have behaved in the past using ranges of ±0% (tracking the index perfectly), 10%, 30%, 50%, 70%, 90%, and 100%. We also generate web-based charts automatically for each individual backtest. These help investment research, but can be shared with end-clients as well, as they can be customized to their particular situation (tax rates, deposit patterns, etc.)

Scenario parameters

Held by us: An initial $1,000,000 in cash, which we are free to invest.
Deposits: $10,000 every month.
Target allocation: 0.5% cash, 99.5% in a stock index.
Index details: to avoid licensing issues, we created a US large-cap 311-stock index. There are no index reconstitutions (additions, deletions, weight changes, etc.) to keep things simple.
AUM fees: no fees, for simplicity. However, we support customized fee schedules; e.g. 25 bps annually, charged ~2 bps / month.
Tax: all scenarios are for taxable accounts.
Shares rounding: we only use round shares here, although we can support fractional shares and arbitrary rounding and order minimum amounts.
Risk model: we have built our own risk model, and our DI can use it. However, we are not using it in this example, to keep things simple.

Results

As we increase the allowable range for every stock index constituent, we expect that both pre-tax and after-tax returns will improve, and that tracking error will increase. It would be good to confirm this.

Returns before tax

A useful feature we added is the ability to plot several backtests on the same chart. Instead of looking at 7 different charts, one for each range formula, it is much more informative to look at all 7 scenarios on the same chart. Click on this screenshot for an interactive version:

This shows the result we expected.

As we increase the DI range (going from light green towards dark green), pre-tax portfolio values increase.
In every case, DI exceeds the benchmark. This is not guaranteed of course but is very likely.
The trivial '0% pct range' DI underperforms the benchmark slightly. This is due to:
- modeled transaction costs: from our initial and subsequent buy orders.
- some cash drag: the index we created as a benchmark assumes cash dividends are reinvested right away, whereas our DI logic waits to invest until enough cash is available.

The differences are dominated by the scale of the chart. It is a bit easier to see the differences if you zoom in, by clicking and dragging a window over an area of the chart. However, a further improvement is a mechanism we built to plot a set of differences between arbitrary pairs of lines, just by passing certain parameters on the URL to specify how to construct those differences. The chart automatically scales to fit the visible lines only, so if we hide the other lines, the scale becomes smaller, and the differences become easier to see.

In this example, it is more informative to display the difference of the portfolio values minus the benchmark, for every backtested DI range (0%, 10%, 30%, etc.) Click on the screenshot for an interactive version:

Returns after tax

These are a bit different. We cannot compare them to the benchmark, because the benchmark never pays taxes. Therefore, for a cleaner chart, we will compare all after-tax returns against the after-tax returns of the '0% DI' case. Click on the screenshot for an interactive version:

Note: the above liquidation values are calculated using long-term tax rates even for short-term tax lots. This results in a more stable metric, which is more useful for investment research. The idea is that, short of a full liquidation, there is rarely a reason why a client cannot wait for short-term gains to become long-term. However, you can still see the results using true tax here. The difference is more pronounced in the earlier part of the backtest, where a larger percentage of holdings are at a short-term loss or gain, due to the initial deposit being much larger than the subsequent ones.

Tracking error

We can search for daily tracking error (standard deviation of period return diff), weighted by portfolio value, in bps in the data grids for DI. The values are:

0.02 for 0pct
0.14 for 10pct
1.75 for 30pct
1.47 for 50pct
1.90 for 70pct
5.85 for 90pct
3.49 for 100pct

Ex-post tracking error generally increases with a higher range, but backtests are somewhat path-dependent, so there is always some 'noise'.

To convert to (more intuitive) annualized numbers, we can multiply by 16 (approximate square root of number of trading days in a year). Using the same links above, the improvement of 50% vs. 0% is 37 bps pre-tax (1195 - 1158) and 68 bps after-tax (1577 - 1509). 1.47 x 16 results in 23.5 bps of annualized tracking error, which is an acceptable price to pay for the increase in returns.

Tracking error - visual

There is also a more visual (and arguably better) way to do this.

At the end of every backtest, we automatically generate many metrics; there are over 400 as of April 2020. We have built a general mechanism that can plot any arbitrary pair of those metrics against each other. In this case, we would like to look at the tradeoff of tracking error vs. returns, for every range setting (0%, 10%, etc.). Click on the screenshot for an interactive version:

We actually ran backtests for 21 ranges (0%, 5%, 10%, ... 100%). The discussion so far only mentioned 7 ranges (0%, 10%, 30%, 50%, 70%, 90%, 100%), in order to avoid clutter in the charts. However, because this plot only shows a single point per backtest instead of a line, it is easy to fit all 21 results. Backtests with tight tracking (red end of spectrum) have the lowest returns but also lowest tracking error. The inverse is true for backtests with looser tracking (green end of spectrum). The yellow ones are somewhere in between. This is all as expected.

The web page automatically draws a line of best fit, but in this case the relationship seems non-linear; the points seem to be on an upward-sloping curve. Points below the line and farthest from it are preferred; the correspond to high returns (x axis) with low tracking error (y axis). This confirms our initial conclusion that using 50% (the large yellow triangle) is a good choice.

Conclusion

Allowing every stock index constituent to vary between ±50% of its target gives an acceptable trade-off of after-tax returns vs. tracking error in the scenarios we analyzed. Our backtest infrastructure and automated chart generation makes it easy for investment researchers to ascertain this. Moreover, the charts can be useful in a client-facing context; unlike other commercial implementations, we can override the default range of ±50% to match individual client preferences. The charts can help advisors and their clients tune this preference.