Direct Indexing in a jupyter notebook

21 minute read Published: 2024-03-05

My brother-in-law sent me a text asking me to explain how I can own SPY without owning TSLA. I told him it was too complicated to explain over text but promised a response "soon". This post is that response.

Firstly, this is not investment or tax advice. I am not a financial advisor or a tax preparer. I am not responsible for any losses you may incur or if the IRS comes knocking at your door.

Secondly, this approach will have some tracking error! In this example, I am assuming you can't buy factional shares (that's probably a bad idea) and I'm sure there are like a thousand other reasons I don't even know about. If you want to follow the S&P 500 as closely as possible. Leave it to the professionals! Buy VOO or SPY and call it a day. But if you're willing to accept some tracking error, direct indexing is pretty cool. There are several nice advantages:

Full repo is on github.

  1. Download all symbols and their weights in the S&P 500.
import pandas as pd
import requests
import cloudscraper
from io import StringIO
from bs4 import BeautifulSoup

pd.set_option('display.max_rows', 50)

url = "https://www.slickcharts.com/sp500"
scraper = cloudscraper.create_scraper()
soup = BeautifulSoup(scraper.get(url).text, 'html.parser')
sap_table = soup.find('table', {'class':"table"})
sap_list = pd.read_html(StringIO(str(sap_table)), flavor='html5lib')[0]
sap = pd.DataFrame(sap_list)

sap = sap[["Company", "Symbol", "Portfolio%"]]
sap
Company Symbol Portfolio%
0 Microsoft Corp MSFT 7.14%
1 Apple Inc. AAPL 6.08%
2 Nvidia Corp NVDA 4.70%
3 Amazon.com Inc AMZN 3.75%
4 Meta Platforms, Inc. Class A META 2.58%
... ... ... ...
498 Zions Bancorporation N.a. ZION 0.01%
499 V.F. Corporation VFC 0.01%
500 Paramount Global Class B PARA 0.01%
501 Fox Corporation Class B FOX 0.01%
502 News Corporation Class B NWS 0.01%

503 rows × 3 columns

  1. Remove any equities you don't want to hold
banned_tickers = ("TSLA", "FOX", "NWS")
sap = sap[~sap["Symbol"].isin(banned_tickers)].copy()
sap
Company Symbol Portfolio%
0 Microsoft Corp MSFT 7.14%
1 Apple Inc. AAPL 6.08%
2 Nvidia Corp NVDA 4.70%
3 Amazon.com Inc AMZN 3.75%
4 Meta Platforms, Inc. Class A META 2.58%
... ... ... ...
496 Mohawk Industries, Inc. MHK 0.01%
497 Whirlpool Corp. WHR 0.01%
498 Zions Bancorporation N.a. ZION 0.01%
499 V.F. Corporation VFC 0.01%
500 Paramount Global Class B PARA 0.01%

500 rows × 3 columns

  1. Also optionally clip weights
sap["PortfolioFraction"] = sap["Portfolio%"].str.strip("%").astype('float') / 100
sap["PortfolioFraction"] = sap["PortfolioFraction"].clip(lower=0, upper=0.05)
sap
Company Symbol Portfolio% PortfolioFraction
0 Microsoft Corp MSFT 7.14% 0.0500
1 Apple Inc. AAPL 6.08% 0.0500
2 Nvidia Corp NVDA 4.70% 0.0470
3 Amazon.com Inc AMZN 3.75% 0.0375
4 Meta Platforms, Inc. Class A META 2.58% 0.0258
... ... ... ... ...
496 Mohawk Industries, Inc. MHK 0.01% 0.0001
497 Whirlpool Corp. WHR 0.01% 0.0001
498 Zions Bancorporation N.a. ZION 0.01% 0.0001
499 V.F. Corporation VFC 0.01% 0.0001
500 Paramount Global Class B PARA 0.01% 0.0001

500 rows × 4 columns

  1. Reweight now that some of the tickers were removed or clipped
increase_prec = 1 - sap["PortfolioFraction"].sum() + 1
sap["PortfolioFraction"] = sap["PortfolioFraction"] * increase_prec
sap
Company Symbol Portfolio% PortfolioFraction
0 Microsoft Corp MSFT 7.14% 0.052395
1 Apple Inc. AAPL 6.08% 0.052395
2 Nvidia Corp NVDA 4.70% 0.049251
3 Amazon.com Inc AMZN 3.75% 0.039296
4 Meta Platforms, Inc. Class A META 2.58% 0.027036
... ... ... ... ...
496 Mohawk Industries, Inc. MHK 0.01% 0.000105
497 Whirlpool Corp. WHR 0.01% 0.000105
498 Zions Bancorporation N.a. ZION 0.01% 0.000105
499 V.F. Corporation VFC 0.01% 0.000105
500 Paramount Global Class B PARA 0.01% 0.000105

500 rows × 4 columns

  1. Import a dataframe of your current equity holdings & cash. You're a bit on your own here but my brokerage allows me to download a CSV of all of my holding. I'm going to assume you have a simplified version of that.
#Write out some test data
with open('holdings.csv', 'w') as f:
    f.write("""symbol,shares
AMZN,10
NVDA,1
MSFT,1
YUM,100
ZTS,10
GME,100""")
holdings = pd.read_csv('holdings.csv')
holdings
symbol shares
0 AMZN 10
1 NVDA 1
2 MSFT 1
3 YUM 100
4 ZTS 10
5 GME 100
cash_to_invest = 10000
  1. Get last share price of all of holdings and compute our total assests to be invested
import yfinance as yf

def get_current_price(symbol):
    ticker = yf.Ticker(symbol.replace(".","-"))
    todays_data = ticker.history(period='2d')
    return todays_data['Close'].iloc[0]

holdings["last$"] = holdings["symbol"].apply(get_current_price)
holdings["total$"] = holdings["shares"] * holdings["last$"]

total_assets = holdings['total$'].sum() + cash_to_invest
total_assets
print(f"total_assets:", total_assets)
holdings
total_assets: 30298.690364837646
symbol shares last$ total$
0 AMZN 10 177.580002 1775.800018
1 NVDA 1 852.369995 852.369995
2 MSFT 1 414.920013 414.920013
3 YUM 100 138.550003 13855.000305
4 ZTS 10 187.860001 1878.600006
5 GME 100 15.220000 1522.000027

  1. Also get lastest share price from all S&P 500 symbols
sap["last$"] = sap["Symbol"].apply(get_current_price)
sap
Company Symbol Portfolio% PortfolioFraction last$
0 Microsoft Corp MSFT 7.14% 0.052395 414.920013
1 Apple Inc. AAPL 6.08% 0.052395 175.100006
2 Nvidia Corp NVDA 4.70% 0.049251 852.369995
3 Amazon.com Inc AMZN 3.75% 0.039296 177.580002
4 Meta Platforms, Inc. Class A META 2.58% 0.027036 498.190002
... ... ... ... ... ...
496 Mohawk Industries, Inc. MHK 0.01% 0.000105 120.830002
497 Whirlpool Corp. WHR 0.01% 0.000105 106.550003
498 Zions Bancorporation N.a. ZION 0.01% 0.000105 39.169998
499 V.F. Corporation VFC 0.01% 0.000105 15.730000
500 Paramount Global Class B PARA 0.01% 0.000105 10.300000

500 rows × 5 columns

  1. Compute shares we should own to match give our current investable assets
sap["$NeededToMatch"] = sap["PortfolioFraction"] * total_assets
sap["SharesToMatch"] = (sap["$NeededToMatch"] / sap["last$"]).astype('int') # assume no factional shares
sap
Company Symbol Portfolio% PortfolioFraction last$ $NeededToMatch SharesToMatch
0 Microsoft Corp MSFT 7.14% 0.052395 414.920013 1587.499882 3
1 Apple Inc. AAPL 6.08% 0.052395 175.100006 1587.499882 9
2 Nvidia Corp NVDA 4.70% 0.049251 852.369995 1492.249889 1
3 Amazon.com Inc AMZN 3.75% 0.039296 177.580002 1190.624911 6
4 Meta Platforms, Inc. Class A META 2.58% 0.027036 498.190002 819.149939 1
... ... ... ... ... ... ... ...
496 Mohawk Industries, Inc. MHK 0.01% 0.000105 120.830002 3.175000 0
497 Whirlpool Corp. WHR 0.01% 0.000105 106.550003 3.175000 0
498 Zions Bancorporation N.a. ZION 0.01% 0.000105 39.169998 3.175000 0
499 V.F. Corporation VFC 0.01% 0.000105 15.730000 3.175000 0
500 Paramount Global Class B PARA 0.01% 0.000105 10.300000 3.175000 0

500 rows × 7 columns

  1. Join to our current holdings
movesWithHoldings = sap.merge(holdings[["symbol", "shares"]], left_on='Symbol', right_on='symbol', how='outer')
movesWithHoldings["shares"] = movesWithHoldings["shares"].fillna(0)
movesWithHoldings["SharesToMatch"] = movesWithHoldings["SharesToMatch"].fillna(0)
movesWithHoldings.loc[movesWithHoldings["Symbol"].isna(),"Symbol"] = movesWithHoldings[movesWithHoldings["Symbol"].isna()]["symbol"]
movesWithHoldings
Company Symbol Portfolio% PortfolioFraction last$ $NeededToMatch SharesToMatch symbol shares
0 Microsoft Corp MSFT 7.14% 0.052395 414.920013 1587.499882 3.0 MSFT 1.0
1 Apple Inc. AAPL 6.08% 0.052395 175.100006 1587.499882 9.0 NaN 0.0
2 Nvidia Corp NVDA 4.70% 0.049251 852.369995 1492.249889 1.0 NVDA 1.0
3 Amazon.com Inc AMZN 3.75% 0.039296 177.580002 1190.624911 6.0 AMZN 10.0
4 Meta Platforms, Inc. Class A META 2.58% 0.027036 498.190002 819.149939 1.0 NaN 0.0
... ... ... ... ... ... ... ... ... ...
496 Whirlpool Corp. WHR 0.01% 0.000105 106.550003 3.175000 0.0 NaN 0.0
497 Zions Bancorporation N.a. ZION 0.01% 0.000105 39.169998 3.175000 0.0 NaN 0.0
498 V.F. Corporation VFC 0.01% 0.000105 15.730000 3.175000 0.0 NaN 0.0
499 Paramount Global Class B PARA 0.01% 0.000105 10.300000 3.175000 0.0 NaN 0.0
500 NaN GME NaN NaN NaN NaN 0.0 GME 100.0

501 rows × 9 columns

  1. Find the shares we need to match the S&P 500 prop
movesWithHoldings["SharesDifference"] = movesWithHoldings["SharesToMatch"] - movesWithHoldings["shares"]
movesWithHoldings
Company Symbol Portfolio% PortfolioFraction last$ $NeededToMatch SharesToMatch symbol shares SharesDifference
0 Microsoft Corp MSFT 7.14% 0.052395 414.920013 1587.499882 3.0 MSFT 1.0 2.0
1 Apple Inc. AAPL 6.08% 0.052395 175.100006 1587.499882 9.0 NaN 0.0 9.0
2 Nvidia Corp NVDA 4.70% 0.049251 852.369995 1492.249889 1.0 NVDA 1.0 0.0
3 Amazon.com Inc AMZN 3.75% 0.039296 177.580002 1190.624911 6.0 AMZN 10.0 -4.0
4 Meta Platforms, Inc. Class A META 2.58% 0.027036 498.190002 819.149939 1.0 NaN 0.0 1.0
... ... ... ... ... ... ... ... ... ... ...
496 Whirlpool Corp. WHR 0.01% 0.000105 106.550003 3.175000 0.0 NaN 0.0 0.0
497 Zions Bancorporation N.a. ZION 0.01% 0.000105 39.169998 3.175000 0.0 NaN 0.0 0.0
498 V.F. Corporation VFC 0.01% 0.000105 15.730000 3.175000 0.0 NaN 0.0 0.0
499 Paramount Global Class B PARA 0.01% 0.000105 10.300000 3.175000 0.0 NaN 0.0 0.0
500 NaN GME NaN NaN NaN NaN 0.0 GME 100.0 -100.0

501 rows × 10 columns

  1. Find tickers that we need to sell
movesWithHoldings[movesWithHoldings["SharesDifference"] < 0][["Symbol", "SharesDifference"]]
Symbol SharesDifference
3 AMZN -4.0
104 ZTS -10.0
220 YUM -100.0
500 GME -100.0
  1. Find tickers that we need to buy
movesWithHoldings[movesWithHoldings["SharesDifference"] > 0][["Symbol", "SharesDifference"]]
Symbol SharesDifference
0 MSFT 2.0
1 AAPL 9.0
4 META 1.0
5 GOOGL 4.0
6 BRK.B 1.0
7 GOOG 3.0
10 JPM 2.0
12 V 1.0
13 XOM 2.0
15 JNJ 1.0
17 PG 1.0
19 AMD 1.0
20 MRK 1.0
21 ABBV 1.0
24 CVX 1.0
26 WMT 3.0
28 BAC 4.0
29 KO 2.0
34 ABT 1.0
35 DIS 1.0
36 WFC 2.0
37 CSCO 2.0
39 INTC 2.0
41 ORCL 1.0
44 CMCSA 3.0
47 VZ 3.0
49 UBER 1.0
53 PFE 4.0
57 PM 1.0
62 RTX 1.0
68 T 5.0
74 NEE 1.0
82 C 1.0
85 BMY 1.0
89 SCHW 1.0
91 MDLZ 1.0
92 BSX 1.0
111 CSX 1.0
116 MO 1.0
120 SLB 1.0
133 USB 1.0
150 GM 1.0
153 FCX 1.0
169 F 2.0
238 PCG 1.0
242 KMI 1.0
249 KVUE 1.0
345 WBD 1.0
400 VTRS 1.0
423 AMCR 1.0