Cross-posted to EA Forum

Summary

I tried to estimate the total yearly dollars that are aligned with EA, specifically through the lens of the four cause areas, the thought being that I should direct my personal funds to whichever I feel is most underweight. Questions:

  1. What do you all think would be an ideal split amongst the cause areas?

  2. Do you disagree in general with the strategy of allocating my personal donations on the basis of where I expect to differ the most from the community regarding #1?

  3. Do you feel that the numbers I'm using are misrepresentative? I will do my best to address limitations below.

import pandas as pd
import seaborn as sns; sns.set()
#Find the amount allocated to each cause area in 2020
cause_areas = pd.Series(['Global Health', 'Animal Welfare', 'Longtermism', 'Meta'], name='Cause Area')
EA_funds = [3861068.57, 1474852.10, 1761781.21, 1999726.27] #exact numbers from their website
givewell = [105000000, 0, 0, 0] #Total $ 'moved', less Open Phil, projected fwd from 2019
open_phil = [54324458, 22780748, 45903684, 17812170] #custom-calculated using their database
ace = [0, 8000000, 0, 0] #Animal Charity Evaluators, less Open Phil, projected fwd from 2019
df = pd.DataFrame({'GiveWell':givewell, 'Open Phil':open_phil, 'EA Funds':EA_funds, 'ACE':ace}, index=cause_areas)
df.rename_axis('Organization', axis='columns', inplace=True)
proportions = df.sum(axis=1)/df.sum().sum()

#Create graph
sns.set_palette('mako_r')
import matplotlib.pyplot as plt
#custom FuncFormatter
def millions(value, tick_number):
    return '$%1.0fM' % (value*1e-6)

###|Make the plot -- omitted in favour of static .png|###
# fig, ax = plt.subplots()
# df.plot(kind='bar', stacked=True, figsize=(10,6), title='EA Dollars by Cause Area (2020)', rot=0, ax=ax);
# ax.yaxis.set_major_formatter(plt.FuncFormatter(millions))

# #label bars
# rects = ax.patches
# labels = [f'{x*100:.1f}%' for x in proportions]
# for rect, label in zip(rects, labels):
#     # height = rect.get_height()
#     height = 0 #above doesn't work as well for stacked rectangles
#     ax.text(rect.get_x() + rect.get_width() / 2, height + 5, label, ha='center', va='bottom')

Data and limitations

Looking at global funding of EA causes in 2020, the best (relatively quick) estimate I was able to produce is about $263M with a 62/18/12/8 cause area split , as shown in Figure 1. I will briefly touch on where this data comes from, and some of its limitations:

  • Open Phil data comes straight from their Grants Database. They categorize slightly differently, but I mapped each grant to the appropriate cause area (and omitted ~36% of grant dollars which didn't belong in any of the four). There is an obvious distinction here in that these grants do not reflect the choices of small individual donors in the EA community. However, Open Phil accounts for more than half of the total funding in this sample, and the cause area breakdown is both fairly predictable and extremely transparent (see Figure 2).

  • GiveWell estimate comes from their annual Metrics Report (projected from latest figures published, omitted overlap with Open Phil). This includes donations made directly through the GiveWell entity, as well as other orgs explicitly acting on their research.

  • EA Funds provides exact intake figures at their website.

  • Animal Charity Evaluators estimate based on their Metrics Report and some discussion here.

#Open Phil rolling window allocation
# !curl -O https://raw.githubusercontent.com/tmaule/ea_stuff/main/open_phil_grants_db_raw.csv
grants = pd.read_csv('open_phil_grants_db_raw.csv', usecols=['Organization Name', 'Focus Area', 'Amount', 'Date'], index_col='Date', parse_dates=True).dropna()
#Map their terminology to standard EA cause areas
cause_map = {'Global Health & Development':'Global Health', 'Farm Animal Welfare':'Animal Welfare', 'Global Catastrophic Risks':'Longtermism', 
                'Potential Risks from Advanced Artificial Intelligence': 'Longtermism', 'Biosecurity and Pandemic Preparedness':'Longtermism', 'Other areas':'Meta'}
cause_area = list() #store true labels
for index, row in grants.iterrows():
    if row['Focus Area'] in cause_map:
        cause_area.append(cause_map[row['Focus Area']])
    else:
        cause_area.append('X') #not an EA cause area
    if row['Organization Name'] == 'GiveWell':
        cause_area[-1] = 'Meta' #supporting GW belongs in Meta, not Global Health

grants['Cause Area'] = cause_area

#drop non-EA grants ('X'), as well as redundant column
grants = grants[grants['Cause Area'] != 'X'].drop('Focus Area', axis=1)

#construct DataFrame of running-total grant dollars to each cause area, recorded monthly
month_idx = pd.date_range('2015-01-01', '2021-02-01', freq='M')
running_totals = pd.DataFrame(index=month_idx)
by_cause = grants.groupby('Cause Area')
for cause, subframe in by_cause:
    running_totals[cause] = [subframe[d > subframe.index].Amount.sum() for d in running_totals.index]

#convert this into a trailing 12mo rolling window using shift()
trailing_dollars = running_totals - running_totals.shift(12)
trailing_dollars.dropna(inplace=True)
trailing_dollars = trailing_dollars[cause_areas.values][trailing_dollars.index.year > 2016] #re-order cols

#plot Open Phil trailing dollars
# sns.set_palette('Set2')
# ax = trailing_dollars.plot(title='Open Phil Dollars Allocated by Cause Area (Trailing 12mo)', figsize=(10,5));
# ax.yaxis.set_major_formatter(plt.FuncFormatter(millions));

Limitations:

  • Using cause areas as bins is a useful model, but in practice there is much more nuance. My guess is that there is a lot of funding adjacent to the 'Global Health' EA bin (e.g. Bill and Melinda Gates Foundation), but hardly any adjacent to 'Longtermism'.

  • Each bin itself is an imperfect estimate. It would be great if there was some comprehensive source of movement-level statistics.

  • Targeting a percentage split by cause area is overly simplistic. Ideally there should be some comparison of [current, relative] opportunity, though it would be very difficult to compare GiveWell-style across bins. Furthermore, if total funding were to increase 10x I wouldn't expect each bin to scale proportionally (e.g. Meta much smaller).

Thoughts

As an individual donor, I am somewhat opposed the idea of impact diversification, or giving to more than one charity for that matter. However, on a macro level I certainly value each of these cause areas, and would expect each to consist of many distinct interventions. So where does that leave me? Should I just pick the one charity that I think is highest EV of all? Should I just split to reflect my uncertainty as to which cause is most effective? I tend to think of each cause area as more or less orthogonal, and very difficult to cross-compare. The way I see it, the best I can do is try to identify the most underfunded space on the margin, and give there.

I am still pretty unsure of how to assign relative value to these four bins, but my tentative opinion is that if I were starting from scratch with $263M, I would want to split something like 40/30/20/10 to Global Health, Longtermism, Animal Welfare, and Meta, respectively. This closely resembles the recent choices made by Open Phil (Figure 3), however GiveWell adds a lot of weight to Global Health. Based on the true splits estimated in Figure 1, unless convinced otherwise, I will likely do all of my personal giving via the Animal Welfare/Longtermism EA Funds this year.

# Plot Open Phil trailing as a proportion
trailing_proportion = trailing_dollars.apply(lambda x: x/x.sum(), axis=1)
# trailing_proportion.plot(title='Open Phil Proportion Allocated by Cause Area (Trailing 12mo)', figsize=(10,5));