The Soup PrincipleHow election forecasting models work (and why they don’t)G. Elliott Morris | December 3, 20211 / 50

2 / 50

3 / 50

Why polls? Why forecasting?4 / 50

Why polls? Why forecasting?1. Journalism4 / 50

Why polls? Why forecasting?1. Journalism1a. Attention4 / 50

Why polls? Why forecasting?1. Journalism1a. Attention1b. Truth4 / 50

Why polls? Why forecasting?1. Journalism1a. Attention1b. Truth2. Methods training4 / 50

Why polls? Why forecasting?1. Journalism1a. Attention1b. Truth2. Methods training3. It's fun!4 / 50

So let's talk about...5 / 50

So let's talk about...1. How polls work5 / 50

So let's talk about...1. How polls work2. How forecassts work5 / 50

So let's talk about...1. How polls work2. How forecassts work3. And why they fail5 / 50

6 / 50

The soup principle and the polls7 / 50

The first polls

Straw polls -> real polls

8 / 50

The first ("scientific") polls- Conducted face-to-face9 / 50

The first ("scientific") polls- Conducted face-to-face- Used demographic quotas for representativenessRace, gender, age, geography, class
9 / 50

The first ("scientific") polls- Conducted face-to-face- Used demographic quotas for representativenessRace, gender, age, geography, class
- Beat straw polls in accuracy (1936)By shrinking bias from nonresponse
9 / 50

The first ("scientific") polls- Conducted face-to-face- Used demographic quotas for representativenessRace, gender, age, geography, class
- Beat straw polls in accuracy (1936)By shrinking bias from nonresponse
- But fell short of true survey science (1948)9 / 50

Polls 2.0- SSRC says: area sampling10 / 50

Polls 2.0

- SSRC says: area sampling

10 / 50

Polls 2.0- SSRC says: area sampling- Gallup implements some partisan controlsStratas are groups of precincts by 1948 vote choice
11 / 50

Polls 2.0- SSRC says: area sampling- Gallup implements some partisan controlsStratas are groups of precincts by 1948 vote choice
- Use rough quotas within geography11 / 50

Polls 2.0- SSRC says: area sampling- Gallup implements some partisan controlsStratas are groups of precincts by 1948 vote choice
- Use rough quotas within geography- Preserve interviewer bias11 / 50

Polls 2.0- SSRC says: area sampling- Gallup implements some partisan controlsStratas are groups of precincts by 1948 vote choice
- Use rough quotas within geography- Preserve interviewer bias11 / 50

Polls 3.0

12 / 50

Polls 3.0

Technological change -> better methods

12 / 50

Polls 3.0- True random sampling (for people with phones)- Response rates above 70 or 80%- Rarer instances of severe nonresponse bias- Cheaper to conduct = news orgs poll (CBS, NYT)13 / 50

Technological change -> worse methods?

14 / 50

15 / 50

The soup principle (in theory)

Image credit: Pew Research Center

16 / 50

But what if the people you sample don't represent the population?
17 / 50

But what if the people you sample don't represent the population?
- People could be very dissimilar by group, meaning small deviations in sample demographics cause big errors (sampling error)17 / 50

But what if the people you sample don't represent the population?
- People could be very dissimilar by group, meaning small deviations in sample demographics cause big errors (sampling error)- Or the people who respond to the poll could be systematically different from the people who don't (response error)17 / 50

But what if the people you sample don't represent the population?
- People could be very dissimilar by group, meaning small deviations in sample demographics cause big errors (sampling error)- Or the people who respond to the poll could be systematically different from the people who don't (response error)- Or your list of potential respondents could be missing people (coverage error)17 / 50

But what if the people you sample don't represent the population?

- People could be very dissimilar by group, meaning small deviations in sample demographics cause big errors (sampling error)

- Or the people who respond to the poll could be systematically different from the people who don't (response error)

- Or your list of potential respondents could be missing people (coverage error)

*Polls can also go wrong if they have bad question wording, a fourth type of survey error called "measurement error"

17 / 50

The soup principle (in practice)

18 / 50

Polls today are not soup

19 / 50

Polls today are not soup

- Declining response rates + Internet = innovations in polling online, but they don't use random sampling

19 / 50

Polls today are not soup

- Declining response rates + Internet = innovations in polling online, but they don't use random sampling

- And even traditional RDD polls don't have a true random sample (since response rates are too low)

- And because of nonresponse

19 / 50

So, to satisfy the soup principle...Pollsters use statistical algorithms to ensure their samples match the population on different demographic targetsRace, age, gender, and region are most common
20 / 50

Option A: Weighting

Raking, calibration, propensity etc.

21 / 50

Option B: Modeling

22 / 50

But the traditional adjustments aren't enough...23 / 50

2016: Education weighting

24 / 50

2020: Partisan nonresponse

25 / 50

But the traditional adjustments aren't enough...26 / 50

But the traditional adjustments aren't enough...Race, age, gender, and region are most common
26 / 50

But the traditional adjustments aren't enough...Race, age, gender, and region are most common
Education, interactions, partisanship are harder, but increasingly necessary
26 / 50

The future of polling?1. More weighting variables2. More online and off-phone data colleciton (SMS, mail)3. Mixed samplesAll in the pursuit of getting representative (and politically balanced) samples before the adjustment stage27 / 50

Big lesson 1:28 / 50

Big lesson 1:Violating the soup principle = unrepresentative polls28 / 50

Big lesson 1:Violating the soup principle = unrepresentative pollsBig lesson 2:28 / 50

Big lesson 1:Violating the soup principle = unrepresentative pollsBig lesson 2:... so what does it do to election forecasting models?28 / 50

The soup principle and election forecasts29 / 50

2020 presidential election forecast

30 / 50

What goes into the model?

1. National economic + political fundamentals

2. Decompose into state-level priors

3. Polls

Uncertainty is propogated throughout the models, incorporated via MCMC sampling in step 3.

31 / 50

National fundamentals?i) Index of economic growth (1940 - 2016)eight different variables, scaled to measure the standard-deviation from average annual growth
ii) Presidential approval (1948 - 2016)iii) Polarization (1948 - 2016)measured as the share of swing voters in the electorate, per the ANES --- and interacted with economic growth
iv) Whether an incumbent is on the ballot32 / 50

33 / 50

34 / 50

35 / 50

Modeling the fundamentals

Model formula:

vote ~ incumbent_running:economic growth:polarization + approval

Training

Model trained on 1948-2016 using elastic net regression with leave-one-out cross-validation

RMSE = 2.6 percentage points on two-party Democratic vote share

36 / 50

The model is a federalisti) Train a model to predict the Democratic share of the vote in a state relative to the national vote, 1948-2016Variables are: lean in the last election, lean two elections ago, home state effects * state size, conditional on the national vote in the state
ii) Use the covariates to make predictions for 2020, conditional on the national fundamentals prediction for every dayii) Simulate state-level outcomes to extract a mean and standard deviationPropogates uncertainty both from the LOOCV RMSE of the national model and the state-level model
37 / 50

That's the basleine38 / 50

That's the basleineNow, we add the polls38 / 50

Just a trend through points...

Can do with any series of packages for R, other statistical languages

39 / 50

Just a trend through points...

Can do with any series of packages for R, other statistical languages

39 / 50

(...but with some fancy extra stuff)

mu_b[:,T] = cholesky_ss_cov_mu_b_T * raw_mu_b_T + mu_b_prior;  
for (i in 1:(T-1)) mu_b[:, T - i] = cholesky_ss_cov_mu_b_walk * raw_mu_b[:, T - i] + mu_b[:, T + 1 - i];
national_mu_b_average = transpose(mu_b) * state_weights;
mu_c = raw_mu_c * sigma_c;
mu_m = raw_mu_m * sigma_m;
mu_pop = raw_mu_pop * sigma_pop;
e_bias[1] = raw_e_bias[1] * sigma_e_bias;
sigma_rho = sqrt(1-square(rho_e_bias)) * sigma_e_bias;
for (t in 2:T) e_bias[t] = mu_e_bias + rho_e_bias * (e_bias[t - 1] - mu_e_bias) + raw_e_bias[t] * sigma_rho;
//*** fill pi_democrat
for (i in 1:N_state_polls){
  logit_pi_democrat_state[i] = 
    mu_b[state[i], day_state[i]] + 
    mu_c[poll_state[i]] + 
    mu_m[poll_mode_state[i]] + 
    mu_pop[poll_pop_state[i]] + 
    unadjusted_state[i] * e_bias[day_state[i]] +
    raw_measure_noise_state[i] * sigma_measure_noise_state + 
    polling_bias[state[i]];
}

40 / 50

Poll-level model41 / 50

Poll-level modeli. Latent state-level vote shares evolve as a random walk over timePooling toward the state-level fundamentals more as we are further out from election day
41 / 50

Poll-level modeli. Latent state-level vote shares evolve as a random walk over timePooling toward the state-level fundamentals more as we are further out from election day
ii. Polls are observations with measurement error that are debiased on the basis of:Pollster firm (so-called "house effects")
Poll mode
Poll population
41 / 50

Poll-level modeli. Latent state-level vote shares evolve as a random walk over timePooling toward the state-level fundamentals more as we are further out from election day
ii. Polls are observations with measurement error that are debiased on the basis of:Pollster firm (so-called "house effects")
Poll mode
Poll population
iii. Correcting for partisan non-responseWhether a pollster weights by party registration or past vote
(Incorporated as a residual AR process)
41 / 50

Debiased predictions

Notable improvements from partisan non-responseand other weighting issues

42 / 50

Debiased predictions

Notable improvements* from controlling for partisan nonresponse and other weighting issues

*In 2016, but not 2020

43 / 50

Back to the soup....44 / 50

45 / 50

If polls are biased, the aggregation model cannot remove the biases. It can only explore them

46 / 50

If polls are biased, the aggregation model cannot remove the biases. It can only explore them

Yet this means violation of random sampling also violates the statistical theory underpinning election models!

46 / 50

If polls are biased, the aggregation model cannot remove the biases. It can only explore them

Yet this means violation of random sampling also violates the statistical theory underpinning election models!

And means we must add ways to incorporate the extra uncertainty from bias. (Especially if partisan nonresponse is getting worse.)

46 / 50

Some ideas...1. Add extra model error for uncertainty. But how?47 / 50

Some ideas...1. Add extra model error for uncertainty. But how?2. Attempt to debias polls (but that adds uncertainty too)47 / 50

Some ideas...1. Add extra model error for uncertainty. But how?2. Attempt to debias polls (but that adds uncertainty too)3. Only use polls we trust? (But how do we measure trust? Not necessarily a robust solution.)47 / 50

4. Communication: Show forecasts for different scenarios of poll error

Ultimately, we're still figuring out the answer....

48 / 50

49 / 50

Thank you!

Book comes out July 12 2022

Website: gelliottmorris.com

Twitter: @gelliottmorris

Questions?

These slides were made using the xaringan package for R. They are available online at https://www.gelliottmorris.com/slides/

50 / 50

↑, ←, Pg Up, k	Go to previous slide
↓, →, Pg Dn, Space, j	Go to next slide
Home	Go to first slide
End	Go to last slide
Number + Return	Go to specific slide
b / m / f	Toggle blackout / mirrored / fullscreen mode
c	Clone slideshow
p	Toggle presenter mode
t	Restart the presentation timer
?, h	Toggle this help