+ - 0:00:00
Notes for current slide
Notes for next slide

The Soup Principle

How election forecasting models work (and why they don’t)

G. Elliott Morris | December 3, 2021

1 / 50

2 / 50

3 / 50

Why polls? Why forecasting?

4 / 50

Why polls? Why forecasting?

1. Journalism

4 / 50

Why polls? Why forecasting?

1. Journalism

1a. Attention

4 / 50

Why polls? Why forecasting?

1. Journalism

1a. Attention

1b. Truth

4 / 50

Why polls? Why forecasting?

1. Journalism

1a. Attention

1b. Truth

2. Methods training

4 / 50

Why polls? Why forecasting?

1. Journalism

1a. Attention

1b. Truth

2. Methods training

3. It's fun!

4 / 50

So let's talk about...

5 / 50

So let's talk about...

1. How polls work

5 / 50

So let's talk about...

1. How polls work

2. How forecassts work

5 / 50

So let's talk about...

1. How polls work

2. How forecassts work

3. And why they fail

5 / 50
6 / 50

The soup principle and the polls

7 / 50

The first polls

Straw polls -> real polls

8 / 50

The first ("scientific") polls

- Conducted face-to-face

9 / 50

The first ("scientific") polls

- Conducted face-to-face

- Used demographic quotas for representativeness

  • Race, gender, age, geography, class
9 / 50

The first ("scientific") polls

- Conducted face-to-face

- Used demographic quotas for representativeness

  • Race, gender, age, geography, class

- Beat straw polls in accuracy (1936)

  • By shrinking bias from nonresponse
9 / 50

The first ("scientific") polls

- Conducted face-to-face

- Used demographic quotas for representativeness

  • Race, gender, age, geography, class

- Beat straw polls in accuracy (1936)

  • By shrinking bias from nonresponse

- But fell short of true survey science (1948)

9 / 50

Polls 2.0

- SSRC says: area sampling

10 / 50

Polls 2.0

- SSRC says: area sampling

10 / 50

Polls 2.0

- SSRC says: area sampling

- Gallup implements some partisan controls

  • Stratas are groups of precincts by 1948 vote choice
11 / 50

Polls 2.0

- SSRC says: area sampling

- Gallup implements some partisan controls

  • Stratas are groups of precincts by 1948 vote choice

- Use rough quotas within geography

11 / 50

Polls 2.0

- SSRC says: area sampling

- Gallup implements some partisan controls

  • Stratas are groups of precincts by 1948 vote choice

- Use rough quotas within geography

- Preserve interviewer bias

11 / 50

Polls 2.0

- SSRC says: area sampling

- Gallup implements some partisan controls

  • Stratas are groups of precincts by 1948 vote choice

- Use rough quotas within geography

- Preserve interviewer bias

11 / 50

Polls 3.0

12 / 50

Polls 3.0

Technological change -> better methods

12 / 50

Polls 3.0

- True random sampling (for people with phones)

- Response rates above 70 or 80%

- Rarer instances of severe nonresponse bias

- Cheaper to conduct = news orgs poll (CBS, NYT)

13 / 50

Technological change -> worse methods?

14 / 50

15 / 50

The soup principle (in theory)

Image credit: Pew Research Center

16 / 50

But what if the people you sample don't represent the population?

17 / 50

But what if the people you sample don't represent the population?

- People could be very dissimilar by group, meaning small deviations in sample demographics cause big errors (sampling error)

17 / 50

But what if the people you sample don't represent the population?

- People could be very dissimilar by group, meaning small deviations in sample demographics cause big errors (sampling error)

- Or the people who respond to the poll could be systematically different from the people who don't (response error)

17 / 50

But what if the people you sample don't represent the population?

- People could be very dissimilar by group, meaning small deviations in sample demographics cause big errors (sampling error)

- Or the people who respond to the poll could be systematically different from the people who don't (response error)

- Or your list of potential respondents could be missing people (coverage error)

17 / 50

But what if the people you sample don't represent the population?

- People could be very dissimilar by group, meaning small deviations in sample demographics cause big errors (sampling error)

- Or the people who respond to the poll could be systematically different from the people who don't (response error)

- Or your list of potential respondents could be missing people (coverage error)

 

 

*Polls can also go wrong if they have bad question wording, a fourth type of survey error called "measurement error"

17 / 50

The soup principle (in practice)

18 / 50

Polls today are not soup

 

19 / 50

Polls today are not soup

 

- Declining response rates + Internet = innovations in polling online, but they don't use random sampling

19 / 50

Polls today are not soup

 

- Declining response rates + Internet = innovations in polling online, but they don't use random sampling

- And even traditional RDD polls don't have a true random sample (since response rates are too low)

- And because of nonresponse

19 / 50

So, to satisfy the soup principle...

Pollsters use statistical algorithms to ensure their samples match the population on different demographic targets

  • Race, age, gender, and region are most common
20 / 50

Option A: Weighting

  • Raking, calibration, propensity etc.

21 / 50

Option B: Modeling

22 / 50

But the traditional adjustments aren't enough...

23 / 50

2016: Education weighting

24 / 50

2020: Partisan nonresponse

25 / 50

But the traditional adjustments aren't enough...

26 / 50

But the traditional adjustments aren't enough...

  • Race, age, gender, and region are most common
26 / 50

But the traditional adjustments aren't enough...

  • Race, age, gender, and region are most common
  • Education, interactions, partisanship are harder, but increasingly necessary
26 / 50

The future of polling?

1. More weighting variables

2. More online and off-phone data colleciton (SMS, mail)

3. Mixed samples

All in the pursuit of getting representative (and politically balanced) samples before the adjustment stage

27 / 50

Big lesson 1:

28 / 50

Big lesson 1:

Violating the soup principle = unrepresentative polls

28 / 50

Big lesson 1:

Violating the soup principle = unrepresentative polls

Big lesson 2:

28 / 50

Big lesson 1:

Violating the soup principle = unrepresentative polls

Big lesson 2:

... so what does it do to election forecasting models?

28 / 50

The soup principle and election forecasts

29 / 50

2020 presidential election forecast

30 / 50

What goes into the model?

1. National economic + political fundamentals

2. Decompose into state-level priors

3. Polls

Uncertainty is propogated throughout the models, incorporated via MCMC sampling in step 3.

31 / 50

National fundamentals?

i) Index of economic growth (1940 - 2016)

  • eight different variables, scaled to measure the standard-deviation from average annual growth

ii) Presidential approval (1948 - 2016)

iii) Polarization (1948 - 2016)

  • measured as the share of swing voters in the electorate, per the ANES --- and interacted with economic growth

iv) Whether an incumbent is on the ballot

32 / 50

33 / 50

34 / 50

35 / 50

Modeling the fundamentals

Model formula:

vote ~ incumbent_running:economic growth:polarization + approval

Training

Model trained on 1948-2016 using elastic net regression with leave-one-out cross-validation

RMSE = 2.6 percentage points on two-party Democratic vote share

36 / 50

The model is a federalist

i) Train a model to predict the Democratic share of the vote in a state relative to the national vote, 1948-2016

  • Variables are: lean in the last election, lean two elections ago, home state effects * state size, conditional on the national vote in the state

ii) Use the covariates to make predictions for 2020, conditional on the national fundamentals prediction for every day

ii) Simulate state-level outcomes to extract a mean and standard deviation

  • Propogates uncertainty both from the LOOCV RMSE of the national model and the state-level model
37 / 50

That's the basleine

38 / 50

That's the basleine

Now, we add the polls

38 / 50

Just a trend through points...

Can do with any series of packages for R, other statistical languages

39 / 50

Just a trend through points...

Can do with any series of packages for R, other statistical languages

39 / 50

(...but with some fancy extra stuff)

mu_b[:,T] = cholesky_ss_cov_mu_b_T * raw_mu_b_T + mu_b_prior;
for (i in 1:(T-1)) mu_b[:, T - i] = cholesky_ss_cov_mu_b_walk * raw_mu_b[:, T - i] + mu_b[:, T + 1 - i];
national_mu_b_average = transpose(mu_b) * state_weights;
mu_c = raw_mu_c * sigma_c;
mu_m = raw_mu_m * sigma_m;
mu_pop = raw_mu_pop * sigma_pop;
e_bias[1] = raw_e_bias[1] * sigma_e_bias;
sigma_rho = sqrt(1-square(rho_e_bias)) * sigma_e_bias;
for (t in 2:T) e_bias[t] = mu_e_bias + rho_e_bias * (e_bias[t - 1] - mu_e_bias) + raw_e_bias[t] * sigma_rho;
//*** fill pi_democrat
for (i in 1:N_state_polls){
logit_pi_democrat_state[i] =
mu_b[state[i], day_state[i]] +
mu_c[poll_state[i]] +
mu_m[poll_mode_state[i]] +
mu_pop[poll_pop_state[i]] +
unadjusted_state[i] * e_bias[day_state[i]] +
raw_measure_noise_state[i] * sigma_measure_noise_state +
polling_bias[state[i]];
}
40 / 50

Poll-level model

41 / 50

Poll-level model

i. Latent state-level vote shares evolve as a random walk over time

  • Pooling toward the state-level fundamentals more as we are further out from election day
41 / 50

Poll-level model

i. Latent state-level vote shares evolve as a random walk over time

  • Pooling toward the state-level fundamentals more as we are further out from election day

ii. Polls are observations with measurement error that are debiased on the basis of:

  • Pollster firm (so-called "house effects")
  • Poll mode
  • Poll population
41 / 50

Poll-level model

i. Latent state-level vote shares evolve as a random walk over time

  • Pooling toward the state-level fundamentals more as we are further out from election day

ii. Polls are observations with measurement error that are debiased on the basis of:

  • Pollster firm (so-called "house effects")
  • Poll mode
  • Poll population

iii. Correcting for partisan non-response

  • Whether a pollster weights by party registration or past vote
  • (Incorporated as a residual AR process)
41 / 50

Debiased predictions

Notable improvements from partisan non-responseand other weighting issues

42 / 50

Debiased predictions

Notable improvements* from controlling for partisan nonresponse and other weighting issues

*In 2016, but not 2020

43 / 50

Back to the soup....

44 / 50

45 / 50

If polls are biased, the aggregation model cannot remove the biases. It can only explore them

46 / 50

If polls are biased, the aggregation model cannot remove the biases. It can only explore them

Yet this means violation of random sampling also violates the statistical theory underpinning election models!

46 / 50

If polls are biased, the aggregation model cannot remove the biases. It can only explore them

Yet this means violation of random sampling also violates the statistical theory underpinning election models!

And means we must add ways to incorporate the extra uncertainty from bias. (Especially if partisan nonresponse is getting worse.)

46 / 50

Some ideas...

1. Add extra model error for uncertainty. But how?

47 / 50

Some ideas...

1. Add extra model error for uncertainty. But how?

2. Attempt to debias polls (but that adds uncertainty too)

47 / 50

Some ideas...

1. Add extra model error for uncertainty. But how?

2. Attempt to debias polls (but that adds uncertainty too)

3. Only use polls we trust? (But how do we measure trust? Not necessarily a robust solution.)

47 / 50

4. Communication: Show forecasts for different scenarios of poll error

Ultimately, we're still figuring out the answer....

48 / 50

49 / 50

Thank you!

Book comes out July 12 2022



Website: gelliottmorris.com

Twitter: @gelliottmorris

Questions?


These slides were made using the xaringan package for R. They are available online at https://www.gelliottmorris.com/slides/

50 / 50

2 / 50
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow