+ - 0:00:00
Notes for current slide
Notes for next slide

What’s the matter with polling?

From Strength in Numbers: How Polls Work + Why We Need Them

G. Elliott Morris | Oct 18 2022 | Pittsburgh, PA

1 / 61

2 / 61

3 / 61

The "soup principle"

4 / 61

The first polls

5 / 61

"Straw" polls

6 / 61

7 / 61

8 / 61

The first ("scientific") polls

- Conducted face-to-face

9 / 61

The first ("scientific") polls

- Conducted face-to-face

- Used demographic quotas for representativeness

  • Race, gender, age, geography
9 / 61

The first ("scientific") polls

- Conducted face-to-face

- Used demographic quotas for representativeness

  • Race, gender, age, geography

- Beat straw polls in accuracy (1936)

  • By shrinking bias from demographic nonresponse
9 / 61

The first ("scientific") polls

- Conducted face-to-face

- Used demographic quotas for representativeness

  • Race, gender, age, geography

- Beat straw polls in accuracy (1936)

  • By shrinking bias from demographic nonresponse
9 / 61

The first ("scientific") polls

- But fell short of true survey science (1948)

10 / 61

Polls 2.0

- SSRC says: area sampling

11 / 61

Polls 2.0

- SSRC says: area sampling

11 / 61

Polls 2.0

- SSRC says: area sampling

- Gallup implements some partisan controls

  • Strata are groups of precincts by 1948 vote choice
12 / 61

Polls 2.0

- SSRC says: area sampling

- Gallup implements some partisan controls

  • Strata are groups of precincts by 1948 vote choice

- Use rough quotas within geography

12 / 61

Polls 2.0

- SSRC says: area sampling

- Gallup implements some partisan controls

  • Strata are groups of precincts by 1948 vote choice

- Use rough quotas within geography

- But, preserve interviewer bias

12 / 61

Polls 2.0

- SSRC says: area sampling

- Gallup implements some partisan controls

  • Strata are groups of precincts by 1948 vote choice

- Use rough quotas within geography

- But, preserve interviewer bias

12 / 61

Polls 3.0

13 / 61

Polls 3.0

Technological change -> better methods

13 / 61

Polls 3.0

- 1970s: true random sampling (for people with phones)

- Response rates above 70-80%

- Rarer instances of severe nonresponse bias

- Cheaper to conduct = many news orgs poll (CBS, NYT)

14 / 61

Source: American Association of Public Opinion Research

15 / 61

The soup principle: satisfied?

Source: Pew Research Center

16 / 61

The soup principle: satisfied?

1. RDD polls are representative (at high response)

2. Availability of many different surveys allow for extra layer of aggregation to control for choices made by individual researcheers

17 / 61

= perfect polls forever,

18 / 61

= perfect polls forever,

...right?

18 / 61

Technological change -> worse methods?

Source: Pew Research Center

19 / 61

Polarized voting -> harder sampling

Source: Webster & Abramowitz 2017

20 / 61

But what if the people you sample don't represent the population?

21 / 61

But what if the people you sample don't represent the population?

- People could be very dissimilar by group, meaning small deviations in sample demographics cause big errors (sampling error)

21 / 61

But what if the people you sample don't represent the population?

- People could be very dissimilar by group, meaning small deviations in sample demographics cause big errors (sampling error)

- Or the people who respond to the poll could be systematically different from the people who don't (response error)

21 / 61

But what if the people you sample don't represent the population?

- People could be very dissimilar by group, meaning small deviations in sample demographics cause big errors (sampling error)

- Or the people who respond to the poll could be systematically different from the people who don't (response error)

- Or your list of potential respondents could be missing people (coverage error)

21 / 61

But what if the people you sample don't represent the population?

- People could be very dissimilar by group, meaning small deviations in sample demographics cause big errors (sampling error)

- Or the people who respond to the poll could be systematically different from the people who don't (response error)

- Or your list of potential respondents could be missing people (coverage error)

 

 

*Polls can also go wrong if they have bad question wording, a fourth type of survey error called "measurement error"

21 / 61

The soup principle in theory

Source: Pew Research Center

22 / 61

The soup principle in practice

23 / 61

Polls today...

 

24 / 61

Polls today...

 

- Declining response rates + Internet = innovations in polling online, but they don't use random sampling

24 / 61

Polls today...

 

- Declining response rates + Internet = innovations in polling online, but they don't use random sampling

- Traditional RDD and even RBS polls don't have a true random sample (since response rates are too low)

24 / 61

Polls today...

 

- Declining response rates + Internet = innovations in polling online, but they don't use random sampling

- Traditional RDD and even RBS polls don't have a true random sample (since response rates are too low)

- And because of nonresponse

24 / 61

So, to satisfy the soup principle...

Pollsters use statistical algorithms to ensure their samples match the population on different demographic targets

  • Race, age, gender, and region are most common

  • Can use weighting (raking) modeling (MRP), w various tradeoffs

25 / 61

These adjustments make polls pretty good!

26 / 61

But they aren't representative, per the theory of sampling

27 / 61

But they aren't representative, per the theory of sampling

...and in close races, the adjustments aren't enough:

27 / 61

Two examples:

28 / 61

2016: Education weighting

29 / 61

2020: Partisan nonresponse

30 / 61

2020: Partisan nonresponse

31 / 61

2020: Partisan nonresponse

  • Problem reaching Trump voters overall

31 / 61

2020: Partisan nonresponse

  • Problem reaching Trump voters overall

  • And within demographic groups

31 / 61

2020: Partisan nonresponse

  • Problem reaching Trump voters overall

  • And within demographic groups

  • Something you cannot fix with weighting

31 / 61

2020: Partisan nonresponse

  • Problem reaching Trump voters overall

  • And within demographic groups

  • Something you cannot fix with weighting

    • Pollsters can adjust for past vote, but the electorate changes, and certain types of voters may not respond to surveys

31 / 61

So what are we left with?

32 / 61

So what are we left with?

33 / 61

So what are we left with?

1. Traditional polls that oscillate wildly due to intensive weighting

33 / 61

So what are we left with?

1. Traditional polls that oscillate wildly due to intensive weighting

2. New "model-based" methods which trade lower variance for higher (potential) bias

33 / 61

So what are we left with?

1. Traditional polls that oscillate wildly due to intensive weighting

2. New "model-based" methods which trade lower variance for higher (potential) bias

3. Lower response rates increase chance of big misses across firms

33 / 61

Polls (and soup?) in 2022

34 / 61

Polls (and soup?) in 2022



A few ways forward:

34 / 61

Making polls work again

35 / 61

Making polls work again

1. More weighting variables (NYT)

35 / 61

Making polls work again

1. More weighting variables (NYT)

2. More online and off-phone data colleciton (SMS, mail)

35 / 61

Making polls work again

1. More weighting variables (NYT)

2. More online and off-phone data colleciton (SMS, mail)

3. Mixed samples (private pollsters)

35 / 61

Making polls work again

1. More weighting variables (NYT)

2. More online and off-phone data colleciton (SMS, mail)

3. Mixed samples (private pollsters)

In the pursuit of getting representative (and politically balanced) samples before and after the adjustment stage

35 / 61

In the pursuit of getting representative (and politically balanced) samples before and after the adjustment stage

36 / 61

In the pursuit of getting representative (and politically balanced) samples before and after the adjustment stage

To satisfy the soup principle

36 / 61

What about aggregation?

Forecasters have a few tricks up our sleeves:

37 / 61

How forecasts work

38 / 61

What goes into the model?

1. National economic + political fundamentals

2. Decompose into state-level priors

3. Add the (average of) polls

39 / 61

2. National fundamentals?

i) Index of economic growth (1940 - 2016)

  • eight different variables, scaled to measure the standard-deviation from average annual growth

ii) Presidential approval (1948 - 2016)

iii) Polarization (1948 - 2016)

  • measured as the share of swing voters in the electorate, per the ANES --- and interacted with economic growth

iv) Whether an incumbent is on the ballot

40 / 61

41 / 61

42 / 61

43 / 61

2. The model is a federalist

i) Train a model to predict the Democratic share of the vote in a state relative to the national vote, 1948-2016

  • Variables are: lean in the last election, lean two elections ago, home state effects * state size, conditional on the national vote in the state

ii) Use the covariates to make predictions for 2020, conditional on the national fundamentals prediction for every day

ii) Simulate state-level outcomes to extract a mean and standard deviation

  • Propogates uncertainty both from the LOOCV RMSE of the national model and the state-level model
44 / 61

That's the baseline

45 / 61

That's the baseline

Now, we add the polls

45 / 61

3. Add the (average of) polls

  • Just a trend through points...
  • Can do with any series of packages for R, other statistical languages
46 / 61

3. Add the (average of) polls

  • Just a trend through points...
  • Can do with any series of packages for R, other statistical languages

46 / 61

3. Add the (average of) polls

(...but with some fancy extra stuff)

mu_b[:,T] = cholesky_ss_cov_mu_b_T * raw_mu_b_T + mu_b_prior;
for (i in 1:(T-1)) mu_b[:, T - i] = cholesky_ss_cov_mu_b_walk * raw_mu_b[:, T - i] + mu_b[:, T + 1 - i];
national_mu_b_average = transpose(mu_b) * state_weights;
mu_c = raw_mu_c * sigma_c;
mu_m = raw_mu_m * sigma_m;
mu_pop = raw_mu_pop * sigma_pop;
e_bias[1] = raw_e_bias[1] * sigma_e_bias;
sigma_rho = sqrt(1-square(rho_e_bias)) * sigma_e_bias;
for (t in 2:T) e_bias[t] = mu_e_bias + rho_e_bias * (e_bias[t - 1] - mu_e_bias) + raw_e_bias[t] * sigma_rho;
//*** fill pi_democrat
for (i in 1:N_state_polls){
logit_pi_democrat_state[i] =
mu_b[state[i], day_state[i]] +
mu_c[poll_state[i]] +
mu_m[poll_mode_state[i]] +
mu_pop[poll_pop_state[i]] +
unadjusted_state[i] * e_bias[day_state[i]] +
raw_measure_noise_state[i] * sigma_measure_noise_state +
polling_bias[state[i]];
}
47 / 61

3. Add the (average of) polls

48 / 61

3. Add the (average of) polls

i. Latent state-level vote shares evolve as a random walk over time

  • "Walks" toward the state-level fundamentals more as we are further out from election day
48 / 61

3. Add the (average of) polls

i. Latent state-level vote shares evolve as a random walk over time

  • "Walks" toward the state-level fundamentals more as we are further out from election day

ii. Polls are observations with measurement error that are debiased on the basis of:

  • Pollster firm (so-called "house effects")
  • Poll mode
  • Poll population
  • Bias in previous elections
48 / 61

3. Add the (average of) polls

i. Latent state-level vote shares evolve as a random walk over time

  • "Walks" toward the state-level fundamentals more as we are further out from election day

ii. Polls are observations with measurement error that are debiased on the basis of:

  • Pollster firm (so-called "house effects")
  • Poll mode
  • Poll population
  • Bias in previous elections

iii. Correcting for partisan non-response

  • Whether a pollster weights by party registration or past vote
  • Adjusts for biases that remain AFTER removing the other biases
48 / 61

3. Add the (average of) polls

Notable improvements from partisan non-response (and other?) issues

49 / 61

In 2016...

50 / 61

In 2016...

... But not 2020

50 / 61

In 2016...

... But not 2020

50 / 61

One more lesson:

1. Traditional polls that oscillate wildly due to intensive weighting

2. New "model-based" methods which trade lower variance for higher (potential) bias

3. Lower response rates increase chance of big misses across firms

51 / 61

One more lesson:

1. Traditional polls that oscillate wildly due to intensive weighting

2. New "model-based" methods which trade lower variance for higher (potential) bias

3. Lower response rates increase chance of big misses across firms

4. Aggregation is not a magic bullet

51 / 61

4. Aggregation is not a magic bullet

What may be more useful than forecasting...

52 / 61

Conditional forecasting!

53 / 61

Conditional forecasting:

54 / 61

Conditional forecasting:

1. Debias polls

54 / 61

Conditional forecasting:

1. Debias polls

2. Rerun simulations

54 / 61

2. Rerun simulations

55 / 61

2. Rerun simulations

Advantage: leaves readers with a much clearer picture of possibilities for election outcomes if past patterns of bias aren't predictive of bias (2016, 2020)

56 / 61

Further questions:

57 / 61

What if that doesn't work?

2022 a critical test: does surveys get better or stay the same — or do they get worse?

What if the DGP remains biased?

What if the quality of the average poll continues to fall?

58 / 61

Can we trust polls to be precise in close elections?

If not, what are they good for?

59 / 61

How Polls Work and Why We Need Them

60 / 61

Thank you!

STENGTH IN NUMBERS is Now available.



Website: gelliottmorris.com

Twitter: @gelliottmorris

Questions?


These slides were made using the xaringan package for R. They are available online at https://www.gelliottmorris.com/slides/

61 / 61

2 / 61
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow