Methodology. 1 State benchmarks are from the American Community Survey Three Year averages

Similar documents
Report for the Associated Press: Illinois and Georgia Election Studies in November 2014

Report for the Associated Press. November 2015 Election Studies in Kentucky and Mississippi. Randall K. Thomas, Frances M. Barlas, Linda McPetrie,

FOR RELEASE SEPTEMBER 13, 2018

PUBLIC SAYS IT S ILLEGAL TO TARGET AMERICANS ABROAD AS SOME QUESTION CIA DRONE ATTACKS

BY Aaron Smith FOR RELEASE JUNE 28, 2018 FOR MEDIA OR OTHER INQUIRIES:

Get Your Research Right: An AmeriSpeak Breakfast Event. September 18, 2018 Washington, DC

RECOMMENDED CITATION: Pew Research Center, October, 2016, Trump, Clinton supporters differ on how media should cover controversial statements

VOTERS AGAINST CASINO EXPANSION, SUPPORT TRANSPORTATION TRUST FUND AMENDMENT

Job approval in North Carolina N=770 / +/-3.53%

PPIC Statewide Survey Methodology

FOR RELEASE AUGUST 16, 2018

THE GOVERNOR, THE PRESIDENT, AND SANDY GOOD NUMBERS IN THE DAYS AFTER THE STORM

RECOMMENDED CITATION: Pew Research Center, May, 2017, Partisan Identification Is Sticky, but About 10% Switched Parties Over the Past Year

BY Jeffrey Gottfried, Galen Stocking and Elizabeth Grieco

RBS SAMPLING FOR EFFICIENT AND ACCURATE TARGETING OF TRUE VOTERS

BY Galen Stocking and Nami Sumida

HOW THE POLL WAS CONDUCTED

Robert H. Prisuta, American Association of Retired Persons (AARP) 601 E Street, N.W., Washington, D.C

MEDICAID EXPANSION RECEIVES BROAD SUPPORT CHRISTIE POSITIONED WELL AMONG ELECTORATE IMPROVES UPON FAVORABLES AMONG DEMOCRATS

Release #2337 Release Date and Time: 6:00 a.m., Friday, June 4, 2010

These are the findings from the latest statewide Field Poll completed among 1,003 registered voters in early January.

Bush 2004 Gains among Hispanics Strongest with Men, And in South and Northeast, Annenberg Data Show

BY Amy Mitchell FOR RELEASE DECEMBER 3, 2018 FOR MEDIA OR OTHER INQUIRIES:

STATE GIVES THUMBS UP TO GOVERNOR CHALLENGERS FACE AN UPHILL BATTLE IN 2013

For immediate release Monday, March 7 Contact: Dan Cassino ;

North Carolina Races Tighten as Election Day Approaches

Minnesota Public Radio News and Humphrey Institute Poll. Dayton Jumps to Double-Digit Lead Over Emmer

THE INDEPENDENT AND NON PARTISAN STATEWIDE SURVEY OF PUBLIC OPINION ESTABLISHED IN 1947 BY MERVIN D. FiElD.

FOR RELEASE October 18, 2018

MASON-DIXON MARYLAND POLL

Release #2475 Release Date: Wednesday, July 2, 2014 WHILE CALIFORNIANS ARE DISSATISFIED

VOTERS APPROVE OF GAY MARRIAGE DECISION; BOOKER AND CHRISTIE REMAIN IN THE LEAD

Most opponents reject hearings no matter whom Obama nominates

Eagleton Institute of Politics Rutgers, The State University of New Jersey 191 Ryders Lane New Brunswick, New Jersey

CLINTON NARROWLY LEADS TRUMP IN FLORIDA -- GOP THIRD PARTY DEFECTIONS & HISPANIC VOTERS CREATING THE CURRENT GAP

Eagleton Institute of Politics Rutgers University New Brunswick 191 Ryders Lane New Brunswick, New Jersey

GW POLITICS POLL 2018 MIDTERM ELECTION WAVE 1

DATE: October 7, 2004 CONTACT: Adam Clymer at or (cell) VISIT:

BOOKER V. RIVERA AND THE POWER OF CABLE NEWS OBAMA APPROVAL DOWN SLIGHTLY

DELAWARE VOTERS GIVE A COLLECTIVE YAWN FOR STATE RACES BUT ARE LARGELY UPBEAT ABOUT LEADERS AND STATE S HEALTH

RUTGERS-EAGLETON POLL: MOST NEW JERSEYANS SUPPORT DREAM ACT

Supplementary Materials A: Figures for All 7 Surveys Figure S1-A: Distribution of Predicted Probabilities of Voting in Primary Elections

Minnesota Public Radio News and Humphrey Institute Poll

RECOMMENDED CITATION: Pew Research Center, March 2014, Concerns about Russia Rise, But Just a Quarter Call Moscow an Adversary

Attitudes toward Immigration: Iowa Republican Caucus-Goers

2018 Florida General Election Poll

The Republican Race: Trump Remains on Top He ll Get Things Done February 12-16, 2016

Georgia Democratic Primary Poll 5/17/18

BLISS INSTITUTE 2006 GENERAL ELECTION SURVEY

THE FIELD POLL. By Mark DiCamillo, Director, The Field Poll

Jim Justice Leads in Race for West Virginia Governor

Most Say Immigration Policy Needs Big Changes

Political Participation

Islamophobia and the American Elections How Does It Look in America and The Middle East?

PEW RESEARCH CENTER. FOR RELEASE December 17, 2018 FOR MEDIA OR OTHER INQUIRIES:

HOT WATER FOR MENENDEZ? OR NJ VOTERS SAY MENENDEZ IS GUILTY; GOOD NEWS IS EVERYONE ELSE IS TOO

Growing share of public says there is too little focus on race issues

Supporting Information for Do Perceptions of Ballot Secrecy Influence Turnout? Results from a Field Experiment

VP PICKS FAVORED MORE THAN TRUMP AND CLINTON IN FAIRLEIGH DICKINSON UNIVERSITY NATIONAL POLL; RESULTS PUT CLINTON OVER TRUMP BY DOUBLE DIGITS

Voters More Optimistic About Direction of State; Support Reforms, Wage Hike Proposal

FOR RELEASE October 1, 2018

CHRISTIE AND BOOKER FARE WELL IN BLUE JERSEY; NJ REPUBS LIKE CHRISTIE IN

MASON-DIXON MARYLAND POLL FEBRUARY 2018

Illinois Top Political Leaders Draw Mixed Reviews from the Voters

THE LOUISIANA SURVEY 2017

College Voting in the 2018 Midterms: A Survey of US College Students. (Medium)

Changes in Party Identification among U.S. Adult Catholics in CARA Polls, % 48% 39% 41% 38% 30% 37% 31%

HILLARY CLINTON LEADS 2016 DEMOCRATIC PRESIDENTIAL HOPEFULS; REPUBLICANS WITHOUT A CLEAR FRONTRUNNER

RECOMMENDED CITATION: Pew Research Center, July, 2015, Negative Views of Supreme Court at Record High, Driven by Republican Dissatisfaction

MASON-DIXON MARYLAND POLL SEPTEMBER 2017

NH Statewide Horserace Poll

PRITZKER HOLDS LEAD FOR NOVEMBER ELECTION

RECOMMENDED CITATION: Pew Research Center, May, 2015, Public Continues to Back U.S. Drone Attacks

Hatch Opens Narrow Lead Over Pawlenty

HYPOTHETICAL 2016 MATCH-UPS: CHRISTIE BEATS OTHER REPUBLICANS AGAINST CLINTON STABILITY REMAINS FOR CHRISTIE A YEAR AFTER LANE CLOSURES

THE FIELD POLL. UCB Contact

RECOMMENDED CITATION: Pew Research Center, July, 2016, In Clinton s March to Nomination, Many Democrats Changed Their Minds

CHRISTIE JOB GRADE IMPROVES SLIGHTLY, RE-ELECTION SUPPORT DOES NOT

FAVORABLE RATINGS OF LABOR UNIONS FALL SHARPLY

UC Berkeley IGS Poll. Title. Permalink. Author. Publication Date. Release # : Gavin Newsom remains the early leader for governor in 2018.

THE PUBLIC AND THE CRITICAL ISSUES BEFORE CONGRESS IN THE SUMMER AND FALL OF 2017

Minnesota State Politics: Battles Over Constitution and State House

Vote Preference in Jefferson Parish Sheriff Election by Gender

RECOMMENDED CITATION: Pew Research Center, February, 2017, In Trump Era, What Partisans Want From Their Congressional Leaders

35 TH ANNIVERSARY MASON-DIXON MARYLAND POLL SEPTEMBER 2018

ALABAMA: TURNOUT BIG QUESTION IN SENATE RACE

ALABAMA STATEWIDE GENERAL ELECTION MEMORANDUM

NATIONAL: 2018 HOUSE RACE STABILITY

REPORT TO PROPRIETARY RESULTS FROM THE 48 TH PAN ATLANTIC SMS GROUP. THE BENCHMARK OF MAINE PUBLIC OPINION Issued May, 2011

MASON-DIXON ARKANSAS POLL

PEW RESEARCH CENTER. FOR RELEASE January 16, 2019 FOR MEDIA OR OTHER INQUIRIES:

Two-to-one voter support for Marijuana Legalization (Prop. 64) and Gun Control (Prop. 63) initiatives.

For immediate release Thursday, January 10, pp. Contact: Krista Jenkins ;

EMBARGOED FOR RELEASE UNTIL MONDAY, OCTOBER 27, am EDT. A survey of Virginians conducted by the Center for Public Policy

2016 Nova Scotia Culture Index

CRUZ & KASICH RUN STRONGER AGAINST CLINTON THAN TRUMP TRUMP GOP CANDIDACY COULD FLIP MISSISSIPPI FROM RED TO BLUE

GOP Makes Big Gains among White Voters

ENVIRONMENTAL ABOUT ENVIRONMENTAL ISSUES

PUBLIC BACKS CLINTON ON GUN CONTROL

RECOMMENDED CITATION: Pew Research Center, July, 2015, Iran Nuclear Agreement Meets With Public Skepticism

Transcription:

The Choice is Yours Comparing Alternative Likely Voter Models within Probability and Non-Probability Samples By Robert Benford, Randall K Thomas, Jennifer Agiesta, Emily Swanson Likely voter models often improve election predictions for both voter turnout and vote choice. Successful modeling typically combines several measures to estimate registered voters, voter turnout, and vote outcome. A number of factors have made likely voter modeling more difficult, including the broader utilization of early voting and changes in sampling and data collection mode. In October 2013 the AP-GfK Poll moved from dualframe RDD telephone surveys to an online protocol using KnowledgePanel, which is the largest U.S. probability-based online panel, enabling rapid and detailed national polling. Though KnowledgePanel can be used for national projections, one of the key interests is prediction of voting outcomes by states where KnowledgePanel can fall short. As such, GfK and The Associated Press (AP) have examined how larger demographically balanced nonprobability (opt-in) samples could supplement probability-based (KnowledgePanel) through a calibration methodology. To study this, we selected two states with diverse populations - one in the Midwest often favoring Democrats (Illinois) and one in the South often favoring Republicans (Georgia). Each state had both Senatorial and Gubernatorial races on the line. In each state two parallel surveys with about 800 KnowledgePanel and 1,600 opt-in respondents were administered immediately prior to the elections. Respondents in each sample were randomly assigned to one of two alternative sets of likely voter items - either the AP-GfK Poll s standard likely voter set or an alternate set driven mainly by stated intention to vote. We report estimates of registered voters, turnout, and election results by likely voter model, how that model can be optimized, and comparisons of estimates separately from KnowledgePanel and opt-in samples. While both models predicted well, the revised model used fewer variables. In addition, calibrating opt-in samples to probability samples can improve the utility of opt-in polling samples. Introduction As with the entire market and survey research industry, polling faces challenges that continue to erode fully probabilistic, high response rate methods that have historically produced quality estimates with calculable precision. Probability-based samples of all varieties are fraught with unknown levels of imperfection due to coverage error and non-response error in what are now often response rates in the low teens or even single-digits. Attempts to overcome these sources of potential error come at a high cost and extensive effort, which rarely eradicate these potential errors. Online samples are one cost effective approach, where the cost-quality tradeoffs are a main reason motivating survey and market researchers to experiment with the use of online samples. Essentially online samples come in two varieties, opt-in or probability based. Opt-in samples can further be thought of as community-based (mostly panels) or intercept approaches (mostly river) with a great deal of variation by sample provider in these approaches. GfK uses both types of samples depending on a project s budget and fitness of use. When an opt-in sample is selected as the best match for a survey, GfK uses routing technology provided by Fulcrum to manage a large number of opt-in providers under the theory that more is better and robustness can overcome many issues. GfK s KnowledgePanel is a probability-based online sample with over 50,000 members and is primarily used when fitness of use mandates this. However, probabilitybased samples by nature are more expensive to recruit, empanel, and maintain leading to frames

of higher cost and of a moderated size nationally. At times, combinations of these types of sample are indicated because low incidence populations or other constraints on sample such as geography make it unfeasible to use one type of sample or the other on its own. GfK s national polling for the Associated Press is conducted using KnowledgePanel. The AP s survey standards allow publication of online polls only if they are conducted using panels selected using probability-based methods. Regardless of online sample source, surveys attempting to represent narrower geographic locations, such as statewide surveys, can limit the amount of sample available for analysis. For example, a client such as the AP might want to conduct a political survey in a specific state to assess the horse race in a statewide election or tell the story of political issues in that state. Often, subgroup analyses by party, sex or race are important and thus sample sizes often must be greater than one single online source can provide. KnowledgePanel covers every state in the U.S., but proportionally leaves some states with less than desirable case counts for surveys with a smaller geographic coverage area. Opt-in samples can be a cost-effective way to supplement KnowledgePanel, in particular when and if they align well through weighting or calibration techniques. This leads to the question of quality by sample source and how the two can work together to produce quality survey estimates. Methodology To address these questions GfK, in coordination with the Associated Press, carried out two surveys, one in Georgia and one in Illinois, where essentially the entire KnowledgePanel sample in each state was used. These samples were supplemented with opt-in panel sources via the Fulcrum router, managed by GfK. To mimic the population in each state, an interlocking quota design matched survey respondents to state distribution of sex by age (18-29, 30-49, 50-64, 65+) by race (Black/AA, All Other) by educational attainment (Some college or less, College grad or higher) or a 32-cell design 1. These demographics were also included for KnowledgePanel respondents in each state. In addition, opt-in respondents were asked five early adopter questions that are already asked and included with the KnowledgePanel sample. Three weights were computed, adjusting only KnowledgePanel, adjusting only opt-in, and calibrating opt-in to KnowledgePanel via demographics and early adopter questions. This research also assessed two different likely voter models in each state. Cases from each sample source were randomly assigned to the standard likely voter model versus a more direct stated intention to vote method. The stated intention model is based on registered voters and includes those who already voted or say they will definitely vote and those who say the probably will vote and say they always or nearly always vote in elections. The stated intention model is based on three survey questions. The standard model is also based on registered voters and is a complex set of definitions that includes past vote frequency, past voting behavior, whether they have already voted, likelihood to vote, interest in news about the election and knowing where to vote. This model requires eight survey questions based on four different patterns of survey answers to define a likely voter. This model is very similar to what others in the polling sector use. Within each sample type, sample was randomly assigned to a likely voter model. Sample sizes 1 State benchmarks are from the American Community Survey Three Year averages 2011-1013.

for each model and sample by state are shown in Table 1. To control for consistency between these two models, specific to this research, the above weighting was completed within model. Prior to analysis, weighted data were compared to assess the outcome of the random assignment and ensure important covariates of election outcomes such as party identification were equitable. In Illinois the demographic weighted outcomes were not equitable between models on party identification. It is not GfK s or AP s standard practice to include party identification in weighting given variability known for this variable. To make the models equitable, the initial weighted estimate of party identification was used as an additional weighting variable within each model. Table 1 KnowledgePanel Opt-In Model Standard Intent Standard Intent GA 333 321 800 759 IL 494 523 875 877 Results For analytic purposes there are two states, each with two models, each comprised of three types of sample KnowledgePanel only, opt-in only, and both combined through calibration. Statistical significance is determined at the 95% confidence level using a t-test of proportions and the effective sample sizes to account for variability due to weights. It should also be noted that while testing of estimates is against parameters, it is also meaningful to assess the absolute differences in estimates by sample type and model. Throughout the findings there are essentially thirty-two estimates across the two types of sample. Each are discussed then summarized. Registered Voters As a prerequisite to vote in most states in the U.S., including both Georgia and Illinois, one must be registered to vote, which makes voter registration the root of most likely voter models. Table 2 shows estimates for each sample type. KnowledgePanel estimates for registered voters across models and states are always within statistical tolerance. Opt-in estimates are the most distant from actual percentages of registered voters. Interestingly, in Illinois, the calibrated estimate is closer to actual registered voters than KnowledgePanel. Table 2: Registered to Vote Actual KnowledgePanel Opt-in Calibrated GA Reg Voter 77.0 77.0 81.8** 80.2** IL Reg Voter 83.3 81.3 85.5** 84.0 **Highlighted estimate significantly different from parameter at 95% confidence. Turnout The essence of any likely voter model is to predict the population that will actually cast votes on Election Day. Turnout, as operationalized as those registered voters modeled as likely to vote, is

an estimate that is nearly always overstated by likely voter models. All estimates are statistically significantly higher than actual turnout among registered voters (Table 3). This could be because those who participate in political surveys are more likely to be interested in politics to begin with, because of overstatement of vote intention or because of some combination of the two. KnowledgePanel was closest to actual turnout in three out of four cases, followed by calibrated estimates, then opt-in. In one case, the Georgia standard model, the calibrated and opt-in sample were closer than KnowledgePanel. Overall, the standard model overstated turnout less than the stated intention model, due to the additional minutia asked in the model to winnow down the likely voter pool. However, just because the turnout estimate is closer does not mean the right mix of voters who turn out is predicted. Table 3: Turnout Standard Model Stated Intent Model Actual KnowledgePanel Opt-in Calibrated KnowledgePanel Opt-in Calibrated GA Turnout 50.0 68.1** 64.3** 64.3** 73.9** 78.1** 74.9** IL Turnout 49.2 64.0** 69.2** 66.7** 68.9** 77.2** 74.7** *Highlighted estimate significantly different from parameter at 95% confidence. Election Results With the exception of the Illinois Governor results in the standard model, KnowledgePanel was always directionally correct in estimating the elections tested in the surveys and never significantly different from the actual results for each candidate (Table 4). The calibrated results performed similarly, but missed the Illinois Governor s race in both models. Opt-in sample missed the Illinois Governor s race in both models as well as missing the Georgia Senate race in the standard likely voter model. Table 4: Election Results Standard Model Stated Intent Model Actual KnowledgePanel Opt-in Calibrated KnowledgePanel Opt-in Calibrated IL Senate Durbin 53.5 52.2 56.4 55.4 53.5 53.9 54.6 Oberweis 42.7 45.6 38.5 39.8 42.6 39.2 38.8 IL Governor Quinn 46.3 48.7 49.7 49.4 44.1 48.7 47.6 Rauner 50.3 48.1 44.5** 45.8** 50.2 45.9** 47.0 GA Senate Nunn 45.2 45.2 46.5 45.5 42.2 44.3 43.8 Perdue 52.9 52.0 45.5** 47.2 49.7 51.0 50.2 GA Governor Carter 44.9 38.9 44.4 42.3 41.5 42.9 41.9 Deal 52.8 54.4 47.8 49.8 49.8 50.9 50.1 *Democratic candidate always shown first, third party candidate not shown. ** Highlighted estimate significantly different from parameter at 95% confidence.

Table 5 shows the predicted margin of victory as the percentage for the Democratic candidate minus the percentage for the Republican candidate. That is, a positive number is the margin in favor of the Democrat and a negative in favor of the Republican. This margin is often critical to calling a race or predicting a winner based on survey estimates. Again with the exception of the Illinois Governor s race in the standard likely voter model, which would have been deemed too close to call, surveys drawn from KnowledgePanel would likely have resulted in directionally correct race calls. The calibrated sample was wrong in both models for the Illinois Governor s race and too close to call in the Georgia Senate race in the standard model but directionally correct. Opt-in sample estimates were wrong in both models for the Illinois Governor s race and wrong for the Georgia Senate race in standard model. Table 5: Dem-Rep Margin Standard Model Stated Intent Model Actual KnowledgePanel Opt-in Calibrated KnowledgePanel Opt-in Calibrated IL Senate 10.8 6.6 17.9** 15.6** 10.9 14.7** 15.7** IL Governor -4.0 0.6 5.2** 3.6** -6.1 2.8** 0.5** GA Senate -7.7-6.8 1.0** -1.6** -7.5-6.6-6.4 GA Governor -7.9-15.5** -3.4-7.4** -8.3-8.0-8.2 **Highlighted estimate significantly different from parameter at 95% confidence. To assess the surveys ability to generate a sample with demographic traits which match those of the overall electorate, we compared the survey results by model with the National Election Pool exit poll estimates of sex, race, Hispanic origin, age and education level. We were also able to look at the actual share of the electorate by gender and race based on figures released by the Georgia Secretary of State. Tables 6 and 7 show weighted demographics among likely voters in each model in each state broken down by sample source. Table 6: Georgia Demographic Comparison KP Only KP + Opt-in Standard Stated Standard Stated Exit poll Secretary of state Men 44 49 47 48 48 45 Women 56 51 53 52 52 55 18-29 6 15 13 15 10 NA 30-44 35 35 35 36 27 NA 45-59 35 28 30 28 34 NA 60+ 25 22 22 21 29 NA White alone 58 66 64 68 65 64 Black alone 33 28 30 27 29 29 Hispanic origin 8 2 6 4 4 1

HS or less 39 39 34 38 18 NA Some college 32 32 34 32 28 NA College grad 29 29 32 31 54 NA Table 7: Illinois Demographic Comparison KP Only KP + Opt-in Standard Stated Standard Stated Exit poll Men 48 43 49 47 50 Women 52 57 51 53 50 18-29 8 13 13 14 11 30-44 27 32 32 32 23 45-59 36 31 31 31 37 60+ 28 23 25 23 29 White alone 75 76 76 77 75 Black alone 19 13 16 14 16 Hispanic origin 4 6 10 9 6 HS or less 31 36 33 32 19 Some college 33 31 31 33 30 College grad 37 34 36 35 51 Likely Voter Models Estimates of candidate vote percentage show very few significant differences by sample within model. From the perspective of the margin of victory, there are more significant differences, but a good deal of directional consistency. That is, in a majority of cases, the call would have been correct. Significance aside, comparing the models across estimates by sample, 70% of the time the stated intention model was closer to the actual results than the standard model. This suggests that the stated intention model (fewer questions) may work well as a substitute for the standard model (more questions). Conclusions It seems clear that probabilistic online samples such as KnowledgePanel are a better choice when budget and the number of panelists available make that choice feasible. When geographic or other constraints limit sample availability, then supplementing these samples with online opt-in samples can work well in estimating election outcomes. However, several details are important in doing so. Bayesian statisticians argue that knowledge of the posterior distribution can help align samples so that they are unbiased. This is not dissimilar to weighting, but extends beyond geodemographics. Care needs to be taken when opt-in samples are designed to not only mimic

the geodemographics but to attend to other dimensions in aligning samples to these posteriors. In this research attitudes towards early adoption are used, and this steers opt-in samples towards accuracy. What this suggests is that as research practices continue to change and evolve, standard or typically weighting practices will need to be more creative, aggressive, and often heroic in nature. These efforts will more likely than not come at the expense of greater variability due to weighting, but be deemed necessary for precision in estimates of populations. Last, when it comes to likely voter models tested here, results are inconclusive based on statistical significance. That is, there is no clear statistical winner. Even though the standard model gets closer to turnout among registered voters, the stated intention model performs equally well when election outcomes are estimated. Thus, given this outcome, one may opt to save questionnaire space and take the more direct stated intention approach. That coupled with appropriate weighting, even when opt-in samples are used alone or calibrated, can produce the reliable estimates necessary. The choice is yours.