Supplementary Materials A: Figures for All 7 Surveys Figure S1-A: Distribution of Predicted Probabilities of Voting in Primary Elections

Similar documents
CRUZ & KASICH RUN STRONGER AGAINST CLINTON THAN TRUMP TRUMP GOP CANDIDACY COULD FLIP MISSISSIPPI FROM RED TO BLUE

Lab 3: Logistic regression models

RBS SAMPLING FOR EFFICIENT AND ACCURATE TARGETING OF TRUE VOTERS

CLINTON NARROWLY LEADS TRUMP IN FLORIDA -- GOP THIRD PARTY DEFECTIONS & HISPANIC VOTERS CREATING THE CURRENT GAP

Survey Overview. Survey date = September 29 October 1, Sample Size = 780 likely voters. Margin of Error = ± 3.51% Confidence level = 95%

Voter turnout in today's California presidential primary election will likely set a record for the lowest ever recorded in the modern era.

Rick Santorum has erased 7.91 point deficit to move into a statistical tie with Mitt Romney the night before voters go to the polls in Michigan.

The Case of the Disappearing Bias: A 2014 Update to the Gerrymandering or Geography Debate

New Louisiana Run-Off Poll Shows Lead for Kennedy, Higgins, & Johnson

Ipsos Poll Conducted for Reuters State-Level Election Tracking:

NH Statewide Horserace Poll

THE PRESIDENTIAL NOMINATION CONTESTS May 18-23, 2007

Chapter. Estimating the Value of a Parameter Using Confidence Intervals Pearson Prentice Hall. All rights reserved

Ohio State University

It s Democrats +8 in Likely Voter Preference, With Trump and Health Care on Center Stage

Practice Questions for Exam #2

Release #2337 Release Date and Time: 6:00 a.m., Friday, June 4, 2010

PENNSYLVANIA: CD01 INCUMBENT POPULAR, BUT RACE IS CLOSE

Marist College Institute for Public Opinion Poughkeepsie, NY Phone Fax

NEW JERSEY VOTERS TAKE ON 2008

Methodology. 1 State benchmarks are from the American Community Survey Three Year averages

Subject: Pinellas County Congressional Election Survey

MASON-DIXON ARKANSAS POLL

Marist College Institute for Public Opinion Poughkeepsie, NY Phone Fax

NUMBERS, FACTS AND TRENDS SHAPING THE WORLD. FOR RELEASE September 12, 2014 FOR FURTHER INFORMATION ON THIS REPORT:

THE FIELD POLL FOR ADVANCE PUBLICATION BY SUBSCRIBERS ONLY.

YouGov Results in 2010 U.S. Elections

PENNSYLVANIA: SMALL GOP LEAD IN CD01

Red Oak Strategic Presidential Poll

1. A Republican edge in terms of self-described interest in the election. 2. Lower levels of self-described interest among younger and Latino

Toplines. UMass Amherst/WBZ Poll of MA Likely Primary Voters

Trump s Approval Improves, Yet Dems Still Lead for the House

HYPOTHETICAL 2016 MATCH-UPS: CHRISTIE BEATS OTHER REPUBLICANS AGAINST CLINTON STABILITY REMAINS FOR CHRISTIE A YEAR AFTER LANE CLOSURES

Changes in Party Identification among U.S. Adult Catholics in CARA Polls, % 48% 39% 41% 38% 30% 37% 31%

Friends of Democracy Corps and Greenberg Quinlan Rosner Research. Stan Greenberg and James Carville, Democracy Corps

LATINO VOTERS IN ARIZONA, COLORADO, FLORIDA, AND NEVADA

NEWS RELEASE. Red State Nail-biter: McCain and Obama in 47% - 47 % Dead Heat Among Hoosier Voters

Colorado Governor Democrat Primary Ballot Test by Voter Subgroup*

The Job of President and the Jobs Model Forecast: Obama for '08?

MASON-DIXON MARYLAND POLL

NBC News/WSJ/Marist Poll

REGISTERED VOTERS October 30, 2016 October 13, 2016 Approve Disapprove Unsure 7 6 Total

2018 Florida General Election Poll

Colorado Governor Democratic Primary Ballot Test by Voter Subgroup* All Voters Men Wom Dem Unaf Wht Hisp. Smwt Lib Clinton Sanders Polis Lead

University of North Florida Public Opinion Research Lab

Tulane University Post-Election Survey November 8-18, Executive Summary

CLOSED PRIMARY, EXPOSED PREFERENCES:

For Voters It s Still the Economy

PARTISANSHIP AND WINNER-TAKE-ALL ELECTIONS

Voters Divided Over Who Will Win Second Debate

Incumbent Support its Lowest Since 94 In a Mine-Strewn Political Environment

Latino Decisions / America's Voice June State Latino Battleground Survey

Supplementary Materials for Strategic Abstention in Proportional Representation Systems (Evidence from Multiple Countries)

VoteCastr methodology

Nevada Poll Results Tarkanian 39%, Heller 31% (31% undecided) 31% would renominate Heller (51% want someone else, 18% undecided)

Supporting Information for Do Perceptions of Ballot Secrecy Influence Turnout? Results from a Field Experiment

HOW THE POLL WAS CONDUCTED

Michigan 14th Congressional District Democratic Primary Election Exclusive Polling Study for Fox 2 News Detroit.

North Carolina Races Tighten as Election Day Approaches

Case 1:17-cv TCB-WSD-BBM Document 94-1 Filed 02/12/18 Page 1 of 37

Response to the Report Evaluation of Edison/Mitofsky Election System

NEW HAMPSHIRE: CLINTON PULLS AHEAD OF SANDERS

Obama s Support is Broadly Based; McCain Now -10 on the Economy

Exposing Media Election Myths

Eagleton Institute of Politics Rutgers, The State University of New Jersey 191 Ryders Lane New Brunswick, New Jersey

Learning from Small Subsamples without Cherry Picking: The Case of Non-Citizen Registration and Voting

TUESDAY, MARCH 22, 2016 ELECTORAL COLLEGE VOTES: 11

Energized Against Donald Trump, Democrats Reach +14 in the Midterms

CHANGING DEMOGRAPHICS AND IMMIGRATION POLITICS IN ARIZONA. March 4, 2014

Minnesota Public Radio News and Humphrey Institute Poll

Survey Instrument. Florida

A positive correlation between turnout and plurality does not refute the rational voter model

On the Causes and Consequences of Ballot Order Effects

These are the highlights of the latest Field Poll completed among a random sample of 997 California registered voters.

Marist College Institute for Public Opinion Poughkeepsie, NY Phone Fax

This journal is published by the American Political Science Association. All rights reserved.

Robert H. Prisuta, American Association of Retired Persons (AARP) 601 E Street, N.W., Washington, D.C

THE DEMOCRATS IN NEW HAMPSHIRE January 5-6, 2008

2016 GOP Nominating Contest

Source institution: The Florida Southern College Center for Polling and Policy Research.

MASON-DIXON MARYLAND POLL SEPTEMBER 2017

Overall Survey. U.S. Senate Ballot Test. Campbell 27.08% Kennedy 48.13%

The Case of the Disappearing Bias: A 2014 Update to the Gerrymandering or Geography Debate

Minnesota Public Radio News and Humphrey Institute Poll. Coleman Lead Neutralized by Financial Crisis and Polarizing Presidential Politics

Vote Preference in Jefferson Parish Sheriff Election by Gender

ELECTION OVERVIEW. + Context: Mood of the Electorate. + Election Results: Why did it happen? + The Future: What does it mean going forward?

For immediate release Monday, March 7 Contact: Dan Cassino ;

Public Opinion and Political Participation

THE PRESIDENTIAL RACE AND THE DEBATES October 3-5, 2008

SCATTERGRAMS: ANSWERS AND DISCUSSION

A Behavioral Measure of the Enthusiasm Gap in American Elections

Marist College Institute for Public Opinion Poughkeepsie, NY Phone Fax

THE ARAB AMERICAN VOTE AMMU S

ALABAMA STATEWIDE GENERAL ELECTION MEMORANDUM

Candidate Faces and Election Outcomes: Is the Face-Vote Correlation Caused by Candidate Selection? Corrigendum

Campaign 16. A Hawthorn Group visit with Kansas City Chamber June 24, 2016

Ipsos MORI June 2016 Political Monitor

Likely Iowa Caucus Voters Attitudes Toward Social Security

REACTIONS TO SEN. OBAMA S SPEECH AND THE REV. WRIGHT CONTROVERSY March 20, 2008

Web Appendix for More a Molehill than a Mountain: The Effects of the Blanket Primary on Elected Officials Behavior in California

Transcription:

Supplementary Materials (Online), Supplementary Materials A: Figures for All 7 Surveys Figure S-A: Distribution of Predicted Probabilities of Voting in Primary Elections (continued on next page) UT Republican Primary UT Democratic Primary CO Republican Primary CO Democratic Primary Registered Voters Likely Electorate Sample Respondents Density Predicted Probability of Voting UT 3CD Republican Primary 5 3 Acutal Voters

Figure S-B: Distribution of Predicted Probabilities for General Elections A New Method for Pre-election Polling Supplementary Materials (Online), UT General FL General Registered Voters Likely Electorate Sample Density Respondents Actual Voters Predicted Probability of Voting Each histogram shows the distribution across the predicted probabilities described in the paper. Each column displays a different survey. Each row displays the distribution of probabilities for different stages of the process. The first row displays the distribution all registered voters (limited to those eligible to vote in the primary election in Figure S-A). The second row displays the distribution for the sample of the predicted likely electorate. The third row displays the probability distribution for those that responded to each survey. The fourth row displays the distribution of predicted probabilities for all people who actually voted in the election, based on public records of individual turnout.

Supplementary Materials (Online), 3 Figure S: ROC Curves for Each Model Florida General CO Republican Primary CO Democratic Primary UT Republican Primary UT Democratic Primary UT General UT 3CD Republican Primary The area under each curve in Figure S is a measure of the accuracy of the model. A model that performed exactly randomly would follow the diagonal line from the bottom left to the top right corner (known as the line of no discrimination ) and have an area of.5. A model with a bias against predicting correct outcomes would have an area less than.5. A perfect model with no false negatives and no false positives would trace the y-axis and the x-axis and have an area of. Therefore, ROC curve areas close to indicate a model that accurately predicts individual voter turnout for all potential predicted probabilities of voting.

Supplementary Materials (Online), Figure S3: Poll Forecasts Compared to Actual Election Results and Publicly Available Polls Senate Governor Our Polls w/ 95% CI Other Phone Surveys Election Results ' FL General Atny Gnrl CFO Amdt Amdt ' CO GOP Primary Senate Governor ' CO Dem Primary Senate ' UT GOP Primary Senate ' UT Dem Primary District President Governor Att Gnrl ' UT General District District District 3 ' UT GOP Primary District 3 5 7 Percentage of Votes Cast

Supplementary Materials (Online), 5 Supplementary Materials B: Modeling Predicted Probability of Voting in Upcoming Election Table S - Logit Models to Predict Probabilities of Voting Survey: UT GOP Primary UT General UT GOP Primary UT Dem Primary CO GOP Primary CO Dem Primary FL General Dependent Variable GOP Primary General Pres. Primary Pres. Primary GOP Primary Dem Primary General General Election Index () 9 (.3) ().57 () -. (.5) -.7 (.5).753 (.) Primary Election Index ().37 ().3 (.).3 (.7).35 (.).3 (.).9 (.) O -Year Election Index.5 (.3).7 (.7) 7 (.) 33 (.) Presidential Primary 3 () 9 () Republican.7 (.3) 57 (.5).553 (.9).9 (.3) 7 () Democrat 7 (). (.9).97 (.3) -.3 () Other Party. (.3) Years Since Original Reg - (.) -3 (.3) -. (.) -. (.) -.9 (.) Years Since Last Reg Change - () () Age.3 (.) - (.) (.) 7 ().7 (.).3 (.) 7 (.) Age Squared. (.) -. (.) -. (.). (.) -. (.) -. (.) Gender (Female) 7 (). () -.9 () Interactions Age * Years Since Original Registration. (.). (.) General Index * Republican -.79 (.5) 7 (.) Primary Index * Republican -.9 () -37 (.) O -Year Index * Republican.5 (.) General Index * Democrat.9 (.). (.) Primary Index * Democrat.5 () - (.) O -Year Index * Democrat.5 (.) Competative Dem District * Dem.9 () Competative Dem District * Unaf -9 (.37) Competative GOP District * Rep () Competative GOP District * Unaf -3 (5) Constant -.5 (.37) -.93 (.5) -3.57 () -39 (5) -57 (3) -5 () - (.7) N 33,3,9,3,,975 55,,53,5,9,3,73, Significance levels : p<. p<.5 p<. Note: This table displays the logistic regression models used for each survey to generate predicted probability of voting in the upcoming election for each eligible registered voter. The predicted probability was then used to draw a probability proportionate to size [PPS] sample to reflect the likely electorate. Coefficients are displayed in logits. Standard errors are in parentheses. Statistical significance using two-tailed hypothesis testing: *<p<., ** p<.5, *** p<.. The previous similar election used to create each model is listed in Table of the paper. There are differences in the variables used in each model because of the differences in information available on the voter files in each state. Definitions of the variables used in each election are found below.

Supplementary Materials (Online), Description of Variables Used in Likely Voter Models Survey: Florida General Election Dependent Variable: The most recent mid-term general election, the Florida general election. Independent Variables General Election Index: An index indicating the number of elections the individual voted in among the three most recent general elections prior to the general election:,, and. Primary Election Index: An index indicating the number of elections each individual voted in among the three most recent primary elections prior to the general election:,, and. Republican: A dummy variable taking a value of if the individual is a registered Republican. Democrat: A dummy variable taking a value of if the individual is a registered Democrat. Years Since Original Registration: The time measured in years between the date of the general election and the date the individual first registered to vote in Florida. Years Since Last Registration Change: The time measured in years between the date of the general election and the date of the last change in an individual s registration status. Changes could occur because the voter moved, changed party affiliation, or other reasons. Age: Measured in years from Election Day for the general election. Age Squared: Used to account for nonlinearities in the effect of age on voting probability. Gender: A dichotomous variable coded for female.

Supplementary Materials (Online), 7 Survey: Colorado Statewide Democratic Primary Dependent Variable: The statewide primary election. See Identification of Eligible Voters for Primary Elections below. Independent Variables General Election Index: An index indicating the number of elections the individual voted in using the three most recent general elections:,, and. Primary Election Index: An index indicating the number of elections the individual voted in using the three most recent primary elections:,, and. Municipal Election Index: An index indicating the number of elections that the individual voted in using the three most recent off-year elections: 9, 7, and 5. Democrat: A dummy variable taking a value of if the individual is a registered Democrat. Years Since Original Registration: The time measured in years between the date of the primary election and the date the individual first registered to vote in Florida. Years Since Last Registration Change: The time measured in years between the date of the primary election and the date of the last change in an individual s registration status. Changes could occur because the voter moved, changed party affiliation, or other reasons. Age: Measured in years from Election Day for the primary. Age Squared: Used to account for non-linearities in the effect of age on voting probability. Gender: A dichotomous variable coded for female. General Election Index * Democrat: Used to account for a different effect among Democrats and Unaffiliated voters in general election voting. We interact the general election index with the Democrat dummy variable. Primary Election Index * Democrat: Used to account for a different effect among Democrats and Unaffiliated voters in primary election voting. We interact the primary election index with the Democrat dummy variable.

Supplementary Materials (Online), Off-Year Election Index * Democrat: Used to account for a different effect among Democrats and Unaffiliated voters in off-year election voting. We interact the offyear, municipal election index with the Democrat dummy variable. Competitive Democratic District * Democrat: Used to account for different levels of salience in the elections geographically as well as between Democrats and unaffiliated voters. We interact the Democrat dummy variable with an indicator of competitiveness. The competitiveness variable takes a value of if the individual lives in the nd Congressional District. Competitive Democratic District * Unaffiliated: Used to account for different levels of salience in the elections geographically as well as between Democrats and unaffiliated voters. We interact the unaffiliated dummy variable with an indicator of competitiveness. The competitiveness variable takes a value of if the individual lives in the nd Congressional District. With this variable and the Competitive Democratic District * Democrat variable above, the comparison group is Democrats and unaffiliated voters who live in districts that are uncompetitive in the Democratic primary. Survey: Colorado Statewide Republican Primary Dependent Variable: The statewide primary election. See Identification of Eligible Voters for Primary Elections below. Independent Variables General Election Index: An index indicating the number of elections the individual voted in using the three most recent general elections:,, and. Primary Election Index: An index indicating the number of elections the individual voted in using the three most recent primary elections:,, and. Municipal Election Index: An index indicating the number of elections the individual voted in using the three most recent off-year elections: 9, 7, and 5. Republican: A dummy variable taking a value of if individual is a registered Republican. Years Since Original Registration: The time measured in years between the date of the primary election and the date the individual first registered to vote in Florida. Years Since Last Registration Change: The time measured in years between the date of the primary election and the date of the last change in an individual s registration status. Changes could occur because the voter moved, changed party affiliation, or other reasons.

Supplementary Materials (Online), 9 Age: Measured in years from Election Day for the primary. Age Squared: Used to account for nonlinearities in the effect of age on voting probability. Gender: A dichotomous variable coded for female. General Election Index * Republican: Used to account for a different effect among Republicans and Unaffiliated voters in general election voting. We interact the general election index with the Republican dummy variable. Primary Election Index * Republican: Used to account for a different effect among Republicans and Unaffiliated voters in primary election voting. We interact the primary election index with the Republican dummy variable. Off-Year Election Index * Republican: Used to account for a different effect among Republicans and Unaffiliated voters in off-year, municipal election voting. We interact the off-year election index with the Republican dummy variable. Competitive Republican District * Republican: Used to account for different levels of salience in the elections geographically as well as between Republican and unaffiliated voters. We interact the Republican dummy variable with an indicator of competitiveness. The competitiveness variable takes a value of if the individual live in the 5 th or th Congressional Districts. Competitive Republican District * Unaffiliated: Used to account for different levels of salience in the elections geographically as well as between Democrats and unaffiliated voters. We interact the unaffiliated dummy variable with an indicator of competitiveness. The competitiveness variable takes a value of if the individual live in the 5 th or th Congressional Districts. With this variable and the Competitive Republican District * Republican vraiable, the comparison group is Republicans and unaffiliated voters who live in districts that are uncompetitive in the Republican primary.

Survey: Utah nd Congressional District Democratic Primary A New Method for Pre-election Polling Supplementary Materials (Online), Dependent Variable: The statewide presidential primary election. We use this election as the dependent variable because Utah has not had a Democratic primary election (statewide or in the nd district) for more than a decade. This leaves us with no election that closely mirrors the nd Congressional District primary. Given this limitation, we select the Democratic Presidential primary election contested by Hillary Clinton and Barack Obama. This election has the advantage of being recent, potentially competitive, and salient. We felt that these characteristics most closely mirrored the election. Independent Variables General Election Index: An index indicating the number of elections the individual voted in using the three most recent general elections:,, and. Primary Election Index: An index indicating the number of elections the individual voted in using the three most recent primary elections:,, and. Municipal Election Index: An index indicating the number of elections the individual voted in using the three most recent off-year elections: 9, 7, and 5. Democrat: A dummy variable taking a value of if the individual is a registered Democrat. Years Since Original Registration: The time measured in years between the date of the election and the date the individual first registered to vote in Utah. Age: Measured in years from Election Day for the primary. Age Squared: Used to account for nonlinearities in the effect of age on voting probability. Age * Years Since Original Registration: An interaction of the age variable and the years since original registration date variable. General Election Index * Democrat: Used to account for a different effect among Democrats and Unaffiliated voters in general election voting. We interact the general election index with the Democrat dummy variable. Primary Election Index * Democrat: Used to account for a different effect among Democrats and Unaffiliated voters in primary election voting. We interact the primary election index with the Democrat dummy variable.

Supplementary Materials (Online), Survey: Utah Statewide Republican Primary Dependent Variable: The statewide presidential primary election. While not an exact match to other primaries, this was the only recent statewide primary election held in the state. Independent Variables General Election Index: An index indicating the number of elections the individual voted in using the three most recent general elections:,, and. Primary Election Index: An index indicating the number of elections the individual voted in using the three most recent primary elections:,, and. Municipal Election Index: An index indicating the number of elections the individual voted in using the three most recent off-year elections: 9, 7, and 5. Republican: A dummy variable taking a value of if the individual is a registered Republican. Years Since Original Registration: The time measured in years between the date of the election and the date the individual first registered to vote in Utah. Age: Measured in years from Election Day for the primary. Age Squared: Used to account for nonlinearities in the effect of age on voting probability. Age * Years Since Original Registration: An interaction of the age variable and the years since the original registration date variable. General Election Index * Republican: Used to account for a different effect among Republicans and Unaffiliated voters in general election voting. We interact the general election index with the Republican dummy variable. Primary Election Index * Republican: Used to account for a different effect among Republicans and Unaffiliated voters in primary election voting. We interact the primary election index with the Republican dummy variable. Survey: Utah General Election Dependent Variable: The most recent presidential election, the Utah presidential election. Independent Variables General Election Index: An index indicating the number of elections the individual voted in using the three most recent general elections:,, and.

Supplementary Materials (Online), Primary Election Index: An index indicating the number of elections the individual voted in using the four most recent primary elections:,,, and. Presidential Primary: A dichotomous variable taking the value of if the individual voted in the presidential primary election in Utah. Republican: A dummy variable taking a value of if the individual is a registered Republican. Democrat: A dummy variable taking a value of if the individual is a registered Democrat. Other Party: A dummy variable taking a value of if individual is registered with a party that is not the Republican or Democrat Parties and is not an unaffiliated voter. Years Since Original Registration * Age: This variable is a series of five dummy variables indicating the quintile of the continuous registration variable that the individual is a member of and then interacted with each of five quintiles of the continuous age variable. Age: Measured in years from Election Day for the general election. Age Squared: Used to account for nonlinearities in the effect of age on voting probability. Survey: Utah 3 rd Congressional District Republican Primary Dependent Variable: The 3 rd Congressional District Republican primary election. Independent Variables General Election Index: An index indicating the number of elections the individual voted in using the six most recent general elections:,,,, 99, and 99. Primary Election Index: An index indicating the number of elections the individual voted in using the four most recent primary elections:,, 99, and 99. Presidential Primary: A dichotomous variable taking the value of if the individual voted in the presidential primary election in Utah. Republican: A dummy variable taking a value of if the individual is a registered Republican. Age: Measured in years from Election Day for the primary.

Supplementary Materials (Online), 3 Registration * Republican: The continuous registration variable is divided into quintiles and interacted with a dummy variable indicating whether the individual is a registered Republican. Identification of Eligible Voters for Primary Elections In Colorado, unaffiliated voters are allowed to declare their affiliation to either party to vote in either primary. Therefore, the models and samples for each party s Colorado primary election included unaffiliated as well as the registered partisans. Unaffiliated voters in Colorado were eligible for sampling in both surveys, although they generally have low probabilities of voting in either primary and accordingly had a low chance of selection for either PPS sample. In order to avoid asking the same individual to respond to both surveys, we removed any individuals that were sampled for both surveys from one of the surveys. In the Colorado samples, unaffiliated voter was removed from the Republican sample because she was also sampled for the Democratic survey. In Utah, only registered Republicans can vote in a Republican primary. However, unaffiliated voters can register with the Republican Party at the polling location on Election Day. The Utah Democratic Party allows registered Democrats and unaffiliated voters to vote in its primary. Again, because of the potential for sampling an unaffiliated voter in both surveys, 5 unaffiliated respondents were deleted from the Utah Republican primary sample to avoid double sampling of individuals.

Supplementary Materials C: Example Invitation Letters A New Method for Pre-election Polling Supplementary Materials (Online),

Supplementary Materials (Online), 5

Supplementary Materials (Online),

Supplementary Materials D: Public Polling Data Table S: Public Polling Data for Figure S3 Election State Election Type Office Polling Firm Field Dates Sample Size Sample Type Winner Public Poll Forecast A New Method for Pre-election Polling Supplementary Materials (Online), 7 nd Place Public Poll Forecast 3rd Place Public Poll Forecast Winner Vote Share Forecast Closer to Actual Utah Primary - Republican 3rd CD Deseret News /-/ 3 RV 7 5 - Our Poll Utah General st CD Deseret News / - /3 5 RV 9. 3. - Public Poll Utah General nd CD Deseret News / - /3 5 RV 7 - Our Poll Utah General 3rd CD Deseret News / - /3 5 RV 7.7 9.3 - Our Poll Utah General Governor Mason-Dixon /3 - /5 LV 9.. - Date Not Comparable Utah General President Deseret News / - /3 5 RV. 3. - Public Poll Utah General President Mason-Dixon /3 - /5 5 LV 3 3 - Public Poll Colorado Primary - Republican Governor PPP (D) /7-/ 77 LV 9 5 - Our Poll Colorado Primary - Republican Governor Denver Post/Survey USA 7/7-7/9 5 LV 5 7 - Our Poll Colorado Primary - Republican Senate PPP (D) /7-/ 77 LV.9 5. - Our Poll Colorado Primary - Republican Senate Denver Post/Survey USA 7/7-7/9 5 LV 5.9 5. - Our Poll Colorado Primary - Democrat Senate PPP (D) /7-/ LV 53.3.7 - Public Poll Colorado Primary - Democrat Senate Denver Post/Survey USA 7/7-7/9 53 LV 5 - Our Poll Utah Primary - Democrat nd CD Deseret News /-/7 9 LV 3 - Our Poll Utah Primary - Republican Senate Deseret News /-/7 5 LV. 5. - Our Poll Florida General Governor Sunshine State News/VSS /3 - / 5 LV 5. 7.9 - Our Poll Florida General Governor PPP (D) /3 - /3 773 LV 9.5 5.5 - Public Poll

Supplementary Materials (Online), Florida General Governor Quinnipiac /5 - /3 95 LV 9 5 - Public Poll Florida General Governor Rasmussen Reports /7 - /7 75 LV 5 - Public Poll Florida General Governor Mason-Dixon /5 - /7 5 LV.3 5.7 - Public Poll Florida General Governor Florida Poll/NYT-USF /3 - /7 9 LV 53. 7. - Our Poll Florida General Senate PPP (D) /3 - /3 773 LV. 3 Our Poll Florida General Senate Sunshine State News/VSS /9 - /3 57 LV.5 3.3 Our Poll Florida General Senate Quinnipiac /5 - /3 95 LV 7.9 33. 9. Our Poll Florida General Senate Rasmussen Reports /7 - /7 75 LV 5. 3.3.7 Our Poll Florida General Senate Mason-Dixon /5 - /7 5 LV 7.9 9.3 Our Poll Florida General AG Mason-Dixon /5 - /7 5 LV 5.3 5.7 - Public Poll Florida General AG Ipsos Public Affairs /5-/9 577 LV 55. 5. - Public Poll Florida General CFO Mason-Dixon /5 - /7 5 LV 5 - Our Poll Florida General CFO Ipsos Public Affairs /5-/9 577 LV 5 - Our Poll Florida General Amdt Mason-Dixon /5 - /7 5 LV 3. 37. - Public Poll Florida General Amdt Ipsos Public Affairs /5-/9 577 LV 3 - Our Poll Note: Undecided voters in public polls are allocated proportionally for comparison with our forecasts, since our surveys did not allow an undecided response option.

Table S3: Pre-Election Forecasts and Actual Election Outcomes for Figure S3 A New Method for Pre-election Polling Supplementary Materials (Online), 9 Actual Results Forecast Results 95% M.E. N Florida General Election Rubio.9.7 ± 99 US Senate: Meek 9. 7 Crist 9.7 3. 3 Governor: Attny General: CFO: Amendment : Scott.9 5 ±.9 3 Sink 7.7 5. Bondi 5 5 ±.9 9 Gelber Atwater 57.3 57 ± 5. 5 Ausley 3.9.3 5 Yes 5 5 ± 5.3 7 No 7.5 5 5 Amendment : Yes 5.9 ± 5. 5 No 37. 3. 7 Colorado GOP Primary US Senate: Buck 5 5.3 ±.5 5 Norton 7 3 Governor: Maes 5 5 ±.7 McInnis 9.3. 3 Colorado Dem Primary Bennett 5 5. ± 73 US Senate: Ranomo.9.9 7 Utah Statewide GOP Primary Lee 5 7 ± 9 US Senate: Bridgewater 5 33 Utah CD Dem Primary Matheson 7.3 3 ± 5. 5 US District : Wright 3.7 3 35 Utah General McCain 3.. ±. 377 President: Obama 33.9 3 9 Governor: Attny General: US District : US District : Huntsman 77.9 7 ±. 7 Springmeyer 9.5 7.5 Shurtle 9.9 7 ±. 5 Hill 7 3 Bishop 5.. ± 7 Bowen 3. 3 3 Matheson 3. ±.7 Dew 3.7 33 75 Cha etz. 7 ± 7 3 US Distsrict 3: Bennion 7 7 53 3CD Utah GOP Primary Cha etz 59 5 ±.5 3 US District 3: Cannon 5 This table shows the actual results of each race within each poll as well as the predicted result, the 95% margin of error for each question and the number of people responding in each race. We see that in every election the actual result falls within the poll margin of error.

Supplementary Materials (Online), Supplementary Materials E: Discussion of Sampling Methods Probability Proportionate to Prediction vs. Probability Proportionate to Size Sampling It is important to note that in PPS sampling the probability of selection is known before sampling begins and the total sample size is also determined before sampling begins. Another method of unequal probability sampling is known as Probability Proportionate to Prediction sampling, or PPP sampling. In PPP sampling, unlike PPS sampling, the probability of inclusion in the sample is unknown before the sample is drawn. When drawing the sample, the researcher estimates an upper bound on the size of all units in the population and then chooses a value, L, larger than that estimate. As each observation is encountered, the size of the observation is observed, and a random number, x i, is drawn from the interval [,L]. If x i is smaller than the measured size of the unit, then the unit is included in the sample. Thus, larger units have higher probability of being included in the sample. Note that the total size of the sample is unknown until all units have been observed, and the probability of selection is not known before sampling begins. While we do use a predicted probability, our sampling method is closer to PPS than PPP sampling since we can calculate the probability of selection beforehand and the total sample size is determined before sampling begins.

Supplementary Materials (Online), Simple Random Sampling vs. Our PPS Sampling Method In the Utah primary and general elections, we compared the performance of surveys using a simple random sample of registered voters to our approach to using PPS to sample the likely electorate. The SRS sample and PPS sample distributions in the Utah primary and general elections have typical distributions of likelihood of voting in these types of elections. Figures S (Primary) and S5 (General) display the distribution of predicted probability of voting for the all eligible registered voters, the distributions in the PPS and SRS samples, the distribution of respondents in each sample, and the distribution of actual voters according to individual turnout records from election officials (Figures S & S5 are similar to Figure in the text). [Figures S and S5 About Here] We begin by comparing all eligible registered voters (first row) to the samples (second row). On the left, the distribution of the SRS sample mirrors the distribution for all registered voters in the top row, as expected. On the right, the PPS sample distributions are skewed towards registered voters more likely to vote and closely resemble the actual voters in the bottom row. The PPS sample distributions are different in the primary and general elections due to the differences between the underlying distributions from which the PPS samples were drawn. In the general election, the differences are quite small because the probability of turning out in the general election resembles the uniform distribution of the SRS. The main difference between the SRS and PPS samples is low turnout probability voters on the left side of the histogram are less likely to be included in the PPS sample. The impact of the PPS sample is more dramatic in low turnout elections like the Utah primary. For the primary, the SRS sample mirrors registered voters with a strong skew to the left because many registered voters have a low individual likelihood of voting in the In the 3 rd Congressional District Primary, voters were sampled using PPS sampling and were sampled using simple random sampling. In the Utah general election, 5 voters were sampled using each method.

Supplementary Materials (Online), primary. In the PPS sample, the large share of voters on the left is discounted by their low probability of voting, while the small share of voters on the right is inflated because of their high probability of voting. Balancing of the density of distribution of voters and intensity from the predicted likelihood of voting creates a PPS sample that closely resembles the actual electorate in the last row. Before examining the distribution of the respective survey respondents in the third row of Figures S & S5, we examine the response rates for each type of sample. Voting and participating in surveys correlate with levels of interest in, engagement with, and knowledge about elections. Therefore, we expected people with a higher probability of voting are also more likely to complete the survey. Since higher probability voters make up a larger portion of the PPS samples than the SRS samples in both types of elections, we expected higher response rates from the PPS samples. This hypothesis proved true in both Utah elections. In the primary, the completion rate in the SRS sample was 5 percent and. percent in the PPS sample. In the general election, the response rate for the SRS sample was 5.5 percentwhile the PPS sampling response rate was.9 percent. The narrower gap in the general election is consistent with the smaller difference in the distributions of the general election PPS and SRS samples. In the third row of Figures S & S5, we see the respondents in both columns reflect the respective SRS and PPS samples (second row) as expected. Therefore, the SRS sample respondents continue to reflect the distribution of all eligible registered voters (top row) while the PPS sample respondents closely resemble the distribution of the actual turnout (bottom row). These results further support the idea of drawing likely voter samples by considering those observable characteristics that correlate with voting. A Kolmogorov-Smirnov test comparing distributions confirms the PPS outperforms the SRS in both the primary and general elections in. In both cases, the distributions of predicted probabilities among those who were sampled using PPS and

Supplementary Materials (Online), 3 those who responded from that sample are closer to the distribution of predicted probabilities of those who actually voted than the SRS sample and respondents. The distribution of respondents in the SRS samples illuminate why pre-election polling in low turnout elections is so difficult and costly when using simple random samples of registered voters (and more so when starting with an SRS of the general population). SRS respondents are biased away from the sampling frame of voters in the upcoming election because low probability voters are substantially over-represented compared to the actual electorate. As the proportion of voters with a low individual probability of voting increases, as in primaries and local elections, the gap between an SRS sample and the likely electorate grows. In conventional pre-election polls for general elections that start with an SRS sample, techniques such as screens relying on self-reported vote intention only need to do a small amount of work to refine the sample of responses to be representative of the likely electorate. In a primary election, a survey using an SRS sample relies heavily on likely voter screens and other techniques for selecting or weighting responses to compensate for the gap between the sample and the likely electorate. Using conventional deterministic approaches to screening SRS samples to identify likely voters can cause bias in either direction. When voters with low individual probability of turning out in the upcoming election are screened out, the survey respondents are biased by the exclusion of many people who will actually vote particularly in low turnout elections like primaries. Although individual respondents have a low probability of voting, the people they represent in a sample may make up a significant share of the actual electorate. However, an SRS sample may also be biased by including too many people drawn from the low-turnout probability end of the distribution. The skew of initial respondents towards voters with low individual probabilities of turning out, especially in primaries (and other low-turnout elections), makes it possible that the screening questions will select a disproportionate number of individuals who over-report being personally likely to vote (due

Supplementary Materials (Online), to social desirability and other biases in self-reporting). Thus, the available techniques for closing the gap between an SRS sample and the intended likely electorate sampling frame can fail in both directions. At best, voters across the distribution of turnout probability have similar preferences so the sample distribution does not bias pre-election forecasts. However, in this best case, the use of these techniques is costly, due to longer screening batteries, discarded responses from unlikely voters, and other factors. Table S3 displays the pre-election forecasts in the Utah surveys from the PPS samples in comparison to the SRS samples. The point estimates of the PPS sample are more accurate than the SRS sample in five of seven races. [Table S About Here] We note the PPS sample was not more accurate than the SRS sample in every race in Table S. Therefore, future research should investigate a possible connection between forecast accuracy and level of interest, information, and/or contestation. Perhaps sampling based on previous voting behavior may be superior in cases when likely voters are also highly informed about the candidates, but if habitual voters are as uninformed or uninterested as unlikely voters, the difference between sampling methods may disappear.

Supplementary Materials (Online), 5 Figure S: Distribution of Predicted Probabilities for Primary Election by Sampling Method UT 3CD Republican Primary Eligible Registered Voters Simple Random Sample PPS Sample Sample Density Respondents Actual Voters Predicted Probability of Voting This figure shows the different distributions that arise from the different sampling methods used in the Utah 3 rd Congressional Republican primary. While the simple random sample closely mirrors the distribution of the voting population (row ), the PPS sample closely mirrors the distribution of voters. Since our poll is concerned with the opinions of voters, not eligible voters, these distributions suggest the PPS method is superior to the simple random sample.

Supplementary Materials (Online), Figure S5: Distribution of Predicted Probabilities for General Election by Sampling Method UT General Sample Registered Voters Simple Random Sample PPS Sample Density Respondents Actual Voters Predicted Probability of Voting This figure shows the different distributions that arise from the different sampling methods used in the Utah general election. While the simple random sample closely mirrors the distribution of the voting population (row ), the PPS sample closely mirrors the distribution of voters. In this case, the distribution of eligible voters is closer to the distribution of voters. However, the PPS distributions are closer to the distribution of voters than the SRS distributions.

Supplementary Materials (Online), 7 Table S: Comparison of Pre-Election Forecasts from PPS Samples and SRS Samples PPS Sample Random Sample Forecast N Forecast N Actual Results Winner 3CD Utah GOP Primary US District 3: Cha etz 5.7 5 7 59 Cannon 5 3.53 PPS Utah General President: McCain 575 99 5 7 3. Obama 33.7.57 7 33.9 SRS Governor: Huntsman 775 7.95 77.9 Springmeyer 7.9.9 5 9.5 PPS Attny General: Surtle 5 3 7. 95 9.9 Hill 97 3.95 3 PPS US District : Bishop 5 53 75 57 5. Bowen.7 3.75 9 3. SRS US District : Matheson 9.35 5.55 5 3. Dew 33 3.7 PPS US District 3: Cha etz.9 7 7.37 57. Bennion.7 3 5.93 7 PPS This table displays the comparison between the PPS samples and the simple random samples in the Utah survyes where the two methods were used. We see that the PPS sample was more accurate than the simple random sample in five of the seven races.