Universality of election statistics and a way to use it to detect election fraud.

Similar documents
Chapter. Sampling Distributions Pearson Prentice Hall. All rights reserved

Theory and practice of falsified elections

A positive correlation between turnout and plurality does not refute the rational voter model

Allegations of Fraud in Mexico s 2006 Presidential Election

Approval Voting and Scoring Rules with Common Values

Chapter. Estimating the Value of a Parameter Using Confidence Intervals Pearson Prentice Hall. All rights reserved

Women as Policy Makers: Evidence from a Randomized Policy Experiment in India

Voluntary Voting: Costs and Benefits

The choice E. NOTA denotes None of These Answers. Give exact answers unless otherwise specified. Good luck, and have fun!

Voter Participation with Collusive Parties. David K. Levine and Andrea Mattozzi

On the Rationale of Group Decision-Making

Sequential Voting with Externalities: Herding in Social Networks

Supplementary Materials for Strategic Abstention in Proportional Representation Systems (Evidence from Multiple Countries)

CSC304 Lecture 16. Voting 3: Axiomatic, Statistical, and Utilitarian Approaches to Voting. CSC304 - Nisarg Shah 1

Supporting Information Political Quid Pro Quo Agreements: An Experimental Study

We have analyzed the likely impact on voter turnout should Hawaii adopt Election Day Registration

Risk-Limiting Audits

Intro Prefs & Voting Electoral comp. Voter Turnout Agency GIP SIP Rent seeking Partisans. Political Economics. Dr. Marc Gronwald Dr.

The 2004 Ohio Presidential Election: Cuyahoga County Analysis How Kerry Votes Were Switched to Bush Votes. Preface

Democratic Protest Movement in Russia. Oleg Kozlovsky George Washington University

Hoboken Public Schools. Algebra II Honors Curriculum

Exposing Media Election Myths

Wisdom of the Crowd? Information Aggregation and Electoral Incentives

Incumbency Advantages in the Canadian Parliament

Lab 3: Logistic regression models

Voting and Elections

3 Electoral Competition

Defensive Weapons and Defensive Alliances

Welfarism and the assessment of social decision rules

The Swing Voter's Curse *

Voting rules: (Dixit and Skeath, ch 14) Recall parkland provision decision:

Distorting Democracy: How Gerrymandering Skews the Composition of the House of Representatives

Political Economics II Spring Lectures 4-5 Part II Partisan Politics and Political Agency. Torsten Persson, IIES

Enriqueta Aragones Harvard University and Universitat Pompeu Fabra Andrew Postlewaite University of Pennsylvania. March 9, 2000

Approval Voting Theory with Multiple Levels of Approval

Learning from Small Subsamples without Cherry Picking: The Case of Non-Citizen Registration and Voting

Trade and Inequality: From Theory to Estimation

On the Causes and Consequences of Ballot Order Effects

Executive Summary. 1 Page

Sequential vs. Simultaneous Voting: Experimental Evidence

Collective Decision with Costly Information: Theory and Experiments

1. The Relationship Between Party Control, Latino CVAP and the Passage of Bills Benefitting Immigrants

Josh Engwer (TTU) Voting Methods 15 July / 49

Introduction to the declination function for gerrymanders

REFERENDUM ON GUERNSEY S VOTING SYSTEM

While viewing this PBS Documentary video answer the following questions. 3. Is voting a Right or a Privilege? (Circle the answer)

Explaining the Impossible: Kenneth Arrow s Nobel Prize Winning Theorem on Elections

Model of Voting. February 15, Abstract. This paper uses United States congressional district level data to identify how incumbency,

AUDITS OF PAPER RECORDS TO VERIFY ELECTRONIC VOTING MACHINE TABULATED RESULTS

Azerbaijan Elections and After

Committee proposals and restrictive rules

Local elections. Referendum on the voting system used to elect MPs to the House of Commons

Introduction to the Theory of Voting

Electing the President. Chapter 12 Mathematical Modeling

SIERRA LEONE 2012 ELECTIONS PROJECT PRE-ANALYSIS PLAN: INDIVIDUAL LEVEL INTERVENTIONS

Youth Voter Turnout has Declined, by Any Measure By Peter Levine and Mark Hugo Lopez 1 September 2002

CS269I: Incentives in Computer Science Lecture #4: Voting, Machine Learning, and Participatory Democracy

Poverty Reduction and Economic Growth: The Asian Experience Peter Warr

Voting and Markov Processes

Practice Questions for Exam #2

The Provision of Public Goods Under Alternative. Electoral Incentives

Response to the Report Evaluation of Edison/Mitofsky Election System

Reputation and Rhetoric in Elections

DU PhD in Home Science

Election Issues 22 What electoral fraud in 2006 and 2014? (a version appeared in FT, 2 August 2014) Professor Wadan Narsey

Opinion Polls in the context of Indian Parliamentary Democracy

answers to some of the sample exercises : Public Choice

Commuting and Productivity: Quantifying Urban Economic Activity using Cellphone Data

Main idea: Voting systems matter.

Same Day Voter Registration in

Economics 470 Some Notes on Simple Alternatives to Majority Rule

Colorado s Risk-Limiting Audits (RLA) CO Risk-Limiting Audits -- Feb Neal McBurnett

Third Party Voting: Vote One s Heart or One s Mind?

Indecision Theory: Explaining Selective Abstention in Multiple Elections

PASW & Hand Calculations for ANOVA

Topics on the Border of Economics and Computation December 18, Lecture 8

1 Electoral Competition under Certainty

JudgeIt II: A Program for Evaluating Electoral Systems and Redistricting Plans 1

Socially Optimal Districting: An Empirical Investigation

Incumbency as a Source of Spillover Effects in Mixed Electoral Systems: Evidence from a Regression-Discontinuity Design.

BIPOLAR MULTICANDIDATE ELECTIONS WITH CORRUPTION by Roger B. Myerson August 2005 revised August 2006

Local differential privacy

by Casey B. Mulligan and Charles G. Hunter University of Chicago September 2000

An Epistemic Free-Riding Problem? Christian List and Philip Pettit 1

First Principle Black s Median Voter Theorem (S&B definition):

Tilburg University. Can a brain drain be good for growth? Mountford, A.W. Publication date: Link to publication

COWLES FOUNDATION FOR RESEARCH IN ECONOMICS YALE UNIVERSITY

OPSEU Reference Guide for Local Elections March 2013 REFERENCE GUIDE FOR LOCAL ELECTIONS

Patterns of Poll Movement *

Political Change, Stability and Democracy

Ipsos Poll Conducted for Reuters Daily Election Tracking:

National Quali cations

Quantitative Prediction of Electoral Vote for United States Presidential Election in 2016

Ohio State University

DOES GERRYMANDERING VIOLATE THE FOURTEENTH AMENDMENT?: INSIGHT FROM THE MEDIAN VOTER THEOREM

Voting System: elections

Chapter 14. The Causes and Effects of Rational Abstention

RETIREMENT BENEFITS SCHEMES. Election Procedures Manual 2016

Social welfare functions

Supplemental Online Appendix to The Incumbency Curse: Weak Parties, Term Limits, and Unfulfilled Accountability

Transcription:

Universality of election statistics and a way to use it to detect election fraud. Peter Klimek http://www.complex-systems.meduniwien.ac.at P. Klimek (COSY @ CeMSIIS) Election statistics 26. 2. 2013 1 / 53

Background: Elections in Russia P. Klimek (COSY @ CeMSIIS) Election statistics 26. 2. 2013 2 / 53

Background Elections in Russia It s not the people who vote that count; it s the people who count the votes. Joseph Stalin P. Klimek (COSY @ CeMSIIS) Election statistics 26. 2. 2013 3 / 53

Background Russian legislative election, 2011 During and after the Nov 4, 2011 Russian legislative election more than 1,100 official complaints were filed. International observers (OSCE) reported undue interference of state authorities due to a convergence of the state and the governing party, in particular the government s control over the Central Election Commission. Protests started soon after the election (with more than 15,000 people gathering at the Red Square) and peaked in May 2012 with about 20,000 people protesting in Moscow on the day before Putin s inauguration. What ignited these public upheavals? P. Klimek (COSY @ CeMSIIS) Election statistics 26. 2. 2013 4 / 53

Background Ballot boxes already filled before the polling station opens? P. Klimek (COSY @ CeMSIIS) Election statistics 26. 2. 2013 5 / 53

Background Very motivated voters? P. Klimek (COSY @ CeMSIIS) Election statistics 26. 2. 2013 6 / 53

Background Strange equipment? P. Klimek (COSY @ CeMSIIS) Election statistics 26. 2. 2013 7 / 53

A research question Purely hypothetical If practices like ballot stuffing and the re-casting of votes would have a widespread occurrence, would this leave a detectable impact on the election statistics? P. Klimek (COSY @ CeMSIIS) Election statistics 26. 2. 2013 8 / 53

A statistical perspective Elections can be seen as large-scale social experiments. A country is segmented into a large number of electoral units. Each unit represents an experiment, where each citizen articulates his/her political preference through a ballot. What are the statistics of such a process? How does ballot stuffing influence these statistics? P. Klimek, Y. Yegorov, R. Hanel, S. Thurner, Statistical detection of systematic election irregularities. Proc. Natl. Acad. Sci. USA, 109, 19151-4 (2012). P. Klimek (COSY @ CeMSIIS) Election statistics 26. 2. 2013 9 / 53

A statistically educated protest rally Make a histogram of votes for a specific party over each electoral district and you get... We do not trust Churov, we trust Gauss! P. Klimek (COSY @ CeMSIIS) Election statistics 26. 2. 2013 10 / 53

Introductory election statistics P. Klimek (COSY @ CeMSIIS) Election statistics 26. 2. 2013 11 / 53

Elections as experiments Let us start with the simplest imaginable case, where each electoral unit contains the same number of people with the same distribution of preferences (we will later relax all these assumptions). An experiment is any procedure which can be infinitely repeated and has a well defined set of mutually exclusive outcomes. The set of all possible outcomes is the sample space Ω. Example I: Roll a six-sided die (experiment). The possible outcomes can be labeled 1, 2, 3, 4, 5, 6 and give the sample space Ω = {1, 2, 3, 4, 5, 6}. Example II: Ask N people if they vote for a party (experiment). The possible outcomes can be labeled 0, 1, 2,..., N and give the sample space 0, 1, 2,..., N. P. Klimek (COSY @ CeMSIIS) Election statistics 26. 2. 2013 12 / 53

Elections as experiments An event A is a subset of the sample space Ω, i.e. it is a set of outcomes. Assume all outcomes are equally likely. Then the probability for event A, P(A), is defined by P(A) = A Ω, (1) (here: denotes the number of elements in the respective set). Example I: What is the probability to toss a head with a fair coin? A = {head}, and P(A) = 1 2. Example II: What is the probability to roll at least a 2 with a die? A = {1, 2}, and P(A) = 1 3. P. Klimek (COSY @ CeMSIIS) Election statistics 26. 2. 2013 13 / 53

Elections as experiments An event A is a subset of the sample space Ω, i.e. it is a set of outcomes. Let n T be the number of repetitions of the random experiment. Let n A by the number of times an outcome of the event A is observed. Frequentist position: In the long run, i.e. as the number of trials approaches infinity, the relative frequency of event A, n A /n T will approach a true frequency, the probability P(A), P(A) = n A lim. (2) n T n T The probability of the impossible event, i.e. the empty set of events, is zero: P( ) = 0. P. Klimek (COSY @ CeMSIIS) Election statistics 26. 2. 2013 14 / 53

Elections as experiments Random variables A random variable X is a function from the sample space Ω to the real numbers, X : Ω R. X can be interpreted as a quantity whose value depends on the outcome of an experiment. X is a discrete random variable, if it takes one out of a countable set of values. For each x in this countable set, define the probability mass function p(x) as p(x) := P(X = x). (3) Intuition: Small x is a concrete outcome of a random experiment (election result in a unit). We use capital X as a placeholder for x. We can then make general statements about the process/experiment generating x, without having to refer to the actual outcomes. P. Klimek (COSY @ CeMSIIS) Election statistics 26. 2. 2013 15 / 53

Elections as experiments Random variables The expectation value of the random variable X, denoted E(X) is defined as E(X) = xp(x). (4) x:p(x)>0 Example I: Toss two coins. Let X be the number of heads, what is E(X) =? Ω = {(H, H), (H, T ), (T, H), (T, T )} E(X) = 1 4 2 + 1 4 1 + 1 4 1 + 1 4 0 = 1. Example II: What is the expectation value of rolling a six sided die? E(X) = 1 6 1 + 1 6 2 + 1 6 3 + 1 6 4 + 1 6 5 + 1 6 6 = 3.5. P. Klimek (COSY @ CeMSIIS) Election statistics 26. 2. 2013 16 / 53

Elections as experiments Random variables Let X be a random variable with expectation value µ, E(X) = µ. The variance of X, denoted Var(X), is defined as Var(X) = E((X µ) 2 ). (5) which can be written as, Var(X) = (x µ) 2 p(x). (6) x:p(x)>0 Intuition: The variance Var(X) measures how much the random variable X varies from its mean over consecutive trials; it is the expectation value of the quadratic distance from the mean. P. Klimek (COSY @ CeMSIIS) Election statistics 26. 2. 2013 17 / 53

Central Limit Theorem P. Klimek (COSY @ CeMSIIS) Election statistics 26. 2. 2013 18 / 53

Central limit theorem (CLT) The central limit theorem is maybe the single most theoretically striking and practically important result of probability theory. Let {X i } be random variables which are independent from each other and drawn from identical distributions, i.i.d. variables. The only thing we know from the distribution of the X i is the mean E(Xi ) = µ, the variance Var(Xi ) = σ 2 <. What can we say about the sum S n of i.i.d. variables, S n = 1 n (X 1 + X 2 + + X n )? CLT: we can say a lot!! But before we understand this, let us first understand the sum of two random variables... P. Klimek (COSY @ CeMSIIS) Election statistics 26. 2. 2013 19 / 53

Central limit theorem Rolling dice Example I: We roll two six-sided dice, a red and a blue one, with random variables X 1 and X 2. Both dice are fair, each outcome has equal probability. What is the probability mass function for the sum of the two dice, i.e. P(X 1 + X 2 = t) =? We will visualize the two probability mass functions of the dice in the following way: P. Klimek (COSY @ CeMSIIS) Election statistics 26. 2. 2013 20 / 53

Central limit theorem Rolling dice P(X 1 + X 2 = 2) =? P. Klimek (COSY @ CeMSIIS) Election statistics 26. 2. 2013 21 / 53

Central limit theorem Rolling dice P(X 1 + X 2 = 3) =? P. Klimek (COSY @ CeMSIIS) Election statistics 26. 2. 2013 22 / 53

Central limit theorem Rolling dice P(X 1 + X 2 = 4) =? P. Klimek (COSY @ CeMSIIS) Election statistics 26. 2. 2013 23 / 53

Central limit theorem Rolling dice P(X 1 + X 2 = 5) =? P. Klimek (COSY @ CeMSIIS) Election statistics 26. 2. 2013 24 / 53

Central limit theorem Rolling dice P(X 1 + X 2 = 6) =? P. Klimek (COSY @ CeMSIIS) Election statistics 26. 2. 2013 25 / 53

Central limit theorem Rolling dice P(X 1 + X 2 = 7) =? P. Klimek (COSY @ CeMSIIS) Election statistics 26. 2. 2013 26 / 53

Central limit theorem Rolling dice P(X 1 + X 2 = 8) =? P. Klimek (COSY @ CeMSIIS) Election statistics 26. 2. 2013 27 / 53

Central limit theorem Rolling dice P(X 1 + X 2 = 9) =? P. Klimek (COSY @ CeMSIIS) Election statistics 26. 2. 2013 28 / 53

Central limit theorem Rolling dice P(X 1 + X 2 = 10) =? P. Klimek (COSY @ CeMSIIS) Election statistics 26. 2. 2013 29 / 53

Central limit theorem Rolling dice P(X 1 + X 2 = 11) =? P. Klimek (COSY @ CeMSIIS) Election statistics 26. 2. 2013 30 / 53

Central limit theorem Rolling dice P(X 1 + X 2 = 12) =? P. Klimek (COSY @ CeMSIIS) Election statistics 26. 2. 2013 31 / 53

Central limit theorem Rolling dice P(X 1 + X 2 + X 3 = t) =? P. Klimek (COSY @ CeMSIIS) Election statistics 26. 2. 2013 32 / 53

Central limit theorem Rolling dice P(X 1 + X 2 + X 3 + X 4 = t) =? P. Klimek (COSY @ CeMSIIS) Election statistics 26. 2. 2013 33 / 53

Central limit theorem Suppose {X 1, X 2,... } is a sequence of i.i.d. variables with E(X i ) = µ and Var(X i ) = σ 2, then i lim X i = 1 n n σ x µ e( σ ) 2 N (µ, σ 2 ), (7) 2π where N (µ, σ 2 ) is the Gauss distribution or normal distribution with mean µ and variance σ 2. So why did the protester trust Gauss? P. Klimek (COSY @ CeMSIIS) Election statistics 26. 2. 2013 34 / 53

Central limit theorem Interpretation of the CLT for election results Assume that we have an infinite (or very large) number of electoral units, each having the same number of voters. Assume that in each unit the people s preferences to vote for a party would have the same expectation value and variance. CLT: the distribution of votes over electoral units must be normal! Straightforward: the same holds not only for the vote count, but also for the turnout. Note that we just constrain mean and variance of the distribution, not its shape! P. Klimek (COSY @ CeMSIIS) Election statistics 26. 2. 2013 35 / 53

Central limit theorem Faith in Gauss P. Klimek (COSY @ CeMSIIS) Election statistics 26. 2. 2013 36 / 53

Central limit theorem Faith in Gauss P. Klimek (COSY @ CeMSIIS) Election statistics 26. 2. 2013 37 / 53

Central limit theorem Faith in Gauss P. Klimek (COSY @ CeMSIIS) Election statistics 26. 2. 2013 38 / 53

Central limit theorem Putting the fun in distribution functions P. Klimek (COSY @ CeMSIIS) Election statistics 26. 2. 2013 39 / 53

Modeling election outcomes P. Klimek (COSY @ CeMSIIS) Election statistics 26. 2. 2013 40 / 53

Modeling election outcomes Assume a country is segmented into n electoral units, label them by i. We are interested in: N i... the vote eligible population in unit i. V i... the number of valid votes cast in unit i. W i... the number of votes for the winning party in unit i. v = (1/n) W i i N i... mean of votes for the winning party. σ v... variance of votes for the winning party. ā = (1/n) V i i N i... turnout (percentage). σ a... variance of turnout. P. Klimek (COSY @ CeMSIIS) Election statistics 26. 2. 2013 41 / 53

Modeling election outcomes An extremely simple null-model for fair election outcomes, using {N i }, v, ā. For each unit i, take the electorate size N i from the data. Draw the model votes for unit i, v (m) i, from the normal distribution with mean and variance estimated by v, σ v. Draw the model turnout for unit i, a (m) i, from the normal distribution with mean and variance estimated by a, σ a. How well does this model describe actual empirical vote and turnout distributions? P. Klimek (COSY @ CeMSIIS) Election statistics 26. 2. 2013 42 / 53

Modeling election outcomes Comparison of election results from France to the model without fraud P. Klimek (COSY @ CeMSIIS) Election statistics 26. 2. 2013 43 / 53

Modeling election outcomes Further election results P. Klimek (COSY @ CeMSIIS) Election statistics 26. 2. 2013 44 / 53

Modeling election outcomes Further election results P. Klimek (COSY @ CeMSIIS) Election statistics 26. 2. 2013 45 / 53

Modeling election outcomes Further election results Obviously, our model is not good in explaining the Russian data... Why is there a substantial correlation between vote and turnout? Why is there a large number of districts with almost hundred percent votes for the winner and hundred percent turnout? We have to extend our model. P. Klimek (COSY @ CeMSIIS) Election statistics 26. 2. 2013 46 / 53

Ballot stuffing A mechanism for electoral fraud How would ballot stuffing influence these election statistics? Assume a large number of ballots with votes for one party would be stuffed into an urn. More ballots in the urn inflated turnout. If all ballots count for the same party inflated vote numbers, always in conjunction with inflated turnout. The data also suggests a mode of extreme fraud, where all ballots are counted for only one party. Let us introduce these mechanisms in the model. P. Klimek (COSY @ CeMSIIS) Election statistics 26. 2. 2013 47 / 53

Modeling election outcomes Estimating effects of ballot stuffing For each unit i, take the electorate size N i from the data. Draw the model votes for unit i, ˆv i, from the normal distribution with mean and variance estimated by v, σ v. Draw the model turnout for unit i, â i, from the normal distribution with mean and variance estimated by a, σ a. Incremental fraud: With probability f i ballots are taken away from both the non-votes and the opposition and added to the winning party s ballots. Extreme fraud: With probability f e almost all ballots from the non-voters and the opposition are added to the winning party s ballots. P. Klimek (COSY @ CeMSIIS) Election statistics 26. 2. 2013 48 / 53

Modeling election outcomes Estimating effects of ballot stuffing The parameters f i and f e quantify how often incremental/extreme fraud takes place. If incremental / extreme fraud takes place, its intensities are again estimated from the data. The model is executed for each pair of (f i, f e ) values. The result for the fraud parameters is the pair which offers the highest overlap with the data (as measured by a test-statistics comparing observed and modeled vote distributions). P. Klimek (COSY @ CeMSIIS) Election statistics 26. 2. 2013 49 / 53

Modeling election outcomes Estimating effects of ballot stuffing The left-handed variance σv L estimates the normal scatter of the voters preferences. The right-handed variance σv R estimates the incremental/ballot stuffing intensity. σ x estimates the intensity of the extreme fraud mechanism. P. Klimek (COSY @ CeMSIIS) Election statistics 26. 2. 2013 50 / 53

Modeling election outcomes Results Incremental fraud mechanism explains the smearing out of the main blob towards the upper right. Extreme fraud explains the peak near hundred percent vote & turnout. The data from Russia and Uganda can be better explained by the model with ballot stuffing, compared to the case without electoral fraud. In all other studied countries the fair model describes the data best. Not discussed here: Results of this method are robust with respect to the aggregation level of the data and the country size! P. Klimek (COSY @ CeMSIIS) Election statistics 26. 2. 2013 51 / 53

Modeling election outcomes All this with a relatively simple model Estimate means of vote distribution (v) and turnout distribution (a). Estimate variances: σ L/R v = (v W i /N i ) 2 Wi /N i </>v, σ a = (a V i /N i ) 2 (Vi /N i <a) (W i /N i <v), σ x = 0.075. Estimate model turnout of unit i, a (m) i v (m) i N (v, 2σv L ). N (a, σ a), and fair vote number Incremental fraud: with probability f i choose x i N (0, σ R v ). Extreme fraud: with probability f e choose x i 1 N (0, σ x). Apply correction for fraud: v (m) i N i (v (m) i a (m) i + x i (1 a (m) i Apply goodness-of-fit test to derive values for f i and f e. ) + x α (1 v (m) i i )a (m) i. P. Klimek (COSY @ CeMSIIS) Election statistics 26. 2. 2013 52 / 53

FIN FIN P. Klimek (COSY @ CeMSIIS) Election statistics 26. 2. 2013 53 / 53