Data manipulation in the Mexican Election? by Jorge A. López, Ph.D.

Similar documents
An Analysis of Mexico s Recounted Ballots

Allegations of Fraud in Mexico s 2006 Presidential Election

Mexico s Evolving Democracy. A Comparative Study of the 2012 Elections. Edited by Jorge I. Domínguez. Kenneth F. Greene.

An Analysis of Discrepancies in the Mexican Presidential Election Results

Lab 3: Logistic regression models

AMLO, the PRI, and the Frente: A Look at Mexico s 2018 Election

PRESS RELEASE FIRST DEFOE-SPIN EXPERIMENT EFFECTS OF PRE-ELECTION SURVEYS APRIL 2018

Do two parties represent the US? Clustering analysis of US public ideology survey

Chapter. Estimating the Value of a Parameter Using Confidence Intervals Pearson Prentice Hall. All rights reserved

Online Appendix for Partisan Losers Effects: Perceptions of Electoral Integrity in Mexico

Fake Polls as Fake News:

AP COMPARATIVE GOVERNMENT AND POLITICS 2010 SCORING GUIDELINES

Forecast error The UK general election

Hoboken Public Schools. AP Statistics Curriculum

Theory and practice of falsified elections

DU PhD in Home Science

2006 CAMPAIGN POLITICAL AND ELECTORAL CONTEXT

The Citizens Vote. Proposed changes are in red. Quoted terms are conceptual and subject to review and revision.

19 ECONOMIC INEQUALITY. Chapt er. Key Concepts. Economic Inequality in the United States

SIMPLE LINEAR REGRESSION OF CPS DATA

Mexico s 2018 Congressional elections

Info Pack Mexico s Elections

What is The Probability Your Vote will Make a Difference?

Distorting Democracy: How Gerrymandering Skews the Composition of the House of Representatives

Global Income Inequality by the Numbers: In History and Now An Overview. Branko Milanovic

Voter Experience Survey November 2016

Non-Voted Ballots and Discrimination in Florida

This report examines the factors behind the

Online Appendix for The Contribution of National Income Inequality to Regional Economic Divergence

Partisan Advantage and Competitiveness in Illinois Redistricting

A Vote Equation and the 2004 Election

HOW ECONOMIES GROW AND DEVELOP Macroeconomics In Context (Goodwin, et al.)

Santorum loses ground. Romney has reclaimed Michigan by 7.91 points after the CNN debate.

US Count Votes. Study of the 2004 Presidential Election Exit Poll Discrepancies

Federal Primary Election Runoffs and Voter Turnout Decline,

National Survey Report. May, 2018

Inferring Directional Migration Propensities from the Migration Propensities of Infants: The United States

Declaration of Charles Stewart III on Excess Undervotes Cast in Sarasota County, Florida for the 13th Congressional District Race

The Electoral College

Supplementary/Online Appendix for:

Drug Trafficking Organizations and Local Economic Activity in Mexico

WHAT YOU SHOULD KNOW ABOUT THE JULY 2018 ELECTIONS IN MEXICO.

NBER WORKING PAPER SERIES HOMEOWNERSHIP IN THE IMMIGRANT POPULATION. George J. Borjas. Working Paper

U.S.-Mexico National Security Cooperation against Organized Crime: The Road Ahead

Women and Power: Unpopular, Unwilling, or Held Back? Comment

The 2012 GOP Primary: Unmasking the Vote Manipulation

Supporting Information Political Quid Pro Quo Agreements: An Experimental Study

VOTING MACHINES AND THE UNDERESTIMATE OF THE BUSH VOTE

2009, Latin American Public Opinion Project, Insights Series Page 1 of 5

Examples that illustrate how compactness and respect for political boundaries can lead to partisan bias when redistricting. John F.

Honors General Exam PART 3: ECONOMETRICS. Solutions. Harvard University April 2014

Education and Language-Based Knowledge Gaps Among New Immigrants In the United States: Effects of English- and Native-Language Newspapers and TV

An Examination of China s Development Factors and Governance Indicators over the Period

PERSPECTIVES ON CRIME AND POLICING IN KENTVILLE, NOVA SCOTIA, 1997: A SURVEY OF RESIDENTS AND BUSINESS OPERATORS

The Electoral Process STEP BY STEP. the worksheet activity to the class. the answers with the class. (The PowerPoint works well for this.

William T Fujioka, Chief Executive Officer POST ELECTION UPDATE: NOVEMBER 2, 2010 GENERAL ELECTION

How Incivility in Partisan Media (De-)Polarizes. the Electorate

Federal Developments Knowledge Center

! = ( tapping time ).

Embargoed until 00:01 Thursday 20 December. The cost of electoral administration in Great Britain. Financial information surveys and

14 Managing Split Precincts

Candidate Packet Contents General Election November 6, 2018

Poverty Reduction and Economic Growth: The Asian Experience Peter Warr

REMITTANCES, POVERTY AND INEQUALITY

Racial Disparities in Police Traffic Stops in North Carolina,

Guided Study Program in System Dynamics System Dynamics in Education Project System Dynamics Group MIT Sloan School of Management 1

SCATTERGRAMS: ANSWERS AND DISCUSSION

Selected ACE: Data Distributions Investigation 1: #13, 17 Investigation 2: #3, 7 Investigation 3: #8 Investigation 4: #2

Welfare, inequality and poverty

Preliminary Effects of Oversampling on the National Crime Victimization Survey

Lobbying in Washington DC

Who Voted for Trump in 2016?

Persistent Poverty on Indian Reservations: New Perspectives and Responses 1

Lobbying and Policy Change in

Text Mining Analysis of State of the Union Addresses: With a focus on Republicans and Democrats between 1961 and 2014

Journalism Terminology. Mr. McCallum

Predicting the Irish Gay Marriage Referendum

MUNICIPALITY OF NORTHERN BRUCE PENINSULA MUNICIPAL ELECTION 2010 OCTOBER 25, 2010

Reference services are provided through in-person visits, by telephone, via , through chat and by regular mail correspondence.

November 6, 2018 General Election Calendar of Important Dates and Deadlines

Chapter 8: Mass Media and Public Opinion Section 1 Objectives Key Terms public affairs: public opinion: mass media: peer group: opinion leader:

CIRCLE The Center for Information & Research on Civic Learning & Engagement

Remittances and Income Distribution in Peru

Recounts in Presidential Elections

Chapter. Describing the Relation between Two Variables Pearson Pren-ce Hall. All rights reserved

Trends in the relation between regional convergence and economic growth in EU

Voter and non-voter survey report

8, DAYS PRIOR TO THE ANNUAL SCHOOL ELECTION

November 3, 2020 General Election Calendar of Important Dates and Deadlines

A Study on Chinese Firms in Hamburg

The Electoral Process. Learning Objectives Students will be able to: STEP BY STEP. reading pages (double-sided ok) to the students.

Institute for Public Policy and Economic Analysis. Spatial Income Inequality in the Pacific Northwest, By: Justin R. Bucciferro, Ph.D.

Voter Turnout to Be Record High in Midterms Implications

A Report of Using Nighttime Satellite Imagery as a Proxy Measure of Human Well-Being

LAUTENBERG SUBSTITUTION REVIVES DEMOCRATS CHANCES EVEN WHILE ENERGIZING REPUBLICANS

A survey of 200 adults in the U.S. found that 76% regularly wear seatbelts while driving. True or false: 76% is a parameter.

AP COMPARATIVE GOVERNMENT AND POLITICS 2012 SCORING GUIDELINES

A delegate s guide to Labour party conference 2017

Response to the Report Evaluation of Edison/Mitofsky Election System

Reanalysis of Hout et al

Transcription:

Data manipulation in the Mexican Election? by Jorge A. López, Ph.D. Many of us took advantage of the latest technology and followed last Sunday s elections in Mexico through a novel method: web postings of the votes through the Program of Preliminary Results, or PREP by its Spanish initials. What Mexico s Federal Electoral Institute (IFE) did not take into account is that the postings were not only informing, they were providing valuable data that can be and was- examined to check its health. The bottom line is that the data presented is ill, so ill that it appears to have been given artificial life by a computer algorithm. What the web surfers saw is that after an initial strong showing, which began at Sunday noon with a Calderon advantage of more than 4% over López Obrador ( AMLO ), the lead began to decrease in percentages. The diminishing trend continued and, around midnight, many of us went to bed forecasting a tie by 3:00 AM Monday, and an AMLO advantage of about 1% by wake up time on Monday. The morning surprise was that the trend had changed overnight and Calderon appeared with a slim but invariant advantage of about 1%; this sent many of us to what we, physics professors, do for a living: data analysis. By Monday afternoon the first sets of PREP data began to circulate on blogs and chat rooms, and the hints of manipulation began to take shape. Mexico s UNAM physicist Luis Mochan and countless anonymous contributors helped to put the picture together. The Data After digging data from several independent sources and confirming its reliability, the first sign of concern appeared when plotting the trends posted by the PREP as a function of time. The similarity between the curves of the votes belonging to different candidates was surprising: it presented a constant percentage-wise advantage of one candidate over the others as shown in the figure.

This mirroring effect is not to be expected as the votes being counted arrived from different parts of the country where the support of the different candidates varied by huge factors. The Scoop The immediate question was: how to quantify this abnormality? The obvious answer is by means of a test, for instance the Pearson s product moment correlation coefficient, which is used anywhere from social science to engineering. The coefficient is defined by which, in plain English, determines if two variables vary together; a coefficient of zero means independence between the variables, a value of one means total dependence. Not surprisingly, the Pearson coefficient of the PAN-PRD voting trend was found to be 0.999974! [For comparison values of over 0.80 are generally viewed, by eg. NASA teams, as an indication of reliability.] Correlations of other curves were found to be 0.998205 (PAN-PRI) and 0.998196 (PRI-PRD); it was obvious that the control had been established over the PAN-PRD link, and more important- it was now clear that the data was, if not fake, at least modified, scooped to put it in AMLO s southern Mexican jargon. The Algorithm Once a relationship had been uncovered, the next question was; what type of a relationship was imposed on the artificial votes? As the curves look extremely parallel one could expect a linear relationship between the voting trends. This is confirmed by the next graph. 2

The plot shows, in the y axis, the number of votes the PAN had at the time when the PRD and PRI had the votes in the x coordinate. The linear relationship is obvious. A quick fit to the PAN-PRD line produces the expression Y PAN =279926.7904 + 1.008060294 X PRD. Noteworthy is that the Calderon advantage reported by the IFE is of 257,532 votes, down from the 402,708 initially reported, and more in line with the intercept determined by the fit. Goodness of fit As any statistician would argue, linear trends are not proof of data manipulation. With this in mind the next question to answer is: can this fit be obtained from a sampling of numbers? The answer this time comes from a study of the deviations that the data has with respect to the mean behavior. Normal samplings always show small deviations from a trend, and these deviations tend to follow what is known as a Gaussian distribution, also called Normal for its repeatability in natural processes. Looking at the differences between the data points of the PAN-PRD curve of the previous chart, and the analytic expression, one can obtain a distribution of these differences and plot them as a frequency chart as in the next graph. With the dots representing the number of times a difference between the data and the fit occurs, and the red curve representing the expected normal distribution for such a sample, it is clear that the data does not follow a Gaussian distribution. The fact that a difference of zero percentage occurs many times more than other values, is a clear indication that the data was manufactured by an algorithm and does not stand a chance at passing as data originated at the actual voting. 3

The Scheme What was then the scheme followed by the controllers of the PREP? This question can be answered by looking at the difference in votes between the two major candidates. The following two graphs shows such a difference both in real votes and in percentage (of the total number of votes) received as a function of time. The plan is now easy to spot. Apparently the algorithm in operation in the PREP count was programmed to give Calderon an early large advantage to forge an illusion of invincibility, and press IFE into declaring Calderon as the winner at the Sunday 8:00 PM press conference. As the independent rapid count that IFE did under an independent group of five scientintists did not ratify PREP s fictitious advantage, the announcement of the winner was postponed to the 11:00 PM conference. As the decreasing trend of Calderon s advantage (in percentage) continued, the announcement was again postponed, and the program apparently entered into a second mode of operation in which the fall of the advantage accelerated. [That s when many of us went to bed with a positive forecast for AMLO in mind.] But then, around 3:00 AM Monday, the code entered into a third mode of operation and constrained the Calderon-AMLO differences to about 1%. Incredible as it sounds, the relationship between the votes of the two top candidates kept on following the linear relationship imposed from the beginning. 4

What Lies Ahead? As I finish writing this manuscript, a recount of the votes is taking place in all of Mexico. News from half an a hour ago (Wednesday 1:00 PM MST) show AMLO leading Calderon with an over 3% advantage with 37.32% of the ballots recounted. The moral of this exercise is twofold: 1) watch out for electronic methods of voting, and 2) never underestimate the power of statistics. 5