KNOW THY DATA AND HOW TO ANALYSE THEM! STATISTICAL AD- VICE AND RECOMMENDATIONS

Similar documents
THE PARADOX OF THE MANIFESTOS SATISFIED USERS, CRITICAL METHODOLOGISTS

Mapping Policy Preferences with Uncertainty: Measuring and Correcting Error in Comparative Manifesto Project Estimates *

Heather Stoll. July 30, 2014

Benchmarks for text analysis: A response to Budge and Pennings

Polimetrics. Lecture 2 The Comparative Manifesto Project

INSTRUCTIONS FOR PARTICIPANTS. Please make sure you have carefully read these instructions before proceeding to code the test document.

Do they work? Validating computerised word frequency estimates against policy series

1. INTRODUCTION 2 2. SELECTION OF PROGRAMMES 2 3. SELECTION OF PARTIES 3 4. THE CODING PROCEDURE Quantification: The Coding Unit 4

You Get What You Vote For: Electoral Determinants of Economic Freedom. Eric Crampton George Mason University

ESTIMATING IRISH PARTY POLICY POSITIONS USING COMPUTER WORDSCORING: THE 2002 ELECTION * A RESEARCH NOTE. Kenneth Benoit Michael Laver

Handbook for Users and Coders of the

Nr. 64, Euromanifesto Coding Instructions. Andreas M. Wüst Andrea Volkens

Manifesto Research Group Standard Coding Categories Used to Code Party Election Programmes

Expert judgements of party policy positions: Uses and limitations in political research

Keywords: Voter Policy Emphasis; Electoral Manifesto, Party Position Shift, Comparative Manifesto Project

Annika Werner, Onawa Lacewell, Andrea Volkens. Manifesto Coding Instructions (5 th revised edition), February 2015

From Spatial Distance to Programmatic Overlap: Elaboration and Application of an Improved Party Policy Measure

Vote Compass Methodology

EXTRACTING POLICY POSITIONS FROM POLITICAL TEXTS USING WORDS AS DATA * January 21, 2003

Manifesto Project Dataset. Codebook

Ideology, Party Factionalism and Policy Change: An integrated dynamic theory

Political text is a fundamental source of information

JAMES ADAMS AND ZEYNEP SOMER-TOPCU*

What makes parties adapt to voter preferences? The role of party organisation, goals and ideology

Left and Right in Comparative Politics

And Yet it Moves: The Effect of Election Platforms on Party. Policy Images

Sciences Po Grenoble working paper n.15

Understanding Taiwan Independence and Its Policy Implications

Party Competition and Responsible Party Government

EXTRACTING POLICY POSITIONS FROM POLITICAL TEXTS USING WORDS AS DATA. Michael Laver, Kenneth Benoit, and John Garry * Trinity College Dublin

Scaling Policy Preferences from Coded Political Texts

What to Do (and Not to Do) with the Comparative Manifestos Project Data

Parties, Programs and Policies: A Comparative and Theoretical Perspective

Social Science Survey Data Sets in the Public Domain: Access, Quality, and Importance. David Howell The Philippines September 2014

corruption since they might reect judicial eciency rather than corruption. Simply put,

Estimating the foreign-born population on a current basis. Georges Lemaitre and Cécile Thoreau

Participation in European Parliament elections: A framework for research and policy-making

Supranational Agenda Setters in the European Union: Rapporteurs in the European Parliament

Ideological Evolution of the Federal NDP, as Seen through Its Election Campaign Manifestos

On the Causes and Consequences of Ballot Order Effects

Ethnic minority poverty and disadvantage in the UK

We present a new way of extracting policy positions from political texts that treats texts not

NYU Abu Dhabi Journal of Social Sciences May 2014

DU PhD in Home Science

Substance vs. Packaging: An Empirical Analysis of Parties Issue Profiles

WHO S AT THE HELM? THE EFFECT OF PARTY ORGANIZATION ON PARTY POSITION CHANGE. Jelle Koedam. Chapel Hill 2015

Re-Measuring Left-Right: A Better Model for Extracting Left-Right Political Party Policy Preference Scores.

Incumbency as a Source of Spillover Effects in Mixed Electoral Systems: Evidence from a Regression-Discontinuity Design.

Staff Tenure in Selected Positions in House Member Offices,

Congressional Gridlock: The Effects of the Master Lever

Who Responds? Voters, Parties, and Issue Attention

Staff Tenure in Selected Positions in Senators Offices,

Employment Outlook 2017

1. The Relationship Between Party Control, Latino CVAP and the Passage of Bills Benefitting Immigrants

Special Report: Predictors of Participation in Honduras

Content Analysis of Network TV News Coverage

Effects of Europe on National Party Issue Profiles: Assessment and Explanation of Convergence within Party Families

CALTECH/MIT VOTING TECHNOLOGY PROJECT A

The 2017 TRACE Matrix Bribery Risk Matrix

Working Paper no. 8/2001. Multinational Companies, Technology Spillovers and Plant Survival: Evidence for Irish Manufacturing. Holger Görg Eric Strobl

Response to the Evaluation Panel s Critique of Poverty Mapping

UNDERSTANDING TAIWAN INDEPENDENCE AND ITS POLICY IMPLICATIONS

Poverty Reduction and Economic Growth: The Asian Experience Peter Warr

Polimetrics. Mass & Expert Surveys

Is policy congruent with public opinion in Australia?: Evidence from the Australian Policy Agendas Project and Roy Morgan

Party Policy Strategies and Valence Issues: An Empirical Study of Ten Post-Communist European Party Systems

Supporting Information for Signaling and Counter-Signaling in the Judicial Hierarchy: An Empirical Analysis of En Banc Review

Unequal Recovery, Labor Market Polarization, Race, and 2016 U.S. Presidential Election. Maoyong Fan and Anita Alves Pena 1

The Politics of Inequality and Partisan Polarization in OECD Countries. Jonas Pontusson 1 and David Rueda 2

Which way from left to right? The issue basis of citizens ideological self-placement in Western Europe

British Election Leaflet Project - Data overview

Analyzing Racial Disparities in Traffic Stops Statistics from the Texas Department of Public Safety

Korea s average level of current well-being: Comparative strengths and weaknesses

CONSULTATION STAGE RESOURCE ASSESSMENT: REDUCTION IN SENTENCE FOR A GUILTY PLEA

The National Citizen Survey

How s Life in Turkey?

COMPETENCES FOR DEMOCRATIC CULTURE Living together as equals in culturally diverse democratic societies

Partisan Sorting and Niche Parties in Europe

Hungary. Basic facts The development of the quality of democracy in Hungary. The overall quality of democracy

Positions and salience in European Union politics: Estimation and validation of a new dataset

Japan s average level of current well-being: Comparative strengths and weaknesses

Quantitative Analysis of Migration and Development in South Asia

Call for Papers. Position, Salience and Issue Linkage: Party Strategies in Multinational Democracies

International Cooperation, Parties and. Ideology - Very preliminary and incomplete

DOL The Labour Market and Settlement Outcomes of Migrant Partners in New Zealand

IMPLICATIONS OF WAGE BARGAINING SYSTEMS ON REGIONAL DIFFERENTIATION IN THE EUROPEAN UNION LUMINITA VOCHITA, GEORGE CIOBANU, ANDREEA CIOBANU

Staff Tenure in Selected Positions in Senate Committees,

Parties, Voters and the Environment

Political Science Series. Exploring the Effects of Party Policy Diffusion on Parties Election Strategies

How s Life in the Czech Republic?

FINAL RESOURCE ASSESSMENT: FAILING TO SURRENDER TO BAIL

PROJECTION OF NET MIGRATION USING A GRAVITY MODEL 1. Laboratory of Populations 2

Party Ideology and Policies

INTRODUCTION TO POLITICAL SCIENCE [ITP521S]

A Perpetuating Negative Cycle: The Effects of Economic Inequality on Voter Participation. By Jenine Saleh Advisor: Dr. Rudolph

The Influence of Turnout of the Results of the Referendum to Amend the Constitution to include a clause on the Rights of the Unborn

Experiments in Election Reform: Voter Perceptions of Campaigns Under Preferential and Plurality Voting

English Deficiency and the Native-Immigrant Wage Gap

International Civic and Citizenship Education Study (ICCS) Final Report

Constitutional Reform in California: The Surprising Divides

Transcription:

KNOW THY DATA AND HOW TO ANALYSE THEM! STATISTICAL AD- VICE AND RECOMMENDATIONS Ian Budge <budgi@essex.ac.uk> Essex University March 2013 Introducing the Manifesto Estimates MPDb - the MAPOR database and associated facilities covers both entry storage and distribution: several different kinds of linked datasets and complete text collections as well as numerical summaries of them. The focus here is on the latter the updated Manifesto estimates (earlier versions are found on the CDs provided with Budge et al 2001: Klingemann et al 2006 reproduced in MPDb ). These estimates consist of the percentage distribution of (quasi-) sentences over each of 56 policy categories plus one 1 uncoded category (Table 1) for each document in the collection. TABLE 1: THE FULL SET OF MANIFETO CODING CATEGORIES Code Name Domain 1: External Relations 101 Foreign Special Relationships: Positive 102 Foreign Special Relationships: Negative 103 Anti-Imperialism 104 Military: Positive 105 Military: Negative 106 Peace 107 Internationalism: Positive 108 European Community/Union: Positive 109 Internationalism: Negative 110 European Community/Union: Negative

Code Name Domain 2: Freedom and Democracy 201 Freedom and Human Rights 202 Democracy 203 Constitutionalism: Positive 204 Constitutionalism: Negative Domain 3: Political System 301 Federalism 302 Centralization 303 Government and Administrative Efficiency 304 Political Corruption 305 Political Authority Domain 4: Economy 401 Free Market Economy 402 Incentives 403 Market Regulations 404 Economic Planning 405 Corporatism/ Mixed Economy 406 Protectionism: Positive 407 Protectionism: Negative 408 Economic Goals 409 Keynesian Demand Management 410 Economic Growth: Positive 411 Technology and Infrastructure 412 Controlled Economy 413 Nationalisation 414 Economic Orthodoxy 415 Marxist Analysis: Positive 416 Anti-Growth Economy: Positive Domain 5: Welfare and Quality of Life 501 Environmental Protection: Positive 502 Culture: Positive 503 Equality: Positive 504 Welfare State Expansion 505 Welfare State Limitation 506 Education Expansion 507 Education Limitation Domain 6: Fabric of Society 601 National Way of Life: Positive 2

Code Name 602 National Way of Life: Negative 603 Traditional Morality: Positive 604 Traditional Morality: Negative 605 Law and Order Positive 606 Civic Mindedness: Positive 607 Multiculturalism: Positive 608 Multiculturalism: Negative Domain 7: Social Groups 701 Labour Groups: Positive 702 Labour Groups: Negative 703 Agriculture and Farmers: Positive 704 Middle Class and Professional Groups: Positive 705 Underprivileged Minority Groups: Positive 706 Non-Economic Demographic Groups: Positive These estimates have been aggregated into a single Right-Le# scale (Table 2), based on all policy categories. TABLE 2: THE MRG-CMP LEFT-RIGHT SCALE Right emphases: sum of %s for Le# Emphases: sum of %s for Military: Positive Freedom and Human Rights Constitutionalism: Positive Political Authority Free Market Economy Economic Incentives Protectionism: Negative Economic Orthodoxy Welfare State Limitation National Way of Life: Positive Traditional Morality: Positive Law and Order Civic Mindedness: Positive Anti-imperialism Military: Negative Peace Internationalism: Positive Democracy Market Regulation Economic Planning Protectionism: Positive Controlled Economy Nationalisation Welfare State Expansion Education Expansion Labour Groups: Positive Several policy sub-scales have been constructed in the same way (Budge et al 2001, Appendix V) covering free market, planned intervention, welfare and international peace. The main characteristics of the policy estimates are: 3

1. their extension over 60 years and 54 countries 2. the general statistical description they provide of whole documents, not geared to any particular research objective but aimed at supporting as wide a range of projects and applications as possible 3. their openness and transparency. Estimates and scales all derive from simple counts and arithmetical operations (calculating, then summing and subtracting percentages). No additional assumptions are required for such measures, which have been extensively discussed in Mapping Policy Preferences (2001) and MPPII (2006). 4. their flexibility. Researchers can easily construct their own measures and procedures if they do not want to use the general ones provided. The estimates are organised and presented so as to make it easy to do so. The aggregate measures such as RILE are deliberately general and holistic to provide a good fallback position for researchers who do not want or need to construct their own measure. 5. their quality. Over 1200 specific research applications in a variety of fields have found the original estimates and measures satisfactory. The main lesson from these is that the original estimates function well within standard procedures like regression or dimensional or discriminant analysis, and as series in space or over time. 6. their sensitivity, particularly seen in their ability to catch all relevant variation in policy preferences both over time and cross-national. 7. constant expansion to new elections and countries. The dataset now is not the dataset of last year, nor of next year. New elections and countries are constantly being added, with considerable revision taking place of earlier entries to improve quality. Mindful of this MPDb stores all versions of the dataset which have been requested, keeping them available for confirmatory analyses or checks. The Manifesto estimates thus have characteristics which differentiate them from survey-based indicators or computerised word counts and which have to be kept in mind when analysing them, otherwise wrong conclusions will emerge. Practical recommendations for analysing the data correctly follow. Recommended Analysis Strategies for the Manifesto Estimates General advice must of course be tailored to particular research objectives and users should always feel free to build their own measures and strategies a#er checking that they respect the characteristics of the estimates. However the care we have taken to confront all the problems commonly raised by users and critics of the data set should 4

render our counsels relevant to most analyses over a variety of sub-fields. Our general recommendations are: 1. Taken as a whole i.e. as summarized in the right-le# scale or where all 56, or substantial sub-sets, of the policy categories are input together, the data set has high validity and reliability (80-100%). Estimates at this level are best input to statistical routines (e.g. regression or dimensional analyses) as they stand, without distorting adjustments. 2. Sub-scales (Free Market, State Intervention, Welfare, Economy, Peace, European Union), and most of the original policy categories are also best input to statistical routines without prior adjustment. 3. Original unadjusted estimates should also be used in distributional comparisons across time or space (i.e. where time or other series are being compared as a whole). 4. The original right-le# point estimates will not generally give misleading results in comparisons of party or other policy positions over time or space, providing there is reasonable discounting of small differences. 5. A guide to such discounting will soon be provided by the confidence intervals published in MPDb. 6. Similar confidence intervals at general, party and individualised levels will be available from MPDb for sub-scales and original policy categories. 7. The confidence intervals reported there are based on observed stability and patterned change in the Manifesto policy estimates included in the data set (MPDb). This captures all the types of error which affect variation, from document selection, coding, transcription etc, while making minimal assumptions about the way source documents are prepared or selected. 8. Final estimate measures of uncertainty and error should always be used in preference to measures which make stronger assumptions about the nature of the documents or their selection or preparation, since these are o#en wrong and/or lead to paradoxical consequences. In particular the length of a document is no guide to its reliability given a) ambiguity about what length implies in terms of noise ; b) the different types of document used to base estimates on, where the significance of length varies. 9. Final estimate based measures are also the most relevant for researchers, whose main concern is with the party policy profile in each election. Thus it is the reliability and validity of policy indicators as such which is of most interest and this is measured directly through the final-estimate approach. 5

10. Note that the reciprocal of the reliability coefficient gives the percentage of variation in point estimates which can be attributed to non-systematic error. This then provides confidence intervals. Confidence intervals are probably most useful in deciding whether a given party policy move between elections is substantively significant or not, which enables us to check predictions about party movement in terms of percentage success for individual cases across time rather than averaged relationships as in regression equations. This facilitates a more powerful predictive test. 11. Confidence intervals can also be derived for Median Voter and Government Policy Intentions. Again these are only relevant for point estimates for these variables, as the original values serve as the best input for multivariate or dimensional analyses or for distributional comparisons. 12. Comparisons across extended stretches of space and time can only be made with invariant indicators which are deductively rather than inductively derived. Changing the right-le# scale or the original coding scheme to fit better for a given time period or country would undermine such extended comparisons. 13. Being derived from the ideological divisions around 1900 which produced the modern party system the standard right-le# scale has a continuing contemporary relevance which is likely to continue into the future. 14. None of these caveats should prevent researchers devising alternatives for their own purposes, as the data are capable of supporting almost any number of combinations and re-combinations adapted to various research uses. Only, they should bear in mind that alternative measures are likely to be limited to particular periods or areas, and so will not serve as replacements for the standard measures. This is particularly true if they are inductively derived from some part of the data. 15. In comparing party positions estimated from the Manifesto data to electoral or other policy positions estimated from mass or expert surveys, analysts should recentre the latter as they miss the cross-national variation which the Manifesto estimates pick up. Without re-centring like is not being compared with like across countries. The Manifesto Estimates Validated, Authoritative and Indispensible A lot of our recommendations rest on the validity of the Manifesto Estimates in a wide variety of areas. Validity is the demonstrated ability of the estimates to measure what 6

they are supposed to measure i.e. party policy at a particular point in time. This could be either general policy (best summarised by its Right-Le# position over the whole document) or positions on individual policies (each of the 56 categories, or the subscales formed from them). Operationally, validity is the ability of the policy estimate to meet historical expectations of where a particular party should be at that time point or to give sensible and understandable results when used in research. Establishing the validity of estimates establishes their reliability as well, since measures could hardly be valid if they were unreliable i.e. gave different results when applied repeatedly under the same circumstances. If an estimate gives the same results on say 85% of occasions the 15 per cent where it varies can be used to estimate the amount of variation which is due to error and provide standard errors of measurement (SEMs) aka confident intervals. Since validity entails reliability validity checking forms the most important test we can make of data-quality. The best check on validity is users satisfaction with results from the research applications they make of it, which in turn can be estimated from repeated use. We estimate that there have been almost 1200 specific research citations on Google of the two Mapping Policy Preferences books with the data (Budge et al 2001: Klingemann et al 2006) not counting the five research volumes published by the Manifesto Research Group (Budge, Robertson, Hearl eds 1987/2008: Laver & Budge eds 1992: Klingemann et al 1994: McDonald & Budge 2005: Budge et al 2012). All these books also report their own checks on validity. This all adds up to overwhelming evidence of estimate validity. Very few datasets and certainly no policy series have been the subject of such extensive checks. 7

References Budge, Ian/Klingemann, Hans-Dieter/Volkens, Andrea/Bara, Judith/Tanenbaum, Eric (2001, 3rd reprint in 2010): Mapping Policy Preferences. Oxford: Oxford University Press. Budge, Ian/McDonald, Michael D./Pennings, Paul/Keman, Hans (2012 forthcoming): Organizing Democratic Choice: Party Representation Over Time, Oxford: Oxford University Press. Budge, Ian/Robertson, David/Hearl, Derek (1987, reprinted in 2004): Ideology, Strategy and Party Change: Spatial Analyses of Post-War Election Programmes in 19 Democracies. Cambridge: Cambridge University Press. Klingemann, Hans-Dieter/Hofferbert, Richard I./Budge, Ian (1994): Parties, Policies, and Democracy, Boulder: Westview Press. Klingemann, Hans-Dieter/Volkens, Andrea/Bara, Judith/Budge, Ian. (2006, 2nd reprint in 2010): Mapping Policy Preferences II. Estimates for Parties, Electors, and Governments in Eastern Europa, European Union and OECD, 1990-2003. Oxford: Oxford University Press. Laver, Michael J./Budge, Ian (eds.) (1992): Party Policy and Government Coalitions. New York: St. Martin's Press. McDonald, Michael/Budge, Ian (2005): Elections, Parties, Democracy. Conferring the Median Mandate, Oxford: Oxford University Press. 8