Approaches to Analysing Politics Variables & Johan A. Elkink School of Politics & International Relations University College Dublin 6 8 March 2017
1 2 3
Outline 1 2 3
A variable is an attribute that has two or more divisions, characteristics, or categories. The opposite is a constant, which is an attribute that does not vary. A sample is a subset of the population, the population is the set of all cases of interest. A case is an entity that displays or possesses the traits of a given variable. The unit of analysis refers to the level or class of the cases. is the process of determining and recording which of the possible traits of a variable an individual case exhibits or possesses. (Argyrous, 1997, 3 4)
Variables: example Hypothesis: Countries with high levels of inequality are more likely to experience civil war. Unit of analysis: countries Population: e.g. all countries Independent variable: inequality Dependent variable: civil war Example of a case: Kenya
Variables: example Hypothesis: Countries with high levels of inequality are more likely to experience civil war. Unit of analysis: countries Population: e.g. all countries Independent variable: inequality e.g. the ratio-level Gini coefficient Dependent variable: civil war e.g. a nominal variable, 1 = civil war, 0 = no civil war Example of a case: Kenya
Variables: example Hypothesis: Countries with high levels of inequality are more likely to experience civil war. Unit of analysis: countries possibly a time-series, i.e. the unit of analysis is a country-year Population: e.g. all countries from 1945 to 2010 Independent variable: inequality e.g. the ratio-level Gini coefficient Dependent variable: civil war e.g. a nominal variable, 1 = civil war, 0 = no civil war Example of a case: Kenya Kenya in 1993
Variables: example Hypothesis: Voters with low trust in politicians are more in favour of binding referendums. Unit of analysis: voters / individual Population: e.g. all Dutch voters Sample: e.g. Dutch Parliamentary Election Study sample of 1,200 respondents Independent variable: trust Dependent variable: support for referendums Example of a case: an individual voter
Variables: example Hypothesis: Voters with low trust in politicians are more in favour of binding referendums. Unit of analysis: voters / individual Population: e.g. all Dutch voters Sample: e.g. Dutch Parliamentary Election Study sample of 1,200 respondents Independent variable: trust e.g. an ordinal Likert-scale of low, medium, high trust Dependent variable: support for referendums e.g. an ordinal Likert-scale of disagree, neutral, agree Example of a case: an individual voter
Levels of measurement Categorical Nominal categories Ordinal... in particular order Scale Interval... with meaningful distance Ratio... with meaningful zero
Levels of measurement Examples: Categorical Nominal categories Ordinal... in particular order Scale Interval... with meaningful distance Ratio... with meaningful zero Binary: treaty signed; war initiated; gender; participated in protest; democracy-autocracy Multiple categories: electoral system; party family; urban-rural
Levels of measurement Categorical Nominal categories Ordinal... in particular order Scale Interval... with meaningful distance Ratio... with meaningful zero Examples: Likert-scales: disagree-neutral-agree; never-sometimes-often Other: democracy-anocracy-autocracy; peace-skirmish-war-world war
Levels of measurement Categorical Nominal categories Ordinal... in particular order Scale Interval... with meaningful distance Ratio... with meaningful zero Examples: Polity democracy scale; sympathy scores; attitude scales
Levels of measurement Categorical Nominal categories Ordinal... in particular order Scale Interval... with meaningful distance Ratio... with meaningful zero Examples: war duration; exports; Gini coefficient; battle deaths
Example data set District System Magnitude Seats Threshold Proportionality 1 PR 10 80 Yes 0.8 2 PR 150 150 No 0.9 3 STV 9 100 No 0.8 4 FPTP 1 300 No 0.4 5 FPTP 1 600 No 0.5 6 PR 3 200 Yes 0.7 7 STV 5 125 No 0.7 8 PR 10 100 Yes 0.8 9 MIXED 15 500 Yes 0.6 PR = proportional representation; STV = single transferable vote; FPTP = first past the post; MIXED = mixed electoral system
Example data set party education gender leftright age Fianna Fail below 2nd level Female 7 70 Other 3rd level Female 5 61 Fianna Fail below 2nd level Female 6 61 Fianna Fail 2nd level Male 5 31 Fianna Fail 2nd level Male 5 53 Independent 3rd level Female 5 40 Other 3rd level Female 5 30 Labour 3rd level Female 5 41 Sinn Fein 2nd level Male 7 60 Sinn Fein 2nd level Male 5 39
Outline 1 2 3
Main graph types univariate categorical scale multivariate scale by scale scale by categorical categorical by categorical pie-charts barplots time plot histogram boxplot scatterplot boxplots barplot barplot
Categorical variables For categorical variables, it is often useful to look at the number of cases or the proportion of cases in a particular category. Barplots and pie charts are useful for this. 150 Fianna Fail 100 count Fine Gael 50 Sinn Fein Labour Other 0 Fianna Fail Fine Gael Labour Other Sinn Fein party
Pie chart Party of 1st preference vote Fine Gael Fianna Fail Independent Labour Sinn Fein Other
Pie chart Using a 3D projection leads to misleading interpretations.
Bar chart (univariate) Party of 1st preference vote Sinn Fein Other Labour Independent Fine Gael Fianna Fail 0.00 0.05 0.10 0.15 0.20
Bar chart (univariate) Party of 1st preference vote Fine Gael Fianna Fail Other Sinn Fein Independent Labour 0.00 0.05 0.10 0.15 0.20
Bar chart (univariate) Party of 1st preference vote Fine Gael Fianna Fail Other Sinn Fein Independent Labour 0.0 0.2 0.4 0.6 0.8 1.0
Time plot When data is measured over time, another useful plot is a time plot, to see trends over time. Proportion democracies 0.0 0.2 0.4 0.6 0.8 1.0 1800 1850 1900 1950 2000 Year Polity IV (Marshall and Jaggers, 2002)
Time plot
Time plot
Distributions For of distributions (histogram, density plot, boxplot, etc.) you want to get an impression of: the shape of the distribution; the center and spread of the distribution; the presence of outliers. (Moore, 2003, 12)
Histogram For continuous (or scale) variables, we often want to get an idea of the distribution of values. How many low, medium, high values? Histograms are useful to get an impression. bin the data using equal-distance cut-off points then produce a barplot of the number in each bin. Frequency 0 50 100 150 200 250 300 2 4 6 8 10 Probability ever vote for Labour
Histogram 125 count 100 75 50 25 0 25 50 75 age
Box plot 80 age 60 40 20 0.6 0.8 1.0 1.2 1.4 1
Boxplot
Outline 1 2 3
Scatter plot 0.0 2.5 5.0 7.5 10.0 20 40 60 80 Age Left Right self placement Left right self placement by age
Scatter plot 0.0 2.5 5.0 7.5 10.0 20 40 60 80 Age Left Right self placement gender Female Male Left right self placement by age
Scatter plot Female Male 20 40 60 80 20 40 60 80 0.0 2.5 5.0 7.5 10.0 Age Left Right self placement Left right self placement by age
Scatter plot 0.0 2.5 5.0 7.5 10.0 20 40 60 80 Age Left Right self placement Left right self placement by age
Bar charts Barcharts can be used to visualise the distribution of a variable using the proportions in each category of a categorical variable but also for relationships between two variables: With another categorical variable: by displaying the proportions in a different variable. With another scale variable: by displaying the mean or other statistics of a different variable.
Bar chart Average left right self placement by party Fine Gael Fianna Fail Independent Labour Other Sinn Fein 0 2 4 6 8 10
Bar chart Average left right self placement by party Fine Gael Fianna Fail Independent Labour Other Sinn Fein 3.5 4.0 4.5 5.0 5.5
Bar chart Percentage of young voters by party Sinn Fein Independent Labour Other Fine Gael Fianna Fail 0 20 40 60 80 100
Bar chart
Bar chart
Box plots Box plots can also be split by category on a different variable, to visualise the relationship between a categorical and a scale variable.
Box plot 80 age 60 40 20 Fianna Fail Fine Gael Independent Labour Other Sinn Fein party
Conclusion Remember keywords of measurement: levels of measurement, sample vs population, unit of analysis. Understanding the relation between measurement, variables, and data sets. Understanding the variation in types of and how not to use them. Understanding the difference between univariate and multivariate.
Argyrous, George. 1997. Statistics for social research. Basingstoke: MacMillan. Marshall, M.G. and K. Jaggers. 2002. Polity IV project: political regime characteristics and transitions, 1800-2002.. URL: http://www.bsos.umd.edu/cidcm/polity/ Moore, David S. 2003. The basic practice of statistics. 3rd ed. New York: W.H. Freeman.