Becoming an Effective Consumer of Descriptive Evidence
What Is This Course About? The use of data and statistics for the analysis of public policy Goal: Make you a better policy analyst and a better consumer of policy analysis Focus on opportunities to use data and research evidence to improve public policy
Agenda 1. Data and Data Visualization 2. Sampling Fluctuation 3. Application: Understanding Migration Based on the Nepal Living Standards Surveys
What are Data? At root, data are any form of information The challenge of statistics is to organize this information into something useful Two methods: Graphics Today Numbers Means, Medians, Standard Deviations, etc.
Graphical Representation of Data Example Dataset: World Development Indicators (WDI) Country Name GDP per capita Population (MM) Total Fertility Infant Mortality HIV Infection Rate Albania $3,405 3.18 1.78 13.40 0.2 Algeria $3,996 33.85 2.38 32.80 0.1 Angola $3,623 16.95 5.80 115.70 2.1 Argentina $6,644 39.50 2.25 14.90 0.5 Armenia $3,059 3.01 1.70 21.90 0.1 Australia $39,066 21.02 1.93 4.90 0.2 Austria $44,879 8.32 1.38 3.60 0.2 Azerbaijan $3,652 8.56 2.00 34.41 0.2 Bahamas $19,844 0.33 2.02 12.20 3.0 Bangladesh $431 158.57 2.83 47.02 0.3 Belarus $4,615 9.70 1.29 11.94 0.2 Belgium $42,609 10.63 1.81 3.70 0.2 Belize $4,200 0.30 2.94 21.60 2.1 Benin $601 9.03 5.42 77.80 1.2 Bhutan $1,668 0.66 2.19 56.20 0.1....................................
Graphical Representation of Data How can we visualize these data? Histogram: Based upon a single variable Group the data into bins The height of each bin indicates the number (or percent) of observations in that bin
# of Countries 0 20 40 60 80 Graph 2 0-10 11-20 21-30 31-40 Deaths per 1000 Live Births 41 or More
Infant Mortality Rates 0 20 40 60 80 # of Countries 20 40 60 80 100 120 Deaths per 1000 Live Births
II Sampling Fluctuation
Sampling Fluctauation Goal of statistical inference: To use a sample to learn something about a population This something is called a population parameter Examples of population parameters: The mean income of households in the United States The proportion of children in India who have access to clean water The fraction of Americans that support the Democratic presidential candidate This type of inference is called descriptive inference
Population vs. Sample Ideally, you would like to determine the population parameter by having access to the whole population, the entire group in which you are interested But in the real world, this is often infeasible Instead, we often have access to a subset of the population called a sample In this class, we will assume that the sample is random Statistical inference, at a very basic level, allows us to extrapolate from the sample to the population
Political Polls Political polls are one example of a situation in which we want to learn something about a population using information contained in a sample Suppose we are interested in a particular election Population of interest: Population parameter of interest:
Political Polls Political polls are one example of a situation in which we want to learn something about a population using information contained in a sample Suppose we are interested in a particular election Population of interest: All voters Population parameter of interest:
Political Polls Political polls are one example of a situation in which we want to learn something about a population using information contained in a sample Suppose we are interested in a particular election Population of interest: All voters Population parameter of interest: Fraction that support a certain candidate or party
Polls in this Classroom We will illustrate the key concepts of inference using this classroom as an example But, first, some test polling.
Do you think Brexit will cause other countries to leave the European Union? 1. Yes 2. No 1. 2.
Do you believe that DFIDs decision to scale down lending in India will have an economic impact in Nepal? 1. Yes 2. No 1. 2.
Spending on education is the best way to promote growth in Nepal 1. Yes 2. No 1. 2.
Polls in this Classroom We will illustrate the key concepts of inference using this classroom as an example Question In this case, Population of interest: Population parameter of interest:
Polls in this Classroom We will illustrate the key concepts of inference using this classroom as an example Question In this case, Population of interest: Our class Population parameter of interest:
Polls in this Classroom We will illustrate the key concepts of inference using this classroom as an example Question In this case, Population of interest: Our class Population parameter of interest: Fraction that
As we vote, record your estimates Record what happened Estimate from sample #1: Estimate from sample #2: Estimate from sample #3: Estimate from sample #4: True proportion in class: [population parameter]
Only the sample votes 1. Yes 2. No 1. 2.
Only the sample votes 1. Yes 2. No 1. 2.
Only the sample votes 1. Yes 2. No 1. 2.
Only the sample votes 1. Yes 2. No 1. 2.
1. Yes 2. No Only the sample votes 5 5 1. 2.
Only the sample votes 1. Yes 2. No 1. 2.
Only the sample votes 1. Yes 2. No 1. 2.
Sampling Fluctuations Notice that different samples generated different estimates In some cases, we selected samples in which support was higher (or lower) than support in the overall population These are examples of sampling fluctuations Because of sampling fluctuation, we must be wary that the estimate from our sample may not be equal to or even close to the parameter from the population Understanding how likely we are to choose an unusual sample allows us to quantify how confident we are about a population based only on evidence from a sample This is the heart of statistical inference
Who do you support in the 2016 United States Presidential Race? http://www.realclearpolitics.com/epolls/2016/president/us/2016_republican_presidential_nomination-3823.html Nine different polls, nine different answers
III Application: Nepal Living Standards Surveys
Background: Migration in Nepal Remittances are 29% of GDP in Nepal Migration has the potential to reduce poverty and promote growth Migration is often politically contentious
Background: Nepal Living Standards Surveys Representative, random sample of people living in Nepal 1995 survey covers 3373 households 2010 survey covers 5988 households
Proportion of households Poverty by Migration Status 10 8 76% 79% 6 4 2 24% 21% Household contains a migrant Below the poverty line No migrants in household Above the poverty line
Proportion households Proportion households Proportion households Reasons for Migration 10 8 6 Reason 1: Searching for education or training 94% 10 8 6 Reason 2: Marriage or relocation of family 78% Reason 3: Relocating for work or business 10 8 6 72% 4 2 6% 4 2 22% 4 2 28% Below the poverty line Above the poverty line Below the poverty line Above the poverty line Below the poverty line Above the poverty line
Proportion households Proportion households Proportion households Destination Destination 1: Within Nepal 10 10 Destination 2: India Destination 3: Outside Nepal and India 8 6 4 26% 74% 8 6 4 37% 63% 10 8 6 4 86% 2 2 2 15% Reason: Relocating for work or business Reason: Relocating for work or business Reason: Relocating for work or business Below the poverty line Above the poverty line Below the poverty line Above the poverty line Below the poverty line Above the poverty line
Proportion of households Changes in Poverty Levels over Time 10 9 8 7 65% 8 6 5 4 3 2 1 35% 2 1995 2010 Below the national poverty line Above the national poverty line
Proportion of households Trends in Migration 10 9 8 7 6 5 4 3 2 1 89% 77% 11% 23% 1995 2010 Household contains a migrant Year No migrants in household