Chapter. Describing the Relation between Two Variables Pearson Pren-ce Hall. All rights reserved

Similar documents
! = ( tapping time ).

SIMPLE LINEAR REGRESSION OF CPS DATA

Report prepared by: Zenaida Ravanera and Victoria Esses with Natalia Lapshina. Produced for Ci zenship and Immigra on Canada December 2014

on Interstate 19 in Southern Arizona

Chapter. Sampling Distributions Pearson Prentice Hall. All rights reserved

Hoboken Public Schools. AP Statistics Curriculum

A Gravitational Model of Crime Flows in Normal, Illinois:

Ethnic minority poverty and disadvantage in the UK

Volume 35, Issue 1. An examination of the effect of immigration on income inequality: A Gini index approach

Gender preference and age at arrival among Asian immigrant women to the US

Distorting Democracy: How Gerrymandering Skews the Composition of the House of Representatives

LABOUR-MARKET INTEGRATION OF IMMIGRANTS IN OECD-COUNTRIES: WHAT EXPLANATIONS FIT THE DATA?

Practice Questions for Exam #2

A positive correlation between turnout and plurality does not refute the rational voter model

Chapter. Estimating the Value of a Parameter Using Confidence Intervals Pearson Prentice Hall. All rights reserved

A Global Perspective on Socioeconomic Differences in Learning Outcomes

Irish. imagine all the people. Irish in Boston

Introduction to Path Analysis: Multivariate Regression

Tsukuba Economics Working Papers No Did the Presence of Immigrants Affect the Vote Outcome in the Brexit Referendum? by Mizuho Asai.

DU PhD in Home Science

Name Date Period. Approximate population in millions. Arizona Colorado Connecticut Georgia Idaho Iowa 3.

Rural Migration and Social Dislocation: Using GIS data on social interaction sites to measure differences in rural-rural migrations

Haitians. imagine all the people. Haitians in Boston

Honors General Exam PART 3: ECONOMETRICS. Solutions. Harvard University April 2014

The Impact of Trade Liberalization on the Gender Wage Gap in the Labor Market

Introduction. Background

A Vote Equation and the 2004 Election

The Persistence of Skin Color Discrimination for Immigrants. Abstract

Corruption and business procedures: an empirical investigation

IMMIGRANT UNEMPLOYMENT: THE AUSTRALIAN EXPERIENCE* Paul W. Miller and Leanne M. Neo. Department of Economics The University of Western Australia

A Study on Chinese Firms in Hamburg

Unequal Recovery, Labor Market Polarization, Race, and 2016 U.S. Presidential Election. Maoyong Fan and Anita Alves Pena 1

RELIGIOUS FREEDOM AND ECONOMIC PROSPERITY Ilan Alon and Gregory Chase

Abstract. research studies the impacts of four factors on inequality income level, emigration,

Predicting Presidential Elections: An Evaluation of Forecasting

Ohio State University

Legitimacy Crisis. Myth and Reality. of the. Explaining Trends and Cross-National OXPORD. Differences in Established Democracies

Cover Page. The handle holds various files of this Leiden University dissertation

On Quality and Equity in the Performance of Students and Schools 1. Robert M. Hauser. Vilas Research Professor of Sociology

Statistical Analysis of Corruption Perception Index across countries

CHAPTER 5 SOCIAL INCLUSION LEVEL

John Parman Introduction. Trevon Logan. William & Mary. Ohio State University. Measuring Historical Residential Segregation. Trevon Logan.

Corruption's Effect on Socioeconomic Factors

CSES Module 5 Pretest Report: Greece. August 31, 2016

Visible minority neighbourhood enclaves and labour market outcomes of immigrants

WHY IS THE PAYOFF TO SCHOOLING SMALLER FOR IMMIGRANTS? *

PI + v2.2. Demographic Component of the REMI Model Regional Economic Models, Inc.

!! Potential)Sources)of)Modern)Day)Slavery) William)R.)DiPietro)) Professor!of!Economics!! Daemen!College!

IOM Rapid Assessment Report

Inequality in the Labor Market for Native American Women and the Great Recession

Ecological Analyses of Permanent and Temporary Migration Streams. in China in the 1990s. Dudley L. Poston, Jr. Li Zhang. Texas A&M University ABSTRACT

New Regula ons Address HUD s Homelessness Programs

Outline for Teaching/Assignments (Semestered School ~88 classes per semester)

Online Appendix for Immigrants Equilibrate Local Labor Markets: Evidence from the Great Recession by Brian C. Cadena and Brian K.

EDEXCEL FUNCTIONAL SKILLS PILOT. Maths Level 2. Test your skills. Chapters 6 and 7. Investigating election statistics

Happiness and economic freedom: Are they related?

THE EFFECT OF CONCEALED WEAPONS LAWS: AN EXTREME BOUND ANALYSIS

Rural to Urban Migration and Household Living Conditions in Bangladesh

The Employment of Low-Skilled Immigrant Men in the United States

The Macro Polity Updated

Why are the Relative Wages of Immigrants Declining? A Distributional Approach* Brahim Boudarbat, Université de Montréal

Since the early 1990s, the technology-driven

IOM SOUTH SUDAN HUMANITARIAN UPDATE #46 HIGHLIGHTS

The Impact of Deunionisation on Earnings Dispersion Revisited. John T. Addison Department of Economics, University of South Carolina (U.S.A.

Midterm Elections Used to Gauge President s Reelection Chances

Gender Variations in the Socioeconomic Attainment of Immigrants in Canada

Impacts of International Migration and Foreign Remittances on Primary Activity of Young People Left Behind: Evidence from Rural Bangladesh

AVOTE FOR PEROT WAS A VOTE FOR THE STATUS QUO

PARLIAMENTARY STUDIES PAPER 11

Arrest Rates and Crime Rates: When Does a Tipping Effect Occur?*

The Determinants of Low-Intensity Intergroup Violence: The Case of Northern Ireland. Online Appendix

Honors General Exam Part 1: Microeconomics (33 points) Harvard University

Corruption as an obstacle to women s political representation: Evidence from local councils in 18 European countries

Methodological and Substantive Issues in Analyses of a Dependent Nominal-Level Variable in Comparative Research. The Case of Party Choice

Real Adaption or Not: New Generation Internal Migrant Workers Social Adaption in China

Does criminal sanctioning direct democracy? A county-level analysis of the relationship between sentencing and voting behavior

The authors acknowledge the support of CNPq and FAPEMIG to the development of the work. 2. PhD candidate in Economics at Cedeplar/UFMG Brazil.

I. MODEL Q1 Q2 Q9 Q10 Q11 Q12 Q15 Q46 Q101 Q104 Q105 Q106 Q107 Q109. Stepwise Multiple Regression Model. A. Frazier COM 631/731 March 4, 2014

DOES POST-MIGRATION EDUCATION IMPROVE LABOUR MARKET PERFORMANCE?: Finding from Four Cities in Indonesia i

CSE 190 Assignment 2. Phat Huynh A Nicholas Gibson A

COMPANIES (JERSEY) LAW 1991 MEMORANDUM. and ARTICLES OF ASSOCIATION WENTWORTH RESOURCES PLC. a public no par value limited liability company

Ethnic composition of the class and educational performance in primary education in The Netherlands

Explanatory note on the 2014 Human Development Report composite indices. Solomon Islands

Gender Wage Gap and Discrimination in Developing Countries. Mo Zhou. Department of Agricultural Economics and Rural Sociology.

Explanatory note on the 2014 Human Development Report composite indices. Serbia. HDI values and rank changes in the 2014 Human Development Report

High Technology Agglomeration and Gender Inequalities

EXAMINATION 3 VERSION B "Wage Structure, Mobility, and Discrimination" April 19, 2018

Venezuela (Bolivarian Republic of)

Poverty Reduction and Economic Growth: The Asian Experience Peter Warr

Is the Great Gatsby Curve Robust?

Supplementary Material for Preventing Civil War: How the potential for international intervention can deter conflict onset.

PRELIMINARY DRAFT PLEASE DO NOT CITE

Explaining the Deteriorating Entry Earnings of Canada s Immigrant Cohorts:

WAGE RENTALS FOR REPRODUCIBLE HUMAN CAPITAL: EVIDENCE FROM GHANA AND THE IVORY COAST

Legislatures and Growth

List of Tables and Appendices

Essential Questions Content Skills Assessments Standards/PIs. Identify prime and composite numbers, GCF, and prime factorization.

Inferring Directional Migration Propensities from the Migration Propensities of Infants: The United States

Online Appendix: The Effect of Education on Civic and Political Engagement in Non-Consolidated Democracies: Evidence from Nigeria

Prowess dx.

Transcription:

Chapter 34 Describing the Relation between Two Variables 2010 Pearson Pren-ce Hall. All rights

Section 4.1 Scatter Diagrams and Correlation 2010 Pearson Pren-ce Hall. All rights 4-2

2010 Pearson Pren-ce Hall. All rights 4-3

2010 Pearson Pren-ce Hall. All rights 4-4

2010 Pearson Pren-ce Hall. All rights 4-5

EXAMPLE Drawing and Interpre.ng a Sca1er Diagram The data shown to the right are based on a study for drilling rock. The researchers wanted to determine whether the -me it takes to dry drill a distance of 5 feet in rock increases with the depth at which the drilling begins. So, depth at which drilling begins is the explanatory variable, x, and -me (in minutes) to drill five feet is the response variable, y. Draw a scaqer diagram of the data. Source: Penner, R., and WaQs, D.G. Mining Informa-on. The American Sta.s.cian, Vol. 45, No. 1, Feb. 1991, p. 6. 2010 Pearson Pren-ce Hall. All rights 4-6

2010 Pearson Pren-ce Hall. All rights 4-7

Various Types of Relations in a Scatter Diagram 2010 Pearson Pren-ce Hall. All rights 4-8

2010 Pearson Pren-ce Hall. All rights 4-9

Determine the type of correlation between the variables. y x A. Positive linear correlation B. Negative linear correlation C. Nonlinear correlation Slide 4-10 Copyright 2010 Pearson Educa-on, Inc.

Determine the type of correlation between the variables. y x A. Positive linear correlation B. Negative linear correlation C. Nonlinear correlation Slide 4-11 Copyright 2010 Pearson Educa-on, Inc.

Determine the type of correlation between the variables. y x A. Positive linear correlation B. Negative linear correlation C. Nonlinear correlation Slide 4-12 Copyright 2010 Pearson Educa-on, Inc.

Determine the type of correlation between the variables. y x A. Positive linear correlation B. Negative linear correlation C. Nonlinear correlation Slide 4-13 Copyright 2010 Pearson Educa-on, Inc.

2010 Pearson Pren-ce Hall. All rights 4-14

2010 Pearson Pren-ce Hall. All rights 4-15

2010 Pearson Pren-ce Hall. All rights 4-16

2010 Pearson Pren-ce Hall. All rights 4-17

2010 Pearson Pren-ce Hall. All rights 4-18

2010 Pearson Pren-ce Hall. All rights 4-19

EXAMPLE Determining the Linear Correla.on Coefficient Determine the linear correlation coefficient of the drilling data. 2010 Pearson Pren-ce Hall. All rights 4-20

2010 Pearson Pren-ce Hall. All rights 4-21

2010 Pearson Pren-ce Hall. All rights 4-22

Calculate the linear correlation coefficient r, for temperature (x) and number of ice cream cones sold per hour (y). A. 0.946 B. 0.973 C. 17.694 D. 0.383 x 65 70 75 80 85 90 95 100 105 y 8 10 11 13 12 16 19 22 23 Slide 4-23 Copyright 2010 Pearson Educa-on, Inc.

Calculate the linear correlation coefficient r, for temperature (x) and number of ice cream cones sold per hour (y). A. 0.946 B. 0.973 C. 17.694 D. 0.383 x 65 70 75 80 85 90 95 100 105 y 8 10 11 13 12 16 19 22 23 Slide 4-24 Copyright 2010 Pearson Educa-on, Inc.

2010 Pearson Pren-ce Hall. All rights 4-25

EXAMPLE Does a Linear Rela.on Exist? Determine whether a linear relation exists between time to drill five feet and depth at which drilling begins. Comment on the type of relation that appears to exist between time to drill five feet and depth at which drilling begins. The correla-on between drilling depth and -me to drill is 0.773. The cri-cal value for n = 12 observa-ons is 0.576. Since 0.773 > 0.576, there is a posi-ve linear rela-on between -me to drill five feet and depth at which drilling begins. 2010 Pearson Pren-ce Hall. All rights 4-26

2010 Pearson Pren-ce Hall. All rights 4-27

According to data obtained from the Sta-s-cal Abstract of the United States, the correla-on between the percentage of the female popula-on with a bachelor s degree and the percentage of births to unmarried mothers since 1990 is 0.940. Does this mean that a higher percentage of females with bachelor s degrees causes a higher percentage of births to unmarried mothers? Certainly not! The correla-on exists only because both percentages have been increasing since 1990. It is this rela-on that causes the high correla-on. In general, -me series data (data collected over -me) will have high correla-ons because each variable is moving in a specific direc-on over -me (both going up or down over -me; one increasing, while the other is decreasing over -me). When data are observa-onal, we cannot claim a causal rela-on exists between two variables. We can only claim causality when the data are collected through a designed experiment. 2010 Pearson Pren-ce Hall. All rights 4-28

Another way that two variables can be related even though there is not a causal rela-on is through a lurking variable. A lurking variable is related to both the explanatory and response variable. For example, ice cream sales and crime rates have a very high correla-on. Does this mean that local governments should shut down all ice cream shops? No! The lurking variable is temperature. As air temperatures rise, both ice cream sales and crime rates rise. 2010 Pearson Pren-ce Hall. All rights 4-29

2010 Pearson Pren-ce Hall. All rights 4-30

This study is a prospec-ve cohort study, which is an observa-onal study. Therefore, the researchers cannot claim that increased cola consump-on causes a decrease in bone mineral density. Some lurking variables in the study that could confound the results are: body mass index height smoking alcohol consump-on calcium intake physical ac-vity 2010 Pearson Pren-ce Hall. All rights 4-31

Section 4.2 Least-squares Regression 2010 Pearson Pren-ce Hall. All rights 4-32

Using the following sample data: (a) Find a linear equation that relates x (the explanatory variable) and y (the response variable) by selecting two points and finding the equation of the line containing the points. Using (2, 5.7) and (6, 1.9): 2010 Pearson Pren-ce Hall. All rights 4-33

(b) Graph the equation on the scatter diagram. 7 6 5 4 3 2 1 0 0 1 2 3 4 5 6 7 (c) Use the equation to predict y if x = 3. 2010 Pearson Pren-ce Hall. All rights 4-34

2010 Pearson Pren-ce Hall. All rights 4-35

The difference between the observed value of y and the predicted value of y is the error, or residual. Using the line from the last example, and the predicted value at x = 3: residual = observed y predicted y = 5.2 4.75 = 0.45 7 6 5 4 (3, 5.2) } residual = observed y predicted y = 5.2 4.75 = 0.45 3 2 1 0 0 1 2 3 4 5 6 7 2010 Pearson Pren-ce Hall. All rights 4-36

2010 Pearson Pren-ce Hall. All rights 4-37

2010 Pearson Pren-ce Hall. All rights 4-38

EXAMPLE Finding the Least- squares Regression Line Using the drilling data (a) Find the least- squares regression line. (b) Predict the drilling -me if drilling starts at 130 feet. (c) Is the observed drilling -me at 130 feet above, or below, average. (d) Draw the least- squares regression line on the scaqer diagram of the data. 2010 Pearson Pren-ce Hall. All rights 4-39

(a) We agree to round the es-mates of the slope and intercept to four decimal places. (b) (c) The observed drilling -me is 6.93 seconds. The predicted drilling -me is 7.035 seconds. The drilling -me of 6.93 seconds is below average. 2010 Pearson Pren-ce Hall. All rights 4-40

(d) 8.5 8 Time to Drill 5 Feet 7.5 7 6.5 6 5.5 0 20 40 60 80 100 120 140 160 180 200 Depth Drilling Begins 2010 Pearson Pren-ce Hall. All rights 4-41

Find the least squares regression line for temperature (x) and number of ice cream cones sold per hour (y). A. B. C. D. x 65 70 75 80 85 90 95 100 105 y 8 10 11 13 12 16 19 22 23 Slide 4-42 Copyright 2010 Pearson Educa-on, Inc.

Find the least squares regression line for temperature (x) and number of ice cream cones sold per hour (y). A. B. C. D. x 65 70 75 80 85 90 95 100 105 y 8 10 11 13 12 16 19 22 23 Slide 4-43 Copyright 2010 Pearson Educa-on, Inc.

2010 Pearson Pren-ce Hall. All rights 4-44

Interpreta:on of Slope: The slope of the regression line is 0.0116. For each addi-onal foot of depth we start drilling, the -me to drill five feet increases by 0.0116 minutes, on average. Interpreta:on of the y- Intercept: The y- intercept of the regression line is 5.5273. To interpret the y- intercept, we must first ask two ques-ons: 1. Is 0 a reasonable value for the explanatory variable? 2. Do any observa-ons near x = 0 exist in the data set? A value of 0 is reasonable for the drilling data (this indicates that drilling begins at the surface of Earth. The smallest observa-on in the data set is x = 35 feet, which is reasonably close to 0. So, interpreta-on of the y- intercept is reasonable. The -me to drill five feet when we begin drilling at the surface of Earth is 5.5273 minutes. 2010 Pearson Pren-ce Hall. All rights 4-45

If the least- squares regression line is used to make predic-ons based on values of the explanatory variable that are much larger or much smaller than the observed values, we say the researcher is working outside the scope of the model. Never use a least- squares regression line to make predic-ons outside the scope of the model because we can t be sure the linear rela-on con-nues to exist. 2010 Pearson Pren-ce Hall. All rights 4-46

The least squares regression line for temperature (x) and number of ice cream cones sold per hour (y) is Predict the number of ice cream cones sold per hour when the temperature is 88º. A. 51.4 B. 10.1 C. 16.0 D. 14.2 Slide 4-47 Copyright 2010 Pearson Educa-on, Inc.

The least squares regression line for temperature (x) and number of ice cream cones sold per hour (y) is Predict the number of ice cream cones sold per hour when the temperature is 88º. A. 51.4 B. 10.1 C. 16.0 D. 14.2 Slide 4-48 Copyright 2010 Pearson Educa-on, Inc.

The data for temperature (x) and number of ice cream cones sold per hour (y) is shown. x 65 70 75 80 85 90 95 100 105 y 8 10 11 13 12 16 19 22 23 It would be reasonable to use the least squares regression line to predict the number of ice cream cones sold when it is 50 degrees. A. True B. False Slide 4-49 Copyright 2010 Pearson Educa-on, Inc.

The data for temperature (x) and number of ice cream cones sold per hour (y) is shown. x 65 70 75 80 85 90 95 100 105 y 8 10 11 13 12 16 19 22 23 It would be reasonable to use the least squares regression line to predict the number of ice cream cones sold when it is 50 degrees. A. True B. False Slide 4-50 Copyright 2010 Pearson Educa-on, Inc.

2010 Pearson Pren-ce Hall. All rights 4-51

To illustrate the fact that the sum of squared residuals for a least- squares regression line is less than the sum of squared residuals for any other line, use the regression by eye applet. 2010 Pearson Pren-ce Hall. All rights 4-52

Section 4.3 The Coefficient of Determination 2010 Pearson Pren-ce Hall. All rights 4-53

2010 Pearson Pren-ce Hall. All rights 4-54

The coefficient of determina:on, R 2, measures the propor-on of total varia-on in the response variable that is explained by the least- squares regression line. The coefficient of determination is a number between 0 and 1, inclusive. That is, 0 < R 2 < 1. If R 2 = 0 the line has no explanatory value If R 2 = 1 means the line variable explains 100% of the variation in the response variable. 2010 Pearson Pren-ce Hall. All rights 4-55

The data to the right are based on a study for drilling rock. The researchers wanted to determine whether the time it takes to dry drill a distance of 5 feet in rock increases with the depth at which the drilling begins. So, depth at which drilling begins is the predictor variable, x, and time (in minutes) to drill five feet is the response variable, y. Source: Penner, R., and Watts, D.G. Mining Information. The American Statistician, Vol. 45, No. 1, Feb. 1991, p. 6. 2010 Pearson Pren-ce Hall. All rights 4-56

2010 Pearson Pren-ce Hall. All rights 4-57

Sample Statistics Mean Standard Deviation Depth 126.2 52.2 Time 6.99 0.781 Correlation Between Depth and Time: 0.773 Regression Analysis The regression equation is Time = 5.53 + 0.0116 Depth 2010 Pearson Pren-ce Hall. All rights 4-58

Suppose we were asked to predict the time to drill an additional 5 feet, but we did not know the current depth of the drill. What would be our best guess? 2010 Pearson Pren-ce Hall. All rights 4-59

Suppose we were asked to predict the time to drill an additional 5 feet, but we did not know the current depth of the drill. What would be our best guess? ANSWER: The mean time to drill an additional 5 feet: 6.99 minutes 2010 Pearson Pren-ce Hall. All rights 4-60

Now suppose that we are asked to predict the time to drill an additional 5 feet if the current depth of the drill is 160 feet? ANSWER: Our guess increased from 6.99 minutes to 7.39 minutes based on the knowledge that drill depth is positively associated with drill time. 2010 Pearson Pren-ce Hall. All rights 4-61

2010 Pearson Pren-ce Hall. All rights 4-62

The difference between the observed value of the response variable and the mean value of the response variable is called the total deviation and is equal to 2010 Pearson Pren-ce Hall. All rights 4-63

The difference between the predicted value of the response variable and the mean value of the response variable is called the explained deviation and is equal to 2010 Pearson Pren-ce Hall. All rights 4-64

The difference between the observed value of the response variable and the predicted value of the response variable is called the unexplained deviation and is equal to 2010 Pearson Pren-ce Hall. All rights 4-65

2010 Pearson Pren-ce Hall. All rights 4-66

Total Variation = Unexplained Variation + Explained Variation 2010 Pearson Pren-ce Hall. All rights 4-67

Total Variation = Unexplained Variation + Explained Variation 1 = Unexplained Variation Total Variation + Explained Variation Total Variation Explained Variation Total Variation = 1 Unexplained Variation Total Variation 2010 Pearson Pren-ce Hall. All rights 4-68

To determine R 2 for the linear regression model simply square the value of the linear correla-on coefficient. 2010 Pearson Pren-ce Hall. All rights 4-69

EXAMPLE Determining the Coefficient of Determination Find and interpret the coefficient of determination for the drilling data. Because the linear correlation coefficient, r, is 0.773, we have that R 2 = 0.773 2 = 0.5975 = 59.75%. So, 59.75% of the variability in drilling time is explained by the least-squares regression line. 2010 Pearson Pren-ce Hall. All rights 4-70

Calculate the coefficient of determination r 2, for temperature (x) and number of ice cream cones sold per hour (y). A. 0.946 B. 0.973 C. 0.923 D. 0.986 x 65 70 75 80 85 90 95 100 105 y 8 10 11 13 12 16 19 22 23 Slide 4-71 Copyright 2010 Pearson Educa-on, Inc.

Calculate the coefficient of determination r 2, for temperature (x) and number of ice cream cones sold per hour (y). A. 0.946 B. 0.973 C. 0.923 D. 0.986 x 65 70 75 80 85 90 95 100 105 y 8 10 11 13 12 16 19 22 23 Slide 4-72 Copyright 2010 Pearson Educa-on, Inc.

Draw a scatter diagram for each of these data sets. For each data set, the variance of y is 17.49. 2010 Pearson Pren-ce Hall. All rights 4-73

Data Set A Data Set B Data Set C 2010 Pearson Pren-ce Hall. All rights 4-74