Selected ACE: Data Distributions Investigation 1: #13, 17 Investigation 2: #3, 7 Investigation 3: #8 Investigation 4: #2

Similar documents
Lecture 1 Economic Growth and Income Differences: A Look at the Data

Iowa Voting Series, Paper 4: An Examination of Iowa Turnout Statistics Since 2000 by Party and Age Group

HOW ECONOMIES GROW AND DEVELOP Macroeconomics In Context (Goodwin, et al.)

English Australia. Survey of major ELICOS regional markets in 2014

Survey sample: 1,013 respondents Survey period: Commissioned by: Eesti Pank Estonia pst. 13, Tallinn Conducted by: Saar Poll

Exam 1 - Spring 2012

PROJECTING THE LABOUR SUPPLY TO 2024

Send Money Africa sendmoneyafrica.worldbank.org

A COMPARISON OF ARIZONA TO NATIONS OF COMPARABLE SIZE

Part 1: Focus on Income. Inequality. EMBARGOED until 5/28/14. indicator definitions and Rankings

Forty Years of LCMS District Statistics Based on Lutheran Annual data for years

A survey of 200 adults in the U.S. found that 76% regularly wear seatbelts while driving. True or false: 76% is a parameter.

PERMIT COVER PAGE. Is contractor performing work? YES NO. If answered YES than Contractor must provide Proof of Insurance to the Borough

Survey of Expert Opinion on Future Level of Immigration to the U.S. in 2015 and 2025 Summary of Results

Geographic Origin Segmentation

Changing Times, Changing Enrollments: How Recent Demographic Trends are Affecting Enrollments in Portland Public Schools

Summary of At-Border Data Collection Results

Vancouver Police Community Policing Assessment Report Residential Survey Results NRG Research Group

This analysis confirms other recent research showing a dramatic increase in the education level of newly

Chapter 5. Residential Mobility in the United States and the Great Recession: A Shift to Local Moves

IV. Residential Segregation 1

The Effects of Immigration on Age Structure and Fertility in the United States

HCEO WORKING PAPER SERIES

Net International Migration Emigration Methodology

Sanction Certainty: An Evaluation of Erie County s Adult Probation Sanctioning System

Immigrants, Education and U.S. Economic Competitiveness

Explaining the 40 Year Old Wage Differential: Race and Gender in the United States

Levels and trends in international migration

*Cross references: Business licenses and regulations, Tit. 10; fines,

Applicants may use three types of granting procedures:

SIKHS A PROFILE OF WHO WE ARE UNITED STATES IN THE. AUTHORED BY Nikhita Luthra & Shawn Singh Ghuman. EDITED BY Sumeet Kaur

What Happens When a Country Has an Absolute Advantage in All Goods

Mexico s Wage Gap Charts

! = ( tapping time ).

Interrelationship between Growth, Inequality, and Poverty: The Asian Experience

Stanford University Climate Adaptation National Poll

Illegal Immigration. When a Mexican worker leaves Mexico and moves to the US he is emigrating from Mexico and immigrating to the US.

Human Population Growth Through Time

On the Rationale of Group Decision-Making

Economic Growth and Poverty Reduction: Lessons from the Malaysian Experience

Iowa Voting Series, Paper 6: An Examination of Iowa Absentee Voting Since 2000

2017 CAMPAIGN FINANCE REPORT

2011 National Opinion Poll: Canadian Views on Asia

Prosperity in Central and Eastern Europe A Legatum Institute Prosperity Report

The Racial Dimension of New York s Income Inequality

Population density is a measure of how crowded a population is. It looks at land area as well as population.

World Population A.D World Population from the Beginnings to the Present. Words

The Effectiveness of Receipt-Based Attacks on ThreeBallot

RESPONSEt EVALUATING MERGER ENFORCEMENT DURING THE OBAMA ADMINISTRATION

Evaluating the Role of Immigration in U.S. Population Projections

Patrick Adler and Chris Tilly Institute for Research on Labor and Employment, UCLA. Ben Zipperer University of Massachusetts, Amherst

Joint Center for Housing Studies. Harvard University

Defining the Gerrymander

Chapter. Sampling Distributions Pearson Prentice Hall. All rights reserved

White Pages Copymasters Blue Pages Answer Keys. Introduction... v Class Record...ix. Student Activities

Technology and the Era of the Mass Army

Do two parties represent the US? Clustering analysis of US public ideology survey

What Happens When a Country Has an Absolute Advantage in All Goods *

Response to the Evaluation Panel s Critique of Poverty Mapping

CH 19. Name: Class: Date: Multiple Choice Identify the choice that best completes the statement or answers the question.

Changes in Wage Inequality in Canada: An Interprovincial Perspective

Visitor Satisfaction Monitoring Report

Recent Trends in Securities Class Action Litigation: 2012 Full-Year Review Settlements Up; Attorneys Fees Down

Youth Voter Turnout has Declined, by Any Measure By Peter Levine and Mark Hugo Lopez 1 September 2002

Benefit levels and US immigrants welfare receipts

Louis M. Edwards Mathematics Super Bowl Valencia Community College -- April 30, 2004

COULD THE LIB DEM MARGINAL MELTDOWN MEAN THE TORIES GAIN FROM A.V.? By Lord Ashcroft, KCMG 20 July 2010

BRIEFING. Immigration by Category: Workers, Students, Family Members, Asylum Applicants.

Table A.1: Experiment Sample Distribution and National Demographic Benchmarks Latino Decisions Sample, Study 1 (%)

The Changing Face of Labor,

What's Driving the Decline in U.S. Population Growth?

Data manipulation in the Mexican Election? by Jorge A. López, Ph.D.

DEVELOPMENT AID IN NORTHEAST ASIA

answers to some of the sample exercises : Public Choice

Estimating the Margin of Victory for Instant-Runoff Voting

C. PCT 1527 January 31, 2018

Explaining differences in access to home computers and the Internet: A comparison of Latino groups to other ethnic and racial groups

Union Byte By Cherrie Bucknor and John Schmitt* January 2015

F. CONTEMPORARY PROTECTIONIST MEASURES IN THE REGION

DU PhD in Home Science

Telephone Survey. Contents *

CRS-2 Production Sharing and U.S.-Mexico Trade When a good is manufactured by firms in more than one country, it is known as production sharing, an ar

ECONOMIC GROWTH* Chapt er. Key Concepts

Population Change and Economic Development in Albania

Random Forests. Gradient Boosting. and. Bagging and Boosting

Trends in Poverty Rates Among Latinos in New York City and the United States,

Fewer, but still with us

Global Scenarios until 2030: Implications for Europe and its Institutions

Why did the U.S. repeal prohibition?

Supplementary Materials for

Perceptions and knowledge of Britain and its competitors in Foresight issue 156 VisitBritain Research

EUROBAROMETER 72 PUBLIC OPINION IN THE EUROPEAN UNION AUTUMN

WORKING P A P E R. Immigrants and the Labor Market JAMES P. SMITH WR-321. November 2005

National Labor Relations Board

Short-Term Transitional Leave Program in Oregon

REMITTANCE PRICES WORLDWIDE

AGRICULTURAL TRADE LIBERALIZATION UNDER NAFTA: REPORTING ON THE REPORT CARD

Matthew A. Cole and Eric Neumayer. The pitfalls of convergence analysis : is the income gap really widening?

MEMPHIS POVERTY FACT SHEET

ERD. Working Paper. No. Interrelationship between Growth, Inequality, and Poverty: The Asian Experience. Hyun H. Son ECONOMICS AND RESEARCH DEPARTMENT

Transcription:

Selected ACE: Data Distributions Investigation 1: #13, 17 Investigation 2: #3, 7 Investigation 3: #8 Investigation 4: #2 ACE Problem Investigation 1 13. a. The table below shows the data for the brown candies from Bags 4 9 of Exercise 1. Make an ordered value bar graph and a line plot for these data. Brown M&M s Bag # 4 5 6 7 8 9 Number of Brown Candies 14 14 15 12 16 24 b. What are the minimum and maximum values? c. What is the range? d. Are there gaps or clusters of data? Explain. e. Would an ordered value bar graph or a line plot better represent the data? Explain. Possible solution 13. a. Note: A value bar graph shows the number of M&M s in each bag. An ordered value bar graph shows the same thing, but the values are arranged in increasing order. These values are not the bag numbers, which are in essence just names. The bags could just as well have been named A, B, C, etc. The values which have to be ordered are the numbers of brown candies. These values range from 12 to 24. # of Brown M&Ms 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 7 4 5 6 8 9 Bag # b. Not answered here. c. Not answered here. d. Not answered here.

e. A line plot shows how frequently each value (# M&M s) occurs. This is helpful in looking for clusters of data, or unusual values. Thus a line plot could be helpful here in locating which values occur most frequently, where most values are clustered, which values are typical/unusual, and where significant gaps occur. However, with only 6 pieces of data either graph would give the same information. 17. a. Describe any trends or patterns in immigration to the United states from Asia from 1820 to 2000 using the graph below. (See student text for graph.) b. Write two comparison statements about the trends from Mexico to the United states (Exercises 8 11) and from Asia to the United States from 1820 to 2000. c. Look back at Graph 2 in Problem 1.2. As the trend for immigration from Europe was decreasing from 1961 to 2000, what happened to the trends for immigration from Mexico and Asia? 17. a. As a percent of total U.S. immigration, immigration from Asia was too small to be noted from 1820 to 1860, then fairly constant from 1860 to 1960 (less than 5%), and then increased dramatically from 1960 to 2000. Note: We should be cautious about deducing that the raw numbers of Asian immigrants follow the same pattern; to make that deduction we would have to know what the total immigration was for each decade. For example we can not say for sure that 5% of x < 35% of y, without knowing the values of x and y. b. We can compare the graphs in Exercise 10 and 17 (this exercise) because they both record immigration from a region as a percent of the total immigration to U.S. The over all pattern of dramatic increase is very similar, but there are differences also. Mexican immigration began its increase slightly earlier than Asian

immigration; Mexican immigration did not increase quite as much as Asian immigration in the decades 1970 to 1990; Mexican immigration is a smaller part of total immigration for the most recent figures (in 2000); Asian immigration may have peaked in 1981-1990. Note: We could deduce that there were more Asian immigrants than Mexican immigrants in 1991 to 2000, in terms of raw numbers as well as percents, because both are percents of the same total immigration for that decade. c. Not answered here. Investigation 2 3. a. What is the mean amount of caffeine in the soda drinks? Na me Caff ein e in 8 oz. A B C D E F G H J 38 37 27 27 26 24 21 15 23 b. Make a line plot for the soda drinks. c. What is the mean amount of caffeine in the other drinks? Nam e Caffe ine in 8 oz. A B C D Te a A Te a B Cof fee Coco a 77 70 25 21 19 10 83 2 33 Juice 3. Note: there are several ways to think about finding the mean. Three of these are shown below. These are Balancing Sharing Using an Algorithm a. We could think of balancing the distribution of caffeine values. A line plot is a convenient way to show the distribution. The idea of balancing is similar to thinking of a teeter-totter. Students can make a quick estimate of the balance point and then use the exact values shown in the line plot to check. (This is a useful method for making an estimate when the exact values are not all known. See Samples and Populations.) d. Make a line plot for the other drinks. e. Write three statements comparing the amount of caffeine in soda and other drinks.

15 20 26 30 38 11 2 3 5 +1 +1 +11 +12 From the above graph we can see that the estimated mean of 26 is too low, because we have a total difference of +25 above the estimated mean, and a total difference of only -21 below the estimated mean. OR, We could think of sharing the amounts of caffeine, taking from higher amounts to add to lesser amounts. A bar graph would make the idea of sharing clear. The bar graph below has a horizontal line drawn across at 28. The bars are marked to show how values exceed or fall short of 28 mg of caffeine. We can see that the horizontal line has been set too high because we only have +19 mg excess caffeine from the first two bars to share with other values below 28 mg of caffeine. By trial and error we can find a horizontal line that makes the sharing process come out evenly.

40 +10 +9 30 1 1 2 Caffeine in mg 20 4 7 13 5 10 0 A B C D E F G H J Soda Name OR, We could use the algorithm. The algorithm has the advantage of giving an exact answer for the mean. (Perhaps a disadvantage is that students don t have a picture of how this mean relates to the rest of the distribution of values.) (38 + 37 + 2 x 27 + 26 + 24 + 21 + 15 + 23) = 238. 238 / 9 = 26.4 (approx.) b. See above. c. Not answered here. d. Not answered here. e. Students could compare means, and use this measure of center to say that the typical other drink has a higher caffeine content. Or they might note that the mean for other drinks is affected by three very high values, so that the distribution for other, shown as a line plot, has a very different shape from the distribution for soda drinks. They might choose

to use the median as a measure of center, instead of mean. Notice that mean and median are alike for the soda drinks, but quite different for the other drinks. They might say that the other drinks show much more variability, and they might measure this variability by using the range, which is 81 for other and 23 for soda. This large variability for other drinks makes any attempt to say what is typical very unreliable. They might comment on significant gaps that appear in the other distribution. 7. a. Compare the three sets of data. Which group of students has longer names? Explain your reasoning. ( See student text for graphs.) b. Look at the distribution for 30 students in the U.S. Suppose the data for the six names with 13 letters were each changed to 16 letters. i. Draw a plot showing this change. ii. Will this change affect the median name length? Explain. iii. Will this change affect the mean name length? Explain. 7. a. Since the question is about longer names, students might compare the means of medians, and use these measures of center to say which data set has a longer typical name. Clearly the center is higher for Russian names. OR, Students might focus on the longest names, the maximum values in the data sets. Again the Russian set of names has the highest maximum, so the absolute longest name in all three sets is a Russian name. Note: this is often an unreliable way to decide on longest, since the maximum for any particular set is only one value, and may be very unlike other values in the set, giving a false overall impression. OR, Students might choose a benchmark, such as 15 letters, and say that more than half the Russian names are greater than or equal to 15 letters long, while only about 20% of

b. Japanese and U.S. names are as long as this. (More on this idea of benchmarks in the next Investigation.) i. If we move the 6 pieces of data from 13 letters to 16 letters then the distribution changes its overall shape, from having a generally mound-shaped distribution to having a shape with two distinct mounds. T 8 9 10 11 12 13 14 15 16 17 18 19 20 ii. Notice that the 6 pieces of data that have been moved were already above the median. Moving them three units right does not change where the middle or median of the distribution is. What is important for calculating the median is the order of the data, and the position of the middle piece of data in this order, not how far above (or below) the middle any particular group of data values are. iii. Not answered here. Investigation 3 8. Use the line plots and table below. How much slower are the Trial 1 reaction times for nondominant hands than the Trial 1 reaction times for dominant hands? Explain. (See student text for graphs and table.) 8. Students have several ways to make a comparison. They might compare measures of center. From the graph we can see that the mean reaction time for the dominant hand is about 1.05 seconds, while the mean for the nondominant hand is 1.3 seconds. Typically the dominant hand is 0.25

OR, OR, seconds faster. If we compare the medians the dominant hand is 0.2 seconds faster. (We can make more exact comparisons from the table.) The mean is higher than the median for the non-dominant hand because of the influence of 3 unusually slow times (slower than 2 seconds.) we might compare the maximum (slowest) values for each distribution. This would be a poor way to compare. The maximum values are about the same for both distributions, but we can see that the non-dominant times distribution is clearly shifted right of the dominant hand times. They might compare clusters, as a way of addressing typical times. The dominant times are clearly clustered around 1 second, while the non-dominant times seem to have two clusters, around 0.8 seconds and around 1.2 seconds. This is not a very conclusive comparison because the non-dominant times are more variable, and not so clearly clustered around a single value. Investigation 4 2. a. The three pairs of line plots below display data about 50 wood roller coasters. Means and medians are marked on each graph. Or, We might choose a benchmark such as 1.4 seconds. We can say that only 4 times (out of 40) are equal to or slower than 1.4 seconds for the dominant hand, while 15 times (out of 40) are equal to or slower than 1.4 seconds. 2. a. As in #8 investigation 3 we have several ways to make comparisons. Below are comparisons of Maximum

(See student text for graphs.) a. Write three statements comparing wood rller coasters built before 1960 with wood roller coasters built in 1960 or later. b. Hector says that there are too few roller coasters to make comparisons. Do you agree with hector? Explain. Drop for the two time periods. The methods used to make the comparisons are Comparing centers Comparing variability Comparing to the same benchmark (These same methods can be used to make comparisons of Maximum Heights and Top Speeds.) Comparing Centers: Both mean and median are greater for the later wood coasters. We can deduce that the typical wood coaster from the later era (1960-2004) has a greater maximum drop. Comparing variability: The range for the later coasters is 215 35 = 180 feet. The range for the earlier coasters is 95 10 = 85 feet. From this we could deduce that the later coasters are more variable, BUT this range value is very much influenced by the very unusual value of 215 feet for the later coasters. If we exclude that value the range for the later coasters would be 155 35 or 120 feet, which is still a larger range value than for the earlier coasters. OR, we could compare clusters. We can see that most of the later wood coasters cluster between 70 and 100 feet, while there is no evident cluster for the earlier coasters. Both of these ways of thinking about how spread out the data are for the two eras would lead us to conclude that the later era shows more variability. BUT there is so little data in the 1902-1959 set that it would be impossible for clusters to form. This makes judging variability very

problematic. Comparing to a benchmark: We could say that half the later coasters had maximum drops greater than or equal to 88 feet, while only 1 (out of 10) of the earlier coasters had a drop as great as this. b. Not answered here.