The Circular Flow: Return Migration from the United States in the Early 1900s

University of Colorado, Boulder CU Scholar Economics Graduate Theses & Dissertations Economics Spring 1-1-2014 The Circular Flow: Return Migration from the United States in the Early 1900s Zachary A. Ward University of Colorado Boulder, zach.a.ward@gmail.com Follow this and additional works at: https://scholar.colorado.edu/econ_gradetds Part of the Economic History Commons, Labor Economics Commons, and the Social Statistics Commons Recommended Citation Ward, Zachary A., "The Circular Flow: Return Migration from the United States in the Early 1900s" (2014). Economics Graduate Theses & Dissertations. 51. https://scholar.colorado.edu/econ_gradetds/51 This Dissertation is brought to you for free and open access by Economics at CU Scholar. It has been accepted for inclusion in Economics Graduate Theses & Dissertations by an authorized administrator of CU Scholar. For more information, please contact cuscholaradmin@colorado.edu.

The Circular Flow: Return Migration from the United States in the Early 1900s by Zachary A. Ward B.A., Wheaton College, 2008 M.A., University of Colorado, 2011 A thesis submitted to the Faculty of the Graduate School of the University of Colorado in partial fulfillment of the requirements for the degree of Doctor of Philosophy Department of Economics 2014

This thesis entitled: The Circular Flow: Return Migration from the United States in the Early 1900s written by Zachary A. Ward has been approved for the Department of Economics Prof. Ann Carlos Prof. Michael Greenwood Date The final copy of this thesis has been examined by the signatories, and we find that both the content and the form meet acceptable presentation standards of scholarly work in the above mentioned discipline.

iii Ward, Zachary A. (Ph.D., Economics) The Circular Flow: Return Migration from the United States in the Early 1900s Thesis directed by Prof. Ann Carlos Many migrants return back to their home country after a short period of stay. Often these migrants are returning to poorer countries, which is at odds with a simple economic model where individuals maximize lifetime earnings. In this dissertation, I explore the motivations for return migration in the early 1900s, the only time in United States history when the government recorded those leaving the country. In the first paper, we estimate the effect of the 1920s immigration quotas on (1) out-migration rates, (2) emigration across skill groups, and (3) the duration of temporary migrants stays in the U.S. Higher quota restrictions reduced emigration rates, mostly for unskilled laborers and farmers. Higher quota restrictions also increased duration of stay, as the share of migrants staying less than 5 years fell and the share staying 5 to 10 years rose. In the second paper, I turn to the self-selection of return migrants. In addition to observing migrants who actually leave, I also have a dataset on migrants intentions to leave at arrival. At least 45% more migrants returned home than had initially planned. While those who planned to return home were negatively self-selected on skill, the negative self-selection intensified at departure. Lowskilled migrants were more likely to experience negative shocks in the United States; these failures drive the result that return migrants were negatively self-selected. However, following migration quotas in the 1920s, return migrants were positively self-selected as the failure rate decreased after a labor supply shock. In the final paper we estimate the self-selection of Mexican migrants into and out of the United States in the 1920s. Officials recorded migrant height on border crossing manifests, which we use to proxy migrant quality and to measure self-selection into migration in 1920. Migrants were positively selected on height compared to the Mexican population. We link these migrants to

iv the 1930 U.S. and Mexican censuses to obtain samples of permanent and return migrants and to estimate the selection into return migration. Return migrants were not differentially self-selected on height relative to permanent migrants.

Dedication To my mother, who taught me the value of hard work.

vi Acknowledgements This dissertation could not have been written without the help of many people. First, and foremost, Ann Carlos has helped and encouraged me throughout the entire process. This project could not have been finished without her guidance. I would also like to thank Mike Greenwood, co-author of the first paper in this dissertation, who taught me finer points of the research process. Edward Kosack, co-author of the third paper of this dissertation, was also tremendously helpful on the other chapters of this project and helpful for learning together how to collect historical datasets. Amber McKinney has spent countless hours editing the following work. Others have been gracious with their time and help with various aspects of the project, including Lee Alston, Francisca Antman, Brian Cadena, Joe Ferrie, Zachary Feldman, Dustin Frye, Murat Iyigun, Myron Gutmann, Gisella Kagy, Priti Kalsi, Ian Keay, Frank Lewis, Jason Long, Terra McKinnish, Chris Minns, Gill Newton, Paul Rhode, Carol Shiue, and Steven Smith. I would also like to thank numreous seminar and conference participants for the inputs on earlier stages of this work.

vii Contents Chapter 1 Introduction 1 2 Immigration Quotas, World War I, and Emigrant Flows from the United States in the Early 20th Century 6 2.1 Introduction......................................... 6 2.2 Historical Background................................... 8 2.3 Previous Work and Theory on Return Migration.................... 12 2.4 The Data.......................................... 14 2.4.1 Reports of the Commissioner General of Immigration (RCI).......... 14 2.4.2 Calculating Emigration Rates........................... 17 2.4.3 Measuring degree of Quota Restriction...................... 19 2.4.4 Emigrant Skill and Length of Stay........................ 21 2.5 Econometric Procedures.................................. 25 2.5.1 Estimating Equation................................ 25 2.5.2 Measurement Error................................. 26 2.6 Empirical Results...................................... 28 2.6.1 Emigration Rates.................................. 28 2.6.2 Emigrant Skill................................... 31 2.6.3 Length of Stay................................... 35 2.7 Summary and Conclusions................................. 37

viii 3 Birds of Passage: Return Migration, Self-Selection, and Immigration Quotas 41 3.1 Historical Background: Immigration to the United States............... 45 3.1.1 Description of Quota Laws............................ 46 3.2 Theoretical Background.................................. 47 3.3 Data............................................. 49 3.3.1 Return Migrants at Departure: Administrative Data and IPUMS....... 50 3.3.2 Planned Return Migrants at Arrival: Ship Manifests.............. 51 3.3.3 Measures of Migrant Quality: Occupational Scores and Height........ 53 3.4 Descriptive Statistics.................................... 54 3.4.1 Self-Selection of Return Migrants at Departure................. 54 3.4.2 Planned Emigrants................................. 56 3.4.3 Self-Selection of Return Migrants Conditional on Observables......... 59 3.4.4 Planned Length of Stay.............................. 60 3.5 Heterogeneity in Return Migrant Selectivity: Uncovering the Role of Unplanned Return Migration...................................... 63 3.5.1 Heterogeneity by Ethnicity............................ 63 3.5.2 Actual Return Rates and Planned Return Rates................ 65 3.5.3 Transferring of Skill across Countries....................... 69 3.5.4 Heterogeneity by Time............................... 72 3.6 The Effect of Migration Quotas on Return Migration.................. 74 3.6.1 Effects on Immigrants at Arrival and Immigrants in Census.......... 75 3.6.2 Effects on Return Migrants at Departure.................... 78 3.6.3 Robustness Checks................................. 80 3.7 Conclusion......................................... 84 4 Who Crossed the Border? Self-Selection of Mexican Migrants in the Early Twentieth Century 86

ix 4.1 Introduction......................................... 86 4.2 U.S.-Mexico Migration in 1920.............................. 90 4.3 Selection into Migration.................................. 93 4.4 Height as a Measure of Selection............................. 95 4.5 Data............................................. 96 4.5.1 Border Crossing Manifests............................. 96 4.5.2 Comparison Samples: Military and Passport Data............... 99 4.6 Estimating Self-Selection into Migration......................... 102 4.6.1 Robustness Checks................................. 105 4.7 Accounting for Return Migration............................. 106 4.7.1 Selection into Return Migration.......................... 106 4.7.2 Linked Sample................................... 108 4.7.3 Estimating Selection into Return Migration................... 112 4.7.4 Robustness of Results for the Linked Sample.................. 113 4.8 Conclusions......................................... 114 5 Conclusions and Further Research 116 Bibliography 119 Appendix A Immigration Quotas, World War I, and Emigrant Flows from the United States: Appendix127 A.1 Mitchell, Maddison, and War Data............................ 127 A.2 Matching between Ethnicities and Countries....................... 128 B Birds of Passage: Return Migration, Self-Selection and Immigration Quotas: Appendix 129 B.1 Alternative Occupational Scores and Accounting for Sex................ 129

B.2 Calculation of the Actual Return Migration Rate.................... 132 x C Who Crossed the Border? Self-Selection of Mexican Migrants in the Early 20th Century: Appendix 141 C.1 Representativeness of the Sample............................. 141 C.2 Creating the Linked Sample................................ 141

xi Tables Table 2.1 Quota Law Allocations, by Country: 1921, 1924, and 1929............... 11 2.2 Total Emigrants and Immigrants, by Country: 1908-1932............... 16 2.3 Skill Composition and Length of Stay of Return Migrants, by Ethnicity: 1908-1932. 23 2.4 Means and Standard Deviations.............................. 24 2.5 Log Emigration Rates for Country Regressions: Coefficients and Standard Errors.. 29 2.6 Log Emigrants for Ethnicity Regressions, by Skill Groups: Coefficients and Standard Errors............................................ 32 2.7 Relative Supply Shocks of Immigrants, by Skill Group: 1921-22 and 1924-25..... 34 2.8 Duration-of-Stay Regressions for Ethnicity Groups: Coefficients and Standard Errors 38 3.1 Self-Selection of Return Migrants, 1908-1932....................... 55 3.2 Descriptives of Planned Return Migrants, 1917-1924.................. 57 3.3 Self-Selection into Planned Migration on Quality, 1917-1924.............. 61 3.4 Planned Length of Stay, 1917-1924............................ 62 3.5 Planned and Actual Return Rates, 1917-1924...................... 68 3.6 Transferring Skills from Source Country to the United States............. 71 3.7 Effect of Quotas on Immigrant s Occupational Score.................. 77 3.8 Effect of Quotas on Out-Migrant s Occupational Score................. 79 3.9 Placebo Tests on Immigrants and Emigrant Skills.................... 84

xii 4.1 Summary Statistics for Migrant, Military, and Passport Samples........... 101 4.2 1920 Selection Regressions Comparing Migrants to the Military and Passport Samples104 4.3 Alternative Sample Specifications for Migrant Selection Regressions.......... 105 4.4 Summary Statistics for Permanent and Return Migrants................ 110 4.5 Regression Results for Return Selection......................... 113 A.1 Matching Country to Ethnicity.............................. 128 B.1 Self-Selection of Actual Migrants in 1920......................... 137 B.2 Self-Selection of Return Migrants, Alternative Occupational Scores.......... 137 B.3 Actual Return Migrants Summary Statistics, by Ethnicity............... 138 B.4 Selection of Actual Return Migrants, 1908-1932: Percentage Point Differences... 139 B.5 Self-Selection of Return Migrants on Occupation Score from 1910-1930........ 140 C.1 Summary Statistics for Migrant Sample and Comparison with 1920 Census..... 144 C.2 Matching Matrix...................................... 144

xiii Figures Figure 2.1 Emigration Rate for Sample Countries, 1908-1932.................... 18 2.2 Logged Emigration Rates, Old versus New Migrants.................. 20 2.3 Degree of Quota Restriction............................... 22 2.4 Duration of U.S. Residence................................ 23 2.5 Duration of Stay, Old versus New Migrants....................... 36 3.1 Self-Selection of Return Migrants, 1908-1932....................... 66 3.2 Planned and Actual Self-Selection of Return Migrants................. 67 3.3 Unexpected Returns and Self-Selection, 1920s...................... 69 3.4 Self-Selection of Return Migrants, 1910-1930....................... 73 3.5 Immigrant Occupational Scores by Cohort, Measured Upon Arrival.......... 81 3.6 Immigrant Occupational Scores by Cohort, Measured at Census............ 82 3.7 Return Migrant Occupational Scores........................... 83 4.1 Immigrant Flows to the United States, 1900-1929.................... 87 4.2 Skill Composition and Literacy Rate of Mexican Migrants, 1908-1930......... 92 4.3 Location of Border Stations and Regions in Mexico................... 97 4.4 Heights: Immigrants, Soldiers and Passport Applicants................. 100 4.5 Heights: Permanent and Return Migrants........................ 111 B.1 Self-Selection of Return Migrants, All returns versus Male returns.......... 134

B.2 Self-Selection of Return Migrants using Alternative Occupational Scores....... 135 B.3 Self-Selection of Male Return Migrants by Decade................... 136 xiv

Chapter 1 Introduction Migration is commonly thought to be a one-way street, where population flows primarily from low-income countries to high-income countries. However, flows often occur in both directions as many individuals return to their original country. This returning flow is perplexing from the perspective of a simple economic model that predicts flows based on relative wages - return migrants often earn lower wages upon return, which may result in less consumption over the life-cycle (Sjaastad, 1962). It could be that migrants are forced to return home, as in the contemporary United States where migrants return home due to institutional constraints such as the expiration of work permits or migration visas; however, return migration was prevalent even prior to laws restricting the duration of migration (Bandiera, Rasul and Viarengo, 2013). Return migration may be economically rational for a couple of reasons. First, migrants may prefer a stream of consumption goods that can only be found in the source country (Dustmann and Weiss, 2007). A temporary sojourn could increase a migrant s wealth, which in turn could be used to consume a larger amount of goods at home. Alternative to consumption preferences, migrants may have a promising investment opportunity at home but lack financing. Accumulating savings in the host country is a way to overcome liquidity constraints (Mesnard, 2004). A third reason for temporary migration is because the reality of living in the host country deviates from expectations. Migrants may decide after arrival to switch their plans from permanently migrating to returning home because of lower-than-expected income (Borjas and Bratsberg, 1996). This dissertation explores these issues in the context of the United States history and specifi-

2 cally the first few decades of the 20th century. The early 1900s witnessed some of the highest rates of immigration in the history of the United States, with millions of Italians, Greeks and Russians entering the country, initially without any restriction. At the same time millions traveled back home to Europe despite higher real wages in the United States (Williamson, 1995). Not only did migrants arrive from and return to European sources but also Canadian and Mexican migration started to increase, particularly after the migration quotas of the 1920s limited European migration (Carter et al, 2006). The early 20th century provides a unique setting to study return migration because of a fortunate (for the researcher) series of events - events that eventually led to the creation of return migration data, data that seldom exist for other countries. The reasons for the data s existence ultimately came from policy makers desire for quantitative proof of the merits (or demerits) of allowing foreigners into the country; since the dissertation relies heavily on this data, it warrants telling the story of how the data came to exist (Zeidel, 2004). Many native-born voters had been trying to restrict migration into the country for decades prior to the 1910s - migration restrictions were not a brand-new idea when they finally were implemented. Waves of nativist resentment against the foreign-born crop up often throughout United States history, usually coinciding with a rise in numbers of a new migrant group, wars abroad or economic downturns. One prominent example of an anti-foreigner movement was the advent of the Know Nothing party of the 1850s, which rallied to gain almost a quarter of the seats in the House of Representative primarily on the platform of being anti-catholic, anti-irish and anti-german (Cohn, 2000). During these xenophobic backlashes, Congress would attempt to limit incoming flows either through qualitative measures (e.g., ability to speak English) or quantitative measures (e.g., total number of migrants allowed annually). Most of these proposed laws failed in committee or on the floor of the House or Senate, but sometimes they would gain enough votes to be passed through the Legislative branch. When this happened, the proposed policies were often met with a President veto; for example, a literacy requirement for entrance was vetoed on four separate occasions from

3 the 1890s to 1910s (Goldin, 1994). One particularly strong push to limit migration occurred in 1907 but was ultimately only successful in limiting insignificant classes of migrants, such as imbeciles and feeble-minded migrants. Besides limiting these small and ill-defined categories of migrants, a more important consequence was the creation of the Dillingham Commission, a bi-partisan committee appointed by Congress to conduct research on the types of people who entered the country, their actions within the United States and their effects on natives. Congress created the Dillingham Commission due to a healthy suspicion of the nativists claim that migration was harmful to the economy. Many Congressional members preferred statistics to back this claim; indeed, the Dillingham Commission spent four years collecting data before publishing 41 volumes of statistics on migration to the United States. Coincident with the creation of the Dillingham Commission was the collection of the first return migration statistics. Part of the concern over migration during the early 1900s was that migrants had trouble assimilating into the American culture and economy and part of this trouble led to migrants returning home. However, many aspects of the return flow were unknown, including the magnitude of it and the flow s age, sex or skill composition. The Immigration Act of 1907 required out-going ships to tabulate foreign-born passengers and send the ship manifests to the Bureau of Immigration. The Bureau then aggregated the ship manifests annually and reported to Congress the number of out-going migrants. Thus, the first data on return migration in the United States was created. This data source continued to record in great detail the number and type of out-migrants between 1908 and 1932. Fortunately, the data began prior to a sequence of events that would forever alter immigration to the United States. First, the era of free mass migration to the United States would end at the onset of World War I, as many ships halted their trips across the Atlantic. Second, a literacy test requirement was imposed in 1917 and was the first major qualitative restriction on migration, a restriction that required incoming migrants to be able to read and write. Third, immigration quotas were first put in place in 1921 and would remain until a reworking of the

4 quotas in 1965. Since the data begin prior to these major events in migration history, it allows study how they affected return migration. These shocks to migration flows led to major shifts in the number and types of people who arrived in the United States. For example, the first year of migration quotas coincided with a 50% drop in the volume of arriving migrants. At the same time, due to family reunification preferences under the quota law, the composition of flows shifted to more females and children arriving as they came to join a husband already in the United States. Further, migration quotas did not restrict migrants from the Western Hemisphere; accordingly, more migrants started to arrive from Canada and Mexico (Carter et al., 2006). Using data on these out-migrants, one can research how out-migrant responded to these policy shocks. Besides data from Congressional reports, there are two other data sources that help to uncover aspects of return migration. First, starting in 1917, incoming migrants had to report their duration intentions: whether they planned to stay permanently or planned to return home. I use this data to uncover whether return migration was planned prior to arrival, or the decision was made after arrival. Understanding whether return migration was planned (e.g., to save money for investment back home) or unplanned (e.g., income lower than expected) is fundamental for understanding how the economic shocks affected return migration (Yang, 2006). For example, a positive income shock could lead to more return migration as a savings target is easier to hit; on the other hand a positive income shock could lead to less return migration as fewer migrants have lower-thanexpected earnings. Another data source that helps to study return migration is the ability to link migrants across countries. With the advent of digitized censuses, one can use a migrant s name, birth place and age to discover whether he stayed in the United States or returned home. This method is only available for countries that have digitized census data; I am able to incorporate this method for Mexican migrants in the fourth chapter. The main advantage of this method is that it yields micro-data on the type of people who leave the United States; the annual reports can only show the aggregations. In the following papers of the dissertation, I use these data sources to understand the old

5 question of why migrants return home. Each of the papers study the rate and types of people who return home, viewing the question from different methods or data sources. One focus of the dissertation is the effect of immigration quotas because they serve as a natural experiment that varies the number of incoming migrants each year; the resultant effects show how competition amongst foreigners and failures in the labor market are major explainers of return migration. The research has implications for how the migrant stock evolves in the United States and the rate of migrant assimilation. If return migrants were mostly lower skilled, then the rate of migrant assimilation in the United States would mechanically increase for the simple reason of there being fewer low-skilled migrants physically present (Borjas, 1985). Typically migration policy may attempt to increase assimilation rates by increasing the quality of incoming migrants; however, it could also increase assimilation rates by decreasing the quality of out-going migrants. By learning the processes of migration policy in the past, one can get closer to understanding the evolution of the migrant stock in the present.

Chapter 2 Immigration Quotas, World War I, and Emigrant Flows from the United States in the Early 20th Century Co-authored with Michael Greenwood Forthcoming in Explorations in Economic History 2.1 Introduction From 1908 to 1932, 12 million individuals migrated to the United States. 1 Over the same period, four million returned to their source country. Despite the magnitude of return migration, little is known about factors that influenced out-migration of prior immigrants. Using annual U.S. administrative data from 1908 to 1932, we study a tumultuous period of U.S. immigration history that began with migrants freely moving across borders with limited institutional barriers and ended with tight restrictions on entering the U.S. and with the Great Depression. World War I interrupted migrant flows due to limited travel across the Atlantic, and soon after the War the U.S. imposed binding limits on immigration with the quota laws of the 1920s. This paper examines not only return flows to Europe, but also the skills and duration of stay of temporary migrants. While the quotas altered the numbers and composition of migrants into the U.S., little is known about how they changed migration out of the U.S. Despite the importance of return migration, data on temporary migrants and return migrants are limited. Researchers either use panels that follow migrants across countries (e.g., Mexican Mi- 1 These figures are based on U.S. administrative records from 1908-1932.

gration Project) or use selective attrition from panels to estimate out-migration (Abramitzky et al., 7 2014; Lubotsky, 2007; Van Hook et al., 2006), methods that are costly and either do not comprehensively measure emigration or do not actually observe out-migrants. 2 The use of administrative records that aimed to enumerate every emigrant from 1908 to 1932 allows the avoidance of such problems with respect to historical U.S. emigration. Contemporary data on return migration do not exist, but historical data are reasonably detailed. The historical data report the number of out-migrants at the time of departure (including their duration of stay and occupation upon departure), which allows the study of how out-migration changed over time with free mobility in and out of the United States (Pre-1913), as well as during World War I (1914-1918), and finally during binding restrictions of the 1920s (1917-1932). 3 In this study, we take advantage of the richness of the emigration data for this time period to investigate how the quotas affected the number of departures, the duration of stay, and the occupations of emigrants. We use three distinct changes in policy that occurred in 1921, 1924, and 1929 to estimate the causal impact of the quotas on out-migration to major European countries. The quotas changed both in absolute number and in allocation across countries, which led to significant within-country changes in restrictions during the 1920s. We measure the restrictiveness of the quotas as the reduction from potential immigrant flows, with potential flows based on previous immigration flows from 1908 to 1914. Furthermore, since out-migration was recorded prior to the implementation of the quotas in 1921, we are able to verify that more and less restricted countries had similar trends of out-migration rates prior to the implementation of the restrictions, suggesting that a difference-in-difference empirical specification is valid for estimating the causal effect of the policy. We find that following the implementation of quotas, out-migration fell more than immigration, leading to decline in emigration rates. Prior evidence suggests that migrant inflows tend to 2 Another recent strategy to create return migration data sets is found in Abramitzky, et. al (2012) and in the third paper of this dissertation where migrants to the United States are linked to source country censuses. However, this method is available only for countries that have digitized censuses. 3 Congress qualitatively restricted migration by requiring migrants to pass a literacy test in 1917. Later, Congress quantitatively restricted migration through immigration quotas in 1921.

8 drive out local residents (Boustan et al., 2010; Card, 2001), and that immigrants primarily affect previous immigrants in the labor market by being close substitutes (Giovanni and Peri, 2012). Many migrants appear to have taken advantage of reduced inflows and improved U.S. employment opportunities following immigration quotas by being more likely to permanently remain in the United States rather than return to their source country. For a quota that restricted 60 percent of the potential flow (the average restriction for the policy during the 1920s), emigration rates fell by 22 (when Germany is separated from the analysis) to 55 percent (when Germany is included), an economically significant change in out-migration. The effect of the quotas fell more heavily on the unskilled and agricultural workers as their out-migration fell, whereas out-migration of semi-skilled workers and professionals was less affected. This evidence is consistent with a simple labor market model in which an immigration quota causes a large supply shock in unskilled labor markets and a mild supply shock in skilled labor markets. Moreover, those migrants who returned to their source countries stayed for a longer time before returning, likely taking advantage of improved job opportunities. Our analysis also yields a number of other observations about emigration rates during this time period, such as that return migrants were less likely to return to war-ravaged countries, as proxied by source country s population loss. The data verify that out-migration was counter-cyclical (Jerome, 1926), and that emigration rates were associated with changes in the savings of incoming migrants, where more savings were correlated with longer durations of stay. Stronger network connections also were associated with lower out-migration rates. Finally, German migrants were much more likely to return home during the 1920s, perhaps as a result of discrimination following World War I (Moser, 2012). 2.2 Historical Background Our study examines three important periods in U.S. immigration history: (1) the end of the Age of Mass Migration (1850-1913), (2) World War I (1914-1918), and (3) the beginning of the first major restrictions on immigration (1917-1932), which were at first qualitative (1917) and later

9 quantitative (1921 and beyond). An extensive literature is available on migration to the U.S. during the period of mass migration, with Hatton and Williamson (1998) and many others discussing U.S. immigration from Europe during the late 19th and early 20th centuries. However, return migration during the Age of Mass Migration has garnered considerably less interest, with only a few scholars attempting to estimate out-migration rates (Bandiera et al., 2013; Gould, 1980; O Rourke and Williamson, 1999), which show that new migrants were much more likely to return than old migrants. 4 Biavaschi (2013) extends knowledge on the type of migrant who left the United States from 1908-1957, showing that out-migrants were of working age, male, and more likely to have unskilled occupations. We approach out-migration from a different perspective by focusing on how the rate of out-migration changed during early 20th century, specifically in response to World War I and immigration quotas. The shift in source-country composition of immigrants from Northern and Western Europe to Eastern and Southern Europe during the late 19th century resulted in political pressure to curtail these new migrants. Francis A. Walker, the first president of the American Economic Association, argued in 1896 that the U.S. needed to protect the American rate of wages, the American Standard of living, and the quality of American citizenship from degradation through the tumultuous access of vast throngs of ignorant and brutalized peasantry from the countries of Eastern and Southern Europe (Walker, 1896, p.823). Not only did nativists argue that new migrants were less skilled, less ambitious, and less educated than previous migrants, but they also criticized these new migrants as birds of passage who made no effort to assimilate into American society since they returned home quickly (Bailey, 1912). Nativists, it seems, preferred migrants to remain in Europe but if they decided to come to the United States, they were expected to remain in the U.S., learn the language, and assimilate. The Age of Mass Migration ended abruptly with the onset of World War I. The War not 4 New migrants refer to people born in Eastern and Southern Europe (Italy, Portugal and Spain in our sample). Old migrants refer to individuals born in Northern and Western Europe (Belgium, Denmark, England, France, Ireland, Germany, Netherlands, Norway, and Sweden in our sample). We have more old migrant countries because boundary changes for many Eastern European countries following World War I affect calculations of out-migration rates.

10 only limited mobility across the Atlantic, but also increased employment opportunities in the U.S.; both of these factors lowered return migrant flows. Although the U.S. remained neutral until 1917, the early years of war were marked by increased production of military goods to satisfy European demand. Employment increased by 3.3 million in the private non-farm sector from 1914 to 1918, causing the unemployment rate to drop from 7.9 percent in 1914 to 1.4 percent in 1918 (Rockoff, 2004, Table 2.2). When the War ended, shipping lines returned to normal, and U.S. military personnel returned from Europe. Pre-War immigrants emigrated from the U.S. en masse following the end of the war, in part because of the safer return journey and in part because of the increased competition with veterans for jobs. The era of free entry into the U.S. survived until 1917 despite numerous attempts by Congress to restrict both the type and number of individuals who could enter the country (Goldin, 1994). 5 The literacy test was the first major method of restriction proposed to increase the quality of migrants. This requirement was controversial, as many argued that such a test would exclude industrious migrants from flourishing in the U.S. The literacy test was vetoed three separate times by Presidents Cleveland, Taft, and Wilson based on these suspicions. Congress, which finally was able to override President Wilson s veto and pass the Literacy Act of 1917, did not settle with this qualitative restriction on immigration, but then went further to pursue numerical restrictions with the Emergency Quota Act of 1921 and the National Origins Acts of 1924 and 1929 (Zeidel, 2005). The Emergency Quota Act of 1921 limited total U.S. immigration to a little more than 300,000, well over a 50 percent drop from admittances in 1920. The National Origins Act of 1924 limited immigration further, to around 165,000, with the National Origins Act of 1929 finally settling on an annual quota of 150,000 and reallocating the quotas to match the perceived ethnic origins of the United States. The quota acts radically transformed migration to the United States. Now migrants had to 5 Migration from certain countries was restricted with the Chinese Exclusion Act of 1882 and the Gentlemen s Agreement with Japan in 1907. Moreover, a number of minor restrictions on entering the U.S. were also in place; these restrictions tended to be of a personal nature imposed on individuals such as polygamists and anarchists. Such restrictions kept out a very small number of potential migrants.

11 apply to foreign consuls and acquire visas to travel to the U.S. Since migrants needed approval across the Atlantic, this system obviated the need for immigration stations like Ellis Island that processed migrants upon arrival. Additionally, the quotas allowed immigrants to apply for a permit to leave the United States for up to a year and return as a non-quota migrant. 6 Finally, during this time period, reuniting with family overseas was much easier for those migrants who became naturalized citizens, because family preferences were established within the quota system. 7 Children (under age 18) and wives of naturalized citizens did not count against the quota, and preference within quotas was given to family members (parents and children under 21) if they were joining a naturalized citizen. Table 2.1: Quota Law Allocations, by Country: 1921, 1924, and 1929 Quota Laws Mean Immigrant Flow Country 1921 1924 1929 (1908-1914) Belgium 1,563 512 517 5,186 Denmark 5,694 2,789 1,181 6,117 France 5,729 3,954 3,086 8,209 Germany 68,059 51,227 25,957 31,292 Ireland - 28,567 17,853 27,571 Italy 42,057 3,845 5,802 202,222 Netherlands 3,607 1,648 3,153 6,625 Norway 12,202 6,453 2,377 11,874 Portugal 2,520 503 440 9,166 Spain 912 131 252 5,021 Sweden 20,042 9,561 3,314 16,642 UK (includes Ireland for 1921) 77,342 34,007 65,721 42,658 Sources: Quota limits taken from the Reports of Commissioner General of Immigration 1922, 1925 and 1930. Adapted from Table 2.2.1 in Greenwood and McDowell (1999). As the quota laws were intended to do, they greatly restricted immigration from the new source countries, while having a limited influence on immigration from the old source countries. We show this greater restrictiveness on new migrants by listing the quota allocations by countries from the 1921, 1924 and 1929 laws in Table 2.1. We also include the average number of migrants 6 Between 1921 and 1924, individuals returning from a temporary visit abroad were counted against the quota, but were allowed to enter the country if the quota was full. 7 To become a citizen, one needed to remain the United States for at least five years, declare an intent to become a citizen two years previous to naturalization, and be able to speak (but not read or write) English.

12 from 1908 to 1914 to compare quota limits to incoming flows for each country. Among many old source countries, the 1921 quota did not restrict flows at all; Germany s quota number was about 30,000 more than the average number of German immigrants from 1908 to 1914, and the United Kingdom s quota also did not bind. On the other hand, the new source countries, such as Italy and Spain, had highly restrictive quotas. The 1924 quota limited new source countries even more, but also caused the quotas to bind for some old source countries. Among selected old source countries, the number of admittances in fiscal year 1926 (after the second quantitative restriction of the 1920s) as a percentage of the 1921 number (before the first quantitative restriction became fully effective) are as follows: Germany, 741.6 percent; England (including Scotland and Wales), 49.9 percent; Ireland (including Northern Ireland), 87.6 percent; and Sweden, 92.8 percent. Among selected new source countries comparable percentages are: Italy, 3.7 percent; Spain, 1.4 percent; Portugal, 3.5 percent; Greece, 3.9 percent, and Poland, 7.5 percent. 2.3 Previous Work and Theory on Return Migration Since they voluntarily return to (presumably) lower wages in their source countries, temporary migrants defy simple income-maximization hypotheses. 8 Whereas some migrants return home for family, cultural, and lifestyle reasons (Gibson and McKenzie, 2011), many temporary migrations are financially strategic. 9 For example, Yang (2006) finds that most Filipino migrants take advantage of higher wages from exchange rate shocks and extend the duration of migration. Moreover, since some emigrants return as entrepreneurs starting a business (Dustmann and Kirchkamp, 2002), certain economists have modeled temporary migration as a way to relieve capital constraints (Galor and Stark, 1991; Mesnard, 2003). A simple labor market model would predict that severely restrictive immigration quotas would 8 See Sjaastad (1962) for an income maximization model of migration, and see Dustmann and Weiss (2007) for a review of contemporary return migration models. 9 To motivate return migration, economists initially modeled temporary migrants with a preference for home country consumption, as individuals return to a familiar culture, food, and climate (Hill, 1987; Djajic and Milbourne, 1988). Gibson and McKenzie (2011) support the claim that out-migrants do not solely factor wages in their migration decision by finding that the highest skilled migrants from Pacific Island countries (New Zealand, Tonga, Papua New Guinea) return home mostly for family and lifestyle reasons. Constant and Massey (2003) support this claim by concluding that migrants to Germany are more likely to remain if they feel German.

13 shift the supply of labor inward and drive up wages, but the effect is not that straightforward. The effect of an immigration supply shock on natives depends on the substitutability of immigrant and native workers, and various methods have been employed to estimate the elasticity of substitution, with no conclusive answer (Borjas, 2003; Borjas and Katz, 2007; Card, 2001). 10 While the impact of immigrants on natives is controversial, the effect of immigration on previous immigrants is less so: having similar language, education, skills, and experience, arriving migrants substitute for previous migrants and reduce their wages (Ottaviano and Peri, 2012). While Ottaviano and Peri demonstrate this result for contemporary migration, it likely also holds for the early 20th century. The skill composition of incoming migrant cohorts was relatively constant over time before quotas were implemented, with 55 to 65 percent of old migrants and 70 to 80 percent of new migrants consistently reporting unskilled jobs upon arrival. 11 After the quotas, approximately 55 to 60 percent of both new and old migrants were unskilled. Having similar skills, the newer cohorts likely competed with just previously arrived cohorts for jobs. Furthermore, arriving migrants tended to settle in the same areas (and thus labor markets) as previous migrants, with 86 percent of incoming migrants reported joining a friend or relative upon arrival. Thus, migrant quotas likely caused a leftward supply shock in previous migrants labor markets. Lifecycle migration models predict that higher wages cause individuals to remain longer in the U.S. and that some migrants find the wage increase attractive enough to stay permanently, decreasing the out-migration rate. 12 Economic theory also predicts that the effects of quotas on the migrant population could have been heterogeneous if the supply shocks were differential across skill groups. Recent empirical strategies in Borjas (2003) and Ottaviano and Peri (2012) split immigrants and natives into skill groups partly based on education, arguing that substitutability is highest within education groups. They use variation in immigration flows across skill groups to estimate the effects of immigration 10 For a review of the earlier literature see Friedberg and Hunt (1995). 11 These percentages are based on authors calculations from Annual Reports. 12 An alternative to life-cycle models of migration is savings-target models, which predict that duration of stay shortens as wages increase. The data in the paper are more consistent with life-cycle models.

14 on the labor market. If we apply this logic to the quotas, because policy caused the flow of unskilled migrants to fall relatively more than that of higher skilled groups, the unskilled immigrant labor market presumably sustained a stronger shock (Massey, 2012). In our study, we test the prediction that unskilled migrants respond more strongly to migration quotas than other skill groups, and expect the out-migration rate to decrease more for unskilled individuals than semiskilled or professional workers. Departing from the focus in the literature on the impact of immigration on natives, we study solely the foreign-born. Since temporary migration was a large fraction of the total migrant flow during the period of study (Bandiera et al., 2013), the impacts of the quotas on temporary and permanent migrants must be separated in order to assess the overall effects of policy. Here we study only the temporary migrants. 2.4 The Data 2.4.1 Reports of the Commissioner General of Immigration (RCI) The United States recorded individuals entering the country beginning in 1820, but not until 1908 did officials keep track of those leaving. 13 From 1908 to 1932, the Commissioner General of Immigration issued a yearly report to Congress about the state of immigration in the United States (Report of the Commissioner General of Immigration or RCI henceforth). 14 These reports are the basis for U.S. immigration rates in Ferenczi and Willcox (1929). For every emigrant leaving the country, ship captains filled out a manifest list similar to the ones completed for immigrants entering the country. The Commissioner General of Immigration collected these manifests, tabulated the data, and provided reports to Congress analyzing flows to and from the United States. These data are split into two categories, one listing out-migrants by country and the other listing out-migrants by ethnicity. The country data list only total out-migrants, whereas the ethnicity data also contain 13 This is due to the Immigration Act of 1907, which also created the Dillingham Commission in order to collect more statistics on immigration. 14 These reports go until 1957, but do not have sufficient detail for the study following 1932. For other studies with the same data, see Greenwood (2007), Greenwood (2008), and Lafortune and Tessada (2012).

15 occupations and length of stay of those leaving. However, ethnicity data are more aggregated than country data, causing them to yield fewer observations. Thus, in this paper we use the country data for out-migration rates but use ethnicity data for skill and duration of stay regressions. Official statistics recorded individuals leaving as either emigrants (alien residents leaving permanently) or as non-emigrants (nonimmigrants leaving or alien residents departing temporarily and reentering with a permit). 15 While out-migrants leaving the United States temporarily only to quickly return comprise a noteworthy group, we focus on migrants intending to leave the United States permanently. Administrative data have been criticized in that they significantly underestimated both the inflow and outflow of migrants from 1892 to 1924 (Bandiera et al., 2013). Underestimates were most significant following the end of World War I, and are attributed to haphazard collection of ship records and undercounting of cabin passengers. A systematic undercounting of emigrants would create non-classical measurement error in our dependent variable (emigration rates), potentially biasing estimation. We discuss the effects of measurement error in the empirical section and argue that the error biases the data against finding an effect of the quotas. While criticisms of the data are noteworthy, official statistics provide advantages over other estimates of outflows from the United States. First, official statistics are the only available data to estimate the effects of quotas by providing annual variation in out-migration. Yearly data are crucial for understanding how policy changes in 1921, 1924, and 1929 influenced emigration. Second, official statistics record information on age, sex, and skills of the emigrants upon departure. Whereas demographic techniques can capture age and gender outflows, they cannot recover information about the skills of those who left, which is critical for understanding the distributional impact of the quotas on the skill of the migrant stock. Third, demographic techniques do not capture 15 In the classification of aliens the terms (1) immigrant and emigrant and (2) nonimmigrant and nonemigrant, respectively, relate (1) to permanent arrivals and departures and (2) to temporary arrivals and departures. In compiling the statistics under this classification the following rule is observed: arriving aliens whose permanent domicile has been outside the United States who intend to reside permanently in the United States are classed as immigrant aliens; departing aliens whose permanent residence has been in the United States who intend to reside permanently abroad are classified as emigrant aliens; all alien residents of the United States making a temporary trip abroad and all aliens resident abroad making a temporary trip to the United States are classed as nonemigrant aliens on the outward journey and nonimmigrants on the inward. (RCI, 1913, p.6)

16 variation in the duration of stay of emigrants. Table 2.2: Total Emigrants and Immigrants, by Country: 1908-1932 Country Emigrants Immigrants Emig/Imm Emigration Rate (per 000s) Ireland* 44,613 452,216 0.099 4.013 Sweden 40,563 243,351 0.167 6.031 Denmark 14,895 90,746 0.164 6.636 Netherlands 13,319 88,844 0.150 7.453 Germany 96,059 645,279 0.149 8.077 Norway 41,984 178,503 0.235 8.093 England 133,991 533,910 0.251 10.211 Belgium 16,971 62,848 0.270 14.300 Portugal 57,752 140,218 0.412 21.455 France 66,962 133,350 0.502 27.570 Italy 1,225,038 2,091,097 0.586 30.531 Spain 65,585 107,556 0.610 41.561 Total 1,817,732 4,767,918 0.381 15.494 Sources: Reports of the Commissioner General of Immigration. Data covers 1908-1932. *During this time period the Irish Free State split from the United Kingdom, but statistics still report separately the number of immigrants coming from the island of Ireland previous to 1922. For the years post 1922, we combine both the Republic of Ireland and Northern Ireland to have consistent borders. In Table 2.2, we report basic immigration and emigration statistics for the 12 countries that form our data base. 16 Over 1.8 million prior U.S. immigrants migrated back to these European countries between 1908 and 1932 nearly a third of immigrant flows. Our sample includes countries that have long traditions of migration to the United States (e.g., United Kingdom and Germany), and also the new migrants from Southern Europe (e.g., Spain, Portugal and Italy). Generally, new migrant countries had much higher return migration rates. Italy far surpassed other countries in the total number of individuals leaving, with 1.2 million departing the United States compared to 2 million entering. France also had a high ratio of return migrants to immigrants, abnormal among Western and Northern European countries. While the ratio of total emigrants to immigrants is informative, it is not a rate because it has little reference to the population at risk to remigrate. 17 The last column in the Table 2.is the emigration rate measure we create in the following section. 16 We have data on some Eastern European countries, but country boundaries changed significantly in the aftermath of World War I, so we keep only the 12 listed in Table 2.2. 17 Bandiera et al. (2013) term the ratio of outflows to inflows as the out-migration rate, but it is really a measure of the net interchange in a given decade.

17 2.4.2 Calculating Emigration Rates Immigration rates are straightforward to calculate since those at risk to move are the population of the country at the beginning of the migration interval (including those who migrate over the interval). The dependent variable in our main regression is the emigration rate for a country/year, requiring a calculation of the at-risk population. The at-risk population for return migrants from the United States is the population of those from the source country currently living in the U.S.; however, only the census provides a complete count of the at-risk population, at ten year intervals. Rather than interpolating the foreign-born population from census to census to proxy the at-risk population, we take advantage of information on how long migrants stayed within the U.S. before returning. For out-migrants, over 95 percent of those who departed had entered the country within the previous 20 years (see Table 2.3, last column). Since close to 100 percent of those who departed lived in the U.S. for 20 years or less, we use immigrants from the past twenty years to proxy the at-risk population. Thus, our emigration rate is EmigrationRate ct = Emigrants ct Σ 1 t= 20 Immigrants ct (2.1) In Equation (1), c denotes country and t denotes the year. This measure captures emigration rate changes on an annual basis. A problem with the denominator in the measure is that it does not account for the out-migration of the prior immigrants over the previous 20 years, who are no longer at risk because they have already departed. Before 1908, no measure of out-migration exists, so we make no effort to adjust the population at risk. Since the true population at risk was actually smaller than our measure of the population at risk, the actual emigration rates were higher than our estimates. 18 In Figure 2.1, we show emigration rates across time for the countries reported in Table 2.3. Large fluctuations are evident in the emigration rate, especially between 1914 and 1920. Out- 18 The emigrants who had departed in the previous twenty years can be subtracted from the at-risk population for the years 1928-1932. The actual emigration rate is approximately 30% higher than the measured rate, but there is no difference in variation between the actual emigration rate and the measured emigration rate. The estimation uses country and year fixed effects, and thus the difference between the measured and actual emigration rate has no effect on results.

18 migration rates fell to their lowest levels during World War I. 19 (Total emigration was 303,338 in 1914 but fell to 66,277 in 1917). A combination of increased employment opportunities in the United States, the deteriorating situation in Europe, and disruptions to passenger travel led to the large decrease in emigration rates. Another reason for the large increase in the 1920 rate is that immigration was low during the War years, so the denominator of the rate (i.e., immigration from country c over the last 20 years) did not keep pace relative to earlier years (total emigration actually fell from 303,338 in 1914 to 288,315 in 1920). Immediately after the War ended, a huge spike in emigration occurred as migrants who waited in the United States during the War returned home. 20 Figure 2.1: Emigration Rate for Sample Countries, 1908-1932. Notes: Calculated from the Reports of the General Commissioner of Immigration 1908-1932. Emigration rate is the number emigrants divided by the number of immigrants from the past twenty years. The emigration rate started to drop after the first quota law was passed in 1921. Total 19 Foreign-born non-citizens made up 11 percent of the draftees in World War I and numbered around 300,000. Given the death rates for those in the U.S. Army (2.6%), a back of the envelope calculation suggests that around 8,000 non-citizen immigrants died in World War I, a number too small to skew results given the hundreds of thousands that emigrated after the war (Ford, 2001). 20 The timing of the 1920 recession does not coincide with the large number of migrants leaving since the fourfold jump in emigration rates occurred within the first few months of the recession. The United States recorded the number of emigrants over the fiscal year, which was from July 1st to June 30th. Thus the 1920 emigration rate refers to the number of migrants who departed from July 1st, 1919 to June 30th, 1920. Unemployment rates from the 1920 recession peaked in 1921, when the emigration rate fell.

19 emigration fell from 288,315 in 1920 to 247,718 in 1921 21 to 198,712 in 1922, and then to 81,450 in 1923. By 1929, emigration had fallen to 69,203. By severely limiting immigration from the countries with the highest emigration rates, the quota laws guaranteed a major decrease in return migration. Moreover, by limiting immigration from countries with high immigrant sex ratios, the quota laws also altered the sex composition of return migrants. The emigrant sex ratio (males/females x 100) fell from 396.2 in 1914 to 322.8 in 1921 and 258.1 in 1922. By 1929 it stood at 205.3. After the quota laws, many women who immigrated may have been reuniting with spouses who had decided to remain in the U.S., which could then have resulted in fewer men returning to Europe. As the economy deteriorated during the beginning years of the Great Depression, large numbers of foreigners left the country. In 1931, for the first time since official statistics on emigration were recorded, more migrants left than entered the country. By 1932, emigrant flows were about three times the size of immigrant flows, but the emigration rate was still lower than witnessed before the end of the Age of Mass Migration. Quotas tended to reduce out-migration rates (Figure 2.2). The vertical lines in the graph refer to the implementation of the 1921 and 1924 quotas. New migrants out-migrated at higher rates than old migrants, and rates trended together prior to the implementation of the quotas, validating the use of a difference-in-difference strategy to estimate the causal impact of quotas on out-migration. Following the passage of the 1924 quota laws, emigration rates for old migrants trended upwards, but rates for new migrants fell, resulting in old migrants having higher outmigration rates by the beginning of the 1930s. The public perception that new migrants were more likely to stay temporarily than old migrants was no longer true by the end of the 1920s and beginning of the 1930s. 2.4.3 Measuring degree of Quota Restriction To estimate the effect of the quotas on emigration rates, we employ a continuous measure that takes advantage of the multiple changes in quota numbers and allocations across countries. The 21 The first quota act went into effect on May 19, 1921.

Figure 2.2: Logged Emigration Rates, Old versus New Migrants 20 Notes: Data is from the Reports of the General Commissioner of Immigration 1908-1932. The vertical lines are drawn for the 1921 and 1924 quota laws. quota laws were first introduced in the fiscal year of 1921-22, restricting immigration to 3 percent of the total population from each country present in the U.S. at the time of the 1910 census. The quotas were specifically targeted to substantially reduce migrants from Eastern and Southern Europe while leaving migration from Northern and Western Europe relatively unaffected. Congress further restricted immigration in 1924 by basing the quota for each country on 2 percent of the number of foreign-born based on the 1890 Census. This change in rules caused quota numbers to drop even further for new migrants; for example, Italy s quota fell from 42,057 to 3,845 and Russia s quota from 24,405 to 2,248, whereas Germany s fell from 67,707 to 51,227 (RCI, 1925). 22 Our measure of quotas takes into account the degree of restrictiveness the quota changes caused. The National Origins Act of 1929 subtly altered the quota system, placing a cap of 150,000 immigrants but allocating them across countries based on the national origins of the United States population. For example, the United Kingdom s quota went from 34,007 under the 1924 Act to 65,721, while Germany s went from 51,227 to 25,957 (RCI, 1930). Our estimate of the restrictiveness of the quotas uses past immigration numbers as the total 22 Germany s quota was changed from the 68,059 reported in Table 2.1 to 67,707 in 1923 after Congress corrected the initial number while reissuing the quota laws.

21 potential immigration in the absence of restrictions. Specifically, for each country, we take the difference between the average number of migrants who arrived in the United States from 1908 to 1914 and the quota limit. 23 This difference provides an estimate of the number of migrants restricted from entering the United States due to the quota. We then divide this difference by the average number of migrants from that country to get a fractional value for the quota restriction. QuotaRestriction jt = AvgMigrants j,1908 1914 Quota j,t AvgMigrants j,1908 1914 (2.2) QuotaRestriction jt changes both across countries/ethnicities and through time due to changes in the policy in 1921, 1924, and 1929. A 0.10 point increase in QuotaRestriction jt is interpreted as restricting the flows from a country by 10 percentage points. For years previous to the Emergency Quota Act of 1921, the value of QuotaRestriction jt is zero. These numbers can be calculated from Table 2.1, and we display the variation in QuotaRestriction jt in Figure 2.3, distinguished by old and new migrants. Quota restriction increases in 1921 and 1924, with a smaller increase during 1929. The quotas produced sharp within-country changes in 1921, 1924, and 1929, which we use to more precisely estimate a causal effect of quotas on out-migration, and to promote ease of interpretation since our measure is the fractional restriction of total potential immigrant flows. 2.4.4 Emigrant Skill and Length of Stay Prior to the imposition of the quotas, the government was more concerned with the ethnicity of immigrants than with their country of origin; thus, further details about occupation, duration of stay, gender, and age are recorded within the RCI by ethnicity. 24 Table 2.3 presents the skill composition and duration of stay for emigrants as determined by ethnicity. Migrants are placed into three different skill groups: unskilled, semi-skilled, and professional (see Appendix for groupings). 25 The majority of those who left the United States claimed laborer as their occupation, which fits the 23 We use the average number of migrants from 1908-1914 rather than more recent years due to World War I and subsequent fallout that could affect the average number of migrants. 24 RCI uses the term races or peoples; we use the term ethnicity 25 Farmers and Farm Laborers are placed in the unskilled group. We do this because many times migrants misreported their occupation as farmer when they were a farm laborer; special instructions were given on manifests

Figure 2.3: Degree of Quota Restriction 22 Notes: Annual quota amounts taken from the Reports of the General Commissioner of Immigration 1922-1932. Calculation is described in text. narrative of temporary workers filling needs in manufacturing and agriculture. This high proportion of unskilled workers is especially seen among Southern European countries, where over 70 percent of those who left were laborers. Northern European countries had a sizable portion of emigrants with professional and semi-skilled occupational classifications, with the English having almost 60 percent who were either semi-skilled or professional, as opposed to Italians, who had only 8 percent in these occupational categories. The duration of stay also is listed in Table 2.3. Most immigrants who returned stayed only five years or less. Another much smaller group stayed between five and ten years, with only a few of those staying past ten years before returning home. Unlike return rates, no correlation appears to exist between the share staying less than five years and whether migrants were old or new ; for example, 64 percent of Italian out-migrants left before five years, whereas 72 percent of German and English migrants did so. Changes in the duration of stay of temporary migrants over time are displayed in Figure 2.4. During World War I and immediately following, the number of immigrants who had been in the to properly distinguish migrants between these two categories, but mistakes still were often made. Since it is unclear whether an individual is a farmer or farm laborer, we conservatively place them in the unskilled category.

Table 2.3: Skill Composition and Length of Stay of Return Migrants, by Ethnicity: 1908-1932 Duration in the United States (Years) Ethnicity Unskilled Semi-Skilled Professional <5 5 to 10 10 to 15 15 to 20 <20 Dutch and Flemish 59.2 21.1 19.7 64.8 24.1 7.9 0.3 97.0 English 40.9 27.9 31.2 71.1 19.3 6.4 0.5 97.2 French 52.2 16.2 31.6 69.5 19.9 6.9 0.5 96.8 German 55.0 18.8 26.2 72.6 19.4 5.1 0.4 97.5 Irish 69.8 15.9 14.3 57.3 27.2 10.2 0.9 95.6 Italian 91.5 5.7 2.9 63.9 27.8 6.1 0.1 97.9 Portuguese 81.4 15.3 3.4 57.2 27.0 11.7 0.1 96.1 Scandinavian 63.6 25.0 11.4 62.2 26.8 7.7 0.5 97.1 Spanish 69.2 14.4 16.4 74.3 18.6 5.4 0.1 98.5 Notes: Data is from Reports of the General Commissioner of Immigration 1908-1932. 23 United States from five to ten years increased, whereas the number who stayed less than five years dropped. Many migrants who arrived in the United States prior to the outbreak of World War I, and who presumably intended to return home at some time, remained in the U.S. during the War and returned afterwards. Figure 2.4: Duration of U.S. Residence Notes: Data is from the Reports of the General Commissioner of Immigration 1908-1932. Table 2.4 reports the means and standard deviations for variables used in the econometric analysis. We split the data into country data, used for regressions of emigration rates, and ethnicity

24 data, used for skill and length of stay regressions. We combine data from a variety of sources to create a balanced panel of 12 countries and 9 ethnicities across 25 years. World War I deaths are the percentage of home country population that died during the War, taken from various governmental sources. (See Appendix for sources.) Quota restriction measures the fractional reduction in migrant flows caused by the quotas. The maximum value for quota restriction is 0.98, a 98-percent drop in immigration flow, specifically for Italian migrants. Table 2.4: Means and Standard Deviations Country Data Ethnicity Data Variable Mean Std. Dev. Mean Std. Dev. Dependent Variables: Emigration Rate 12.39 12.16 Log Emigration Rate 2.10 1.01 Less than 5 Years 0.67 0.16 5-10 Years 0.22 0.15 More than 10 Years 0.10 0.06 Log Unskilled 7.42 1.29 Log Semi-skilled 6.00 1.07 Log Professional 5.79 1.09 Log Farmers 4.62 1.00 Log Farm Laborers 3.02 1.59 Explanatory Variables: Quota Restriction 0.23 0.35 0.23 0.35 World War I Deaths (Natural Increase (t-20) 9.94 4.06 9.20 4.13 GDP per Capita (t-1) 3.23 1.11 3.09 1.13 GDP Growth (t-2 to t) 2.74 8.58 2.49 8.54 Foreign Sex Ratio (t-1) 94.81 5.24 94.04 4.19 Share in Agriculture 0.37 0.14 0.39 0.15 Share in Industry 0.31 0.1 0.31 0.10 Log Immigrant Occ. Score (t-5) 2.94 2.88 2.96 0.17 Immigrant Network (t-5) 0.86 0.09 0.85 0.10 Log Immigrant Savings (t-5) 4.04 0.75 4.02 0.78 Immigrant Literacy Rate (t-5) 0.92 0.16 0.90 0.18 Observations 300 300 225 225 Notes: Emigration rates are from the Reports of General Commissioner of Immigration 1908-1932. Natural increase (Crude birth rate minus death rate), share in agriculture, foreign sex ratio is taken from Mitchell (1998). GDP figures are from Maddison (2008). World War I death rates taken from various governmental sources, found in Appendix. In some specifications, we control for the composition of the at-risk migrant group by including measures of immigrants entering five years prior. These measures include the average immigrant

occupational score, which measures the skill changes of immigrants, the average amount of money brought by immigrants, the fraction of immigrants joining a friend or relative (which we refer to 25 as network), and the literacy rate of immigrants. 26 The data show that migrants were very likely to join a network in the United States, as 85 percent said they were coming to join a friend or a relative, and that the average migrant brought 60 dollars. Moreover, about 90 to 92 percent of migrants were literate, a percentage that increased after the literacy test was implemented in 1917. Besides including controls for the composition of prior migrants, we include other explanatory variables such as GDP data taken from Maddison (2008), the share of labor force in agriculture, the share of labor force in industry, natural increase lagged 20 years (Easterlin, 1961), and sex ratios in the home country (Mitchell, 1998). 27 2.5 Econometric Procedures 2.5.1 Estimating Equation The panel structure of the data allows the estimation of the effects of quota laws on outmigration rates. Since the quota laws changed three times during the 1920s, the variation in quota number by country is used to estimate the effects of quota restrictions on emigration rates, emigration of specific skill groups, and duration of stay. This leads to the following estimating equation Y jt = β 0 + β 1 QuotaRestriction jt + X jtπ + γ j + δ t + ɛ jt (2.3) where j denotes either country or ethnicity and t indicates the year. 28 Moreover, Y jt represents the outcomes of interest, which are emigration rates, number of out-migrants by skill group, and fraction of emigrants by length of stay. The variable of interest QuotaRestriction jt is as described above in Equation (2). In effect, QuotaRestriction jt implements a difference-in-difference 26 This is done by matching occupations in the RCI to the variable occscore in IPUMS. 27 We use the sex ratio for people in prime age rather than entire population. 28 Country data are used for emigration rates while ethnicity data are used for out-migration rates by skill group and duration of stay.

26 strategy where pre-treatment years are 1908 to 1921, treatment years are 1922 to 1932, and treatment dosages vary with the restrictiveness of the quotas. Cultural or taste-based reasons for out-migration that are time-invariant (e.g., language) are taken into account by country/ethnicity fixed effects (γ j ). This is an important point because we are identifying the effect of quotas on out-migration by using three within country/ethnicity quota changes. To capture trends in the economy across years, especially during World War I and entering the Great Depression, we include year fixed effects (δ t ). For example, emigration rates increased during the Great Depression after quota restrictions increased in 1929, but it would be improper to attribute increased emigration to quotas. For β 1 to be properly identified, the model assumes there is no omitted variable in ɛ jt that is correlated with the severe changes in the quota laws in 1921, 1924, and 1929 and also with the dependent variable. The X jt vector contains various control variables listed in Table 2.4. These variables are of interest in their own right because research on the association between economic variables and outmigration rates is limited. The World War I variable is calculated based on the number of deaths among a country s population, implemented into the regression for years 1919 to 1932, and zero prior to 1919. 29 Moreover, X jt has certain variables that distinguish Germany s out-migration experience from other countries that did not suffer the same discrimination after the end of World War I (Moser, 2012). 2.5.2 Measurement Error The dependent variables for the out-migration rate regressions are based on the number of out-migrants, who were potentially undercounted (Bandiera et al., 2013). This undercount creates a systematic downward bias in the dependent variable. Suppose measurement error is modeled as Y jt = Y jt + ϑ jt = Y jt + (u j + v t + w jt ) (2.4) 29 Alternate specifications could include lagged effects of World War I for years following 1919, but qualitative results are unchanged.

where Y jt is the measured dependent variable in Equation (3), (Yjt ) is the true out-migration rate, and ϑ jt is non-classical measurement error where E[ϑ jt ] < 0. Now ϑ jt can be decomposed into country-specific (u j ) and time-specific components (v t ) and a component that varies both across time and country (w jt ). The measurement error of migrants resulted from careless collection of ship records and 27 failure to enumerate cabin class passengers (Bandiera et al., 2013). Careless collection of ship records appears to have changed over time, especially after World War I, and thus is mostly in time-specific error component (v t ). The main regression uses year fixed effects which control for yearly measurement error. Otherwise, it is unclear if careless collection would be random or nonrandom across countries. The 1912 Report suggests that cabin class passengers were undercounted because these passengers were eager to avoid the 4 dollar head tax for arriving in the U.S (RCI, 1912). This type of undercount would not be a concern for out-migrants because no head tax was imposed on emigrants. However, our estimated effect of quotas on emigration rates is biased if the time and countryspecific measurement error (w jt ) is correlated with QuotaRestriction jt. Since most measurement error resulted from careless collection of ship records, it is plausible that as the number of migrants decreased, officials were less careless since they had fewer records to keep. If officials were better able to count migrants after quotas, the measured out-migration rate would artificially increase when the quota restriction variable increases. This leads to a positive bias on the estimate of the effect of QuotaRestriction jt on out-migration rates. 30 A positive bias works against the findings of the paper, which suggests that quotas lowered the out-migration rates. 30 Non-classical measurement error in the dependent variable can be thought of as an omitted variables bias. In other words, Cov(QuotaRestriction jt, w jt) > 0 (since w jt is closer to zero as the quota goes up) and Cov(Y jt, w jt) > 0 (since higher measured out-migration rate Y jt if w jt is closer to zero), biasing β 1 positively.

28 2.6 Empirical Results 2.6.1 Emigration Rates For the country dataset, coefficients and standard errors of the effect of quotas on emigration rate are reported in Table 2.5. The first three columns show estimates of the effects based on the full data set, whereas the last two columns restrict the sample to only years following World War I (1919 to 1932). As predicted by a strong negative supply shock to the labor market, higher quota restrictions reduced out-migration rates. The first column of Table 2.5 does not include the full set of year fixed effects, but includes only time fixed effects for years prior to WW I (1908-1914), during WW I (1915-1918), and following WW I (1919-1932). 31 Excluding year fixed effects allows the estimation of the association between U.S. GDP and emigration rates. Our results confirm the prior observation that out-migration is counter cyclical (Jerome, 1926); both increases in US GDP per capita and US GDP growth were correlated with falling emigration rates. 32 Again referring to Equation (3), the full set of year fixed effects is added to the regression reported in the second column of Table 2.5. The more flexible specification increases the magnitude of the coefficient on quota restriction, where restricting immigrant flows by 60 percent leads to about a 66 percent (50 log points) fall in the out-migration rate. The effect of quotas on out-migration rates was not driven by population changes in World War I, but out-migrants were less likely to travel back to countries with higher death rates, likely proxying the extent of damage from the War. 33 Natural increase, a variable that is strongly positively correlated with migration out of the country (Hanson and McIntosh, 2010; Hatton and Williamson, 1998), had the exact opposite effect for return migration in that it decreased emigration rates. 31 Results are qualitatively the same when excluding years of the Great Depression (fiscal years 1930-1932). 32 Comprehensive wage data across immigrant groups do not exist during this time period to verify effects of a labor supply shock on wages, but some studies find that migration to a local labor market drives out previous members without an effect on wages (Card, 2001; Boustan et al., 2010). The Dillingham Report of 1911 provides wage data across immigrant groups, but not before and after the quotas. 33 Regression specifications where World War I deaths have lagged effects on emigration for years following 1919 yield the same result of reduced emigration.

Table 2.5: Log Emigration Rates for Country Regressions: Coefficients and Standard Errors I II III IV V Variable \ Years 1908-1932 1908-1932 1908-1932 1919-1932 1919-1932 Quota Restriction -0.823*** -0.840*** -0.729** -0.700* -0.329* (0.207) (0.226) (0.311) (0.412) (0.196) WW I Deaths (% of pop.) -0.0569-0.149** -0.201*** (0.0797) (0.0672) (0.0682) Natural Increase (t-20) -0.126*** -0.118*** -0.0831*** -0.116*** -0.0889*** (0.0383) (0.0363) (0.0319) (0.0399) (0.0328) Foreign GDP per Capita (t-1) 0.413** 0.529*** 0.420*** 0.135 0.153 (0.163) (0.150) (0.153) (0.174) (0.136) Foreign GDP Growth (t-2 to t) 0.0124** 0.00783 0.00800 0.00921* 0.00528 (0.00542) (0.00593) (0.00570) (0.00544) (0.00344) Foreign Share in Agriculture 2.007 2.399 0.708 5.962*** 7.477*** (2.226) (1.896) (1.648) (2.269) (1.820) Foreign Share in Industry -2.508-3.820** -3.928*** -4.747*** -4.403*** (2.067) (1.588) (1.409) (1.543) (1.510) Foreign Sex Ratio (t-1) 0.0163* 0.00927 0.0106-0.00445 0.000788 (0.00964) (0.00961) (0.00944) (0.0137) (0.0103) US GDP per capita (t-1) -0.232* (0.119) US GDP growth (t-2 to t) -0.0145*** (0.00474) Immigrant Network (t-5) -3.319*** -2.891** -2.143** (0.805) (1.154) (0.889) Immigrant Occ. Score (t-5) 0.00165-0.898-0.852 (0.467) (0.634) (0.590) Immigrant Log Savings (t-5) -0.308*** -0.301*** -0.277*** (0.0597) (0.0558) (0.0563) Immigrant Lit. Rate (t-5) 0.248 0.504 0.395 (0.565) (0.579) (0.536) WW I x Germany -3.290*** (0.504) Post WWI x Germany 1.277*** (0.447) Country FE X X X X X Time FE WWI/Post WWI Yearly Yearly Yearly Yearly Observations 300 300 300 168 168 R-squared 0.629 0.692 0.721 0.741 0.888 Notes: The log emigration rate is the dependent variable. GDP numbers are from Maddison (2008), sex ratio and natural increase from Mitchell (1998). Robust standard errors in parenthesis. * p<0.10, **p<0.05, *** p<0.01 29 Potential migrants also considered other aspects of the home country in making their decision to return. Higher GDP in the source country was correlated with increased return rates, whereas a higher industry share reduced such rates. Higher male/female sex ratios in the home country

did not significantly increase return migration, so an ethnic-marriage market effect does not seem to have been at work. However, the lack of correlation between marriage markets and emigration rates is consistent, since a larger fraction of returning males were married (56 percent) as compared to 50 percent of the migrants in the United States. 34 After the implementation of the quotas, out-migration could have changed because the composition of immigrants was more skilled or literate, and more highly skilled individuals were less likely to return. We attempt to separate the direct effect of the quotas on out-migration from the effects of changing composition by including controls for prior migrants, including prior immigrants skills, savings, networks and literacy. The coefficient on quota restriction decreases slightly, indicating a limited influence of changing skill composition on emigration rates; a 60 percent increase in quotas is correlated with a 54.9 percent fall in out-migration rates. A 10 percentage increase in savings correlated with 3.1 percent less returns home, which could be due to those planning to return home bringing less savings in the first place or savings giving a migrant more cushion to remain in the United States if a negative income shock occurs. Moreover, if immigrants were joining friends or family, they were less likely to return, as a 1 percentage point increase in network connections was correlated with 3.3 percent decrease in return rates. Interestingly, prior migrants occupational scores or literacy rates were not correlated with returning. The first three columns in Table 2.5 estimate the effects of quotas on emigration rates including the full set of years from 1908-1932, essentially using years prior to World War I as a counterfactual for years following World War I. This estimation could be problematic if there was a structural change in emigration following World War I that year fixed effects or our control for World War I death rates do not capture (Biavaschi, 2013). We restrict our sample to only years following World War I in Columns 4 and 5 to test if this alters the estimated effects of quotas. Dropping years before World War I (Column 3 to Column 4) yields a slight drop in the estimated coefficient on quota restriction, suggesting that years prior to World War I are an appropriate, but 34 Authors calculation based on RCI data and 1910-1930 IPUMS, where the 50 percent of foreign-born married reflects those living in the United States for less than 20 years, to reflect the out-migrant population. 30

31 not perfect, counterfactual for years following World War I. Now the effects of World War I are no longer identified because there is no variation in World War I deaths following World War I. Since the coefficient on quota restriction is very similar when keeping years prior to World War I and the World War I coefficient is identified, we prefer the results in Column 3. Finally, it is possible that including Germany in the sample could be problematic if there was rampant discrimination against Germans during the 1920s (Moser, 2012). We separate Germany from the analysis by including variables where World War I and post-world War I are interacted with Germany. During the War, Germans were significantly less likely to return home, likely due to difficulties of travel. Even as the economic situation was deteriorating in Germany after the War, out-migration of Germans significantly increased. After separating Germany from the analysis, the quotas still impacted emigrants, with a quota restriction of 60 percent yielding a 21.8 percent (19.7 log points) drop in out-migration. 2.6.2 Emigrant Skill Next, we study the return migration of each skill group (Unskilled, Semi-Skilled, and Professional). However, the regressions of Tables 5 and 6 differ in two major ways. First, Table 2.6 employs ethnicity instead of country data, which, because the reports aggregate countries into ethnicities, reduces our sample from 12 countries to 9 ethnicities. Second, a count of the at-risk population within each skill group is unavailable, so the log of total out-migrants in each skill group is now the dependent variable and the total migrant population from the past 20 years is a control variable. 35 Changing the specification where now the total migration population is a control variable rather than in the denominator, as in Table 2.5, leads to a slightly higher estimated coefficient (from 0 to 10 percent) for quota restriction. 36 35 The two changes in sample and regression specification do not alter the conclusion from Table 2.5 that quotas depressed emigration rates. Specifically, we both collapse 12 countries into 9 ethnicities and alter the regression specification to match Tables 6 to test the effect of quotas on log emigrants. This method yields the same qualitative results as in Table 2.5 with the coefficient on quota restriction slightly higher; however, the estimated coefficient on quota restriction when only including years 1919-1932 is statistically insignificant, likely due to a smaller number of observations after collapsing. 36 Specifically, we rerun Table 2.5 with log emigrants as a dependent variable and the total number of immigrants as an independent variable. This specification is more flexible by allowing the coefficient on log immigrants to deviate

Table 2.6: Log Emigrants for Ethnicity Regressions, by Skill Groups: Coefficients and Standard Errors I II III IV V Variable Unskilled Semi-Skilled Professional Farm Laborers Farmers Quota Restriction -0.642** -0.225-0.474*** -0.889** -0.699** (0.255) (0.256) (0.177) (0.377) (0.292) World War I Deaths (%) -0.321*** -0.128* -0.221*** -0.393*** 0.0291 (0.0659) (0.0694) (0.0476) (0.129) (0.0797) Natural Increase (t-20) -0.0848*** -0.118*** -0.0511** 0.0625 0.00532 (0.0317) (0.0352) (0.0254) (0.0657) (0.0509) Foreign GDP per capita (t-1) 0.425*** 0.743*** 0.420*** 0.752*** 0.444** (0.157) (0.217) (0.149) (0.277) (0.199) Foreign GDP growth (t-2 to t) 0.00640 0.00853 0.00785** 0.00957 0.00139 (0.00520) (0.00614) (0.00385) (0.00924) (0.00662) Foreign Sex Ratio (t-1) 0.0413 0.0266 0.0453** 0.0935** 0.0365 (0.0254) (0.0266) (0.0176) (0.0440) (0.0290) Foreign Share in Agriculture 4.100 3.462 0.370-7.223-3.127 (3.081) (2.695) (2.120) (6.222) (3.730) Foreign Share in Industry -7.204*** -5.901*** -8.200*** -18.95*** -4.264 (2.095) (2.077) (1.644) (3.625) (2.981) Immigrant Network (t-5) -2.013** -2.881*** -1.418* -2.234* -1.446 (0.905) (0.845) (0.755) (1.333) (1.033) Immigrant Occ. Score (t-5) -0.105 0.136 0.165 1.084 0.742 (0.572) (0.556) (0.425) (1.073) (0.667) Immigrant Log Savings (t-5) -0.267*** -0.257*** -0.153*** -0.241** -0.133* (0.0615) (0.0620) (0.0437) (0.100) (0.0686) Immigrant Lit. Rate (t-5) 1.384** 0.570 0.471-0.908 0.0567 (0.574) (0.655) (0.435) (1.019) (0.617) WW I x Germany 0.0768-0.951*** 0.0448-2.390*** -1.311*** (0.342) (0.349) (0.260) (0.545) (0.390) Post WWI x Germany 0.963*** 0.998** 0.667** 0.864** 0.299 (0.362) (0.385) (0.304) (0.414) (0.384) Control total (Im. past 20 years) 1.508*** 0.225 0.362** -0.107 0.354 (0.207) (0.209) (0.154) (0.352) (0.244) Ethnicity FE X X X X X Year FE X X X X X Observations 225 225 225 225 225 R-squared 0.886 0.815 0.901 0.742 0.753 Notes: Dependent variable is the log of total number in that skill group. Data is from Reports of the Commissioner General of Immigration (1908-1932). * p<0.10, **p<0.05, *** p<0.01 32 The results of the effect of the quotas on skill groups are presented in Table 2.6. The effect of quota restriction is not uniform across groups. A 10 percentage-point increase in the quota from one, unlike the regression specification we report in Table 2.5. We do this to match regression specifications for later ethnicity data. The coefficient on quota restriction ranges from 0-10 percent higher depending on variables included in the model. Results are available from authors upon request.

33 restriction results in a 6.4 percent decrease in unskilled emigrants but has no effect on semi-skilled migrants; however, this 10 percentage-point increase leads to a 4.7 percent decline in professional out-migrants. This group largely consists of merchants who possibly found it more attractive to keep businesses in the United States during the 1920s. Much of the decrease for the unskilled group was specifically for both farmers and farm laborers, whose results are separated from the unskilled group and shown in Columns 4 and 5. Our main interpretation of these findings attributes falling emigration rates to inward supply shocks caused by the quotas; these socks were not only differential across ethnicities but also across skill groups. We show the magnitude of the labor supply shocks for each skill category resulting from the 1921 and 1924 quotas in Table 2.7, reported as a fractional change in number of immigrants. For example, between 1921 and 1922, the number of unskilled migrants from new source countries fell by 87.6 percent whereas the number for old source countries fell by 31.3 percent. However, our empirical strategy relies on differential shocks between new and old migrants to skill groups. In Table 2.7, we also calculate this relative supply shock caused by differentially high restrictive quotas. For example, the 1921 quota led to a larger relative supply shock for farmers (new source country farmers fell by 66.9 percentage points more than old source country farmers) and semi-skilled workers (64.5 percentage points more); the 1924 quota also led to the largest relative supply shock for farm laborers (80.4 percentage point difference). However, the 1924 quota had the smallest differential shock for semi-skilled workers (31.0 percentage point difference). If we take both quotas into account, the effect on migrant labor markets should have been relatively strongest for unskilled migrants, specifically farmers and farm laborers. Semi-skilled and professional workers would be relatively less affected. The results for the influence of the quotas on out-migration for each skill group as reported in Table 2.6 are largely consistent with the information presented in Table 2.7, where labor markets for the unskilled (especially agricultural markets) were the most affected by quotas. However, this interpretation relies on the substitutability between incoming and resident migrants within skill groups and the presumption that competition within group was sufficiently strong that new

34 migrants drove out resident migrants. Whereas strong competition may have existed in industrial occupations, the situation may have been different in farm labor markets, and the results are may be due to another explanation besides supply shocks. 37 Table 2.7: Relative Supply Shocks of Immigrants, by Skill Group: 1921-22 and 1924-25 Fraction Change of Incoming Flows Ethnicity Unskilled Semi-Skilled Professional Farm Laborers Farmers Panel A: 1921 to 1922 Old Migrants -0.313-0.218-0.137-0.352-0.247 New Migrants -0.876-0.862-0.625-0.870-0.916 Relative Shock (New-Old) -0.563-0.645-0.488-0.518-0.669 Panel B: 1924 to 1925 Old Migrants -0.421-0.542-0.376 0.112-0.115 New Migrants -0.890-0.852-0.788-0.692-0.608 Relative Shock (New-Old) -0.469-0.310-0.412-0.804-0.493 Notes: The Table 2.shows the fraction change in number of incoming migrants by skill group for before and after 1921 and 1924 quotas. The fraction change is the average of ethnicity s change within old and new migrant groups. Authors calculations from RCI (1921, 1922, 1924, 1925). In addition to the supply shock, the effect on farmers could have been large for two other reasons. 38 First, the options other than farming likely increased in the 1920s following the quotas, leading more migrants to switch occupation upon arrival from farming to industrial work. Occupational switching was prevalent; for example, whereas 7 percent of incoming migrants claimed to be farmers in their source country, 2.6 percent of the foreign-born were farmers in the United States. 39 Second, migrants may have misreported their occupation as farmer when they were actually a farm laborer; this was so often the case on incoming manifests that special instructions were provided on manifests for recorders to differentiate between the two groups. If officials did not take the same care with out-going manifests, then a fall in the number of out-migrant farmers 37 There is some evidence that farming occupations were competitive is based on the fact that the percentage of foreign-born staying less than 20 years who were farmers in 1910-1930 IPUMS (2.6 percent) was the same as the percentage of out-migrants who were farmers (2.6 percent). This suggests that farming was no more or less competitive than other migrant occupations. The intuition is that a more competitive occupation would lead to more failures and higher return rates; a less competitive occupation would lead to fewer return rates. 38 Another possibility for the farming result is that the number of farmers are so small that if a handful of farmers decide to change their decision, the percentage change in out-migration looks large. 39 This is for years 1910-1930 from IPUMS and 1908-1932 in RCI.

35 could truly be a fall in out-migrant farm laborers. As an alternative to labor supply shocks increasing the benefits of migration, another possible interpretation is that reentry into the United States was limited, leading people to stay rather than move back and forth constantly. We think this is unlikely for two reasons. First, migrants could obtain a permit to reenter the United States after leaving for a brief time, costing 3 dollars. The 1925 Report notes that this provision of the law was not well known initially but soon became very popular; applicants for reentry permits reached over 100,000 in the fiscal year of 1929-1930, compared to the 50,661 who left permanently. Second, those individuals who departed for a short time period were recorded by officials as non-emigrants and thus are not included in this analysis. One of the main goals of the quota laws was to decrease the number of unskilled migrants, but even though the inflow of unskilled workers was reduced, those unskilled migrants who did arrive then decided to stay rather than return home. This observation is important because the unskilled group constituted the majority of the immigrant population. The influence of quotas on return migration composition, if strong enough, could have altered the skills of the overall stock as unskilled migrants tended to stay. However, further research is needed to answer this question. 2.6.3 Length of Stay The quota laws may have reduced emigration rates by causing some potential return migrants to stay slightly longer rather than staying permanently. While Biavaschi (2013) shows in a similar methodology that the quotas were correlated with an increase in share of those staying more than five years, we extend her analysis by including World War I variables. Figure 2.5 shows how temporary migrants (distinguished by old and new migrants) changed their duration of stay following the quota laws. The distribution for length of stay across old and new migrants is similar in spite of the new migrants having higher out-migration rates. After the quota laws were imposed, the share staying less than 5 years fell sharply for new migrants with a coinciding increase for the share staying 5 to 10 years. This observation is consistent with a life-cycle model in which increasing wages cause migrants to either stay longer or permanently (Dustmann and Weiss, 2007). In order

36 to determine if quotas affected length of stay, we use the share of emigrants that stayed less than 5 years, the share that stayed between 5 to10 years, and share that stayed more than 10 years as the dependent variables in Equation (3). The vector X jt, in addition to controls for demographics and World War I, includes controls to determine how prior migrant composition affects duration of stay. Figure 2.5: Duration of Stay, Old versus New Migrants Notes: Data is from the Reports of the General Commissioner of Immigration 1908-1932. Table 2.8 shows that quotas pushed the distribution of length of stay in the United States to the right. The share of individuals who stayed less than 5 years declined, that of those who stayed between 5 to 10 years increased, and that of those who stayed more than 10 years did not change. The first column shows that quotas were correlated with a smaller share of migrants staying less than five years; when including the effects of World War I, the interpretation does not change. World War I was correlated with a smaller share of those staying less than five years, likely due to individuals escaping destruction in their home economy. Column 3 uses the full set of controls, including composition of prior migrants, showing that the quotas caused a fall in the share of migrants staying less than five years. Columns 4-6 and 7-9 repeat the analysis for shares of emigrants staying, respectively, between five and ten years and more than ten years. The quotas

37 increased the share of those staying between five and ten years, but had no effect on those staying more than ten years. Not only did the quotas lower emigration rates and encourage people to stay in the U.S., but given that temporary migrants were going to leave at some time, they stayed longer. Once again, these results are consistent with improved employment opportunities in the United States that were available due to a strong labor shock in migrant labor markets. 2.7 Summary and Conclusions Almost any question that research can ask about immigration could be asked about return migration. Temporary migration impacts labor markets (Durand and Massey, 2006; Dustman and Weiss, 2007) and remittances (Yang, 2008), could reverse or amplify brain drain (Dustmann and Mestres, 2010; Dustmann et al., 2011; Mayr and Peri, 2009), and could accelerate technological diffusion (Dos Santos and Postel-Vinay, 2009; Ortega and Peri, 2013). Understanding the selfselection of return migrants is fundamental to estimating migrant assimilation into the labor market (Borjas, 1985). Finally, if return flows are sizable, they could reverse the impact of immigration on the labor market (Bandiera et al., 2013). However, whereas return migration is important in the contemporary world, as it was in the early 20th century, little of substance is known about many aspects of this phenomenon. In the early 20th century, millions came to the United States for a short time before returning to their source country, negatively labeled by some U.S. natives as birds of passage. Studying these birds of passage from the United States is possible for the early years of the 20th century because the United States recorded characteristics of out-migrants upon departure; the United States stopped such enumeration in the middle of the twentieth century. Rather than using attrition from panel data to study return migration, as many studies of contemporary return migration need to do, we make use of data directly observing emigrants on departure in order to provide a more complete picture of out-migration. The major focus of this study is the effects of immigration quotas on out-migration. To

Table 2.8: Duration-of-Stay Regressions for Ethnicity Groups: Coefficients and Standard Errors I II III IV V VI VII VIII IX Variables <5 <5 <5 5 to 10 5 to 10 5 to 10 >10 >10 >10 Quota Restriction -0.0648* -0.0741* -0.130** 0.0478 0.0503* 0.125** 0.0170 0.0238 0.00503 (0.0378) (0.0377) (0.0554) (0.0292) (0.0293) (0.0483) (0.0159) (0.0156) (0.0211) WWI Deaths ( (0.0100) (0.00864) (0.00792) (0.00749) (0.00343) (0.00290) Prior Immigrant Characteristics: Occupational Score (t-5) 0.0274-0.0657 0.0383 (0.0875) (0.0754) (0.0358) Network (t-5) 0.327* -0.173-0.155** (0.173) (0.145) (0.0649) Log Savings (t-5) -0.0648*** 0.0327*** 0.0321*** (0.0124) (0.00930) (0.00641) Literacy Rate (t-5) 0.0491-0.0896 0.0405 (0.104) (0.0861) (0.0475) Year FE X X X X X X X X X Ethnicity FE X X X X X X X X X Demographic/GDP Variables X X X Observations 225 225 225 225 225 225 225 225 225 R-squared 0.676 0.678 0.802 0.668 0.668 0.759 0.672 0.680 0.832 Notes: The dependent variable in the regression is the share of emigrants who stayed in the US less than 5 years, 5 to 10, and 10 or more years. Data is from the Report of the Commissioner General of Immigration (1908-1932). Robust standard errors in parenthesis. * p<0.10, **p<0.05, *** p<0.01 38

examine the effects of immigration policy, we take advantage of three quota-law changes in 1921, 1924, and 1929 that provide within-country variation and allow the identification of the causal 39 effects of the quotas on emigration. The immigration quotas caused a strong supply shock in migrant labor markets. Whereas immigrants effects on natives are somewhat controversial since the substitutability between immigrants and natives is debatable, the effects on prior immigrants are less so because recently arrived immigrants and prior immigrants operate in the same labor markets and are highly substitutable. The impact of immigration on previous immigrants is evident in our findings. Immigration quotas caused a reduction in out-migration rates, with a 60 percent restriction of immigrant flows reducing out-migration by 22 (with Germany excluded from the analysis) to 55 percent (with Germany included). Annual data on out-migrants allow us to test what drove annual variations in out-migration rates. Our secondary findings suggest that return migration was counter cyclical, with outmigration falling when the U.S. economy improved. Population pressure was also correlated with decreased return migration as migrants remained in the United States when the source country s natural increase rose. Moreover, World War I severely interrupted migrant flows as temporary migrants could not safely cross the Atlantic to return home; after the War ended, migrants departed en masse. However, fewer migrants returned home to countries whose populations had been more devastated by the war. Finally, discrimination against Germans during the 1920s was prevalent, such that after the War, Germany had higher emigration rates than other countries. Quota laws were initially implemented by Congress to limit immigration of individuals from Southern and Eastern Europe, who were perceived as being of low skill and low education. The popular argument of the time was that temporary migrants leaked savings out of the United States and failed to assimilate into society, staying only a few years. While the quota laws certainly accomplished the task of reducing immigration rates into the country, they also encouraged prior immigrants to stay in the United States longer. Emigration rates of prior immigrants fell as they stayed and assimilated in the U.S.; if migrants did intend to leave, they remained in the United States longer. Thus, in many respects, the quotas were a successful policy that kept new migrants

40 out and also encouraged migrants to stay permanently. However, the quota laws mainly reduced the emigration rates of the least skilled, so the overall quality of migrants who remained in the country may have actually decreased, since unskilled migrants were more likely to stay. Like so many other governmental policies, quotas may have created unintended consequences by potentially lowering the skill mix of the migrant stock. Another more recent immigration policy that may have had unintended consequences is The Immigration Reform and Control Act of 1986 (IRCA). Three major aspects of this law were to (for the first time) impose sanctions on employers for hiring those in the country illegally, to increase border control, and grant amnesty for some who were here illegally. Ultimately, about 2.67 million individuals received amnesty. With the passage of IRCA, Congress hoped both to discourage illegal entry and to encourage illegal residents to depart (Rolph, 1992, p. 39). Whereas before IRCA much movement occurred back and forth across the border with Mexico, some hypothesized that those in the U.S. illegally would now tend to stay rather than risk a higher probability of detection upon return to the U.S. after a temporary, often seasonal, departure (Bean, Vernez, and Keely, 1989, p. 89). Given the experience of the 1920s, this hypothesis seems reasonable if the benefits to staying illegally increased due to a smaller inflow of these migrants. However, perhaps because of a hope on the part of potential migrants that future amnesties would be enacted and because border patrol resources were not greatly increased, illegal movement across the southern border was not greatly reduced. Although the amnesties affected the ability to confidently assess the effect on staying, some evidence suggests that IRCA did not appreciably influence the probability of staying in the U.S. Of course, two major differences between the experiences of the 1920s and more recent years are that the migrants in the earlier period were legal whereas IRCA was directed at those here illegally, and the cost of access to the U.S. differs greatly between Europe and countries to south of the U.S.

Chapter 3 Birds of Passage: Return Migration, Self-Selection, and Immigration Quotas Fundamentally migration is about the flows of people across borders, but a fact often overlooked is that flows occur both into and out of a country. Return flows can be substantial; in the early 20th century United States they were about three-fifths the size of inflows, suggesting that the average incoming migrant was temporary rather than permanent (Bandiera, Rasul and Viarengo, 2013). Return migration influences the impact of immigration on labor markets, the magnitude of brain drain from the source country, and the transfer of financial capital across countries some of the most important questions in the literature. Of particular interest are the characteristics of those who self-select into return migration. Selective out-migration changes the composition of migrants remaining in the United States and can alter our understanding of how well migrants assimilate (Borjas, 1985). Although return migration is of particular importance to policy makers, data on out-migrants are relatively rare. The United States does not track out-migrants, leaving the question of who selfselects unanswered; this lack of data also limits our knowledge of how policy affects out-migration. Despite extensive research on immigration and the migrant stock, the importance of return migration in shaping the migrant stock remains relatively understudied. In order to under return migrant self-selection, I turn to the early 20th century a period when data on out-migrants actually exist. From 1908 to 1932, the United States aimed to record the occupation of every migrant who left the country, providing the only period in the history of the

42 United States with out-migrant data. 1 Other papers that attempt to estimate the self-selection of return migrants rely on indirect methods with short-term migrants as residuals (Lubotsky, 2007; Abramitzky, Boustan and Eriksson, 2014). 2 Here I directly observe out-migrants. The data also cover a time period when migration law changed from an open borders policy in the 1900s to restrictive quotas in the 1920s, allowing estimates of how policy influenced return migration. Further, this time period offers unique insight into a common question encountered when modeling return migration: whether the decision to return home was planned before arrival or occurred after arrival. Those who decide before arrival to migrate temporarily are known as target savers, people who migrate in order to reach a savings target in the host country which will yield a premium on investment back home (Dustmann and Weiss, 2007). 3 Based on this view, a Roy model of selection would predict that return migrants could be either positively or negatively selfselected, depending on the relative wage distributions in the host and home country (Borjas and Bratsberg, 1996). However, this theory is at odds with empirical studies which mostly show that return migrants are negatively self-selected (Lubotsky, 2007; Abramitzky, Boustan and Eriksson, 2014). Such negative self-selection has led to an alternative view of return migration in which those with the least skill or ability return home after arrival because they failed in the labor market, typically during an unemployment spell (Bijwaard, Schluter and Wahba, 2014). Although return migrants are likely a combination of both target savers and failures in the labor market, it is important for policy makers to know which model dominates since target savers and failures respond to economic shocks differently. For example, a positive income shock would lead to higher return rates of target savers, but lower return rates from failures in the labor market (Bijwaard and Wahbah, 2014; Yang, 2006). To uncover the relative importance of planned and unplanned return migration, I compare two datasets: one observes migrants at arrival who planned to return home; the other directly observes 1 Out-migration statistics continue until 1957, but do not disaggregate occupations by ethnicity after 1932. 2 For indirect methods on the self-selection of return migrants, see Van Hook et al. (2006). Other studies on return migration either rely on high-quality government data (Bijwaard and Wahba, 2014) or linking migrants across historical censuses (Abramitzky, Boustan and Eriksson, 2012). 3 It is also possible that human capital acquired in the host country is rewarded highly in the source country.

migrants who actually returned home. For the dataset on planned return migrants, starting in 1917, all migrants who entered the country were required to state whether they intended to return 43 to their source country. 4 I randomly sample 1% of ships that entered Ellis Island between 1917 and 1924 to gather data on migrant intentions, leading to a sample of 20,156 individuals. For data on migrants who departed, the Bureau of Immigration published annual records on the characteristics of those leaving the country, aggregating passenger manifests from ships who sailed away from the United States. By comparing the self-selection of planned return migrants on arrival to the self-selection of actual return migrants upon departure, one can uncover which types of migrants were most likely to switch their plans after arrival. I find that return migrants at departure were negatively self-selected, as emigrants occupations were less skilled on average than the occupations of the total foreign-born. 5 This finding is mostly consistent with Abramitzky, Boustan and Eriksson s (ABE, 2014) use of a residual method to study return migration. However, while migrants who actually out-migrated were negatively self-selected on skill, migrants who planned to return home were less negatively self-selected, suggesting that shocks after arrival intensified the negative selectivity pattern by driving out low-skilled migrants. The role of unplanned return migration is clear when comparing planned return migration rates to actual return migration rates. About 13% of migrants from 1917 to 1924 planned to return home, while at least 19% actually did suggesting that, on average, more migrants experienced negative income shocks in the United States than experienced positive ones. Further, as the difference between the actual and planned return rates grows (i.e., the rate of unexpected returns grows), the self-selection of return migrants becomes more negative. Migrants from new source countries (i.e., Eastern and Southern Europe) had higher unexpected return rates compared with old source country migrants (i.e., Northern and Western Europe); accordingly, new source coun- 4 There was no benefit or penalty for declaring an intention to return home. 5 A limit to this study is that I can only determine how temporary migrants and permanent migrants differ across occupations rather than within occupation. It is possible that return migrants were lower paid within an occupation, but it is impossible to determine whether or not this was true.

44 try migrants were mostly negatively self-selected at departure, while old source country migrants were slightly positively self-selected. When migration quotas went into effect in the 1920s, they created a large labor supply shock as the migrant flow dropped by approximately 60% from 1921 to 1925 (Carter et al., 2006). Scarcity of migrant labor should improve employment rates or wages, which would reduce the failure rate. Indeed, return migration rates dropped during the 1920s, partially because there were fewer migrants who failed in the labor market. At the same time, return migrants were positively self-selected by 1930, which is also attributable to fewer failures following the migration quotas. The relative importance of failing in the labor market would end for a brief period of time, as eventually the Great Depression would drastically increase out-migration during the 1930s (see second chapter of this dissertation). By lowering the return rate for low-skilled migrants, quotas could also have had unintended consequences by leading to a lower-quality migrant stock. Initially, migrant quotas raised the skill level of entering migrants who claimed to have a job, where a 60% restriction of the migrant flow increased entering occupational scores by approximately 5%. However, when the same cohort is observed years later in the Census, migration quotas had zero effect on increasing skills. Part of this discrepancy is that migration quotas also increased the skills of the outgoing group by 3.2%, partially eliminating the gain in incoming skills. I interpret the results in terms of an Easterlin effect, where a smaller cohort following quotas led to less intense competition and thus fewer failures in the labor market, specifically amongst the least skilled foreign-born. 6 However, the main reason why a 5% skill increase did not appear in the Census is that better occupations in the home country did not transfer well to better occupations in the United States, especially for new source countries. The findings show that migration policy aimed to improve migrant assimilation worked, as migrants stayed longer within the United States; however, the quotas attempt to increase skills ultimately failed, partially due to unintended effects on temporary migration. 6 See Pampel and Peters (1995) for an overview of the Easterlin effect.

45 3.1 Historical Background: Immigration to the United States During the early decades of the Age of Mass Migration (1850-1913), rates of return migration were relatively low due to the high costs of migration (Bailey, 1912). Before steamships, migrants boarded sailing ships and paid high fares to travel for months across the Atlantic (Cohn, 2009). Not only did the high ticket prices and opportunity costs deter return migration, but also the trip itself was dangerous. Mortality rates on sailing ships were high, decreasing the incentive to return (Cohn, 1984). Following the Civil War, as shipping technology shifted from sail to steam, it lowered the costs of migration. While fares dropped slightly, the more important effect of steam technology on migration was that travel time fell from two months to two weeks (Cohn, 2009). This decrease in the cost of traveling to the United States altered the selection of arriving migrants; migrants who were formerly constrained by the costs of migration could now migrate, contibuting to the falling quality of migrant cohorts in the late 1800s (Abramitzky, Boustan and Eriksson, 2014). Further, the development of networks in the 19th century also lowered costs for migrants, allowing for an increase in the volume of migration (Wegge, 1998; Spitzer, 2014). At the same time that the costs of migration were falling, the benefits of migration for unskilled labor were rising, as the United States offered a premium in real wages over European countries (Williamson, 1995). Accordingly, the composition of migrants changed during the late 19th century, shifting from a family movement where entire households moved to an influx of young males who were more mobile, and more likely to return home (Gould, 1980; Baines, 1994). Millions came from Southern and Eastern Europe to find work in the relatively more industrialized United States economy. These jobs required skills that were different from agricultural jobs back home (Wyman, 1996). However, if migrants were not successful in finding good jobs, it was relatively cheap and easy to return home; indeed, out-flows were large relative to inflows (Piore, 1979; Bandiera, Rasual and Viarengo, 2013). 7 7 There are several historical studies of return migration for specific countries during this time period. See Sarna (1981) for Jewish return migration, Kraljic (1978) for Croatian, and Saloutos (1956) for Greek return migration. Also, see Balch (1910) and Steiner (1906) for contemporary accounts of temporary migration. Finally, Gmelch has an anthropological survey of return migration (1980).

46 The large increase in volume of new migrants soon led to a nativist backlash against the open-door policy of the United States. During the late 19th and early 20th centuries, members of Congress repeatedly tried to the pass restrictions on the type of migrant who could enter, the most popular being that migrants should be literate (Zeidel, 2004). After many failed attempts, the literacy test was eventually put in place in 1917 during the anti-foreigner fervor of World War I (Goldin, 1994). This helped mark the beginning of the end of a free migration era. 8 With the end of World War I, migrant flows increased rapidly, which would eventually lead to a set of stronger restrictions on migration: the quota laws of the 1920s. 3.1.1 Description of Quota Laws Passed in 1921, the first quota law (Emergency Quota Act) led to a fall in immigration flows of almost 55% in its first year. The annual quota limit for a given country was 3% of a country s migrant stock, as enumerated in the 1910 census. The law applied only to the Eastern Hemisphere (the Western Hemisphere was excluded from quota limits) and was especially restrictive for Eastern and Southern European countries with a small migrant stock in 1910. The Act categorized people into non-quota and quota immigrants, limiting only quota migrants after the numerical limit was reached; those traveling temporarily for business or vacation were still allowed to enter, not counting against the quota. 9 From July 1921 to June 1922, out of a possible 356,995 quota immigrants, only 243,953 entered, as many Western and Northern European countries did not fill their quotas (Report of the Commissioner General of Immigration or RCI, 1922). 10 The quota system of 1921 was temporary, designed to be in place for a year as Congress debated over a permanent system. After extensions in 1922 and 1923 to keep quotas intact, Congress 8 There were laws that restricted Chinese migrants, anarchists, polygamists and other smaller classes of people prior to 1917, but these laws did not limit a large number of potential migrants from Europe, the main source of immigration during this time period. 9 Non-quota migrants (close family members, temporary travelers, certain occupations such as governmental officials, and migrants from the Western Hemisphere) were not counted against the quotas. In 1925, of the 250,912 non-quota immigrants that entered, 7,217 (2.9%) were wives and children of United States citizens, 64,632 (25.8%) were those returning after a visit abroad, and 175,069 (69.8%) were from countries in the Western Hemisphere (RCI, 1925). 10 The total amount of immigrants admitted during the fiscal year 1922 (July 1921-June 1922) was 309,556, a sharp drop from 805,228 the previous year (Report of the Commissioner General of Immigration or RCI, 1922).

47 passed the Immigration Act of 1924, which led to another fall in the migrant flow. Now the quota formula was 2% of the foreign-born population from a given country based on the 1890 Census. This change lowered the annual quota from 357,803 to 164,667, but unequally affected old source country migrants and new source country migrants. 11 For example, Italy s quota dropped from 42,607 to 3,845 and Russia s quota fell from 24,405 to 2,248, while Germany s only decreased from 67,607 to 51,227 (RCI, 1924-1925). These quota laws achieved their desired effect as migrant composition shifted sharply between 1914 and 1925 from new soure countries back to old source countries. Northern and Western Europe s percentage of total immigrant flows increased from 20.8% to 75.7%, while Southern and Eastern Europe s dropped from 75.6% to 10.8% (RCI, 1925). These quota laws would govern United States migration policy for over four decades until the Immigration and Nationality Act of 1965. 12 Following implementation of the quota laws, the foreign-born population became a smaller fraction of the United States population; the era of millions of migrants arriving yearly into the United States had ended. But the quota laws did not just affect how many people entered they also influenced who left. The quota laws lowered return rates from the United States, especially for those who were lower skilled (see second chapter of this dissertation); however, it is unknown whether this fall in return migration following the quotas was due to changes in planned return migration or unplanned return migration. 3.2 Theoretical Background Following Borjas and Bratsberg (1996), I assume that there are two reasons for return migration. 13 The first is to use savings accumulated in the United States for investment in a farm or 11 Minor adjustments to country s quotas were made between 1921 and 1924 years as officials attempted to accurately estimate the effect of political boundary changes on quotas (RCI, 1922-1924). 12 The quotas changed one more time in 1929. The quota numbers were no longer calculated based on the number of foreign-born in the United States, but now also attempted to reflect the national origins of United States citizens. This quota falls outside the time period of my study. 13 There are many reasons for migrants to move temporarily rather than permanently despite that many migrants return to poorer countries. One common suggestion is that migrants return due to preferences for consumption back home; culture, lifestyle and family are more familiar in the home country and the psychological costs of living in a different land might exceed the gains in utility from higher wages. While a taste for home consumption could explain return motivations, it does not give insight into selection in return migration based on skill, which is the goal of this paper.

48 business back home. This financial capital allows the migrant to earn a premium over his original earnings in the source country, which forms the motivation for temporary migration. The second reason for returning home is because of shocks to income after arrival; migrants may earn higher or lower earnings than expected and then decide to switch their plans after arrival to either stay or leave the United States. A simple Roy model of migration suggests that short-term migrants should be self-selected on human capital depending on the relative returns to skill across countries (Borjas and Bratsberg, 1996). To illustrate this, assume that wages in host country (United States for our purposes) are rewarded as follows: w us = µ us + φh + ɛ us (3.1) where w us is the wage in the United States, µ us is the base wage in the United States, and φ is the relative return to human capital h in the United States. Similarly, wages at home (w home ) are w home = µ home + h (3.2) Suppose that temporary migration is rewarded by a premium to home earnings κ, which is known before migrating. If one migrates temporarily, he or she spends a fraction t of his or her life in the United States. Thus their potential wages from migrating temporarily would be w return = tw us + (1 t)(w home + κ). This sets up the motivation for temporary migration, where one plans to migrate temporarily if both E[w return ] 2C > w home and E[w return ] C > E[w us ] where C is the cost of migration. 14 If skill is rewarded relatively more in the host country (φ > 1), then the higher skilled have a larger incentive to migrate, leading to the migrant population (both temporary and permanent migrants) being positively self-selected from the source country. Of these high-skilled migrants, the highest skilled gain the largest premium and decide to stay permanently. Other migrants, who 14 This is assumed to be constant across all indviduals, which could be relaxed easily.

49 are on the lower end of the already high-skilled migrant population, could decide to return home to earn a premium from investing capital; in other words, the return migrants are negatively self-selected relative to permanent migrants. This process leads to the best of the best remaining and return migration intensifying the original selection of migrants to be more positive. In the alternative case where skill is rewarded relatively less in the host country (φ < 1), the worst of the worst remain permanently because they earn the highest premium (See Borjas and Bratsberg (1996) for a further discussion). However, this insight from Borjas and Bratsberg (1996) on the selection of return migrants is only for planned return migration rather than actual return migration. After arrival, one updates his or her decision dependent on the draw ɛ us, which could lead to a desire to return, or w return C > w us, even though prior to arrival one expected to migrate permanently. Further, there may be a correlation between ɛ us and h. In other words, selection on skill may change after arrival depending on which parts of the human capital experience negative wage shocks. Wages may differ from expectations because skills are not transferred across economies, misinformation or economy-wide downturns. Thus migrants may fail in the labor market, because they earn much less than expected. If negative shocks are correlated with the least skilled, then return migrants may be negatively self-selected on skill even when a Roy model predicts positive self-selection. 3.3 Data To determine the role of target savers versus failures in the labor market, I make two comparisons between return migrants and permanent migrants: one is at arrival where return migration is expected and the second is at departure when return migration is realized. For migrants at arrival, I use incoming ship records; at departure, I contrast administrative data on out-going migrants with Census data provided by IPUMS. I begin with this latter group.

50 3.3.1 Return Migrants at Departure: Administrative Data and IPUMS Data on out-migrants are found in the Annual Report of the Commissioner General of Immigration (henceforth RCI) between 1908 and 1932, which records statistics on emigrants at departure from the United States. When leaving the United States, ship captains had to deliver a passenger list to the port s customs agents, a manifest similar to those for arriving immigrants, which included a variety of demographic, economic and geographic characteristics. These ship manifests were forwarded to the Bureau of Immigration, which aggregated them into tables and reported them annually to Congress. 15 The annual reports include a reasonable amount of detail on the types of migrants who were leaving on a yearly basis. These are the only data from the United States that systematically observes departures. Importantly, this data provides only aggregations of those who leave, making analysis of micro-determinants of out-migration impossible. To determine how these individuals selected into out-migration, I compare their characteristics with the population they were drawn from: the migrant stock living in the United States. Data on the migrant population in the United States is taken from 1% IPUMS samples from 1910-1930 (Ruggles et al., 2010). Obviously, short-term migrants were very different from the migrant stock in terms of years of stay. Approximately 90% of out-migrants lived in the United States for less than ten years, while the corresponding number for the migrant stock is 30%. To make a better comparison between out-migrants and permanent migrants, I reweight the migrant stock to match the years of stay in the out-migrant data, which in effect places less weight on those staying more than ten years and more weight on those staying less than ten years. 16 My aim is to measure self-selection into return migration based not only on occupation, but also on other observables. When comparing characteristics such as age, sex, marital status, and location, I use the full sample of foreign-born in IPUMS. However, when comparing migrant skills (i.e., occupational scores), I 15 These annual reports serve as a basis for Ferenzci and Willcox s (1929) expansive volume on migration data. 16 The reweighting is matched by year and ethnicity. For example, if 73% of German out-migrants in 1911 left after five years of stay, then the census data is reweighted so that in 1910 73% the German migrant stock stayed for less than five years. The years are matched from the fiscal year 1911, 1921, and 1931 annual reports to the 1910, 1920 and 1930 census. The fiscal year in the United States lasted from, for example in 1911, July 1st, 1910 to June 30th, 1911.

51 drop those under the age of 16 and over the age of 65 to represent those of working age. The administrative data recorded characteristics of out-migrants based on ethnicity rather than country of origin. While this is actually beneficial over the 1908-1932 time period as many countries changed their borders due to World War I, it requires matching individuals in the U.S. Census to an ethnicity. Rather than using country of birth to match ethnicity across the two datasets, I use the mother s language variable in IPUMS to connect them to an ethnicity, which is relatively straightforward and allows matching on ethnicities (e.g., Hebrew ) without a single defined country. The data on out-migrants are not perfect. Recent research by Bandiera, Rasul, and Viarengo (BRV, 2013) argues that these administrative records severely undercounted both immigrants and emigrants. Thus, the RCI data can be thought as a sample of total out-migrants, just as the IPUMS is a sample of the entire population however, the question is whether the RCI data is representative of the return migrant population. The main reason suggested by BRV for the undercounting is careless compiling of ship manifests by the Bureau of Immigration. Which ships were not recorded in the official tally is unclear, and it is plausible that this measurement error is random and does not affect the representativeness of the sample; however, there is no way to verify whether this is true since the original out-going manifests were not archived. 17 3.3.2 Planned Return Migrants at Arrival: Ship Manifests The other important aspect of return migration is how skilled migrants differed in their intentions to return. To understand return migrant intentions, I randomly sample 1% of ships that originated in Europe and arrived in Ellis Island from 1917 to 1924. 18 While a system for keeping 17 Another concern is whether or not cabin classes were excluded from the official statistics, which would create a negative bias since the recorded out-migrants would be less skilled than the overall returning population. Willcox (1931) reports that only steamship passengers are included in official statistics before 1904, but changes in classification of immigrants by 1907 solve the problem by including the cabin class. The definition of an immigrant in 1908 changed to one whose last permanent residence was outside the country and intended to reside within the United States for at least 12 months at a time, and thus included cabin class passengers who did not wish to be associated with the term immigrant. (Hutchinson, 1958) However, this problem seems to apply to only incoming migrants and not to out-going migrants because some cabin class migrants attempted to avoid the $4 head tax (RCI, 1912). 18 I randomly sample the ships from the Statue of Liberty Ellis-Island Foundation.

records of immigrants had been in place since 1820, an important question was added to these 52 manifests in 1917: asking whether migrants intended to return to their home country. 19 There was no benefit or penalty to stating a permanent or a return plan; further, these ship manifests were filled out by ship captains rather than United States border officials, which potentially reduced any misrepresentation if migrants would be more intimidated by border officials. Using this information, I can determine whether an individual was planning to return. However, there are some migrants (599 to be exact) who listed their intention to stay as indefinite or uncertain. I allocate these uncertain migrants to the planned return migrant group, leading the total of 2,533 planned return migrants. Allocating the uncertain migrants to the planned permanent group or dropping them from the sample does not change any conclusions of the paper; in fact, either way would make the conclusion that unplanned return migration (i.e., failures in the United States) was important to return migration even stronger since dropping the uncertain group would lead to fewer migrants planning to return home. Not all individuals who were on the Ellis Island manifests would be included as immigrants in the administrative data. To maintain consistency with United States definitions of immigrants and emigrants, I drop those who were traveling temporarily through the United States, had previously been in the United States in the same year, and those who were listed but did not actually embark on the trip. 20 The final sample contains 68 ships and 20,156 individuals. Since the sample is not a simple random sample of the migrant population but rather a sample of ship arrivals, I reweight the individuals to match the number of migrants from old source countries and new source countries per year according to the official records. This is done in order to make the sample representative of the incoming migrant population. 19 The text of the questions was as follows: Whether alien intends to return to country whence he came after engaging temporarily in laboring pursuits in the United States. Other questions include Length of time alien intends to remain in the United States, and Whether alien intends to become a citizen of the United States. The manifests also include whether the migrant had been in the United States previously, which is important to the self-selection of repeat migrants. However, this is not the focus of this paper. 20 Those that did not embark on the trip were often crossed out with a dark line. It is unclear why these migrants were not actually on the ship.

53 3.3.3 Measures of Migrant Quality: Occupational Scores and Height In order to determine whether return migrants were higher or lower quality than permanent migrants, I use two separate measures to capture human capital. First, I use occupational scores, which is standard amongst other historical self-selection papers (Abramitzky, Boustan, and Eriksson, 2012, 2014; Collins and Wanamaker, 2014). Ideally, one would compare wages instead of occupations, but administrative data only records the occupations of returnees. 21 Lacking individual-specific wages, I assign an occupational score to each occupation to reflect its earnings, with all individuals claiming an occupation receive the same score. Accordingly, self-selection estimates are based on how temporary and permanent migrants differed on the occupational ladder. Although actual return migrants had lower wages within occupation, this is impossible to determine with the data available. Similar to Collins and Wanamaker (2014), I use income data in the 1% 1940 IPUMS sample to assign each occupation the mean wage based purely on migrants earnings in 1940. 22 This has the advantage over the 1950 IPUMS occupational score, used in other papers, of being closer in time to the period of study (1908-1932), and also reflecting migrants earnings rather than that of natives. Furthermore, I vary occupational scores by new and old source countries, assigning each the mean wage within an occupation for that group. 23 A disadvantage to this method is that self-employed earnings are not reported, so I drop those with zero reported income when calculating occupational earnings. This is particularly important because many farmers earnings were unreported. However, farmers make up a small portion of return migrant occupations (less than 4%), so estimates are not strongly sensitive to this restriction. For results using alternative 21 Furthermore, wages are likewise not recorded in the census for stayers. 1940 was the first year that the census recorded wages, but they did not record the year of arrival of migrants, which is needed when comparing return and permanent migrants. 22 Collins and Wanamaker (2014) create another occupational score based on the Historical Statistics of the United States (Carter et al., 2006) and Lebergott s (1964) data on earnings between 1900 and 1928, a more suitable time period for my study. However, this data is based on very broad industry categories which cannot be applied to the RCI data since industry is not recorded for laborers. 23 Generally, I assign based on old source country migrants (Western and Northern Europe) and new source country migrants (Eastern and Southern Europe, South America, Asia and Africa). More specific earnings could be based on country of birth, but the 1940s 1% sample does not have enough observations to fill occupation-country cells.

54 occupational scores, see Appendix B.1. As a second measure of migrant quality, only available for the planned migrant micro-data, I am able to use an individual-level measure of quality: the migrant s height. Height is positively correlated with wages, nutrition, intelligence and strength, all of which are important measures of quality (Steckel, 2009). Also, height has favorable attributes for measuring selection of migrants, being constant across borders and having a standard measurement, unlike occupation, education or wages (Spitzer and Zimran, 2014). Further, I am able to explore whether planned return migrants had lower heights even when controlling for occupation, unlike occupational scores. 3.4 Descriptive Statistics 3.4.1 Self-Selection of Return Migrants at Departure Characteristics will serve to contrast against the group of migrants who planned to return at arrival, described in the next section. 24 Table 3.1 reports the differences between return migrants and those remaining in the United States as calculated by IPUMS. These differences are comparisons of unconditional means without controlling for sex, age or place of birth. Estimating self-selection on occupation given these types of controls would be informative, but are impossible with the highly aggregated out-migrant data. However, how the overall group of migrants differed unconditionally is arguably of first-order importance, especially for any policy maker. When comparing the average out-migrant with the average migrant in the Census, there are a number of differences in the foreign-born who selected into return migration. First, the average return migrant was less skilled than the rest of the migrant population, earning about 3.5 percent less. This finding supports the conclusion that return migrants were negatively selfselected on occupation by ABE (2014), who were only able to study return migration by residual methods. Since return migrants were lower skilled on average, a migrant who survived in the United States would likely be higher skilled than those leaving which positively biases estimates of migrant assimilation (Borjas, 1985). 24 See Biavaschi (2013) for a further discussion of the self-selection of out-migrants between 1908 and 1957.

Table 3.1: Self-Selection of Return Migrants, 1908-1932 55 Characteristics Return Migrants Census Self-Selection Log (Occupational Score) 6.825 6.860-0.035 Male 79.6 55.7 23.9 Less than 16 4.3 14.6-10.3 Age 16-45 79.4 75.9 3.5 More than 45 16.3 9.5 6.8 Married* 68.5 45.9 22.6 Old Source Country 24.1 50.6-26.5 New Source Country 75.9 49.4 26.5 Region of Last Residence Northeast 63.7 54 9.7 Midwest 20.7 25.3-4.6 South 5.8 5.9-0.1 West 9.7 14.8-5.1 Total individuals 3,893,293 233,497 Notes: Out-migrant data is from the Annual Reports of the Commissioner General of Immigration (1908-1932). The Census is from the 1910-1930 IPUMS samples. Self-Selection is the difference between the out-migrants and the Census. A positive number in Column III indicates that return migrants have more of that characteristic. All numbers are percentages except for the occupational scores; further, those with no occupation are dropped for the occupational score calculation. *Marital status was only measured for out-migrants after 1918. The finding that migrants were lower skilled than the rest of the migrant population is perhaps surprising given that return migrants were more likely to be male than the foreign-born (79.6% versus 55.7%). It is possible that when controlling for sex, out-migrants would be even more negatively self-selected on occupation. On the other hand, a possible expalanation for out-migrants lower earnings is that return migrants were also more likely to be from the new source countries of Italy, Greece and Russia migrants that did not have high-paying occupations in the United States. The cost of return migration appears to have influenced who self-selected into out-migration, as the further a migrant was away from a port of departure, the less likely he was to migrate. Migrating back to Europe from the Northeast was much easier than the Midwest or West because traveling across country by railroad or car was costly in terms of ticket price and time; indeed, outmigrants were more likely to be from the Northeast compared to the migrant population (63.7%

56 versus 54%). Besides the costs of traveling, the opportunity cost of being separated from family may be high return migrants were more likely to be married compared to the migrant population (68.5% versus 45.9%), a difference which holds even after dropping those under the age of sixteen in the census. 25 The differences between return migrants and permanent migrants could be for a simple reason: these were the migrants who had always planned to return home. Perhaps migrants from new source countries who were lower skilled, married, male, older and lived in the Northeast purposely migrated temporarily in order to accumulate savings quickly in the United States. For example, the lower skilled could have been more constrained by credit for investments in a business; cultural and language differences could have led to higher propensities to return of new source country migrants; married migrants would be more likely to return to family; any foreign-born who planned to migrate temporarily should settle in a place that with lower traveling costs, such as New York. Alternatively, all of these migrants could have been more likely to fail in the United States. Discrimination could have driven out new source country migrants, the lower skilled could have found it difficult to upgrade occupations, married migrants could have tested labor outcomes prior to bringing their family over and failed, and those in the Northeast could have faced more competition amongst other migrants. For these reasons, it is important to look at migrant intention data to uncover which was the stronger explanation for return migration: failures in the labor market or plans to return home. 3.4.2 Planned Emigrants Table 3.2 shows the difference between those who planned to return home versus those who planned to remain permanently in the United States. If those who planned to return home did not change their plans after arrival, then the selection of planned return migrants on arrival should be the same as the selection of actual migrants at departure. However, the selection does not remain constant through time. 25 After dropping those under the age of sixteen, the marriage rate is 55.6%.

Table 3.2: Descriptives of Planned Return Migrants, 1917-1924 57 Characteristics Return Permanent Planned Return Selection Log (Occupational Score) 6.78 6.80-0.02** Male 53.7 52.9 0.8 Less than 16 2.8 7.3-4.5*** Age 16-45 89 82.8 6.2*** More than 45 8.3 10.0-1.7** Married 32.3 40.9-8.6*** Old Source Country 63.2 57.7 5.5*** New Source Country 36.8 42.3-5.5*** Intended Region of Residence: Northeast 60.6 60.5 0.1 Midwest 24.6 29.3-4.7*** South 3.2 3.8-0.6 West 11.6 6.4 5.2*** Age (years) 28.5 28.5 0.0 Been in US before 22.9 22.0 0.9 Number of Children 0.2 0.4-0.2** Join Family 71.4 81.3-9.9*** Traveling Alone 81.0 66.3 14.7*** Height (cm) 166.7 166.8-0.1 Big City (>100,000) 63.1 60.8 2.3** Urban (>2,500) 82.0 82.4-0.4 Observations 2,523 17,633 Notes: Data is from Ellis Island Records (1917-1924). Ellis Island records are weighted to be representative of the incoming migrant flow. Everything is in percentages unless otherwise noted. *** indicates a p-value of less than 0.01, ** for a p-value of less than 0.05 and * for a p-value less than 0.10. First, planned return migrants were less skilled than planned permanent migrants with this difference in skill being more severe for migrants that actually left. For example, planned return migrants occupations earned 2% less than planned permanent migrants, while in 1920 actual return migrants occupations earned 8% less than permanent migrants. 26 The fall in the self-selection of return migrants from arrival to departure suggests that the lower skilled were more likely to change their plans after arrival; in other words, the lower skilled were more likely to fail in the 26 Table B.1 reports the characteristics of European migrants leaving in 1920, using the 1921 fiscal-year RCI data to compare to the census. This table is created as a more proper comparison to European migrants entering Ellis Island from 1917-1924.

58 labor market. 27 Another difference between the self-selection of return migrants on arrival and departure is country of origin. Migrants from new source countries were less likely to plan to return home than migrants from old source countries. Yet as we have seen already in Table 3.1, new source migrants actually returned at much higher rates, which suggests that new source country migrants experienced worse outcomes than expected relative to old source country migrants. Part of this difference between arrival and departure may be attributable to the years of the sample, which cover the 1921 and 1924 immigration quotas. Migrant quotas lowered the rates of returns of new source country migrants below the return rate of old source country migrants. Planned return migrants were less likely to be married, unlike actual return migrants who were more likely to be married. Planned did not intend to leave from the Northeast at higher rates than other places in the United States, unlike actual return migrants. Finally, males were not more likely to plan to return, unlike actual return migrants. Putting the differences between planned return migrants on arrival and actual return migrants on departure together suggests which parts of the migrant population were more likely to experience negative shocks after arrival; in other words, some parts of the migrant population were too optimistic about their outcomes in the United States. 28 Those who were low-skilled, married, male, from new source countries, and traveling to the Northeast were more likely to experience failures in the labor market. Other characteristics of the self-selection of planned return migrants are listed at the bottom of Table 3.2. These are characteristics that cannot be compared to the actual out-migrant data due to limitations of the administrative data. Planned return migrants were no more or less likely than planned permanent migrants to be in the United States before (i.e., repeat migrants). This perhaps contradicts the stories of birds of passage who would travel back and forth between 27 Alternatively, the higher skilled may have been more likely to switch their plans to stay in the United States; however, this is unlikely given that on average more migrants decided to switch their plans to leave the United States than stay. 28 This is opposed to recent evidence that some migrants from Tonga underestimate their earnings when migrating (McKenzie, Gibson and Stillman, 2013).

59 Europe and the United States many times. Planned return migrants also were less likely than planned permanent migrants to be traveling with children, less likely to join a family member, more likely to be traveling without any companion and more likely to be traveling to a large urban center such as New York, Philadelphia or Boston. 3.4.3 Self-Selection of Return Migrants Conditional on Observables Broad differences between return and permanent migrants provide a picture of who in the foreign-born population were more likely to return home - a limitation of this data is that it only holds aggregates. Using the micro-data on incoming migrants, one can control for observables to determine what is driving the selection of, at least, planned return migrants. I run the following regression of planned return migration on various observables: P lannedreturn i = β 0 + β 1 Log(OccScore i ) + γx i + ɛ i (3.3) where P lannedreturn i is an zero-one variable where one indicates that a migrant planned to return home. The main variable of interest is β 1, which indicates whether a higher occupational score is more or less likely to plan to return home. Observables in X i include a migrant s fiscal year of arrival, ethnicity, age, sex, and other observables listed in Table 3.2. I also attempt to control for the effects of migration quotas on return migration. In particular, I include the variable New P ost1921 i, which interacts whether a migrant is from a new source country and if the fiscal year is after 1921, when the first quota is put in place. This variable is also of interest itself because it estimates the effect of migration quotas on planning to return. 29 I cluster standard errors by ethnicity in order to account for any serial correlation in the error term. Table 3.3 shows results from Equation 3.3. The first column shows the simple correlation between the occupational score and return and demonstrates that occupational scores were negatively correlated with planning to return home. 30 The second column adds controls for year 29 This is a difference-in-difference methodology to capture the effect of migration quotas. Any regression with new source country interacted with a post-1921 indicator also includes ethnicity and fiscal year fixed effects. 30 There is no statistically significant difference for occupational scores because Table 3.3 clusters standard errors

60 and birthplace, which turns the coefficient on occupational score turns positive. This change in coefficienct for occupational score shows that, within country, return migrants were positively selfselected, while across countries they were negatively self-selected. In other words, return migrants were more likely to come from countries that had lower occupational scores. However, one should note that the coefficient is rather small; a 10 percent increase in an occupational score increases the likelihood of planning to return home by only 0.25 percentage points. Going from a general laborer to an iron worker yielded a 50% premium in occupational score; the same worker would be 1.25 percentage points more likely to return. After controlling for year and ethnicity, I add all observables controls for incoming migrants, including sex, age, marital status and whether they were planning to join someone at arrival. Including these controls leads to no change in the sign for occupational score, but the coefficient is now statistically insignificant. Using height as an alternative measure of the quality of a migrant in Column I leads to the sign on height being negative but statistically insignificant and extremely small. Due to the small magnitude of the coefficients on height and occupational scores, the preferred interpretation is that across countries and within country, planned return migrants were close in skill to planned permanent migrants. 3.4.4 Planned Length of Stay Planned return migrants are generally considered to have a savings target to reach prior to migrating back to their home country. If the savings target for investment in a business back in the source country is constant across the skill distribution, then higher earners would plan to stay in the United States for a shorter period of time. We can test whether this is true by using data on planned migrants expected length of stay. Not all migrants who planned to return home reported years of stay, but 1,317 observations did list a length of stay, averaging 4.3 years for those who held a job. Table 3.4 shows the results of estimating 3.3, but using years of stay as the dependent by birthplace, while the difference in means from Table 3.1 are not.

Table 3.3: Self-Selection into Planned Migration on Quality, 1917-1924 61 I II III IV V Log (Occupational Score) -0.0185 0.0259* 0.0324 0.0327 (0.0238) (0.0139) (0.0198) (0.0195) Height (cm) -0.000831-0.000857 (0.000887) (0.000875) New x Post 1921-0.115* -0.120* -0.117* (0.0679) (0.0684) (0.0679) Male -0.0297-0.0147-0.0241 (0.0210) (0.0196) (0.0243) Repeat Migrant -0.0141** -0.0137** -0.0139** (0.00615) (0.00599) (0.00618) Join Family -0.0302-0.0336* -0.0306 (0.0203) (0.0197) (0.0204) Ever Married -0.0263*** -0.0264*** -0.0262*** (0.00848) (0.00848) (0.00841) Traveling Alone 0.0204*** 0.0205*** 0.0206*** (0.00660) (0.00648) (0.00657) Number of Children -0.0174* -0.0175** -0.0172* (0.00882) (0.00834) (0.00863) Midwest -0.00575-0.00797-0.00570 (0.0109) (0.0120) (0.0109) West 0.0575* 0.0546 0.0580* (0.0322) (0.0346) (0.0328) South 0.00541 0.00701 0.00544 (0.0192) (0.0195) (0.0193) Big City (> 100,000) 0.0143 0.0164 0.0146 (0.0143) (0.0151) (0.0141) Age FE X X X Ethnicity FE X X X X Year FE X X X X Observations 13,979 13,979 13,979 13,979 13,979 R-squared 0.000 0.119 0.140 0.140 0.141 Notes: Data is from incoming passenger manifests from Ellis Island (1917-1924). The dependent variable is whether or not an individual planned to return home. Standard errors are clustered by ethnicity. variable. Column I shows that the simple correlation between occupational score and length of stay is negative, estimating that a 10 percent increase in an occupational score decreases length of stay by 0.17 years or 2 months. The estimated coefficient suggests that a laborer would plan to stay a year longer than a worker in the iron and steel industry. A negative estimated coefficient on occupational scores is consistent with the hypothesis that temporary migrants were aiming for

62 a similar savings target. Table 3.4: Planned Length of Stay, 1917-1924 I II III IV V Log (Occupational Score) -1.758*** -1.213*** -1.188*** -1.175*** (0.290) (0.204) (0.190) (0.200) Height (cm) -0.0289* -0.0271* (0.0139) (0.0134) New x Post 1921-0.405-0.227-0.392 (0.633) (0.652) (0.624) Male 0.187-0.0330 0.383** (0.149) (0.208) (0.162) Repeat Migrant 0.364** 0.379** 0.354** (0.137) (0.148) (0.141) Join Family 0.437* 0.552** 0.409* (0.217) (0.200) (0.215) Ever Married 0.134 0.132 0.129 (0.175) (0.185) (0.177) Traveling Alone 0.511** 0.448* 0.520** (0.225) (0.245) (0.224) Number of Children 0.265 0.182 0.295 (0.479) (0.533) (0.477) Midwest 0.751*** 0.829*** 0.783*** (0.162) (0.160) (0.151) West 1.134*** 1.243*** 1.191*** (0.185) (0.185) (0.187) South 0.169 0.0633 0.162 (0.587) (0.601) (0.593) Big City (> 100,000) 0.184** 0.170* 0.204** (0.0871) (0.0953) (0.0956) Age FE X X X Ethnicity FE X X X X Year FE X X X X Observations 1,317 1,317 1,317 1,317 1,317 R-squared 0.062 0.313 0.375 0.361 0.378 Notes: Data is from incoming passenger manifests from Ellis Island (1917-1924). Only those who planned to return home and listed the intended years of stay are included in the sample. The dependent variable is the number of years a return migrant plans to stay. Standard errors are clustered by ethnicity. Higher-skilled people intended to stay shorter lengths of time even after controlling for various observables. Column II looks at within country differences in lengths of stay, which lead to a slight drop in the estimated association between skill and length of stay. Further, for a given

63 birth location, sex, set of family characteristics and geographical preferences, higher-skilled return migrants planned to stay a shorter period of time. Using height as the measure of skill also provides evidence that taller individuals tended to migrate for a shorter duration. 3.5 Heterogeneity in Return Migrant Selectivity: Uncovering the Role of Unplanned Return Migration 3.5.1 Heterogeneity by Ethnicity Both those who actually returned and those who planned to return were lower skilled than permanent migrants but this masks considerably heterogeneity by ethnicity. 31 I have shown that planned return migrants were negatively self-selected across countries but positively self-selected within country. Now I turn to estimate the self-selection of actual return migrants at departure by ethnicity. Figure 3.1 plots the self-selection of return migrants comparing outgoing migrants to permanent migrants of the same ethnicity. The chart is color coded to display differences between migrants from new source countries (white) and old source countries (black). The pattern shows that most migrants from new source countries were negatively self-selected on occupation at departure. Negative selection reflects the larger number of laborers, farm laborers and miners traveling back to the source country. The patterns are remarkably consistent with ABE s indirect estimates of self-selected migrants; for example, in their and my estimates, Russian return migrants earned much less than permanent migrants and the Dutch and Flemish earned more than permanent migrants. 32 Although my levels of self-selection are more positive than ABE, the ordering of countries is similar. 33 Once again, heterogeneity across countries in self-selection of return migrants could have 31 Heterogeneity in observables besides occupational scores by ethnicity is given in Table B.3. 32 I am comparing my results on the direct estimate of self-selection with ABE s indirect estimates of self-selection in Figure 5 (2014, page 492). The ethnicity Dutch and Flemish is matched with the Netherlands and Belgium in this paper; ABE only report positive selection for Belgium. 33 Differences in methodology and years of coverage likely explain the slightly more positive result when estimating self-selection of return migrants using direct data. For a further discussion, see Appendix B.1.

64 reflected heterogeneity in who planned to return. In particular, it could be that new source countries negatively self-selected into planned return migration. In Figure 3.2, I plot both the self-selection of planned return migrants from 1917 to 1924 and the self-selection of actual return migrants in 1920. 34 Figure 3.2 shows that for many countries the self-selection of planned return migrants was actually positive, reinforcing the prior result from Table 3.3 that return migrants were positively selfselected within country. The more important inference from Figure 3.2 is that within country the self-selection of return migrants at arrival was much more positive than at departure. For example, Slovakian return migrants were strongly postively self-selected at arrival as return migrants had 21% higher occupational scores than permanent migrants. However, Slovakian return migrants at departure had 4% smaller occupational scores than the Slovakian permanent migrants a switch from strongly positive self-selection to negative self-selection. Something happened in between arrival and departure that drove out more low-skilled Slovakian migrants. This story is common across almost all ethnicities: the expected self-selection of out-migrants was more positive than the actual self-selection of migrants. The positive self-selection of planned migrants is perhaps unsurprising because low-skilled migrants received high returns in the United States economy. The United States offered higher real wages than other economies during the 1920s, specifically for unskilled workers (Williamson, 1995). Since low-wage workers were more likely to plan to stay, this suggests that permanent migrants were not only negatively self-selected relative to temporary migrants but also to the source country s overall population (Borjas and Bratsberg, 1996). Without data on the source country s population, it is impossible to verify this theoretical result. 34 Note that the self-selection of return migrants at departure was more negative for European countries in 1920 (earnings were 8% less on average) than self-selection over the entire 1908 to 1932 time period (2% less on average), suggesting heterogeneity in self-selection across time this heterogenerity will be explored in a later section.

65 3.5.2 Actual Return Rates and Planned Return Rates It is possible that it was not low-skilled migrants who switched their plans to leave the United States, but rather high-skilled migrants were more likely to switch their plans to stay. This switching would lead to positive self-selection on arrival to negative self-selection on departure. However, this was not what was happening as we see when comparing expected return rates against actual return rates. The difference between the two rates reflects unplanned returns or stays. Overall the planned return rate was approximately 13 percent, while the actual return migration rate was 19 percent. 35 The higher actual return migration rates suggest that failures in the United States drove a lot of return migration. 36 Given that the actual return rate could have been over twice as high due to miscounts of out-migrant data (Bandiera, Rasul and Viarengo, 2013), the actual return rate was at least 6 percentage points or 45 percent higher than the planned return migration rate. Table 3.5 shows exactly how optimistic migrants were by ethnicity. Column III shows the difference between planned and actual return migration rates: a positive number (more returns than expected) indicated worse-than-expected outcomes in the United States while a negative number indicates better-than-expected outcomes in the United States. For example, Polish and Scottish migrants both had low expected return rates with less than 3 percent of migrants desired to return back to their source country. However, the actual return rate for Polish migrants was 14.7% while for Scottish migrants it was 6.7%, suggesting their Polish migrants assimilated poorly into the United States economy relative to Scottish migrants. 37 35 The estimate for the actual rate of return migration is described in Appendix B.2. 36 A second possibility is that source country economic conditions improved that allowed a return of permanent migrants. However, the evidence that many migrants left the United States according to the business cycle points to the importance of United States conditions. 37 It is also possible that Polish migrants experienced positive shocks to returning back home while Scottish migrants had a low premium to returning back home.

Figure 3.1: Self-Selection of Return Migrants, 1908-1932 Notes: Data is from the Annual Report of the Commissioner General of Immigration (1908-1932), and IPUMS (1910-1930). The vertical axis is the log difference in average occupational scores of return migrants and permanent migrants. A positive value indicates that return migrants had a higher occupational scores than the foreign born. The foreign-born occupational score is weighted to match the length of stay of return migrant population. 66

67 Figure 3.2: Planned and Actual Self-Selection of Return Migrants Notes: Data is from Ellis Island Records (1917-1924), the Annual Report of the Commisioner General of Immigration (1921), and IPUMS (1920). The vertical axis is the log difference in average occupational scores of return migrants and permanent migrants. Expected self-selection is based on migrants on arrival who plan to return; actual self-selection is based on migrants on departure who actually return.

Table 3.5: Planned and Actual Return Rates, 1917-1924 68 Ethnicity Planned Actual Difference Return Rate Return Rate Polish 1.4 14.7 13.3 Hebrew 1.8 0.7-1.1 Magyar 2.2 10.9 8.7 Croatian 2.4 6.8 4.4 Scotch 2.9 6.7 3.8 Syrian 3.5 11.4 7.9 German 5.1 7.0 1.9 Greek 5.2 43.9 38.7 Bulgarian 6.8 50.2 43.4 Russian 8.1 16.8 8.7 Slovak 8.2 6.1-2.1 Romanian 8.7 39.7 31.0 English 13.9 14.4 0.5 Dutch 14.7 15.8 1.1 Italian 19.0 25.4 6.4 French 20.3 8.9-11.4 Finnish 21.4 13.7-7.7 Welsh 24.7 6.7-18.0 Irish 26.1 4.4-21.7 Scandinavian 27.1 12.7-14.4 Spanish 43.6 44.3 0.7 Notes: Data is from Ellis Island passenger manifests and the Annual Reports of the Commissioner General of Immigration (1917-1932). The expected return rate is the percent of incoming migrants who planned to return home. See Appendix B.2 for the calculation of actual return rate. A pattern of higher-than-expected return rates existed amongst new source countries. Greek, Bulgarian and Romanian migrants had return rates that were over 30 percentage points higher than their expected return rate. This could have been due to discrimination against new source country migrants that coincided with social movements in the 1920s determined to Americanize arriving migrants (Lleras-Muney and Shertzer, 2014). On the other hand, for many old source country migrants, the actual return rate was close to the expected rate. For example, Irish and Scandinavian actual return rates were 15 percentage points less than the expected rate. Some old source country migrants appear to have experienced better-than-expected outcomes in the United States. The new source country migrants experienced both negative self-selection and higher unex-

Figure 3.3: Unexpected Returns and Self-Selection, 1920s 69 Notes: Data is from Ellis Island Records (1917-1924), the Annual Report of the Commisioner General of Immigration (1921), and IPUMS (1920). The vertical axis is the log difference in average occupational scores of return migrants and permanent migrants. The horizontal line is the difference between the actual return rate and expected return rate; a positive number indicates higher unexpected returns. The expected return rate is the percent of incoming migrants who planned to return home. See Appendix B.2 for the calculation of actual return rate. pected return rates. This correlation is clear when plotting self-selection and unexpected return rates of all countries, seen in Figure 3.3. It is clear that there is a negative correlation between the two: the higher the unexpected return migration rate, the lower is the self-selection of return migrants. The R 2 for the figure is 0.19, suggesting that unexpected return rates explain approximately one-fifth of the self-selection of migrants across countries. 3.5.3 Transferring of Skill across Countries An obvious candidate to explain why migrants left the United States against expectations is that their skills did not transfer to the United States. If human capital acquired in the host country is not rewarded in the United States, a pattern that is true for recent migrants (Friedberg,

70 2000), then migrants may be more likely to return home. This is commonly known as occupational downgrading upon arrival: a doctor in the source country may be an laborer in the host economy. A simple test of whether skills transferred across countries is to estimate the correlation between skills on arrival and skills in the census. This is done with aggregate data, measuring how a migrant cohort s skills changed from arrival to the Census. Skills on arrival (found in the RCI) should reflect jobs in the source country while skills in the Census (found in IPUMS) reflect jobs in the United States. However, since return migration may bias a cohort s skills across time, this correlation can only be estimated for very recently arrived cohorts in the Census. Using IPUMS data, I calculate the average logged occupational score for those who have been in the United States for one year or less; then I regress this score on the skills of migrants at arrival for the same year. For example, German s occupational scores in the 1910 US census for those who arrived in 1909 or 1910 is regressed on German occupational scores in the RCI data of those who arrived in 1909 or 1910. 38 If skills are perfectly transferred from arrival to the United States, then the correlation between the two should be close to one. To create this sample, I only include ethnicities where there are more than 15 people with jobs observed in the recent cohort; this leads to only 41 observations. The results are reported in Table 3.6. The first column reports that the elasticity between incoming migrant skills and skills in the census is 0.333. This suggests that an increase of an incoming migrant cohort s skills of 10% leads to a 3.3% increase in occupational score in the United States. 39 This number is rather low, and could give an indication why some migrants decided to leave the United States if they were unable to acquire a similar job to the one they had in their source country. This low correlation between prior jobs and US jobs masks hetereogeneity across new and old source countries. I separate the two groups in Columns II and III to estimate the correlation within each set of countries. For old source countries the correlation is 0.606, indicating that jobs were 38 Technically, I use the 1909 and 1910 RCI data because they report jobs on fiscal years (July 1st to June 30th). This would introduced a slight bias because a census year and incoming year are not the same. 39 The regression treats skills in the source country and the United States as constant; however, the ordering of skills according to United States prices is likely not too different within Europe.

Table 3.6: Transferring Skills from Source Country to the United States 71 I II III IV V Variables \ Sample: All Old New All All Log Occ. Score at Arrival 0.333*** 0.606*** 0.264* 0.689*** 0.656*** (0.0921) (0.100) (0.135) (0.127) (0.104) Log Occ. Score on Arrival -0.467*** -0.470*** New Source Country (0.168) (0.161) New Source Country 3.256*** 3.225*** (1.142) (1.086) Log Years in USA 0.0255-0.0620 (0.0542) (0.0618) Log Network Size 0.0316* 0.0356** (0.0160) (0.0150) Year 1920 0.0736* (0.0375) Year 1930 0.0922*** (0.0334) (0.0332) Observations 41 16 25 41 41 R-squared 0.249 0.708 0.152 0.415 0.466 Notes: Data is from RCI (1909-1910; 1919-1920; 1929-1930) and IPUMS (1910-1930). The dependent variable is the average logged occupational score of a recently arrived immigrant cohort ( 1 years in US), as measured in the Census. An observation is an ethnicity-year. New source countries are from Eastern and Southern Europe; the reference group for new source countries is old source countries from Northern and Western Europe. The reference year is 1910. much more easily transferred for these migrants. On the other hand, for new source countries the correlation of skills is 0.264. Column IV tests whether the difference in correlation between new and old source countries is statistically significant by interacting the incoming migrant cohort s skills with whether the cohort is from a new source country; indeed, new source countries migrant skills do not transfer as well as skills of old source country migrants. An explanation for the difference in skill transference is that old source countries have established networks that could possibly provide job connections (Hatton and Leigh, 2011; Beaman, 2012; Patel and Vella, 2013); however, even after controlling for the foreign-born stock and the average years the stock had been in the United States, the correlation between migrant skills in the source country and host country remains weak. The regressions suggest that higher skills at home did not perfectly transfer to higher skills in

72 the United States, which serves as a potential explanation for why new source country migrants left at higher rates, many of these migrants actually enjoyed better paying jobs in the United States. In other words, a regression of United States occupational scores on home country occupational scores has a slope of less than one, but the intercept is positive. Rather than being a story of occupational downgrading, similar to migrants today, many new source country migrants upgraded their occupation in the United States. This occurred because numerous incoming migrants were farmers or farm laborers in their source country, typically a low-paid occupation, while in the United States they worked as general laborers, miners or in the manufacturing sector. This switch in occupation actually led to higher occupational scores, which theoretically should lower the rate of return. The increase in occupational scores on arrival lends some doubt to the hypothesis that lower-skilled migrants were not able to acquire higher paying jobs; however, this result does not rule out that while migrants claimed better jobs, they could have experienced high unemployment which would lower annual income. 3.5.4 Heterogeneity by Time How did the self-selection of return migrants change from 1910 to 1930? I use the RCI data from the 1911, 1921 and 1931 fiscal years (for example, the 1911 fiscal year records out-migrants leaving from July 1910 to June 1911) to compare return migrants to those immediately previously recorded in the 1910, 1920 and 1930 Censuses. Migrants returning in the fiscal year following the decadal census are drawn precisely from the distribution just observed in the census, increasing the precision of self-selection estimates (Fernández-Huertas Moraga, 2011). A particularly interesting pattern is the self-selection of return migrants over time for old and new source countries, displayed in Figure 3.4. For both 1910 and 1920, new migrants were more negatively self-selected than old migrants. The pattern of negative self-selection intensified from 1910 to 1920, which could be because those migrants who intended to leave following World War I were unable to, and these were lower-skilled migrants. While the 1920 result is partially contaminated by World War I, it is interesting that both old and new source country migrants were

73 similarly affected. Figure 3.4: Self-Selection of Return Migrants, 1910-1930 Notes: Data is from the Annual Reports of the Commissioner General of Immigration (1911, 1921, and 1931) and IPUMS (1910, 1920, and 1930). New source countries are those from Eastern and Southern Europe, and Asia. Old Source countries are those from Northern and Eastern Europe and the Western Hemisphere. The gap in self-selection between old and new migrants was completely erased by 1930, when migrants returning to new source countries earned 2.2% more than the stock and return migrants to old source countries earned 2.1% more than the stock. Now return migrants were on average positively self-selected. Something occurred between 1920 and 1930 that led to new source country return migrants to be more positively self-selected. While this result is a natural corollary to the second chapter s result that quotas caused the lowest skilled s return rates to fall the most, it is a new finding that the self-selection for both new and old source countries turned positive. If the self-selection of return migrants was positive, it is possible that return migration actually lowered the skills of the migrant stock remaining in the United States, the opposite direction than is commonly assumed and found by the literature (Abramitzky, Boustan and Eriksson, 2014; Borjas,

1985; Lubotsky, 2007). To uncover the overall effects of quotas and self-selection of return migrants on the migrant stock, I turn to analyze the effect of quotas on those in the census. 74 3.6 The Effect of Migration Quotas on Return Migration It appears that migration quotas led to return migrants being more positively self-selected on occupation. One possible explanation for this is that the fewer low-skilled migrants failed following migration quotas, perhaps due to a labor supply shock in foreign-born labor markets. This is the mechanism suggested in the second chapter as the reason why return migration rates fell following the implementation of quotas. Using their estimate for the effect of quotas on return migration rates, a back-of-the-envelope calculation suggests that quotas caused a fall in out-migration rates of at least 11.8 percentage points. 40 The corresponding fall in planned return migration rates was approximately 7 percentage points, which is less than the drop in actual return migration rates. 41 Since the actual return migration rate dropped more than the planned return migration, the unexpected return rate also fell evidence consistent with a labor supply shock leading to fewer failures. According to the correlation between unexpected return rates and negative self-selection, a drop in the failure rate should lead to more positive self-selection. As more low-skilled migrants stayed or as return migrants became more positively self-selected, the migrant cohort could remain low-skilled over time; this is opposed to the typical suggestion that negatively self-selected return migrants lead to an increase of a migrant cohort s average wage over time. If the effect of migrant quotas was strong enough to keep a large percentage of low-skilled migrants in the United States, then the migrant stock would have a higher fraction of low-skilled migrants. There is evidence that migrant quotas increased the skills of entering migrants for those who 40 The second chapter suggests that a 60 percent restriction of the flow led to a 55 percent fall in the outmigration rate. The return migration rates for European countries, the countries that mostly entered Ellis Island, was approximately 21.5 percent prior to the 1921 quota. A 55 percent drop from a 21.5 percent return rate is 11.8 percentage points. However, the fall is at least 11.8 percentage points because the out-migration rate prior to quotas (21.5 percent) is likely an underestimate of the true out-migration rate (Bandiera, Rasul and Viargeno, 2013). 41 This 7 percentage point drop is based on a simple difference-in-difference regression similar to the coefficient on New Source Country x Post 1921 in Table 3.3, but with the entire group of entering migrants in the sample rather than just those holding jobs.

75 had jobs, likely because institutional constraints led to a higher cost of traveling to the United States and restricted lower-skilled individuals from migrating (Hatton and Williamson, 2005; Massey, 2012). If selection into return migration did not change following quotas, a skill increase at arrival should survive when the cohort is observed years later. However, if more low-skilled migrants remained in the United States for those countries most heavily restricted by quotas, then the skill increase on arrival should be diminished when the cohort is observed later because low-skilled migrants, who would normally have returned home if there were no quotas, had decided to stay. 3.6.1 Effects on Immigrants at Arrival and Immigrants in Census I estimate the effects of migration quotas on migrants at arrival and migrants observed years later in the census to see how any effect survived over time. In order to do this, I need to make additional adjustments to the data. First, to capture recent changes in the migrant stock that are due to return migration, I only include people who have been in the United States for ten years or less. 42 Since I am comparing the aggregated RCI data to Census data, I collapse the Census data into ethnicity/year-of-arrival cells from 1908 to 1929. 43 However, since the Census does not have observations in every ethnicity by year-at-arrival cell (for example, there are no Koreans in 1916), I drop ethnicities that do not have observations every year. This creates a balanced panel of 440 observations (20 ethnicities by 22 years) which are mostly European countries but also includes Mexican and Spanish-American ethnicities. Finally, to account for other shocks to migration such as the 1917 literacy test and World War I, I include controls for the literacy rate of incoming migrants and World War I death rates. 44 In addition, I add controls for the home country s GDP, taken from Barro and Ursúa (2010). World War I deaths rates and GDP data are rough estimates at best and likely contain much measurement error, but they will reduce omitted variables bias to 42 This limits multiple censuses contributing to an ethnicity/year-at-arrival cell. For example, people who arrived in 1911 from Italy are only counted by the 1920 census rather than Italians who had been in the United States 19 years by 1930. 43 The years 1908 to 1929 are chosen because out-migrant data does not exist prior to 1908 and the 1930 United States Census occurred in April which would not observe the entire year of migration. 44 The World War I death rates are taken from the second chapter. For ethnicities without a specifically defined country, I proxy the death rate with the nearest country. However, not much faith should be put into this data because they are very rough estimates.

76 the the extent that they capture damage from World War I and the business cycle at home. I use a regression framework to estimate the effect of the quotas on immigrants occupation scores at arrival. Using a panel of ethnicities across years 1908-1929, I use a difference in difference framework where treatments (T reat j ) are different indicators for the 1921 quota, 1924 quota, and a intensity measure of quota restrictiveness that estimates the fraction of the potential incoming flow that is restricted from entering (see second chapter). 45 The intensity measure, labeled Quota Restriction, takes advantage of the two changes in the dose of quotas in 1921 and 1924, which more properly identifies the effects of quotas than a blunt dummy variable for treated and non-treated. Log(OccScore jt ) = α 0 + α 1 New j T reat t + δ X jt + ν j + ϕ t + ɛ jt (3.4) Equation (3.4) estimates the effects of the quota on immigrant occupational scores. In addition to ethnicity (ν j ) and country fixed effects (ϕ t ), I include controls for World War I death rates, literacy rates and the home country s logged GDP in X jt. I verify that the empirical strategy is valid in a subsequent section by testing for pre-treatment trend differences. The results of Equation (3.4) are presented in Panel A of Table 3.7. The results are consistent with the previous literature where quotas increased immigrant skills upon arrival, with an increase of 5.4% after the 1921 quota, or 7.1% after the 1924 quotas. When using both treatment variables, the 1924 quota had the larger effect on increasing occupation scores. Column IV shows that restricting migrant flows by 60% (the average quota restriction) increased occupational scores of immigrants upon arrival by 5.0%. While a priori it is unclear that a quota would increase skills since the quota is first-come first-in, the 1920s quotas clearly did. 46 Importantly, these results hold after controlling for literacy rates of the incoming migrant cohort. The (unreported) coefficient on literacy rate is approximately 0.40, suggesting that a one percentage point increase in the literacy 45 The variable for quota restrictiveness is a 0 to 1 measure where a 0.60 value implies that the incoming flow is restricted by 60%. The measure is one minus the ratio of the quota allowance divided by the potential migrant flow, as proxied by the average flow from 1908 to 1914. 46 After the 1920s quotas were put in place, migrants were screened by consulates in the source country. This process could have restricted low-skilled migrants.

77 rate increased occupational scores by 0.4%. Table 3.7: Effect of Quotas on Immigrant s Occupational Score (1) (2) (3) (4) Panel A: Immigrants at Arrival New Source Country x Post 1921 0.0539 0.0135 (0.0352) (0.0359) New Source Country x Post 1924 0.0709** 0.0607** (0.0327) (0.0279) Quota Restriction 0.0829* (0.0399) Panel B: Immigrants at Census New Source Country x Post 1921-0.00938-0.0337 (0.0198) (0.0210) New Source Country x Post 1924 0.0111 0.0385* (0.0217) (0.0221) Quota Restriction -0.00782 (0.0203) Ethnicity FE X X X X Year FE X X X X Literacy Rates, Home GDP and World War I X X X X Notes: Data is from Annual Report of the Commissioner General of Immigration (1908-1929) for Panel A and IPUMS (1910-1930) for Panel B. Both panels are ethnicity/year-at-arrival panel data with 440 observations. Standard errors are clustered by ethnicity for both panels. The dependent variable is the logged occupational score. Additional controls include literacy, the home country s GDP and World War I death rates. New source countries are from Eastern and Southern Europe; the reference group for new source countries is old source countries from Northern and Western Europe and the Western Hemisphere. Migrant skills at arrival clearly increased, but what happened to a cohort s skill when it is observed years later? While I have motivated that the difference between skills on arrival and skills observed years later as mainly due to selective return migration, it is also attributable to occupations not transferring across borders. In Table 3.6, I estimated that occupations from all source country correlate with United States occupations by 0.333, and for new source countries by only 0.264. After correcting for this occupational transferrence, it is possible that the surviving migrant cohort will still have lower skills because of positively self-selected return migration.

78 I estimate the effect of quotas on new and old immigrants using the same econometric method for incoming migrants (Equation (3.4)) but now with migrant cohorts observed in the Census. The results are in Panel B for Table 3.7. There is no evidence that quotas increased the skills of the migrant stock remaining in the United States. Column I shows that there was no effect of quotas on skill when measured after the 1921 quota, similar to Column II. Column III separates the effects of the 1921 and 1924 quota and finds that the estimated coefficient is negative for 1921 and positive for 1924. Both of these estimated effects are less than the effect of quotas on incoming migrants; however, the census data does show that the 1924 quota did increase occupational scores. This could be because return migration would not change a cohort s average skill the closer the yearat-arrival is to census enumeration date. Finally, Column IV shows that using the more precise measure of quota restriction has zero effect on a cohort s occupational score. Thus, the effect of quotas on migrants occupations at arrival do not show up when using the Census data. 3.6.2 Effects on Return Migrants at Departure Selective return migration could explain why quotas increased skills at arrival but not years later if low-skilled migrants were more likely stay permanently. Part of the explanation could be due to a low correlation of occupations across borders - to separate the two mechanisms, I look at the skills of those migrants who were leaving the United States. If the skills of return migrants increased following quotas, then selective return migration would partially wipe out the effect of higher incoming migrant skills. I use the same method to estimate the effect of quotas on out-migrants but using out-migrant occupational scores as the dependent variable. One caveat is that measures for out-migrants are based on year of departure, not the year of arrival. Out-migrants leaving in a given year are a mix of temporary migrants across cohorts, but with the data it is impossible to nail down the effect of quotas on temporary migrants for a given cohort. Table 3.8 shows the results of the regression using out-migrants occupational scores. The regression results are mostly consistent with quotas increasing the out-migrants occupational scores,

79 however the only statistically significant result is seen when using the intensity measure of quota restrictiveness. A 60% quota restriction increases the skills of out-going migrants by approximately 3.2%. Table 3.8: Effect of Quotas on Out-Migrant s Occupational Score Variables (1) (2) (3) (4) New Source Country x Post 1921 0.0244 0.0299 (0.0260) (0.0231) New Source Country x Post 1924 0.0136-0.00815 (0.0252) (0.0178) Quota Restriction 0.0530* (0.0282) Ethnicity FE X X X X Year FE X X X X Literacy Rates, Home GDP and World War I X X X X Observations 440 440 440 440 R 2 0.791 0.789 0.791 0.795 Notes: Data is from Annual Report of the Commissioner General of Immigration (1908-1929). Standard errors are clustered by ethnicity. Additional controls include literacy, the home country s GDP and World War I death rates. New source countries are from Eastern and Southern Europe; the reference group for new source countries is old source countries from Northern and Western Europe and the Western Hemisphere. Selective return migration does have a role to play in why quotas did not increase occupational scores for those at the census. Given an average 5% increase in occupation scores for migrants at arrival and a correlation of 0.264 of occupational skills across borders for new source country migrants, the occupation for cohorts at the census should have increased approximately 1.3%. Rather, the occupations of migrants observed at the census increased by 0.0%, creating a discrepancy between skills transferring at arrival and occupations at the census. Since out-migrants occupational scores increased by 3.2% and return migration fell by 55% following quotas, it is reasonable that quotas effect on selective return migration partially eliminated the increase of skills on arrival.

80 3.6.3 Robustness Checks Validity of Empirical Strategy To convincingly show that a difference-in-difference in strategy is causal, there must be no unobservable differences that vary over time between new and old source country migrants. One way to show this is by verifying that before quotas are implemented, new source country migrants and old source country migrants skills trend together. While visual evidence from Figures 3.5, 3.6, and 3.7 suggests that new migrants and old migrants skills trend closely together prior to migration restrictions, I use placebo treatment effects in a regression framework to argue that the empirical strategy is valid. More specifically, I use data on pre-treatment years (before the first 1921 quota) and vary the treatment year from 1909-1921 to determine if an effect of quota would show up prior to 1921. If my empirical strategy is valid, there should not be an effect of placebo quotas on skills of migrants. I run placebo tests in three separate samples: one using RCI data for immigrant skills upon arrival, one using IPUMS data for immigrants observed years later at census, and one using RCI data for emigrants skills upon departure.

81 Figure 3.5: Immigrant Occupational Scores by Cohort, Measured Upon Arrival Notes: Data is from the Annual Report of the Commissioner General of Immigration (1908-1929). The residual logged occupational scores are after controlling for ethnicity, year of arrival, literacy rates, World War I death rates and the home country s logged gdp. The shaded area is during World War I, and the two vertical lines coincide with 1921 and 1924 quotas.

82 Figure 3.6: Immigrant Occupational Scores by Cohort, Measured at Census Notes: Data is from recent migrant stock of past ten years, observed IPUMS (1910-1930). Year is the year the cohort arrived, but note that cohorts are only measured later in 1910, 1920, and 1930 census. The residual logged occupational scores are after controlling for ethnicity, year of arrival, literacy rates, World War I death rates and the home country s logged gdp. The shaded area is during World War I, and the two vertical lines coincide with 1921 and 1924 quotas.

83 Figure 3.7: Return Migrant Occupational Scores Notes: Data is from the Annual Report of the Commissioner General of Immigration (1908-1929). The residual logged occupational scores are after controlling for ethnicity, year of departure, literacy rates of the cohort three years ago, World War I death rates and the home country s logged gdp. The shaded area is during World War I, and the two vertical lines coincide with 1921 and 1924 quotas.

84 Table 3.9: Placebo Tests on Immigrants and Emigrant Skills Year Immigrants Upon Arrival Immigrants at Census Emigrants 1909 0.0227-0.0266 0.0121 1910 0.0409-0.0269-0.00265 1911 0.0337-0.0246 0.00511 1912 0.0302-0.0113-0.00443 1913 0.0356-0.0104-0.00926 1914 0.0459-0.00739-0.0115 1915 0.058 0.000993-0.0189 1916 0.0623 0.00109-0.0149 1917 0.0542 0.00654 0.000965 1918 0.0434 0.00370 0.0112 1919 0.035 0.0414 0.00559 1920-0.00409 0.0351 0.0096 Ethnicity FE X X X Year FE X X X Literacy, Home GDP and WWI X X X Observations 280 280 280 Notes: Data is from IPUMS (1910-1930) and RCI (1908-1921). The dependent variable is the logged occupational score. Each estimate is from a separate regression with a treatment variable for that year. Standard errors are clustered by ethnicity. *p<0.10, **p<0.05, ***p<0.01 The results of placebo tests on migrant skills are shown in Table 3.9. Each number in the columns represent a separate regression where the treatment variable is the year in the leftmost column. There is no statistically significant difference between new and old migrants, even going through World War I. 3.7 Conclusion Of the millions who decided to immigrate to the United States in the early 20th century, many returned home. Migrants arrive in a country with expectations on how well they will perform; reality deviates from expectations as migrants perform either better or worse than expected. Fundamental to shaping the characteristics of the migrant stock are changes in both the expectations and reality of earnings in the United States economy. If migrants were too optimistic about their chances, then they will decide to return home and temporary migration will increase, possibly draining savings

85 out of the economy. Migrants from new source countries had low planned return rates but high actual return rates. Many low-skilled migrants did not survive the United States labor market, perhaps because of intense competition within the migrant pool or because of discrimination following World War I. Those with highly substitutable characteristics tended to leave more rapidly: prime-aged males living in the Northeast faced substantial competition from inflows of migrants, and they were also driven out. Migration policy can alter return migration by either changing the planned return rate or changing the failure rate in the United States. The migration quotas of the 1920s changed both, lowering the planned return migration rate and the failure rate. The result was that fewer lowskilled migrants left the United States; instead, they remained in the migrant stock, ultimately slowing the rate of assimilation. Migration policy aimed at affecting the characteristics of the migrant stock needs to account not just for effects on the incoming migrant flow but also on the outgoing migrant flow. It is possible that entry restrictions change the behavior of those already in the United States or those lucky enough to cross the border, which in the end alters the effects of policy of the overall migrant stock (Angelucci, 2012). Movement back to a free migration system that drastically lowers the costs of migration, as had been the case prior to the migration quotas, would encourage more migration, much of which would likely be temporary. The United States still offers a wage premium over many countries; while some may worry that a free migration system would lead to a large increase in the foreign-born stock, many migrants would likely return back home. Further, pure economic forces of competition amongst migrants seem to lead to a better quality migrant stock remaining in the United States.

Chapter 4 Who Crossed the Border? Self-Selection of Mexican Migrants in the Early Twentieth Century Co-authored with Edward Kosack Forthcoming in the Journal of Economic History 4.1 Introduction Through the beginning of the twentieth century, Europeans dominated migrant flows to the United States, arriving freely with few laws restricting entry. This era of free mass migration ended abruptly in the 1920s with the Immigration Acts of 1921 and 1924 which imposed quotas to curtail European migration to the United States; however, migration from Mexico remained relatively unrestricted. 1 More individuals from Mexico arrived in the United States during the 1920s than did migrants from many European countries (see Figure 4.1). Mexican migrants became an increasingly important source of labor in the United States in the early twentieth century, yet little is known about those who decided to migrate and, among the migrants, those who decided to either stay or return. 1 The Emergency Immigration Act of 1921 and Immigration Act of 1924 placed annual limits on European migration while imposing no restrictions on Western Hemisphere countries. While Mexican migrants were not limited by quotas, the Immigration Act of 1917 did require all migrants to pass a literacy test and to pay an eight dollar head tax. World War I also caused a steep drop in migration leading some economists to cite 1913 as the end of the Age of Mass Migration (Hatton and Williamson 1998).

87 Figure 4.1: Immigrant Flows to the United States, 1900-1929 Notes: Immigrant flows are aggregated in five year bins. Source: Historical Statistics of the United States (Carter et al., 2006) In this paper we measure the pattern of selection into migration, and then examine whether there is any differential selection into return migration. Because only some individuals are willing to cross borders and leave their native land, the economic consequences of the quality of migrants relative to those who remain behind could affect the home and host economies through multiple channels (Borjas 1987). For the United States, the specific pattern of selection affects both migrant assimilation (Chiswick 1978; Borjas 1985; Ferrie 1999) and the return to migration (Abramitzky et al. 2012). For Mexico, whether those leaving were of higher or lower quality than those staying is important for understanding potential brain drain (Gibson and McKenzie 2011), as well as income inequality (McKenzie and Rapoport 2007). If migrants were better than the general Mexican population in terms of productivity, education or health, then they were positively self-selected. 2 To determine the pattern of selection one can compare the wages that Mexican migrants would earn in Mexico to the wages of those in Mexico who do not migrate (Borjas 1987). However, migrants and their wages are typically only observed in the host country. As prices for skills vary from country to country, comparing wages 2 George Borjas (1987) defined selection not only in terms of comparing migrants wages to the home country s distribution, but also in terms of how they compared to the host country s distribution of wages. We follow the recent direction of the literature comparing migrants only to those in the home country (see Daniel Chiquiar and Gordon Hanson (2005)).

88 once migrants have crossed the border does not give the proper counterfactual. Techniques used to circumvent this problem and generate the appropriate comparison of migrants and non-migrants include propensity score matching (Chiquiar and Hanson 2005) and sibling fixed effects (Abramitzky et al. 2012). Further, in many cases individual wages are not known. Some studies of historical selection use aggregated measures of human capital, such as occupational scores, to compare movers to stayers (Abramitzky et al. 2012; Collins and Wanamaker 2014). However, if occupations reported in the historical immigration statistics were not representative of an individual s place in the skill distribution because migrants listed downgraded occupations on arrival that reflected labor demand in the United States, then these data would systematically underestimate the true quality of a migrant worker. Additionally, occupational scores are not specific to the individual and do not allow us to look at how migrants differed from the home population within reported occupation or skill class. We use height as an alternative measure of the historical self-selection of Mexican migrants. A long literature argues that greater stature is correlated with higher earnings, greater intelligence, and increased health; in other words, height is positively correlated with quality (Steckel 1995; Steckel 2009). A migrant s height does not change as he crosses the border into the United States, unlike occupation or wages. Further, height gives a partial measure of human capital that is specific to the individual, important when there is little variation in migrant occupation. Since the vast majority of Mexican migrants claimed laborer or miner as their occupation, we are able to determine if the United States received the better laborers or the better miners by using height data. Much of the migration from Mexico to the United States was temporary, and many individuals returned home instead of settling permanently (Gratton and Merchant 2013). Measuring the selection into migration is not sufficient for understanding the effect of migrants on the labor force in both Mexico and the United States since return migrants might be differentially self-selected (Borjas and Bratsberg 1996). However, the direction of selection for return migrants is unclear. Return migrants may have been target earners who migrated to accumulate savings in order

89 to start a business back home, making the direction of selection ambiguous (Mesnard 2004; Piore 1979). On the other hand, return migrants could have been those who unexpectedly failed in the United States labor market and would thus be negatively self-selected (Abramitzky et al. 2014). Further, pressure on Mexicans to leave the country or deportation drives that began to occur in the late 1920s and 1930s could have changed the quality of return migrants (Hoffman 1974). We utilize newly collected data from individual border manifests for migrants crossing through border towns in Arizona and Texas in 1920. To determine the selection of the migrant population compared to the home population, we compare heights for migrants to samples of heights for soldiers in the military and for those who applied for passports in Mexico. 3 Having estimated the self-selection of inflows, we estimate the self-selection of outflows. To do this, we link our sample of migrants who crossed the border in 1920 to the 1930 United States Census to create a sample of permanent migrants, and to the 1930 Mexican Census to create a sample of return migrants. We compare the heights of each sample to determine the self-selection of return migrants relative to permanent migrants. We find that Mexican migrants in 1920 were positively self-selected on height from the Mexican population. They were four to five centimeters taller than soldiers in the militarytypically members of the lower class of Mexican societyand they were only one and a half centimeters shorter than passport holderstypically members of the higher class of Mexican society (López-Alonso and Condey 2003). Our result holds within occupational skill class as the United States received the taller laborers, the taller skilled workers, and the taller professionals. We also find that although a substantial proportion of Mexican migrants returned home (between 13 and 44 percent), there was no differential self-selection on height into return migration. Our measured result of positive selection for migrant inflows is a good proxy for the change in the quality of the overall stock of Mexican migrants in the United States in the early twentieth century. 3 The military and passport height data was collected by Moramay López-Alonso and is publicly available at the ICPSR (2003).

90 4.2 U.S.-Mexico Migration in 1920 There is an extensive literature on the history of migration between the United States and Mexico (see Lawrence Cardoso (1980), Patrick Ettinger (2009), and David Gutierrez (1995) for an overview). Indeed, Mexican migration patterns transformed dramatically during the early twentieth century. The Mexican Revolution pushed migrants out during the 1910s, while the immigration quotas of 1921 and 1924 curtailed unskilled labor from Eastern and Southern Europe in the 1920s and pulled Mexican workers into the United States. We choose 1920 as a benchmark year, falling as it does directly between these two major events, to reveal how the self-selection process operated with limited, confounding institutional factors. While there were some restrictions to entering the country in 1920, picking a year prior leads to several challenges for our analysis. First, most of the fighting in the Mexican Revolution occurred between 1910 and 1917, making it difficult to separate migrants moving for economic reasons versus those fleeing as refugees. Although some small amount of fighting continued in 1920, it was limited to the North while most of our sample comes from central Mexico. Second, the United States only started to systematically collect immigrant records for individuals crossing the Mexican border in 1907, and the process was not firmly in place by 1909, the year before the Mexican Revolution (Immigration Act of 1907, Sec. 32). The Mexican Revolution, a multi-sided conflict, raged during the early 1910s although the major fighting subsided by 1917. 4 At the beginning of the Revolution, conflict occurred throughout Mexico as revolutionaries from different states fought to overthrow President Díaz, with the most intense fighting occurring between 1913 and 1916. Following the creation of a new constitution in 1917, major warfare subsided with only Pancho Villa skirmishing in small battles in the North. By 1920 most fighting halted as Villa surrendered and Álvaro Obregón was elected to the presidency (Knight 1986). During the Revolution, thousands of Mexicans temporarily fled to the United States (United 4 See Alan Knight (1986) for a review of the Mexican Revolution.

91 States Bureau of Immigration 1914). As refugees fled during the Revolution, migrant flows became more skilled between 1913 and 1916, the most intense period of fighting. By the end of the 1910s, however, the skill mix of the inflow had returned to pre-revolutionary levels (see Figure 4.2). Even though thousands crossed the border, the United States absorbed these migrants easily as World War I increased the demand for labor (Rockoff 2004). In fact, in 1917 the United States encouraged temporary Mexican migrants to work in agriculture, railroads, and mining, briefly suspending entry restrictions by allowing contract laborers, discontinuing the head tax, and waiving the literacy requirement (Cardenas 1975). By 1920 thousands of Mexicans traveled northward yearly to earn higher wages offered by employers in the United State, but many of these same migrants later returned home (Clark 1908; United States Bureau of Immigration 1920). In fact, Paul Taylor (1929), in his extensive study of Mexican migrants, notes that many employers in the 1920s perceived Mexicans to be more reliable than other workers and attempted to keep them from returning home after the harvest season.

92 Figure 4.2: Skill Composition and Literacy Rate of Mexican Migrants, 1908-1930 Notes: Skill classifications according to López-Alonso (2000). The vertical line at 1917 represents the year of the literacy requirement. Comparing the correlation between skill and literacy before and after the 1917 legislation provides evidence that the literacy requirement was not enforced (see text). The skill proportions add to one in each year. These numbers are based on authors calculations from the Reports of the General Commissioner of Immigration (1908-1930). Before 1917 the literacy rate is calculated for males 14 years and older, after 1918 it was for 16 years and older. Source: Annual Reports of the Commissioner General of Immigration 1908 to 1930. As Congress was encouraging migration to the United States from Mexico, they simultaneously passed qualitative restrictions on migration in 1917 by requiring migrants to be able to read and write in their own language, potentially limiting illiterate Mexican individuals. However, the United States did not consistently enforce this law for Mexican migrants, first waiving the literacy requirement in 1917 and reissuing the waiver time to time until 1921 (Cardenas 1975; Cardoso 1976). The waiving of the literacy test is clear when comparing male migrant literacy rates to the skill mix of inflows, as we show in Figure 4.2. Prior to the literacy test in 1917, literacy and migrant skill level were positively correlated, as expected. Following 1917, however, the migrant flow became less skilled but more literate. By 1920 the percent unskilled was even higher than just before 1917, while the literacy rate increased to 99.4 percent. Even when agricultural workers were waived from the literacy test and head tax, official statistics probably still recorded them as literate. The literacy test did not appear to restrict migration from Southern and Eastern Europe

93 substantially. Congress imposed quantitative restrictions in 1921 and 1924, dramatically reducing migration from Europe (Zeidel 2004). The quota system, however, placed no limits on migrants coming from the Western Hemisphere, and so Mexican migration was relatively unimpeded. Following the quotas, Mexican immigration increased dramatically as Mexicans acquired jobs due to a labor shortfall (Bloch 1929). The large increase in numbers would eventually lead to concerns over the racial origins of Mexican migrants (Foerster 1925), to the creation of the Mexican Border Patrol and to the criminalization of undocumented entry in the 1920s (Ngai 2002). 4.3 Selection into Migration Migrants are not a random draw from their home country s population. Borjas (1987) argued that we can predict the direction of self-selection for migrants based on the relative distribution of wages across economies. He finds that if the United States has a more unequal income distribution than the sending country, then we can expect positive selection into migration to the U.S. If the migrants who leave from the home country are on average better (for example, more motivated, more educated, more productive, and so on) than those who stay, then the self-selection is positive; if migrants are worse along these dimensions than stayers, then self-selection is negative. Selection is influenced by the variation in expected benefits of migration across the human capital distribution of potential migrants. In the early twentieth century, the benefits of migration were immediate as job opportunities were plentiful for Mexican workers. In the southwestern United States, many farms, railroads, and mines hired migrants directly at the border (United States Bureau of Immigration 1920). Mexicans typically worked in these sectors in the Southwest, but throughout the 1920s meatpackers and manufacturers in the Midwest and Northeast would recruit Mexicans from cities in Texas to replace jobs typically given to Southern and Eastern Europeans (Taylor 1929). While low-skilled jobs were readily available, high-skilled jobs were not as prevalent. Also, wages were higher for these unskilled laborers than in Mexico, suggesting a significant return to migration for unskilled laborers, which could lead to negative self-selection (Clark 1908). Selection is also influenced by the variation in costs of migration across the human capital

94 distribution of potential migrants. While high wages abroad may entice a low-skilled individual to move, his mobility might be restricted by the costs of moving (for example, transportation, psychological, informational or opportunity costs). Chiquiar and Hanson (2005) extend the Borjas (1987) model by adding costs to reconcile the theoretical prediction that contemporary Mexican migrants should be negatively self-selected with the empirical evidence that they are intermediately self-selected. This shift in the literature from focusing on the benefits of migration to the costs of migration has been used to explain differential patterns of Mexican migrant selection from urban and rural areas and from places with different intensities of migrant networks (Fernández-Huertas Moraga 2011; McKenzie and Rapoport 2010). The benefits to migration were clear but high costs may have constrained individuals from traveling. While improvements in transportation from central Mexico to the United States border, especially the completion of the Mexican railroad in the late nineteenth century, lowered the cost of migration and subsequently spurred large waves of emigration, the cost of a ticket from central Mexico to the United States border was still high for poorer individuals (Clark 1908; Coatsworth 1981). Additionally, the 1917 migration legislation required all migrants to pay an eight dollar head tax. Although the enforcement of this law during 1920 is unclear, if low-earning individuals were unable to finance the trip abroad, then self-selection could have been positive. While a handful of papers analyze the selection of migrants from Europe (Abramitzky et al 2012, 2014; Hatton and Williamson 2006; Stolz and Baten 2012), little is known about selection of Mexican migration to the United States during the early twentieth century. Zadia Feliciano (2001) is the only paper to our knowledge that explores the historical self-selection of Mexican migrants, finding that in 1910 Mexicans in the United States had a higher rate of literacy than did the general Mexican population. We extend her results by incorporating evidence on immigrants following the Mexican Revolution, by using a measure (height) that is constant across borders, and by exploring the self-selection of return migrants which could alter the quality of the stock of Mexican migrants observed in the census.

95 4.4 Height as a Measure of Selection Multiple metrics of human capital have been used in studies of selection, including income (Chiquiar and Hanson 2005), skill class (Hatton and Williamson 2006), occupational scores (Abramitzky et al. 2014; Collins and Wanamaker 2014), age-heaping (Stolz and Baten 2012; A Hearn et al. 2009), years of education and literacy (Feliciano 2001). We have no data on the wages and education level of Mexican migrants in 1920, and so we use height to measure the quality of an individual migrant. When income and wage data are not available, economists must rely on other measures to proxy for standard of living. In particular, height as a measure has been used since it is positively correlated with income and improved health and nutrition (See Richard Steckel (1995) and (2009) for a review of height studies). Higher living standards with ample food during childhood increase height, while poor nutrition and health can stunt growth. Not only does the average height of a society indicate overall health and well-being, but also taller people also earn more than their shorter counterparts within a country. For example, Paul Schultz (2002) shows that a one centimeter increase in height leads to an eight to ten percent increase in wages in Brazil and Ghana. The return to physical strength is especially important in developing countries where large sectors of the economy rely on the physical productivity of labor. Height is a determinant of wages in these countries since larger and stronger men (as measured by Body Mass Index) are rewarded in the labor market (Thomas and Strauss 1997). Mexican migrants worked in labor-intensive industries, such as mining, railroad construction, and farm labor, where improved physiology could lead to higher productivity (Clark 1908; United States Bureau of Immigration 1920). Nicola Persico, Andrew Postlewaite, and Dan Silverman (2004) argue that higher wages for taller individuals are due to non-cognitive characteristics (for example, confidence), while others (Case and Paxson 2008; Schick and Steckel 2010) argue that early childhood inputs into health and nutrition can increase the cognitive functioning of an individual later in life. For example, taller individuals are more likely to remember their exact date of birth (Humphries and Leunig 2009)

96 and taller individuals score higher on early childhood cognitive and non-cognitive tests (Case and Paxson 2008). Either way, the evidence suggests that taller individuals, on average, earn higher wages. If the migrants who arrived in the United States were taller than those who remained in Mexico, then this would indicate a pattern of positive selection for Mexican migrants. 4.5 Data 4.5.1 Border Crossing Manifests To understand exactly who migrated to the United States from Mexico in 1920, we construct a unique dataset from the manifest lists for those crossing at the border towns of Ajo, Arizona; Douglas, Arizona; Brownsville, Texas; and El Paso, Texas in 1920. 5 In Figure 4.3 we show the geographical coverage of our sample. 6 Height was recorded on each manifest by border officials and was often rounded to the nearest quarter inch. In addition to height, much more information about migrants upon arrival was recorded on the manifest, including demographic (age, sex, marital status), geographic (place of birth, place of last residence, intended destination), economic (occupation, savings), and network (join a friend, relative or employer) data. We collect all available data for each adult male (18 years or older) classified as an immigrant. 7 In total, we have microdata for 3,671 male migrants who crossed the border in 1920. 5 National Archives, Mexican Border Crossing Records, Manifests of Alien Arrivals at Ajo, Lukeville, and Sonoyta (Sonoita), Arizona, Jan. 1919-Dec. 1952, and at Los Ebanos, Texas, Dec. 1950-May 1955 (2 rolls), no. A3377; Nonstatistical Manifests and Statistical Index Cards of Aliens Arriving at Douglas, Arizona, July 1908-December 1952 (4 rolls), no. M1759; Statistical and Nonstatistical Manifests of Alien Arrivals at Brownsville, Texas, February 1905-June 1953, and Related Indexes (40 rolls), no. M1502; and Manifests of Statistical Alien Arrivals at El Paso, Texas, May 1909-October 1924 (96 rolls), no. A3412. 6 There is no systematic difference in the outcome of interest (height) across border towns after controlling for state and decade of birth fixed effects, suggesting that heights were consistently measured across border stations. 7 An observation was collected if and only if the individual s intended length of stay was listed as permanent or indefinite, the last permanent residence was outside of the United States, the place of birth was outside of the United States, and the final destination was within the United States.

97 Figure 4.3: Location of Border Stations and Regions in Mexico Notes: Additional border stations were located in Lukeville, AZ; Naco, AZ; Nogales, AZ; Sasabe, AZ; Sonoyta, AZ; Columbus, NM; Andrade, CA; San Ysidro, CA; Del Rio, TX; Eagle Pass, TX; Laredo, TX; and Rio Grande City, TX. Region of birth is split into North, Bajio, Center, and South. North includes Baja California Sur, Baja California, Sonora, Sinaloa, Chihuahua, Coahuila, Nuevo León, and Tamaulipas. Bajío includes Jalisco, Colima, Michoacán, Nayarit, San Luis Potosí, Durango, Zacatecas, Aguascalientes, Quertaro, and Guanajuato. Center includes Distrito Federal, Mxico, Morelos, Tlaxcala, Puebla, Veracruz, and Hidalgo. The South includes Guerrero, Oaxaca, Tabasco, Campeche, Yucatan, Quintana Roo, and Chiapas. To determine the representativeness of our sample, we compare the characteristics of our migrants with those of similar migrants recorded in the 1920 United States Census. We use the one percent 1920 IPUMS sample to identify migrants who arrived in the previous year, who were literate, over the age of 18, and male (Ruggles et al. 2010). Our sample is representative of the distribution of skills for migrants recorded in the census with no statistical difference in occupational mix. 8 There is also no difference in marital status, although our sample is about two years younger and overrepresented by people moving to Texas. 9 Our dataset captures only documented migrants. 10 Although Louis Bloch (1929), in a comparison of census numbers with net migration flows, estimates that undocumented entries 8 Results for the representativeness of the sample are available in the appendix. 9 The fact that our sample is overrepresented by people headed to Texas is an artifact of the majority of it being recorded from the El Paso and Brownsville border stations. 10 Migrating to the United States was not technically illegal until later in the 1920s, when the United States government created the Border Patrol in an attempt to stop Mexicans and other European ethnicities that tried to enter the United States through the south (Foerster 1925; Ngai 2002). The Border Patrol began in 1925 with a force of 472 members (Carter et al. 2006, Table 4.Ad1076-1084).

98 could have been substantial for the decade from 1910 to 1920, he also admits that there is a lack of reliable information to make study of this population feasible. To be precise, our results apply to those migrants who crossed through official border crossing stations, and not necessarily to all migrants. The border-crossing data allow us to create a profile for the typical, documented, male migrant who crossed the border from Mexico to the United States in 1920, shown in Table 4.1. Male migrants to the United States were, on average, 29 years old, equally likely to be married as single, and almost universally recorded as literate. In Figure 4.3 we show the regional classification we use for the state of birth. 11 Immigrants came most often from central and northern Mexico, with very few coming from the southern states. 12 A large portion of our migrants were born in the Mexican states of Chihuahua, Guanajuato, and Jalisco, which are still high-sending states today, and most reported a final destination of Texas. Only 14 percent of migrants in the sample reported meeting someone (friend, relative or employer) upon entry, much lower than Europeans in 1920 with 83 percent of Germans, 96 percent of Italians, and 97 percent of Greeks joining a network upon arrival. 13 On average, Mexican migrants brought 39 dollars cash across the border. We classify migrants as unskilled, skilled or professional workers based on their reported occupation. 14 The majority (about 87 percent) of immigrants in the sample were unskilled. It is because of this lack of variation in skill class that occupational rankings yield little information in determining self-selection. Height allows us to examine whether migrants, within a given occupational class, were better or worse than non-migrants remaining in Mexico. 11 We follow the same region of birth classification as Moramay López-Alonso and Raul Porras Condey (2003) to maintain consistency across samples. A common birthplace for a migrant crossing at El Paso was Guadalajara, Jalisco. This represents a journey of about 1,266 kilometers. 12 It is well noted that the construction of the Mexican railroad helped transport Mexicans to the United States. However, the railroad did not reach the southern states below Veracruz by 1920, which explains why few of our observations are from the southern Mexican states. 13 Based on authors calculations from the Report of the Commissioner General of Immigration (1920). 14 We follow López-Alonso s (2000) occupational classification.

99 4.5.2 Comparison Samples: Military and Passport Data To make an inference about the selection of migrants from Mexico we need to compare the heights of migrants to those living within Mexico. Here we use two distinct samplesmilitary soldiers and passport holders. Howard Bodenhorn, Timothy Guinnane, and Thomas Mroz (2013) warn that samples of historical heights are likely selected, which could lead to incorrect inferences about the underlying population. We acknowledge that both of these samples are not representative as the military sample is from the lower part of the height distribution of Mexico and passport records are from the upper part of the height distribution of Mexico (López-Alonso 2007, 2012; López-Alonso and Condey 2003). However, by comparing migrants to both samples and determining which sample migrants most closely resemble we can infer whether migrants were positively or negatively selected. The Secretaría Nacional de la Defensa houses federal military records in the Archivo de Concentración, recording deceased soldiers in the Sección de Personal Extinto and deserters in the Sección de Cancelados (López-Alonso and Condey 2003). 15 Since the military did not have required service until 1939, only those who made the choice to join the military appear in the data. Characteristics of the military sample are also listed in Table 4.1. We show that 77 percent of military males were in unskilled occupations and that individuals were well represented across different regions of Mexico. At first glance, the military sample appears to be higher skilled than the migrant group, since 87 percent of migrants were unskilled compared to 77 percent of individuals in the military, implying negative self-selection. However, migrants may have reported intended occupation rather than previous occupation, leaving their true position in the skill distribution of Mexico unclear. Importantly, a comparison of average height reveals that migrants were nearly five centimeters taller than those in the military. We illustrate this comparison in Figure 4.4 by showing that the estimated height distribution for the migrant sample lies well to the right of the 15 Birth records did not become widely available until the 1930s, so the military kept track of members (who might potentially desert) by recording their height, place of birth, age and occupation. The Sección de Cancelados contains information on members of the military who deserted the army before their service time ended, and the Sección de Personal Extinto contains individuals who died in service or retired and then died afterwards (López-Alonso 2012). The majority of the military data is for individuals who joined the Mexican Army between 1915 and 1935.

100 estimated height distribution for the military sample. Figure 4.4: Heights: Immigrants, Soldiers and Passport Applicants Notes: Observations below 140 cm in height are dropped, although results are unchanged if they are included. Source: Migrant heights are from borders crossing manifests. Soldier and passport applicant heights are from López-Alonso(2003).

101 Table 4.1: Summary Statistics for Migrant, Military, and Passport Samples Variable Migrant Sample Military Sample Passport Sample Height (centimeters) 168.66 163.83 170.15 (6.09) (6.72) (7.30) Age at arrival (years) 27.86 28.37 38.63 (9.63) (7.64) (10.14) Unskilled 0.87 0.77 (0.33) (0.42) Skilled 0.10 0.21 (0.29) (0.41) Professional 0.03 0.02 (0.17) (0.13) Literate 0.99 (0.07) Married 0.49 (0.50) Single 0.48 (0.50) Widowed 0.02 (0.15) Headed to California 0.07 (0.26) Headed to Texas 0.81 (0.39) Headed to Arizona 0.08 (0.27) North 0.22 0.19 (0.41) (0.39) Bajio 0.75 0.30 (0.43) (0.46) Center 0.03 0.40 (0.16) (0.49) South 0.00 0.11 (0.04) (0.32) Meeting no one 0.86 (0.34) Meeting friend 0.01 (0.10) Meeting relative 0.13 (0.33) Cash on hand (dollars) 38.73 (300.00) Observations 3,671 3,884 1,249 Notes: Standard deviations are in parentheses. Proportions are reported unless otherwise noted. Source: Border crossing manifests and López-Alonso (2003).

102 We also compare migrants to a sample of passport applications from Mexico collected by López-Alonso (2003) from the Archivo de Pasaportes. Unfortunately, this sample only includes age and does not give region of birth or skill classifications. Height was not measured for passports but was self-reported, possibly creating an upward bias since height tends to be over-reported (Spencer et al. 2002). Summary statistics in Table 4.1 show that passport applicants were only about one and a half centimeters taller than those immigrating to the United States. In Figure 4.4 we show that the estimated height distribution for the migrant sample lies very close to the estimated height distribution for the passport sample. While the average migrant was nearly five centimeters taller than the average member of the military, he was similar in height to the average passport applicant. 4.6 Estimating Self-Selection into Migration We utilize a linear regression model to explore the pattern of selection among Mexican migrants in 1920 as measured by migrant height. Although the analysis of the estimated densities in Figure 4.4 suggests a pattern of positive selection, it is possible that greater stature is simply correlated with other characteristics that are more prevalent in the migrant sample, such as a particular region of birth. Thus, we estimate Equation (1) to control for many of these additional characteristics that could confound our positive selection result. Height i = β 0 + β 1 Migrant i + δ X i + ɛ i (4.1) An individual s height is regressed on a constant, an indicator variable for whether or not the individual is from the migrant sample, and a vector of controls. Final adult height may not be reached until 24 years of age and so individuals who are between 18 and 24 years might still be growing. We include dummy variables for age bins of 18 to 20 years and 21-23 years in order to account for this pattern. 16 We also include controls for decade of birth to account for any conditions that may have affected the height of all those born in Mexico during those times. 17 Furthermore, 16 The results are qualitatively similar in regressions that exclude those under 24 years of age. 17 Results are robust to the inclusion of birth year fixed effects.

103 we include geographic controls to account for any spatial pattern in Mexican heights. 18 Finally, we include variables for occupational skill class which allows us to describe how migrants differed from others within skill class. Results of the selection regressions comparing the sample of male migrants to the sample of males in the Mexican military are presented in Table 4.2, Columns (1)-(4). First, our estimates reveal expected patterns in heights. For example, adults in the 18 to 20 year age bin were shorter than adults over 24 years old, while those in the 21 to 23 year age bin were only slightly shorter and the difference loses statistical significance, consistent with human growth patterns. Also, those in the skilled class were taller than those in the unskilled class, while those in the professional class were taller than individuals in either of the other two occupational skill classes, supporting the claim that height is correlated with income, productivity and cognitive ability. Second, the result of positive selection as measured by height holds in each of these specifications, with the migrant sample measuring four to five centimeters taller than those individuals in the military sample. Migrants were taller than those in the military even though they reported lower-skilled occupations. Finally, in Column (4) we show that migrants were taller than those in the military within occupational skill class. Although the descriptive statistics show that those who chose to migrate tended to come from lower-skilled occupations, we find that within skill class the individuals who migrated tended to be taller than those who were in the military. We also present in Table 4.2 the results of selection regressions comparing the sample of male migrants to the sample of males applying for Mexican passports in Columns (5) and (6). Column (5) again shows a simple comparison of means between migrants and passport applicants, while Column (6) includes controls for ages less than 24 years and decade of birth. Those in the migrant sample were, on average, just under a centimeter and a half shorter than those in the passport sample. Given that the difference in height is quite small and the fact that those holding passports probably came from the upper end of the distribution in Mexican society, this is additional evidence 18 For example, those born in the North region are significantly taller than those in other regions, consistent with a diet richer in protein, which leads to taller individuals (Steckel 2009).

104 consistent with a pattern of positive selection into Mexican migration in 1920. Table 4.2: 1920 Selection Regressions Comparing Migrants to the Military and Passport Samples (1) (2) (3) (4) (5) (6) Comparison Sample: Military Military Military Military Passport Passport Migrant 4.831*** 5.062*** 4.118*** 4.160*** -1.484*** -1.432*** (0.147) (0.157) (0.191) (0.192) (0.230) (0.268) Age, 18-20 years -2.682*** -2.593*** -2.529*** -0.568 (0.291) (0.284) (0.284) (0.553) Age, 21-23 years -0.337-0.278-0.240 0.200 (0.211) (0.210) (0.209) (0.286) Decade of birth, 1850 0.352-0.445-0.511-0.857 (0.917) (0.886) (0.830) (1.288) Decade of birth, 1860-1.368** -1.329** -1.498** -0.225 (0.628) (0.616) (0.609) (0.702) Decade of birth, 1870-0.912** -0.866** -0.920*** -0.158 (0.354) (0.346) (0.347) (0.547) Decade of birth, 1880-0.545* -0.361-0.381 0.184 (0.279) (0.272) (0.271) (0.512) Decade of birth, 1890-0.533** -0.426* -0.439* 0.491 (0.231) (0.225) (0.225) (0.487) Born, Center region 1.033*** 0.967*** (0.346) (0.343) Born, Bajio region 2.608*** 2.652*** (0.354) (0.352) Born, North region 4.261*** 4.300*** (0.365) (0.364) Skilled 0.924*** (0.204) Professional 1.830*** (0.429) Constant 163.8*** 164.7*** 162.6*** 162.3*** 170.1*** 169.9*** (0.108) (0.211) (0.365) (0.367) (0.206) (0.417) Observations 7,555 7,555 7,555 7,555 4,920 4,920 R-squared 0.124 0.138 0.165 0.169 0.010 0.014 Notes: Robust standard errors are in parentheses. The omitted categories are those over age 24, those born prior to 1850, those born in the South region, and unskilled workers. * = Signficant at the 10 percent level. ** = Significant at the 5 percent level. *** = Significant at the 1 percent level. Source: Border crossing manifests and López-Alonso (2003).

105 4.6.1 Robustness Checks We present alternative specifications in Table 4.3 to address concerns about the Mexican Revolution and the effect of the 1917 literacy test requirement. It is possible that the self-selection result is not due to economic forces but rather because of refugees fleeing the Mexican Revolution. We test for differences in the pattern of selection by region of birth to determine whether positive selection was strongest in the North where fighting continued, and show the results in Table 4.3. Similar to the result found for the whole sample in the main specification, migrants born in the northern parts of Mexico were just over four centimeters taller than non-migrants and did not exhibit an abnormal or extraordinary pattern of selection that would give cause for concern. Table 4.3: Alternative Sample Specifications for Migrant Selection Regressions Sample Specification Migrant Sample Specification Migrant Baseline 4.160*** Only literate 3.356*** (0.192) (0.379) Only North region 4.265*** Only unskilled 4.015*** (0.335) (0.213) Only Bajio region 3.740*** Only skilled 4.294*** (0.255) (0.458) Only Center region 5.916*** Only professional 5.274*** (0.606) (0.970) Only South region 7.638** (3.638) Notes: Robust standard errors are in parentheses. * = Signficant at the 10 percent level. ** = Significant at the 5 percent level. *** = Significant at the 1 percent level. The dependent variable in each regression is height. Each regression includes the full set of controls for age, location of birth, and occupation, but only the coefficient on migrant is reported. The comparison group is the military sample. Source: Border crossing manifests and López-Alonso (2003). It is possible that the pattern of positive selection resulted from the literacy test imposed in 1917, which could have barred low-quality, Mexican migrants. While the degree of enforcement of the literacy test for Mexican migrants in 1920 is ambiguous as discussed earlier, we compare

106 our sample of migrants to a subsample of 3,884 military deserters for whom we have literacy data, recognizing that there is a difference in how literacy is determined in the migrant and military samples. The literacy test required the migrant to read and write a paragraph of twenty five words in a language of his choosing (Goldin 1994), while literacy in the military sample was determined by whether the soldier could sign his name (López-Alonso and Condey 2003). Our finding that migrants were positively selected still holds when comparing literate samples, and literate migrants were over three centimeters taller than their counterparts in the military. Our results indicate that documented migrants to the United States in 1920 were positively self-selected from the home distribution but does not account for undocumented entry. Bloch (1929) estimates that roughly 111,000 undocumented individuals entered the United States over the decade ending in 1920. 19 Using this number in combination with the official statistics for migration in the 1910s, a back of the envelope calculation suggests that the average undocumented migrant would need to have been 154.29 centimeters tall (nine and a half centimeters shorter than the average male in the military and fourteen and a half centimeter shorter than the average documented migrant) to erase the height advantage over the military. 20 This means that even though institutional constraints could cause negatively self-selected individuals to migrate unofficially, it is unlikely that undocumented migration would cause a reversal of our positive selection result. 4.7 Accounting for Return Migration 4.7.1 Selection into Return Migration Measuring just the selection into migration is not sufficient to understand its long-term impact, especially when return migration was prevalent as in the case of Mexico. Even though 19 This estimate in Bloch (1929) is based on estimates of the Mexican-born population in the United States found in official statistics. The Mexican-born population used is very similar to estimates made from IPUMS microdata (Gutmann et al. 2000; Gratton and Merchant 2013). 20 The official migration statistics for the United States show that 219,004 individuals entered the country legally from Mexico from 1911 to 1920. Thus, the total flow from Mexico for the decade was 330,004, with undocumented entrants accounting for 33.6% and documented entrants accounting for 66.4% of that flow. If we use a weighted average of documented and undocumented migrants to measure selection, we can calculate how short the average undocumented migrant would need to be to erase the 4.83 centimeter advantage over the military.

107 migrants were positively self-selected from the Mexican population, return migrants could be differentially selected from the overall set of migrants, changing the quality of the stock of migrants that remained in the United States permanently and the quality of the stock of labor in Mexico (Borjas 1985; Borjas and Bratsberg 1996). Whether migrants who made the decision to return were positively or negatively self-selected from the migrant population is ambiguous. One possibility is that most migrants were target earners and returned when enough was saved to invest in capital back home, leaving their quality relative to permanent migrants unclear (Mesnard 2004; Angelucci 2012). Alternatively, if return migrants were those who failed in the labor market (Abramitzky et al. 2014), then return migrants would have been negatively self-selected. Another possibility is that return migrants did not make the decision on their own and were not voluntarily self -selected, but were forcibly removed by federal officials as the labor market tightened at the onset of the Great Depression. Historians have placed emphasis on the injustices surrounding mass deportations of Mexicans, with some estimating that over one million Mexicans, including children and United States citizens, left the United States either forcibly or under the threat of removal (Balderrama and Rodriguez 2006). There is a debate, however, over the size of the dramatic fall in the Mexican population during the 1930s and whether the decrease was due to deportation or voluntary departure (Gratton and Merchant 2013; Taylor 1934). Much of the mass departure could simply have been a result of the worsening economic conditions. Most of this massive outflow of Mexicans occurred later during the 1930s, outside the years of this study, but there was a significant southward movement in the late 1920s. Some of this outward flow was due to official deportations under warrants, which increased almost 400 percent from 1,751 in 1925 to 8,438 in 1930 (Reports of the Commissioner General of Immigration 1925, 1930). This number of deportations in 1930 was larger than the 6,296 Mexicans leaving voluntarily, according to United States statistics. Mexican statistics, which are more reliable than United States data, suggest that deportations may not have been as important. According to Mexican sources, 70,129 individuals returned during 1930, making official deportations a much smaller percentage

108 of total repatriations (Hoffman 1974; Taylor 1929). 21 However, many more Mexicans may have left under the threat of deportation rather than being legally deported. Although the importance of deportations is under debate, the selection of return migrants on height could be altered by deportation pressures, depending on who was pressured to leave or forcibly removed. 4.7.2 Linked Sample To estimate the selection of return migrants, we link our sample of 3,671 migrants forward to the 1930 United States Census for a sample of permanent migrants, and forward to the 1930 Mexican Census to get sample of return migrants. The link to the 1930 United States Census is based on four characteristics: first name, last name, year of birth and country of birth (Mexico). We also link our sample to the 1930 Mexican census based on the same four characteristics, but are able to match on state of birth in Mexico. We follow the iterative matching procedure similar to Abramitzky, Boustan, and Eriksson (2014). 22 In order to limit bias from transcription errors, we also standardize names using the Double Metaphone algorithm. 23 Our linking strategy produces a set of migrants who are uniquely linked to the United States Census or to the Mexican Census, linked to multiple people in the same census, not linked to either census, or linked to both censuses. Failure to link to either census is most likely due to death, name change or transcription error, while linking to both censuses or multiple times to the same census is likely due to extremely common names. These groups are dropped from the sample. From the original 3,671 migrants, we have a sample of 632 individuals uniquely linked to the 1930 Mexican Census and 798 uniquely linked to the 1930 United States Census. 24 There are 632 return migrants out of a total 1,430 uniquely linked migrants, which yields a 44.2 percent return rate after ten years of stay. This rate is likely an upper bound for the true 21 One significant discrepancy between the two sources is that Mexico enumerated every border crossing (including, for example, day trips), while the United States only include those who planned to leave permanently (Taylor 1929). 22 A detailed description of the matching procedure as well as the linking matrix can be found in the appendix. 23 We use Ancestry.com to perform the linking process. 24 In addition to these matches, there are 1,765 migrants who are unlinked and 261 matched to both censuses. The enumeration date for the United States census was 1 April 1930, and the enumeration date for the Mexican Census was 15 May 1930, so it is possible that migrants left in between dates to be counted in both countries. However, given that migrants were already in the United States for ten years, this is unlikely.

rate of return since transcription error and name changes were more likely to occur in the United 109 States. It is likely that transcription errors were more prevalent in the United States than in Mexico as Mexican enumerators were more familiar with Spanish names. Further, names changes were more likely to occur in the United States as some migrants anglicized their names to gain favor in the labor market (Biavaschi et al. 2013). By comparison, the return migration rate for males calculated with administrative data for the decade from 1920 to 1930 was approximately 13.3 percent. 25 However, these administrative records probably undercounted out-migrants and therefore provide a lower bound on out-migration rates (Bandiera et al. 2013; Taylor 1929). 25 This is calculated as the total number of emigrants from 1921-1930 over the at-risk population to return home during the 1920s, given in proxy by the numbers of immigrants from 1916-1925 (since most migrants leave from the past five years). This rate formula is similar to the repatriation rate in John Gould (1980) but does not correct for non-immigrants and non-emigrants since our migrant sample only contains immigrants and not the other categories.

110 Table 4.4: Summary Statistics for Permanent and Return Migrants Variables Permanent Migrants Return Migrants Difference Height (centimeters) 168.7 168.7-0.00623 (5.943) (6.126) Age at arrival (years) 27.77 27.88 0.112 (8.720) (8.916) Unskilled 0.855 0.888 0.0330* (0.353) (0.316) Skilled 0.113 0.0854-0.0273* (0.317) (0.280) Professional 0.0326 0.0269-0.00568 (0.178) (0.162) Literate 0.994 0.997 0.00310 (0.0790) (0.0562) Married 0.439 0.472 0.0329 (0.497) (0.500) Single 0.543 0.509-0.0331 (0.498) (0.500) Widowed 0.0188 0.0190 0.0002 (0.136) (0.137) Headed to California 0.0877 0.0633-0.0244* (0.283) (0.244) Headed to Texas 0.799 0.831 0.0312 (0.401) (0.375) Headed to Arizona 0.0739 0.0633-0.0106 (0.262) (0.244) North 0.256 0.225-0.0310 (0.436) (0.418) Bajio 0.707 0.764 0.0575** (0.456) (0.425) Center 0.0376 0.00949-0.0281*** (0.190) (0.0970) South 0.000 0.00158 0.00158 (0.000) (0.0398) Meeting no one 0.846 0.860 0.0124 (0.361) (0.348) Meeting friend 0.00922 0.0100 0.00078 (0.0957) (0.0997) Meeting relative 0.144 0.129-0.0149 (0.351) (0.335) Cash on hand (dollars) 29.24 34.11 4.87 (92.26) (183.3) Observations 798 632 Notes: Standard deviations are in parentheses. Proportions are reported unless otherwise noted. Permanent migrants are those migrants linked to the 1930 US Census and return migrants are those linked to the 1930 Mexican Census. * = Signficant at the 10 percent level. ** = Significant at the 5 percent level. *** = Significant at the 1 percent level. Source: Border crossing manifests.

111 Despite the fact that permanent migrants and return migrants ended up in different countries, their characteristics upon arrival, as shown in Table 4.4, were remarkably similar. Return migrants and permanent migrants were statistically indistinguishable in terms of age, marital status, and cash on hand at arrival. Perhaps surprisingly, there was no difference in network connections, which could have supported migrants and altered return behavior. Importantly, there was also no statistically significant difference in heights, which suggests that return migrants were not differentially selected from the migrant population (see Figure 4.5). Figure 4.5: Heights: Permanent and Return Migrants Notes: Observations below 140 cm in height are dropped, although results are unchanged if they are included. Permanent migrants are those migrants linked to the 1930 US Census and return migrants are those linked to the 1930 Mexican Census. Source: Migrant heights are from border crossing manifests. Return migrants and permanent migrants were not similar in every way. Migrants born in the Center region were more likely to be permanent migrants, while those born in the Bajio region were more likely to be return migrants. 26 Those who listed their intended destination as California were least likely to become return migrants. The further distance between sending states and California 26 It is possible that migrants moved back and forth across the border multiple times. The Report of the Commissioner General of Immigration in 1908, the closest report to the 1920s with available information, shows that 22.5% of arriving Mexican immigrants had been in the United States previously.

112 likely increased the costs of returning, lowering return rates (Borjas and Bratsberg 1996). In addition, return migrants were slightly more likely to be unskilled, although the magnitude of this difference is small and only marginally statistically significant. While the difference in occupational class suggests that return migrants were negatively selected on occupation, it is unknown whether occupation was intended or previous occupation, leaving their true position in the skill distribution unclear. 4.7.3 Estimating Selection into Return Migration We revisit the observation from Table 4.4 and Figure 4.5 that permanent and return migrants had similar heights (168.7 centimeters) and test whether this result holds when controlling for age and region of birth. Specifically, we pool the return and permanent migrant samples and regress height on an indicator for whether the migrant was a return migrant. The results of the regression of height on return migration status are presented in Panel A of Table 4.5. A simple correlation in the first column shows that return migrants were 0.006 centimeters shorter than permanent migrants, a statistically and economically insignificant difference. In regressions including age and region of birth fixed effects, heights of return and permanent migrants continue to be statistically indistinguishable. Although occupational structures upon arrival were slightly different between return and permanent migrants, once controlling for occupational structure there was still no differential selection of return migrants. Panel B shows alternative sample specifications for samples including only unskilled, skilled, or professional workers, and also including only people born in the North, Bajio or Center region. All regressions show no economically or statically significant differences between return migrants and permanent migrants in terms of height. Overall, our analysis suggests that return migrants and permanent migrants had similar levels of human capital.

113 Table 4.5: Regression Results for Return Selection Panel A: (1) (2) (3) (4) (5) (6) Sample All All All All Age at Arrival<40 Cross-Link Return migrant -0.00623-0.0267 0.0586 0.0744-0.0886 0.00594 (0.322) (0.323) (0.321) (0.321) (0.339) (0.303) Decade of birth X X X X X Age bins X X X X X Region of birth X X X X Occupational class X X X Observations 1,430 1,430 1,430 1430 1,268 1,599 Panel B: Sample Unskilled Skilled Professional North Bajio Center Return migrant 0.0949-0.211 0.193 0.260 0.0433-2.046 (0.340) (1.128) (1.808) (0.673) (0.370) (3.864) Decade of birth X X X X X X Age bins X X X X X X Region of birth X X X N/A N/A N/A Occupational class N/A N/A N/A X X X Observations 1,243 144 43 346 1,047 36 Notes: Robust standard errors are in parentheses. * = Signficant at the 10 percent level. ** = Significant at the 5 percent level. *** = Significant at the 1 percent level. The dependent variable in each regression is height. Each regression has different sample specifications. Permanent migrants are those migrants linked to the 1930 US Census and return migrants are those linked to the 1930 Mexican Census. Source: Border crossing manifests. 4.7.4 Robustness of Results for the Linked Sample Linked samples may not be representative of their underlying populations because the links are not made at random. Specifically, a migrant is more likely to be connected if he has a unique name, and he will not be linked if there was a death, name change, or transcription error. While transcription error was likely random with respect to height, name changes could have occurred more often for migrants intending to reside permanently in the United States. If those migrants were more adept at English or at acquiring United States specific human capital, then our linked

114 sample would underestimate the quality of permanent migrants. Another concern is that mortality might bias results if taller individuals are healthier and likely live longer. We restrict the sample to migrants arriving under the age of 40 who were less likely to die within ten years. The results, shown in Column (5) of Panel A of Table 4.5, indicate that even with the restricted sample we find no differential selection into return migration. Lastly, it is possible that households in Mexico reported migrants in the United States as members of the household to enumerators. This error would imply that links to both the United States and Mexican censuses were actually people who resided in the United States. We include 169 migrants who were uniquely linked to both the United States Census and the Mexican Census in our sample of permanent migrants and regress height on return migrant status. The result reported in Column (6) in Panel A of Table 4.5 confirms that return migrants defined in this manner were not differentially selected from the population. 4.8 Conclusions In the early twentieth century, the United States labor market drew the taller workers from Mexico. Mexican migrants were over four centimeters taller than members of the Mexican military and only one and a half centimeters shorter than passport holders. The fact that Mexican migrants were positively self-selected is consistent with Borjas (1987) and Chiquiar and Hanson (2005) where migrants had high costs of travel or faced credit constraints, limiting the ability of lower quality Mexicans to migrate. This positive self-selection represents a quality drain or productivity drain from Mexico to the United States. From our linked sample we find that return migrants were not differentially selected. Taller individuals migrated from Mexico, and migrants observed years later in the United States were just as tall. Time and return migration did not impact the quality drain on Mexico or the quality gain to the United States that resulted from positive self-selection into migration from Mexico. This lack of selection on height for Mexican return migrants is in contrast to negative self-

115 selection of most European return migrants during the early 1900s (Abramitzky et al. 2014). 27 It is possible that this result could be due to deportation or the threat of deportation, which forced different people out of the United States than would otherwise leave and thus likely altered the selection of return migrants. If the United States randomly deported Mexicans without respect to human capital or skill then this would lead to return migrants having the same human capital as permanent migrants, which we find. While purely economic forces might lead to the lowest performing migrants leaving the country, deportation policies expel both high and low quality migrants. However, other reasons (for example, proximity to Mexico) could also explain the different patterns in return migrant selection between Europeans and Mexicans. A pattern of positive self-selection of Mexican migrants affects both Mexico and the United States in a variety of ways. The United States received the relatively more productive Mexican workers, and these workers would assimilate into the labor market more quickly than negatively selected migrants. For Mexico, the taller laborers, the taller miners, and the taller farmers left Mexico to work in the United States, draining Mexico of human capital and lowering the productivity of the average Mexican worker. However, the total effect on Mexican development is unclear as migration affects not only labor markets, but also can influence home country savings and investment by increasing remittances. It further affects the home country by changing political institutions if migrants return back home, by increasing technological diffusion with the transmission of techniques or capital goods across borders, or by influencing future migration with the strengthening of networks. This study provides a valuable step to better understanding the various effects of historical migration from Mexico to the United States on the economies of both nations. 27 This non-negative selection is verified by official United States statistics which show that Mexican return migrants were positively self-selected on occupation in the year 1930, but this could be due to deportation pressures or limitations in the data.

Chapter 5 Conclusions and Further Research Migrating into and out of the United States was common in the early 20th century. Prior to migration policy limiting the flows into the United States in the 1920s, migrations would freely enter; however, many would also return home without any expiration of visa or deportation policy. Reasons for returning were varied, but many migrants returned home due to not surviving the United States economy, either departing when an economic shock would increase the incidence of unemployment, or because the large volumes of inflows led to intense competition amongst a migrant group. The United States absorbed many migrants, but it did not absorb everyone. While this dissertation has sought to answer which migrants decided to return home, there are still many aspects of return migration that likely had large influences on the types of migrants who remained and the types who returned home. First, repeat migration was a relatively common phenomenon, where migrants would travel back to their home country, but then decide once again to enter the United States. Repeat migrants could have been those who were most sensitive to business cycles, both at home and abroad, and simply migrated to take advantage of the best economic opportunity available. However, we do not know who these repeat migrants were; the dataset on incoming migrants from Ellis Island does include those who had been in the United States previously, so there is data on this type of entrant. I have presented research on return migration from the United States as a whole, but the Annual Reports also have a large amount of detail on the number of migrants who left from each state. Of particular interest would be how return migration responded to the business cycle across

117 states, and whether we can proxy economic activity in different states with the number of outmigration from the state. Further, as migrants returned, there was a decrease in the labor supply, which could have softened the negative effects of a shock. The Great Depression witnessed a large increase in out-migration from the United States, andtesting how return migration affected other labor market outcomes during the business cycle would give insight whether migration greases the wheels of the labor market. Migrants often did not migrate in the first place without a network in the United States, but it is unknown how these networks influenced return migration. Migrant networks have been shown to lower the cost of migration and alter the selection of migrants entering the country. However, there has not been research on whether networks also increase the rate of selection into return migration; some evidence from those arriving shows that migrants planned to stay forever if they were joining a network, but this does not indicate whether or not the migrant actually remained in the United States. It could be that networks improved outcomes by connecting incoming migrants to jobs, providing a familiar cultural environment that decreased opportunity costs away from how, and offering housing for those who just entered. Either way, migration networks had a large role in the initial decision to migrate - it likely had a large role in the decision to return home. Further, when migrants returned back home, the return not only had an impact on host country s labor supply but also had an impact on the source country s labor supply. Migrants brought back with them the human capital they acquired in the United States - knowledge of techniques used in manufacturing process, the agricultural sector and also human capital about United States cultural institutions and laws. Return migrants likely diffused technology and ideas, something that would contribute to growth within the home economy. Also, the savings they brought back after a temporary trip abroad were used to finance capital. The effect of return migration on the productivity of the home economy would be interesting to determine the externalities of United States migration policy. Finally, assimilation not only mattered through improvement in labor outcomes, but also through cultural assimilation in terms of learning the United States customs and, specifically,

118 language. The effect of English proficiency on migrant assimilation and return migration could potentially explain why different ethnicities have a difficult finding high-skilled jobs in the United States. While these areas of research are promising and of interest to the researcher and policy maker, they will be difficult to understand with the current data available. Unfortunately for researchers, data on out-migrants have ceased to exist since the 1950s; further, due to privacy laws, it is impossible to link migrants across United States censuses following 1940. Due to these limitations, researchers need to rely either on expensive survey data or on source country s data to estimate when migrants return home and the types of people who return home. While this method can work, it can miss the large number of migrants who return home. Hopefully the United States will restart a program measuring outflows and learning about return migrants characteristics, which will paint a much more robust picture of migration. As we have seen from history, migration policy can have unintended effects on the migrant population. The migrant stock is made up of two flows; only paying attention to one side of the flow captures half of the story of the complex mechanics of migration in the United States.

Bibliography [1] Ran Abramitzky, Leah Platt Boustan, and Katherine Eriksson. Europe s Tired, Poor, Huddled Masses: Self-Selection and Economic Outcomes in the Age of Mass Migration. American Economic Review, 102(5):1832 1856, 2012. [2] Ran Abramitzky, Leah Platt Boustan, and Katherine Eriksson. A Nation of Immigrants: Assimilation and Economic Outcomes in the Age of Mass Migration. Journal of Political Economy, 122(3):467 506, 2014. [3] Brian A Hearn, Joerg Baten, and Dorothee Crayen. Quantifying quantitative literacy: Age heaping and the history of human capital. The Journal of Economic History, 69(03):783 808, 2009. [4] Manuela Angelucci. Us border enforcement and the net flow of mexican illegal migration. Economic Development and Cultural Change, 60(2):311 357, 2012. [5] W. B. Bailey. The Bird of Passage. American Journal of Sociology, 18(3):391 397, 1912. [6] Dudley Baines. Emigration from Europe 1815-1930, volume 11. Cambridge University Press, 1995. [7] Emily Greene Balch. Our Slavic fellow citizens. Charities Publication Committee, New York, 1910. [8] Oriana Bandiera, Imran Rasul, and Martina Viarengo. The Making of Modern America: Migratory Flows in the Age of Mass Migration. Journal of Development Economics, 102:23 47, 2013. [9] Robert J Barro and José F Ursúa. Barro-Ursúa macroeconomic data, 2010. [10] Lori A Beaman. Social networks and the dynamics of labour market outcomes: evidence from refugees resettled in the us. The Review of Economic Studies, 79(1):128 161, 2012. [11] Frank D. Bean, Georges Vernez, and Charles B. Keely. Opening and closing the doors: Evaluating immigration reform and control, Vol I. The Urban Institute, New York, 1989. [12] Costanza Biavaschi. Fifty Years of Compositional Changes in US Out-migration, 1908-1957. Technical report, Discussion Paper Series, Forschungsinstitut zur Zukunft der Arbeit, 2013. [13] Costanza Biavaschi, Corrado Giulietti, and Zahra Siddique. The economic payoff of name americanization. Technical report, IZA Discussion Paper, 2013.

[14] Govert E. Bijwaard, Christian Schluter, and Jackline Wahba. The Impact of Labor Market Dynamics on the Return Migration of Immigrants. The Review of Economic and Statistics, 96(3):483 494, 2014. [15] Govert E Bijwaard and Jackline Wahba. Do High-Income or Low-Income Immigrants Leave Faster? Journal of Development Economics, 108:54 68, 2014. [16] Louis Bloch. Facts about mexican immigration before and since the quota restriction laws. Journal of the American Statistical Association, 24(165):50 60, 1929. [17] Howard Bodenhorn, Timothy Guinnane, and Thomas Mroz. Problems of Sample-selection Bias in the Historical Heights Literature: A Theoretical and Econometric Analysis, 2013. [18] George J Borjas. Assimilation, Changes in Cohort quality, and the Earnings of Immigrants. Journal of Labor Economics, pages 463 489, 1985. [19] George J. Borjas. Self-selection and the earnings of immigrants. The American Economic Review, 77(4):531 553, 1987. [20] George J. Borjas. The Labor demand curve is downward sloping: reexamining the impact of immigration on the Labor market. The Quarterly Journal of Economics, 118(4):1335 1374, 2003. [21] George J. Borjas and Bernt Bratsberg. Who Leaves? The Outmigration of the Foreign-Born. The Review of Economics and Statistics, 78(1):165 176, 1996. [22] George J. Borjas and Lawrence F. Katz. The evolution of the mexican-born workforce in the united states. In Mexican immigration to the United States, pages 13 56. University of Chicago Press, 2007. [23] Leah Platt Boustan, Price V. Fishback, and Shawn Kantor. The effect of internal migration on local Labor markets: American cities during the great depression. Journal of Labor Economics, 28(4):719 746, 2010. [24] David Card. Immigrant inflows, native outflows, and the local Labor market impacts of higher immigration. Journal of Labor Economics, 19(1):22 64, 2001. [25] Gilberto Cardenas. United states immigration policy toward mexico: An historical perspective. Chicano L.Rev., 2:66 89, 1975. [26] Lawrence A. Cardoso. Labor emigration to the southwest, 1916 to 1920: Mexican attitudes and policy. The Southwestern Historical Quarterly, 79(4):400 416, 1976. [27] Lawrence A. Cardoso. Mexican emigration to the United States, 1897-1931: Socio-economic patterns. University of Arizona Press, Tuscon, 1980. [28] Susan B Carter, Scott S Gartner, Michael R Haines, Alan L Olmstead, Richard Sutch, and Gavin Wright. Historical Statistics of the united states: Millennial edition, 2006. [29] Anne Case and Christina Paxson. Stature and status: Height, ability, and Labor market outcomes. Journal of Political Economy, 116(3):499 532, 2008. 120

[30] Daniel Chiquiar and Gordon H. Hanson. International migration, self selection, and the distribution of wages: Evidence from mexico and the united states. Journal of Political Economy, 113(2):239 281, 2005. [31] Barry R. Chiswick. The effect of americanization on the earnings of foreign-born men. The Journal of Political Economy, 86(5):897 921, 1978. [32] Victor S. Clark. Department of Commerce and Labor. Bulletin of the Bureau of Labor. Volume XVI, 1908, volume serial set no. 5327. Washington, DC, 1908. [33] John H. Coatsworth. Growth against development: the economic impact of railroads in Porfirian Mexico. Northern Illinois University Press, DeKalb, 1981. [34] Raymond L. Cohn. Mortality on Immigrant Voyages to New York, 1836-1853. The Journal of Economic History, 44(2):289 300, 1984. [35] Raymond L Cohn. Mass Migration Under Sail: European Immigration to the Antebellum United States. Cambridge University Press, New York, 2009. [36] William J Collins. When the Tide Turned: Immigration and the Delay of the Great Black Migration. Journal of Economic History, 57(3):607 632, 1997. [37] William J Collins and Marianne H Wanamaker. Selection and Economic Gains in the Great Migration of African Americans: New Evidence from Linked Census Data. American Economic Journal: Applied Economics, 6(1):220 252, 2014. [38] Amelie Constant and Douglas S. Massey. Self-selection, earnings, and out-migration: A longitudinal study of immigrants to germany. Journal of Population Economics, 16(4):631 653, 2003. [39] Jorge Durand and Douglas S. Massey. Crossing the border: Research from the Mexican migration project. Russell Sage Foundation, New York, 2006. [40] Christian Dustmann, Itzhak Fadlon, and Yoram Weiss. Return migration, human capital accumulation and the brain drain. Journal of Development Economics, 95(1):58 67, 2011. [41] Christian Dustmann and Oliver Kirchkamp. The optimal migration duration and activity choice after re-migration. Journal of Development Economics, 67(2):351 372, 2002. [42] Christian Dustmann and Josep Mestres. Remittances and temporary migration. Journal of Development Economics, 92(1):62 70, 2010. [43] Christian Dustmann and Yoram Weiss. Return Migration: Theory and Empirical Evidence from the UK. British Journal of Industrial Relations, 45(2):236 256, 2007. [44] Richard A. Easterlin. Influences in european overseas emigration before world war i. Economic Development and Cultural Change, 9(3):331 351, 1961. [45] Patrick W. Ettinger. Imaginary lines: border enforcement and the origins of undocumented immigration, 1882-1930. University of Texas Press, Austin, 1 edition, 2009. [46] Zadia M. Feliciano. The skill and economic performance of mexican immigrants from 1910 to 1990. Explorations in Economic History, 38(3):386 409, 2001. 121

[47] Imre Ferenczi and W.F. Willcox. International Migration Statistics. In International Migrations, Volume I: Statistics. NBER, 1929. [48] Jesús Fernández-Huertas Moraga. New Evidence on Emigrant Selection. The Review of Economics and Statistics, 93(1):72 96, 2011. [49] Joseph P. Ferrie. A new sample of males linked from the public use microdata sample of the 1850 us federal census of population to the 1860 us federal census manuscript schedules. Historical Methods: A Journal of Quantitative and Interdisciplinary History, 29(4):141 156, 1996. [50] Joseph P. Ferrie. Yankeys Now: Immigrants in the Antebellum US 1840-1860. Oxford University Press, New York, 1999. [51] Robert F. Foerster. The racial problems involved in immigration from Latin America and the West Indies to the United States :a report submitted to the Secretary of Labor. Govt. Print. Off., Washington, 1925. [52] Nancy Gentile Ford. Americans All! Foreign-Born Soldiers in World War I. Texas A and M University Press, College Station, 2001. [53] Rachel M Friedberg. You can t take it with you? immigrant assimilation and the portability of human capital. Journal of Labor Economics, 18(2):221 251, 2000. [54] Rachel M. Friedberg and Jennifer Hunt. The impact of immigrants on host country wages, employment and growth. The Journal of Economic Perspectives, 9(2):23 44, 1995. [55] Oded Galor and Oded Stark. The probability of return migration, migrants work effort, and migrants performance. Journal of Development Economics, 35(2):399 405, 1991. [56] John Gibson and David McKenzie. Eight questions about brain drain. The Journal of Economic Perspectives, 25(3):107 128, 2011. [57] John Gibson and David McKenzie. The Microeconomic Determinants of Emigration and Return Migration of the Best and Brightest: Evidence from the Pacific. Journal of Development Economics, 95(1):18 29, 2011. [58] George Gmelch. Return migration. Annual Review of anthropology, pages 135 159, 1980. [59] Claudia Goldin. The Political Economy of Immigration Restriction in the United States, 1890 to 1921. In The Regulated Economy: A Historical Approach to Political Economy, pages 223 258. University of Chicago Press, Chicago, 1994. [60] Claudia Goldin and Robert A Margo. The Great Compression: The Wage Structure in the United States at Mid-Century. The Quarterly Journal of Economics, 107(1):1 34, 1992. [61] John D Gould. European Inter-Continental Emigration. The Road Home: Return Migration from the USA. Journal of European Economic History, 9(1):41 112, 1980. [62] Michael J. Greenwood. Modeling the age and age composition of late 19th century us immigrants from europe. Explorations in Economic History, 44(2):255 269, 2007. 122

[63] Michael J. Greenwood. Family and sex-specific us immigration from europe, 18701910: A panel data study of rates and composition. Explorations in Economic History, 45(4):356 382, 2008. [64] Michael J. Greenwood and John M. McDowell. Legal US immigration: Influences on gender, age, and skill composition. Upjohn Press, New York, 1999. [65] David G. Gutiérrez. Walls and mirrors: Mexican Americans, Mexican immigrants, and the politics of ethnicity. Univ of California Press, Berkeley, 1995. [66] Timothy J Hatton and Andrew Leigh. Immigrants assimilate as communities, not just as individuals. Journal of Population Economics, 24(2):389 419, 2011. [67] Timothy J. Hatton and Jeffrey G. Williamson. The age of mass migration: causes and economic impact. Oxford University Press, New York, 1998. [68] Timothy J Hatton and Jeffrey G Williamson. Global Migration and the World Economy: Two Centuries of Policy and Performance. Cambridge Univ Press, New York, 2005. [69] Timothy J. Hatton and Jeffrey G. Williamson. International migration in the long run: Positive selection, negative selection, and policy. Labor Mobility and the World Economy, pages 1 31, 2006. [70] John K. Hill. Immigrant decisions concerning duration of stay and migratory frequency. Journal of Development Economics, 25(1):221 34, 1987. [71] Jennifer Van Hook, Weiwei Zhang, Frank D. Bean, and Jeffrey S. Passel. Foreign-born emigration: A new approach and estimates based on matched cps files. Demography, 43(2):361 382, 2006. [72] Jane Humphries and Timothy Leunig. Was dick whittington taller than those he left behind? anthropometric measures, migration and the quality of life in early nineteenth century london? Explorations in Economic History, 46(1):120 131, 2009. [73] Edward P Hutchinson. Notes on immigration Statistics of the United States. Journal of the American Statistical Association, 53(284):963 1025, 1958. [74] Harry Jerome. Migration and Business Cycles. NBER, 1926. [75] Alan Knight. The Mexican Revolution, volume 54-55. Cambridge University Press, Cambridge Cambridgeshire; New York, 1986. [76] Frances Kraljic. Croatian Migration to and from the United States 1900-1914. Ragusan Press, Palo Alto, CA, 1978. [77] Jeanne Lafortune and José Tessada. Smooth (er) Landing? The Dynamic Role of Networks in the Location and Occupational Choice of Immigrants. 2012. [78] David P. Lindstrom. Economic opportunity in mexico and return migration from the united states. Demography, 33(3):357 374, 1996. 123

[79] Adriana Lleras-Muney and Allison Shertzer. Did the americanization movement succeed? an evaluation of the effect of english-only and compulsory schools laws on immigrants. American Economic Journal: Economic Policy, Forthcoming. [80] Moramay López-Alonso. The military option: Health, nutrition and living conditions of mexican soldiers. 2000. [81] Moramay López-Alonso. Growth with inequality: Living standards in mexico, 18501950. Journal of Latin American Studies, 39(01):81 105, 2007. [82] Moramay López-Alonso. Measuring Up: A History of Living Standards in Mexico, 18501950. Stanford University Press, Palo Alto, 2012. [83] Moramay López-Alonso and Ral Porras Condey. The ups and downs of mexican economic growth: the biological standard of living and inequality, 18701950. Economics and Human Biology, 1(2):169 186, 2003. [84] Darren Lubotsky. Chutes or Ladders? A Longitudinal Analysis of Immigrant Earnings. Journal of Political Economy, 115(5):820 867, 2007. [85] Angus Maddison, Organisation for Economic Co-operation, and Development. The world Economy: historical Statistics. Development Centre of the Organisation for Economic Cooperation and Development, Paris, France, 2008. [86] Catherine Massey. Immigration quotas and Immigrant Skill Composition: Evidence from the Pacific Northwest. 2012. [87] Karin Mayr and Giovanni Peri. Brain drain and brain return: theory and application to eastern-western europe. The BE Journal of Economic Analysis and Policy, 9(1), 2009. [88] David McKenzie, John Gibson, and Steven Stillman. A land of milk and honey with streets paved with gold: Do emigrants have over-optimistic expectations about incomes abroad? Journal of Development Economics, 102(0):116 127, 2013. [89] David McKenzie and Hillel Rapoport. Self-selection patterns in mexico-us migration: the role of migration networks. The Review of Economics and Statistics, 92(4):811 821, 2010. [90] Alice Mesnard. Temporary migration and capital market imperfections. Oxford Economic Papers, 56(2):242 262, 2004. [91] B. R. Mitchell. International historical Statistics. Palgrave Macmillan, New York, 4th edition, 1998. [92] Jesus Fernandez-Huertas Moraga. New evidence on emigrant selection. The Review of Economics and Statistics, 93(1):72 96, 2011. [93] Petra Moser. Taste-based discrimination evidence from a shift in ethnic preferences after wwi. Explorations in Economic History, 49(2):167 188, 2012. [94] Mae M. Ngai. The strange career of the illegal alien: Immigration restriction and deportation policy in the united states, 1924-1965. Law and History Review, 21(1):69 108, 2002. 124

[95] United States. Bureau of Immigration. Annual report of the commissioner general of immigration to the secretary of Labor. 1903; 1932. [96] Francesc Ortega and Giovanni Peri. The effect of trade and migration on income. Technical report, National Bureau of Economic Research, 2012. [97] Gianmarco IP Ottaviano and Giovanni Peri. Rethinking the effect of immigration on wages. Journal of the European Economic Association, 10(1):152 197, 2012. [98] Fred C Pampel and H Elizabeth Peters. The easterlin effect. Annual Review of Sociology, pages 163 194, 1995. [99] Krishna Patel and Francis Vella. Immigrant networks and their implications for occupational choice and wages. Review of Economics and Statistics, 95(4):1249 1277, 2013. [100] Nicola Persico, Andrew Postlewaite, and Dan Silverman. The effect of adolescent experience on Labor market outcomes: The case of height. Journal of Political Economy, 112(5):1019 1053, 2004. [101] Michael J Piore. Birds of passage: migrant Labor and industrial societies. Cambridge University Press, New York, 1979. [102] Hugh Rockoff. Until it s over, over there: The us Economy in world war i. NBER Working Paper, (w10580), 2004. [103] Elizabeth S. Rolph. Immigration Policies: Legacy from the 1980s and issues for the 1990s. RAND Institute, Santa Monica, 1992. [104] Steven Ruggles, T Alexander, K Genadek, R Goeken, M Schroeder, and M Sobek. Integrated public use microdata series (ipums): Version 5.0 [machine-readable database]. Minneapolis: University of Minnesota, 2010. [105] Theodore Saloutos. They remember America: the story of the repatriated Greek-Americans. University of California Press, Berkeley, CA, 1956. [106] Manon Domingues Dos Santos and Fabien Postel-Vinay. Migration as a source of growth: the perspective of a developing country. Journal of Population Economics, 16(1):161 175, 2003. [107] Jonathan D Sarna. The myth of no return-jewish return migration to eastern-europe, 1881-1914. American Jewish History, 71(2):256 268, 1981. [108] Andreas Schick and Richard Steckel. Height as a proxy for cognitive and non-cognitive ability. NBER Working Paper Series, 16570, 2010. [109] T. Paul Schultz. Wage gains associated with height as a form of health human capital. American Economic Review, 92(2):349 353, 2002. [110] Larry A. Sjaastad. The costs and returns of human migration. The Journal of Political Economy, 70(5):80 93, 1962. [111] Elizabeth A. Spencer, Paul N. Appleby, Gwyneth K. Davey, and Timothy J. Key. Validity of self-reported height and weight in 4808 epicoxford participants. Public health nutrition, 5(04):561 565, 2002. 125

[112] Yannay Spitzer. Pogroms, Networks, and Migration: The Jewish Migration from the Russian Empire to the United States 1881-1914. 2014. [113] Yannay Spitzer and Ariell Zimran. Migrant self-selection: Anthropometric evidence from the mass migration of italians to the united states, 1907 1925. 2014. [114] Richard H. Steckel. Stature and the standard of living. Journal of Economic Literature, 33(4):1903 1940, 1995. [115] Richard H Steckel. Heights and Human Welfare: Recent Developments and New Directions. Explorations in Economic History, 46(1):1 23, 2009. [116] Edward Alfred Steiner. On the Trail of the Immigrant. FH Revell Company, New York, 1906. [117] Yvonne Stolz and Joerg Baten. Brain drain in the age of mass migration: Does relative inequality explain migrant selectivity? Explorations in Economic History, 49(2):205 220, 2012. [118] Duncan Thomas and John Strauss. Health and wages: Evidence on men and women in urban brazil. Journal of Econometrics, 77(1):159 185, 1997. [119] Jennifer Van Hook, Weiwei Zhang, Frank D Bean, and Jeffrey S Passel. Foreign-born Emigration: A New Approach and Estimates Based on Matched CPS files. Demography, 43(2):361 382, 2006. [120] Francis A. Walker. Restriction of immigration. Atlantic Monthly, 77(464):822 829, 1896. [121] Simone A Wegge. Chain migration and information networks: Evidence from nineteenthcentury hesse-cassel. Journal of Economic History, 58(4):957 86, 1998. [122] Walter F Willcox. Appendices to International Migrations, Volume II: Interpretations. In International Migrations, Volume II: Interpretations. NBER, 1931. [123] Jeffrey G Williamson. The Evolution of Global Labor Markets Since 1830: Background Evidence and Hypotheses. Explorations in Economic History, 32(2):141 196, 1995. [124] Mark Wyman. Round-trip to America: the immigrants return to Europe, 1880-1930. Cornell University Press, New York, 1996. [125] Dean Yang. Why do Migrants Return to Poor Countries? Evidence from Philippine Migrants Responses to Exchange Rate Shocks. The Review of Economics and Statistics, 88(4):715 735, 2006. [126] Robert F Zeidel. Immigrants, Progressives, and Exclusion Politics: The Dillingham Commission, 1900-1927. Northern Illinois University Press, 2004. 126

Appendix A Immigration Quotas, World War I, and Emigrant Flows from the United States: Appendix A.1 Mitchell, Maddison, and War Data Emigration data has yearly variation while many of the independent variables are recorded at multi-year intervals, typically decadal. We make adjustments to the data in order to get yearly variation. European data are taken from Mitchell (1998) this includes foreign country sex ratio, share of labor in agriculture and industry, and natural increase (birth rate minus death rate lagged twenty years). GDP figures are taken from Maddison (2008) and are extrapolated for years not covered. World War I deaths data sources for England and Ireland are from Commonwealth War Graves Commission s Annual Report 2009-2010; France from La Population de la France by Michel Huber, 1931; Italy from La Salute pubblica in Italia durante e dopo la Guerra, by Giorgo Mortara,1925; Portugal from the British War Office, 1922, Belgium from L Annuaire Statistique de la Belgique et du Congo 1915-1919, 1922; Germany from Wars and Population by Boris Urlanis, 1971; Denmark, Netherlands, Norway, Sweden, and Spain were neutral. Census Dates: Belgium 1900, 1910, 1930 Denmark 1906, 1911, 1916, 1921, 1925, 1930 France 1906, 1911, 1921, 1926, 1931 Germany 1900, 1910, 1925, 1933 Ireland 1901, 1911, 1926 (Combined Northern Ireland and Republic of Ireland for 1926) Italy 1901, 1911, 1921, 1931. Birth and death rates for years 1888-1899 not in Mitchell(1998). They are projected backwards linearly to fill in the time series. Netherlands 1909, 1920, 1930. Birth and death rates for years 1888-1899 not in Mitchell (1998) and are borrowed from Belgium. Norway 1900, 1910, 1920 1930 Portugal 1900, 1911, 1920, 1930

Table A.1: Matching Country to Ethnicity 128 Country Belgium Netherlands France Germany Irish Italy Denmark Norway Sweden Portugal Spain United Kingdom (minus Ireland) Ethnicity Dutch or Flemish French German Irish Italian (south), Italian (north) Scandinavian Portuguese Spanish English Spain 1900. 1910, 1920, 1930 Sweden 1900, 1910, 1915, 1920, 1925, 1930 United Kingdom 1901, 1911, 1921, 1931 A.2 Matching between Ethnicities and Countries The RCI occupation data is listed by ethnicity while other data is listed by country. We match ethnicity to country as follows, weighting by population.

Appendix B Birds of Passage: Return Migration, Self-Selection and Immigration Quotas: Appendix B.1 Alternative Occupational Scores and Accounting for Sex The main occupational score used to proxy migrant skills are the mean wages from the 1940 census; advantages of this occupational score is that it reflects foreign-born earnings, separated by new and old source countries. These occupational scores are matched to the RCI occupations using the cross-walk developed by Lafortune and Tessada (2013). After migrants in IPUMS are matched to these occupations, then the average wage within that matched RCI occupation is the occupational score. There are a couple of rare instances where the 1940 IPUMS does not fill all cells by new and old source migrant; when this occurs, I apply the average wage of all foreign-born who hold that job. There are two other metrics that could be used to estimate the self-selection of return migrants. One of them is the IPUMS variable occscore, which reflects the median earnings of individuals in an occupation in 1950. This is the measure used by ABE (2014) for their indirect estimation of the self-selection of return migrants at the turn of the 20th century. While this variable is easily utilized, it measures earnings in 1950, decades after the time period under study. There was significant wage compression between the early and mid-twentieth century (Goldin and Margo, 1992), which would positively bias self-selection estimates as a large number of out-migrants were laborers. The other alternative metric uses wages from the Cost of Living Survey (CLS) taken by the

Bureau of Labor Statistics in 1901, recorded in Preston and Haines (1991) and used by Abramitzky, 130 Boustan, and Eriksson (2012) to measure self-selection of Norwegian migrants. However, these earnings are based on urban, married households which may not be representative of migrant earnings. 1 In addition to using these occupational scores as alternative metrics, I also develop another occupational score that is based on male jobs. Changes in the sex composition of return migrants would change the self-selection on occupation; in an attempt to control for this, I eliminate jobs from the RCI that are female-dominated, according to census data. This eliminates the jobs of Hat and Cap Makers, Milliners, Seamstresses, and Servants, which constitutes about 9% of those who claim a job in the census. Estimates of self-selection using the three alternative metrics (1940 only Male Jobs, 1901 CLS, 1950 IPUMS occscore) are presented in Table B.2. As a reference, I include the occupational score used in the main results section, which is based on 1940 jobs for both males and females. Excluding female jobs lowers the self-selection of return migrants from negative 3.5% to 4.9%. Return migrants were more likely to be male than in the census, and since males generally hold higher paying job, the self-selection correcting for male jobs is lower. One should note that there are still females in this sample if females held male-dominated jobs; however, this is the best one can do with the available data. Using the 1901 CLS score leads to return migrants being strongly negatively self-selected, earning 15.9% less than the migrant stock. This is opposed to using the 1950 occupational score, which should that out-migrants earned 4.8% more than the migrant stock. These differences are almost completely explained by the relative earnings of laborers to the rest of the population. Using the 1901 CLS occupational score, laborers earned 37 log points less than other occupations; the 1950 occupational score estimates that laborers earned only 5 log points less the 1940 occupational score used in the paper estimates that laborers earned 10 log points less. Laborers are a 1 When calculating self-selection with this occupational score, I use the Abramitzky, Boustan and Eriksson s (2012) calculation for farmer s earnings reported in their appendix.

large fraction of out-going migrants making the self-selection of return migrants sensitive to this 131 occupation. While this sensitivity may be particularly worrisome for estimation, note that the conclusion of the paper is that return migrants were negatively self-selected; as one moves back from 1940 to the 1908-1932 time period, laborers earned relatively less than other occupations. It is likely the negative self-selection was stronger than the result that out-migrants earned 3.5% less than the migrant stock. Interestingly, using the 1950 IPUMS occscore leads to an estimation of positive self-selection; this was the same occupational score used by ABE (2014) who found negative self-selection of return migrants. However, our estimates are not directly comparable for a couple of reasons. First, my data covers years 1908 to 1932 while their data cover 1900 to 1920. Second, they only estimate self-selection of return migrants after the first observation of migrants. Their entering cohorts are from 1880 to 1900; many return migrants of these cohorts likely left prior to their first observation of migrants in 1900. If selection of return migrants changes with years of arrival, particularly if target savers are more likely to leave first and target savers are positively self-selected, that would lead to my estimate being higher than theirs. They are also able to control for age in their selfselection estimates, unlike mine; further, their sample only includes males. Finally, their estimate is based on difference in occupational upgrading which my selectivity result is based on differences in levels. For these, reasons, it is impossible the clearly define why the our levels of self-selection are different; however, our ranking of countries are relatively similar, lending some support to the indirect method of estimation. Results by ethnicity are presented in Figure B.2. Finally, Figure B.3 shows the self-selection of return migrants over time, using the maledominated jobs occupational score. The convergence between new and old source countries by 1930 is largely driven by female migrants; more female migrants entered the country following migrant quotas, and these female migrants were more likely to stay. However, the result that quotas increased the self-selection of new source country migrants still holds; a simple differencein-difference of scores from 1920 to 1930 show that quotas increased self-selection scores by 0.030 percent using male and female jobs; the same difference-in-difference yields a result of 0.037 percent

132 for only male jobs. B.2 Calculation of the Actual Return Migration Rate Estimating the return rates of migrants is difficult because it is impossible to track a specific cohort over time. Information in the RCI lists how many leave the United States in year, but does not list how many left from a precise cohort of entry. For example, of the 5,715 Germans that left the United States in the fiscal year 1922, it is unclear how many of these emigrants arrived in 1920 or arrived in 1910. However, the reports do list how many years these emigrants stayed within the United States of the 5,715 German emigrants, 837 stayed less than five years (implying they arrived between 1917-1921) while 3,034 stayed between five and ten years (implying they arrived between 1912-1916). This information could be used to help proxy the return rate between 1917-1921 or 1912-1916. I use information from the annual reports to estimate two different return rates for the cohort that entered between fiscal years 1917-1925. 2 An estimation of the return rates follows the basic premise of adding up all the return migrants who entered the United States between 1917 and 1925 and left afterwards, and dividing them by the number of migrants who entered between fiscal years 1917 and 1925. I assume that the average emigrant who claimed staying less than five years actually only stayed two years; those who claimed to stay between 5 and 10 years actually stayed seven years. 3 I sum up those who claimed staying less than five years for report years 1919 to 1927 and those staying five to ten years for reports 1924 to 1932. ActualReturnRate e,1917 1925 = EmigrantsLess5 e,1919 1927 + Emigrants5to10 e,1924 1932 Immigrants e,1917 1925 (B.1) Equation (B.1) is my estimate of the actual emigration rate for leaving approximately ten 2 I use 1917 to 1925 instead of 1917 to 1924 because the RCI has fiscal years. Dropping year 1925 or year 1917 does not lead to different conclusions. 3 The main reason for assuming two and seven years is because I have data on reports until 1932 any assumption that is higher than two and seven go into data that I do not have.

133 years after arrival. It should be noted that this estimated return rate is likely downwardly biased due to Bandiera, Rasul, and Viarengo s (2013) evidence that out-migrants were undercounted. This method is a rough approximation but works well when spanning a number of incoming migrant cohort years (for example, from 1917-1925). It starts to fail when trying to estimate the outmigration rate for one cohort year. This is because those who left and stayed only 5 years likely did not all enter 2 years ago; by spanning multiple incoming years, it will capture those who entered 3 or 4 years ago, which reduces error.

134 Figure B.1: Self-Selection of Return Migrants, All returns versus Male returns Notes: Data is from the Annual Report of the Commissioner General of Immigration (1908-1932), and IPUMS (1910-1930). The vertical axis is the log difference in average occupational scores of return migrants and permanent migrants. A positive value indicates that return migrants had a higher occupational scores than the foreign born. The foreign-born occupational score is weighted to match the length of stay of return migrant population.

135 Figure B.2: Self-Selection of Return Migrants using Alternative Occupational Scores Notes: Data is from the Annual Report of the Commissioner General of Immigration (1908-1932), and IPUMS (1910-1930). The vertical axis is the log difference in average occupational scores of return migrants and permanent migrants. A positive value indicates that return migrants had a higher occupational scores than the foreign born. The foreign-born occupational score is weighted to match the length of stay of return migrant population.

136 Figure B.3: Self-Selection of Male Return Migrants by Decade Notes: Data is from the Annual Reports of the Commissioner General of Immigration (1908-1932), and IPUMS (1910-1930). The difference is between the logged occupational scores, a positive value indicated return migrants have higher occupational scores than foreign born. The foreign born occupational score is weighted to match gender and length of stay of return migrant population.