An Application of Nested Logit Model to Rural-Urban Migration in China

Similar documents
5. Destination Consumption

8. Consumption and Savings of Migrant Households:

Land Use, Job Accessibility and Commuting Efficiency under the Hukou System in Urban China: A Case Study in Guangzhou

Impact of Internal migration on regional aging in China: With comparison to Japan

Migration Networks, Hukou, and Destination Choices in China

Social Insurance for Migrant Workers in China: Impact of the 2008 Labor Contract Law

Rural Labor Force Emigration on the Impact. and Effect of Macro-Economy in China

Internal Migration and Living Apart in China

11. Demographic Transition in Rural China:

The RUMiC Longitudinal Survey: Fostering Research on Labor Markets in China

Asian Development Bank Institute. ADBI Working Paper Series HUMAN CAPITAL AND URBANIZATION IN THE PEOPLE S REPUBLIC OF CHINA.

Rural and Urban Migrants in India:

Influence of Identity on Development of Urbanization. WEI Ming-gao, YU Gao-feng. University of Shanghai for Science and Technology, Shanghai, China

China s Internal Migrant Labor and Inclusive Labor Market Achievements

Overview The Dualistic System Urbanization Rural-Urban Migration Consequences of Urban-Rural Divide Conclusions

Analysis of Urban Poverty in China ( )

Where Are the Surplus Men? Multi-Dimension of Social Stratification in China s Domestic Marriage Market

Rural-Urban Migration and Policy Responses in China: Challenges and Options

The RUMiC longitudinal survey: fostering research on labor markets in China

Human Capital and Urbanization of the People's Republic of China

Cai et al. Chap.9: The Lewisian Turning Point 183. Chapter 9:

Increasing Cities and Shrinking Regions (Increasing Cities and Shrinking Regions: Migration in China s Urbanization

Changing income distribution in China

Albert Park, University of Oxford Meiyan Wang, Chinese Academy of Social Sciences Mary Gallagher, University of Michigan

Working women have won enormous progress in breaking through long-standing educational and

Rural and Urban Migrants in India:

15. China s Labour Market Tensions and Future Urbanisation Challenges 1

Status Inheritance Rules and Intrahousehold Bargaining

The impacts of minimum wage policy in china

Migration and Transformation of Rural China* (Preliminary Draft) Zai Liang and Miao David Chunyu

UNR Joint Economics Working Paper Series Working Paper No Urban Poor in China: A Case Study of Changsha

Are All Migrants Really Worse Off in Urban Labour Markets? New Empirical Evidence from China

Migration at the Provincial Level in China: Effects of the Economic Motivation and Migration Cost

Dimensions of rural urban migration

The Impact of Interprovincial Migration on Aggregate Output and Labour Productivity in Canada,

DOES POST-MIGRATION EDUCATION IMPROVE LABOUR MARKET PERFORMANCE?: Finding from Four Cities in Indonesia i

Remittances and the Brain Drain: Evidence from Microdata for Sub-Saharan Africa

Immigrant Employment and Earnings Growth in Canada and the U.S.: Evidence from Longitudinal data

Internal Migration to the Gauteng Province

Poverty Profile. Executive Summary. Kingdom of Thailand

Assimilation or Disassimilation? The Labour Market Performance of Rural Migrants in Chinese Cities

Chapter One: people & demographics

Social-family network and self-employment: evidence from temporary rural urban migrants in China

Immigrant Legalization

THE DEREGULATION OF PEOPLE FLOWS IN CHINA: DID THE STRUCTURE OF MIGRATION CHANGE?*

Explaining the Deteriorating Entry Earnings of Canada s Immigrant Cohorts:

Tracking rural-to-urban migration in China: Lessons from the 2005 inter-census population survey

Employment of Farmers and Poverty Alleviation in China

Rising inequality in China

Labor supply and expenditures: econometric estimation from Chinese household data

Rural Labor Migration and Poverty Reduction in China

10/19/2017. China: Outline. PM Li Keqiang. Chinese Cities. Nobel economics laureate Joseph E. Stiglitz: Urbanization over time.

Development Economics: Microeconomic issues and Policy Models

Brain Drain, Brain Gain, and Economic Growth in China

Impacts of Internal Migration on Economic Growth and Urban Development in China

8AMBER WAVES VOLUME 2 ISSUE 3

Roles of children and elderly in migration decision of adults: case from rural China

Migration, Remittances and Educational Investment. in Rural China

Commuting and Minimum wages in Decentralized Era Case Study from Java Island. Raden M Purnagunawan

The Effects of Interprovincial Migration on Human Capital Formation in China 1

Returns to Education in the Albanian Labor Market

Assessment of Demographic & Community Data Updates & Revisions

Migration and Socio-economic Insecurity: Patterns, Processes and Policies

Lessons of China s Economic Growth: Comment. These are three very fine papers. I say that not as an academic

Birth Control Policy and Housing Markets: The Case of China. By Chenxi Zhang (UO )

1971~ % n= ~

Employment of Return Migrants and Rural Industrialization in China. -A Case Studay in Hunan Province

EVER since China began its economic reforms in 1978, rural-to-urban migration

Income Inequality in Urban China: A Comparative Analysis between Urban Residents and Rural-Urban Migrants

Human development in China. Dr Zhao Baige

The Consequences of Marketization for Health in China, 1991 to 2004: An Examination of Changes in Urban-Rural Differences

Highways and Hukou. The impact of China s spatial development policies on urbanization and regional inequality

CARE COLLABORATION FOR APPLIED RESEARCH IN ECONOMICS LABOUR MOBILITY IN THE MINING, OIL, AND GAS EXTRACTION INDUSTRY IN NEWFOUNDLAND AND LABRADOR

Migration and Poverty Alleviation in China

Wage Structure and Gender Earnings Differentials in China and. India*

Session 2: The economics of location choice: theory

Population migration pattern in China: present and future

China s Rural-Urban Migration: Structure and Gender Attributes of the Floating Rural Labor Force

Remittances and Poverty. in Guatemala* Richard H. Adams, Jr. Development Research Group (DECRG) MSN MC World Bank.

Labour Market Reform, Rural Migration and Income Inequality in China -- A Dynamic General Equilibrium Analysis

Post-Secondary Education, Training and Labour September Profile of the New Brunswick Labour Force

Labour Market Impact of Large Scale Internal Migration on Chinese Urban Native Workers

Poverty profile and social protection strategy for the mountainous regions of Western Nepal

Youth labour market overview

Evolution of the Chinese Rural-Urban Migrant Labor Market from 2002 to 2007

Labor Supply of Married Couples in the Formal and Informal Sectors in Thailand

Rural-urban Migration and Urbanization in Gansu Province, China: Evidence from Time-series Analysis

Migration Networks and Migration Processes: The Case of China. Zai Liang and Hideki Morooka

The urban transition and beyond: Facing new challenges of the mobility and settlement transitions in Asia

The Trend of Regional Income Disparity in the People s Republic of China

Labour Mobility Interregional Migration Theories Theoretical Models Competitive model International migration

REMITTANCE TRANSFERS TO ARMENIA: PRELIMINARY SURVEY DATA ANALYSIS

Reaping the Dividends of Reforms on Hukou System. Du Yang

Institute for Public Policy and Economic Analysis

THE EMPLOYABILITY AND WELFARE OF FEMALE LABOR MIGRANTS IN INDONESIAN CITIES

Human Development Research Paper 2009/09 Migration and Labor Mobility in China. Cai Fang, Du Yang and Wang Meiyan

The Chinese Economy. Elliott Parker, Ph.D. Professor of Economics University of Nevada, Reno

PROJECTING THE LABOUR SUPPLY TO 2024

Rural-Urban Migration and Happiness in China

Measuring the Income-Distance Tradeoff for Rural-Urban Migrants in China

Transcription:

An Application of Nested Logit Model to Rural-Urban Migration in China Yaqin Su Abstract China has undergone a massive demographic movement in recent decades. Ruralurban migration, in particular, has drawn attention from both researchers and policy makers. Research concerning this phenomenon has proliferated in the recent years, probably outnumbered the totality of previous work on this topic. Most studies have reached consensus as to the motives for rural residents to migrate: they flooded to cities in search of work and employment opportunities, mostly driven by poverty at rural villages. Interestingly, unlike most literature in U.S. that focuses on the determinants of migration, Chinese scholars have predominantly focused on the consequences of rural-urban migration, for example, whether migrants remittances helped lift the home villages out of poverty; the e ects of migration on urban-rural income inequality as well as the income disparity in the society as a whole; the pressures such a large-scale migration could exert on cities public goods and health care system; and to what extent rural migrants can adapt and assimilate into the urban societies. Most Chinese studies have relied on province-to-province migration flows drawn from the population census data. It is worthy of note that the interprovincial migration flows The author is a Ph.D. candidate at the Department of Economics of the University at Buffalo. Contact info: Department of Economics, 435 Fronczak Hall, Bu alo, NY 14260. Phone: (716)8169686 1

exclude those who have moved within their home provinces and treat them as non-movers. Importantly, various data sources in China indicate that intraprovincial rural migrants have always dominated their inter-provincial counterpart. Somehow surprisingly, the former seems to have received little attention in the existing literature. This paper aims at shedding new light on this topic by examining both interand intra-provincial migrants, paying close attention to the di erences in their personal attributes and location choices. Utilizing a unique micro dataset based on a representative sample of rural migrants working in cities, I employ a nested logit approach to model a migrant s decision as first deciding whether to move within or out of his/her home province, then deciding which city to migrate to based on a set of alternative-specific attributes. I find that there exists a strong inclination for rural workers to migrate to major cities within their home province. Moving beyond one s home province has a strong deterring e ect on the probability of migration, analogous to the negative border e ect identified in numerous studies in the migration literature. Consistent with some typical findings in this field, migrant workers are drawn to the cities that have higher wage di erentials, larger population sizes, faster employment growth, and higher standard of living. Meanwhile, there is empirical evidence confirming that distance between origin and destination location has a significant deterrent e ect on the propensity to migrate, although such e ect tends to attenuate as distance increases. Furthermore, I find that people who have moved interprovincially di er significantly from intraprovincial migrants in a variety of individual characteristics. In general, they tend to have a higher proportion of young, male workers; they are usually in great physical condition, and have obtained fewer years of formal education. 2

Keywords: rural-urban migration, nested-logit model, China 1. Introduction In the past three decades, China has witnessed a massive human migration unparalleled in world history. Particularly, rural to urban migration has played a central role in the urbanization process and continues to be significant in scale. Since 1979, over 500 million people have been added to China s urban population, of which 78 percent was attributable to rural-to-urban migrants. According to the latest survey by the National Bureau of Statistics (NBS), in 2014 a total of 269 million rural migrants are working and living in Chinese cities. Although the pace seems to have slowed down in recent years, it is predicted that by the year 2030 China s cities will be home to 1 billion people, that is, one in every eight people on earth (Miller 2012). Currently, the majority of rural migrants in cities lead second-class lives without much access to urban benefits. To sustain rapid urban growth and promote healthy cities, policies based upon sound economic reasoning are of practical values and urgently needed. It is generally acknowledged that rural-urban migrants have greatly contributed to China s economic growth by providing inexpensive labor to the manufacturing and service industries that were rapidly expanding in Chinese cities. Several descriptive studies have presented clear evidence that the volume of rural-urban migration is in parallel with growth rate of per capita GDP as well as secondary and tertiary industries share of GDP over the period 1997-2005. Although the extraordinary demographic transformation that s been underway I 3

in China has attracted attention from both policy makers and economists, it remains one of the most complex and least understood phenomena in the Chinese economy. Recently, there has been mounting concerns regarding whether China has run out of the cheap surplus of rural workers, which could potentially curb the country s future economic growth. Meanwhile, the Chinese government has pledged to gradually relax the rigid residence registration system, known as hukou, in an attempt to encourage labor mobility and stimulate spending in the cities. Although there exists a large body of research on human migration, most studies in China have provided only a partial understanding as to the entire puzzle of migration. For example, a number of recent studies focus on various welfare aspects (e.g., working conditions and mental conditions) of rural workers in the cities and the impacts of migration on the returned migrants themselves and the development of rural areas. Hu and Wu (2012) find that rural leaders with migration experience play a positive role in promoting income per capita and entrepreneurial activities in rural villages and towns. Wang and Yang (2013) examine the occupational choices of returned migrants and find that migration experience is positively related to wage-employment, as opposed to self-employment. Liu, Wang and Tao (2013) look at the housing conditions of rural migrants in 12 cities and discover that migrants who are better connected with urban residents tend to have better housing conditions. Messinis (2013) pays attention to the e ect of migration on urban-rural wage di erentials and suggests that education is critical in explaining the urban-rural wage gap. Liang, Yi and Sun (2014) discover that rural to urban migration has a significantly negative impact on the fertility level of rural household. The migration decision is often characterized as resulting from push factors - 4

for example, poverty or lack of job opportunities in rural regions, coupled with pull factors -such as jobs, amenities, and public goods in the destination localities. The conceptual background regarding migration can be traced back to the work of Lewis (1954). In his model, internal migration is considered desirable and should be encouraged based on the argument that reallocation of rural surplus labor from the agricultural sector, where the productivity is low, to the more productive secondary and tertiary sectors leads to an overall productivity gain for the economy. For the purpose of discussion, we can broadly group the theoretical framework of the migration literature in the past five decades into three broad categories. Firstly, the dual economy model which dominates the literature in the 1950s and 1960s. The second features the Harris-Todaro models formulated in the 1970s and 1980s, and the third focuses on microeconomics model, also referred to as New Economic of Labor Migration, which has been at center stage for the past 15 years. By utilizing a unique dataset comprising of rural-urban migrant workers in 15 major migration destination cities, this paper aims to shed some light on the characteristics of rural migrants and various factors that determine a migrant s destination choice. We examine an individual s choice of location within the utility maximization framework. First, we apply the multinomial conditional logit model (MCLM) that allows us to examine individual characteristics as well as location attributes that vary by origin location and destination city. Additionally, we employ a nested logit approach to model a migrant s decision as first deciding whether to move within or out of his/her home province, then deciding which destination city to migrate to. We investigate the e ects of various city-specific attributes, such as employment growth, wage di erential, concentration of human capital, infrastructure, rent and distance (between origin and 5

destination location), on the probability of migrating to a particular destination. Most Chinese literature on migration use inter-provincial migration flows based on province-level aggregate data (Bao, Hou, and Shi 2006; Lin, Wang and Zhao 2004; Poncet 2006). Much attention was paid to whether migration plays a role in alleviating regional income inequality. However, my data suggest that at least 55 percent of rural migrants move within their home province. This immediately raises several concerns regarding using inter-provincial migration ratios to analyze internal migration. First, the migration flow excludes the rural-urban migrants who moved within their home province. Instead, they will be counted as non-movers. Thus, the actual migration flow can be grossly underestimated. Secondly, inter-province migration flow includes all types of interprovincial movement: city-dwellers who move between urban areas of different provinces (due to job changes, studying, or marriage), and urban-to-rural migrants (due to returning to rural origins or retirement). Third, migration flow based on a 5-year period is very restrictive in the sense that it doesn t count rural workers who stayed in the destination cities more than 5 years at the time of survey, or those who have returned to their hometown. Lastly, the census surveys cover only those who are registered at the destination cities. Although China s current law mandates that anyone who stays more than 3 days at a place other than where his/her hukou is registered report to the police and obtain a temporary resident permit, rural migrants generally don t comply with this requirement. As a result, they are invisible from the o cial statistics. Up to date, an accurate measure of the size of floating population in China is not available, only estimates from various sources exist. I find that there is a strong inclination for rural workers to migrate to major cities within their home province, even after controlling for distance between 6

source and destination location. This seems to indicate that there are substantial psychological or information costs associated with migrating out of one s home province. Overall, rural-urban migrants prefer cities that have higher expected wage di erentials, higher employment growth, lower concentration of human capital, and higher standard of living. On the other hand, distance between origin and destination location has a significant deterrent e ect on the probability of migration, although this e ect attenuates as distance increases. The rest of the paper is organized as follows. The next section discusses the econometric models and their recent development. Section 3 describes data and descriptive statistics. Section 4 presents the estimation techniques and empirical findings, and the last section concludes. 2. Econometric models 2.1. Logit model consistent with utility maximization The logit (or the family of logit) model is probably the most widely used discrete choice model in analyzing the decision regarding whether to migrate as well as where to migrate. Usually, the migration choice is studied within the random utility framework. Suppose that an individual i faces J alternatives, indexed by j = 1, 2,...,J, the utility he derives from choosing jth alternative is modeled as U ij = V ij + " ij (1) where V ij is the deterministic part of the utility function, usually specified as a linear-in-parameter function of observed attributes; " ij represents the random component in the utility function in the sense that there are unobserved factors, 7

unknown or unmeasurable to the researchers, that play a part in the decision making process. For example, the taste/preference for locality could vary substantially across individuals for unobserved reasons, and the individuals may have varied degree of information available to them regarding the actual destination situations at the time of decision making. The specification of the distribution of the random component is of crucial importance in understanding the logit model and its variations. In the standard logit model, " ij is assumed to follow an iid (independently, identically distributed) extreme value distribution (or Gumbel) with the following density function (Train, 2007): f (" ij ) = e " ij e e " ij (2) and the cumulative distribution follows, F(" ij ) = e e " ij (3) Usually, the di erence between the error terms of extreme value distribution and that of the normal distribution is empirically indiscernible. The extreme value distribution, however, has more occurrences of extreme deviation than a normal distribution, thus slightly fatter tails. It s worth pointing out that the independence of the error terms essentially states that the unobserved part of the utility is purely random. That is, the unobserved component of utility for one alternative is not correlated to the unobserved component for another alternative. Another way of understanding this property is that the deterministic part of the utility function is well specified so that the logit model is appropriate and su cient. 8

Assume that each individual is rational and utility maximizing, the probability of a person choosing alternative j equals the probability of such choice yielding him the maximum level of utility, that is P ij = prob(u ij > U ik ) 8k, j = prob(v ij + " ij > V ik + " ik ) 8k, j = prob(v ij V ik + " ij >" ik ) 8k, j Taking " ij as fixed at first, the conditional probability of P ij " ij (for an alternative k, j) is given by the cumulative distribution function in equation (3), that is, e e (V ij +V ik +" ij ) (4) Since the errors are independent, the cumulative distribution over all k, j is the product of equation (4) Y P ij " ij = e e (V ij +V ik +" ij ) (5) k, j In most cases, " ij is unknown, then the unconditional probability is obtained by integrating P ij " ij over all values of " ij : Z Y P ij = e e (V ij +V ik +" ij ) (e " ij e e " ij) d" ij (6) k, j Solving this integral gives a closed-form expression, known as the logit probability: P ij = ev ij P k e V ik 9 (7)

Further, the observed component of the utility function V ij is usually specified to be linear in parameters: V ij = 0 X ij, where X ij can be a vector of observed characteristics consisting of individual attributes, alternative-specific attributes, and variables depending on the interaction between the two. Substituting V ij into the probability equation (7), we have arrived at the familiar form of logit: P ij = exp ( 0 X ij ) P k exp ( 0 X ik ) (8) 2.2. Limitations of logit model Although logit model provides a powerful tool for analyzing discrete choice outcomes, its limitations are what motivated a variety of advanced discrete models in recent decades. First of all, the logit model assumes proportional substitution across alternatives, which can be seen directly from equation (8). The ratio of probabilities of choosing alternative j over alternative m, for example, is P ij = exp ( 0 X ij ) P im exp ( 0 X im ) = exp [ 0 (X ij X im )] (9) It indicates that the odds ratio for any alternative pair ( j, m) 8 j, m will stay the same, regardless of the changes in the attributes of other alternatives, a property known as independence from irrelevant alternatives - IIA property. This property can be viewed as a natural outcome of iid assumption of the random component of the utility function. The independence of error terms implies that there are no common unobserved factors a ecting the utilities derived from various alternatives. Had one alternative been removed from the set of alternatives, the probabilities of all the other alternative will increase by the same proportion, 10

leaving the pairwise probability ratios unchanged a substitution pattern called proportional substitution across alternatives. This property is questioned and deemed implausible in some research scenarios, and researchers have formulated typical examples that invalidate this assumption. The second limitation lies in the fact that the logit model doesn t allow for heterogeneity in tastes due to unobserved portion of the utility. An otherwise similar individual, based on age, gender, education, etc., may weigh each alternative very di erently due to some idiosyncratic characteristics. Some may be extremely adverse towards moving long-distance away from their family and children in the home villages, while others may be completely uninhibited by the distance and place a lot more importance on wage di erentials and job opportunities. If the individual responsiveness to the attributes of alternatives di ers as a result of some unobserved elements in the error terms, the standard logit can lead to biased and inconsistent parameter estimates (see Chamberlain, 1980). Recent research e orts gear towards a heterogeneous logit model (Mixed Logit) that allows attitude towards location-specific attributes to di er across individuals. However, this model takes much more computational resources to estimate and does not necessarily provide more economic insights compared with the standard logit. As a result of these two noteworthy limitations, a series of more advanced GEV models have been put forward which greatly enhanced the flexibility of logit model. Among these, the most prominent is the nested logit model proposed 35 years ago by McFadden (1978). The nested logit model relaxes the IIA property and allows the random portion of utility to be correlated over alternatives. It has gained popularity and been widely applied in a variety of fields such as consumer behavior, transportation choice, residential housing, etc (see 11

Daly and Zachary, 1978; Ben-Akiva and Vovsha, 1997). Since then, various formulations of the GEV model have sprung into existence, including the Ordered GEV (OGEV) model (Small, 1987; Breshanan et al., 1997), the Paired Combinatorial Logit (PCL) model (Chu, 1990; Koppelman and Wen, 2000), the multinomial logit-ordered GEV (MNL-OGEV) model (Bhat, 1998a), and the cross-nested logit (CNL) model, which has gleaned quite a bit of attention in the recent decade (Vovsha, 1997; Vovsha and Bekhor, 1998; Ben-Akiva and Bierlaire, 1999; Papola, 2000, Bierlaire, 2001). Particularly, Wen and Koppelman (2001) formulated a general GEV structure, referred to as the generalized nested model (GNL), and illustrated that other GEV models can be shown as special cases of GNL model. These GEV type models di er from one another in the way they specify the correlation of error terms, while they share the same attribute that when the correlation is zero, all GEV models reduce to the standard logit model. 2.3. Nested logit model One major contribution of the nested logit model is that it allows the unobserved attributes of alternatives to be correlated. The alternatives can be grouped into subsets, called nests, based on their perceived similarities. The level of substitution/competition is thus greater among the alternatives in the same nest, while remaining irrelevant to the alternatives in a di erent nest, that is, the IIA condition holds for the alternatives within each nest, but doesn t have to hold for the alternatives in di erent nests. One caveat though. For the nested logit model to work, many standard statistical software packages require the set of alternatives to be partitioned into non-overlapping subsets. In other words, an alternative ought to be assigned to 12

only one nest. This restriction causes di culties for some research situations, which gives rise to the cross-nested logit (CNL) and the generalized nested logit (GNL) model that address this limitation and allow each alternative to appear in more than one nest. An alternative can, thus, be allocated among the nests in a flexible way. Starting from the random utility specification in equation (1), the nested logit model can be derived by assuming that the random component follows a generalized extreme value (GEV) distribution. Specifically, the cumulative distribution of error term takes the form: F(" ij ) = e ( P Nl=1 ( P k2b l e " ik / l ) l ) (10) This distribution is the same as that of the logit model except that the alternatives are now partitioned into N nests, labelled as B 1,...,B N ; represents the level of independence among the alternatives within a nest named as dissimilarity coe cient. This coe cient is the same for the alternatives within a nest, but vary across di erent nests. A high means greater dissimilarity and less correlation. Thus, 1 provides a measure of correlation. It s easy to see that when = 1 for all nests, meaning the correlation among the unobserved portion of utility for all alternatives is zero, the nested logit model collapses to the standard logit model. With this distribution, the probability that an individual chooses alternative j from the choice set is: P j = ev j/ n ( P k2b n e V k/ n ) n 1 P N l=1 (P k2b l e V k/v l) l (11) To simplify the notation, let s make alternative j reside in nest n, and drop the 13

individual index i at this point. Further, after some mathematical manipulation, we can show that P j can be expressed as, P j = (e V j/ n )( P k2b n e V k/ n ) n ( P k2b n e V k/ n) P N l=1 (P k2b l e V k/v l) l (12) Utilizing the property e x b c = e x+c ln b, we obtain that P j = where IV stands for the inclusive value or log-sum (e V j/ n )(e niv n ) ( P k2b n e V k/ n)( P N l=1 e liv l) (13) X IV n = log e Vk/ n (14) k2b n Based on equation (13), we can see that the choice probabilities of the nested logit model can be decomposed into two parts: P j = P j n P n (15) Thus, for each individual i, the probability that alternative j is chosen can be equivalently expressed as the product of the conditional probability of choosing j, given nth nest has been chosen, that is P j n = e V j/ n Pk2B n e V k/ n (16) and the marginal probability of choosing nest n P n = e niv n P N l=1 e liv l (17) 14

Taking log of equation (15) and sum over J alternatives, for each individual i, the log-likelihood for discrete choice based on the decision observed is of the form: NX X LL i = d n ln P n + d j ln P j n (18) n=1 j2n where d n is an indicator variable that assumes the value of 1 if nest n is chosen, and zero otherwise; similarly, d j equals 1 if alternative j within nest n is chosen, and zero otherwise. This log-likelihood function gives a sequential meaning to the decision-making process: it can be viewed as if a nest were chosen first, then an alternative within that nest was chosen. Summing the equation (18) over individuals gives the log-likelihood for the entire sample: X LL(, ) = LL i (19) The parameters in the log-likelihood function are usually estimated using Maximum likelihood techniques (MLE). They can also be estimated sequentially by estimating the conditional probabilities specified in equation (16) first, followed by estimating the unconditional probabilities based on equation (17). Simultaneous estimation is preferred because it takes advantage of all available information. However, sequential estimation provides consistent estimates when the simultaneous approach is di cult to carry out. Note that, although the sequential structure of the nest logit model is often interpreted as implying higher-level decision are made first, followed by decision at lower levels, no such temporal ordering is necessarily implied. In fact, the nested logit model is appropriate as long as we believe the alternatives are similar to each other in unobserved factors. i 15

3. Data and descriptive statistics 3.1. The data Analyses in this paper are mainly based on RUMiC (Rural Urban Migration in China) survey data. The RUMiC survey is a large-scale household survey initiated by a group of researchers at the Australian National University, the University of Queensland and the Beijing Normal University, and supported by the Institute for the Study of Labor (IZA). The surveys consist of 5,000 migrant households in cities, 5,000 urban residence households and 8,000 rural households (with or without migrant workers). The survey is designed to provide a longitudinal dataset covering a four-year span from 2008 to 2012. Since 2008, five waves of the migrant household surveys have been conducted. At this point, the first two waves survey results have been made available to researchers in general. The RUMIC survey selects locations based on whether a province is one of the major sending or receiving regions. As a result, the Migrant Survey was conducted in 15 cities across nine provinces or metropolitan areas: Shanghai, Guangdong, Jiangsu, Zhejiang, Anhui, Hubei, Sichuan, Chongqing and Henan, where the first four locations are the largest migration destinations and the remaining five are the among largest migration sending areas. The RUMiC survey aims at providing a representative and unbiased sample of rural migrants in the cities, including those who live at their workplaces such as factory dormitories and construction sites. So far, most existing surveys regarding rural migrants use registered residential addresses as a basis for sampling. The biases associated with such a sampling frame can arise from two sources. First, as mentioned previously, although China requires anyone who stays more than 3 days in a place other than where his/her hukou is registered 16

to report to the local police and obtain a temporary residence permit, more often than not, many rural migrants fail to comply with this regulation to avoid the fee involved in the process. As a result, a large proportion of temporary migrant workers - the so-called floating population, was left out and became invisible in most censuses carried out at the national level. Secondly, surveys based on the residential addresses leave out a large number of migrants who live in dormitories and construction sites provided by their employers. To address these sampling biases, the RUMiC survey employed a workplace-based sampling strategy. As the first step, each selected city is divided into hundreds of equal-sized blocks within defined city boundaries (for example, 0.5km by 0.5 km), from which 20-50 blocks are randomly drawn for survey purposes. Within each selected block, a census of workplaces is undertaken and information about the number of migrant workers at each workplace is collected. The total number of migrant workers can then be aggregated and used as an estimate for the total size of the migrant population in the cities. Finally, a random sample of migrant workers is selected to have a face-to-face interview with the enumerator. Based on the survey results, Dongguan, Shanghai, Guangzhou, Hangzhou, and Shenzhen are the top five migration destination cities and have the largest size of migrant populations. When it comes to the density of migrant workers, based on the number of migrants per census block, Dongguan is the most densely populated migrant city, followed by Shenzhen, Wuxi and Guangzhou. The total number of migrants in the 15 survey cities amounts to about 12 million. The RUMiC survey provides a useful source of micro data to analyze some under-researched topics concerning rural-urban migration. So far, the topics explored using RUMiC data include occupational choice and entrepreneurship, subjective well-being, wage inequality and labor market segmentation, and the 17

determinants and consequences of migration (see Akguc et al. 2013 for a review of existing studies based on RUMiC data). The survey o ers a rich set of information covering various aspects of the life of rural migrants. Besides the usual demographic and socioeconomic variables (such as age, gender, education, income and occupation etc.), the survey provides a detailed description of migrants physical and mental health, household characteristics such as income, expenditure, housing and living condition, as well as school performance and the mental health condition of migrants left-behind children. The paper mainly utilizes the Migrant Survey of the first two waves in 2008 and 2009. Despite substantial e orts made by the survey team to track every migrant as long as they remain in the surveyed cities and villages, there is a high attrition rate in the migrant sample from 2008 to 2009. By 2009, the RUMiC survey had lost track of more than half (58 percent) of the migrant individuals contained in the 2008 migrant sample. This is partly due to the fact that migrant workers are temporary in nature and highly mobile; they move from city to city in search of higher paying jobs. The high attrition rate can also be largely attributed to the global financial crisis that hit China in 2009, which especially a ected the export industry in coastal cities such as Guangzhou and Dongguan, where most migrant workers are concentrated. About 23 million migrant workers lost their jobs in early 2009. To restore the migrant sample to the original size, the RUMiC team surveyed a new random sample of migrants in 2009. As a result, the dataset utilized in this paper contains 8,449 individual migrants based on the survey in 2008 and 5,426 additional migrant individuals in 2009. The resulting sample has 13,872 individual records. The panel dimension o ered by RUMiC data is not yet explored in this analysis. 18

3.2. Descriptive statistics Compared with the body of research that focuses on the net migration in the developed countries, rural-urban migration in China, by and large, is unidirectional and highly concentrated in a number of economically distinguished cities in the coastal region. Generally speaking, there are four levels of administrative divisions in China: in a descending order, provincial-level (including 4 municipalities administered directly by central government), prefecture-level, county-level (xian) and township-level (xiang). Migration occurs at all levels both horizontally (within the same administrative level) and vertically (from bottom level township upwards), but the majority of migrants originate from the county and township level and di use to places higher in the hierarchical order. In particular, township, which can be further divided into streets (jiedao), towns (zhen) and townships (xiang), is the primary provider of migrant workers for all the cities across the country. Several data sources have shown that intra-provincial migration dominates inter-provincial migration. For example, based on the 2000 census, the volume (in million) of intra- and inter-provincial migrants are 88.9 and 32.3 respectively, indicating that around 73% of migration takes place within the boundary of the provinces. However, due to data limitation, most studies concerning the rural-urban migration in China have solely focused on inter-provincial migration, whereas intra-migration was not examined adequately. The RUMiC data indicate that overall 55 percent of migrants move within their home provinces. Table 1 presents the directional migration flows from eight major origin provinces to the 15 city destinations. Although the rural migrants in our dataset come from all provinces in the country, these eight provinces provide over 85 percent of the migrant workers for the cities in our sample. A clear pat- 19

tern emerges from the statistics presented in Table 1: the vast majority of rural workers choose the capital or provincial-level cities in their home province, with the rest of them split almost equally between Guangdong province (Pearl River Delta) and the Shanghai area (Yangtze River Delta), with less than 10 percent scattered in the rest of the cities. While it is common for the migrants to move from the West and Middle regions to the East, it is rare for those from the East to move westwards. Take Sichuan province for example, with a total population of 87 million, Sichuan claims to be the most populous province in the country. Sichuan has been singled out in several studies concerning rural-urban migration as the chief provider of migrant laborers, partly due to its large impoverished rural areas. Despite the impressive volume of migrant workers from Sichuan registered in some prosperous coastal cities such as Guangzhou and Shanghai, Table 1 indicates that 62 percent of rural laborers from Sichuan choose to go to Chengdu-the capital city of Sichuan, and another 15 percent choose Chongqing, a municipality in the proximity of Sichuan. Around 11 percent of Sichuan peasants flow to Guangdong province, and 10 percent go to Shanghai area. The rest of the cities combined attract merely 4 percent of the rural laborers from Sichuan. The destination distribution of the migrants from other source provinces share a similar pattern. 70 percent of the peasants from Henan choose Zhengzhou (the capital city of Henan) and Luoyang another major city in Henan. 13 percent migrate to the Yangtze River region due to its proximity to Henan; and 10 percent are drawn to Guangdong province in the south. None of them move westwards to Chongqing or Chengdu; and the remaining 6 percent is shared by other major interior cities in its vicinity (Hefei, Bengbu and Wuhan). If a migrant is from the rural areas of one of the most popular migration destinations, the decision is fairly simple: almost all peasants (98 percent) from Guangdong province head 20

towards the three cities within Guangdong: Guangzhou, Shenzhen and Dongguan. The share of migrants going elsewhere is next to none. In contrast with several studies claiming that distance is not an important factor in distributing migrant laborers in China, the RUMiC data suggest otherwise. Of course, the strong preference of migrating within the province cannot be justified by distance alone. Long-distance migration is often associated with less information regarding the destination, thus higher level of uncertainty. Besides, it is reasonable to assume that the choices of migrants are greatly influenced by their family and relatives, as well as the fellow villagers who migrate with or before them. Their social connections tend to be stronger within the province than destinations farther away. Regardless, it would be interesting to examine the intraprovincial migration since it has been paid much less attention in the literature. When it comes to the incentives for migration, there is no discernible difference between inter-provincial and intra-provincial migrants. The top three reasons based on the responses of the migrants are: (1) too poor at home (29 percent for both inter- and intra-province migrants); (2) no future in hometown (22 percent for intraprovincial migrants and 28 percent for interprovincial counterparts); (3) want to accumulate work experiences (20 percent and 18 percent for intra- and inter-provincial migrants, respectively). Other reasons include dislike farming (around 7 percent) and prefer city life (2 percent). This seem to indicate that economic condition of rural hometown is the most important factor determining whether to migrate. Table 2 presents some basic characteristics of rural migrants based on 12,336 migrant individuals aged 16 to 65. The migrants in our sample are, on average, 31 years old. Majority of them (66 percent) are young workers aged 16 to 35, 21

and 34 percent are older than 35. It is shown that 57 percent of migrants are male, consistent with the conventional view that men are more likely to migrate than women. Their average educational attainment is 9.1 years, only slightly higher than the 9 years of education mandated by the Compulsory Schooling Law in China. Most migrants (61 percent) have junior high education levels or below; 33 percent attended senior high or vocational schools; and 7 percent are college educated. In addition, the majority of them are married (61 percent). On average, they make around 1600 yuan (about 250 U.S. dollars) per month at the destination cities. The self-employed make noticeably higher income than the wage-earners 872 yuan (around $140) more per month on average. Note that there is a nontrivial proportion of migrants who are self-employed, constituting 22% of the entire migrant sample. On average, eight years have elapsed since they first migrated from their home villages, and most of them have changed cities in between. Overall, they are very healthy: 84% of them consider themselves in good or excellent health condition. To see if the intraprovincial migrants di er from their inter-provincial counterparts in these individual characteristics, I generated descriptive statistics for these two groups separately and performed t tests on the di erence in means. The results are summarized in Table 3. With the exception that there is no discernible di erence in the proportions of married workers, these two groups are significantly di erent in terms of all other personal characteristics. In summary, I find that being younger, male, less educated and healthier increases the probability of inter-provincial migration. Interprovincial migrants also tend to come from richer rural areas and make higher earnings in the current destinations. Besides, they are more mobile, based on the observation that they have migrated for a shorter period of time on average yet changed more cities than their in- 22

traprovicial counterpart. On the other hand, intra-provincial migrants seem to have a higher proportion of college graduates, and are more likely to become self-employed. Rural migrants are often labeled as a scattered, isolated and marginalized group of laborers at the lowest level of the society. Yet, as an indispensable part of the Chinese economy, they play an active role in every corner of the urban cities. They are the workers at the construction sites, workers at the manufacturing assembly line, waitresses at restaurants, hairdressers, nannies and veggie vendors, etc. The huge and flexible pool of cheap services provided by them have greatly contributed to China s urban growth since the mid-1990s and will continue to shape the face of modern cities for years to come. Highly mobile and swift to take any opportunities opening up to them, they also constitutes the pettiest unit of entrepreneurship. However, due to their rural origin and typically low level of education, they are often subject to contempt and discrimination from the society. Existing studies have described them as working in 3D occupations-dirty, Dangerous and Demeaning. To gain some insights into their labor market condition, I group the 25 occupations in RUMiC data into eight broader groups based on the similarities of the jobs. Table 4 depicts the occupational distribution of the working-age migrant workers. It s clearly shown that a large proportion of workers (nearly 40 percent) are engaged in service jobs, including food preparation and serving, maids and housekeeping, personal care such as beauty, massage, hairdressing, and tour guides, etc., followed by sales related jobs (16.4 percent) including vegetable and beverage vendors, and manufacturing jobs (16.3 percent), around 8 percent of migrants work as construction laborers, only 4.5 percent work as clerks or administrative support personnel, and less than 3 percent have technical jobs or work as managers. The results 23

seem to confirm that migrant workers typically take the jobs at the lowest end of the earning spectrum, those jobs shunned by urbanites. Based on the medium wage by occupational group, the highest paying jobs among the migrants are the managers-they usually make 2000 yuan a month (about 300 dollars), followed by the business owners and self-employed; workers in service jobs make about 1200 yuan a month; and maids and housekeeping cleaners usually have the lowest wage - 800 yuan a month (equivalent to 120 dollars). Despite the relatively meager earnings, migrant workers have demonstrated extraordinary tenacity and grit. They are usually hard-working and optimistic. Our data indicates that majority of them consider themselves fairly happy or no unhappier than normal. A comparison between the interprovincial and interprovincial migrants shows that a higher proportion of inter-provincial migrants are engaged in service jobs. By looking into the jobs within the service category, I find that a larger share of interprovincial migrants work in personal care jobs such as beauty services, barber, etc., whereas a smaller proportion work in restaurants preparing and serving food. Inter-provincial migrants also have a higher proportion working in manufacturing factories, which is expected since Guangdong province attracts migrant workers from across the country, and by utilizing the cheap labor provided by migrant workers it has become the factory for the world. In contrast, intraprovincial migrants seem to have a higher proportion working in sales related jobs and restaurants. Consistent with my previous findings, a higher proportion of migrants who moved within the province are owners of businesses or self-employed, which can be partly due to fewer employment opportunities in the interior cities and partly due to their higher level of educational attainment. There is empirical evidence that individuals with more human capital tend to choose self-employment rather than being employed by others (Meng 2001). 24

Besides the type of jobs they undertake, other aspects of employment help shed further light on their labor market situation. Table 5 provides a summary of migrants job related characteristics. The RUMiC data show that the unemployment rate for the migrant workers is fairly low-under 1 percent. Intraprovincial migrants seem to have a slightly higher unemployment rate than interprovincial migrants. On average, they spent 14 days finding their current job, suggesting a relatively short search span. Usually, they work long hours: 9.7 hour per day and 70 hours per week. Over 40 percent of their employers provide them with food and accommodation. However, less than 4 percent of their employers provide unemployment insurance. Around 7.7 percent of migrant workers have work injury insurance, and 13.5 percent of them had injuries in the past three months, of which 26 percent of the injuries were fairly serious. When it comes to medical insurance coverage, the vast majority of them have rural cooperatives medical insurance; about 6.7 percent have medical care provided by their jobs; and less than 4 percent have access to public health insurance. When it comes to the approaches they used to find their current job, the data reveal that over 59 percent of migrants obtained their jobs through family, relatives or friends, which provides support for the view that social networking plays an important role in the processes of migration. Furthermore, a higher proportion of intraprovincial migrants have relied on this channel to find current work, confirming my earlier propositions that one of the reasons why intraprovincial migration dominates over interprovincial migration lies in the fact that the social network e ect is stronger within the province. The results also indicate that migrants rarely move alone, only 10.8 percent migrate by themselves, whereas close to 90 percent traveled with relatives or fellow villagers. Furthermore, roughly one-fifth of the migrants are still in the same province where their first job was located, the 25

majority of them have moved. Finally, it seems unreasonable to assume that the migration decision is made separately within the household. In our data, 61 percent of migrants are married and the majority of couples migrated to the same city seeking employment. Recently, there are growing concerns regarding the large number of migrant children left behind in the countryside. According to the All-China Women s Federation, about 61 million Chinese children haven t seen one or both parents for at least three months. In no other country on earth are there so many children who live largely on their own. Thus, the migration decision not only a ects the migrants themselves, but also has substantial impacts on their family members and the society as a whole. Table 6 presents information at the household level, including income, major expenditures, spouse and children, housing and living conditions. Based on 8,429 migrant households in the RUMiC data, there are 1-2 persons per household. 34 percent of migrants live apart from their spouses, among whom 28 percent are also migrants in the cities. On average, the migrant couples lived apart from each other for 8 months in past year. There are 1.5 children per married household, and 73 percent of children are left in rural hometowns. The average household income is 2,223 yuan monthly, 70 percent of which is spent on food, clothing and housing expenses. The annual total consumption in the last 12 months is 16,582 yuan, including expenditures on medical expenses, transportation, communication and education, etc. The average spending on entertainment is 245 yuan per annum, equivalent to 40 US dollars. This suggests that migrants usually live a modest lifestyle. In terms of their residence condition, 41 percent of households live in the dormitory provided by their workplace; 8 percent of households live at the construction sites or other types of working area; nearly half of the households rent their housing 26

independently or share it with other people. In general, migrant households live in a crowded space. The average living area per household is 32 square meters, which is typically shared by 3-4 persons. 36 percent of the houses have no toilet or bathroom, and the families have to rely on public sanitary facilities. Of note, 47 percent of migrant households choose to live close to their fellow villagers, which seems to confirm the point of view in a number of studies stating that migrants are usually segregated from the local urbanites and rely heavily on other migrants of the same origin for information and support. 4. Empirical implementation 4.1. Basic setup I examine the migrant s location choice within a utility-maximizing, discrete choice model that incorporates personal attributes of the migrant, the economic condition at the alternative destination cities, and the costs of moving such as distance. Especially, I model the individual s decision sequentially as first deciding whether to migrate within or outside his/her origin province, and then choosing the destination city. Recall that the utility of an individual i derives by choosing destination j takes the form as equation (1): U ij = V ij + " ij Let the deterministic component of the utility function for individual i, V ij, be a function of personal and location-specific characteristics, V ij = F(X i, Y j, Z ij ) (20) where X i is a vector of individual characteristics such as age, sex, marital 27

status and education, etc.; Y j is a set of destination-specific attributes that don t depend on one s origin. The variables identified to be important factors in the existing migration literature include the economic strengths of a locality (such as GDP and average wage), population size, employment growth, cost of living such as rent, amenities, etc.; and Z ij is a set of variables that depend on both origin and destination characteristics. For example, the distance between the origin and destination or wage di erentials between the destination city and rural village. Note that although distance is usually pinpointed as one of the most important deterrence to migration, the cost of migration goes beyond what is captured by distance alone. There are other unobservable factors related to the cost of moving long-distance, such as psychological costs and information costs. As noted earlier, one s social connections seem to be stronger within the origin province. In developed countries, migration is often modelled as an act of investing in human capital. People migrate to a place where their skills can be better rewarded, which is largely consistent with the empirical evidence that people with higher educational attainment are more likely to migrate, especially for those who move across a country s border. Thus, there exists a line of research that focuses on the skill distribution and skill premium at the destination cities. However, the majority of the rural migrants examined in this study have a relatively low educational achievement level-junior high school or below. Besides, the RUMiC data show that less than 3 percent of the migrants make any form of investment in education after they migrate. Thus, investing in human capital is not among one of the major incentives to migrate for rural migrants. Rather, employment opportunities and wage di erentials play a bigger role in distributing the rural laborers cross the country. Nonetheless, there is empirical evidence 28

showing that migrant workers gain skills through the migration experience. For example, a number of recent studies find that migrants returning to their home villages have demonstrated greater leadership skills and significantly contributed to the entrepreneurial activities and the economic prosperity of the rural villages. (Hu and Wu, 2012). I employ a two-level Nested Logit specification to examine the migration decision-making process. The upper-level branches represent the choices of whether to migrate within the source province or migrate to a province other than that of one s origin. It seems reasonable to assume that personal attributes such as age, gender, health and education level will exert a bigger influence on the top-level decisions. Meanwhile, distance between the origin locality and the two prominent city clusters in the coastal regions, namely Guangzhou and Shanghai, should also play an important role in determining the inter-versus intra-provincial migration. Besides, the share of villagers who have moved interprovincially in the origin area should also play a considerable role at the top-level model. A number of studies have found that the migrants tend to follow the path trodden by previous migrants, partly due to the fact that more information is available for the chosen destination. Conditional on the decision made at the top-level model, the bottom-level choice set is partitioned into the city destinations within one s origin province and those outside one s origin. For example, if the rural migrants from Hubei province choose to stay within Hubei, the city choice will be Wuhan-the capital city of Hubei. On the other hand, if they choose to move inter-provincially, the set of city choices will be the remaining 14 cities excluding Wuhan. Note that as in the case of Hubei province, the Within branch has a degenerate structure since there is only one choice in the nest of Within, and a non-degenerate 29