Probabilistic Regional Population Forecasts: The Example of Queensland, Australia

Similar documents
Australia s uncertain demographic future

PROJECTING THE LABOUR SUPPLY TO 2024

No. 1. THE ROLE OF INTERNATIONAL MIGRATION IN MAINTAINING HUNGARY S POPULATION SIZE BETWEEN WORKING PAPERS ON POPULATION, FAMILY AND WELFARE

THE ROLE OF INTERNATIONAL MIGRATION IN MAINTAINING THE POPULATION SIZE OF HUNGARY BETWEEN LÁSZLÓ HABLICSEK and PÁL PÉTER TÓTH

PROJECTION OF NET MIGRATION USING A GRAVITY MODEL 1. Laboratory of Populations 2

Subsequent Migration of Immigrants Within Australia,

BRIEFING. The Impact of Migration on UK Population Growth.

The Development of Australian Internal Migration Database

International migration data as input for population projections

3 November Briefing Note PORTUGAL S DEMOGRAPHIC CRISIS WILLIAM STERNBERG

Population Projection Alberta

The demographic diversity of immigrant populations in Australia

8. United States of America

People. Population size and growth. Components of population change

Poverty Reduction and Economic Growth: The Asian Experience Peter Warr

A Snapshot of Current Population Issues in the Northern Territory

The Contributions of Past Immigration Flows to Regional Aging in the United States

Evaluating the Role of Immigration in U.S. Population Projections

Population Change and Public Health Exercise 8A

Section IV. Technical Discussion of Methods and Assumptions

REVISIONS IN POPULATION PROJECTIONS AND THEIR IMPLICATIONS FOR THE GROWTH OF THE MALTESE ECONOMY

Time Series of Internal Migration in the United Kingdom by Age, Sex and Ethnic Group: Estimation and Analysis

Inferring Directional Migration Propensities from the Migration Propensities of Infants: The United States

The Effects of Immigration on Age Structure and Fertility in the United States

People. Population size and growth

PI + v2.2. Demographic Component of the REMI Model Regional Economic Models, Inc.

This analysis confirms other recent research showing a dramatic increase in the education level of newly

Population Projection Methodology and Assumptions

Assumptions for long-term stochastic population forecasts in 18 European countries

The Impact of Interprovincial Migration on Aggregate Output and Labour Productivity in Canada,

The Demography of the Labor Force in Emerging Markets

Alberta Population Projection

Estimates by Age and Sex, Canada, Provinces and Territories. Methodology

Fiscal Impacts of Immigration in 2013

Estimating the foreign-born population on a current basis. Georges Lemaitre and Cécile Thoreau

REGIONAL DISPARITIES IN EMPLOYMENT STRUCTURES AND PRODUCTIVITY IN ROMANIA 1. Anca Dachin*, Raluca Popa

Alice According to You: A snapshot from the 2011 Census

Changing Times, Changing Enrollments: How Recent Demographic Trends are Affecting Enrollments in Portland Public Schools

POPULATION PROJECTIONS FOR COUNTIES AND METROPOLITAN STATISTICAL AREAS CALIFORNIA. Walter P. Hollmann, State of California, Department of Finance

Model migration schedules incorporating student migration peaks

Model Migration Schedules

WHAT IS THE ROLE OF NET OVERSEAS MIGRATION IN POPULATION GROWTH AND INTERSTATE MIGRATION PATTERNS IN THE NORTHERN TERRITORY?

Comparing Mobility Around the World: Results from the IMAGE Project

Community Profile of Adelaide Metropolitan area

Paper for the European Population Conference, 31 August to 3 September, 2016, Mainz, Germany

Volume 35, Issue 1. An examination of the effect of immigration on income inequality: A Gini index approach

11. Demographic Transition in Rural China:

POPULATION STUDIES RESEARCH BRIEF ISSUE Number

Economic and Social Council

DEMIFER Demographic and migratory flows affecting European regions and cities

ALTERNATIVE APPROACHES TO FORECASTING MIGRATION: FRAMEWORK AND ILLUSTRATIONS

Evidence-Based Policy Planning for the Leon County Detention Center: Population Trends and Forecasts

The migration model in EUROPOP2004

Introduction: The State of Europe s Population, 2003

5A. Wage Structures in the Electronics Industry. Benjamin A. Campbell and Vincent M. Valvano

The new demographic and social challenges in Spain: the aging process and the immigration

2011 Census Papers. CAEPR Indigenous Population Project

POPULATION STUDIES RESEARCH BRIEF ISSUE Number

REGIONAL. San Joaquin County Population Projection

Recent demographic trends

Low fertility: a discussion paper

Migration. Ernesto F. L. Amaral. April 19, 2016

Schooling and Cohort Size: Evidence from Vietnam, Thailand, Iran and Cambodia. Evangelos M. Falaris University of Delaware. and

STATISTICAL REFLECTIONS

Making use of the consistency of patterns to estimate age-specific rates of inter-provincial migration in South Africa

Working women have won enormous progress in breaking through long-standing educational and

The impact of immigration on population growth

SIMPLE LINEAR REGRESSION OF CPS DATA

Number of marriages increases and number of divorces decreases; infant mortality rate is the lowest ever

Using data provided by the U.S. Census Bureau, this study first recreates the Bureau s most recent population

Simultaneous Modeling of Heterogeneous Subpopulations within one Framework

Migration and Demography

EXAMINATION 3 VERSION B "Wage Structure, Mobility, and Discrimination" April 19, 2018

AMERICAN IMMIGRATION IN THE SIXTIES

INFOSTAT INSTITUTE OF INFORMATICS AND STATISTICS Demographic Research Centre. Population in Slovakia 2004

Demographic Challenges

Economic correlates of Net Interstate Migration to the NT (NT NIM): an exploratory analysis

Emigrating Israeli Families Identification Using Official Israeli Databases

GLOBALISATION AND WAGE INEQUALITIES,

THE RISING FINANCIAL BURDEN OF BC'S AGING POPULATION

Chapter One: people & demographics

Headship Rates and Housing Demand

Inflation and relative price variability in Mexico: the role of remittances

A four-dimensional population module for the analysis of future adaptive capacity in the Phang Nga province of Thailand

Nazi Victims of the Holocaust Currently Residing in Canada, the United States, Central & Eastern Europe and Western Europe

REPORT OF THE WORK SESSION ON DEMOGRAPHIC PROJECTIONS

Trends in Labour Supply

(EPC 2016 Submission Extended Abstract) Projecting the regional explicit socioeconomic heterogeneity in India by residence

Undocumented Immigration to California:

BRIEFING. Long-Term International Migration Flows to and from Scotland. AUTHOR: WILLIAM ALLEN PUBLISHED: 18/09/2013

Immigration and Multiculturalism: Views from a Multicultural Prairie City

How does having immigrant parents affect the outcomes of children in Europe?

A COMPARISON OF ARIZONA TO NATIONS OF COMPARABLE SIZE

Migrants Fiscal Impact Model: 2008 Update

THE IMPACT OF IMMIGRATION ON ENGLAND S HOUSING

DEMIFER Demographic and migratory flows affecting European regions and cities

Water Demand Demographic Change and Uncertainty

Population Dynamics in Poland, : Internal Migration and Marital Status Changes

Assessment of Demographic & Community Data Updates & Revisions

Case study: China s one-child policy

Transcription:

Geographical Analysis ISSN 0016-7363 Probabilistic Regional Population Forecasts: The Example of Queensland, Australia Tom Wilson, Martin Bell Queensland Centre for Population Research, School of Geography, Planning and Architecture, The University of Queensland, St Lucia, Brisbane, Australia The variability of demographic trends at the subnational scale, particularly internal and international migration, renders subnational population forecasting more difficult than at the national scale. Illustrating the uncertainty of the demographic future for subnational regions is therefore a crucial element of any set of subnational population forecasts. However, subnational forecasts are currently prepared using deterministic models, which fail to properly address the issue of demographic uncertainty. The traditional high, medium, and low variants approach employed by many national statistical offices poses a number of problems. Probabilistic population forecasting models have the potential to overcome many of these problems, but these models have so far been limited to national-level forecasts. This article reports a first attempt to implement a probabilistic approach to subnational population forecasting using a biregional projection framework. The article sets out the forecasting framework, outlines the approach adopted to formulate each of the assumptions, and presents probabilistic forecasts for 2002 2051 for Queensland and the rest of Australia. The forecasts show a two-thirds probability that Queensland s population in 2051 will be between 5.4 and 7.7 million while the same range for the rest of the country is 18.6 and 22.7 million. The forecasts quantify to what extent greater uncertainty exists about the demographic future at the subnational compared with the national scale. Introduction Deterministic population forecasts frequently turn out to be rather inaccurate, sometimes within the embarrassingly short period of just 1 or 2 years of their publication. This inaccuracy arises from a number of sources, such as an incomplete understanding of the drivers of demographic trends, an inherent randomness in demographic processes which renders a precise prediction impossible even if the trend prediction is correct, and errors in the jump-off population (de Beer 2000). Correspondence: Tom Wilson, School for Social and Policy Research, Charles Darwin University, Darwin, North Territory, Australia. e-mail: tom.wilson@cdu.edu.au Submitted: June 26, 2004. Revised version accepted: November 23, 2004. Geographical Analysis 39 (2007) 1 25 r 2007 The Ohio State University 1

Geographical Analysis The conventional method of illustrating the uncertainty of the demographic future is to produce variant population forecasts with different assumptions about the future of fertility, mortality, and migration. Many national statistical offices and international organizations take combinations of the variant fertility, mortality, and migration assumptions to produce high, medium, and low forecast variants. But while this approach would seem a sensible way of dealing with the uncertainty issue, closer inspection reveals several major shortcomings. First, no indication is given as to the likelihood of the low and high variants coming true (Lutz and Scherbov 1998). Are the high and low variants quite likely or very unlikely? Will the future population almost certainly be within the high low range? The variants cannot be meaningfully interpreted. Second, the future trajectories of fertility, mortality, and migration are nearly always assumed to be linear or to change smoothly over time. This simply does not match what is known about past trends. Cyclical behavior and random fluctuations are ruled out (Lee 1999). If, as is often the case, the high and low variants of fertility, mortality, or migration are slowly trended in over many years from the most recently observed value then the high low range will open up quite slowly. The chance of actual trends exceeding that high low range in the early years of the forecast is therefore quite high (Bryant 2003). Third, the fixed relationships between the fertility, mortality, and migration assumptions in variant population forecasts give high low ranges which will vary in their probabilistic coverage from one output variable (e.g., total population) to another (e.g., the elderly dependency ratio; Keilman, Pham, and Hetland 2002). A fourth and related point is that fixed combinations of fertility, life expectancy at birth, and international migration preclude the many alternatives (e.g., high fertility with low international migration) that could exist in the future. Probabilistic population forecasts overcome these limitations. Over the last decade probabilistic population forecasting methods have been progressively developed and applied to a number of countries, including Australia (Wilson and Bell 2004a), Austria (Lutz and Scherbov 1998), Norway (Keilman, Pham, and Hetland 2002), Finland (Alho 2002), the Netherlands (de Beer and Alders 1999), Sweden (Cohen 1986), the United States (Lee and Tuljapurkar 1994), and for world regions (Lutz, Sanderson, and Scherbov 2001; Lutz and Scherbov 2003). To date, however, there have been very few attempts to apply the probabilistic approach to population forecasting at the subnational scale. Exceptions include the work of Rees and Turton (1998), Gullickson and Moen (2001), Gullickson (2001), Miller (2002), Lee, Miller, and Edwards (2003), and Smith and Tayman (2004). Rees and Turton were probably the first to produce subnational probabilistic population forecasts when they used a multiregional model to prepare forecasts for the NUTS2 regions of the European Union. In a further innovation these researchers handled the huge computational task by employing parallel processing on a supercomputer. Limited past time series of data, however, forced the authors to use educated guesses in setting the predictive intervals for the input variables, and perfect correlation between regions was assumed for fertility, mortality, and 2

Tom Wilson and Martin Bell Probabilistic Regional Population Forecasts migration. Gullickson and Moen (2001) extended subnational probabilistic forecasting methods beyond just population numbers by forecasting population and emergency hospital admissions for a two-region system in Minnesota using net migration rates due to data limitations. In another article Gullickson (2001) discusses many of the issues involved in placing multiregional population forecasts in a probabilistic framework, suggesting the use of log-linear models to simplify the forecasting of internal migration. In our view this approach holds much promise for multiregional forecasting and we hope to investigate this approach in a subsequent article. More recently Lee, Miller, and Edwards (2003) presented probabilistic population and fiscal forecasts for the state of California but their approach employed a single region model which forecast net interstate and net international migration. Focusing on just total population size, Smith and Tayman (2004) employed autoregressive integrated moving average (ARIMA) models to forecast the populations of selected U.S. states. Their research did not convince them that ARIMA models at least in the form used in their experiments could provide suitable predictive intervals for population forecasts. A slightly different approach involves the production of point forecasts using standard deterministic models and then the application of predictive intervals based on past forecast errors. Smith and Sincich (1988) discovered sufficient temporal stability in forecast errors for U.S. states for them to be used as an indication of future forecast uncertainty. This is difficult, however, when few forecasts have been made. Tayman, Schafer, and Carter (1998) overcame this limitation by sampling errors from one forecast period but a large number of small areas, thereby estimating the relationship between population size and forecast error, and thus providing a guide to forecast uncertainty. There appears to be little other work on subnational probabilistic population forecasting. At one level this lack of attention is not surprising as the addition of internal migration to the three variables that must be considered for national forecasts brings further challenges to an already complex modeling exercise. At the same time, the imperative for a probabilistic approach is more pressing as uncertainty rises as population size falls. Not only is there one extra variable to be considered it is also the most volatile of the components of population change. Table 1 illustrates the extent to which forecast uncertainty is greater at the subnational scale by showing selected forecast errors from the Australian Bureau of Statistics (ABS) projections for Queensland and Australia as a whole. It is clear that forecasting Queensland s population has proved considerably more difficult than that of the whole nation. This article sets out one approach to the production of subnational probabilistic population forecasts using a biregional projection framework. Next we set out the case for modeling gross rather than net migration flows in subnational probabilistic modeling. The probabilistic forecasting model is then specified and the processes used to set assumptions are described. The model was applied to Australia to generate 2002-based population forecasts for Queensland and the rest of Australia up to 2051, and selected results from this application are presented in the following 3

Geographical Analysis Table 1 Percentage Error of the ABS Principal Series w Population Projections Jump-off year Percentage error after 5 years 10 years 15 years 20 years 1972 Queensland 4.74 8.26 9.04 12.53 Australia 0.58 3.46 5.95 7.52 1978 Queensland 6.61 9.23 14.57 18.19 Australia 1.58 2.14 2.67 11.24 1984 Queensland 3.57 6.72 8.11 Australia 2.57 2.58 0.25 1987 Queensland 1.62 3.55 4.04 Australia 1.24 0.46 1.31 1989 Queensland 0.53 1.10 Australia 1.22 1.70 1993 Queensland 0.09 0.20 Australia 1.68 1.31 1995 Queensland 2.43 Australia 0.40 1997 Queensland 0.14 Australia 0.34 NOTE: Source: calculated from Australian Bureau of Statistics (ABS) population projections and estimated resident populations. Calculated as (projection actual)/actual 100. w Where a principal projection series was not officially assigned we used the series most likely to be designated the principal series by users. section. The article concludes by noting some challenging issues, which need to be tackled in the further development of probabilistic population forecasting methods. The importance of modeling gross migration flows in probabilistic regional population forecasting The population forecasts in this article were produced using a probabilistic version of the biregional cohort component model (Isserman 1985; Rogers 1985; Wilson and Bell 2004b). In biregional models the world is divided into two regions internal to a country, plus the rest of the world. The defining feature of the biregional model lies in its handling of migration between internal regions as place-to-place flows rather than net migration. The specific version implemented for our forecasts also incorporates international migration as gross flows rather than net migration. But why was this model chosen given that a simpler method of producing regional population projections would have involved a cohort component model which just dealt with net migration? Exactly the same model used for national probabilistic forecasts could have been implemented at the regional scale. This would have required less time, less input data, less assumption setting, and reduced computing time. The answer to this question lies in the advantages of modeling place-to-place 4

Tom Wilson and Martin Bell Probabilistic Regional Population Forecasts migration flows rather than net migration in probabilistic regional population forecasting. We identify three key points. First, as Rogers (1990) and others have stressed, there is no such things as a net migrant, only people moving from one place to another. Modeling net migration is akin to projecting natural change rather than fertility and mortality separately, and while, like natural change, net migration is a useful demographic measure, it is unsuitable for modeling. From a conceptual standpoint gross migration flows constitute a better representation of real-world demographic processes. Second, forecasting interregional migration with origin destination migration rates relates the volume of migration to the population at risk of migration in the origin region. Models incorporating net migration do not. This is not simply a conceptual refinement. Migration rates also have practical benefits: they avoid the nonsensical negative populations which can occur with net migration models if a fixed amount of net migration is subtracted from the population each year, emptying the population of its members before the end of the forecast horizon. In some deterministic forecasting applications when net migration is always positive, the negative population problem will be avoided. But if net migration was modeled probabilistically, it would often be the case that part of the net migration predictive distribution would be below zero. In such cases there would be some net migration sample paths that would remain negative for the whole forecast horizon, thereby generating negative populations in many situations. The third problem concerns net migration age sex profiles. Where population projections are produced using a net migration model a fixed age sex profile of net migration is usually assumed. The total value of net migration may vary over the forecast horizon but the shape of the net migration age sex profile remains fixed. This prevents the age sex profile of migration from responding to the age sex structure of the base population. A fixed net migration profile is also problematic when different levels of in- and out-migration are projected. While it is known that the shapes of in- and out-migration age sex profiles exhibit considerable stability over time (Rogers, Raquillet, and Castro 1978), their levels do not. However, different levels of in- and out-migration can result in substantial variations in the shape of the net migration age sex profile. Fig. 1 illustrates this point using model migration schedules fitted to annual average interstate migration flows to and from Queensland over the period 1997 2001. Graphs (a), (b), and (d) of Fig. 1 show different levels of in- and out-migration. The net migration age profiles shown in graph (c) result from the combination of the in-migration numbers in graph (a) with the out-migration figures in graph (b). Graph (e) shows the combination of the in- and out-migration in graphs (a) and (d), demonstrating that the same total net migration figure derived from different in- and out-migration levels may be associated with different net migration age profiles. In probabilistic population forecasts in which the levels of in- and out-migration fluctuate widely, the use of a fixed net migration age sex profile is clearly a poor substitute for separate in- and out-migration profiles. 5

Geographical Analysis In-migration 3500 3000 2500 2000 1500 1000 120,000 100,000 80,000 500 (a) 0 0 10 20 30 40 50 60 70 80 90 100 age Out-migration (b) 3500 3000 2500 2000 1500 1000 500 0 0 10 20 30 40 50 60 70 80 90 100 age 80,000 75,000 70,000 Out-migration (d) 3500 3000 2500 2000 1500 1000 500 0 0 10 20 30 40 50 60 70 80 90 100 age 110,000 90,000 70,000 Net migration 1000 800 600 400 200 0 200 40,000 25,000 10,000 400 0 10 20 30 40 50 60 70 80 90 100 (c) age Net migration 600 400 200 0 200 400 10,000 10,000 10,000 600 0 10 20 30 40 50 60 70 80 90 100 (e) age Figure 1. The effect of different levels of in- and out-migration on the net migration age profile. Source: model migration schedules fitted to Australian Bureau of Statistics Medicare-based data. The forecasting model Our population forecasts were produced using a probabilistic biregional cohortcomponent model. The operation of the cohort component part of this model is well known. In this application we started with a June 30, 2002 jump-off population, forecast in 1 year steps up to June 30, 2051, and disaggregated the population by sex and single years of age from 0 to the highest attained age (with the model operating up to age 120). The methods of multiregional demography, which incorporates biregional models, are also widely described (Rees 1984; Rees and Wilson 1977; Rogers 1985, 1995). Similarly, the publications on probabilistic population forecasting now amount to a considerable literature from which methods and advice on probabilistic forecasting at the national scale can be obtained. 6

Tom Wilson and Martin Bell Probabilistic Regional Population Forecasts The challenges in the development of our model lay in drawing together these areas of literature. In extending national probabilistic methods to the subnational scale three challenges stood out in particular: (i) switching from a probabilistic single region model to a biregional model to handle internal migration, (ii) the way in which regional correlations of variables should be handled, and (iii) dealing with the spatial operation of an international migration system. The following six steps summarise our approach to tackling these challenges. Step 1 First, the population accounting framework was designed. Four types of international migration flow were distinguished permanent immigration, permanent emigration, long-term immigration, and long-term emigration definitions of which are given later in this section. The population accounting equation of the model can therefore be summarized as Paþ1 i ðt þ 1Þ ¼P a i ðtþ Di a;aþ1 ðt; t þ 1ÞþIMi a;aþ1 ðt; t þ 1Þ OMi a;aþ1ðt; t þ 1Þ þ PIa;aþ1 i ðt; t þ 1Þ PEi a;aþ1 ðt; t þ 1ÞþLIi a;aþ1 ðt; t þ ð1þ 1Þ LEi a;aþ1ðt; t þ 1Þ where P denotes population; D is the deaths; IM is the interstate in-migration; OM is the interstate out-migration; PI is the permanent immigration; PE is the permanent emigration; LI is the long-term immigration; LE is the long-term emigration; i is the represent region; t is June 30th one year, t,t11 is a 1-year period from June 30 one year to 30th the next; a is age group; and a,a11 is the parallelogram period-cohort space on the Lexis diagram which over the t to t11 interval shifts from age a to age a11. Step 2 The summary indicators of the demographic components of change were selected. These are: the total fertility rate (TFR), life expectancy at birth (e 0 ), gross migraproduction rates (GMRs) for internal migration, total numbers for national permanent immigration, a proportion of the national permanent immigration total going to Queensland, a GMR for permanent emigration, and for long-term immigration and emigration a long-term migration average total. These are described in detail below. Step 3 The models to generate the sample paths of the summary parameters were designed. The equations for these models are outlined below. Step 4 The forecast assumptions were then formulated. Assumptions needed to be made in terms of the median and widths of the summary parameter forecast distributions, the parameters of the models used to generate the individual sample paths, and the degree of regional and sex correlation between variables. 7

Geographical Analysis Step 5 For each of the summary parameters 3000 sample paths covering the forecast horizon 2002 2051 were generated. For some summary parameters ceiling and floor limits were set to prevent sample paths obtaining impossible or implausible values. Following the practice in Keilman, Pham, and Hetland (2002) any sample path which exceeded the specified limits was rejected and another generated. This continued until 3000 permissible sample paths had been created. Step 6 Deterministic age and sex profiles were applied to these summary parameters to provide the age- and sex-specific rates and flows. The cohort component model was then run 3000 times. Details of the models used to generate sample paths of each of the summary parameters are now described. Fertility and mortality The TFR was used as the summary indicator for fertility and life expectancy at birth (e 0 ) for mortality. Separate TFRs were modeled for the two regions; e 0 simulations were generated for each region and each sex. Random walk with drift models were used to generate the TFR and e 0 sample paths, and thus the predictive distributions, that is vðtþ ¼vðt 1Þþe v ðtþþdrift v ðtþ ð2þ where v denotes a summary parameter, e v random error drawn from a normal distribution with mean zero, drift v a value to fix the median of the distribution to the specified median assumption, and t a 1-year period of time. Because past data reveal regional correlations in fertility and sex and regional correlations in mortalitycorrelated random errors were generated for the TFR and e 0 simulations using Cholesky decomposition (Press et al. 2001, pp. 89 91). Predictive distributions were obtained from the random walk model without prescribing any limits to the values of these parameters. The age-specific rates required by the cohort component model were generated by applying fixed age schedules to the summary parameters. For fertility, sets of graduated rates scaled to sum to unity were simply multiplied by the TFR to give age-specific fertility rates. To obtain age sex-specific mortality rates an iterative algorithm was programed to link each simulated e 0 value to an appropriate set of rates. Internal migration In a biregional model internal migration flows are forecast by multiplying age- and sex-specific origin destination migration rates by the origin population at risk. Four sets of rates are therefore involved: migration from Queensland to the rest of Australia for males and females separately and migration in the opposite direction, also broken down by sex. The GMR was employed as the summary parameter for these internal migration rates. Being the sum of all the age-specific migration rates this measure is analogous to the TFR. Unlike the TFR, however, there is no obvious 8

Tom Wilson and Martin Bell Probabilistic Regional Population Forecasts upper age at which to end the summation, and it is usually quite sensitive to the choice of terminal age (Rees et al. 2000). We calculated the GMR over ages 0 99. Given that our model was designed to use movement-type migration data rather than the transition data available from censuses (Rees and Willekens 1986) the past time series of GMRs were calculated using Medicare change of address data. To create the random GMR sample paths the same random walk with drift model as used for the TFR and e 0 was used. Floor and ceiling limits were set to prevent extremely low and high values. Age-specific rates were generated in the same way as for fertility by multiplying the GMRs by age schedules summed to unity. International migration The separation of international migrants into the four categories listed in equation (1) is made possible by information collected on the arrival and departure cards that all passengers passing through airports and ports must complete. The rationale for modeling international migration in these categories is due to marked differences in the demographic characteristics of these two types of migration, both in terms of long-run trend and age sex migration profiles. Classification into permanent and long-term categories is based on self-reported status of intended stay and visa category. Permanent immigrants are identified as arrivals holding migrant visas, regardless of stated period of intended stay. Permanent emigrants consist of permanent residents who on departure state that they intend to settle permanently in another country. Long-term arrivals (departures) are defined as Australian residents returning after (leaving for) a year or more overseas, together with overseas visitors stating an intention to spend a year or more in Australia (having spent a year or more in Australia). Many of the latter are temporary skilled workers and foreign students. New Zealand citizens receive special treatment in the migration figures. They may live and work in Australia without being required to obtain permanent residence status, and are classified as either permanent or long-term migrants depending on what they state on their arrival or departure cards. Different approaches were used to model the international migration flows to reflect the different characteristics of those flows. Because the level of permanent immigration is controlled to a large extent by Australian government policies a trend-plus-error model was chosen to represent this type of migration, that is PIðtÞ ¼PT ðtþþe PT ðtþþe JO ð3þ where PI denotes the total number of permanent immigrations, PT the set permanent immigration trend, e PT the fluctuation around that trend, and e JO a jump-off year error. The trend part of the model was designed to capture the policy influences while the e PT error reflects delays between visa issue and actual date of arrival, plus fluctuations in permanent immigration from New Zealand. The e PT were modeled as a random walk. The third term on the right-hand side of the equation (3) is a jump-off year error (with a mean of zero and a guesstimated standard deviation) because international migration data in Australia is thought to have suffered from a 9

Geographical Analysis number of problems in recent years (McDonald, Khoo, and Kippen 2003; Australian Bureau of Statistics 2003a). The permanent immigration model of equation (3) was applied only at the national scale. Distribution to the two regions was achieved by assigning Queensland a proportion of the national immigration flow based on its share of the national population, with the remainder being allocated to the rest of Australia. A strong linear relationship was found to exist between Queensland s share of Australia s population and its share of permanent immigration (r 5 0.86). To avoid iterative calculations this relationship was modeled using the previous year s population share: QðtÞ ¼a þ b P Qldðt 1Þ P Aus ðt 1Þ ð4þ where Q denotes the proportion of national permanent immigration allocated to Queensland as implied by the linear model; a and b are intercept and slope parameters, respectively; and P represents total population size for Queensland (Qld) and Australia (Aus). As this relationship does not hold precisely, the actual share of national permanent immigration was modeled as a function of the share implied by the linear regression plus a random error component, that is QðtÞ ¼ QðtÞþe Q ðtþ ð5þ where Q is the proportion of national permanent immigration allocated to Queensland. The error values e Q were estimated from the linear regression and modeled as a random walk. Permanent emigration GMR sample paths were produced in a simple fashion using the random walk with drift model as described in equation (2) but with an added jump-off year error to reflect uncertainty over recent migration figures. This jump-off year error distribution was assumed to be normal with a mean of zero and guesstimated standard deviation. Ceiling and floor limits were applied to prevent the negative and extremely high values that would otherwise result from the large year on year GMR differences. For long-term immigration and emigration a different model seemed appropriate. Because long-term migration is by definition nonpermanent, long-term immigrants should, after an interval of a few years, become long-term emigrants, and vice versa. When averaged over several years, then, long-term immigration and emigration at the national level should be roughly similar. At the subnational scale internal migration of these long-term international migrants provides the potential for greater separation in the long-run average values of long-term immigration and emigration but past data reveal a reasonably close association for the two regions in question. These past trends suggested the following two-stage model of long-term migration. First, long-term immigration and emigration were conceptualized as a single long-term immigration and emigration average, modeled as a random walk with drift: LTðtÞ ¼LT ðt 1Þþe LT ðtþþdrift LT ðtþþe JO ð6þ 10

Tom Wilson and Martin Bell Probabilistic Regional Population Forecasts where LT denotes long-term migration and e JO a jump-off year error to account for recent uncertainty surrounding the true values of long-term migration. As before, it was assumed to be normally distributed with a mean of zero and guesstimated standard deviation. Second, separate long-term immigration and emigration sample paths were simulated as the product of this long-term average value and a randomly fluctuating scaling factor, that is LIðtÞ ¼LT ðtþ½1þfðtþš LEðtÞ ¼LT ðtþ½1 fðtþš ð7þ where LI and LE denote long-term immigration and long-term emigration, respectively, and f is a scaling factor modeled as a random walk within specified limits. Once 3000 simulations of the summary parameters of international migration had been generated age sex-specific rates and flows were prepared by multiplying each parameter by a set of rates (for permanent emigration) or a set of proportions (for the other three international migration flows), which had been smoothed and scaled to unity. Forecast assumptions For the model to produce the forecast distributions of the summary indicators of demographic change, four sets of variables had to be specified: (i) (ii) (iii) (iv) a trajectory describing the most likely future for each of the summary demographic indicators. These formed the medians of the forecast distributions; the distributions of the year on year differences in the summary indicators; the degree of correlation between variables; floor and ceiling limits to prevent very unlikely or nonsensical values being forecast. The values adopted for each indicator are discussed in the text below and a selection of the predictive distributions is shown. Fertility Both past trends and theory (McDonald 2003) suggest that Australia s TFR will continue to fall in coming years. Queensland s TFR has traditionally been very slightly above that of the rest of Australia. In these forecasts the most recently observed TFR for Queensland (1.78 in 2001 2002) and the rest of Australia (1.72) are assumed to decline gradually to 1.60 and 1.58, respectively, in 20 years time. These assumptions are consistent with the most recent Queensland Government medium series population projections (Queensland Government 2003; Wilson et al. 2004). Taking the view that the era of below-replacement fertility with small annual fluctuations in the TFR is likely to continue, the distribution of year on year TFR differences observed as Australian fertility fell below replacement level was 11

Geographical Analysis 3.5 3.0 estimates forecasts 2.5 2.0 1.5 1.0 0.5 0.0 1970-71 1980-81 1990-91 2000-01 2010-11 2020-21 2030-31 2040-41 2050-51 Total Fertility Rate 95% predictive interval 80% predictive interval 67% predictive interval median Figure 2. Queensland s recorded total fertility rate and predictive distribution. Source of estimates: Australian Bureau of Statistics. assumed to continue. The standard deviations from this period are 0.046 for Queensland and 0.038 for the rest of Australia. The correlation between the two regions for the year on year TFR differences was found to be quite low over the 1971 2001 period at 0.4. This level of correlation was assumed for the forecasts. The forecast distribution for Queensland s TFR is shown in Fig. 2. Mortality Despite being one of the lowest mortality countries in the world, life expectancy at birth of increases in Australia have been substantial in recent decades (Australian Bureau of Statistics 2002) and our forecasts are based on the assumption that this trend will continue. The medians of the life expectancy distributions were obtained by extrapolating from mortality rates for the 1971 2001 period (earlier data not being readily available). The rate of mortality change over this period was then smoothed including aggressive smoothing to change the slight increase in mortality among young adult men to a slight decrease. Mortality rates were then forecast holding these rates of change constant and the resulting e 0 values were obtained from life tables. A very slight adjustment was then made to bring the forecast e 0 figures in line with the recent Queensland Government medium series forecasts, which for Queensland is 87.7 years for men and 90.0 years for women by 2050 2051 (and almost exactly the same for the rest of Australia). These figures are more optimistic than the ABS medium series projections of 84.0 years for men and 87.7 years for women for the same period (Australian Bureau of Statistics 2003b). The use of the Queensland Government forecasts rather than those of the ABS is supported by the fact that every national-level medium series e 0 forecast produced by the ABS over the last 30 years has proved too low from 5 years into the forecast horizon. 12

Tom Wilson and Martin Bell Probabilistic Regional Population Forecasts Life expectancy at birth 100 95 90 85 80 75 70 65 60 1970-71 estimates forecasts 1980-81 1990-91 2000-01 2010-11 2020-21 95% predictive interval 80% predictive interval 67% predictive interval median 2030-31 2040-41 2050-51 Figure 3. The observed female e 0 in Queensland and predictive distribution. Source of estimates: calculated from Australian Bureau of Statistics data. The year on year variability in e 0 is small, with the standard deviation of Queensland s e 0 year on year variation being 0.52 years for males and 0.45 years for females (and 0.28 and 0.27 years for males and females, respectively, for the rest of Australia). Strong correlations between male and female year on year e 0 differences are evident from the past data for both regions and between the regions. This is as expected given that period factors which affect mortality are likely to be national in effect and will therefore influence both men and women and both Queensland and the rest of the country. These year on year variations and correlations were assumed to continue into the future. The forecast distribution for female life expectancy at birth in Queensland is presented in Fig. 3. Internal migration Past data show that rates of migration from Queensland to the rest of Australia have increased modestly over past decades, and rates of migration to Queensland have increased more substantially (see Fig. 4), with the intensity of both migration flows fluctuating considerably. The simplest method of forecasting the median of the internal migration GMR distributions would be to fit a linear regression and extrapolate, but internal migration is a complex phenomenon. Although many causes and explanatory models have been identified (Bell and Newton 1996; Champion et al. 1998; Champion et al. 2002) it remains insufficiently understood for this knowledge to be translated into reliable forecasts. While a number of socio-economic changes would point to higher mobility in the future rising incomes, continued professionalisation of the workforce, an increasingly flexible labor market, and wider experiences of places through travel others suggest a possible dampening of the current upward mobility trend. Such developments include rising housing and higher education costs, a greater preponderance of two-earner households, 13

Geographical Analysis Gross Migra production Rate 1.50 1.25 1.00 0.75 0.50 0.25 95% predictive interval 80% predictive interval 67% predictive interval median 0.00 estimates forecasts 1970-71 1980-81 1990-91 2000-01 2010-11 2020-21 2030-31 2040-41 2050-51 Figure 4. The observed GMR for male migration from the rest of Australia to Queensland and predictive distribution. Source of estimates: calculated from Australian Bureau of Statistics data. increased commuting fields facilitated by transport developments, and increasingly similar regional economies which reduce spatial mismatches of labor and therefore labor migration. For our median GMR distributions we decided to use a linear extrapolation of the more modestly rising GMR for migration from Queensland to the rest of the country and an increase in the rest of Australia to Queensland GMR set equal to half that of a linear extrapolation in order to account for possible diseconomies of a rapidly growing Queensland. However, in order to account for futures in which the intensity of migration increases significantly, or decreases, wide ceiling and floor values were set for the GMRs. The floor values were set at half the lowest recorded GMR of the last 30 years and the ceiling values the same distance above the assumed median as the floor values were below it. While arbitrary, these limits were needed to prevent GMRs from attaining values well below those recorded in the available time series of past data (including negative values) and from being assigned values which are not only much higher than in the past but would imply a phenomenal, and to us implausible, increase in the intensity of Australian internal migration. The year on year GMR differences and correlations from the 1971 to 2001 period were assumed to continue. The predictive distribution of the GMR for male migration from the rest of Australia to Queensland is shown in Fig. 4. International migration Permanent immigration data show a roughly stable trend over the past 30 years with large fluctuations around this trend. Will this trend continue roughly at the same level, or will the increasing size of the Australian population and economy (along with a likely increase in permanent emigration numbers) lead the government to gradually increase the number of visas allocated in the Migration and 14

Tom Wilson and Martin Bell Probabilistic Regional Population Forecasts Humanitarian Programs? Our position is that a modest increase is probable and a rise of 500 per year is assumed for the median of the national immigration assumption. The precise value of this annual increment was chosen so that, together with the other international migration assumptions, the median of the Australiawide net international migration distribution was 100,000 throughout the forecast horizon. Ceiling and floor limits were applied to the e PT error term in equation (3) to prevent excessive values. The past errors of permanent immigration from a linear regression of these values were calculated and the ceiling and floor limits were set at three standard deviations of these errors either side of the forecast median trend. Permanent emigration GMRs were assumed to continue their upward trends. The medians of the predictive distributions were calculated from linear regressions. Ceiling and floor values were set in the same manner as for the internal migration GMRs with the lower bound defined as half the lowest ever recorded GMR, and the upper limit as the set median value plus the difference between the median and floor limit. The standard deviations and regional and sex correlations between the year on year GMR errors observed over the 1971 2001 period were assumed to continue. Setting assumptions for long-term international migration proved particularly challenging due to the unreliability of recent data, both in terms of its numbers and its regional allocation. Because long-term immigration figures for recent years show explosive and implausible growth, we decided to set the median of the long-term migration predictive distribution from linear extrapolations of the long-term emigration figures. These suggest drift LT values of 650 per year for Queensland and 3000 for the rest of Australia. The unreliability of the recent figures was taken into account by random jump-off values, the mean and standard deviation of which could only be educated guesses. However, the uncertainty surrounding the jumpoff value is probably not too important given that equations (6) and (7) ensure a close association between long-term immigration and emigration. Considerable uncertainty also exists over the future of long-term immigration to Australia. It is not known whether the upward trend observed over the last couple of decades is the start of a long-run increase or a logistic curve-shaped shift to a higher level (Hugo 1999). For this reason, and because no negative values were generated, ceiling and floor limits were not applied to the long-term migration trend simulations (equation [6]). Results Total populations Fig. 5 shows the forecast distribution of Queensland s population from 2002 to 2051. The median of the distribution reaches 5.3 million by 2026 and 6.5 million by 2051, by which time the two-thirds predictive interval ranges from 5.4 to 7.7 million. Summary statistics on the total population predictive distributions for Queensland, the rest of Australia, and Australia as a whole are given in Table 2. 15

Geographical Analysis 10 9 estimates forecasts 8 Population (millions). 7 6 5 4 3 2 1 0 95% predictive interval 80% predictive interval 67% predictive interval median 1971 1981 1991 2001 2011 2021 2031 2041 2051 Figure 5. Estimated total population of Queensland and forecast distribution. Source of estimates: Australian Bureau of Statistics. The uncertainty of Queensland s demographic future compared with that of the country as a whole may be assessed by the relative interdecile range. This measure, described by Lutz, Sanderson, and Scherbov (2004), is defined as the value of the 80% predictive interval divided by the median. As expected, it can be seen that Queensland s demographic future is relatively more uncertain than the larger region which forms the rest of the country, which in turn has a more uncertain demographic future than Australia as a whole. How does the predictive distribution for Queensland compare with the forecast errors of ABS projections as reported in Table 1? Suppose that two population forecasts for Queensland followed the upper and lower 95% bounds of the predictive distribution while the actual population trajectory turned out to follow the median. After 10 years the lower population of 4.06 million would represent a 7.9% error while the higher forecast of 4.74 million would be in error by 7.6%. The 15-year forecast errors would be 12.4% and 12.1%, respectively. The 95% predictive intervals therefore incorporate most of the past ABS forecast errors, and those excluded are not too far off. If it is accepted that forecasting Queensland s population remains as difficult as in the past, then these figures indicate, very roughly, that out predictive intervals are about right. It is also of interest to briefly compare our total population forecast distribution with the latest sets of projections produced by Australian Bureau of Statistics (2003b) and the Queensland Government (2003). Table 3 contains figures for selected years from these two sets of projections and the probabilities covered by the high low ranges as implied by our probabilistic forecasts. It can be seen that, while both medium projections are very close to the median of our forecast distribution, the ABS high low range covers a slightly wider range of possible population futures than the Queensland Government projections. As mentioned in the introduction to the article, one important advantage of probabilistic population forecasts is that they are 16

Tom Wilson and Martin Bell Probabilistic Regional Population Forecasts Table 2 Summary Statistics of the Forecast Distributions for Total Population (in millions) 2002 2011 2026 2051 Queensland Lower 95% 4.04 4.29 4.36 Lower 80% 4.14 4.63 5.03 Lower 67% 4.19 4.78 5.40 Median 3.71 4.34 5.31 6.53 Upper 67% 4.49 5.81 7.74 Upper 80% 4.54 5.99 8.12 Upper 95% 4.64 6.32 8.98 RIDR 0.092 0.255 0.473 Rest of Australia Lower 95% 16.42 16.82 16.70 Lower 80% 16.65 17.43 17.96 Lower 67% 16.75 17.73 18.56 Median 15.93 17.07 18.67 20.56 Upper 67% 17.38 19.63 22.68 Upper 80% 17.48 19.93 23.40 Upper 95% 17.72 20.66 24.96 RIDR 0.049 0.134 0.264 Australia Lower 95% 20.71 21.89 22.77 Lower 80% 20.95 22.66 24.29 Lower 67% 21.07 22.98 24.90 Median 19.64 21.40 23.99 27.15 Upper 67% 21.76 25.00 29.44 Upper 80% 21.87 25.35 30.17 Upper 95% 22.11 26.02 31.95 RIDR 0.043 0.112 0.217 NOTE: Source of 2002 figures: Australian Bureau of Statistics. The relative interdecile range (RIDR) is the range between the first and ninth deciles divided by the median. Table 3 ABS and Queensland Government Projections (in millions) for Queensland 2011 2026 2051 ABS projections Series A 4.50 5.88 8.09 Series B 4.43 5.31 6.43 Series C 4.17 4.76 5.17 High-low probability interval (%) 72 71 77 Queensland Government projections High series 4.44 5.69 7.77 Medium series 4.35 5.29 6.47 Low series 4.28 4.91 5.28 High-low probability interval (%) 41 53 70 NOTE: Source: Australian Bureau of Statistics (2003b) and Queensland Government (2003). 17

Geographical Analysis age males 100 90 80 70 60 50 40 30 20 10 females 0 60 40 20 0 20 40 60 population (thousands) males 100 0 60 40 20 0 20 40 60 population (thousands) (a) 2002 (b) 2011 age 90 80 70 60 50 40 30 20 10 females 95% predictive interval 67% predictive interval median age males 100 females 90 80 70 60 50 40 30 20 10 0 60 40 20 0 20 40 60 population (thousands) 95% predictive interval 67% predictive interval median age males 100 90 80 70 60 50 40 30 20 10 0 60 40 20 0 20 40 60 population (thousands) (c) 2021 (d) 2031 females 95% predictive interval 67% predictive interval median Figure 6. The evolution of forecast uncertainty surrounding Queensland s age sex structure. Source of 2002 data: Australian Bureau of Statistics. always probabilistically consistent between variables and over time. Table 3 provides a simple illustration of how pairs of deterministic projection variants cannot be meaningfully assigned probability ranges because these probabilities vary over time. Age sex structure For many users of population forecasts the age sex structure is as important as, or more important than, total numbers. The forecast distributions of Queensland s age sex structure for the years 2011, 2021, and 2031 are illustrated in Fig. 6. These provide snapshots of the uneven way in which uncertainty unfolds across the age groups over time. In the first few years of the forecasts the fairly high uncertainty over the numbers of births translates into relatively wide predictive intervals in the youngest childhood age groups (Fig. 6b). The similar width intervals in the 20s, 30s, and early 40s reflect migration uncertainty in the peak migration ages of the 20s and early 30s, which then propagates up the population pyramid over the years 18

Tom Wilson and Martin Bell Probabilistic Regional Population Forecasts Elderly Dependency Ratio 0.7 0.6 0.5 0.4 0.3 0.2 0.1 95% predictive interval 80% predictive interval 67% predictive interval median estimates forecasts 0.0 1971 1981 1991 2001 2011 2021 2031 2041 2051 Figure 7. Estimated elderly dependency ratio of Queensland and forecast distribution. Source of estimates: Australian Bureau of Statistics. 2002 2011. By 2021 this migration uncertainty has traveled to higher ages and the uncertainty over mortality starts to become evident in the elderly ages (Fig. 6c). By around the fourth decade into the forecast horizon the effects of fertility and migration have existed for long enough to reduce the variation in the predictive intervals for most childhood and adult ages. Beyond this time there is little change to the shape of the predictive distributions as they continue to widen (so their age sex structures are not shown). Population ageing is one of the dominant themes in Australian and international demography today and will continue to be so in the coming decades. One indicator of the impact of this ageing (albeit a fairly crude one) is the elderly dependency ratio (EDR), defined here as the population aged 65 and over divided by the population aged 20 64. Fig. 7 presents the forecast distribution of the EDR for Queensland, indicating that the modest upward movements in this measure experienced over the last three decades will soon change. The forecasts suggest a twothirds probability that the 2051 EDR will lie between 0.427 and 0.534, a substantial increase over the current figure of 0.196. The forecasts also demonstrate how deterministic variant projections understate the degree of uncertainty surrounding the future of the EDR. The 2002-based ABS projections indicate 2051 EDRs of 0.538 (series A), 0.500 (series B), and 0.560 (series C), giving a high low range of 0.060. This covers just 24% of the possible outcomes in our probabilistic forecasts, much smaller than the probability interval covered by the ABS high and low total population projections, again indicating the probabilistic inconsistency of a deterministic variants approach. Components of change Examining the forecast demographic components of change helps to shed light on the forecast direction of total population change and the widths of its predictive 19

Geographical Analysis Table 4 Summary Statistics of the Forecast Distributions (in thousands) for Selected Demographic Components of Change 2001 2002 2010 2011 2025 2026 2050 2051 Births Lower 95% 40.0 35.4 27.9 Lower 80% 42.9 40.9 36.4 Lower 67% 44.3 43.6 41.1 Median 47.7 48.7 52.7 56.6 Upper 67% 53.2 62.4 76.9 Upper 80% 54.5 66.1 82.9 Upper 95% 57.6 72.9 100.9 Deaths Lower 95% 22.9 28.6 40.8 Lower 80% 24.8 32.0 46.2 Lower 67% 25.6 33.3 48.7 Median 23.3 28.3 38.3 58.3 Upper 67% 31.0 43.7 68.8 Upper 80% 32.0 45.6 72.3 Upper 95% 33.9 49.8 80.3 Net internal migration Lower 95% 12.4 25.1 39.7 Lower 80% 2.1 8.0 18.0 Lower 67% 10.2 0.9 7.6 Median 29.0 32.1 28.8 23.9 Upper 67% 55.2 55.6 58.2 Upper 80% 61.6 65.3 68.1 Upper 95% 75.0 84.8 93.0 Net international migration Lower 95% 1.9 3.9 10.3 Lower 80% 3.5 2.4 1.7 Lower 67% 6.5 5.6 2.7 Median 28.7 15.2 16.5 16.8 Upper 67% 24.5 28.7 33.3 Upper 80% 27.4 32.6 38.3 Upper 95% 34.4 40.9 49.7 NOTE: Source of 2001 2002 figures: Australian Bureau of Statistics. distribution. Summary figures of the predictive distributions for births, deaths, net internal migration and net international migration for Queensland are given in Table 4. The wide forecast intervals for births reflect cumulative uncertainty over the generations. Not only is there uncertainty over the TFR, but from around 2020 onwards some births are being produced by mothers whose generation size is also uncertain because they themselves were born after 2002. In contrast, the forecast distributions for deaths are narrower. Partly this is due to relatively narrow forecast 20