Overall Methodology

National projections are usually done with what is called the cohort-component model in which the initial population is carried forward into the future by adding new births, subtracting deaths, adding people moving into the country (immigrants), and subtracting people moving out (emigrants). The model used for the Pew projections is is a variant of the cohort-component method modified to incorporate immigrant generations by Edmonston and Passel (1992). In this application, five generation groups for U.S. residents are defined: (1) the foreign-born population, or the first generation; (2) the Puerto Rican-born population8; (3) the U.S.-born population of foreign (or mixed) parentage, or the second generation; (4) the U.S.-born population of Puerto Rican (or mixed) parents; and (5) the U.S-born population of U.S.-born parents. In simplifying to three generations, the third-and-higher generations group is defined as U.S. natives born to U.S.-native parents and is the sum of groups 2, 4 and 5.

In the projection methodology, each of the five generation groups is carried forward separately. Immigrants and emigrants enter and leave the first generation; migrants from Puerto Rico enter (and leave) the Puerto Rican-born population.9 Births are assigned to generations based on the generation of the mother and a matrix allowing for cross-generational fertility. All births to first-generation women are assigned to the second generation; all births to the Puerto Rican-born population are assigned to the Puerto Rican parentage population. Most births to the second and third-and-higher generations are assigned to the third-and-higher generations, but some are assigned to the second generation to allow for mixed generation couples that include immigrants. Likewise, most births to women of Puerto Rican parentage are assigned to the third-and-higher generation, but some are assigned to the Puerto Rican parentage population to allow for mixed couples including Puerto Rican-born migrants. The generational assignment matrix (or G-matrix) is based on race/ethnic origin but is allowed to vary dynamically based on relative generational sizes.

For these projections, the entire population is divided into five mutually exclusive racial/ethnic groups: Hispanic origin; white, not Hispanic; black, not Hispanic; Asian and Pacific Islander, not Hispanic; and American Indian/Alaska Native, not Hispanic. The report also includes a historical analysis using data developed with the same projection methods and estimates of the demographic components. The components are estimated so as to reproduce as closely as possible the decennial censuses from 1960 to 2000 and the 2005 population by age, sex, race/Hispanic origin and generation. The projections and historical analyses use five-year age groups up to 85 years and older by sex. The projections are done for five-year time steps from April 1, 2005 to April 1, 2050.

The remainder of this section describes the underlying data and assumptions for the projections and historical analyses. The first section treats the assumptions for the major demographic components of immigration, fertility and mortality for 2005–2050, with a particular emphasis on immigration. The next section presents the methods for defining and measuring the racial/ethnic groupings and the generational groups. Within each section, the data and the methods used to define the historical population and components of change are described. A more detailed treatment of the overall methods and especially the historical analysis is available (Passel, 2004).

Demographic Components

Demographic components of population change account for all additions and subtractions from the national U.S. population. Births and deaths are the largest of the components, but measurement of immigration is far more complicated because there are multiple channels of entry to and exit from the U.S. population. For some of the components, such as legal immigration, the available data are better, and accurate measurement is easier than for others, such as unauthorized or illegal immigration. The measurement methods differ among the immigration components, in part because of the nature of the data and in part because some of the immigration concepts dictate particular methods.

The demographic components included in the population projection model are:

  • Births (or fertility rates)
  • Deaths (or mortality rates)
  • Immigration
    • Legal Immigration (including refugees and legalized aliens)
    • Net Undocumented Immigration
    • Emigration
    • Net Movement from Puerto Rico
    • Other Immigration Components

Immigration

The immigration assumptions are critical for both the prospective projections and the historical analyses. Immigration increased substantially over the 1960–2000 period with particularly large increases for Hispanics and Asians. The rapid growth in the Hispanic population is attributable principally to immigrants and their descendants.

Background

Immigration has been the most difficult demographic component to forecast in the last several decade. It is directly affected by national policies and other events in ways that fertility and mortality are not. Although many of the social and economic factors affecting migration trends are reasonably well known, no broadly accepted theoretical framework can be readily applied in a projections framework (Howe and Jackson, 2005). Further, this component has proved quite difficult to measure even in historical analyses (Passel, 2001; Robinson, 2001).

Total immigration (legal plus unauthorized) has shown a steady upward trend since the 1930s, regardless of the period measured—the last 25, 50 or 75 years. Annual immigration has grown by about 4% per year since the early 1950s to the point where average gross annual immigration for 1995–2005 exceeded 1.6 million. Even with a significant drop during 2002–2004, possibly due to an economic slowdown and heightened security concerns, total immigration for 2000–2005 averaged almost 1.5 million per year (Passel and Suro, 2005). (Figure A1)

During the long period of immigration growth, immigration increased more rapidly than the total population, which averaged 1.2% annual growth over the entire period. As a result, the migration rate (defined as number of immigrants during a period per 1,000 population at the beginning) increased steadily. For 1930–1945, the migration rate was only 0.5 per 1,000, but by 1990–2005 it exceeded six immigrants per 1,000 population.

Growth in the scale of immigration can be linked to a number of factors. Major legislative changes in 1965, 1976, 1980 and 1990 greatly expanded legal immigration.10 Unauthorized migration began to increase in the 1970s as a result of, among other factors, the United States ending a key temporary worker program with Mexico, changes in Mexican society and its economy, and increased entry to the U.S. by foreign nationals as temporary residents or tourists. Large and increasing numbers of unauthorized migrants began to settle permanently in the U.S. with their families, replacing what in previous decades had largely been circular temporary migration from Mexico. The buildup of a significant undocumented population led to the passage in 1986 of the Immigration Reform and Control Act (IRCA). This law had two major provisions—one that legalized about 2.6 million formerly illegal immigrants and a second that made it illegal for employers knowingly to hire unauthorized migrants. By the mid-1990s, however, the unauthorized population was growing rapidly. In 2006, an estimated 11.5 million to 12 million illegal immigrants were living in the United States, compared with 3 million to 5 million at the passage of IRCA 20 years earlier (Passel, 2006).

As the U.S. economy expanded rapidly in the late 1990s, large numbers of workers arrived, some as legal immigrants, many as unauthorized migrants and increasing numbers as legal temporary workers (Passel and Suro, 2005). In addition, larger numbers of people came to the United States for tourism or business purposes as part an increasingly globalized economy. Even though only a small fraction of the visitors settled in the United States as unauthorized migrants (i.e., “overstayers”), the numbers eventually reached significant levels—about 4 million to 5.5 million by 2005 (Passel, 2006a). Relatively easy and inexpensive international communication and travel facilitated the settlement of both legal and unauthorized migrants in the U.S. Although immigration to the U.S. has been trending upward for 75 years, most of the factors affecting current immigration have their roots within the last 25 to 35 years and point to this period as a basis for projecting immigration into the future.

Legal and Net Unauthorized Immigration

Setting the assumptions for projected immigration involved two basic steps: (1) determining the level of current immigration to use as a launch point; and (2) setting a trajectory for the future course of immigration. The projection involves assumptions for the combined level of legal immigration and unauthorized migration. Emigration of legal immigrants is measured as a separate component, described below. Unauthorized migration, however, has been measured only as a net figure historically. Thus, the projection assumptions are formed for the total of legal in-migration and net unauthorized migration.

Current Immigration (2000–2010)

Total immigration for 2000–2005 was 7.4 million—an amount consistent with the increase in the total population from 281.6 million in 2000 to 295.7 million in 2005, and in the foreign-born population from 31.2 million to 35.5 million. However, within this period, there was considerable variability in annual immigration. Data from the March supplements to the Current Population Survey (CPS) and other sources showed that annual immigration for 2002–2003 was 20% to 30% lower than the peak values attained in 1998–2000 (Passel and Suro, 2005). After this marked drop in immigration, more recent data show a strong tendency toward recovery to the pre-boom immigration levels of the early 1990s. These analyses point to a value for 2005 legal and unauthorized immigration of slightly less than 1.4 million and an upward trend for 2004–2007. Combining this estimate with growth rates extrapolated from data for 1992–2004 gives a figure for immigration in 2005–2010 of just less than 7.1 million.

This immigration total for 2005–2010 is very slightly lower than the measured value for 1990–1995. In terms of the migration rate, the 2005–2010 estimate is 4.8 per 1,000, which is almost exactly the average for the five-year rates from 1970–1975 through 2000–2005. This migration rate is the starting point for projecting immigration over the 2010–2050 period.

Projected Immigration (2005–2050)

Ideally, immigration would be projected in a model-based framework with an underlying theoretical foundation using high quality data on the history of immigration and its determinants. Unfortunately, in spite of a considerable amount of recent research, virtually none of these items exists in a suitable form. There is a reasonable set of historic estimates of the total amount of immigration to the U.S. over the last 100 years or so. (See below for a description of how some of the key elements have been estimated.) Some recent work commissioned by the Social Security Administration brings together the literature on migration theory and migration determinants to address the issue of modeling and projecting immigration (Howe and Jackson, 2005; Jackson, 2006) and identifies six broad theoretical frameworks that could be used to develop projection models:

1. Policy framework—national immigration policy determines levels of immigration

2. Neoclassical framework—in a global labor market, labor will migrate toward higher wages if the wage differential is larger than the moving cost

3. World systems framework—immigration occurs when countries are incorporated in a global free market system; attitudinal shifts, remittances and community effects lead to “cumulative causation” and increasing levels of immigration

4. New economics framework—extended family economic units participate in a series of decisions and moves to maximize income, diversify income sources and insure against risk

5. Social network framework—networks of kin and other social contacts in both sending and receiving areas reduce costs and risks of migration, facilitating movement and settlement; momentum develops over time to increase migration

6. Dual labor market framework—segmentation of labor market in receiving countries sets up niches for immigrants and encourages migration.

Only the policy framework points to potential decreases in net immigration.

Over the long history where immigration to the U.S. has been measured (1820–2006), the migration rate11 has averaged 4.4 immigrants per year per 1,000 U.S. residents. This estimate averages periods of very high immigration (e.g., 1850–1855 with a rate of 16.5 and 1905–1910 with a rate of 12.1) and periods of very low immigration (e.g., 1930–1950, when the rate was 0.3–0.7). For the period since major immigration reform was enacted in 1965, the migration rate has averaged 4.6 and since 1980, it has averaged 5.4.

Either of these historic time scales (40 or 25 years) offers a reasonable basis for extrapolating into the future. Major immigration reform in 1965 ushered in a new era of migration from virtually all parts of the world to the United States; a new migration regime encompassing significant numbers of unauthorized migrants reached maturity in the 1975–1985 period. As discussed above, the annual estimated migration rate of 4.8 immigrants per 1,000 population for 2005–2010 falls comfortably near the middle of the rates experienced over this time and represents a realistic launch value for the projections.

The immigration assumption for a population projection can be implemented in many different ways. The migration rate is one such formulation, but future immigration levels would then depend on the exact implementation of other assumptions about fertility, mortality and even other migration components. Specifying future levels of immigration rather than future rates offers some advantages in programming of the projections; however, as noted above, the migration rate has some conceptual advantages. Accordingly, the immigration assumption for these Pew projections specifies the levels of immigration for each future period but in such a way as to maintain a relationship to the expected migration rate—that is, allowing immigration to change in concert with the population growth to maintain an approximately constant migration rate. To meet these conditions, immigration is assumed to increase by 5% from one five-year period to the next so that the migration rate will stay very close to the starting value of 4.8 and so that the assumption can be implemented easily within the context of the projection technology. With this assumption, immigration is projected to increase smoothly from 7.1 million (or just over 1.4 million per year) for the 2005–2010 period to 10.4 million (or just under 2.1 million per year) for 2045–2050. (See Figure 3.)

Historical Immigration (1960–2005)

Because previous analysis has uncovered problems with some of the historical measures, especially for the 1980s and 1990s (Passel, 2004), development of an accurate and consistent time series of data for legal immigration and net unauthorized immigration required application of the multigenerational projection methods and an iterative approach. First, historical measures of total net immigration (i.e., the components listed above) were developed by five-year period for 1960–2005 from Census Bureau and other official data sources. One major issue in creating this data series was the treatment of the large number of legal immigrants who acquired legal status under IRCA. The vast majority of these 2.6 million legal immigrants appear in official admission statistics when they received their “green cards” or during fiscal years 1989–1991. In reality, most arrived in the U.S. between 1970 and 1986. In our historical data series for immigration, they are assigned to actual periods of arrival, not to the 1989–1991 period.

Then, preliminary “projections” were done for each historical census to the next to assess the accuracy of the immigration figures for estimating the foreign-born population at the second census date; e.g., the foreign-born population from the 1960 Census was “projected” to 1970 using the initial estimates of immigration and then compared with the foreign-born population in the 1970 Census. To the extent that the projected population diverged from the population at the second census date, adjustments were made in the net immigration component. The projections were rerun until the immigration components agreed with the series of population data from the 1960, 1970, 1980, 1990 and 2000 Censuses and the 2005 estimates derived from the Current Population Survey. Similar methods were used by Passel and Edmonston (1994) to generate a consistent set of immigration measures for 1900 through 1960.

This approach yields estimates of the components of immigration that are consistent with the foreign-born (and Puerto Rican-born) population by age-sex-race/ethnicity for the 1960–2005 period. Significant revisions to the existing “official” estimates of immigration12 were required. For 1960–1970, the revised measure of immigration is about 15%, or 500,000 less than the previously used estimates.13 For 1970–1980, the methods imply additional immigration of about 1 million, or 15% more than the previous estimates; this amount was added to net unauthorized migration. Larger numeric increases in the assumed levels of immigration are required for each of the next two decades—2.0 million in the 1980s and 3.3 million in the 1990s. Although immigration has grown with each decade, these revisions amount to significant changes in estimates of unauthorized migration—about one-quarter for the 1980s and one-third for the 1990s. While these changes are large, these levels of immigration must have occurred if the census and survey measures of the foreign-born population are to be consistent with immigration flows.

Emigration

Emigration of legal immigrants has proved to be another elusive component of population change. The measures used in the historical analysis incorporate revised measures based on variations of “residual” calculations using successive censuses to incorporate the detailed census figures on the foreign-born population. (See Passel, 2004, for a detailed description of the estimation methodology.) For each five-year period from 1960 through 2005, a set of emigration rates was developed relating the revised measures of emigration to estimates of the foreign-born population.

The emigration measures developed in this manner show steady increases in the number of foreign-born emigrants. However, much of the increased level of emigration seems to be related to the sizable increase in the population “at risk” of emigrating, i.e., the foreign-born population over the 1960–2005 period. Although the number of emigrants has increased, rates of emigration have decreased. In the 1960s about 1.6 million former immigrants left the country, representing rates of about 7% for each five-year period. For the 1995–2005 decade, emigration was more than 2.9 million but the rate of emigration had dropped to roughly 4.5% for each five-year period. (See Ahmed and Robinson, 1994, Mulder et al., 2002 and Passel, 2004, for previous research on this topic.)

Because of the relationship between foreign-born population size and emigration, the projection methodology used here employs a set of emigration rates applied to the foreign-born population rather than a fixed amount of migration out of the country by former immigrants. Since the historical data show steadily decreasing rates of emigration (even with increasing amounts of emigration out of the country), emigration rates within each racial ethnic group are assumed to decline in the projection, but at a decreasing rate, reaching slightly more than 2.5% overall for 2045–2050. With these rates, emigration amounts in the main projection increase from about 3.1 million in the 2000s to just under 4 million in the 2040s.

Remaining Immigration Components

Net Movement from Puerto Rico

Measuring movement between Puerto Rico and the U.S. is necessary in accounting for changes in the U.S. population. As included in Census Bureau estimates and projections, this movement is generally positive (i.e., into the United States) for children and for adults up to about age 35. For older ages, the component is generally negative, reflecting the propensity of former migrants to return to Puerto Rico after a period of time in the United States. The official estimates were incorporated as initial estimates into projections for 1960–2005. The iterative procedure described above for the immigration component showed that the Census Bureau’s age and sex structure was essentially correct but that the level of movement into the U.S. had to be nearly doubled—from about 600,000 for 1960–2005 to 1.1 million—to account for growth in the Puerto Rican-born population living in the United States. For the Pew projections, the initial value for 2005–2010 is set to the average level for 1995–2005 and is assumed to increase at the same rate as overall immigration—by 5% for each subsequent five-year period.14

“Other” Immigration

The “Other Immigration” components are mostly small relative to the legal immigration, net unauthorized migration, and emigration—amounting to 300,000 to 500,000 per decade. They are: Net Change in Temporary Foreign Residents (called “Net Change in Foreign Students” for 1960–1980 by the Census Bureau); Net Movement of Civilian Citizens; Net Recruits and Deaths to the Armed Forces Overseas; and Emigration of U.S. Natives. The latter three mainly affect the native population. The first listed component consists entirely of the foreign born, however. These components have almost no impact on the historical analyses, but data from Census 2000 and the annual March supplements to the Current Population Surveys of 1995–2006 suggest that the change in temporary foreign residents was seriously underestimated for 1960–1995 and is probably about triple the average of 50,000 per five-year period in the official estimates for 1960–1995.15 For the projections, all of these components for 2005–2010 are set to their average values for 1995–2005 and then assumed to increase by 5% in each subsequent five-year projection period.

Alternative Immigration Scenarios

Immigration levels are almost certain to fluctuate in the future and diverge from the baseline assumption. To encompass a reasonable range of alternatives for future immigration and to assess their impact on future population, two alternative scenarios were developed—one with higher levels of immigration and one with lower levels. In the higher-immigration scenario, all immigration components (except emigration rates) are set to 150% of the values in the baseline or main scenario. In the lower-immigration scenario, all of these components are set to 50% of the baseline values. For both the higher- and lower-immigration scenarios, emigration rates remain at the values projected for the baseline scenario.

Contribution of Immigration to Population Change

The contribution of immigration to population growth goes beyond the numbers of immigrants added to the population because once the immigrants have arrived in the country, many give birth to children in the U.S. In the long run, the immigrants themselves will die, but their U.S.-born offspring will have children themselves, followed by grandchildren and subsequent generations. The use of a population projection methodology permits measurement of future contributions of immigrants to long-run population growth as well as an assessment of the role of past immigration in population change.

In measuring the contribution of future immigration to the projected population in 2050 (or any other future date), an alternative population projection is carried out in which the various immigration components (i.e., legal immigration, net unauthorized immigration and “other” immigration) are set to zero. With this assumption, not only are no future immigrants added to the population, but there are no other contributions from these immigrants to population change through future births, deaths or emigration16 as all of these components are computed by applying rates to the population. The difference between the “zero immigration” projection and the baseline projection represents the contribution of future immigrants to future population change.

Following the work of Passel and Edmonston (1994), this same methodology can be used to assess the contribution of past immigration to past population change because the time series of historical populations has been developed with a population projection methodology. Thus, past immigration can be set to zero and a “projection” carried out to estimate what would have happened had there been no immigrants during the entire 1960–2005 period or during intervals within the period. This methodology works because the time series of population change has been reconstructed using a projection methodology based on rates of fertility, mortality and emigration rather than past numbers of births, deaths and emigrants.

Fertility

Projected Fertility, 2005–2050

The generational pattern of fertility for racial/ethnic groups is drawn from analysis of recent data from the June Fertility Supplements of the CPS for 1994–2004 and the historical analyses described below. Historical total fertility rates (TFR, defined as a measure of lifetime births per woman) of 1960–2004 for the first generation exceed the third-and-higher generations (for all racial/ethnic groups except blacks), with the second generation falling either roughly between the extremes or close to the level of the third-and-higher generations. This pattern is maintained throughout the projection horizon. (Figure A2) Over time, TFRs are assumed to move toward 2.0 (i.e., with increases for whites and Asians, decreases for the other groups) with a significantly faster rate of change for 2005–2025 than for 2025–2050. This assumption follows those used by the Census Bureau (2000, 2004) in recent projections. (Note that the ultimate TFRs are not forced to any specific value.)

This series of complex assumptions involving some TFRs series trending upward and some downward with differences by race and generation ultimately results in little overall movement of the TFR, with total projected TFRs falling in a very narrow range of 1.99 to 2.03 for the entire 2005–2050 period. Notwithstanding the more complex methodology, this result is very similar to the overall assumption made by the Social Security Trustees (2007) in their projections which do not incorporate any differences by race/ethnicity. The Social Security assumptions, based on global assumptions about fertility, were strongly endorsed by the 2007 Technical Panel on Assumptions and Methods (TPAM, 2008).

The overall TFR for Hispanics fell rapidly from 4.7 in the first half of the 1960s to 3.4 in the second. By 2005, the Hispanic TFR dropped further to 2.5 children per woman. This value is higher than for any of the race groups; white and Asian TFRs are about 1.8 and the black TFR is about 2.2. The higher rate for Hispanic women is, in large part, due to the relatively high fertility of Hispanic immigrants who have a TFR of about 2.8. Fertility rates are assumed to decrease overall by 2050, with Hispanic fertility for the 2040s reaching slightly more than 2.1. Part of the drop can be attributed to a higher share of native-born women in the childbearing ages by the 2040s.

Historical Fertility, 1960–2005

The first step in analyzing fertility for the projections was the development of a set of generational total fertility rates for each racial/ethnic group consistent with the historical data. The TFRs are constrained to meet a number of conditions: (1) the total number of births in each period17 must agree with totals of registered births from the National Center for Health Statistics—by racial/ethnic group, when available; (2) survivors of the births that occurred in the 10 years before a census must equal the totals for the population under age 10 at the next census; and (3) the generational distribution of the survivors under age 10 must agree with the estimated population at the census, when available. An initial set of race/ethnic generational TFRs was developed from patterns in the June Fertility Supplements to the CPS for 1994–2004 and a similar set of TFRs estimated by Passel and Edmonston (1994) for 1960–1990.

In implementing the multigeneration projection methodology, a “G-matrix” is required to distribute the births of mothers in each generation to a generation for the children. Births to immigrant mothers go into the second generation, and all births to Puerto Rican-born women go into the Puerto Rican parentage population. For the second and third-and-higher generations, some births are distributed back to the second generation as a result of cross-generational childbearing of mixed couples made up of first and second generations or first and third-and-higher generations. This matrix is estimated for each race group using data on exogamous couples from the Current Population Surveys for 1995–2000. Analysis of the initial G-matrices showed a strong relationship between the percentage of cross-generational births and the relative sizes of the generations. Accordingly, this relationship is built into the historical analyses and prospective projections to allow for dynamic changes in cross-generational marriage patterns.

Mortality

Age-sex-specific mortality rates for race groups in the prospective projection are drawn directly from Census Bureau (2000) projections; the same mortality rates are applied to all generations within a racial/ethnic group. Life expectancy at birth overall increases from 74.9 years for males in 2005 to 81.2 in 2050; for females, the change is from 80.7 to 86.7. These improvements are greater than those assumed in the Social Security Trustees’ (2007) projections; its assumed values for life expectancy at birth in 2050 are 83.5 years for women and 79.4 years for men. The Census Bureau’s assumptions, adopted wholesale for the Pew projections, are much closer to the recommendations of the technical panel (TPAM, 2008) than are the Social Security assumptions.

Life expectancy at birth for Hispanic men is projected to increase from 77.8 years in 2005 to 83.0 years in 2050. For Hispanic women, the projected change is from 84.3 years to 88.4. At each date for both sexes, Hispanic life expectancy at birth is exceeded only by that of Asians. (Figure A3)

Definitions and Initial Census-Based Populations: 1960–2005

Racial/Ethnic Groups

The projections and historical analysis use five mutually exclusive and exhaustive race groups: Hispanic; white, not Hispanic; black, not Hispanic; American Indian/Alaska Native, not Hispanic; and Asian/Pacific Islander, not Hispanic. The basic population distributions are constructed from decennial census data for 1960–2000 and official population estimates for 2005. The age-sex-racial/ethnic-generation data were developed initially from tabulations of microdata obtained from the Integrated Public Use Microdata Samples (IPUMS) project. For 2005, the initial age-sex-race/ethnic distributions are from the Census Bureau’s population estimates with generational information from the March 2005 CPS. None of the data sources provided complete information along all of the dimensions. Accordingly, initial approximations to the detailed distributions were developed and adjusted through an iterative process to produce population distributions consistent with the available data on the components of change and vice versa.

Hispanic Origin

The Hispanic population can be identified with the Hispanic origin variable as collected (without modification) in census data for 1990 and 2000 and in CPS data from 1995 onward. For 1980, there was a small amount of misreporting of Hispanic origin; the initial population for the historical analysis is based on a corrected version of these data developed by using additional information including data on ancestry, place of birth, surname, language and ethnicity of other household members. The net effect of the modification is to reduce the size of Hispanic population in 1980 by about 3%, or 400,000 persons.

The Hispanic origin variable in 1970 suffered from both a major overstatement in some categories and a rather serious overall understatement. The responses in the census showed about 1.5 million persons choosing the “Central or South American” category, whereas tests in the 1969 and 1971 Current Population Surveys showed about 1 million fewer. Further analyses of these data showed that the overstatement was most severe in central and southern states of the country, suggesting a complete misinterpretation of the question on the part of a large number of respondents. At the same time, the total from the Hispanic origin question was at least 500,000 lower than other measures of the Hispanic population, notwithstanding the roughly 1 million overstatement just noted.

For 1970, the Hispanic population for historical analysis was constructed using information in the census on place of birth, mother’s and father’s place of birth, mother tongue, Spanish surname, current residence, residence five years ago, mother tongue of parent(s), grandparents’ place of birth and the same information about other household members. The final estimate of the Hispanic population in 1970 was adjusted to ensure consistency with the historic data series through an iterative process involving forward projections to 1980 and backward “projections” to 1960.

The 1960 Census did not include a variable directly identifying persons of Hispanic origin. The methods used to identify Hispanics in the 1970 data were also applied to 1960. Some minor adjustments were required because less information was available in 1960 for U.S. natives.

Race

Methods for classifying individuals by race and collecting racial information in the census have changed several times during the period covered by this analysis. Production of a consistent set of information in which all non-Hispanic persons are assigned to one of the four major race groups required some adjustments and modifications to each census and to the CPS for 2001 and later. In the censuses of 1970–2000, some persons did not choose one of the major races but opted for something different, resulting in assignment of “some other race.” For 1980–2000, the Census Bureau developed procedures to reclassify these individuals and released aggregated data by age-sex-race/Hispanic origin on so called “modified” race groups in which all persons were assigned to the major races. A variant on these procedures was applied here to the IPUMS data. Specifically, individuals who were initially classified as “other” were reassigned into one of the four major groups on the basis of (in order): race of parent(s), place of birth, ancestry (first and second), language spoken in the home, race of other household members and, if needed, then a default assignment as white.

In the 1960 and 1970 Censuses, there was no overarching category corresponding to the Asian/Pacific Islander race group used in the 1980 Census and later data. These earlier censuses included some specific “race” groups, such as Chinese and Japanese, but no global category. Further, according to the definitions of the time, persons from South Asia (India, Pakistan, Bangladesh) were classified as “white.” To bring these censuses in line with the subsequent ones, an “Asian” category was created from the reported data on race and the variables listed above used to assign Hispanic origin. After these assignments were done, a small number of individuals remained in the residual “other race” category. These were assigned to major race groups using a hierarchical procedure similar to that developed for 1980 and later.

Beginning with the 2000 Census and the 2001 CPS, individuals were permitted to choose more than one racial group. While only a small percentage of the non-Hispanic population chose two or more races (1.9% of non-Hispanics in Census 2000 and 1.8% in the 2006 American Community Survey or ACS), more than 4 million individuals were counted as multiracial. Individuals choosing more than one race were reassigned to one of the five race/Hispanic groups using a hierarchical system: Hispanics; black; Asian or Native Hawaiian/Other Pacific Islander; white; and then American Indian/Alaska Native.

Generation

The historical analyses and projections use five generation groups:

Foreign-born Population or First Generation—individuals born outside the United States who were not U.S. citizens at birth.

Puerto Rican-born Population—U.S. citizens (at birth) born in Puerto Rico, U.S. territories, or other outlying areas.

Foreign- and Mixed-Parentage Population or Second Generation—U.S.-born citizens with one or both parents born outside the United States (including persons born as U.S. citizens in foreign countries with one or two foreign-born parents).

Natives of Puerto Rican Parentage or the Second Puerto Rican Generation—U.S.-born citizens with one or both parents born in Puerto Rico (including persons born as U.S. citizens in foreign countries with one or two Puerto Rican-born parents).

U.S. Natives with Native Parentage or Third-and-higher Generations—U.S.-born citizens with both parents born in the United States (including persons born as U.S. citizens in foreign countries with two U.S.-born parents).

The full array of generations can be obtained directly only from the 1960 and 1970 Censuses because these two include the questions on nativity, citizenship and parents’ places of birth that are required to produce tabulations of these five generations. In censuses since 1970, the parental birthplace questions were dropped, so the 1980, 1990 and 2000 Censuses can provide direct data only for the foreign-born, Puerto Rican-born and native populations; the native population encompasses the second generation, natives of Puerto Rican-parentage and the third-and-higher generations. For 1980 and 1990, “projections” from 1960 and 1970 were used to produce an initial approximation to the full five generations. The native population at each census date was distributed to the more detailed generations using these projections.

For 2000, CPS data can provide a five-generation distribution by age, sex and race. However, these data are subject to substantial sampling variability because of the relatively small CPS sample size. So, the final population data for five generations in 2000 incorporate information from both the CPS and “projections” from 1970 to distribute the native population from Census 2000 to more detailed generations. For 2005, the March CPS again provides information for a fully detailed tabulation of five generations by racial/ethnic group. Although the CPS sample size was expanded between 2000 and 2005, it is still small. The final age-sex-race/ethnic-generation distribution for the population in 2005 represents a combination of the projected population from 2000, the March CPS generational detail, the official Census Bureau population estimates by age-sex-race/ethnicity and information on native/Puerto Rican/foreign populations from the 2005 ACS. This estimate for 2005 is the starting population for the projections.