Jeffrey S. Passel, senior demographer at the Pew Research Center, spoke at a forum on the 2010 Census on Jan. 21 about challenges the Census Bureau faces in attempting to count everybody. He also talked about the potential problem of differing data from the 2010 Census and American Community Survey. The event was held at the center; it also was sponsored by the American Statistical Association and the DC chapter of the American Association of Public Opinion Research.

In this edited transcript, ellipses are not used in order to facilitate reading.

I’m going to talk about the next year to two years.  The Census Bureau has in many ways, I think, had an extraordinary decade.  Not without issues, but I’m going to focus more on the positive than the negative.

Census 2000 was in many ways extremely successful.  The net undercount was very low, notwithstanding some issues of duplicates.  The black/not-black difference in coverage was reduced substantially.  They reached a timely decision not to adjust.  Be it right or wrong, they did it on time and they got data out in a very usable way very quickly.  The challenge in many ways, I think, is to repeat that and do at least as well and hopefully improve.

The second challenge is the American Community Survey.  It may not be the war, the army of the census, but it’s close.  And it’s been rather remarkable.  It has changed the culture of the Census Bureau in many ways, some very apparent and some subtle in the way the analysts at the Census Bureau work.

It became fully operational in 2006 by including group quarters.  They released multiyear data for the first time in 2007 and we now have what we and others call a fire hose of data. We’re no longer sipping data; we’re inundated with it and it’s difficult to keep up.

The intersection of these two, I think, is going to be a challenge. I’m going to talk a little bit more about that as a challenge for the Census Bureau but also for you and for us to deal with in the coming couple of years.

There have been a lot of concerns that this wasn’t exactly a smooth road. Connie [Citro] mentioned the handhelds, getting the ACS up and running, fixing some of the weighting problems.  Planning was not smooth, but it never is.

But we now have a rather remarkable set of data products that we can look at on an ongoing basis. The ACS has contributed in substantial ways to the improvements in the census. A lot of the language-targeting information [used in the census outreach campaign], for example, comes from the ACS. It’s how to get that feedback working that is a continuing challenge.

Duplicating the 2000 Census Success

This is just to illustrate what I mean [see slide] The 2000 Census was a success.  These are demographic analysis estimates of undercount.  We had rather steady improvements. We can see through 1980, each census was better than the previous one in terms of the percent of people missed.  In terms of the number of people missed, sometimes the numbers stayed about the same; the population was growing.  We saw reductions steadily in undercount.

But 1990, however, was a bump in the road.  Lack of paid advertising, which we heard about, I think, was a big problem.  The undercount went up in percentage terms; it went up even more in numeric terms.  And it’s the first time, at least in this cycle that we’ve been able to measure, that we saw that.  And 2000 represented a major turn-around.  The demographic estimate is basically a tenth of a percent of the population net was missed – a rather remarkable figure that we’ll be hard-pressed to duplicate.

This is black and not-black undercount.  The numbers aren’t so important, but the trends going down are the same and the lack of improvement from 1980 to ’90 curved for both black and not black. [see slide]

This was the challenge of 2000.  We had big reductions–the black undercount rate was cut in half, and the small net overcount for the rest of the population showed up.  But there were big improvements in coverage.  The differential [in the black and not-black undercount] turned around as well.  I think the 2000 was very successful in these terms.

This is the challenge going forward–what happens?  Where do we go from here?  And the scale of this–a 1% undercount is 3 million people, so it’s not a small number of people.  So the challenge to get the counts right and to keep the undercount rates low is going to be difficult.  The operations and the planning and the advertising, I think, were key in 2000, and it sounds to me like that lesson has been learned and improved upon.

American Community Survey and the 2010 Census

The American Community Survey is really a rather remarkable operation.  I was skeptical in the ’90s that this would be pulled off, but it seems to be working.  We get detailed, census-like data. (And by the way, I say “census-like” data:  It’s not quite the same, but the majority of data users think it is the same as the census, and for the non-sophisticated users, the distinction is completely missed.)  We get annual data; we’re getting data based on averages across 12 months of surveys, 36 months and 60 months.  We get annual data for the total population of areas and for the characteristics.

The distinction here is those population totals don’t come from the ACS survey itself. The totals come from the population estimates program of the Census Bureau, and are not census-like.  The ACS totals by race come from the population estimates program.

This distinction is an important one and it’s overlooked by most of the people who use it.  We’re seeing a broad user community develop:  That’s an invaluable tool for census planning. There are a lot of sophisticated users. There are a lot of unsophisticated users. And there are a lot of people who should be using it, who aren’t.  But the confusion between “census” and “survey” is ongoing and is difficult.

The challenge and the potential train wreck, I’m afraid, is that the data users have been getting data from the ACS for their areas, for their communities, for several years.  There are going to be a series of numbers out there that people will have used.  And the census is going to come in and the numbers are going to be different from those in the ACS.  It’s going to be a lot different in some places because of coverage error; it’s going to be a lot different in some places because of estimation error.  But it’s going to be different.

The problem, I think, is the credibility of both data systems is going to be at issue.  There are going to be large differences in places.  The political users of this will want whichever is larger, and they’ll challenge whichever one is smaller.  We–the data user community and the Census Bureau, as the producer of this–are going to need a strong defense of both data systems. It’s going to require help from key users like us to emphasize that these are both valuable systems.  We have to explain why do both.  I think we need to preempt that criticism as soon as we can and talk about the value of both of these data systems.

The census, in addition to the political uses for reapportionment and redistricting, offers us an opportunity to re-benchmark the population estimates and the ACS. It’s critical that we are able to do that periodically.  But the ACS data provides a broad range of information and it gives us up-to-date annual data that is a tremendous resource, and much better than anything we had.

The feedback of ACS in the census process was alluded to earlier.  I think it’s been a major breakthrough in planning for the census, and the value of it I don’t think can be understated.  The feedback that we haven’t seen yet, to my satisfaction at least, is the feedback from the ACS into the population estimates program, that can improve and make the ACS more census-like, if you will.

Getting the Message Out

I think what we need over the next year or so is to be very clear in presenting ACS data that it’s not the census, that these are based on [population] estimates, and that the census is coming and it’s going to help us improve these.  The Census Bureau will be releasing 2009 ACS data in September, and I’m sure that people will think those are the census figures.  It’s going to be very important that they be divorced from the census product, that they present it as something different.

The first tract-level information from the ACS is about to be presented in September, and those data are going to be based on 10-year-old [population] estimates, or estimates carried forward by 10 years.  They’re going to be quite different from what we’ll see in the census.  I think it needs to be labeled in a way that’s very, very clear that this is not the census and be very careful.  I think it’s very important, and I think the plans are in place that the 2010 ACS data, which will be released in late 2011, will be weighted to the census counts.  I think those are the plans – I’m not sure.

But it’s important, especially for small areas, that we don’t get two sets of numbers for 2010; that can be confusing.  My own preference would be to even delay those [2010 ACS numbers] a little bit, rather than release data with 2000-based weights, when we have the 2010 data.

Finally, I think, as an analyst who tries to look at this data over time, it’s been very difficult to monitor changes from year to year, because the weighting has been adjusted three times in the last three years.  I think to make this data most useful for analytic purposes, it would be extremely important to go back, look at the last five years of ACS data and produce a set of consistent numbers that are weighted both to the 2000 Census and 2010 Census so that we have a clean product going forward.

I think there’s a challenge for the Census Bureau, certainly, but I think for us, as sophisticated data users, there’s a challenge that these data will present, to be clear to our consumers and the population at large.

Earlier postings from this event include the transcript and audio of a presentation by Census Bureau Director Robert Groves, as well as a transcript from speaker Constance Citro and a transcript from speaker Joseph Salvo.