General information on census data, including area classifications, definitions of topics, accuracy of the data, and collection and processing techniques, is provided in decennial census publications (and with census data sets on the Internet). The United States has been taken primarily on a de jure (usual place of residence) basis rather than on a de facto (location at the time of the census) basis. Estimates of census coverage and net under-enumeration have been prepared for the decennial census on a regular basis since 1940. While the estimated rates of net undercount have varied somewhat, they have generally shown higher rates of net undercount for males than for females, for young adults than for other age groups, and for minority groups than for the White (or White non-Hispanic) population.
Since 1940, some data in the decennial census have been collected on a sample basis, and since 1960, this has been the case for data on most social and economic characteristics. The use of sample data (in decennial census publications and tabulations based on IPUMS) is indicated in headnotes for the graphics. In general, estimates of sampling error are provided in decennial census publications that show sample data.
Sample estimates may differ somewhat from the data that would have been obtained if information had been collected for the entire population. In addition to sampling error for data based on a sample, both 100-percent data and sample data are subject to nonsampling error. Nonsampling error may be introduced during any of the numerous operations used to collect and process data. Such errors may include the following: not enumerating every household or every person in the population, failing to obtain all required information from the respondents, obtaining incorrect or inconsistent information, and recording information incorrectly. In addition, errors can occur during the review of the enumerators’ work, during clerical handling of the questionnaires, and during the processing of the questionnaires.
The magnitude of sampling error is determined primarily by sample size and to a lesser degree by the sampling rate. Since the sample data shown in the graphics are for the United States, regions, states, and large cities (and not, for example, for small towns), the samples on which the sample data are based are sufficiently large that the resulting sampling errors are relatively small. As noted above, information on sampling error typically is provided in decennial census publications; however, the following very general guideline is offered, unless there is particular reason to question the comparability of data (e.g., due to changes in definitions). Changes (over time) and differences (for the same census year) of less than one or two percentage points (in the case of percentages) or of less than one or two percent (in the case of other measures, such as ratios, or numbers) do not merit emphasis. Such differences may not be statistically significant due to sampling error and/or nonsampling error. In addition, such small changes may not be of substantive significance, even if they are of statistical significance.
This general guideline does not apply to estimates based on sample data for net migration of the population born in the United States. In this case, estimates of in-migration and of out-migration are each subject to sampling error, and the resulting estimate of net migration, which may be a much smaller number, may have a large sampling error relative to the size of the estimate.
For discussion and estimates of estimated net undercount, see U.S. Bureau of the Census, 1975 (Historical Statistics of the United States, Part 1, p. 1); Fay et al, 1988; Robinson et al, 1993; and U.S. Census Bureau, 2003.
For illustration, an example at the national level is provided using 1950 census data on children ever born to ever-married women. These data were based on a 3 and 1/3 percent sample, much smaller than for most census sample data. For a weighted population of 1,000,000 (meaning about 33,000 sample cases), the standard error on an estimated percentage of 10 percent with zero children ever born is 0.2 percentage points, and the standard error on an estimated rate of 3.00 lifetime births per woman is 0.02 births (U.S. Bureau of the Census, 1955). There is about a 69-percent chance that the sample-based estimates would be within one standard error (and about a 95-percent chance within two standard errors) of what would have been obtained from a complete census.