Session Slot: 10:30-12:20 Tuesday

Estimated Audience Size: 100

AudioVisual Request: regular - overhead projector

Panel Session Title: Should We Continue To Release Public-Use Microdata Files?

Statistical data from censuses and surveys are disseminated in text, tabular, graphic, and microdata formats. The last of these, which has substantially enhanced the utility of census and survey data, is a product of the computer age. Data users receive files of individual records, giving them the ability to interact with the data set and perform the tabulations and statistical analyses that are best suited to their own purposes. Two principles have been applied to the release of microdata files - restricted data and restricted access. For virtually all releases of microdata files, data are restricted by removing explicit identifiers and taking other steps to limit the possibility that individuals can be identified. In some instances, access is restricted to certain categories of users, such as those who have research grants with a federal agency or those who are willing to work with the files at a federal site. Users in these categories are generally subject to penalties for violating the terms of their access to the files. A public-use microdata file is one for which there are no restrictions on access, other than payment of the cost of a tape or CD- ROM. Some public-use microdata files can be downloaded via the Internet. In recent years, Rubin, Fienberg, and others have proposed to eliminate all risk that individuals can be identified by releasing microdata files containing only synthetic data - records that do not contain information for specific individuals but, in the aggregate, retain the statistical properties of the real records.

Several federal statistical agencies are currently releasing public-use microdata sets and plan to do so in the future, e.g., from the 2000 Census of Population. However, some respected statisticians argue that this practice cannot continue indefinitely - that with the rapid development of more efficient computer matching programs and the proliferation of computerized information about individuals available from both the public and private sectors, the risks of disclosure of the identities of persons whose records are included in public-use data sets will become unacceptably high. The purpose of this panel discussion is to discuss the risks and benefits associated with continued release of public-use microdata files and to discuss the pros and cons of alternative approaches - restricted access and release of synthetic data files.

Theme Session: Yes

Applied Session: Yes

Panel Organizer: Jabine, Thomas B. Independent Consultant







Panel Organizer: de Wolf, Virginia A. US Bureau of Labor Statistics








Session Timing: 110 minutes total (Sorry about format):

110 minutes total...please allocate Opening Remarks by Chair - 5 minutes First Panelist - 15 minutes Second Panelist - 15 minutes Third Panelist - 15 minutes Fourth Panelist - 15 minutes Fifth Panelist - 15 minutes Rebuttals/Discussion Among Panel Members - 15 minutes Floor Discussion - 15 minutes

Panel Chair: Kirkendall, Nancy J. US Office of Management and Budget








David, Martin   University of Wisconsin








Fienberg, Stephen E.   Carnegie Mellon University








Gordon, Nancy M.   Census Bureau








Juster, F. Thomas   University of Michigan








Scheuren, Frederick J.   Ernst & Young







List of speakers who are nonmembers: None

David Scott