“Using Partially Synthetic Data to Replace Suppression in the Business Dynamics Statistics: Early Results“. Javier Miranda (U.S. Census Bureau) and Lars Vilhuber (NCRN, Cornell University)
Abstract: “The Business Dynamics Statistics is a product of the U.S. Census Bureau that provides measures of business openings and closings, and job creation and destruction, by a variety of cross-classifications (firm and establishment age and size, industrial sector, and geography). Sensitive data are currently protected through suppression. However, as additional tabulations are being developed, at ever more detailed geographic levels, the number of suppressions increases dramatically. This paper explores the option of providing public-use data that are analytically valid and without suppressions, by leveraging synthetic data to replace observations in sensitive cells.”
Proceedings are available at Springer. Working paper in our eCommons repository.
The North American Data Documentation Initiative Conference (NADDI) is an opportunity for those using metadata standards and those interested in learning more to come together and learn from each other. Modeled on the successful European DDI User Conference, NADDI 2015 will be a three day conference (April 8-10) with invited and contributed presentations, and should be of interest to both researchers and data professionals in the social sciences and other disciplines.
Cornell’s Bill Block is on the Program Committee.
Bridging The Data Divide: Data In The International Context
The theme of our 2015 conference is Bridging the Data Divide: Data in the International Context. Going hand in hand with the well-known digital divide is a growing inequity in access to data. Increasing budget concerns have placed strains on governments, universities, and other institutions in the provision of data services. From the cancellation of the Statistical Abstract of the United States, to the controversy over the Canadian Census long form, to political barriers in the data collection process in some countries, access to data and the data divide presents organizational, economic and educational challenges to the community of data professionals worldwide.
“Improving Access and Data Security to Confidential Labor Market Data”, Warren Brown (Cornell University), Stephanie Jacobs (Cornell University), David Schiller (German Institute for Employment Research), Jörg Heining (German Institute for Employment Research)
Abstract: The Cornell Institute for Social and Economic Research (CISER), Cornell University and the Institute for Employment Research (IAB), German Federal Employment Agency are collaborating to expand use of IAB’s confidential Sample of Integrated Labour Market Biographies (SIAB). DDI 2.5 is used to enable researchers to discover the files by means of variable level searching in a repository of metadata on U.S. and German labor market related data files. The repository is the Comprehensive Extensible Data Documentation and Access Repository (CED2AR) being developed by researchers at Cornell University with funding from the U.S. National Science Foundation. CED2AR provides researchers access to machine-readable codebooks with variable characteristics thus enabling researchers to develop detailed proposals for access to these data that are submitted to IAB. Researchers with approved projects are able to access and analyze the data using the Cornell Restricted Access Data Center (CRADC), a remote access virtual data enclave using remote desktop protocol. In the initial testing phase several researchers located in Europe and North America are successfully accessing and analyzing the Scientific Use Files of the SIAB. The project is well on its way to realizing the goal of wider access to researchers while improving secure management of confidential data.
The presentation can be found at http://hdl.handle.net/1813/44707
Lars Vilhuber speaks about “Disclosure Limitation and Confidentiality Protection in Linked Data” at the Center for Interuniversity Research and Analysis of Organizations‘s conference on “Facilitate the access to Quebec data: How and to what ends?” The conference is jointly organized with the Quebec inter-University Centre for Social Statistics (QICSS). The presentation relies on joint work with John M. Abowd and Ian M. Schmutte.