“Synthetic Longitudinal Business Databases for International Comparisons” — Joerg Drechsler, Institute for Employment Research ; Lars Vilhuber, Cornell University
International comparison studies on economic activity are often hampered by the fact that access to business microdata is very limited on an international level. A recently launched project tries to overcome these limitations by improving access to Business Censuses from multiple countries based on synthetic data. Starting from the synthetic version of the longitudinally edited version of the U.S. Business Register (the Longitudinal Business Database, LBD), the idea is to create similar data products in other countries by applying the synthesis methodology developed for the LBD to generate synthetic replicates that could be distributed without confidentiality concerns. In this paper we present some first results of this project based on German business data collected at the Institute for Employment Research.
“Total Variability Measures for Selected Quarterly Workforce Indicators and LEHD Origin Destination Employment Statistics in OnTheMap”, Kevin McKinney (U.S. Census Bureau), Lars Vilhuber (Cornell University and U.S. Census Bureau), John Abowd (Cornell University and U.S. Census Bureau), Andrew Green (Cornell University)
We report results from the first comprehensive total quality evaluation of three major indicators in the U.S. Census Bureau’s Longitudinal Employer-Household Dynamics (LEHD) Program Quarterly Workforce Indicators (QWI): beginning-of-quarter employment, full-quarter employment, and average monthly earnings of full-quarter employees. Beginning-of-quarter employment is also the main tabulation variable in the LEHD Origin-Destination Employment Statistics workplace reports as displayed in OnTheMap (OTM). The evaluation is conducted using the multiple threads generated by the edit and imputation models used in the LEHD Infrastructure File System. These threads conform to the Rubin (1987) multiple imputation model. Each implicate is the output of formal probability models that address coverage, edit and imputation errors. Design-based sampling variability and finite population corrections are also included in the evaluation. We derive special formulas for the Rubin total variability and its components that are consistent with the disclosure avoidance system used for QWI and LODES/OTM workplace reports. These formulas allow us to publish the complete set of detailed total quality measures for QWI and LODES. The analysis reveals that the three publication variables under study are estimated very accurately for tabulations involving at least 10 jobs. Tabulations involving three to nine jobs have acceptable quality. Tabulations involving one or two jobs, which are generally suppressed in the QWI, have substantial total variability but their publication in LODES allows the formation of larger custom aggregations, which will in general have the accuracy estimated for tabulations in the QWI of similar magnitude.
“Formal Privacy Protection for Data Products Combining Individual and Employer Frames”, Ashwin Machanavajjhala (Duke University), Samuel Haney (Duke University), Matthew Graham (U.S. Census Bureau), Mark Kutzbach (U.S. Census Bureau), Lars Vilhuber (Cornell University and U.S. Census Bureau), John Abowd (Cornell University and U.S. Census Bureau)