After 40 Years, How Representative Are Labor Market Outcomes in the NLSY79?
Abstract
In 1979, the National Longitudinal Study of Youth 1979 (NLSY79) began following a group of U.S. residents born between 1957 and 1964 and has continued to reinterview these same individuals for more than four decades. Despite this long sampling period, attrition remains modest. This article shows that after 40 years of data collection, the remaining NLSY79 sample continues to be broadly representative of their national cohorts regarding key labor market outcomes. For NLSY79 age cohorts, life-cycle profiles of employment, hours worked, and earnings are comparable to those in the Current Population Survey. Moreover, the distribution of lifetime earnings over the age range 25 to 55 closely aligns with the distribution found in Social Security Administration data. Our results suggest that the NLSY79 can continue to provide useful data for economists and other social scientists studying life-cycle and lifetime labor market outcomes, including earnings inequality.
Introduction
The National Longitudinal Study of Youths 1979 (NLSY79) is a long-running panel dataset for the U.S. It began in 1979 by interviewing a group of U.S. residents aged 14 to 22 (born 1957 to 1964) and has continued to reinterview these same individuals for more than four decades. The NLSY79 collects information on a wide range of topics, including demographics, family structure, labor market outcomes, health, and criminal activity. This rich information, combined with a long panel, have made the NLSY79 a valuable data source for economists and other social scientists. For example, between 2010 and 2023, the NLSY79 was used in at least 34 articles published in the “top 5” economics journals.
Work on inequality has long understood the importance of distinguishing between the transitory and persistent components of inequality. Motivated by this understanding, recent work on inequality has used administrative data to document features of lifetime inequality (see, e.g., Guvenen et al. (2022), who use Social Security Administration (SSA) data). A key advantage of administrative data is the large sample size that they offer. However, there are also some disadvantages to relying on administrative datasets: Access to such data is extremely limited, especially in the U.S., and some variables of interest are typically not present. For example, because the SSA data do not include information on hours worked, they cannot distinguish between inequality in earnings and inequality in wage rates.
The NLSY79’s long panel provides a publicly available dataset that can now be used to study lifetime inequality for a specific set of cohorts. Moreover, because it provides information on both earnings and hours, it can distinguish between inequality in earnings and inequality in wage rates. For example, in Bick, Blandin, and Rogerson (2024) we use the sample constructed in this article to document the relationship between lifetime hours worked, hourly wages, and earnings.
Citation
Alexander Bick, Adam Blandin and Richard Rogerson, "After 40 Years, How Representative Are Labor Market Outcomes in the NLSY79?," Federal Reserve Bank of St. Louis Review, First Quarter 2025, Vol. 107, No. 2, pp. 1-50.
https://doi.org/10.20955/r.2025.02
Editors in Chief
Michael Owyang and Juan Sanchez
This journal of scholarly research delves into monetary policy, macroeconomics, and more. Views expressed are not necessarily those of the St. Louis Fed or Federal Reserve System. View the full archive (pre-2018).
Email Us