Loading...
Combining probability and nonprobability Ssamples on an aggregated level
Villalobos-Aliste,Sofía F. ; Scholtus,Sander ; de Waal,Ton
Villalobos-Aliste,Sofía F.
Scholtus,Sander
de Waal,Ton
Abstract
Probability surveys are experiencing important drawbacks nowadays: costs are relatively high and participation rates are decreasing, which could yield less accurate estimates. Alternatively, nonprobability samples like administrative records are having a rise in popularity due to their convenience and low costs. Unfortunately, nonprobability samples are often selective and, as the underlying sampling design is unknown, estimators based on such samples are generally biased. Research is ongoing on how to deal with this selection bias. In this paper, a method is proposed that combines estimators from a probability and nonprobability sample on an aggregated level. Our estimator is constructed as a weighted mean of both estimators. The weight is chosen to minimize the expected value of the mean squared error (MSE) of the combined estimator under an assumed model for the bias in the estimator based on the nonprobability sample. Our method does not require any data on the level of the individual units in the samples. We performed simulation studies where two different methods of modeling the bias in the nonprobability sample were tested. We also applied one of these methods to a real dataset from Statistics Netherlands and showed that the MSE was indeed reduced in a real application.
Description
Date
2025-06
Journal Title
Journal ISSN
Volume Title
Publisher
Research Projects
Organizational Units
Journal Issue
Keywords
administrative data, data integration, probability samples, nonprobability samples
Citation
Villalobos-Aliste, S F, Scholtus, S & de Waal, T 2025, 'Combining probability and nonprobability Ssamples on an aggregated level', Journal of Official Statistics, vol. 41, no. 2, pp. 619-648. https://doi.org/10.1177/0282423X241293751
