eprintid: 48531 rev_number: 12 eprint_status: archive userid: 1482 importid: 105 dir: disk0/00/04/85/31 datestamp: 2024-01-12 09:15:51 lastmod: 2024-09-10 14:50:27 status_changed: 2024-09-10 14:50:27 type: article metadata_visibility: show creators_name: Medous, Estelle creators_name: Goga, Camelia creators_name: Ruiz-Gazen, Anne creators_name: Beaumont, Jean-François creators_name: Dessertaine, Alain creators_name: Puech, Pauline creators_id: anne.ruiz-gazen@tse-fr.eu creators_idrefppn: 076419851 creators_idrefppn: 085824089 creators_halaffid: 1002422 title: QR prediction for statistical data integration ispublished: pub subjects: subjects_ECO abstract: In this paper, we investigate how a big non-probability database can be used to improve estimates of finite population totals from a small probability sample through data integration techniques. In the situation where the study variable is observed in both data sources, Kim and Tam (2021) proposed two design-consistent estimators that can be justified through dual frame survey theory. First, we provide conditions ensuring that these estimators are more efficient than the Horvitz-Thompson estimator when the probability sample is selected using either Poisson sampling or simple random sampling without replacement. Then, we study the class of QR predictors, introduced by Särndal and Wright (1984), to handle the less common case where the non-probability database contains no study variable but auxiliary variables. We also require that the non-probability database is large and can be linked to the probability sample. We provide conditions ensuring that the QR predictor is asymptotically design-unbiased. We derive its asymptotic design variance and provide a consistent design-based variance estimator. We compare the design properties of different predictors, in the class of QR predictors, through a simulation study. This class includes a model-based predictor, a model-assisted estimator and a cosmetic estimator. In our simulation setups, the cosmetic estimator performed slightly better than the model-assisted estimator. These findings are confirmed by an application to La Poste data, which also illustrates that the properties of the cosmetic estimator are preserved irrespective of the observed non-probability sample. date: 2023-12 date_type: published publisher: Statistics Canada id_number: ISSN: 0714-0045 official_url: http://tse-fr.eu/pub/128953 faculty: tse divisions: tse keywords: Cosmetic estimator keywords: Dual frame keywords: GREG estimator keywords: Non-probability sample keywords: Probability sample keywords: Variance estimator language: en has_fulltext: FALSE doi: ISSN: 0714-0045 view_date_year: 2023 full_text_status: none publication: Survey Methodology volume: vol. 49 number: n° 2 place_of_pub: Ottawa pagerange: 385-410 refereed: TRUE issn: 0714-0045 oai_identifier: oai:tse-fr.eu:128953 harvester_local_overwrite: creators_idrefppn harvester_local_overwrite: pending harvester_local_overwrite: creators_id harvester_local_overwrite: number harvester_local_overwrite: volume harvester_local_overwrite: issn harvester_local_overwrite: publisher harvester_local_overwrite: pagerange harvester_local_overwrite: place_of_pub harvester_local_overwrite: hal_id harvester_local_overwrite: hal_version harvester_local_overwrite: hal_url harvester_local_overwrite: hal_passwd harvester_local_overwrite: creators_halaffid oai_lastmod: 2024-09-06T09:51:29Z oai_set: tse site: ut1 hal_id: hal-04390127 hal_passwd: v6@&0ft hal_version: 1 hal_url: https://hal.science/hal-04390127 citation: Medous, Estelle, Goga, Camelia , Ruiz-Gazen, Anne , Beaumont, Jean-François, Dessertaine, Alain and Puech, Pauline (2023) QR prediction for statistical data integration. Survey Methodology, vol. 49 (n° 2). pp. 385-410.