UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Optimal data collection for randomized control trials

Carneiro, P; Lee, S; Wilhelm, D; (2020) Optimal data collection for randomized control trials. The Econometrics Journal , 40 (1) pp. 1-31. 10.1093/ectj/utz020. Green open access

[thumbnail of utz020.pdf]
utz020.pdf - Published Version

Download (348kB) | Preview


In a randomized control trial, the precision of an average treatment effect estimator and the power of the corresponding t-test can be improved either by collecting data on additional individuals, or by collecting additional covariates that predict the outcome variable. To design the experiment, a researcher needs to solve this trade-off subject to her budget constraint. We show that this optimization problem is equivalent to optimally predicting outcomes by the covariates, which in turn can be solved using existing machine learning techniques using pre-experimental data such as other similar studies, a census, or a household survey. In two empirical applications, we show that our procedure can lead to reductions of up to 58% in the costs of data collection, or improvements of the same magnitude in the precision of the treatment effect estimator.

Type: Article
Title: Optimal data collection for randomized control trials
Open access status: An open access version is available from UCL Discovery
DOI: 10.1093/ectj/utz020
Publisher version: https://doi.org/10.1093/ectj/utz020
Language: English
Additional information: This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com.
Keywords: C51 - Model Construction and EstimationC52 - Model Evaluation, Validation, and SelectionC55 - Large Data Sets: Modeling and AnalysisC81 - Methodology for Collecting, Estimating, and Organizing Microeconomic Data; Data Access
UCL classification: UCL
UCL > Provost and Vice Provost Offices > UCL SLASH
UCL > Provost and Vice Provost Offices > UCL SLASH > Faculty of S&HS
UCL > Provost and Vice Provost Offices > UCL SLASH > Faculty of S&HS > Dept of Economics
URI: https://discovery.ucl.ac.uk/id/eprint/10098349
Downloads since deposit
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item