On the Asymptotic Properties of Debiased Machine Learning Estimators

Last version JMP version JMP Online Appendix

Abstract: This paper studies debiased machine learning (DML) under a novel asymptotic framework. DML is a two-step estimation method for econometric models in which the parameter of interest depends on unknown nuisance functions. It uses $K$-fold cross-fitting to accommodate flexible machine-learning estimators. Practitioners implementing DML confront multiple decisions: whether to use DML1 or DML2 (two variants of DML estimators), and how to choose $K$. Existing fixed-$K$ asymptotic theory establishes that DML1 and DML2 are asymptotically equivalent, offering no formal guidance on which variant to use or how to select $K$. Under a framework in which $K$ can grow with the sample size $n$, we demonstrate that DML2 offers theoretical advantages over DML1 in terms of bias, mean-squared error (MSE), and inference. When first-step estimators admit a linear stochastic expansion, we further show that for scalar DML2 the choice $K=n$ is asymptotically optimal in terms of second-order asymptotic bias and MSE.

Amilcar Velez
Amilcar Velez
Provost New Faculty Fellow

I am currently a Provost New Faculty Fellow in the Department of Economics at Cornell University and will join the faculty as an Assistant Professor of Economics in July 2026.