On the Asymptotic Properties of Debiased Machine Learning Estimators

Last version JMP version JMP Online Appendix

Abstract: This paper studies debiased machine learning (DML) when the number of cross-fitting folds, $K_n$, may grow with the sample size $n$. Existing fixed-$K$ asymptotic theory implies that DML1 and DML2, two variants of DML estimators, are asymptotically equivalent, providing no guidance on which variant to use or how to choose $K_n$. We first show that when $K_n \propto \sqrt n$, DML1 can exhibit asymptotic bias, implying that standard inference based on DML1 may fail, whereas inference based on DML2 remains valid. We then show that, under an algorithmic-stability condition, the standard first-order asymptotic theory for DML2 remains valid for any $K_n \le n$. Finally, for scalar DML2 estimators whose first-step estimators admit a linear stochastic expansion, we derive a second-order approximation showing that larger $K_n$ reduces the magnitude of second-order asymptotic bias and mean-squared error, although the marginal improvements decline as $K_n$ increases. Within this framework, among common choices for DML2, $K_n=10$ is preferred to $K_n=5$.

Amilcar Velez
Amilcar Velez
Provost New Faculty Fellow

I am currently a Provost New Faculty Fellow in the Department of Economics at Cornell University and will join the faculty as an Assistant Professor of Economics in July 2026.