On the Asymptotic Properties of Debiased Machine Learning Estimators
Last version JMP version JMP Online Appendix
Abstract: This paper studies debiased machine learning (DML) when the number of cross-fitting folds, $K_n$, may grow with the sample size $n$. Existing fixed-$K$ asymptotic theory implies that DML1 and DML2, two variants of DML estimators, are asymptotically equivalent, providing no guidance on which variant to use or how to choose $K_n$. We first show that when $K_n \propto \sqrt n$, DML1 can exhibit asymptotic bias, implying that standard inference based on DML1 may fail, whereas inference based on DML2 remains valid. We then show that, under an algorithmic-stability condition, the standard first-order asymptotic theory for DML2 remains valid for any $K_n \le n$. Finally, for scalar DML2 estimators whose first-step estimators admit a linear stochastic expansion, we derive a second-order approximation showing that larger $K_n$ reduces the magnitude of second-order asymptotic bias and mean-squared error, although the marginal improvements decline as $K_n$ increases. Within this framework, among common choices for DML2, $K_n=10$ is preferred to $K_n=5$.