# On the generalization error of norm penalty linear regression models

Joint with José Luis Montiel Olea , Cynthia Rush, and Johannes Wiesel

Abstract: We study linear regression problems with convex penalty functions and empirical measure of the data. Well known examples include the square-root lasso, square-root sorted L1-penalization, and penalized least absolute deviations regression. We show that, under regularity assumptions on the penalty function, such procedures naturally provide robust generalization, as the problem can be reformulated as a distributionally robust optimization (DRO) problem for max-sliced Wasserstein ball (a type of uncertainty set of adversarial distributions), i.e. the estimate solves the penalized linear regression problem iff it solves a best (worst-case) linear prediction problem. Our proof of this result is constructive and does not rely on duality results. We argue that the max-sliced Wasserstein ball are the natural balls to consider in this framework, as they provide a computationally efficient procedure comparable to non-robust methods and optimal robustness guarantees. In fact, our generalization bounds are of order $d/n$, up to logarithmic factors, and thus do not suffer from the curse of dimensionality as is the case for known generalization bounds when using the Wasserstein metric on $\mathbb{R}^d$. Moreover, the bounds provide theoretical support for recommending a regularization parameter $\delta$ of the same order for the linear regression problem.