How does generalization behave under suitable model capacities in modern machine learning? From deterministic equivalence to function spaces
In this talk, I will discuss some fundamental questions in modern machine learning:
- What is a suitable model capacity of a modern machine learning model?
- How to precisely characterize the test risk under such a model capacity?
- What is the corresponding function space induced by such a model capacity?
- What are the fundamental limits of statistical/computational learning efficiency within space?
My talk will partly answer the above questions, through the lens of norm-based capacity control. By deterministic equivalence, we provide a precise characterization of how the estimator’s norm concentrates and how it governs the associated test risk. Our results show that the predicted learning curve admits a phase transition from under- to over-parameterization, but no double descent behavior, and reshapes scaling laws as well. Additionally, I will talk about the path-norm based capacities and the induced Barron spaces to understand the fundamental limits of statistical efficiency, particularly in terms of sample complexity and dimension dependence—highlighting key statistical-computational gaps.
Talk based on https://arxiv.org/abs/2502.01585, https://arxiv.org/abs/2404.18769
