A Note on Lazy Training in Supervised Differentiable Programming

Lenaic Chizat; Francis Bach

Preprints, Working Papers, ... Year : 2018

A Note on Lazy Training in Supervised Differentiable Programming

(1) , (2, 1)

1
2

Lenaic Chizat

Function : Author
PersonId : 19586
IdHAL : lenaic-chizat
ORCID : 0000-0002-6553-1211
IdRef : 23033802X

Statistical Machine Learning and Parsimony

Francis Bach

Function : Author
PersonId : 863086
ORCID : 0000-0001-8644-1058
IdRef : 152166203

Laboratoire d'informatique de l'école normale supérieure

Statistical Machine Learning and Parsimony

Abstract

In a series of recent theoretical work, it has been shown that strongly over-parameterized neural networks trained with gradient-based methods could converge linearly to zero training loss, with their parameters hardly varying. In this note, our goal is to exhibit the simple structure that is behind these results. In a simplified setting, we prove that ``lazy training'' essentially solves a kernel regression. We also show that this behavior is not so much due to over-parameterization than to a choice of scaling, often implicit, that allows to linearize the model around its initialization. These theoretical results complemented with simple numerical experiments make it seem unlikely that ``lazy training'' is behind the many successes of neural networks in high dimensional tasks.

Keywords

Neural network optimization Kernel regression Gradient flow

Domains

Machine Learning [stat.ML] Optimization and Control [math.OC]

Fichier principal

chizatbach2018lazy.pdf (759.19 Ko)

Origin : Files produced by the author(s)

Lénaïc Chizat : Connect in order to contact the contributor

https://inria.hal.science/hal-01945578

Submitted on : Wednesday, December 5, 2018-1:27:54 PM

Last modification on : Friday, April 19, 2024-4:18:55 PM

Long-term archiving on: Wednesday, March 6, 2019-2:06:31 PM

Dates and versions

hal-01945578 , version 1 (05-12-2018)

hal-01945578 , version 2 (11-12-2018)

hal-01945578 , version 3 (21-02-2019)

hal-01945578 , version 4 (08-06-2019)

hal-01945578 , version 5 (18-06-2019)

hal-01945578 , version 6 (07-01-2020)

Identifiers

HAL Id : hal-01945578 , version 1

Cite

Lenaic Chizat, Francis Bach. A Note on Lazy Training in Supervised Differentiable Programming. 2018. ⟨hal-01945578v1⟩

Export

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

5441 View

4495 Download

A Note on Lazy Training in Supervised Differentiable Programming

Abstract

Keywords

Domains

Dates and versions

Identifiers

Cite

Export

Share