Learning through atypical phase transitions in overparameterized neural networks

Baldassi, C.; Lauditi, C.; Malatesta, E. M.; Pacelli, R.; Perugini, G.; Zecchina, R.

doi:10.1103/PhysRevE.106.014116

Current deep neural networks are highly overparameterized (up to billions of connection weights) and nonlin-ear. Yet they can fit data almost perfectly through variants of gradient descent algorithms and achieve unexpected levels of prediction accuracy without overfitting. These are formidable results that defy predictions of statistical learning and pose conceptual challenges for nonconvex optimization. In this paper, we use methods from statistical physics of disordered systems to analytically study the computational fallout of overparameterization in nonconvex binary neural network models, trained on data generated from a structurally simpler but ???hidden??? network. As the number of connection weights increases, we follow the changes of the geometrical structure of different minima of the error loss function and relate them to learning and generalization performance. A first transition happens at the so-called interpolation point, when solutions begin to exist (perfect fitting becomes possible). This transition reflects the properties of typical solutions, which however are in sharp minima and hard to sample. After a gap, a second transition occurs, with the discontinuous appearance of a different kind of ???atypical??? structures: wide regions of the weight space that are particularly solution dense and have good generalization properties. The two kinds of solutions coexist, with the typical ones being exponentially more numerous, but empirically we find that efficient algorithms sample the atypical, rare ones. This suggests that the atypical phase transition is the relevant one for learning. The results of numerical tests with realistic networks on observables suggested by the theory are consistent with this scenario.

Learning through atypical phase transitions in overparameterized neural networks / Baldassi, C., Lauditi, C., Malatesta, E.M., Pacelli, R., Perugini, G., Zecchina, R.. - In: PHYSICAL REVIEW. E. - ISSN 2470-0053. - 106:1(2022), p. 014116. [10.1103/PhysRevE.106.014116]

Learning through atypical phase transitions in overparameterized neural networks

Baldassi C.;Lauditi C.;Malatesta E. M.;Pacelli R.;Perugini G.;Zecchina R.

2022

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno del prodotto
	
				2022
			
	Codice DOI
	
				https://dx.doi.org/10.1103/PhysRevE.106.014116
			
	Titolo della Rivista
	
				PHYSICAL REVIEW. E
			
	Appare nelle tipologie
	
				1.1 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
2110.00683.pdf accesso aperto Tipologia: 2. Post-print / Author's Accepted Manuscript Licenza: Pubblico - Tutti i diritti riservati Dimensione 1.52 MB Formato Adobe PDF Visualizza/Apri	1.52 MB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2983563

PORTO @ Archivio Istituzionale della Ricerca

Learning through atypical phase transitions in overparameterized neural networks

Baldassi C.;Lauditi C.;Malatesta E. M.;Pacelli R.;Perugini G.;Zecchina R.

2022

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Pubblicazioni consigliate

Informazioni

Conferma cancellazione

Scheda breve

Scheda completa

Scheda completa (DC)