Unreasonable effectiveness of learning neural networks: From accessible states and robust ensembles to basic algorithmic schemes

Baldassi, Carlo; Borgs, Christian; Chayes, Jennifer T; Ingrosso, Alessandro; Lucibello, Carlo; Saglietti, Luca; Zecchina, Riccardo

doi:10.1073/pnas.1608103113

In artificial neural networks, learning from data is a computationally demanding task in which a large number of connection weights are iteratively tuned through stochastic-gradient-based heuristic processes over a cost function. It is not well understood how learning occurs in these systems, in particular how they avoid getting trapped in configurations with poor computational performance. Here, we study the difficult case of networks with discrete weights, where the optimization landscape is very rough even for simple architectures, and provide theoretical and numerical evidence of the existence of rare-but extremely dense and accessible-regions of configurations in the network weight space. We define a measure, the robust ensemble (RE), which suppresses trapping by isolated configurations and amplifies the role of these dense regions. We analytically compute the RE in some exactly solvable models and also provide a general algorithmic scheme that is straightforward to implement: define a cost function given by a sum of a finite number of replicas of the original cost function, with a constraint centering the replicas around a driving assignment. To illustrate this, we derive several powerful algorithms, ranging from Markov Chains to message passing to gradient descent processes, where the algorithms target the robust dense states, resulting in substantial improvements in performance. The weak dependence on the number of precision bits of the weights leads us to conjecture that very similar reasoning applies to more conventional neural networks. Analogous algorithmic schemes can also be applied to other optimization problems.

Unreasonable effectiveness of learning neural networks: From accessible states and robust ensembles to basic algorithmic schemes / Baldassi, Carlo; Borgs, Christian; Chayes, Jennifer T; Ingrosso, Alessandro; Lucibello, Carlo; Saglietti, Luca; Zecchina, Riccardo. - In: PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA. - ISSN 0027-8424. - STAMPA. - 113:48(2016), pp. E7655-E7662. [10.1073/pnas.1608103113]

Unreasonable effectiveness of learning neural networks: From accessible states and robust ensembles to basic algorithmic schemes

BALDASSI, CARLO;Borgs, Christian;Chayes, Jennifer T;INGROSSO, ALESSANDRO;LUCIBELLO, CARLO;SAGLIETTI, LUCA;ZECCHINA, RICCARDO

2016

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno del prodotto
	
				2016
			
	Codice DOI
	
				https://dx.doi.org/10.1073/pnas.1608103113
			
	Titolo della Rivista
	
				PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA
			
	Appare nelle tipologie
	
				1.1 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
accessible-arxiv.pdf accesso aperto Descrizione: Articolo principale e Supporting Material Tipologia: 1. Preprint / submitted version [pre- review] Licenza: Pubblico - Tutti i diritti riservati Dimensione 958.05 kB Formato Adobe PDF Visualizza/Apri	958.05 kB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2660985

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

PORTO @ Archivio Istituzionale della Ricerca

Unreasonable effectiveness of learning neural networks: From accessible states and robust ensembles to basic algorithmic schemes

BALDASSI, CARLO;Borgs, Christian;Chayes, Jennifer T;INGROSSO, ALESSANDRO;LUCIBELLO, CARLO;SAGLIETTI, LUCA;ZECCHINA, RICCARDO

2016

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Pubblicazioni consigliate

Informazioni

Attenzione

Conferma cancellazione

Scheda breve

Scheda completa

Scheda completa (DC)