A new perspective on optimizers: leveraging moreau-yosida approximation in gradient-based learning

Betti, Alessandro; Ciravegna, Gabriele; Gori, Marco; Melacci, Stefano; Mottin, Kevin; Precioso, Frédéric

doi:10.3233/ia-240047

Machine Learning (ML) heavily relies on optimization techniques built upon gradient descent. Numerous gradient-based update methods have been proposed in the scientific literature, particularly in the context of neural networks, and have gained widespread adoption as optimizers in ML software libraries. This paper introduces a novel perspective by framing gradient-based update strategies using the Moreau-Yosida (MY) approximation of the loss function. Leveraging a first-order Taylor expansion, we demonstrate the concrete exploitability of the MY approximation to generalize the model update process. This enables the evaluation and comparison of regularization properties underlying popular optimizers like gradient descent with momentum, ADAGRAD, RMSprop, and ADAM. The MY-based unifying view opens up possibilities for designing new update schemes with customizable regularization properties. To illustrate this potential, we propose a case study that redefines the concept of closeness in the parameter space using network outputs. We present a proof-of-concept experimental procedure, demonstrating the effectiveness of this approach in continual learning scenarios. Specifically, we employ the well-known permuted MNIST dataset, a progressively-permuted MNIST and CIFAR-10 benchmarks, and a non i.i.d. stream. Additionally, we validate the update scheme’s efficacy in an offline-learning scenario. By embracing the MY-based unifying view, we pave the way for advancements in optimization techniques for machine learning.

A new perspective on optimizers: leveraging moreau-yosida approximation in gradient-based learning / Betti, Alessandro; Ciravegna, Gabriele; Gori, Marco; Melacci, Stefano; Mottin, Kevin; Precioso, Frédéric. - In: INTELLIGENZA ARTIFICIALE. - ISSN 1724-8035. - 18:2(2024), pp. 301-311. [10.3233/ia-240047]

A new perspective on optimizers: leveraging moreau-yosida approximation in gradient-based learning

Betti, Alessandro;Ciravegna, Gabriele;Gori, Marco;Melacci, Stefano;Mottin, Kevin;Precioso, Frédéric

2024

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno del prodotto
	
				2024
			
	Codice DOI
	
				https://dx.doi.org/10.3233/ia-240047
			
	Titolo della Rivista
	
				INTELLIGENZA ARTIFICIALE
			
	Appare nelle tipologie
	
				1.1 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
betti-et-al-2024-a-new-perspective-on-optimizers-leveraging-moreau-yosida-approximation-in-gradient-based-learning.pdf accesso riservato Tipologia: 2a Post-print versione editoriale / Version of Record Licenza: Non Pubblico - Accesso privato/ristretto Dimensione 500.56 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	500.56 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2996193

PORTO @ Archivio Istituzionale della Ricerca

A new perspective on optimizers: leveraging moreau-yosida approximation in gradient-based learning

Betti, Alessandro;Ciravegna, Gabriele;Gori, Marco;Melacci, Stefano;Mottin, Kevin;Precioso, Frédéric

2024

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Pubblicazioni consigliate

Informazioni

Conferma cancellazione

Scheda breve

Scheda completa

Scheda completa (DC)