Network Support for High-performance Distributed Machine Learning

Malandrino, Francesco; Chiasserini, Carla Fabiana; Molner, Nuria; Antonio De La Oliva,

doi:10.1109/TNET.2022.3189077

The traditional approach to distributed machine learning is to adapt learning algorithms to the network, e.g., reducing updates to curb overhead. Networks based on intelligent edge, instead, make it possible to follow the opposite approach, i.e., to define the logical network topology around the learning task to perform, so as to meet the desired learning performance. In this paper, we propose a system model that captures such aspects in the context of supervised machine learning, accounting for both learning nodes (that perform computations) and infor- mation nodes (that provide data). We then formulate the problem of selecting (i) which learning and information nodes should cooperate to complete the learning task, and (ii) the number of epochs to run, in order to minimize the learning cost while meeting the target prediction error and execution time. After proving important properties of the above problem, we devise an algorithm, named DoubleClimb, that can find a 1 + 1/|I|- competitive solution (with I being the set of information nodes), with cubic worst-case complexity. Our performance evaluation, leveraging a real-world network topology and considering both classification and regression tasks, also shows that DoubleClimb closely matches the optimum, outperforming state-of-the-art alternatives.

Network Support for High-performance Distributed Machine Learning / Malandrino, Francesco; Chiasserini, Carla Fabiana; Molner, Nuria; de la Oliva, Antonio. - In: IEEE-ACM TRANSACTIONS ON NETWORKING. - ISSN 1063-6692. - STAMPA. - 31:1(2023), pp. 264-278. [10.1109/TNET.2022.3189077]

Network Support for High-performance Distributed Machine Learning

Francesco Malandrino;Carla Fabiana Chiasserini;Nuria Molner;Antonio de la Oliva

2023

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno del prodotto
	
				2023
			
	Codice DOI
	
				https://dx.doi.org/10.1109/TNET.2022.3189077
			
	Titolo della Rivista
	
				IEEE-ACM TRANSACTIONS ON NETWORKING
			
	Appare nelle tipologie
	
				1.1 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
journal_R2_v1_embedded.pdf accesso aperto Descrizione: Articolo principale Tipologia: 2. Post-print / Author's Accepted Manuscript Licenza: Pubblico - Tutti i diritti riservati Dimensione 890.05 kB Formato Adobe PDF Visualizza/Apri	890.05 kB	Adobe PDF	Visualizza/Apri
Network_Support_for_High-Performance_Distributed_Machine_Learning-2.pdf accesso riservato Tipologia: 2a Post-print versione editoriale / Version of Record Licenza: Non Pubblico - Accesso privato/ristretto Dimensione 1.65 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	1.65 MB	Adobe PDF	Visualizza/Apri Richiedi una copia

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2969437

PORTO @ Archivio Istituzionale della Ricerca

Network Support for High-performance Distributed Machine Learning

Francesco Malandrino;Carla Fabiana Chiasserini;Nuria Molner;Antonio de la Oliva

2023

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Pubblicazioni consigliate

Informazioni

Conferma cancellazione

Scheda breve

Scheda completa

Scheda completa (DC)