The existing work on the distributed training of machine learning (ML) models has consistently overlooked the distribution of the achieved learning quality, focusing instead on its average value. This leads to a poor dependability of the resulting ML models, whose performance may be much worse than expected. We fill this gap by proposing DepL, a framework for dependable learning orchestration, able to make high-quality, efficient decisions on (i) the data to leverage for learning, (ii) the models to use and when to switch among them, and (iii) the clusters of nodes, and the resources thereof, to exploit. For concreteness, we consider as possible available models a full DNN and its compressed versions. Unlike previous studies, DepL guarantees that a target learning quality is reached with a target probability, while keeping the training cost at a minimum. We prove that DepL has constant competitive ratio and polynomial complexity, and show that it outperforms the state-of-the-art by over 27% and closely matches the optimum.
Dependable Distributed Training of Compressed Machine Learning Models / Malandrino, F.; Di Giacomo, G.; Levorato, M.; Chiasserini, C. F.. - ELETTRONICO. - (2024). (Intervento presentato al convegno IEEE WoWMoM 2024 tenutosi a Perth (Australia) nel 04-07 June 2024) [10.1109/WoWMoM60985.2024.00036].
Dependable Distributed Training of Compressed Machine Learning Models
G. Di Giacomo;C. F. Chiasserini
2024
Abstract
The existing work on the distributed training of machine learning (ML) models has consistently overlooked the distribution of the achieved learning quality, focusing instead on its average value. This leads to a poor dependability of the resulting ML models, whose performance may be much worse than expected. We fill this gap by proposing DepL, a framework for dependable learning orchestration, able to make high-quality, efficient decisions on (i) the data to leverage for learning, (ii) the models to use and when to switch among them, and (iii) the clusters of nodes, and the resources thereof, to exploit. For concreteness, we consider as possible available models a full DNN and its compressed versions. Unlike previous studies, DepL guarantees that a target learning quality is reached with a target probability, while keeping the training cost at a minimum. We prove that DepL has constant competitive ratio and polynomial complexity, and show that it outperforms the state-of-the-art by over 27% and closely matches the optimum.| File | Dimensione | Formato | |
|---|---|---|---|
| pipes-5.pdf accesso aperto 
											Tipologia:
											2. Post-print / Author's Accepted Manuscript
										 
											Licenza:
											
											
												Pubblico - Tutti i diritti riservati
												
												
												
											
										 
										Dimensione
										2.08 MB
									 
										Formato
										Adobe PDF
									 | 2.08 MB | Adobe PDF | Visualizza/Apri | 
| Chiasserini-Dependable.pdf accesso riservato 
											Tipologia:
											2a Post-print versione editoriale / Version of Record
										 
											Licenza:
											
											
												Non Pubblico - Accesso privato/ristretto
												
												
												
											
										 
										Dimensione
										482.48 kB
									 
										Formato
										Adobe PDF
									 | 482.48 kB | Adobe PDF | Visualizza/Apri Richiedi una copia | 
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11583/2986252
			
		
	
	
	
			      	