γ-Razor: Hardness-Aware Dataset Pruning for Efficient Neural Network Training

Liu, Lei; Zhang, Peng; Liang, Yunji; Liu, Junrui; Morra, Lia; Guo, Bin; Zhiwen, Yu; Zhang, Yanyong; Zeng, Daniel D.

doi:10.1109/tcss.2024.3453600

Training deep neural networks (DNNs) on largescale datasets is often inefficient with large computational needs and significant energy consumption. Although great efforts have been taken to optimize DNNs, few studies focused on the inefficiency caused by the data samples with less value for model training. In this article, we empirically demonstrate that sample complexity is important for model efficiency and selecting representative samples is constructive to the model efficiency. In particular, we propose hardness-aware dataset pruning method (gamma-Razor) to select representative samples from large-scale datasets to remove the less valuable data samples for model training.gamma -Razor is a two-stage framework that includes interclass sampling and intraclass sampling. First, we introduce the inverse self-paced learning strategy to learn hard samples and adjust their weights adaptively according to the inverse frequency of effective samples of each class. For intraclass sampling, hardness-aware cluster sampling algorithm is proposed to downsample easy samples within each class. To evaluate the performance of gamma-Razor, we conducted extensive experiments on three large-scale datasets for image classification tasks. The experimental results show that models trained with the pruned datasets show competitive performances against their counterparts trained with the original large-scale datasets in terms of robustness and efficiency. Furthermore, models trained with the pruned datasets converge faster with lower energy consumption.

γ-Razor: Hardness-Aware Dataset Pruning for Efficient Neural Network Training / Liu, Lei; Zhang, Peng; Liang, Yunji; Liu, Junrui; Morra, Lia; Guo, Bin; Yu, Zhiwen; Zhang, Yanyong; Zeng, Daniel D.. - In: IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS. - ISSN 2329-924X. - (2024), pp. 1-15. [10.1109/tcss.2024.3453600]

γ-Razor: Hardness-Aware Dataset Pruning for Efficient Neural Network Training

Liu, Lei;Zhang, Peng;Liang, Yunji;Liu, Junrui;Morra, Lia;Guo, Bin;Yu, Zhiwen;Zhang, Yanyong;Zeng, Daniel D.

2024

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno del prodotto
	
				2024
			
	Codice DOI
	
				https://dx.doi.org/10.1109/tcss.2024.3453600
			
	Titolo della Rivista
	
				IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS
			
	Appare nelle tipologie
	
				1.1 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
-Razor_Hardness-Aware_Dataset_Pruning_for_Efficient_Neural_Network_Training.pdf accesso riservato Tipologia: 2. Post-print / Author's Accepted Manuscript Licenza: Non Pubblico - Accesso privato/ristretto Dimensione 2.36 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	2.36 MB	Adobe PDF	Visualizza/Apri Richiedi una copia

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2993544

PORTO @ Archivio Istituzionale della Ricerca

γ-Razor: Hardness-Aware Dataset Pruning for Efficient Neural Network Training

Liu, Lei;Zhang, Peng;Liang, Yunji;Liu, Junrui;Morra, Lia;Guo, Bin;Yu, Zhiwen;Zhang, Yanyong;Zeng, Daniel D.

2024

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Pubblicazioni consigliate

Informazioni

Conferma cancellazione

Scheda breve

Scheda completa

Scheda completa (DC)