Semantic segmentation is one of the popular tasks in computer vision, providing pixel-wise annotations for scene understanding. However, segmentation-based convolutional neural networks require tremendous computational power. In this work, a fully-pipelined hardware accelerator with support for dilated convolution is introduced, which cuts down the redundant zero multiplications. Furthermore, we propose a genetic algorithm based automated channel pruning technique to jointly optimize computational complexity and model accuracy. Finally, hardware heuristics and an accurate model of the custom accelerator design enable a hardware-aware pruning framework. We achieve 2.44X lower latency with minimal degradation in semantic prediction quality (−1.98 pp lower mean intersection over union) compared to the baseline DeepLabV3+ model, evaluated on an Arria-10 FPGA. The binary files of the FPGA design, baseline and pruned models can be found in github.com/pierpaolomori/SemanticSegmentationFPGA.

Accelerating and pruning CNNs for semantic segmentation on FPGA / Mori, Pierpaolo; Vemparala, Manoj-Rohit; Fasfous, Nael; Mitra, Saptarshi; Sarkar, Sreetama; Frickenstein, Alexander; Frickenstein, Lukas; Helms, Domenik; Nagaraja, Naveen-Shankar; Stechele, Walter; Passerone, Claudio. - STAMPA. - DAC '22: Proceedings of the 59th ACM/IEEE Design Automation Conference:(2022), pp. 145-150. (Intervento presentato al convegno Proceedings of the 59th ACM/IEEE Design Automation Conference tenutosi a San Francisco (USA) nel 10-14 luglio 2022) [10.1145/3489517.3530424].

Accelerating and pruning CNNs for semantic segmentation on FPGA

Mori, Pierpaolo;Passerone,Claudio
2022

Abstract

Semantic segmentation is one of the popular tasks in computer vision, providing pixel-wise annotations for scene understanding. However, segmentation-based convolutional neural networks require tremendous computational power. In this work, a fully-pipelined hardware accelerator with support for dilated convolution is introduced, which cuts down the redundant zero multiplications. Furthermore, we propose a genetic algorithm based automated channel pruning technique to jointly optimize computational complexity and model accuracy. Finally, hardware heuristics and an accurate model of the custom accelerator design enable a hardware-aware pruning framework. We achieve 2.44X lower latency with minimal degradation in semantic prediction quality (−1.98 pp lower mean intersection over union) compared to the baseline DeepLabV3+ model, evaluated on an Arria-10 FPGA. The binary files of the FPGA design, baseline and pruned models can be found in github.com/pierpaolomori/SemanticSegmentationFPGA.
2022
978-1-4503-9142-9
File in questo prodotto:
File Dimensione Formato  
3489517.3530424.pdf

accesso aperto

Descrizione: Accelerating and Pruning CNNs for Semantic Segmentation on FPGA
Tipologia: 2a Post-print versione editoriale / Version of Record
Licenza: PUBBLICO - Tutti i diritti riservati
Dimensione 3.11 MB
Formato Adobe PDF
3.11 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2970797