Synetgy: Algorithm-hardware Co-design for ConvNet Accelerators on Embedded FPGAs

Yang, Yifan; Huang, Qijing; Bichen, Wu; Zhang, Tianjun; Ma, Liang; Gambardella, Giulio; Blott, Michaela; Lavagno, Luciano; Vissers, Kees; Wawrzynek, John; Keutzer, Kurt

doi:10.1145/3289602.3293902

Using FPGAs to accelerate ConvNets has attracted significant attention in recent years. However, FPGA accelerator design has not leveraged the latest progress of ConvNets. As a result, the key application characteristics such as frames-per-second (FPS) are ignored in favor of simply counting GOPs, and results on accuracy, which is critical to application success, are often not even reported. In this work, we adopt an algorithm-hardware co-design approach to develop a ConvNet accelerator called Synetgy and a novel ConvNet model called DiracDeltaNet†. Both the accelerator and ConvNet are tailored to FPGA requirements. DiracDeltaNet, as the name suggests, is a ConvNet with only 1x1 convolutions while spatial convolutions are replaced by more efficient shift operations. DiracDeltaNet achieves competitive accuracy on ImageNet (89.0% top-5), but with 48x fewer parameters and 65x fewer OPs than VGG16. We further quantize DiracDeltaNet’s weights to 1-bit and activations to 4-bits, with less than 1% accuracy loss. These quantizations exploit well the nature of FPGA hardware. In short, DiracDeltaNet’s small model size, low computational OP count, ultra-low precision and simplified operators allow us to co-design a highly customized computing unit for an FPGA. We implement the computing units for DiracDeltaNet on an Ultra96 SoC system through high-level synthesis. Our accelerator’s final top-5 accuracy of 88.2% on ImageNet, is higher than all the previously reported embedded FPGA accelerators. In addition, the accelerator reaches an inference speed of 96.5 FPS on the ImageNet classification task, surpassing prior works with similar accuracy by at least 16.9x.

Synetgy: Algorithm-hardware Co-design for ConvNet Accelerators on Embedded FPGAs / Yang, Yifan; Huang, Qijing; Wu, Bichen; Zhang, Tianjun; Ma, Liang; Gambardella, Giulio; Blott, Michaela; Lavagno, Luciano; Vissers, Kees; Wawrzynek, John; Keutzer, Kurt. - ELETTRONICO. - (2019), pp. 23-32. ( 27th ACM/SIGDA International Symposium on Field-Programmable Gate Arrays) [10.1145/3289602.3293902].

Synetgy: Algorithm-hardware Co-design for ConvNet Accelerators on Embedded FPGAs

Yifan Yang;Qijing Huang;Bichen Wu;Tianjun Zhang;Liang Ma;Giulio Gambardella;Michaela Blott;Luciano Lavagno;Kees Vissers;John Wawrzynek;Kurt Keutzer

2019

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno del prodotto
	
				2019
			
	Codice ISBN
	
				9781450361378
			
	Appare nelle tipologie
	
				4.1 Contributo in Atti di convegno

File in questo prodotto:

Non ci sono file associati a questo prodotto.

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2726980

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

PORTO @ Archivio Istituzionale della Ricerca

Synetgy: Algorithm-hardware Co-design for ConvNet Accelerators on Embedded FPGAs

Yifan Yang;Qijing Huang;Bichen Wu;Tianjun Zhang;Liang Ma;Giulio Gambardella;Michaela Blott;Luciano Lavagno;Kees Vissers;John Wawrzynek;Kurt Keutzer

2019

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Pubblicazioni consigliate

Informazioni

Attenzione

Conferma cancellazione

Scheda breve

Scheda completa

Scheda completa (DC)