Project FDNN: FPGA-based Deep Neural Networks

Established: February 1, 2015

This project aims to accelerate the inference and training of Deep Neural Networks (DNN) using FPGAs for high energy efficiency and low latency in data centers.

We have been developing a CNN (Convolutional Neural Network) accelerator based on an embedded FPGA platform. A dynamic-precision data quantization method and a convolver design that is efﬁcient for all layer types in CNN are proposed to improve the bandwidth and resource utilization. Results show that only 0.4% accuracy loss is introduced by our data quantization ﬂow for the very deep VGG16 model when 8/4-bit quantization is used. VGG16-SVD is implemented on an embedded FPGA platform (Xilinx Zynq) as a case study. The system on Xilinx Zynq ZC706 board achieves a framerate at 4.45 fps with the top-5 accuracy of 86.66% using 16-bit quantization. The average performance of Convolutional layers and the full CNN is 187.8GOP/s and 137.0GOP/s under 150MHz working frequency