Accelerating Low Bit-width Neural Networks at the Edge, PIM or FPGA: A Comparative Study
Document Type
Conference Proceeding
Publication Date
6-5-2023
Abstract
Deep Neural Network (DNN) acceleration with digital Processing-in-Memory (PIM) platforms at the edge is an actively-explored domain with great potential to not only address memory-wall bottlenecks but to offer orders of performance improvement in comparison to the von-Neumann architecture. On the other side, FPGA-based edge computing has been followed as a potential solution to accelerate compute-intensive workloads. In this work, adopting low-bit-width neural networks, we perform a solid and comparative inference performance analysis of a recent processing-in-SRAM tape-out with a low-resource FPGA board and a high-performance GPU to provide a guideline for the research community. We explore and highlight the key architectural constraints of these edge candidates that impact their overall performance. Our experimental data demonstrate that the processing-in-SRAM can obtain up to ∼160x speed-up and up to 228x higher efficiency (img/s/W) compared to the under-test FPGA on the CIFAR-10 dataset.
Identifier
85163164087 (Scopus)
ISBN
[9798400701252]
Publication Title
Proceedings of the ACM Great Lakes Symposium on VLSI Glsvlsi
External Full Text Location
https://doi.org/10.1145/3583781.3590213
First Page
625
Last Page
630
Grant
2228028
Fund Ref
National Science Foundation
Recommended Citation
Kochar, Nakul; Ekiert, Lucas; Najafi, Deniz; Fan, Deliang; and Angizi, Shaahin, "Accelerating Low Bit-width Neural Networks at the Edge, PIM or FPGA: A Comparative Study" (2023). Faculty Publications. 1670.
https://digitalcommons.njit.edu/fac_pubs/1670
