[Slide] ISSCC 2017 Deep learning processor
4.1 A 2.9 TOPS/W Deep Convolutional Neural Network SoC in FD-SOI 28nm for Intelligent Embedded Systems
In Paper 14.1, STMicroelectronics presents a deep convolutional neural network SoC in 28nm FD-SOI with energy efficiency of 2.9TOPS/W and peak performance of more than 676GOPS, operating at 200MHz with supply voltage of 0.575V.
14.2 DNPU: An 8.1TOPS/W Reconfigurable CNN-RNN Processor for General-Purpose Deep Neural Networks
In Paper 14.2, KAIST presents a reconfigurable CNN-RNN processor SoC in 65nm CMOS with energy efficiency of 8.1TOPS/W, operating at 50MHz with supply voltage of 0.77V.
14.3 A 28nm SoC with a 1.2GHz 568nJ/Prediction Sparse Deep-Neural-Network Engine with >0.1 Timing Error Rate Tolerance for IoT Applications
In Paper 14.3, Harvard University presents a fully connected (FC)-DNN accelerator SoC in 28nm CMOS, which achieves 98.5% accuracy for MNIST inference with 0.36μJ/prediction at 667MHz and 0.57μJ/pred at 1.2GHz.
14.4 A Scalable Speech Recognizer with Deep-Neural-Network Acoustic Models and Voice-Activated Power Gating
In Paper 14.4, MIT presents an IC designed in a 65nm LP process for DNN-based automatic speech recognition (ASR) and voice-activity detection (VAD). Real-time ASR capability scales from 11 words (172μW) to 145k words (7.78mW) and the noise-robust VAD has power consumption of 22.3μW.
14.5 ENVISION: A 0.26-to-10TOPS/W Subword-Parallel Dynamic-Voltage-Accuracy-Frequency-Scalable Convolutional Neural Network Processor in 28nm FDSOI
In Paper 14.5, KU Leuven presents an energy-scalable CNN processor in 28nm FDSOI achieving efficiencies up to 10TOPS/W by modulating computational accuracy, voltage and frequency, while maintaining recognition rate and throughput.
14.6 A 0.62mW Ultra-Low-Power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-On Haar-Like Face Detector
In Paper 14.6, KAIST presents an ultra-low-power CNN-based face recognition (FR) processor and a CMOS image sensor integrated with an always-on Haar-like face detector in 65nm CMOS. The analog-digital hybrid Haar-like face detector improves the energy efficiency of face detection by 39% and the FR system dissipates 0.62mW at 1fps.
14.7 A 288μW Programmable Deep-Learning Processor with 270KB On-Chip Weight Storage Using Non-Uniform Memory Hierarchy for Mobile Intelligence
In Paper 14.7, the University of Michigan presents a programmable fully connected (FC)-DNN accelerator in 40nm CMOS. It achieves 374GOPS/W at 0.65V (288μW) and 3.9MHz, with configurable data precision, strategic data assignment in NUMA memory (270KB) and dynamic drowsy memory operation.
14.8 A 135mW Fully Integrated Data Processor for Next-Generation Sequencing
In Paper 14.8, National Taiwan University presents a data processor for Next-Generation Sequencing (NGS) in 40nm CMOS, which realizes DNA mapping, including suffix-array (SA) sorting and backward searching. With 135mW at 200MHz, it achieves significantly higher energy efficiency over CPU/GPU-based implementations.