Designing Highly-Efficient Hardware Accelerators for Robust and Automatic Deep Learning Technologies



Journal Title

Journal ISSN

Volume Title



Deep learning based AI technologies, such as deep convolutional neural networks (DNNs), have recently achieved amazing success in numerous applications, such as image recognition, autonomous driving, and so on. However, there are two critical issues in the conventional DNN applications. The first problem is safety. DNN models can become unreliable due to the uncertainty in data, e.g., insufficient labeled training data, measurement errors and noise in the label. To address this issue, Bayesian deep learning has become an appealing solution since it provides a mathematically grounded framework to quantify uncertainties for model's final prediction. As a key example, Bayesian Neural Networks (BNNs) are one of the most successful Bayesian models being increasingly employed in a wide range of real-world AI applications which demand reliable and robust decisions. However, the nature of BNN stochastic inference and training procedures incurs orders of magnitude higher computational costs than conventional DNN models, which poses a daunting challenge to traditional hardware platforms, such as CPUs/GPUs. The second issue lying in the conventional DNN applications is the laboring-intensive design period. The actual architecture design of a DNN model demands significant amount of efforts and cycles from machine learning experts. Fortunately, the recent emergence of Neural Architecture Search (NAS) has brought the neural architecture design into an era of automation. Nevertheless, the search cost is still prohibitively expensive for practical large-scale deployment in real-world applications.

This dissertation focuses on designing high-speed and energy-efficient hardware accelerators for robust and automatic deep learning technologies, i.e. BNN and NAS. In this dissertation, two BNN accelerators, i.e., Fast-BCNN and Shift-BNN, are proposed to accelerate the BNN inference and training, respectively. Furthermore, an efficient in-situ NAS search engine is introduced for large-scale deployment in the real-world applications. The proposed accelerators show promise of solving the challenges during the execution of BNN and NAS workloads efficiently.



Hardware acceleration, Deep learning