Enabling Efficient Neural Network Computation Via Hardware And Software Co-Design

dc.contributor.advisorFu, Xin
dc.contributor.committeeMemberChen, Jinghong
dc.contributor.committeeMemberPan, Miao
dc.contributor.committeeMemberJackson, David R.
dc.contributor.committeeMemberWu, Xuqing
dc.creatorZhang, Xingyao
dc.creator.orcid0000-0002-8874-9520
dc.date.accessioned2022-06-30T23:37:55Z
dc.date.createdAugust 2020
dc.date.issued2020-08
dc.date.submittedAugust 2020
dc.date.updated2022-06-30T23:37:57Z
dc.description.abstractIn recent years, the neural networks have achieved great successes in the many area, e.g., automotive driving, medical and Intelligent Personal Assistants (IPAs). Among the neural network models, Long-Short Term Memory network (LSTM) and Capsule Network (CapsNet) are popular but exhibit low efficient when processed on the hardware device. In this dissertation, I introduce two hardware and software co-design approaches to efficiently execute the inference stage of the LSTM and the CapsNet. In the first work, we observe that LSTMs exhibit quite inefficient memory access pattern when executed on mobile GPUs due to the redundant data movements and limited off-chip bandwidth. To address the redundancy, we propose inter-cell level optimizations to improve the data locality across cells with negligible accuracy loss. To relax the pressure on limited offchip memory bandwidth, we propose intra-cell level optimizations that dynamically skip the loads and computations of rows in the weight matrices with trivial contribution to the outputs. We also introduce a light-weighted module to the GPUs architecture for the runtime row skipping in weight matrices. In the second work, CapsNet execution is observed low efficiency due to the execution features of their routing procedure, including massive unshareable intermediate variables and intensive synchronizations. we propose the software-hardware co-designed optimizations, SH-CapsNet, which includes the software-level optimizations named S-CapsNet and a hybrid computing architecture design named PIM-CapsNet . In software-level, S-CapsNet reduces the computation and memory accesses by exploiting the computational redundancy and data similarity of the routing procedure. In hardware-level, the PIM-CapsNet leverages the processing-in-memory capability of today’s 3D stacked memory to conduct the off-chip in-memory acceleration solution for the routing procedure, while pipelining with the GPU’s on-chip computing capability for accelerating CNN types of layers in CapsNet.
dc.description.departmentElectrical and Computer Engineering, Department of
dc.format.digitalOriginborn digital
dc.format.mimetypeapplication/pdf
dc.identifier.citationPortions of this document appear in: Zhang, Xingyao, et al. "Towards memory friendly long-short term memory networks (LSTMs) on mobile GPUs." 2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). IEEE, 2018; and in: Zhang, Xingyao, et al. "Enabling Highly Efficient Capsule Networks Processing Through A PIM-Based Architecture Design." 2020 IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE, 2020.
dc.identifier.urihttps://hdl.handle.net/10657/10256
dc.language.isoeng
dc.rightsThe author of this work is the copyright owner. UH Libraries and the Texas Digital Library have their permission to store and provide access to this work. UH Libraries has secured permission to reproduce any and all previously published materials contained in the work. Further transmission, reproduction, or presentation of this work is prohibited except with permission of the author(s).
dc.subjectComputer Architecture
dc.subjectMachine Learning Acceleration
dc.subjectEmerging Technology
dc.subjectProcessing in Memory
dc.titleEnabling Efficient Neural Network Computation Via Hardware And Software Co-Design
dc.type.dcmiText
dc.type.genreThesis
local.embargo.lift2022-08-01
local.embargo.terms2022-08-01
thesis.degree.collegeCullen College of Engineering
thesis.degree.departmentElectrical and Computer Engineering, Department of
thesis.degree.disciplineElectrical Engineering
thesis.degree.grantorUniversity of Houston
thesis.degree.levelDoctoral
thesis.degree.nameDoctor of Philosophy

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
ZHANG-DISSERTATION-2020.pdf
Size:
10.1 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 2 of 2
No Thumbnail Available
Name:
PROQUEST_LICENSE.txt
Size:
4.43 KB
Format:
Plain Text
Description:
No Thumbnail Available
Name:
LICENSE.txt
Size:
1.81 KB
Format:
Plain Text
Description: