Document Type

Dissertation

Date of Award

Spring 5-31-2019

Degree Name

Doctor of Philosophy in Electrical Engineering - (Ph.D.)

Department

Electrical and Computer Engineering

First Advisor

Bipin Rajendran

Second Advisor

Abu Sebastian

Third Advisor

Durgamadhab Misra

Fourth Advisor

Osvaldo Simeone

Fifth Advisor

Hieu Pham Trung Nguyen

Abstract

Brain-inspired computation promises a paradigm shift in information processing, both in terms of its parallel processing architecture and the ability to learn to tackle problems deemed unsolvable by traditional algorithmic approaches. The computational capability of the human brain is believed to stem from an interconnected network of 100 billion compute nodes (neurons) that interact with each other through approximately 10¹⁵ adjustable memory junctions (synapses). The conductance of synapses is modifiable allowing the network to learn and perform various cognitive functions. Artificial neural networks inspired by this architecture have demonstrated even super-human performance in many complex tasks.

Computational systems based on the von Neumann architecture, however, are ill-suited to optimize and operate these large networks, as they have to constantly move data between the physically separated processor and memory units.

Crossbar arrays of nanoscale analog memory devices could store large network weight matrices in their respective conductances and could perform matrix operations without moving the weights to a processor. While this `in-memory computation' provides an efficient and scalable architecture, the trainability of the memory devices is constrained by their limited precision, stochasticity, and non-linearity, and therefore poses a major challenge.

In this thesis, a mixed-precision architecture is demonstrated which uses a high-precision digital memory to compensate for the limited precision of the synaptic devices during the training of deep neural networks. In the proposed architecture, the desired weight updates are accumulated in high-precision and transferred to the synaptic devices when the accumulated update exceeds a threshold representing the average device update granularity. Deep neural networks based on experimental nanoscale devices are shown to achieve performance comparable to high-precision software simulations by this approach.

Phase-change memory devices (PCM) on a prototype chip from IBM is used to experimentally demonstrate the proposed architecture. Artificial neural networks whose synapses are realized using PCM devices are trained to classify handwritten images from the MNIST dataset and the mixed-precision approach is successful in achieving training accuracies comparable to floating-point simulations. On-chip inference experiment using the PCM devices shows that the network states are retained reliably for more than $10⁶s. The architecture is estimated to achieve approximately 20X acceleration in training these networks compared to high-precision implementations and has a potential for at least 100X efficiency gain in inference.

Supervised training and inference of third generation spiking neural networks using PCM are also demonstrated using the hardware platform. New array level conductance scaling methods are demonstrated for adaptive mapping of the device conductance to network weights and to compensate for the effect of conductance drift. During the course of the study, Ge₂Sb₂Te₅ based PCM and Cu/SiO₂/W based resistive random access memories are characterized for their gradual conductance modulation behavior and statistically accurate models are created. The models are used to pre-validate the experiments and to test the efficacy of different synapse configurations in the training of neural networks.

Collectively, this work demonstrates the feasibility of realizing high-performance learning systems that use low-precision nanoscale memory devices, with accuracies comparable to those obtained from high-precision software training. Such learning systems could have widespread applications including for energy and memory constrained edge computing and internet of things.

Recommended Citation

Sasidharan Rajalekshmi, Nandakumar, "High-performance learning systems using low-precision nanoscale devices" (2019). Dissertations. 1405.
https://digitalcommons.njit.edu/dissertations/1405

Download

Included in

Electrical and Computer Engineering Commons, Neurosciences Commons

COinS

Dissertations

High-performance learning systems using low-precision nanoscale devices

Document Type

Date of Award

Degree Name

Department

First Advisor

Second Advisor

Third Advisor

Fourth Advisor

Fifth Advisor

Abstract

Recommended Citation

Included in

Search

Browse

Author Corner

Links

Dissertations

High-performance learning systems using low-precision nanoscale devices

Author

Document Type

Date of Award

Degree Name

Department

First Advisor

Second Advisor

Third Advisor

Fourth Advisor

Fifth Advisor

Abstract

Recommended Citation

Included in

Share

Search

Browse

Author Corner

Links