Date of Award
Doctor of Philosophy in Electrical Engineering - (Ph.D.)
Electrical and Computer Engineering
Hieu Pham Trung Nguyen
Brain-inspired computation promises a paradigm shift in information processing, both in terms of its parallel processing architecture and the ability to learn to tackle problems deemed unsolvable by traditional algorithmic approaches. The computational capability of the human brain is believed to stem from an interconnected network of 100 billion compute nodes (neurons) that interact with each other through approximately 1015 adjustable memory junctions (synapses). The conductance of synapses is modifiable allowing the network to learn and perform various cognitive functions. Artificial neural networks inspired by this architecture have demonstrated even super-human performance in many complex tasks.
Computational systems based on the von Neumann architecture, however, are ill-suited to optimize and operate these large networks, as they have to constantly move data between the physically separated processor and memory units.
Crossbar arrays of nanoscale analog memory devices could store large network weight matrices in their respective conductances and could perform matrix operations without moving the weights to a processor. While this `in-memory computation' provides an efficient and scalable architecture, the trainability of the memory devices is constrained by their limited precision, stochasticity, and non-linearity, and therefore poses a major challenge.
In this thesis, a mixed-precision architecture is demonstrated which uses a high-precision digital memory to compensate for the limited precision of the synaptic devices during the training of deep neural networks. In the proposed architecture, the desired weight updates are accumulated in high-precision and transferred to the synaptic devices when the accumulated update exceeds a threshold representing the average device update granularity. Deep neural networks based on experimental nanoscale devices are shown to achieve performance comparable to high-precision software simulations by this approach.
Phase-change memory devices (PCM) on a prototype chip from IBM is used to experimentally demonstrate the proposed architecture. Artificial neural networks whose synapses are realized using PCM devices are trained to classify handwritten images from the MNIST dataset and the mixed-precision approach is successful in achieving training accuracies comparable to floating-point simulations. On-chip inference experiment using the PCM devices shows that the network states are retained reliably for more than $106s. The architecture is estimated to achieve approximately 20X acceleration in training these networks compared to high-precision implementations and has a potential for at least 100X efficiency gain in inference.
Supervised training and inference of third generation spiking neural networks using PCM are also demonstrated using the hardware platform. New array level conductance scaling methods are demonstrated for adaptive mapping of the device conductance to network weights and to compensate for the effect of conductance drift. During the course of the study, Ge2Sb2Te5 based PCM and Cu/SiO2/W based resistive random access memories are characterized for their gradual conductance modulation behavior and statistically accurate models are created. The models are used to pre-validate the experiments and to test the efficacy of different synapse configurations in the training of neural networks.
Collectively, this work demonstrates the feasibility of realizing high-performance learning systems that use low-precision nanoscale memory devices, with accuracies comparable to those obtained from high-precision software training. Such learning systems could have widespread applications including for energy and memory constrained edge computing and internet of things.
Sasidharan Rajalekshmi, Nandakumar, "High-performance learning systems using low-precision nanoscale devices" (2019). Dissertations. 1405.