Document Type

Dissertation

Date of Award

8-31-2021

Degree Name

Doctor of Philosophy in Computing Sciences - (Ph.D.)

Department

Computer Science

First Advisor

Usman W. Roshan

Second Advisor

Zhi Wei

Third Advisor

Ioannis Koutis

Fourth Advisor

Hai Nhat Phan

Fifth Advisor

Ji Meng Loh

Sixth Advisor

William Graves

Abstract

The zero-one loss function is less sensitive to outliers than convex surrogate losses such as hinge and cross-entropy. However, as a non-convex function, it has a large number of local minima, andits undifferentiable attribute makes it impossible to use backpropagation, a method widely used in training current state-of-the-art neural networks. When zero-one loss is applied to deep neural networks, the entire training process becomes challenging. On the other hand, a massive non-unique solution probably also brings different decision boundaries when optimizing zero-one loss, making it possible to fight against transferable adversarial examples, which is a common weakness in deep learning neural network models.

This dissertation introduces a stochastic coordinate descent to optimize the linear classification model based on zero-one loss. Moreover, its variants are successfully applied to multi-layer neural networks using sign activation and multi-layer convolutional neural networks to obtain higher image classification performance. In some image benchmark tests, the stochastic coordinate descent method achieves accuracy close to that of the stochastic gradient descent method. At the same time, some heuristic techniques are used, such as random node optimization, feature pool, warm start, step training, additional backpropagation penetration, and other methods to speed up training and save memory usage. Furthermore, the model's adversarial robustness is analyzed by conducting white-box attacks, decision boundary attacks, and comparing zero-one loss models to those using more traditional loss functions such as cross-entropy.

Share

COinS