Document Type
Dissertation
Date of Award
8-31-2021
Degree Name
Doctor of Philosophy in Computing Sciences - (Ph.D.)
Department
Computer Science
First Advisor
Usman W. Roshan
Second Advisor
Zhi Wei
Third Advisor
Ioannis Koutis
Fourth Advisor
Hai Nhat Phan
Fifth Advisor
Ji Meng Loh
Sixth Advisor
William Graves
Abstract
The zero-one loss function is less sensitive to outliers than convex surrogate losses such as hinge and cross-entropy. However, as a non-convex function, it has a large number of local minima, andits undifferentiable attribute makes it impossible to use backpropagation, a method widely used in training current state-of-the-art neural networks. When zero-one loss is applied to deep neural networks, the entire training process becomes challenging. On the other hand, a massive non-unique solution probably also brings different decision boundaries when optimizing zero-one loss, making it possible to fight against transferable adversarial examples, which is a common weakness in deep learning neural network models.
This dissertation introduces a stochastic coordinate descent to optimize the linear classification model based on zero-one loss. Moreover, its variants are successfully applied to multi-layer neural networks using sign activation and multi-layer convolutional neural networks to obtain higher image classification performance. In some image benchmark tests, the stochastic coordinate descent method achieves accuracy close to that of the stochastic gradient descent method. At the same time, some heuristic techniques are used, such as random node optimization, feature pool, warm start, step training, additional backpropagation penetration, and other methods to speed up training and save memory usage. Furthermore, the model's adversarial robustness is analyzed by conducting white-box attacks, decision boundary attacks, and comparing zero-one loss models to those using more traditional loss functions such as cross-entropy.
Recommended Citation
Xue, Yunzhe, "Gradient free sign activation zero one loss neural networks for adversarially robust classification" (2021). Dissertations. 1545.
https://digitalcommons.njit.edu/dissertations/1545
Included in
Artificial Intelligence and Robotics Commons, Biomedical Engineering and Bioengineering Commons