Un-Fair Trojan: Targeted Backdoor Attacks Against Model Fairness
Document Type
Conference Proceeding
Publication Date
1-1-2022
Abstract
Machine learning models have proven to have the ability to make accurate predictions on complex data tasks such as image and graph data. However, they are vulnerable to various backdoor and data poisoning attacks which adversely affect model behavior. These attacks become more prevalent and complex in federated learning, where multiple local models contribute to a single global model communicating using only local gradients. Additionally, these models tend to make unfair predictions for certain protected features. Previously published works revolve around solving these issues both individually and jointly. However, there has been little study on how the adversary can launch an attack that can control model fairness. Demonstrated in this work, a flexible attack, which we call Un-Fair Trojan, that targets model fairness while remaining stealthy can have devastating effects against machine learning models, increasing their demographic parity by up to 30%, without causing a significant decrease in the model accuracy.
Identifier
85150681381 (Scopus)
ISBN
[9798350346718]
Publication Title
2022 9th International Conference on Software Defined Systems Sds 2022
External Full Text Location
https://doi.org/10.1109/SDS57574.2022.10062890
Recommended Citation
Furth, Nicholas; Khreishah, Abdallah; Liu, Guanxiong; Phan, Nhat Hai; and Jararweh, Yasser, "Un-Fair Trojan: Targeted Backdoor Attacks Against Model Fairness" (2022). Faculty Publications. 3358.
https://digitalcommons.njit.edu/fac_pubs/3358