User-Entity Differential Privacy in Learning Natural Language Models
Document Type
Conference Proceeding
Publication Date
1-1-2022
Abstract
In this paper, we introduce a novel concept of user-entity differential privacy (UeDP) to provide formal privacy protection simultaneously to both sensitive entities in textual data and data owners in learning natural language models (NLMs). To preserve UeDP, we developed a novel algorithm, called UeDP-Alg, optimizing the trade-off between privacy loss and model utility with a tight sensitivity bound derived from seamlessly combining user and sensitive entity sampling processes. An extensive theoretical analysis and evaluation show that our UeDP-Alg outperforms baseline approaches in model utility under the same privacy budget consumption on several NLM tasks, using benchmark datasets.
Identifier
85147902940 (Scopus)
ISBN
[9781665480451]
Publication Title
Proceedings 2022 IEEE International Conference on Big Data Big Data 2022
External Full Text Location
https://doi.org/10.1109/BigData55660.2022.10020247
First Page
1465
Last Page
1474
Grant
CNS-1850094
Fund Ref
National Science Foundation
Recommended Citation
Lai, Phung; Phan, Nhat Hai; Sun, Tong; Jain, Rajiv; Dernoncourt, Franck; Gu, Jiuxiang; and Barmpalios, Nikolaos, "User-Entity Differential Privacy in Learning Natural Language Models" (2022). Faculty Publications. 3254.
https://digitalcommons.njit.edu/fac_pubs/3254