Author ORCID Identifier
0000-0002-9821-4183
Document Type
Dissertation
Date of Award
8-31-2025
Degree Name
Doctor of Philosophy in Business Data Science - (Ph.D.)
Department
Data Science
First Advisor
Dantong Yu
Second Advisor
Yi Chen
Third Advisor
Michael A. Ehrlich
Fourth Advisor
Junmin Shi
Fifth Advisor
Guiling Wang
Sixth Advisor
Yao Ma
Abstract
This work proposes innovative methods for integrating domain-specific knowledge into natural language processing tasks through the use of graphs, aiming to enhance the performance of models across various domains, including finance and healthcare. Several novel approaches are proposed that fuse graph structures with modern deep learning techniques, addressing the challenges of missing word embeddings, label prediction, and graph representation learning for large language models.
First, a powerful embedding method built on top of the recent advances in latent graph learning is introduced to address the critical problem of word embedding imputation. Second, a graph-enhanced label attention model designed for medical coding from clinical text is presented, incorporating both the semantic relationships in medical data and the hierarchical structure of medical codes to improve classification accuracy. Finally, node prompt, a simple and efficient approach for node classification in text-attributed-graphs is proposed, allowing simultaneous processing of raw text and graph structure information.
Through these contributions, the work demonstrates the potential of graph-based approaches in addressing complex challenges in natural language processing. The methods developed advance the state-of-the-art in integrating graph information into natural language processing models, leading to more accurate and robust solutions for real-world problems in different fields.
Recommended Citation
Varolgunes, Uras, "Harnessing graphs for knowledge representation in natural language processing" (2025). Dissertations. 1857.
https://digitalcommons.njit.edu/dissertations/1857
