Document Type

Thesis

Date of Award

5-31-2024

Degree Name

Master of Science in Data Science - (M.S.)

Department

Data Science

First Advisor

James Geller

Second Advisor

Lijing Wang

Third Advisor

Akshay Rangamani

Abstract

Generative Artificial Intelligence has recently garnered enormous attention for a varied number of reasons, one of them being its capacity to automate complex tasks. As the potential for integrating Generative Artificial Intelligence across technology, media, and healthcare grows, many tasks previously reliant on manual or algorithm-intensive methods can be simplified significantly. One such challenge is concept extraction, which is a labor-intensive process, especially when performed manually for building robust ontologies. It involves analyzing text to identify and extract relevant concepts, key phrases, entities, and relationships between these entities. This thesis explores various challenges of concept extraction from the ABCD (Adolescent Brain Cognitive Development) database focusing on extracting concepts pertaining to Social Determinants of Health. The thesis will address the issues of language-specific text separation for data processing, fine-tuning of GPT models, and tools to find semantic similarity. The thesis aims to enhance the understanding and application of Generative Artificial Intelligence in refining the existing concept extraction processes.

Share

COinS
 
 

To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.