Document Type

Thesis

Date of Award

Fall 12-31-2017

Degree Name

Master of Science in Computer Science - (M.S.)

Department

Computer Science

First Advisor

James Geller

Second Advisor

Soon Ae Chun

Third Advisor

Hai Nhat Phan

Abstract

The topic of this project is an analysis of drug-related tweets. The goal is to build a Machine Learning Model that can distinguish between tweets that indicate drug abuse and other tweets that also contain the name of a drug but do not describe abuse. Drugs can be illegal, such as heroin, or legal drugs with a potential of abuse, such as painkillers. However, building a good Machine Learning Model requires a large amount of training data. For each training tweet, a human expert has determined whether it indicates drug abuse or not. This is difficult work for humans. In this project a new “Looping Predictive Method” was developed that allows generating large training datasets from a small seed set of tweets by repeatedly adding machine-labeled tweets to the human-labeled tweets. With this method, an accuracy improvement of 15.4% was achieved from an initial set of 1,075 tweets, by expanding the training set to 29,908 tweets.

Recommended Citation

Pogili, Subramanyam Reddy, "Looping predictive method to improve accuracy of a machine learning model" (2017). Theses. 45.
https://digitalcommons.njit.edu/theses/45

Download

Included in

Computer Sciences Commons

COinS

Theses

Looping predictive method to improve accuracy of a machine learning model

Document Type

Date of Award

Degree Name

Department

First Advisor

Second Advisor

Third Advisor

Abstract

Recommended Citation

Included in

Search

Browse

Author Corner

Links

Theses

Looping predictive method to improve accuracy of a machine learning model

Author

Document Type

Date of Award

Degree Name

Department

First Advisor

Second Advisor

Third Advisor

Abstract

Recommended Citation

Included in

Share

Search

Browse

Author Corner

Links