Topical classification of domain names based on subword embeddings

Document Type

Article

Publication Date

3-1-2020

Abstract

A good domain name can help a company rapidly increase their brand awareness, attract more visitors, and therefore obtain more customers. Due to the exponential increase in the number of domain names, registrants are often frustrated because their preferred domain names are already taken. In order to enhance registrants’ satisfaction and efficiency, as well as to increase the revenue of registrars (e.g. GoDaddy, Yahoo, Squarespace), it is important to suggest alternative domain names that are available. The first step is to detect registrants’ needs by classifying the attempted domain name to one of the categories. This study is the first that defines the problem of domain name classification, which classifies a registrant's preferred domain name into pre-defined categories. The paper proposes deep neural networks with subword embeddings that are built in multiple strategies. We build embeddings for character n-grams of a domain name by learning from training data, learning from external corpus, or learning from external corpus and adjusting based on training data. The experiments show that the proposed methods significantly outperform the baselines.

Identifier

85081131105 (Scopus)

Publication Title

Electronic Commerce Research and Applications

External Full Text Location

https://doi.org/10.1016/j.elerap.2020.100961

ISSN

15674223

Volume

40

This document is currently not available here.

Share

COinS