A human-in-the-loop attribute design framework for classification

Document Type

Conference Proceeding

Publication Date

5-13-2019

Abstract

In this paper, we present a semi-automated, “human-in-the-loop” framework for attribute design that assists human analysts to transform raw attributes into effective derived attributes for classification problems. Our proposed framework is optimization guided and fully agnostic to the underlying classification model. We present an algebra with various operators (arithmetic, relational, and logical) to transform raw attributes into derived attributes and solve two technical problems: (a) the top-k buckets design problem aims at presenting human analysts with k buckets, each bucket containing promising choices of raw attributes that she can focus on only without having to look at all raw attributes; and (b) the top-l snippets generation problem, which iteratively aids human analysts with top-l derived attributes involving an attribute. For the former problem, we present an effective exact bottom-up algorithm that is empowered by pruning capability, as well as random walk based heuristic algorithms that are intuitive and work well in practice. For the latter, we present a greedy heuristic algorithm that is scalable and effective. Rigorous evaluations are conducted involving 6 different real world datasets to showcase that our framework generates effective derived attributes compared to fully manual or fully automated methods.

Identifier

85066894041 (Scopus)

ISBN

[9781450366748]

Publication Title

Web Conference 2019 Proceedings of the World Wide Web Conference Www 2019

External Full Text Location

https://doi.org/10.1145/3308558.3313547

First Page

1612

Last Page

1622

Grant

1814595

Fund Ref

National Science Foundation

This document is currently not available here.

Share

COinS