A probabilistic framework for estimating pairwise distances through crowdsourcing

Document Type

Conference Proceeding

Publication Date

1-1-2017

Abstract

Estimating all pairs of distances among a set of objects has wide applicability in various computational problems in databases, machine learning, and statistics. This work presents a probabilistic framework for estimating all pair distances through crowdsourcing, where the human workers are involved to provide distance between some object pairs. Since the workers are subject to error, their responses are considered with a probabilistic interpretation. In particular, the framework comprises of three problems: (1) Given multiple feedback on an object pair, how do we combine and aggregate those feedback and create a probability distribution of the distance? (2) Since the number of possible pairs is quadratic in the number of objects, how do we estimate, from the known feedback for a small numbers of object pairs, the unknown distances among all other object pairs? For this problem, we leverage the metric property of distance, in particular, the triangle inequality property in a probabilistic settings. (3) Finally, how do we improve our estimate by soliciting additional feedback from the crowd? For all three problems, we present principled modeling and solutions. We experimentally evaluate our proposed framework by involving multiple real-world and large scale synthetic data, by enlisting workers from a crowdsourcing platform.

Identifier

85046421978 (Scopus)

ISBN

[9783893180738]

Publication Title

Advances in Database Technology Edbt

External Full Text Location

https://doi.org/10.5441/002/edbt.2017.24

e-ISSN

23672005

First Page

258

Last Page

269

Volume

2017-March

Grant

W911NF-15-1-0020

Fund Ref

Army Research Office

This document is currently not available here.

Share

COinS