Proportionate Diversification of Top-k LLM Results using Database Queries

Document Type

Conference Proceeding

Publication Date

1-1-2023

Abstract

Result diversification aims to return relevant results that cover a variety of perspectives. Attribute-based diversification groups results by shared attributes (e.g., genre for movies) and selects a proportional number of items from each group based on their distribution in the underlying data. However, large language models (LLMs) are not designed to produce proportionally diverse results. In this work, we propose leveraging external data sources to determine the distribution of groups related to a query and prompt LLMs to produce proportionally diverse results. This can improve result diversity by representing groups in proportion to their prevalence. Specifically, we first argue the benefits of making top-k results from LLMs proportionally diverse. We then show how to use external benchmark databases to enable proportional diversity. Finally, we outline a framework that prompts LLMs with proportionality information from external data and discuss challenges in automating this process. Our approach provides a path to overcoming LLMs' limitations in producing proportionally diverse responses.

Identifier

85171260694 (Scopus)

Publication Title

Ceur Workshop Proceedings

ISSN

16130073

Volume

3462

Grant

1814595

Fund Ref

National Science Foundation

This document is currently not available here.

Share

COinS