Active Few-Shot Learning for Sound Event Detection

Document Type

Conference Proceeding

Publication Date

1-1-2022

Abstract

Few-shot learning has shown promising results in sound event detection where the model can learn to recognize novel classes assuming a few labeled examples (typically five) are available at inference time. Most research studies simulate this process by sampling support examples randomly and uniformly from all test data with the target class label. However, in many real-world scenarios, users might not even have five examples at hand or these examples may be from a limited context and not representative, resulting in model performance lower than expected. In this work, we relax these assumptions, and to recover model performance, we propose to use active learning techniques to efficiently sample additional informative support examples at inference time. We developed a novel dataset simulating the long-term temporal characteristics of sound events in real-world environmental soundscapes. Then we ran a series of experiments with this dataset to explore the modeling and sampling choices that arise when combining few-shot learning and active learning, including different training schemes, sampling strategies, models, and temporal windows in sampling.

Identifier

85140098080 (Scopus)

Publication Title

Proceedings of the Annual Conference of the International Speech Communication Association Interspeech

External Full Text Location

https://doi.org/10.21437/Interspeech.2022-10907

e-ISSN

19909772

ISSN

2308457X

First Page

1551

Last Page

1555

Volume

2022-September

Grant

1544753

Fund Ref

National Science Foundation

This document is currently not available here.

Share

COinS