Faculty Publications

Multi-modal dialog scene detection using hidden Markov models for content-based multimedia indexing

A. Aydin Alatan, Middle East Technical University (METU)
Ali N. Akansu, New Jersey Institute of Technology
Wayne Wolf, School of Engineering and Applied Science

Document Type

Article

Publication Date

6-1-2001

Abstract

A class of audio-visual data (fiction entertainment: movies, TV series) is segmented into scenes, which contain dialogs, using a novel hidden Markov model-based (HMM) method. Each shot is classified using both audio track (via classification of speech, silence and music) and visual content (face and location information). The result of this shot-based classification is an audio-visual token to be used by the HMM state diagram to achieve scene analysis. After simulations with circular and left-to-right HMM topologies, it is observed that both are performing very good with multi-modal inputs. Moreover, for circular topology, the comparisons between different training and observation sets show that audio and face information together gives the most consistent results among different observation sets.

Identifier

0035368101 (Scopus)

Publication Title

Multimedia Tools and Applications

External Full Text Location

https://doi.org/10.1023/A:1011395131992

ISSN

13807501

First Page

137

Last Page

151

Issue

Volume

Recommended Citation

Alatan, A. Aydin; Akansu, Ali N.; and Wolf, Wayne, "Multi-modal dialog scene detection using hidden Markov models for content-based multimedia indexing" (2001). Faculty Publications. 15163.
https://digitalcommons.njit.edu/fac_pubs/15163

This document is currently not available here.

COinS

DOI

10.1023/A:1011395131992

Faculty Publications

Multi-modal dialog scene detection using hidden Markov models for content-based multimedia indexing

Document Type

Publication Date

Abstract

Identifier

Publication Title

External Full Text Location

ISSN

First Page

Last Page

Issue

Volume

Recommended Citation

DOI

Search

Browse

Author Corner

Links

Faculty Publications

Multi-modal dialog scene detection using hidden Markov models for content-based multimedia indexing

Authors

Document Type

Publication Date

Abstract

Identifier

Publication Title

External Full Text Location

ISSN

First Page

Last Page

Issue

Volume

Recommended Citation

Share

DOI

Search

Browse

Author Corner

Links