Comparative analysis of hidden Markov models for multi-modal dialogue scene indexing

Document Type

Conference Proceeding

Publication Date

1-1-2000

Abstract

A class of audio-visual content is segmented into dialogue scenes using the state transitions of a novel hidden Markov model (HMM). Each shot is classified using both audio track and visual content to determine the state/scene transitions of the model. After simulations with circular and left-to-right HMM topologies, it is observed that both are performing very good with multi-modal inputs. Moreover, for circular topology, the comparisons between different training and observation sets show that audio and face information together gives the most consistent results among different observation sets.

Identifier

0033708104 (Scopus)

ISBN

[0780362934]

Publication Title

ICASSP IEEE International Conference on Acoustics Speech and Signal Processing Proceedings

External Full Text Location

https://doi.org/10.1109/ICASSP.2000.859325

ISSN

15206149

First Page

2401

Last Page

2404

Volume

4

This document is currently not available here.

Share

COinS