Comparative analysis of hidden Markov models for multi-modal dialogue scene indexing
Document Type
Conference Proceeding
Publication Date
1-1-2000
Abstract
A class of audio-visual content is segmented into dialogue scenes using the state transitions of a novel hidden Markov model (HMM). Each shot is classified using both audio track and visual content to determine the state/scene transitions of the model. After simulations with circular and left-to-right HMM topologies, it is observed that both are performing very good with multi-modal inputs. Moreover, for circular topology, the comparisons between different training and observation sets show that audio and face information together gives the most consistent results among different observation sets.
Identifier
0033708104 (Scopus)
ISBN
[0780362934]
Publication Title
ICASSP IEEE International Conference on Acoustics Speech and Signal Processing Proceedings
External Full Text Location
https://doi.org/10.1109/ICASSP.2000.859325
ISSN
15206149
First Page
2401
Last Page
2404
Volume
4
Recommended Citation
Aydin Alatan, A.; Akansu, Ali N.; and Wolf, Wayne, "Comparative analysis of hidden Markov models for multi-modal dialogue scene indexing" (2000). Faculty Publications. 15733.
https://digitalcommons.njit.edu/fac_pubs/15733
