This work develops a system to extract the semantic video units, i. e. the scene and event, from the underlying movie sequence. Multiple media cues have been employed including audio and visual information. Moreover, we have also applied the speaker identification technology to recognize the speakers present in movie dialogs.
Video [Win Media] [Quicktime]