Abstract
Static documents play a central role in multimodal applications such as meeting recording and browsing. They provide a variety of structures, in particular thematic, for segmenting meetings, structures that are often hard to extract from audio and video. In this article, we present four steps for creating a strong link between static documents and multimedia meeting archives. First, a document-centric meeting environment is introduced. Then, a document analysis tool is presented, which builds a multi-layered representation of documents and creates indexes that are further on used by document/speech and document/video alignment methods. Finally, a document-based browsing system, integrating the various alignment results, is described along with a preliminary user evaluation.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Bett, M., Gross, R., Yu, H., Zhu, X., Pan, Y., Yang, J., Waibel, A.: Multimodal meeting tracker. In: Conference on Content-Based Multimedia Information Access, RIAO 2000, Paris, France (2000)
Brotherton, J.A., Bhalodia, J.R., Abowd, G.D.: Automated capture, integration, and visualization of multiple media streams. In: IEEE International Conference on Multimedia Computing and Systems, p. 54 (1998)
Chiu, P., Kapuskar, A., Reitmeier, S., Wilcox, L.: Room with a rear view: meeting capture in a multimedia conference room. IEEE Multimedia 7(4), 48–54 (2000)
Cutler, R., Rui, Y., Gupta, A., Cadiz, J., Tashev, I., He, L.W., Colburn, A., Zhang, Z., Liu, Z., Silverberg, S.: Distributed meetings: a meeting capture and broadcasting system. In: 10th ACM International Conference on Multimedia, Juan les Pins, France, pp. 503–512 (2002)
Hunter, J., Little, S.: Building and indexing a distributed multimedia presentation archive using SMIL. In: 5th European Conference on Research and Advanced Technology for Digital Libraries, pp. 415–428 (2001)
Mukhopadhyay, S., Smith, B.: Passive capture and structuring of lectures. In: 7th ACM International Conference on Multimedia, Orlando, FL, USA, pp. 477–487 (1999)
Lalanne, D., Sire, S., Ingold, R., Behera, A., Mekhaldi, D., von Rotz, D.: A research agenda for assessing the utility of document annotations in multimedia databases of meeting recordings. In: 3rd Workshop on Multimedia Data and Document Engineering, Berlin, Germany (2003)
Hadjar, K., Rigamonti, M., Lalanne, D., Ingold, R.: Xed: a new tool for extracting hidden structures from electronic documents. In: International Workshop on Document Image Analysis for Libraries, Palo Alto, CA, USA, pp. 212–224 (2004)
Lalanne, D., Mekhaldi, D., Ingold, R.: Talking about documents: revealing a missing link to multimedia meeting archives. In: Document Recognition and Retrieval XI, IS&T/SPIE’s International Symposium on Electronic Imaging 2004, San Jose, CA, pp. 82–91 (2000)
Klemmer, S.R., Graham, J., Wolff, G.J., Landay, J.A.: Books with voices: paper transcripts as a physical interface to oral histories. In: Conference on Human Factors in Computing Systems, CHI 2003, Ft. Lauderdale, FL, USA, pp. 89–96 (2003)
Wellner, P.: Interacting with paper on the digitaldesk. Communications of the ACM 36(7), 86–96 (1993)
Mekhaldi, D., Lalanne, D., Ingold, R.: Thematic segmentation of meetings throught document/speech alignment. In: 12th ACM International Conference on Multimedia, New York, NY, USA (2004)
Popescu-Belis, A., Lalanne, D.: Reference resolution over a restricted domain: References to documents. In: ACL 2004 Workshop on Reference Resolution and its Applications, Barcelona, Spain, pp. 71–78 (2004)
Behera, A., Lalanne, D., Ingold, R.: Looking at projected documents: Event detection & document identification. In: IEEE International Conference on Multimedia and Expo, ICME 2004, Taiwan (2004)
Behera, A., Lalanne, D., Ingold, R.: Visual signature based identification of low-resolution document images. In: ACM Symposium on Document Engineering, Milwaukee, WI, USA (2004)
Uchihashi, S., Foote, J., Girgensohn, A., Boreczky, J.: Video manga: generating semantically meaningful video summaries. In: 7th ACM International Conference on Multimedia, Orlando, FL, USA, pp. 383–392 (1999)
Smith, M.A., Kanade, T.: Video skimming and characterization through the combination of image and language understanding techniques. In: International Workshop on Content-Based Access of Image and Video Databases, CAIVD 1998, Bombay, India, pp. 61–70 (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lalanne, D., Ingold, R., von Rotz, D., Behera, A., Mekhaldi, D., Popescu-Belis, A. (2005). Using Static Documents as Structured and Thematic Interfaces to Multimedia Meeting Archives. In: Bengio, S., Bourlard, H. (eds) Machine Learning for Multimodal Interaction. MLMI 2004. Lecture Notes in Computer Science, vol 3361. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30568-2_8
Download citation
DOI: https://doi.org/10.1007/978-3-540-30568-2_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-24509-4
Online ISBN: 978-3-540-30568-2
eBook Packages: Computer ScienceComputer Science (R0)