Skip to main content
Log in

An analytical evaluation of search by content and interaction patterns on multimodal meeting records

  • REGULAR PAPER
  • Published:
Multimedia Systems Aims and scope Submit manuscript

Abstract

It has been suggested that combining content-based indexing with automatically generated temporal metadata might help improve search and browsing of recordings of computer-mediated collaborative activities such as on-line meetings, which are characterised by extensive multimodal communication. This paper presents an analytical evaluation of the effectiveness of these techniques as implemented through automatic speech recognition and temporal mapping. In particular, it assesses the extent to which this strategy can help uncover contextual relationships between audio and text segments in recorded remote meetings. Results show that even simple temporal mapping can effectively support retrieval of recorded audio segments, improve retrieval performance in situations where speech recognition alone would have exhibited prohibitively high word error rates, and provide a basic form of semantic adaptation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Agius, H., Angelides, M.C.: Enriching MPEG-7 user models with content metadata. In: Proceedings of the 1st International Workshop on Semantic Media Adaptation and Personalization: SMAP’06, pp. 151–156 (2006)

  2. Allen J.F. (1983). Maintaining knowledge about temporal intervals. Commun. ACM 11(26): 832–843

    Article  Google Scholar 

  3. Bouamrane M.M. and Luz S. (2006). Meeting browsing: a state-of-the-art review. Multimedia Syst 12(4–5): 439–457

    Google Scholar 

  4. Bouamrane, M.M., Luz, S.: Navigating multimodal meeting recordings with the meeting miner. In: Flexible Query Answering Systems: FQAS 2006, LNAI, vol. 4027, pp. 356–367. Springer, Milan (2006)

  5. Bouamrane, M.M., Luz, S.: Temporal mining of recorded collaborative production of artefacts. In: Proceedings of Industrial Conference on Data Mining, ICDM’06, pp. 187–201, Leipzig (2006)

  6. Erol, B., Li, Y.: An overview of technologies for e-meeting and e-lecture. In: IEEE International Conference on Multimedia and Expo, ICME’05, pp. 1000–1005. IEEE press, Amsterdam (2005)

  7. Furui, S.: Automatic speech recognition and its application to information extraction. In: Proceedings of the 37th annual meeting of the Association for Computational Linguistics, pp. 11–20. Morristown (1999)

  8. Geyer, W., Richter, H., Abowd, G.D.: Making multimedia meeting records more meaningful. In: Proceedings of International Conference on Multimedia and Expo, ICME’03, vol. 2, pp. 669–672 (2003)

  9. Jain R. (2003). Are we doing multimedia?. IEEE MultiMedia 10(4): 111–112

    Article  Google Scholar 

  10. Koumpis K. and Renals S. (2005). Content-based access to spoken audio. IEEE Signal Process. 22(5): 61–69

    Article  Google Scholar 

  11. Lee, D.S., Hull, J., Erol, B., Graham, J.: Minuteaid: multimedia note-taking in an intelligent meeting room. In: IEEE International Conference on Multimedia and Expo, vol. 3, pp. 1759 – 1762. IEEE press, New York (2004)

  12. Luz, S., Bouamrane, M.M., Masoodian, M.: Gathering a corpus of multimodal computer-mediated meetings with focus on text and audio interaction. In: Proceedings of the International Conference on Language Resources and Evaluation, LREC 2006, pp. 407–412. Genoa (2006)

  13. Luz, S., Roy, D.M.: Meeting browser: A system for visualising and accessing audio in multicast meetings. In: Proceedings of the International Workshop on Multimedia Signal Processing, pp. 489–494. IEEE Signal Process. Soc. (1999)

  14. Masoodian, M., Luz, S., Bouamrane, M.M., King, D.: RECOLED: A group-aware collaborative text editor for capturing document history. In: Proceedings of WWW/Internet 2005, vol. 1, pp. 323–330. Lisbon (2005)

  15. McCowan I., Gatica-Perez D., Bengio S., Lathoud G., Barnard M. and Zhang D. (2005). Automatic analysis of multimodal group actions in meetings. IEEE Trans. Pattern Anal. Mach. Intell. 27(3): 305–317

    Article  Google Scholar 

  16. Nakatani, C., Whittaker, S., Hirschberg, J.: Now you hear it, now you don’t: Empirical studies of audio browsing behavior. In: Proceedings of International Conference on Spoken Language Processing, ICSLP 1998, vol. 4, pp. 1651–1654. Sydney (1998)

  17. Rijsbergen C.J.V. (1979). Information Retrieval. Butterworths, London, UK

    Google Scholar 

  18. Sellen, A.J.: Speech patterns in video-mediated conversations. In: Proceedings of the SIGCHI conference on Human factors in computing systems: CHI’92, pp. 49–59. ACM Press, New York (1992)

  19. Smeaton, A.F.: Indexing, browsing, and searching of digital video and digital audio information. LNCS Lectures on information retrieval pp. 93–110 (2001)

  20. http://cmusphinx.sourceforge.net/

  21. Tannen, D.: Talking voices, repetition, dialogue and imagery in conversational discourse. Studies in interactional sociolinguistics. Cambridge Univ. Press, (1989)

  22. Tucker, S., Whittaker, S.: Accessing multimodal meeting data: systems, problems and possibilities. In: Machine Learning for Multimodal Interaction: MLMI 2004, vol. LNCS 3361, pp. 1–2. Springer, Heidelberg (2005)

  23. Waibel, A., Bett, M., Metze, F., Ries, K., Schaaf, T., Schultz, T., Soltau, H., Yu, H., Zechner, K.: Advances in automatic meeting record creation and access. In: Proceedings of the International Conference on Acoustics, Speech and Signal Processing, pp. 597–600 (2001)

  24. Wellner, P., Flynn, M., Guillemot, M.: Browsing recorded meetings with Ferret. In: Bengio S., Bourlard H. (eds.) In: Proceedings of Machine Learning for Multimodal Interaction: 1st International Workshop, MLMI 2004, vol. 3361, pp. 12–21. Springer-, Martigny (2004)

  25. Wellner, P., Flynn, M., Tucker, S., Whittaker, S.: A meeting browser evaluation test. In: CHI ’05 Extended abstracts on Human factors in computing systems, pp. 2021–2024. ACM Press, New York (2005)

  26. Zechner, K.: Automatic generation of concise summaries of spoken dialogues in unrestricted domains. In: Proceedings of the 24th annual conference on Research and development in information retrieval, SIGIR ’01, pp. 199–207. ACM Press, New York (2001)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Saturnino Luz.

Additional information

The authors are listed in alphabetical order.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bouamrane, MM., Luz, S. An analytical evaluation of search by content and interaction patterns on multimodal meeting records. Multimedia Systems 13, 89–102 (2007). https://doi.org/10.1007/s00530-007-0087-8

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00530-007-0087-8

Keywords

Navigation