Abstract
This paper describes the process of iterative design and comprehensive evaluation of the Meeting Miner, a tool for browsing of recorded multimedia meetings. The Meeting Miner provides access to the speech content of recordings by tracking non-verbal interaction events collected in real time during online collaborative meeting activities. It emphasises semantic relationships between speech and discrete actions performed during the meeting and aggregates information through patterns of co-location and co-occurrence of actions. We report on the experience gained through developing functionality to enhance the user’s browsing experience, requirements regarding information feedback and the importance of flexibility in the browsing tool. In particular, we demonstrate how iterative development and evaluation can reveal areas where interface adaptation can play a useful role in enhancing the system’s usability.
Similar content being viewed by others
References
Allan, J., Carbonell, J., Doddington, G., Yamron, J., Yang, Y.: Topic detection and tracking pilot study: final report. In: Proceedings of the DARPA Broadcast News Transcription and Understanding Workshop (1998)
Arons, B.: Speechskimmer: a system for interactively skimming recorded speech. In: ACM Transactions on Computer–Human Interaction, vol. 4,1, pp. 3–38. ACM Press, New York, NY, USA (1997)
Bafoutsou G. and Mentzas G. (2002). Review and functional classification of collaborative systems. Int. J. Inf. Manage. 22: 281–305
Bouamrane, M.M., Luz, S.: An analysis of the effectiveness of temporal mapping and speech recognition for content-based multimedia indexing. In: Mylonas, P., Wallace, M., Angelides, M. (eds.) Proceedings of the 1st International Workshop on Semantic Media Adaptation and Personalization (SMAP’06), pp. 1–6. IEEE Computer Society, Athens (2006). doi:10.1109/SMAP.2006.9
Bouamrane, M.M., Luz, S.: Meeting browsing, a state-of-the-art review. In: Boll, S., Westermann, U. (eds.) Multimedia Systems, vol. 12(4–5), Special issue on user-centered multimedia (2006)
Bouamrane, M.M., Luz, S.: Navigating multimodal meeting recordings with the meeting miner. In: H.L. et al. (ed.) Proceedings of Flexible Query Answering Systems, FQAS’2006, vol. LNCS 4027, pp. 356–367. Springer, Milan, Italy (2006)
Bouamrane, M.M., Luz, S.: An analytical evaluation of search by content and interaction patterns on multimodal meeting records. In: Angelides, M., Mylonas, P., Wallace, M. (eds.) Multimedia Systems 13(2), Special issue on Semantic media adaptation and personalisation (2007). doi:10.1007/s00530-007-0087-8
Bouamrane, M.M., Luz, S.: In search of a better BET: novel metrics for a browser evaluation test. In: Dominic, S., Kiss, F. (eds.) International Conference on the Theory of Information Retrieval (ICTIR), pp. 37–50. Budapest (2007)
Campbell J.P. (1997). Speaker recognition: a tutorial. Proc. IEEE 85(9): 1437–1462
Carbonell, J., Goldstein, J.: The use of MMR, diversity-based reranking for reordering documents and producing summaries. In: Proceedings of the 21st International Conference on Research and Development in Information Retrieval, SIGIR ’98, pp. 335–336. ACM Press, New York, NY, US (1998)
Carbonell, J., Yang, Y., La, J., Brown, R., Pierce, T., Liu, X.: CMU report on TDT2: Segmentation detection and tracking. In: Proceedings of the DARPA Broadcast News Workshop, pp. 117–120 (1999)
Foote J. (1999). An overview of audio information retrieval. ACM Multimedia Syst. 7: 2–10
Foote, J., Jones, G., Jones, K., Young, S.: Talker-independent keyword spotting for information retrieval. In: Proceedings of Eurospeech 95, vol. 3, pp. 2145–2148. Madrid, Spain (1995)
Furui, S.: Robust methods in automatic speech recognition and understanding. In: Proceedings of EUROSPEECH, vol. III, pp. 1993–1998. Geneva (2003)
Geyer, W., Richter, H., Abowd, G.D.: Making multimedia meeting records more meaningful. In: Proceedings of International Conference on Multimedia and Expo, ICME ’03, vol. 2, pp. 669–672 (2003)
Goldman, J., Renals, S., Bird, S., de Jong, F., Federico, M., Fleischhauer, C., Kornbluh, M., Lamel, L., Oard, D., Stewart, C.,Wright, R.: Accessing the spoken word. Int. J. Digital Lib. 5(4), 287–298 (2005)
Hill, W.C., Hollan, J.D., Wroblewski, D., McCandless, T.: Edit wear and read wear. In: CHI ’92: Proceedings of the SIGCHI conference on Human factors in computing systems, pp. 3–9. ACM Press, New York, NY, USA (1992)
Johnson, N., Galata, A., Hogg, D.: The acquisition and use of interaction behavior models. In: Conference on Computer Vision and Pattern Recognition, CVPR ’98, pp. 866–871. IEEE Computer Society (1998)
Koumpis K. and Renals S. (2005). Content-based access to spoken audio. IEEE Signal Process. Mag. 22(5): 61–69
Li, F.C., Gupta, A., Sanocki, E., wei He, L., Rui, Y.: Browsing digital video. In: CHI ’00: Proceedings of the SIGCHI conference on Human factors in computing systems, pp. 169–176. ACM Press, New York, NY, USA (2000)
Luz, S.: Interleave factor and multimedia information visualisation. In: Sharp, H., Chalk, P., LePeuple, J., Rosbottom, J. (eds.) Proceedings of Human Computer Interaction, vol. 2, pp. 142–146. London (2002)
Luz, S., Masoodian, M.: A mobile system for non-linear access to time-based data. In: Proceedings of Advanced Visual Interfaces AVI’04, pp. 454–457. ACM Press (2004). doi:10.1145/989863.989950
Luz, S., Masoodian, M.: A model for meeting content storage and retrieval. In: Proceedings of the 11th International Multimedia Modelling Conference, MMM’05, pp. 392–398. IEEE Press, New York (2005)
McCowan I., Gatica-Perez D., Bengio S., Lathoud G., Barnard M. and Zhang D. (2005). Automatic analysis of multimodal group actions in meetings. IEEE Trans. Pattern Anal. Mach. Intell. 27(3): 305–317
Nakatani, C., Whittaker, S., Hirschberg, J.: Now you hear it, now you don’t: empirical studies of audio browsing behavior. In: Proceedings of International Conference on Spoken Language Processing, ICSLP 1998, vol. 4, pp. 1651–1654. Sydney, Australia (1998)
Nielsen, J., Molich, R.: Heuristic evaluation of user interfaces. In: CHI ’90: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 249–256. ACM Press, New York, NY, USA (1990)
Oliver N., Rosario B. and Pentland A. (2000). A bayesian computer vision system for modeling human interactions. IEEE Trans. Pattern Anal. Mach. Intell. 22(8): 831–843
Posner I. and Baecker R.M. (1992). How people write together. Morgan Kaufmann, CA, USA, 127–138
Preece J., Rogers Y. and Sharp H. (2002). Interaction Design: Beyond Human–Computer Interaction. Wiley, London
Richter, H.A., Abowd, G.D., Geyer, W., Fuchs, L., Daijavad, S., Poltrock, S.E.: Integrating meeting capture within a collaborative team environment. In: Proceedings of the 3rd International Conference on Ubiquitous Computing, UbiComp ’01, pp. 123–138. Springer, London, UK (2001)
Rijsbergen C. (1979). Information Retrieval. Butterworths, London
Shriberg, E., Stolcke, A., Hakkani-Tur, D., Tur, G.: Prosodybased automatic segmentation of speech into sentences and topics. Speech Commun. 32(1–2), 127–154 (2000)
Tucker, S., Whittaker, S.: Accessing multimodal meeting data: Systems, problems and possibilities. In: Samy Bengio, H.B. (ed.) Machine Learning for Multimodal Interaction: First International Workshop, MLMI 2004, vol. 3361, pp. 1–11. Springer-Verlag GmbH, Martigny, Switzerland (2005)
Waibel, A., Bett, M., Metze, F., Ries, K., Schaaf, T., Schultz, T., Soltau, H., Yu, H., Zechner, K.: Advances in automatic meeting record creation and access. In: Proceedings of the International Conference on Acoustics, Speech and Signal Processing, pp. 597–600 (2001)
Waibel, A., Brett, M., Metze, F., Ries, K., Schaaf, T., Schultz, T., Soltau, H., Yu, H., Zechner, K.: Advances in automatic meeting record creation and access. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 1, pp. 597–600. IEEE Press (2001)
Wellner P., Flynn M., Guillemot M.: Browsing recorded meetings with ferret. In: Bengio S., Bourlard H. (eds) Proceedings of Machine Learning for Multimodal Interaction: First International Workshop, MLMI, vol. 3361, pp. 12–21, Springer-Verlag GmbH, Martigny Switzerland , (2004).
Wellner, P., Flynn, M., Tucker, S., Whittaker, S.: A meeting browser evaluation test. In: CHI ’05 Extended Abstracts on Human Factors in Computing Systems, pp. 2021–2024. ACM Press, New York, NY, USA (2005)
Whittaker, S., Hirschberg, J., Choi, J., Hindle, D., Pereira, F., Singhal, A.: Scan: designing and evaluating user interfaces to support retrieval from speech archives. In: Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’99, pp. 26–33. ACM Press, New York, NY, USA (1999)
Yamron, J., Carp, I., Gillick, L., Lowe, S., van Mulbregt, P.: Event tracking and text segmentation via hidden markov models. In: Proceedings of IEEE Workshop on Automatic Speech Recognition and Understanding, pp. 519–526. Santa Barbara, CA, USA (1997)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Bouamrane, MM., Luz, S. Uncovering non-verbal semantic aspects of collaborative meetings: iterative design and evaluation of the Meeting Miner. SIViP 2, 337–353 (2008). https://doi.org/10.1007/s11760-008-0085-0
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11760-008-0085-0