Abstract
Content-based queries in multimedia sequence databases where information is sequential is a tough issue, especially when dealing with large-scale applications. One of the key points is similarity estimation between a query sequence and elements of the database. In this paper, we investigate two ways to compare multimedia sequences, one—that comes from the literature—being computed in the feature space while the other one is computed in a model space, leading to a representation less sensitive to noise. We compare these approaches by testing them on a real audio dataset, which points out the utility of working in the model space.


Similar content being viewed by others
Notes
Note that this method might overestimate dissimilarity if the natural path contains a significant amount of nondiagonal parts.
References
Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment search tool. J Mol Biol 215(3):403–410
Andoni A, Indyk P (2006) Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. In: Proceedings of the 47th annual IEEE symposium on foundations of computer science. IEEE, Piscataway, pp 459–468
Bouthemy P, Gelgon M, Ganansia F (1999) A unified approach to shot change detection and camera motion characterization. IEEE Trans Circuits Syst Video Technol 9(7):1030–1044
Bruno E, Marchand-Maillet S (2003) Prédiction temporelle de descripteurs visuels pour la mesure de similarité entre vidéos. In: Proceedings of the GRETSI’03. France
Chen L, Ng R (2004) On the marriage of lp-norms and edit distance. In: Proceedings of the 30th international conference on very large data bases. Toronto, 29 August–3 September 2004, pp 792–803
Ciaccia P, Patella M, Zezula P (1997) M-tree: an efficient access method for similarity search in metric spaces. In: Proceedings of the 23th international conference on very large data bases. Athens, Greece, August 1997. Morgan Kaufmann, San Mateo, pp 426–435
Davis S, Mermelstein P (1980) Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans Audio Speech Lang Process 28(4):357–366
Ding H, Trajcevski G, Scheuermann P, Wang X, Keogh E (2008) Querying and mining of time series data: Experimental comparison of representations and distance measures. In: Proceedings of the 34th international conference on very large data bases. Auckland, 23–28 August 2008
Keogh E (2002) Exact indexing of dynamic time warping. In: Proceedings of the 28th international conference on very large data bases. Hong Kong, 20–23 August 2002, pp 406–417
Lejsek H, Ásmundsson FH, Jónsson BÞ, Amsaleg L (2009) NV-tree: an efficient disk-based index for approximate search in very large high-dimensional collections. IEEE Trans Pattern Anal Mach Intell 31(5):869–883. doi:10.1109/TPAMI.2008.130
Law-To J, Chen L, Joly A, Laptev I, Buisson O, Gouet-Brunet V, Boujemaa N, Stentiford F (2007) Video copy detection: a comparative study. In: Proceedings of the 6th ACM international conference on image and video retrieval. New York, NY, USA, July 2007. ACM, New York, pp 371–378
Mercer J (1909) Functions of positive and negative type, and their connection with the theory of integral equations. Philos Trans R Soc Lond A Contain Pap Math Phys Character 209:415–446
Muscariello A, Gravier G, Bimbot F (2009) Variability tolerant audio motif discovery. In: The 15th international multimedia modeling conference. Sophia Antipolis, 7–9 January 2009
Nistér D, Stewénius H (2006) Scalable recognition with a vocabulary tree. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition. New York, 17–22 June 2006
Sakoe H, Chiba S (1978) Dynamic programming optimization for spoken word recognition. IEEE Trans Acoust Speech Signal Process 26:43–49
Smola AJ, Schoelkopf B (1998) A tutorial on support vector regression. http://www.eknigu.org/info/Cs_Computer%20science/CsAi_AI,%20knowledge/Smola%20A.J.,%20Schoelkopf%20B.%20Tutorial%20on%20support%20vector%20regression%20(2003)(24s).pdf
Tavenard R, Amsaleg L, Gravier G (2007) Machines à vecteurs supports pour la comparaison de séquences de descripteurs. In: Proceedings of the 12th CORESA, pp 247–251
Vapnik VN (1995) The nature of statistical learning theory. Springer, New York
Vapnik V, Golowich S, Smola A (1997) Support vector method for function approximation. In: Mozer M, Jordan M, Petsche T (eds.) Neural information processing systems, vol 9. MIT, Cambridge
Wilcoxon F (1945) Individual comparisons by ranking methods. Biom Bull 1:80–83
Yi B, Jagadish HV, Faloutsos C (1998) Efficient retrieval of similar time sequences under time warping. In: Proceedings of the 14th international conference on data engineering, pp 201–208
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Tavenard, R., Amsaleg, L. & Gravier, G. Model-based similarity estimation of multidimensional temporal sequences. Ann. Telecommun. 64, 381–390 (2009). https://doi.org/10.1007/s12243-009-0091-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12243-009-0091-4