Abstract
Comparison of video sequences is an important operation in many multimedia information systems. The similarity measure for comparison is typically based on some measure of correlation with the perceptual similarity (or difference) amongst the video sequences or with the similarity (or difference) in some measure of semantics associated with the video sequences. In content-based similarity analysis, the video data are expressed in terms of different features. Similarity matching is then performed by quantifying the feature relationships between the target video and query video shots, with either an individual feature or with a feature combination. In this study, two approaches are proposed for the similarity analysis of video shots. In the first approach, mosaic images are created from video shots, and the similarity analysis is done by determining the similarities amongst the mosaic images. In the second approach, key frames are extracted for each video shot and the similarity amongst video shots is determined by comparing the key frames of the video shots. The features extracted include image histograms, slopes, edges, and wavelets. Both individual features and feature combinations are used in similarity matching using an artificial neural network. The similarity rank of the query video shots is determined based on the values of the coefficients of determination and the mean absolute error. The study reported in this paper shows that the mosaic-based similarity analysis can be expected to yield a more reliable result, whereas the key frame-based similarity analysis could be potentially applied to a wider range of applications. The weighted non-linear feature combination is shown to yield better results than a single feature for video similarity analysis. The coefficient of determination is shown to be a better criterion than the mean absolute error in similarity matching analysis.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
D.A. Adjeroh, M.C. Lee, and I. King, “A distance measure for video sequence similarity mathching,” in Proc. IEEE Intl. Workshop on Multimedia Database Management Systems, Dayton, Ohio, August 5–7, 1998, pp. 72–79.
J.H. Lim, J.K. Wu, and D. Narasimhalu, “Learning similarity matching in multimedia content-based retrieval,” IEEE Transactions on Knowledge and Data Engineering, vol. 13, pp. 846–850, 2001.
S. Santini and R. Jain, “Similarity measures,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 21, pp. 871–883, 1999.
W.Y. Ma and B.S. Manjunath, “A texture thesaurus for browsing large aerial photographs,” Journal of the American Society for Information Science, vol. 49, pp. 633–648, 1998.
A. Pentland, R.W. Picard, and S. Sclaroff, “Photobook: Tools for content-based manipulation of image databases,” in Proceedings of SPIE-94, Bellingham, Washington, 1994, pp. 34–47.
S.H. Cha and S.N. Srihari, “On measuring the distance between histograms,” Pattern Recognition, vol. 35, pp. 1355–1370, 2002.
M. Inoue, Y. Mitsukura, M. Fukumi, and N. Akamatsu, “Neural net based retrieval by using color and location information,” in Proceedings of IEEE International Conference on Systems, Man, and Cybernetics, vol. 4, 2000, pp. 2575–2579.
Y. Rubner, C. Tomasi, and L. Guibas, “A metric for distributions with applications to image databases,” IEEE International Conference on Computer Vision, pp. 354–367, 1998.
M. Swain and D. Ballard, “Color indexing,” International Journal of Computer Vision, vol. 7, pp. 11–32, 1991.
C.E. Jacobs, A. Finkelstein, and D.H. Salesin, “Fast multi-resolution image querying,” in Proceedings of SIGGRAPH ‘95, ACM: New York, NY, pp. 277–286, 1995.
A.K. Jain, A. Vailaya, and X. Wei, “Query by video clip,” Multimedia Systems, vol. 7, pp. 369–384, 1999.
B.S. Manjunath and W.Y. Ma, “Texture features for browsing and retrieval of image data,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 18, pp. 837–842, 1996.
J. Puzicha, T. Hofmann, and J.M. Buhmann, “Non-parametric similarity measures for unsupervised texture segmentation and image retrieval,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 1997, pp. 267–272.
J.R. Smith and S.-F. Chang, “Quad-tree segmentation for texture-based image query,” ACM International Conference on Multimedia, pp. 279–286, 1994.
A. Del Bimbo and P. Pala, “Effective image retrieval using deformable templates,” in Proceedings of the International Conference on Pattern Recognition, 1996, pp. 120–124.
F. Mokhtarian and S. Abbasi, “Shape similarity retrieval under affine transforms,” Pattern Recognition, vol. 35, pp. 31–41, 2002.
P. Pala and S. Santini, “Image retrieval by shape and texture,” Pattern Recognition, vol. 32, pp. 517–527, 1999.
E.G.M. Petrakis and E. Milios, “Efficient retrieval by shape content,” IEEE International Conference on Multimedia Computing Systems, vol. 2, 1999, pp. 616–621.
S. Scarloff, “Deformable prototypes for encoding shape categories in image databases,” Pattern Recognition, vol. 30, pp. 627–641, 1997.
C. Colombo and A. Del Bimbo, “Color induced image representation and retrieval,” Pattern Recognition, vol. 32, pp. 1685–1696, 1999.
A. Del Bimbo, M. Mugnaini, P. Pala, and F. Turco, “Visual querying by color perceptive regions,” Pattern Recognition, vol. 31, pp. 1241–1253, 1998.
M.S. Kankanhalli, B.M. Mehtre, and H.Y. Huang, “Color and spatial feature for content-based image retrieval,” Pattern Recognition Letters, vol. 20, pp. 109–118, 1999.
H.K. Lee and S.I. Yoo, “Neural network-based image retrieval using nonlinear combination of heterogeneous features,” in Proceedings of IEEE Conference on Evolutionary Computation, vol. 1, 2000, pp. 667–674.
G. Sheikholeslami, S. Chatterjee, and A. Zhang, “NeuroMerge: An approach for merging heterogeneous features in content-based image retrieval systems,” in Proceedings of IEEE Intl. Workshop on Multimedia Database Management Systems, Dayton, Ohio, August 5–7, 1998, pp. 106–113.
Y.K. Chan and C.C. Chang, “Image matching using run-length feature,” Pattern Recognition Letters, vol. 22, pp. 447–455, 2001.
Y. Zhong and A. Jain, “Object localization using color,” texture and shape, Pattern Recognition, vol. 33, pp. 671–684, 2000.
A. Mojsilovic, J. Kovacevic, J. Hu, R.J. Safranek, and S.K. Ganpathy, “Matching and retrieval based on vocabulary and grammar of color patterns,” in Proceedings of IEEE International Conference on Multimedia Computing Systems, vol. 1, 1999, pp. 189–194.
S.M. Bhandarkar and A.A. Khombadia, “Motion–-based parsing of compressed video,” in Proc. IEEE Intl. Wkshp. Multimedia Database Mgmt. Sys., Dayton, Ohio, August 5–7, 1998, pp. 80–87.
S.M. Bhandarkar, Y.S. Warke, and A.A. Khombadia, “Integrated parsing of compressed video,” in Proc. Intl. Conf. Visual Information Systems, Amsterdam, The Netherlands, June 2–4, 1999, pp. 269–276.
I.D. Moore, A. Lewis, and J.C. Gallant, “Terrain attributes: Estimation methods and scale effects,” in Modeling Change in Environmental Systems, edited by Jakeman et al., New York: John Wiley & Sons Ltd., NY, pp. 190–213, 1993.
J. Canny, “A computational approach to edge detection,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 8, pp. 679–697, 1986.
E.J. Stollnitz, T.D. Derose, and D.H. Salesin, Wavelets for Computer Graphics–-Theory and Applications, Morgan Kaufmann Publishers, Inc.: San Francisco, CA, 1996.
M.K. Mandal, T. Aboulnasr, and S. Panchanathan, “Fast wavelet histogram techniques for image indexing,” Computer Vision Image Understanding, vol. 75, pp. 186–196, 1999.
C.K. Chui, An Introduction to Wavelets, Academic Press: Boston, MA, 1992.
G.K. Bhattacharya and R. Johnson, Statistics: Principles and Methods, John Wiley and Sons: New York, NY, 2001.
Ward System Group, Inc.: Neural Shell User’s Manual, Ward System Group, Inc.: Fredericksburg, MD, 1996.
VideoBrush Corporation, VideoBrush User’s Manual, VideoBrush Corporation, Carpinteria, CA, 2000.
S. Antani, R. Kasturi, and R.C. Jain, “A survey on use of pattern recognition methods for abstraction, indexing and retrieval of images and video,” Pattern Recognition, vol. 35, pp. 945–965, 2002.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Bhandarkar, S.M., Chen, F. Similarity Analysis of Video Sequences Using an Artificial Neural Network. Appl Intell 22, 251–275 (2005). https://doi.org/10.1007/s10791-005-6622-3
Issue Date:
DOI: https://doi.org/10.1007/s10791-005-6622-3