Abstract
Multimedia data mining refers to pattern discovery, rule extraction and knowledge acquisition from multimedia database. Two typical tasks in multimedia data mining are of visual data classification and clustering in terms of semantics. Usually performance of such classification or clustering systems may not be favorable due to the use of low-level features for image representation, and also some improper similarity metrics for measuring the closeness between multimedia objects as well. This paper considers a problem of modeling similarity for semantic image clustering. A collection of semantic images and feed-forward neural networks are used to approximate a characteristic function of equivalence classes, which is termed as a learning pseudo metric (LPM). Empirical criteria on evaluating the goodness of the LPM are established. A LPM based k-Mean rule is then employed for the semantic image clustering practice, where two impurity indices, classification performance and robustness are used for performance evaluation. An artificial image database with 11 semantics is employed for our simulation studies. Results demonstrate the merits and usefulness of our proposed techniques for multimedia data mining.
Similar content being viewed by others
References
Aditya V, Mario AT, Anil KJ, Zhang HJ (2001) Image classification for content-based indexing. IEEE Trans Image Process 10:117–130
Bhatia SK, Deogun JS (1998) Conceptual clustering in information retrieval. IEEE Trans Syst Man Cybern Part B 28:427–436
Han JW, Kamber M (2000) Data mining: concepts and techniques. Morgan Kaufmann
Haykin S (1998) Neural networks: a comprehensive foundation. Prentice Hall, Englewood cliffs
Jang J-SR, Sun CT, Mizutani E (1996) Neuro-fuzzy and soft computing: a computational approach to learning and machine intelligence, Pearson Education
Kamishima T, Motoyoshi F (2003) Learning from cluster examples. Mach Learn 53:199–233
Lu Z, Szafron D, Greiner R, Lu P, Wishart DS, Poulin B, Anvik J, Macdonell C, Eisner R (2004) Predicting subcellular localization of proteins using machine-learned classifiers. Bioinformatics 20:547–556
Munkres JR (2000) Topology 2nd edn. Prentice-Hall, Upper Saddle River Englewood cliffs
Moller MF (1993) A scaled conjugate gradient algorithm for fast supervised learning. Neural Netw 6:525–533
Perner P (2003) Data mining on multimedia data. Lecture notes in computer science, vol. 2558, Springer, Berlin Heidelberg Newyork
Santin S, Jain R (1999) Similarity measures. IEEE Trans Pattern Anal Mach Intell 21:871–883
Sheikholeslami G, Chang W, Zhang AD (2002) Semquery: semantic clustering and querying on heterogeneous features for visual data. IEEE Trans Knowl Data Eng 14:988–1002
Wang DH, Dillon TS, Ma XH (2003) Robustness for evaluating rule’s generalization capability in data mining. Lect Notes Comput Sci 2903:699–709
Wang XZ, Wang YD, Wang LJ (2004) Improving fuzzy c-means clustering based on feature-weight learning. Pattern Recognit Lett 25:1123–1132
Wang DH, Lim JS, Han MM, Lee BW (2005a) Learning similarity for semantic image classification. Neurocomputing 67:363–368
Wang DH, Ma XH, Kim YS (2005b) Learning pseudo metric for intelligent multimedia data classification and retrieval. J Intell Manuf 16:575–586
Yoshitaka A, Ichikawa T (1999) A survey on content-based retrieval for multimedia databases. IEEE Trans Knowl Data Eng 1–2:81–93
Yeung DS, Wang XZ (2002) Improving performance of similarity-based clustering by feature weight learning. IEEE Trans Pattern Analy Mach Intell 24:556–561
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Wang, D., Kim, YS., Park, S.C. et al. Learning Based Neural Similarity Metrics for Multimedia Data Mining. Soft Comput 11, 335–340 (2007). https://doi.org/10.1007/s00500-006-0086-2
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-006-0086-2