Abstract
In this paper, an unsupervised image classification technique combining features from different media levels is proposed. In particular geometrical models of visual features are here integrated with textual descriptions derived through Information Extraction processes from Web pages. While the higher expressivity of the combined individual descriptions increases the complexity of the adopted clustering algorithms, methods for dimensionality reduction (i.e. LSA) are applied effectively. The evaluation on an image classification task confirms that the proposed Web mining model outperforms other methods acting on the individual levels for cost-effective annotation.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Alsabti, K., Ranka, S., Singh, V.: An efficient k-means clustering algorithm. In: First Workshop High Performance Data Mining (1998)
Basili, R., Moschitti, A.: Automatic Text Categorization: from Information Retrieval to Support Vector Learning. Aracne (2005)
Deerwester, S., Dumais, S., Furnas, G., Harshman, R., Landauer, T.: Indexing by latent semantic analysis. Journal of the American Society for Information Science 41(6), 391–407 (1990)
Hare, J.S., Lewis, P.H., Enser, P.G.B., Sandom, C.J.: Mind the gap: Another look at the problem of the semantic gap in image retrieval. In: Proceedings of Multimedia Content Analysis, Management and Retrieval 2006 SPIE (2006)
Monay, F., Gatica-Perez, D.: On image auto-annotation with latent space models. In: Proceedings of the 11th annual ACM international conference on Multimedia, ACM Press, New York (2003)
RWTH. Lti-lib - computer vision library. Website, Settembre, Universit di Aachen (2006)
Salton, G.: Automatic Text Processing–The Transformation, Analysis, and Retrieval of Information by Computer. Addison-Wesley, Reading, Massachusetts (1989)
Smeulders, A.W.M., Worring, M., Santini, S., Gupta, A., Jain, R.: Content-based image retrieval at the end of the early years. IEEE Transactions on Pattern Analysis and Machine Intelligence 12(22), 1349–1380 (2000)
Vapnik, V.: The Nature of Statistical Learning Theory. Springer, Heidelberg (1995)
Witten, I.H., Frank, E.: Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann, San Francisco (2005)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Basili, R., Petitti, R., Saracino, D. (2007). Mining Web Data for Image Semantic Annotation. In: Basili, R., Pazienza, M.T. (eds) AI*IA 2007: Artificial Intelligence and Human-Oriented Computing. AI*IA 2007. Lecture Notes in Computer Science(), vol 4733. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74782-6_58
Download citation
DOI: https://doi.org/10.1007/978-3-540-74782-6_58
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74781-9
Online ISBN: 978-3-540-74782-6
eBook Packages: Computer ScienceComputer Science (R0)