Abstract
In the last two decades, images are quite produced in increasing amounts in several application domains. In medicine, for instance, a large number of images of various imaging modalities (e.g. computer tomography, magnetic resonance, nuclear imaging, etc.) are produced daily to support clinical decision-making. Thereby, a fully functional Image Management System becomes a requirement to the end-users. In spite of current researches, the practice has proved that the problem of image management is highly related to image representation. This paper contribution is twofold in facilitating the representation of images and the extraction of its content and context descriptors. In fact, we introduce an expressiveness and extendable XML-based meta-model able to capture the metadata and content-based of images. We also propose an information extraction approach to provide automatic description of image content using related metadata. It automatically generates XML instances, which mark up metadata and salient objects matched by extraction patterns. In this paper, we illustrate our proposal by using the medical domain of lungs x-rays and we show our first experimental results.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Wu, J.K., Narasimhalu, A.D., Mehtre, B.M., Lam, C.P., Gao, Y.J.: CORE: A Content-Based Retrieval Engine for Multimedia Information Systems. Multimedia Systems 3, 25–41 (1995)
Berchtold, S., Boehm, C., Braunmueller, B., et al.: Fast Parallel Similarity Search in Multimedia Databases. In: SIGMOD Conference, AZ, USA, pp. 1–12 (1997)
Yoshitaka, A., Ichikawa, T.: A Survey on Content-Based Retrieval for Multimedia Databases. IEEE Transactions on Knowledge and Data Engineering 11(1), 81–93 (1999)
Oria, V., Özsu, M.T., Liu, L., et al.: Modeling Images for Content-Based Queries: The DISMA Approach. In: VIS 1997, San Diago, pp. 339–346 (1997)
Wu, J.K.: Content-Based Indexing of Multimedia Databases. IEEE TKDE 9(6), 978–989 (1997)
Rui, Y., Huang, T.S., Chang, S.F.: Image Retrieval: Past, Present, and Future. Journal of Visual Communication and Image Representation 10, 1–23 (1999)
Stonebraker, M., Brown, P.: Object-Relational DBMSs. Mogan Kaufmann Pub. Inc., San Francisco (1999)
Excalibur Image Datablade Module User’s Guide. Informix Press (March 1999) Ver. 1.2, P. No. 000-5356
Oracle8i, Visual Information Retrieval Users Guide & Reference. Oracle Press (1999) Release 8.1.5, A67293-01
Grosky, W.I.: Managing Multimedia Information in Database Systems. Communications of the ACM 40(12), 72–80 (1997)
Grosky, W.I., Stanchev, P.L.: An Image Data Model. In: Laurini, R. (ed.) VISUAL 2000. LNCS, vol. 1929, pp. 14–25. Springer, Heidelberg (2000)
Eakins, J.P., Graham, M.E.: Content-Based Image Retrieval: A Report to the JISC Technology Applications Programme. Inst. for Image Data Research, Univ. of North-umbria at Newcastle (January 1999)
Smeulders, A.W.M., Gevers, T., Kersten, M.L.: Crossing the Divide Between Computer Vision and Databases in Search of Image Databases. In: Visual Database Systems Conf., Italy, pp. 223–239 (1998)
Sheth, A., Klas, W.: Multimedia Data Management: Using Metadata to Integrate and Apply Digital Media. McGraw-Hill, San Francisco (1998)
Badr, Y.: Xtractor: A Light Wrapper For XML Paragraph-Centric Documents. In: Proceedings of the 2005 International Conference on Signal-Image Technology & Internet - Based Systems (IEEE - SITIS 2005), Yaoundé Cameroon, pp. 150–155 (2005)
Veltkamp, R.C., Tanase, M.: Content-Based Image Retrieval Systems: A Survey, Technical Report UU-cs-2000-34, Department of Computer Science, Utrecht University (October 2000)
Oria, V., Özsu, M.T., Iglinski, P., et al.: DISMA: An Object Oriented Approach to Developing an Image Database System, ICDE 2000. In: 16th Int. Conf. on Data Engineering, San Diego, California (February 2000)
Oria, V., Özsu, M.T., Iglinski, P., et al.: DISMA: A Distributed and Interoperable Image Database System. In: Proc. of ACM SIGMOD Int. Conf. on Management of Data, SIGMOD 2000, Dallas, Texas (2000)
Duncan, J.S., Ayache, N.: Medical Image Analysis: Progress over Two Decades and the Challenges Ahead. IEEE Transactions on Pattern Analysis and Machine Intelligence 22(1) (January 2000)
Soderland, S., Fisher, D., Aseltine, J., et al.: Issues in inductive learning of domain-specic text extraction rules. In: Learning for Natural Language Processing, pp. 290–301. Springer, Heidelberg (1996)
Allen, J.E.: Maintaining Knowledge about Temporal Intervals. Communications of ACM 26, 832–843 (1983)
Chbeir, R., Favetta, F.: A Global Description of Medical Image with a High Precision. In: IEEE International Symposium on Bio-Informatics and Biomedical Engineering IEEE-BIBE 2000, Washington D.C., USA, November 8th-10th, pp. 289–296. IEEE Computer Society, Los Alamitos (2000)
Chu, W.W., Hsu, C.C., Cárdenas, A.F., et al.: Knowledge-Based Image Retrieval with Spatial and Temporal Constraints. IEEE Transactions on Knowledge and Data Engineering 10(6), 872–888 (1998)
Mechkour, M.: EMIR2. An Extended Model for Image Representation and Retrieval. In: Database and Expert system Applications (DEXA), pp. 395–404 (September 1995)
Trayser, G.: Interactive System for Image Selection, Digital Imaging Unit Center of Medical Informatics University Hospital of Geneva, http://www.expasy.ch/UIN/html1/projects/isis/isis.html
Narasimhalu, A.D.: Multimedia Databases, Multimedia Systems, vol. 4, pp. 226–249. Springer, Heidelberg (1996)
Lu, G.: Multimedia Database Management Systems. Artech House Computing library (1999) ISBN 0-089006-342-7
Hopcroft, J.E., Ullman, J.D.: Introduction to automata theory languages, and computation. Addison-Wesley Publishing Co., Reading (1979)
Hume, A.: A tale of two greps. Software Practice and Experience 18(11), 1063–1072 (1988)
Wall, L., Christensen, T., Schwartz, R.L.: Programming Perl, 2nd edn. O’Reilly & Associates, Inc., Sebastopol (1996)
Smith, D.J., Lopez, M.: Information extraction for semi-structured documents. In: Proc. Workshop on Management of Semistructured Data (May 1997)
Hammer, J., Garcia-Molina, H., Cho, J., et al.: Extracting Semi structured Information from the Web. In: Proceedings of the Workshop on Management of Semistructured Data, Tucson, Arizona (May 1997)
Hsu, C.N., Dung, M.T.: Generating finite-state transducers for semistructured data extraction from the web. Information Systems, Special Issue on Semistructured Data 23(8), 521–538 (1998)
Ashish, N., Knoblock, C.: Wrapper Generation for Semi-structured Internet Sources. In: ACM SIGMOD Workshop on Management of Semistructured Data, Tucson, Arizona (1997)
Kuhlins, S., Tredwell, R.: Toolkits for Generating Wrappers: A survey. In: Aksit, M., Mezini, M., Unland, R. (eds.) NODe 2002. LNCS, vol. 2591. Springer, Heidelberg (2003)
Sankar, S., Viswanadha, S., Duncan, R.: Java Compiler Compiler (JavaCC)
The Java Parser Generator. Located at: http://www.suntest.com/JavaCC/
Savarese, D.F.: OROmatcher - Regular Expressions for Java, http://www.savarese.org/
Karttunen, L., Chanod, J.-P., Grefenstette, G., Schiller, A.: Regular expressions for language engineering. Journal of national language engineering 2(4), 305–328 (1996)
van Noord, G., Gerdemann, D.: An Extendible Regular Expression Compiler for Finite-State Approaches in Natural Language Processing. In: Boldt, O., Jürgensen, H. (eds.) WIA 1999. LNCS, vol. 2214, p. 122. Springer, Heidelberg (2001)
MPEG-7 Overview (visited at, 26/02/2006), http://www.chiariglione.org/MPEG/standards/mpeg-7/mpeg-7.htm
Chang, S.K., Shi, Q.Y., Yan, C.W.: Iconic Indexing by 2-D Strings. IEEE-Transactions-on-Pattern-Analysis-and-Machine-Intelligence PAMI-9(3), 413–428 (1987)
Chang, S.K., Jungert, E.: Human- and System-Directed Fusion of Multimedia and Multimodal Information using the Sigma-Tree Data Model. In: Leung, C. (ed.) Visual Information Systems. LNCS, vol. 1306, pp. 21–28. Springer, Heidelberg (1997)
Huang, P.W., Jean, Y.R.: Using 2D C+-Strings as spatial knowledge representation for image database management systems. Pattern Recognition 27(9), 1249–1257 (1994)
Egenhofer, M.: Query Processing in Spatial Query By Sketch. Journal of Visual Language and Computing 8(4), 403–424 (1997)
El-kwae, M.A., Kabuka, M.R.: A robust framework for Content-Based Retrieval by Spatial Similarity in Image Databases. ACM Transactions on Information Systems 17(2), 174–198 (1999)
Peuquet, D.J.: The use of spatial relationships to aid spatial database retrieval. In: Proc. Second Int. Symp. on Spatial Data Handling, Seattle, pp. 459–471 (1986)
Egenhofer, M., Frank, A., Jackson, J.: A Topological Data Model for Spatial Databases. In: Buchmann, A., Smith, T.R., Wang, Y.-F., Günther, O. (eds.) SSD 1989. LNCS, vol. 409, pp. 271–286. Springer, Heidelberg (1990)
Gross, M.: The Use of Finite Automata in the Lexical Representation of Natural Language. In: Gross, M., Perrin, D. (eds.) LITP 1987. LNCS, vol. 377, pp. 34–50. Springer, Heidelberg (1989)
Courtois, B.: Le dictionnaire electronique des mots simples. In: Les dctionnaires electroniques. Langue francaise no 87. Larousse, Paris (1990)
Silberztein, M.: INTEX: a Finite State Transducer toolbox. In: Theoretical Computer Science #231:1. Elsevier Science, Amsterdam (1999)
Subramaniam, L.V., Mukherjea, S., Kankar, P., Srivastava, B., Batra, V.S., Kamesam, P.V., Kothari, R.: Information Extraction from Biomedical Literature: Methodology, Evaluation and an Application, IBM India Research Lab, New Delhi, India
Fukuda, K., Tsunoda, T., Tamura, A., Takagi, T.: Toward Information Extraction: Identify-ing Protein Names from Biological Papers. In: Proceedings of the Pacific Symposium on Biocomputing, Hawaii, pp. 707–718 (1998)
Daniel, Q., Hesham, A.: Ontology Specific Data Mining Based on Dynamic Grammars. In: Bioinformatics conference, Stanford, CA, August 16-19 (2004)
Embley, D.W., Campbell, D.M., Smith, R.D.: Ontology-Based Extraction and Structuring of Information from Data-Rich Unstructured Documents. In: Proceedings of CIKM 1998, Bethesda, Maryland (1998)
Bricon-Souf, N., Beuscart-Zéphir, M.C., Watbled, L., Laforest, F., Karadimas, H., Anceaux, F., Flory, A., Lepage, E., Beuscart, R.: Technologies de l’Information Pour l’Hospitalisation A Domicile: le projet TIPHAD, Télémédecine et e-Santé, Collection Infor-matique et Santé, Paris, vol. 13. Springer- Verlag (2002)
Unitex Home page (last visited, March 12th 2006), Available at: http://www-igm.univ-mlv.fr/~unitex/
Frakes, W.B., Baeza-Yates, R.: Information Retrieval: Data Structures & Algorithms. Prentice Hall, Englewood Cliffs (1992)
Appelt, D.E., Israel, D.J.: Introduction to Information Extraction Technology. In: Tutorial for IJCAI 1999, Stockholm (1999)
Charniak, E.: Statistical Language Learning, p. 192. MIT Press, Cambridge (1994)
Brill, E., Church, K.: Proceedings of the Conference on Empirical Methods in Natural Language Processing. University of Pennsylvania. Philadelphia, PA (1996)
Marcus, M., Santorini, B., Marcinkiewicz, M.: Building a large annotated corpus of English. Computational Linguistics 19(2), 313–330 (1993)
Freitag, D., McCallum, A.: Information extraction with HMMs and shrinkage. In: Proceedings of the AAAI 1999 Workshop on Machine Learning for Information Extraction, pp. 31–36 (1999)
Miikkulainen, R.: Subsymbolic Natural Language Processing: An Integrated Model of Scripts, Lexicon, and Memory. MIT Press, Cambridge (1993)
Brill, E.: Transformation-based error-driven learning and natural language processing: A case study in part-of-speech tagging. Computational Linguistics 21(4), 543–565 (1995)
Magerman, D.M.: Statistical decision-tree models for parsing. In: Proceedings of the 33rd Annual Meeting of the Association for Computational Lenguistics, Cambridge, pp. 276–283 (1995)
Wermter, S., Rilo, E., Scheler, G.: Connectionist, Statistical, and Symbolic Approaches to Learning for Natural Language Processing, pp. 315–328. Springer, Berlin (1996)
Lavrac, N., Dzeroski, S.: Inductive Logic Programming: Techniques and Applications. Ellis Horwood (1994)
Huffman, S.: Learning information extraction patterns from examples. In: Workshop on Learning for Natural Language Processing, IJCAI 1995, Canada, pp. 246–260 (1995)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Badr, Y., Chbeir, R. (2006). Automatic Image Description Based on Textual Data. In: Spaccapietra, S. (eds) Journal on Data Semantics VII. Lecture Notes in Computer Science, vol 4244. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11890591_7
Download citation
DOI: https://doi.org/10.1007/11890591_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-46329-0
Online ISBN: 978-3-540-46330-6
eBook Packages: Computer ScienceComputer Science (R0)