Abstract
With Internet, the bulk of predictive intelligence can be obtained from public and unclassified sources, which are more accessible, ubiquitous, and valuable. Up to 80% of electronic data is textual and most valuable information is often encoded in pages which are neither structured, nor classified. The process of accessing all these raw data, heterogeneous for language used, and transforming them into information is therefore inextricably linked to the concepts of textual analysis and synthesis, hinging greatly on the ability to master the problems of multilinguality. Through Multilingual Text Mining, users can get an overview of great volumes of textual data having available a highly readable grid, which helps them discover meaningful similarities among documents and find all related information. This paper describes the approach used by SYNTHEMA, showing a content enabling system for OSINT that provides deep semantic search and information access to large quantities of distributed multimedia. SPYWatch provides with a language independent search and dynamic classification features for a broad range of data collected from several sources in a number of culturally diverse languages.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Grishman, R., Sundheim, B.: Message Understanding Conference - 6: A Brief History. In: Proceedings of the 16th International Conference on Computational Linguistics (COLING), I, Kopenhagen, pp. 466–471 (1996)
Hearst, M.: Untangling Text Data Mining. In: ACL 1999. University of Maryland, June 20-26 (1999)
Miller, H.J., Han, J.: Geographic Data Mining and Knowledge Discovery. CRC Press, Boca Raton (2001)
Wei, L., Keogh, E.: Semi-Supervised Time Series Classification. In: SIGKDD (2006)
Carreras, X., MÃ rquez, L.: Introduction to the CoNLL-2005 Shared Task: Semantic Role Labeling. In: CoNLL 2005, Ann Arbor, MI USA (2005)
Vossen, P., Neri, F., et al.: KYOTO: A System for Mining, Structuring, and Distributing Knowledge Across Languages and Cultures. In: Proceedings of GWC 2008, The 4th Global Wordnet Conference, Szeged, Hungary, January 22-25 (2008)
McCord, M.C.: Slot Grammar: A System for Simpler Construction of Practical Natural Language Grammars Natural Language and Logic 1989, pp. 118–145 (1989); McCord, M.C.: Slot Grammars. American Journal of Computational Linguistics 6(1), 31–43 (1980)
Cascini, G., Neri, F.: Natural Language Processing for Patents Analysis and Classification. In: ETRIA World Conference, TRIZ Future 2004, Florence, Italy (2004)
Neri, F., Raffaelli, R.: Text Mining applied to Multilingual Corpora. In: Sirmakessis, S. (ed.) Knowledge Mining: Proceedings of the NEMIS 2004 Final Conference. Springer, Heidelberg (2004)
Baldini, N., Neri, F., Pettoni, M.: A Multilanguage platform for Open Source Intelligence, Data Mining and Information Engineering 2007. In: Proceedings of 8th International Conference on Data, Text and Web Mining and their Business Applications, The New Forest, UK. WIT Transactions on Information and Communication Technologies, vol. 38, June 18-20 (2007) ISBN: 978-184564-081-1
Neri, F., Pettoni, M.: Stalker, A Multilanguage platform for Open Source Intelligence. In: Open Source Intelligence and Web Mining Symposium. Proceedings of 12th International Conference on Information Visualization, pp. 314–320. IEEE Computer Society, London (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Neri, F., Priamo, A. (2008). SPYWatch, Overcoming Linguistic Barriers in Information Management. In: Ortiz-Arroyo, D., Larsen, H.L., Zeng, D.D., Hicks, D., Wagner, G. (eds) Intelligence and Security Informatics. EuroIsI 2008. Lecture Notes in Computer Science, vol 5376. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-89900-6_8
Download citation
DOI: https://doi.org/10.1007/978-3-540-89900-6_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-89899-3
Online ISBN: 978-3-540-89900-6
eBook Packages: Computer ScienceComputer Science (R0)