Abstract
We present the architecture of the system aimed at search and synthesis of information within document repositories originating from different sources, with documents provided not necessarily in the same format and the same level of detail. The system is expected to provide domain knowledge interfaces enabling the internally implemented algorithms to identify relationships between documents (as well as authors, institutions et cetera) and concepts (such as, e.g., areas of science) extracted from various types of knowledge bases. The system should be scalable by means of scientific content storage, performance of analytic processes, and speed of search. In case of compound computational tasks (such as production of richer semantic indexes for the search improvements), it should follow the paradigms of hierarchical modeling and computing, designed as an interaction between domain experts, system experts, and appropriately implemented intelligent modules.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Agrawal, R., Ailamaki, A., Bernstein, P.A., Brewer, E.A., Carey, M.J., Chaudhuri, S., Doan, A., Florescu, D., Franklin, M.J., Garcia-Molina, H., Gehrke, J., Gruenwald, L., Haas, L.M., Halevy, A.Y., Hellerstein, J.M., Ioannidis, Y.E., Korth, H.F., Kossmann, D., Madden, S., Magoulas, R., Ooi, B.C., O’Reilly, T., Ramakrishnan, R., Sarawagi, S., Stonebraker, M., Szalay, A.S., Weikum, G.: The Claremont Report on Database Research. Commun. ACM 52(6), 56–65 (2009)
Agrawal, R., Gollapudi, S., Kannan, A., Kenthapadi, K.: Enriching Education through Data Mining. In: Kuznetsov, S.O., Mandal, D.P., Kundu, M.K., Pal, S.K. (eds.) PReMI 2011. LNCS, vol. 6744, pp. 1–2. Springer, Heidelberg (2011)
Badr, Y., Chbeir, R., Abraham, A., Hassanien, A.: Emergent Web Intelligence: Advanced Semantic Technologies. Springer, Heidelberg (2010)
Bazan, J.G.: Hierarchical Classifiers for Complex Spatio-temporal Concepts. T. Rough Sets 9, 474–750 (2008)
Betliński, P., Gora, P., Herba, K., Nguyen, T.T., Stawicki, S.: Semantic Recognition of Digital Documents. In: Bembenik, R., Skonieczny, Ł., Rybiński, H., Niezgódka, M. (eds.) Intelligent Tools for Building a Scientific Information Platform. Springer, Heidelberg (2011)
Breitman, K., Casanova, M., Truszkowski, W.: Semantic Web: Concepts, Technologies and Applications. Springer, Heidelberg (2007)
Butcher, S., Clarke, C., Cormack, G.: Information Retrieval: Implementing and Evaluating Search Engines. MIT Press (2010)
Cao, L.: Data Mining and Multiagent Integration. Springer, Heidelberg (2009)
Chodorow, K., Dirolf, M.: MongoDB: The Definitive Guide: Powerful and Scalable Data Storage. O’Reilly Media (2010)
Codd, E.: Derivability, Redundancy and Consistency of Relations Stored in Large Data Banks. SIGMOD Record 38(1), 17–36 (2009)
Colomb, R.: Ontology and the Semantic Web. IOS Press (2007)
Davies, J., Grobelnik, M., Mladenic, D.: Semantic Knowledge Management: Integrating Ontology Management, Knowledge Discovery, and Human Language Technologies. Springer, Heidelberg (2009)
Feldman, J.A.: From Molecule to Metaphor: A Neural Theory of Language (A Bradford Book). MIT Press (2006)
Gasevic, D., Djuric, D., Devedzic, V.: Model Driven Engineering and Ontology Development. Springer, Heidelberg (2009)
Gómez-Pérez, A., Fernández-López, M., Corcho, O.: Ontological Engineering with Examples from the Areas of Knowledge Management, e-Commerce and the Semantic Web. Springer, Heidelberg (2004)
Han, J.: Construction and Analysis of Web-Based Computer Science Information Networks. In: Kuznetsov, S.O., Ślęzak, D., Hepting, D.H., Mirkin, B.G. (eds.) RSFDGrC 2011. LNCS (LNAI), vol. 6743, pp. 1–2. Springer, Heidelberg (2011)
Helbig, H.: Knowledge Representation and the Semantics of Natural Language. Springer, Heidelberg (2006)
Jankowski, A., Skowron, A.: A Wistech Paradigm for Intelligent Systems. T. Rough Sets 6, 94–132 (2007)
Kacprzyk, J., Zadrożny, S.: Computing with words is an implementable paradigm: Fuzzy queries, linguistic data summaries, and natural-language generation. IEEE T. Fuzzy Systems 18(3), 461–472 (2010)
Kowalski, M., Ślęzak, D., Stencel, K., Pardel, P., Grzegorowski, M., Kijowski, M.: RDBMS Model for Scientific Articles Analytics. In: Bembenik, R., Skonieczny, Ł., Rybiński, H., Niezgódka, M. (eds.) Intelligent Tools for Building a Scientific Information Platform, Springer, Heidelberg (2011)
Ledford, J.L.: Search Engine Optimization Bible. Wiley (2009)
McCandless, M., Hatcher, E., Gospodnetić, O.: Lucene in Action, 2nd edn. Manning Publications (2010)
Mika, P.: Social Networks and the Semantic Web. In: Proc. of Int. Conf. on Web Intelligence (WI), pp. 285–291 (2004)
Nguyen, H.S.: Approximate Boolean Reasoning: Foundations and Applications in Data Mining, pp. 334–506 (2006)
Nguyen, L.A., Nguyen, H.S.: On Designing the SONCA System. In: Bembenik, R., Skonieczny, Ł., Rybiński, H., Niezgódka, M. (eds.) Intelligent Tools for Building a Scientific Information Platform. Springer, Heidelberg (2011)
Nguyen, S.H., Świeboda, W., Jaśkiewicz, G.: Extended Document Representation for Search Result Clustering. In: Bembenik, R., Skonieczny, Ł., Rybiński, H., Niezgódka, M. (eds.) Intelligent Tools for Building a Scientific Information Platform. Springer, Heidelberg (2011)
Nolfi, S., Mirolli, M.: Evolution of Communication and Language in Embodied Agents. Springer, Heidelberg (2010)
Pawlak, Z.: Information Systems Theoretical Foundations. Inf. Syst. 6(3), 205–218 (1981)
Pedrycz, W., Skowron, A., Kreinovich, V. (eds.): Handbook of Granular Computing. Wiley (2008)
Poggio, T., Smale, S.: The Mathematics of Learning: Dealing with Data. Notices of the AMS 50(5), 537–544 (2003)
Shinyama, Y.: PDFMiner: Python PDF Parser and Analyzer (2010), http://www.unixuser.org/~euske/python/pdfminer/
Skowron, A., Stepaniuk, J., Świniarski, R.W.: Approximation Spaces in Rough-Granular Computing. Fundam. Inform. 100(1-4), 141–157 (2010)
Ślęzak, D., Kowalski, M.: Towards Approximate SQL – Infobright’s Approach. In: Szczuka, M., Kryszkiewicz, M., Ramanna, S., Jensen, R., Hu, Q. (eds.) RSCTC 2010. LNCS, vol. 6086, pp. 630–639. Springer, Heidelberg (2010)
Ślęzak, D., Wróblewski, J., Eastwood, V., Synak, P.: Brighthouse: An Analytic Data Warehouse for Ad-hoc Queries. Proc. VLDB Endow. 1(2), 1337–1345 (2008)
Szczuka, M., Janusz, A., Herba, K.: Clustering of Rough Set Related Documents with Use of Knowledge from DBpedia. In: Yao, J. (ed.) RSKT 2011. LNCS (LNAI), vol. 6954, pp. 394–403. Springer, Heidelberg (2011)
Ulam, S.: Analogies Between Analogies: The Mathematical Reports of S. M. Ulam and His Los Alamos Collaborators. University of California Press (1990)
Valiant, L.G.: Robust Logics. Artif. Intell. 117(2), 231–253 (2000)
Vapnik, V.: Learning Has Just Started (An interview with Vladimir Vapnik by Ran Gilad-Bachrach) (2008), http://seed.ucsd.edu/joomla/index.php/articles/12-interviews/9-qlearning-has-just-startedq-an-interview-with-prof-vladimir-vapnik
Wasilewski, P.: Towards Semantic Evaluation of Information Retrieval. In: Bembenik, R., Skonieczny, Ł., Rybiński, H., Niezgódka, M. (eds.) Intelligent Tools for Building a Scientific Information Platform. Springer, Heidelberg (2011)
Zadeh, L.A.: Computing with Words and Perceptions - A Paradigm Shift. In: Proc. of Int. Conf. on Parallel and Distributed Processing Techniques and Applications (PDPTA), pp. 3–5 (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag GmbH Berlin Heidelberg
About this chapter
Cite this chapter
Nguyen, H.S., Ślęzak, D., Skowron, A., Bazan, J.G. (2012). Semantic Search and Analytics over Large Repository of Scientific Articles. In: Bembenik, R., Skonieczny, L., Rybiński, H., Niezgodka, M. (eds) Intelligent Tools for Building a Scientific Information Platform. Studies in Computational Intelligence, vol 390. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24809-2_1
Download citation
DOI: https://doi.org/10.1007/978-3-642-24809-2_1
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-24808-5
Online ISBN: 978-3-642-24809-2
eBook Packages: EngineeringEngineering (R0)