Abstract
Scientific data constitutes a great asset. However, its volume is far bigger than any human can comprehend. Therefore, automatic analytical, search and indexing solutions are called for. In this paper we present the architecture and the data model of such a system. SONCA (Search based on ONtologies and Compound Analytics) is a platform to implement and exploit intelligent algorithms identifying relations between various types of objects (publications, inventions, scientists and institutions). The results of these algorithms can be used to build semantic search engines but also can be fed into further analytical algorithms in order to find even more associations.We also show experimental evaluation of the performance of SONCA. Its results are promising and we argue that SONCA’s architecture is robust.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Adar, E., Teevan, J., Agichtein, E., Maarek, Y. (eds.): Proceedings of the Fifth International Conference on Web Search and Web Data Mining, WSDM 2012, Seattle, WA, USA, February 8-12. ACM (2012)
Agrawal, R., et al.: The claremont report on database research. Commun. ACM 52(6), 56–65 (2009)
Burzańska, M., Stencel, K., Suchomska, P., Szumowska, A., Wiśniewski, P.: Recursive Queries Using Object Relational Mapping. In: Kim, T.-H., Lee, Y.-H., Kang, B.-H., Ślęzak, D. (eds.) FGIT 2010. LNCS, vol. 6485, pp. 42–50. Springer, Heidelberg (2010)
Cuzzocrea, A., Serafino, P.: LCS-hist: taming massive high-dimensional data cube compression. In: Kersten, M.L., Novikov, B., Teubner, J., Polutin, V., Manegold, S. (eds.) EBDT. ACM International Conference Proceeding Series, vol. 360, pp. 768–779. ACM (2009)
Janusz, A., Świeboda, W., Krasuski, A., Nguyen, H.S.: Interactive Document Indexing Method Based on Explicit Semantic Analysis. In: Yan, J., et al. (eds.) RSCTC 2012. LNCS, vol. 7413, pp. 156–165. Springer, Heidelberg (2012)
Kersten, M.L., Manegold, S.: Revolutionary database technology for data intensive research. ERCIM News (89) (2012)
Meina, M.: Query-context search result clustering basing on graphs. In: Szczuka, M., Czaja, L., Skowron, A., Kacprzak, M. (eds.) CS&P, Puttusk, Poland, pp. 346–352. Białystok University of Technology (2011) Electronic edition
Nguyen, S.H., Świeboda, W., Jaśkiewicz, G.: Extended Document Representation for Search Result Clustering. In: Bembenik, R., Skonieczny, Ł., Rybiński, H., Niezgódka, M. (eds.) Intelligent Tools for Building a Scient. Info. Plat. SCI, vol. 390, pp. 77–95. Springer, Heidelberg (2012)
Poelmans, J., Ignatov, D., Kuznetsov, S., Dedene, G., Elzinga, P., Viaene, S.: Formal concept analysis in knowledge processing: A survey on applications. Inf. Sci. (2012)
Sarawagi, S., Thomas, S., Agrawal, R.: Integrating association rule mining with relational database systems: Alternatives and implications. Data Min. Knowl. Discov. 4(2/3), 89–125 (2000)
Ślęzak, D., Janusz, A., Świeboda, W., Nguyen, H.S., Bazan, J.G., Skowron, A.: Semantic Analytics of PubMed Content. In: Holzinger, A., Simonic, K.-M. (eds.) USAB 2011. LNCS, vol. 7058, pp. 63–74. Springer, Heidelberg (2011)
Ślęzak, D., Synak, P., Borkowski, J., Wróblewski, J., Toppin, G.: A rough-columnar RDBMS engine – a case study of correlated subqueries. IEEE Data Eng. Bull. 35(1), 34–39 (2012)
Ślęzak, D., Wróblewski, J., Eastwood, V., Synak, P.: Brighthouse: an analytic data warehouse for ad-hoc queries. PVLDB 1(2), 1337–1345 (2008)
Szczuka, M., Betliński, P., Herba, K.: Named Entity Matching in Publication Databases. In: Yan, J., et al. (eds.) RSCTC 2012. LNCS, vol. 7413, pp. 172–179. Springer, Heidelberg (2012)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Grzegorowski, M., Pardel, P.W., Stawicki, S., Stencel, K. (2013). SONCA: Scalable Semantic Processing of Rapidly Growing Document Stores. In: Pechenizkiy, M., Wojciechowski, M. (eds) New Trends in Databases and Information Systems. Advances in Intelligent Systems and Computing, vol 185. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32518-2_9
Download citation
DOI: https://doi.org/10.1007/978-3-642-32518-2_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-32517-5
Online ISBN: 978-3-642-32518-2
eBook Packages: EngineeringEngineering (R0)