Abstract
The Web has profoundly reshaped our vision of information management and processing, enlightening the power of a collaborative model of information production and consumption. This new vision influences the Knowledge Discovery in Databases domain as well. In this paper we propose a service-oriented, semantic-supported approach to the development of a platform for sharing and reuse of resources (data processing and mining techniques), enabling the management of different implementations of the same technique and characterized by a community-centered attitude, with functionalities for both resource production and consumption, facilitating end-users with different skills as well as resource providers with different technical and domain specific capabilities. We first describe the semantic framework underlying the approach, then we demonstrate how this framework is exploited to give different functionalities to users through the presentation of the platform functionalities.
Similar content being viewed by others
References
Ali, A.S., Rana, O.F., Taylor, I.J. (2005).Web services composition for distributed data mining. In Proc. of the International Conference on Parallel Processing Workshops (pp. 11–18).
Alsairafi, S., Ghanem, M., Giannadakis, N., Guo, Y., Kalaitzopoulos, D., Osmond, M., Rowe, A., Syed, J., Wendel, P. (2003). The design of discovery net: Towards open grid services for knowledge discovery. International Journal of High Performance Computing Applications, 17(3), 297–315.
Bell, D., de Cesare, S., Iacovelli, N., Lycett, M., Merico, A. (2007). A framework for deriving semantic web services. Information Systems Frontiers, 9, 69–84.
Bernstein, A., Provost, F., Hill, S. (2005). Toward intelligent assistance for a data mining process: An ontology-based approach for cost-sensitive classification. IEEE Transactions on Knowledge and Data Engineering, 17(4), 503–518.
Cannataro, M., & Talia, D. (2003). The knowledge grid. Communications of the Association for Computing Machinery, 46(1), 89–93.
Cheung, W.K., Zhang, X.F., fai Wong, H., Liu, J., Luo, Z.W., Tong, F.C.H. (2006). Service-oriented distributed data mining. IEEE Internet Computing, 10(4), 44–54.
Chien, B., Hu, C., Ju,M. (2009). Learning fuzzy concept hierarchy and measurement with node labeling. Information Systems Frontiers, 11, 551–559.
Comito, C., Mastroianni, C., Talia, D. (2006). Metadata, ontologies and information models for resource management in grid-based pse toolkits. International Journal of Web Services Research, 3(4), 52–72.
Congiusta, A., Talia, D., Trunfio, P. (2008). Service-oriented middle-ware for distributed data mining on the grid. Journal of Parallel and Distributed Computing, 68(1), 3–15.
De Roure, D., Goble, C.A., Stevens, R. (2009). The design and realisation of the myexperiment virtual research environment for social sharing of workflows. Future Generation Computer Systems, 25(5), 561–567.
Demsar, J., Zupan, B., Leban, G., Curk, T. (2004). Orange: From experimental machine learning to interactive data mining. In Boulicaut, J.F., Esposito, F., Giannotti, F., Pedreschi, D. (Eds.), Knowledge discovery in databases: PKDD 2004, LNCS (Vol. 3202, pp. 537–539). Springer.
Diamantini, C., & Potena, D. (2008). Semantic annotation and services for kdd tools sharing and reuse. In Proc. of the 8th IEEE international conference on data mining workshops. 1st int. workshop on semantic aspects in data mining (pp. 761–770). Pisa, Italy.
Diamantini, C., Potena, D., Cellini J. (2007). Uddi registry for knowledge discovery in databases services. In Proc. of the international symposium on collaborative technologies and systems, IEEE (pp. 321–328). Orlando, FL, USA.
Diamantini, C., Potena, D., Storti E. (2009a). Kddonto: an ontology for discovery and composition of kdd algorithms. In Proc. of the ECML/PKDD09 workshop on third generation data mining: Towards service-oriented knowledge discovery (pp. 13–24). Bled, Slovenia.
Diamantini, C., Potena, D., Storti, E. (2009b). Ontology-driven kdd process composition In Adams, N. (Ed.), Advances in intelligent data analysis VIII, proc. of the 8th international symposium on intelligent data analysis, LNCS (Vol. 5772, pp. 285–296). Lyon, France: Springer.
Dourish, P. (1995). The parting of the ways: Divergence, data management and collaborative work. In Proc. of the fourth European conference on computer-supported cooperative work (pp. 215–230). Stockholm, Sweden.
Farrel, J., & Lausen, H. (2007). Semantic annotations for wsdl and xml schema, w3c recommendation. http://www.w3.org/TR/sawsdl/.
Fayyad, U.M., Piatetsky-shapiro, G., Smyth, P. (1996). From data mining to knowledge discovery: An overview, American association for artificial intelligence (pp. 1–34). CA, USA: Menlo Park.
Fernandez-Lopez, M., Gomez-Perez, A., Juristo, N. (1997). Methontology: From ontological art towards ontological engineering. In Proc. of the AAAI Spring Symposium Series on Ontological Engineering (pp. 33–40). USA: Stanford.
Frank, A., & Asuncion, A. (2010). Uci: Machine learning repository. http://archive.ics.uci.edu/ml.
Gruber, T.R. (1995). Toward principles for the design of ontologies used for knowledge sharing. International Journal of Human Computer Studies, 43(5–6), 907–928.
Guedes, D., Meira, W., Ferreira, R. (2006). Anteater: A service-oriented architecture for high-performance data mining. IEEE Internet Computing, 10(4), 36–43.
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P.,Witten, I.H. (2009). The weka data mining software: an update. ACM SIGKDD Explorations Newsletter, 11(1), 10–18.
Jurisica, I., & Glasgow, J. (2006). Introduction: Knowledge discovery in high-throughput biological domains. Information Systems Frontiers, 8, 5–7.
Kickinger, G., Hofer, J., Brezany, P., Tjoa, A.M. (2004). Grid knowledge discovery processes and an architecture for their composition. In Proc. of the IASTED international conference on parallel and distributed computing and networks. Innsbruck, Austria.
Kumar, A., Kantardzic, M.M., Ramaswamy, P., Sadeghian, P. (2004). An extensible service oriented distributed data mining framework. In Proc. of the international conference on machine learning and applications (pp. 256–263).
Majithia, S., Shields, M.S., Taylor, I.J., Wang, I. (2004). Triana: A graphical web service composition and execution toolkit. In Proc. of IEEE international conference on web services (pp. 514–521).
Noy, N.F., & Mcguinness, D.L. (2002). Ontology development 101: A guide to creating your first ontology. Stanford University.
Olejnik, R., Fortis, T.F., Toursel, B. (2009). Webservices oriented data mining in knowledge architecture. Future Generation Computer Systems, 25(4), 436–443.
Panov, P., Dzeroski, S., Soldatova, L.N. (2008). Ontodm: An ontology of data mining. In Proc. of the 8th IEEE int. conf. on data mining workshops, 1st int. workshop on semantic aspects in data mining, (pp. 752–760).
Park, B.H., & Kargupta, H. (2003). Distributed data mining: Algorithms, systems, and applications. In Ye, N. (Ed.), The handbook of data mining (pp. 341–358). Routledge.
Pedrinaci, C., & Domingue, J. (2010). Web services are dead. long live internet services. Tech. rep.
Peng, D., Wang, X., Zhou, A. (2007). Vslattice: A vector-based conceptual index structure for web service retrieval. Information Systems Frontiers, 9, 423–437.
Pérez, M.S., Sánchez, A., Robles, V., Herrero, P., Pea, J.M. (2007). Design and implementation of a data mining grid-aware architecture. Future Generation Computer Systems, 23(1), 42–47.
Podpecan, V., Žáková, M., Lavrač, N. (2010). Workflow construction for service-oriented knowledge discovery. In Margaria, T., Steffen, B., (Eds.), Leveraging applications of formal methods, verification, and validation, LNCS (Vol. 6415, pp. 313–327). Springer.
Sarawagi, S., & Nagaralu, S.H. (2000). Data mining models as services on the internet. ACM SIGKDD Explorations Newsletter, 2(1), 24–28.
Serban, F., Kietz, J.U., Bernstein, A. (2010). An overview of intelligent data assistants for data analysis. In Proc. of the 3rd planning to learn workshop at ECAI, 2010 (pp. 7–14).
Shearer, C. (2000). The crisp-dm model: The new blueprint for data mining. Journal of Data Warehousing, 5(4), 13–22.
Treadwell, J. (2005). Open Grid Services Architecture Glossary of Terms. GGF Document GFD-I.044.
Tsai, C.Y., & Tsai, M.H. (2005). A dynamic web service based data mining process system. In Proc. of the 5th international conference on computer and information technology (pp. 1033–1039).
Yu-hua, L., Zheng-ding, L., Xiao-lin, S., Kun-mei, W., Rui-xuan, L. (2006). Data mining ontology development for high user usability. Wuhan University Journal of Natural Sciences, 11(1), 51–56.
Zhou, Z.H., & Liu, X.Y. (2006). Training cost-sensitive neural networks with methods addressing the class imbalance problem. IEEE Trans on Knowledge and Data Engineering, 18(1), 63–77.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Diamantini, C., Potena, D. & Storti, E. A virtual mart for knowledge discovery in databases. Inf Syst Front 15, 447–463 (2013). https://doi.org/10.1007/s10796-012-9399-0
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10796-012-9399-0