Abstract
The paper proposes a Service-oriented Knowledge Discovery (SoKD) framework and a prototype implementation named Orange4WS. To provide the proposed framework with semantics, we are using the Knowledge Discovery Ontology which defines relationships among the ingredients of knowledge discovery scenarios. It enables to reason which algorithms can be used to produce the results required by a specified knowledge discovery task, and to query the results of the knowledge discovery tasks. In addition, the ontology can also be used for automatic annotation of manually created workflows facilitating their reuse. Thus, the proposed framework provides an approach to third generation data mining: integration of distributed, heterogeneous data and knowledge resources and software into a coherent and effective knowledge discovery process. The abilities of the prototype implementation have been demonstrated on a text mining use case featuring publicly available data repositories, specialized algorithms, and third-party data analysis tools.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Ali, A., Rana, O., Taylor, I.: Web services composition for distributed data mining. In: Proc. of the 2005 IEEE Int. Conf. on Parallel Processing Workshops. IEEE, Los Alamitos (2005)
Baader, F., Calvanese, D., McGuinness, D., Nardi, D., Patel-Schneider, P. (eds.): The Description Logic Handbook, Theory, Implementation and Applications. Cambridge University Press, Cambridge (2003)
Bernstein, A., Deanzer, M.: The NExT system: Towards true dynamic adaptions of semantic web service compositions (system description). In: Franconi, E., Kifer, M., May, W. (eds.) ESWC 2007. LNCS, vol. 4519, pp. 739–748. Springer, Heidelberg (2007)
Bernstein, A., Provost, F., Hill, S.: Toward intelligent assistance for a data mining process: An ontology-based approach for cost-sensitive classification. IEEE Trans. on Knowledge and Data Engineering 17(5), 503–518 (2005)
Berthold, M.R., Cebron, N., Dill, F., Gabriel, T.R., Kötter, T., Meinl, T., Ohl, P., Sieb, C., Thiel, K., Wiswedel, B.: KNIME: The Konstanz information miner. In: Studies in Classification, Data Analysis, and Knowledge Organization (GfKL 2007). Springer, Heidelberg (2007)
Demšar, J., Zupan, B., Leban, G.: Orange: From experimental machine learning to interactive data mining. White Paper (2004)
DeRoure, D., Goble, C., Stevens, R.: The design and realisation of the myExperiment virtual research environment for social sharing of workflows. Future Generation Computer Systems 25, 561–567 (2008)
Diamantini, C., Potena, D., Storti, E.: KDDONTO: An ontology for discovery and composition of KDD algorithms. In: SoKD: ECML/PKDD 2009 Workshop on Third Generation Data Mining: Towards Service-oriented Knowledge Discovery, pp. 13–24 (2009)
Džeroski, S.: Towards a general framework for data mining. In: Džeroski, S., Struyf, J. (eds.) KDID 2006. LNCS, vol. 4747, pp. 259–300. Springer, Heidelberg (2007)
Erl, T.: Service-Oriented Architecture: Concepts, Technology, and Design. Prentice-Hall, Englewood Cliffs (2006)
Finin, T., Gama, J., Grossman, R., Lambert, D., Liu, H., Liu, K., Nasraoui, O., Singh, L., Srivastava, J., Wang, W.: National science foundation symposium on next generation of data mining and cyber-enabled discovery for innovation (NGDM 2007): Final report (2007)
Guedes, D., Meira, W.J., Ferreira, R.: Anteater: A service-oriented architecture for high-performance data mining. IEEE Internet Computing 10(4), 36–43 (2006)
Hoffmann, J.: Towards efficient belief update for planning-based web service composition. In: Proc. of ECAI 2008, pp. 558–562 (2008)
Hoffmann, J., Nebel, B.: The FF planning system: Fast plan generation through heuristic search. Journal of Artificial Intelligence Research 14, 253–302 (2001)
Hull, D., Wolstencroft, K., Stevens, R., Goble, C., Pocock, M., Li, P., Oinn, T.: Taverna: a tool for building and running workflows of services. Nucleic Acids Research 34, 729–732 (2006)
Kalyanpur, A., Jiménez Pastor, D., Battle, S., Padget, J.A.: Automatic mapping of OWL ontologies into Java. In: Proc. of SEKE 2004, pp. 98–103 (2004)
Lécué, F., Delteil, A., Léger, A.: Applying abduction in semantic web service composition. In: Proc. of the ICWS 2007, pp. 94–101 (2007)
Li, Y., Lu, Z.: Ontology-based universal knowledge grid: Enabling knowledge discovery and integration on the grid. In: Proc. of the 2004 IEEE Int. Conf. on Services Computing (2004)
Liu, Z., Ranganathan, A., Riabov, A.: A planning approach for message-oriented semantic web service composition. In: Proc. of the Nat. Conf. on AI, vol. 5(2), pp. 1389–1394 (2007)
Klusch, M., Gerber, A., Schmidt, M.: Semantic web service composition planning with OWLS-XPlan. In: Procs of 1st Intl. AAAI Fall Symposium on Agents and the Semantic Web (2005)
Morik, K., Scholz, M.: Web services composition for distributed data mining. In: Proc. of the International Conference on Machine Learning, pp. 47–65 (2004)
Panov, P., Džeroski, S., Soldatova, L.N.: OntoDM: An ontology of data mining. In: Proceedings of the IEEE ICDM Workshops 2008, pp. 752–760 (2008)
Rios, J., Karlsson, J., Trelles, O.: Magallanes: a web services discovery and automatic workflow composition tool. BMC Bioinformatics 10(1) (2009)
Schvaneveldt, R.W., Dearholt, D.W., Durso, F.T.: Graph theoretic foundations of pathfinder networks. Computers and Mathematics with Applications (1988)
Sirin, E., Parsia, B.: SPARQL-DL: SPARQL query for OWL-DL. In: Proc. of the OWLED 2007 Workshop on OWL: Experiences and Directions (2007)
Sirin, E., Parsia, B., Wu, D., Hendler, J., Nau, D.: HTN planning for web service composition using SHOP2. Journal of Web Semantics 1(4), 377–396 (2004)
Stankovski, V., Swain, M., Kravtsov, V., Niessen, T., Wegener, D., Kindermann, J., Dubitzky, W.: Grid-enabling data mining applications with DataMiningGrid: An architectural perspective. Future Generation Computer Systems 24(4), 259–279 (2008)
Talia, D., Trunfio, P., Verta, O.: Weka4WS: A WSRF-enabled Weka toolkit for distributed data mining on grids. In: Jorge, A.M., Torgo, L., Brazdil, P.B., Camacho, R., Gama, J. (eds.) PKDD 2005. LNCS (LNAI), vol. 3721, pp. 309–320. Springer, Heidelberg (2005)
Taylor, I., Shields, M., Wang, I., Harrison, A.: The Triana workflow environment: Architecture and applications. In: Taylor, I., Deelman, E., Gannon, D., Shields, M. (eds.) Workflows for e-Science, pp. 320–339. Springer, Heidelberg (2007)
Vavpetič, A., Batagelj, V., Podpečan, V.: An implementation of the pathfinder algorithm for sparse networks and its application on text networks. In: Proceedings of the 12th international multiconference Information Society (IS 2009), pp. 236–239 (2009)
Žáková, M., Křemen, P., Železný, Lavrač, N.: Automatic knowledge discovery workflow composition through ontology-based planning. IEEE Trans. Automation Science and Engineering (2010)
Witten, I.H., Frank, E.: Data Mining: Practical machine learning tools and technique, 2nd edn. Morgan Kaufmann, San Francisco (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Podpečan, V., Žakova, M., Lavrač, N. (2010). Workflow Construction for Service-Oriented Knowledge Discovery. In: Margaria, T., Steffen, B. (eds) Leveraging Applications of Formal Methods, Verification, and Validation. ISoLA 2010. Lecture Notes in Computer Science, vol 6415. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16558-0_27
Download citation
DOI: https://doi.org/10.1007/978-3-642-16558-0_27
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-16557-3
Online ISBN: 978-3-642-16558-0
eBook Packages: Computer ScienceComputer Science (R0)