Abstract
Business, scientific and engineering experiments, medical studies, and governments generate huge amount of information. The problem is how to extract knowledge from all this information. Data mining provides means for at least a partial solution to this problem. However, it would be too expensive to all these areas of human activity and companies to develop their own data mining solutions, develop software, and deploy it on their private infrastructure. This chapter presents the CloudMiner that offers a cloud of data mining services (Software as a Service) running on a cloud service provider infrastructure. The architecture of the CloudMiner is shown and its main components are discussed: MiningCloud that contains all published data mining services, BrokerCloud which mining service providers publish services to, DataCloud that contains the collected data, and Access Point which allows users to access the Service Broker to discover mining services and supports mining service selection and their invocation. The chapter finishes with a short presentation of two use cases.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Al-Ali, R., von Laszewski, G., Amin, K., Hategan, M., Rana, O.,Walker, D., Zaluzec, N.: QoS support for high-performance scientific Grid applications. In: Proceedings of the 2004 IEEE International Symposium on Cluster Computing and the Grid, CCGRID ’04, pp. 134–143. IEEE Computer Society, Washington, DC, USA (2004)
Amazon: Amazon Elastic Compute Cloud (2010). URL http://aws.amazon.com/ec2
Banerjee, S., Basu, S., Garg, S., Garg, S., Lee, S.J., Mullan, P., Sharma, P.: Scalable Grid Service Discovery based on UDDI. In: Proceedings of the 3rd international workshop on Middleware for grid computing, MGC ’05, pp. 1–6. ACM, NY, USA (2005)
Benkner, S., Engelbrecht, G.: A Generic QoS Infrastructure for Grid Web Services. In: Proceedings of the Advanced Int’l Conference on Telecommunications and Int’l Conference on Internet and Web Applications and Services, AICT-ICIW ’06, p. 141. IEEE Computer Society, Washington, DC, USA (2006)
Brezany, P., Janciak, I., Tjoa, A.M.: GridMiner: An advanced support for e-science analytics. In: Dubitzky, W. (ed.) Data Mining Techniques in Grid Computing Environments, pp. 37–55. Wiley, NY (2008)
Brezany, P., Elsayed, I., Han, Y., Janciak, I.,W¨ohrer, A.,Novakova, L., Stepankova, O., Zakova, M., Han, J., Liu, T.: Inside the NIGM Grid Service: Implementation, Evaluation and Extension. In: Proceedings of the 2008 4th International Conference on Semantics, Knowledge and Grid, pp. 314–321. IEEE Computer Society, Washington, DC, USA (2008)
Brock, M., Goscinski, A.: State aware WSDL. In: Proceedings of the sixth Australasian workshop on Grid computing and e-research – vol. 82, AusGrid ’08, pp. 35–44. Australian Computer Society, Darlinghurst, Australia (2008)
Brock, M., Goscinski, A.: Attributed publication and selection for web service-based distributed systems. In: Proceedings of the 2009 Congress on Services – I, pp. 732–739. IEEE Computer Society, Washington, DC, USA (2009)
Brock, M., Goscinski, A.: A technology to expose a cluster as a service in a cloud. In: Proceedings of the Eighth Australasian Symposium on Parallel and Distributed Computing – vol. 107, AusPDC ’10, pp. 3–12. Australian Computer Society, Darlinghurst, Australia (2010)
Brock, M., Goscinski, A.: Toward a Framework for Cloud Security. In: ICA3PP (2), pp. 254–263 (2010)
Brock, M., Goscinski, A.: Toward ease of discovery, selection and use of clusters within a cloud. In: IEEE International Conference on Cloud Computing, pp. 289–296 (2010)
Chapman, P., Clinton, J., Kerber, R., Khabaza, T., Reinartz, T., Shearer, C., Wirth, R.: CRISPDM 1.0 Step-by-step data mining guide. Tech. rep., The CRISP-DM consortium (2000)
Data Mining Group: Predictive Model Markup Language, version 4.0 (2010)
Demers, A., Gehrke, J.E., Riedewald, M.: Research issues in distributed mining and monitoring. In: Proceedings of the National Science Foundation Workshop on Next Generation Data Mining. Baltimore, MD (2002)
Foster, I.: Globus Toolkit Version 4: Software for Service-Oriented Systems. In: IFIP International Conference on Network and Parallel Computing, no. 3779 in LNCS, pp. 2–13. Springer, Berlin (2005)
Foster, I., Frey, J., Graham, S., Tuecke, S., Czajkowski, K., Ferguson, D., Leymann, F., Nally, M., Sedukhin, I., Snelling, D., Storey, T., Vambenepe, W.,Weerawarana, S.: Modeling stateful resources with web services v.1.1. Tech. rep., Globus Alliance (2004)
Goscinski, A., Brock, M.: Toward dynamic and attribute based publication, discovery and selection for cloud computing. Future Gener. Comput. Syst. 26, 947–970 (2010)
Grant, A., Antonioletti, M., Hume, A., Krause, A., Dobrzelecki, B., Jackson, M., Parsons, M., Atkinson, M., Theocharopoulos, E.: OGSA-DAI: Middleware for Data Integration: Selected Applications. In: IEEE Fourth International Conference on eScience ’08, p. 343 (2008)
Grossman, R., Gu, Y.: Data mining using high performance data clouds: Experimental studies using sector and sphere. In: Proceeding of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’08, pp. 920–927. ACM, NY, USA (2008)
Han, J.: Data Mining: Concepts and Techniques. Morgan Kaufmann, CA (2005)
Han, Y., Brezany, P., Janciak, I.: Cloud-Enabled Scalable Decision Tree Construction. In: International Conference on Semantics, Knowledge and Grid, pp. 128–135. IEEE Computer Society, Los Alamitos, CA, USA (2009)
IBM: IBM Smart Analytics System (2010). URL http://www-01.ibm.com/software/data/infosphere/smart-analytics-system/data.html
Janciak, I., Brezany, P.: A Reference Model for Data Mining Web Services. In: International Conference on Semantics, Knowledge and Grid, pp. 251–258. IEEE Computer Society, Los Alamitos, CA, USA (2010)
Janciak, I., Kloner, C., Brezany, P.: Workflow enactment engine for WSRF-compliant services orchestration. In: Proceedings of the 2008 9th IEEE/ACM International Conference on Grid Computing, GRID ’08, pp. 1–8. IEEE Computer Society, Washington, DC, USA (2008)
Keahey, K., Freeman, T.: Science Clouds: Early Experiences in Cloud Computing for Scientific Applications. In: Cloud Computing and its Applications (CCA) (2008)
Kopeck´y, J., Vitvar, T., Bournez, C., Farrell, J.: SAWSDL: Semantic Annotations for WSDL and XML Schema. IEEE Internet Comput. 11, 60–67 (2007)
R Systems: (2010). URL http://www.rsystems.com/index.asp
Shafer, J.C., Agrawal, R., Mehta, M.: SPRINT: A Scalable Parallel Classifier for Data Mining. In: Proceedings of the 22th International Conference on Very Large Data Bases, VLDB ’96, pp. 544–555. Morgan Kaufmann, CA (1996)
Hoch, F., Kerr, M., Griffith, A.: Software as a service: strategic backgrounder. Tech. Rep., Software Inform. Indus. Assoc. (2001)
Alves, A., Arkin, A., Askary, S., Bloch, B., Curbera, F., Goland, Y., Kartha, N., Sterling, Konig, D.,Mehta, V., Thatte, S., van der Rijn, D., Yendluri, P., Yiu, A.:Web Services Business Process Execution Language Version 2.0. OASIS Committee Draft (2006)
Wang, G.: Domain-oriented data-driven data mining (3DM): Simulation of human knowledge understanding. In: Proceedings of the 1st WICI International Conference on Web Intelligence Meets Brain Informatics, WImBI’06, pp. 278–290. Springer, Heidelberg (2007)
Wolfram Research: Cloud services for mathematica (2010). URL http://www.nimbisservices. com/page/what-cloud-services-mathematica
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Goscinski, A., Janciak, I., Han, Y., Brezany, P. (2011). The CloudMiner. In: Fiore, S., Aloisio, G. (eds) Grid and Cloud Database Management. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20045-8_10
Download citation
DOI: https://doi.org/10.1007/978-3-642-20045-8_10
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-20044-1
Online ISBN: 978-3-642-20045-8
eBook Packages: Computer ScienceComputer Science (R0)