Skip to main content

Integrating Clustering Data Mining into the Multidimensional Modeling of Data Warehouses with UML Profiles

  • Conference paper
Data Warehousing and Knowledge Discovery (DaWaK 2007)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4654))

Included in the following conference series:

Abstract

Clustering can be considered the most important unsupervised learning technique finding similar behaviors (clusters) on large collections of data. Data warehouses (DWs) can help users to analyze stored data, because they contain preprocessed data for analysis purposes. Furthermore, the multidimensional (MD) model of DWs, intuitively represents the system underneath. However, most of the clustering data mining are applied at a low-level of abstraction to complex unstructured data. While there are several approaches for clustering on DWs, there is still not a conceptual model for clustering that facilitates modeling with this technique on the multidimensional (MD) model of a DW. Here, we propose (i) a conceptual model for clustering that helps focusing on the data-mining process at the adequate abstraction level and (ii) an extension of the unified modeling language (UML) by means of the UML profiling mechanism allowing us to design clustering data-mining models on top of the MD model of a DW. This will allow us to avoid the duplication of the time-consuming preprocessing stage and simplify the clustering design on top of DWs improving the discovery of knowledge.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Jain, A.K., Murty, M.N., Flynn, P.J.: Data Clustering: A Review. ACM Comput. Surv. 31(3), 264–323 (1999)

    Article  Google Scholar 

  2. Frawley, W.J., Piatetsky-Shapiro, G., Matheus, C.J.: Knowledge Discovery in Databases: An Overview. In: Knowledge Discovery in Databases, pp. 1–30. AAAI/MIT Press (1991)

    Google Scholar 

  3. Inmon, W.H.: Building the Data Warehouse, 2nd edn. John Wiley & Sons, Inc., New York, NY, USA (1996)

    Google Scholar 

  4. Object Management Group: Unified Modeling Language (UML), version 2.1.1 (February 2007), http://www.omg.org/technology/documents/formal/uml.htm

  5. Luján-Mora, S., Trujillo, J., Song, I.-Y.: A UML profile for multidimensional modeling in data warehouses. Data Knowl. Eng. 59(3), 725–769 (2006)

    Article  Google Scholar 

  6. Zubcoff, J.J., Trujillo, J.: Extending the UML for Designing Association Rule Mining Models for Data Warehouses. In: Tjoa, A.M., Trujillo, J. (eds.) DaWaK 2005. LNCS, vol. 3589, pp. 11–21. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  7. Zubcoff, J., Trujillo, J.: A UML 2.0 profile to design Association Rule mining models in the multidimensional conceptual modeling of data warehouses. Data Knowl. Eng. (in press), doi:10.1016/j.datak.2006.10.007

    Google Scholar 

  8. Zubcoff, J.J., Trujillo, J.: Conceptual modeling for classification mining in data warehouses. In: Tjoa, A.M., Trujillo, J. (eds.) DaWaK 2006. LNCS, vol. 4081, pp. 566–575. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  9. Rasmussen, E.M.: Clustering Algorithms. Information Retrieval: Data Structures & Algorithms, 419–442 (1992)

    Google Scholar 

  10. Jain, A.K., Dubes, R.C.: Algorithms for Clustering Data. Prentice-Hall, Englewood Cliffs (1988)

    MATH  Google Scholar 

  11. Object Management Group: Object Constraint Language (OCL), version 2.0. (May 2006), http://www.omg.org/technology/documents/formal/ocl.htm

  12. Object Management Group: Common Warehouse Metamodel (CWM), version 1.1 (March 2003), http://www.omg.org/technology/documents/formal/cwm.htm

  13. Data Mining Group: Predictive Model Markup Language (PMML), version 3.1 (visited April 2007), http://www.dmg.org/pmml-v3-1.html

  14. Rizzi, S., Bertino, E., Catania, B., Golfarelli, M., Halkidi, M., Terrovitis, M., Vassiliadis, P., Vazirgiannis, M., Vrachnos, E.: Towards a Logical Model for Patterns. In: Song, I.-Y., Liddle, S.W., Ling, T.-W., Scheuermann, P. (eds.) ER 2003. LNCS, vol. 2813, pp. 77–90. Springer, Heidelberg (2003)

    Google Scholar 

  15. Rizzi, S.: UML-Based Conceptual Modeling of Pattern-Bases. In: PaRMa (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Il Yeal Song Johann Eder Tho Manh Nguyen

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Zubcoff, J., Pardillo, J., Trujillo, J. (2007). Integrating Clustering Data Mining into the Multidimensional Modeling of Data Warehouses with UML Profiles. In: Song, I.Y., Eder, J., Nguyen, T.M. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2007. Lecture Notes in Computer Science, vol 4654. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74553-2_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-74553-2_18

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-74552-5

  • Online ISBN: 978-3-540-74553-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics