Skip to main content

Clustering Complex Data Represented as Propositional Formulas

  • Conference paper
  • First Online:
Advances in Knowledge Discovery and Data Mining (PAKDD 2017)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10235))

Included in the following conference series:

Abstract

Clustering has been extensively studied to deal with different kinds of data. Usually, datasets are represented as a n-dimensional vector of attributes described by numerical or nominal categorical values. Symbolic data is another concept where the objects are more complex such as intervals, multi-categorical or modal. However, new applications might give rise to even more complex data describing for example customer desires, constraints, and preferences. Such data can be expressed more compactly using logic-based representations. In this paper, we introduce a new clustering framework, where complex objects are described by propositional formulas. First, we extend the two well-known k-means and hierarchical agglomerative clustering techniques. Second, we introduce a new divisive algorithm for clustering objects represented explicitly by sets of models. Finally, we propose a propositional satisfiability based encoding of the problem of clustering propositional formulas without the need for an explicit representation of their models. Preliminary experimental results validating our proposed framework are provided.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://dtai.cs.kuleuven.be/CP4IM/.

References

  1. Aggarwal, C.C., Reddy, C.K.: Data clustering: algorithms and applications. CRC Press, Boca Raton (2013)

    MATH  Google Scholar 

  2. Berkhin, P.: A survey of clustering data mining techniques. In: Kogan, J., Nicholas, C.K., Teboulle, M. (eds.) Grouping Multidimensional Data - Recent Advances in Clustering, pp. 25–71. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  3. Billard, L., Diday, E., Analysis, S.D.: Conceptual Statistics and Data Mining. Wiley, Hoboken (2012)

    Google Scholar 

  4. Bock, H.H.: Analysis of Symbolic Data: Exploratory Methods for Extracting Statistical Information from Complex Data. Springer, New York (2000)

    Book  MATH  Google Scholar 

  5. Chakraborty, S., Meel, K.S., Vardi, M.Y.: A scalable approximate model counter. In: Schulte, C. (ed.) CP 2013. LNCS, vol. 8124, pp. 200–216. Springer, Heidelberg (2013). doi:10.1007/978-3-642-40627-0_18

    Chapter  Google Scholar 

  6. de Carvalho, F.D.A., Csernel, M., Lechevallier, Y.: Clustering constrained symbolic data. Pattern Recogn. Lett. 30(11), 1037–1045 (2009)

    Article  Google Scholar 

  7. de Souza, R.M., de Carvalho, F.D.A.: Clustering of interval data based on city-block distances. Pattern Recogn. Lett. 25(3), 353–365 (2004)

    Article  Google Scholar 

  8. Diday, E., Esposito, F.: An introduction to symbolic data analysis and the SODAS software. Intell. Data Anal. 7(6), 583–601 (2003)

    Google Scholar 

  9. Gomes, C.P., Hoffmann, J., Sabharwal, A., Selman, B.: From sampling to model counting. In: IJCAI 1997, pp. 2293–2299 (2007)

    Google Scholar 

  10. Gowda, K.C., Diday, E.: Symbolic clustering algorithms using similarity and dissimilarity measures. In: Diday, E., Lechevallier, Y., Schader, M., Bertrand, P., Burtschy, B. (eds.) New Approaches in Classification and Data Analysis, pp. 414–422. Springer, Heidelberg (1994)

    Chapter  Google Scholar 

  11. Hotz, L., Felfernig, A., Stumptner, M., Ryabokon, A., Bagley, C., Wolter, K.: Configuration knowledge representation and reasoning. In: Knowledge-Based Configuration, chap. 6, pp. 41–72. Morgan Kaufmann (2014)

    Google Scholar 

  12. Jabbour, S., Lonlac, J., Sais, L., Salhi. Y.: Extending modern SAT solvers for models enumeration. In: IEEE-IRI 2014, pp. 803–810 (2014)

    Google Scholar 

  13. Jaccard, P.: The distribution of the flora of the alpine zone. New Phytol. 11, 37–50 (1912)

    Article  Google Scholar 

  14. Michalski, R.S.: Knowledge acquisition through conceptual clustering: a theoretical framework and an algorithm for partitioning data into conjunctive concepts. J. Policy Anal. Inf. Syst. 4(3), 219–244 (1980)

    MathSciNet  Google Scholar 

  15. Tversky, A.: Features of similarity. Psychol. Rev. 84(4), 327–352 (1977)

    Article  Google Scholar 

  16. Tversky, A.: Preference, Belief, and Similarity. The MIT Press, Cambridge (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Abdelhamid Boudane .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Boudane, A., Jabbour, S., Sais, L., Salhi, Y. (2017). Clustering Complex Data Represented as Propositional Formulas. In: Kim, J., Shim, K., Cao, L., Lee, JG., Lin, X., Moon, YS. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2017. Lecture Notes in Computer Science(), vol 10235. Springer, Cham. https://doi.org/10.1007/978-3-319-57529-2_35

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-57529-2_35

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-57528-5

  • Online ISBN: 978-3-319-57529-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics