On the Abstraction of a Categorical Clustering Algorithm

Sheikhalishahi, Mina; Mejri, Mohamed; Tawbi, Nadia

doi:10.1007/978-3-319-41920-6_51

Mina Sheikhalishahi¹⁴,
Mohamed Mejri¹⁴ &
Nadia Tawbi¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9729))

Included in the following conference series:

International Conference on Machine Learning and Data Mining in Pattern Recognition

3107 Accesses
1 Citations

Abstract

Despite being one of the most common approach in unsupervised data analysis, a very small literature exists on the formalization of clustering algorithms. This paper proposes a semiring-based methodology, named Feature-Cluster Algebra, which is applied to abstract the representation of a labeled tree structure representing a hierarchical categorical clustering algorithm, named CCTree. The elements of the feature-cluster algebra are called terms. We prove that a specific kind of a term, under some conditions, fully abstracts a labeled tree structure. The abstraction methodology maps the original problem to a new representation by removing unwanted details, which makes it simpler to handle. Moreover, we present a set of relations and functions on the algebraic structure to shape the requirements of a term to represent a CCTree structure. The proposed formal approach can be generalized to other categorical clustering (classification) algorithms in which features play key roles in specifying the clusters (classes).

This research has been supported by Natural Sciences and Engineering Research Council of Canada (NSERC).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

A New Context-Based Clustering Framework for Categorical Data

Clustering of mixed-type data considering concept hierarchies: problem specification and algorithm

Article Open access 25 April 2020

A Formal Learning Theory for Three-Way Clustering

References

Benavides, D., Segura, S., Ruiz-Cortés, A.: Automated analysis of feature models 20 years later: A literature review. Inf. Syst. 35(6), 615–636 (2010)
Article Google Scholar
Berkhin, P.: A survey of clustering data mining techniques. In: Kogan, J., Nicholas, C., Teboulle, M. (eds.) Grouping Multidimensional Data, pp. 25–71. Springer, Heidelberg (2006)
Chapter Google Scholar
Caruana, R., Niculescu-Mizil, A.: An empirical comparison of supervised learning algorithms. In: Proceedings of the 23rd International Conference on Machine Learning, ICML 2006, pp. 161–168. ACM, NY (2006)
Google Scholar
Giunchiglia, F., Walsh, T.: A theory of abstraction. Artif. Intell. 57(2–3), 323–389 (1992)
Article MathSciNet MATH Google Scholar
Gross, J.L., Yellen, J.: Graph Theory and Its Applications. Discrete Mathematics and Its Applications, 2nd edn. Chapman & Hall/CRC (2005)
Google Scholar
Hell, P., Nesetil, J.: Graphs and homomorphisms. Oxford lecture series in mathematics and its applications. Oxford University Press, Oxford, New York (2004)
Book Google Scholar
Höfner, P., Khedri, R., Möller, B.: Feature algebra. In: Misra, J., Nipkow, T., Sekerinski, E. (eds.) FM 2006. LNCS, vol. 4085, pp. 300–315. Springer, Heidelberg (2006)
Chapter Google Scholar
Höfner, P., Khédri, R., Möller, B.: An algebra of product families. Software and System Modeling 10(2), 161–182 (2011)
Article Google Scholar
Kang, K.C., Kim, S., Lee, J., Kim, K., Shin, E., Huh, M.: Form: A feature-oriented reuse method with domain-specific reference architectures. Ann. Softw. Eng. 5, 143–168 (1998)
Article Google Scholar
Panda, B., Herbach, J.S., Basu, S., Bayardo, R.J.: Planet: Massively parallel learning of tree ensembles with mapreduce. Proc. VLDB Endow. 2(2), 1426–1437 (2009)
Article Google Scholar
Sheikhalishahi, M., Mejri, M., Tawbi, N.: Clustering spam emails into campaigns. In: Library, S.D. (ed.) 1st International Conference on Information Systems Security and Privacy (2015)
Google Scholar
Sheikhalishahi, M., Saracino, A., Mejri, M., Tawbi, N., Martinelli, F.: Fast and effective clustering of spam emails based on structural similarity. In: Garcia-Alfaro, J., et al. (eds.) FPS 2015. LNCS, vol. 9482, pp. 195–211. Springer, Heidelberg (2016). doi:10.1007/978-3-319-30303-1_12
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Université Laval, Québec City, Canada
Mina Sheikhalishahi, Mohamed Mejri & Nadia Tawbi

Authors

Mina Sheikhalishahi
View author publications
You can also search for this author in PubMed Google Scholar
Mohamed Mejri
View author publications
You can also search for this author in PubMed Google Scholar
Nadia Tawbi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mina Sheikhalishahi .

Editor information

Editors and Affiliations

IBaI, Inst of Comp Vision and applied Comp Sci, Leipzig, Sachsen, Germany
Petra Perner

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sheikhalishahi, M., Mejri, M., Tawbi, N. (2016). On the Abstraction of a Categorical Clustering Algorithm. In: Perner, P. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2016. Lecture Notes in Computer Science(), vol 9729. Springer, Cham. https://doi.org/10.1007/978-3-319-41920-6_51

Download citation

DOI: https://doi.org/10.1007/978-3-319-41920-6_51
Published: 28 June 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-41919-0
Online ISBN: 978-3-319-41920-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics