A new semi-supervised hierarchical active clustering based on ranking constraints for analysts groupization

Ben Ahmed, Eya; Nabli, Ahlem; Gargouri, Faiez

doi:10.1007/s10489-012-0407-3

A new semi-supervised hierarchical active clustering based on ranking constraints for analysts groupization

Published: 23 January 2013

Volume 39, pages 236–250, (2013)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Eya Ben Ahmed¹,
Ahlem Nabli² &
Faiez Gargouri³

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Abstract

The groupization aims to enrich the individual preferences using similar individual’s data. It may efficiently adapt the query results to the user expectations. In this paper, we aim to optimally identify the analyst’ groups in a data warehouse. For that reason, we study the similarity between the selected queries in the analytical history. To enhance the quality of derived groups of analysts, we introduce a new method of semi-supervised hierarchical clustering under constraints ranking for handling cases when some constraints are more important than others and must be firstly enforced during the groupization process. Four axis for group identification are distinguished: (i) the function exerted, (ii) the granted responsibilities to accomplish goals, (iii) the source of groups identification, (iv) the dynamicity of discovered groups. Carried out experiments on real log files used for decision-maker groupization in data warehouse confirm the soundness of our approach. Our findings demonstrate that groupization improves upon personalization for several group types, mainly for function-based groupization and explicitly identified groups.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A novel approach to exploring maximum consensus graphs from users’ preference data in a new age environment

Article 04 September 2015

A Novel Grouping Harmony Search Algorithm for Clustering Problems

Collaborative Clustering: New Perspective to Rank Factor Granules

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Notes

The data warehouse is built using the available information at http://www.bvmt.com.tn/publications/?view=cours.
http://www.cs.waikato.ac.nz/~ml/weka/.

References

Aligon J, Golfarelli M, Marcel P, Rizzi S, Turricchia E (2011) Mining preferences from OLAP query logs for proactive personalization. In: Proceedings of the 15th advances in databases and information systems. LNCS, pp 84–97
Chapter Google Scholar
Bade K, Hermkes M, Nürnberger A (2007) User oriented hierarchical information organization and retrieval. In: Proceedings of the 18th European conference on machine learning. LNCS, pp 518–526
Google Scholar
Basu S, Banerjee A, Mooney RJ (2002) Semi-supervised clustering by seeding. In: International conference on machine learning, pp 27–34
Google Scholar
Bellatreche L, Giacometti A, Marcel P, Mouloudi H, Laurent D (2005) A personalization framework for OLAP queries. In: International workshop on data warehousing and OLAP, pp 9–18
Chapter Google Scholar
Ben Ahmed E, Nabli A, Gargouri F (2011) A survey of user-centric data warehouses: from personalization to recommendation. Int J Database Manag Syst 3(2):59–71
Article Google Scholar
Ben Ahmed E, Nabli A, Gargouri F (2012) Building MultiView analyst profile from multidimensional query logs: from consensual to conflicting preferences. Int J Comput Sci Issues 9(1):124–131
Google Scholar
Ben Ahmed E, Nabli A, Gargouri F (2012) Performing groupization in data warehouses: which discriminating criterion to select? In: Proceedings of the 17th international conference on applications of natural language to databases (NLDB). LNCS, pp 234–240
Google Scholar
Ben Ahmed E, Nabli A, Gargouri F (2012) $\mathcal{SHACUN}$: semi-supervised hierarchical active clustering based on ranking constraints. In: 12th industrial conference on data mining (ICDM’12). LNCS, Germany, pp 194–208
Google Scholar
Benitez E, Collet C, Adiba M (2001) Entrepôts de données: caractéristiques et problématique. Revue TSI 20(2):145–178
Google Scholar
Böhm C, Plant C (2008) Hissclu: a hierarchical density-based method for semi-supervised clustering. In: Proceedings of the 11th international conference on extending database technology, New York, NY, USA, pp 440–451
Google Scholar
Daud A, Muhammad F (2012) Group topic modeling for academic knowledge discovery. Appl Intell J 36(4):870–886
Article Google Scholar
Dasgupta S, Ng V (2010) Which clustering do you want? Inducing your ideal clustering with minimal feedback. J Artif Intell Res 39:581–632
MathSciNet MATH Google Scholar
Davidson I, Ravi SS (2009) Using instance-level constraints in agglomerative hierarchical clustering: theoretical and empirical results. In: Data mining and knowledge discovery, pp 257–282
Google Scholar
Favre C, Bentayed F, Boussaid O (2007) Evolution et personnalisation des analyses dans les entrepôts de données: une approche orientée utilisateur. In: XXVème congrès informatique des organisations et systèmes d’information et de décision, Perros-Guirec, pp 308–323
Google Scholar
Friedman N, Geiger D, Goldszmidt M (1997) Bayesian network classifiers. Mach Learn 29:131–163
Article MATH Google Scholar
Giacometti A, Marcel P, Negre E (2008) A framework for recommending OLAP queries. In: ACM eleventh international workshop on data warehousing and OLAP, California, US, pp 307–314
Google Scholar
Huang R, Lam W (2009) An active learning framework for semi-supervised document clustering with language modeling. Data Knowl Eng 68(1):49–67
Article Google Scholar
Golfarelli M, Maio D, Rizzi S (1998) Conceptual design of data warehouses from E/R schemes. In: 31st Hawaii international conference on system sciences
Google Scholar
Jaccard P (1908) Nouvelles recherches sur la distribution florale. Bull Soc Vaud Sci Nat 44:223–270
Google Scholar
Jain AK, Dubes RC (1988) Algorithms for clustering data. Prentice-Hall, Upper Saddle River
MATH Google Scholar
Jerbi H, Ravat F, Teste O, Zurfluh G (2009) Applying recommendation technology in OLAP systems. In: International conference on enterprise information systems, Milan, Italy, pp 220–233
Chapter Google Scholar
Kamvar SD, Klein D, Manning CD (2003) Spectral learning. In: International joint conference on artificial intelligence, pp 561–566
Google Scholar
Kestler HA, Kraus JM, Palm G, Schwenker F (2006) On the effects of constraints in semi-supervised hierarchical clustering. In: Artificial neural networks in pattern recognition. LNCS, pp 57–66
Chapter Google Scholar
Klein D, Kamvar SD, Manning CD (2002) From instance-level constraints to space-level constraints: making the most of prior knowledge in data clustering. In: Proceedings of the 19th international conference on machine learning, CA, USA, pp 307–314
Google Scholar
MacQueen J (1967) Some methods for classification and analysis of multivariate observations. In: 5th Berkeley Symp Math Statist Prob
Google Scholar
Martino FDi, Loia V, Sessa S (2011) Fuzzy transforms method in prediction data analysis. Fuzzy Sets Syst 180(1):146–163
Article MATH Google Scholar
Morris MR, Teevan J (2008) Understanding groups’ properties as a means of improving collaborative search systems. In: 8th workshop on collaborative information retrieval, Pittsburgh, USA
Google Scholar
Morris MR, Teevan J, Bush S (2008) Enhancing collaborative web search with personalization: groupization, smart splitting, and group hit-highlighting. In: Proceedings of the ACM conference on computer, supported cooperative work
Google Scholar
Pedrycz W, Senatore S (2010) Fuzzy clustering with viewpoints. IEEE Trans Fuzzy Syst 18(2):274–284
Google Scholar
Provost F, Fawcett T, Kohavi R (1998) The case against accuracy estimation for comparing induction algorithms. In: Proceeding of the 50th international conference on machine learning, Madison, Wisconsin, USA, pp 445–453
Google Scholar
Nogueira BM, Jorge AM, Rezende SO (2012) Hierarchical confidence-based active clustering. In: The 27th symposium on applied computing, pp 535–537
Google Scholar
Quinlan JR (1986) Induction of decision trees. Mach Learn 81–106
Ravat F, Teste O (2008) Personalization and OLAP databases. In: New trends in data warehousing and data analysis. Annals of information systems, vol 3, pp 71–92
Google Scholar
Rizzi S (2010) New frontiers in business intelligence: distribution and personalization. In: Advances in databases and information systems (ADBIS’10). LNCS, pp 23–30
Chapter Google Scholar
Teevan J, Morris RM, Bush S (2009) Discovering and using groups to improve personalized search. In: Proceedings of web search and data mining (WSDM), pp 15–24
Chapter Google Scholar
Strehl A, Ghosh J, Mooney R (2000) Impact of similarity measures on web-page clustering. In: AAAI-2000: workshop on artificial intelligence for web search
Google Scholar
Tung AKH, Han J, Lakshmanan LVS, Ng RT (2001) Constraint-based clustering in large databases. In: Proceedings of the international conference on database theory (ICDT’01), London, UK
Google Scholar
Wagstaff K, Cardie C, Rogers S, Schroedel S (2001) Constrained k-means clustering with background knowledge. In: International conference on machine learning, pp 577–584
Google Scholar
Xing EP, Ng AY, Jordan MI, Russell S (2002) Distance metric learning with application to clustering with side-information. In: Advances in neural information processing systems, pp 505–512
Google Scholar

Download references

Author information

Authors and Affiliations

High Institute of Management of Tunis, University of Tunis, Tunis, Tunisia
Eya Ben Ahmed
Faculty of Sciences of Sfax, University of Sfax, Sfax, Tunisia
Ahlem Nabli
High Institute of Computer and Multimedia of Sfax, University of Sfax, Sfax, Tunisia
Faiez Gargouri

Authors

Eya Ben Ahmed
View author publications
You can also search for this author inPubMed Google Scholar
Ahlem Nabli
View author publications
You can also search for this author inPubMed Google Scholar
Faiez Gargouri
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Eya Ben Ahmed.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ben Ahmed, E., Nabli, A. & Gargouri, F. A new semi-supervised hierarchical active clustering based on ranking constraints for analysts groupization. Appl Intell 39, 236–250 (2013). https://doi.org/10.1007/s10489-012-0407-3

Download citation

Published: 23 January 2013
Issue Date: September 2013
DOI: https://doi.org/10.1007/s10489-012-0407-3

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A new semi-supervised hierarchical active clustering based on ranking constraints for analysts groupization

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A novel approach to exploring maximum consensus graphs from users’ preference data in a new age environment

A Novel Grouping Harmony Search Algorithm for Clustering Problems

Collaborative Clustering: New Perspective to Rank Factor Granules

Explore related subjects

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now