research-article

An Effective Algorithm for Extracting Maximal Bipartite Cliques

Authors:

Raghda Fawzey Hriez,

Ghazi Al-Naymat,

Arafat AwajanAuthors Info & Claims

DATA'21: International Conference on Data Science, E-learning and Information Systems 2021

Pages 76 - 81

https://doi.org/10.1145/3460620.3460735

Published: 04 June 2021 Publication History

Editorial Notes

NOTICE OF CONCERN: ACM has received evidence that casts doubt on the integrity of the peer review process for the DATA 2021 Conference. As a result, ACM is issuing a Notice of Concern for all papers published and strongly suggests that the papers from this Conference not be cited in the literature until ACM's investigation has concluded and final decisions have been made regarding the integrity of the peer review process for this Conference.

Abstract

The reduction of bipartite clique enumeration problem into a clique enumeration problem is a well-known approach for extracting maximal bipartite cliques. In this approach, the graph inflation is used to transform a bipartite graph to a general graph, then any maximal clique enumeration algorithm can be used. However, between every two vertices (in the same set), the traditional inflation algorithm adds a new edge. Therefore incurring high computation overhead, which is impractical and cannot be scaled up to handle large graphs. This paper proposes a new algorithm for extracting maximal bipartite cliques based on an efficient graph inflation algorithm. The proposed algorithm adds the minimal number of edges that are required to convert all maximal bipartite cliques to maximal cliques. The proposed algorithm has been evaluated, using different real world benchmark graphs, according to the correctness of the algorithm, running time (in the inflation and enumeration steps), and according to the overhead of the inflation algorithm on the size of the generated general graph. The empirical evaluation proves that the proposed algorithm is accurate, efficient, effective, and applicable to real world graphs more than the traditional algorithm.

References

[1]

2003. Sandi graph. http://vlado.fmf.uni-lj.si/pub/networks/data/2mode/Sandi/Sandi.htm

[2]

Gabriela Alexe, Sorin Alexe, Yves Crama, Stephan Foldes, Peter L Hammer, and Bruno Simeone. 2004. Consensus algorithms for the generation of all maximal bicliques. Discrete Applied Mathematics 145, 1 (2004), 11–21.

Digital Library

[3]

Erich J Baker, Jeremy J Jay, Vivek M Philip, Yun Zhang, Zuopan Li, Roumyana Kirova, Michael A Langston, and Elissa J Chesler. 2009. Ontological discovery environment: A system for integrating gene–phenotype associations. Genomics 94, 6 (2009), 377–387.

[4]

Vladimir Batagelj and Andrej Mrvar. 2006. Pajek datasets. http://vlado.fmf.uni-lj.si/pub/networks/data/(2006).

[5]

Alex Beutel. 2016. User behavior modeling with large-scale graph analysis. Ph.D. Dissertation. Ph. D. Thesis at Carnegie Mellon University.

[6]

Alex Beutel, Wanhong Xu, Venkatesan Guruswami, Christopher Palow, and Christos Faloutsos. 2013. Copycatch: stopping group attacks by spotting lockstep behavior in social networks. In Proceedings of the 22nd international conference on World Wide Web. ACM, 119–130.

Digital Library

[7]

Coen Bron and Joep Kerbosch. 1973. Algorithm 457: finding all cliques of an undirected graph. Commun. ACM 16, 9 (1973), 575–577.

Digital Library

[8]

Yizong Cheng and George M Church. 2000. Biclustering of expression data. In Ismb, Vol. 8. 93–103.

[9]

Elissa J Chesler and Michael A Langston. 2007. Combinatorial genetic regulatory network analysis tools for high throughput transcriptomic data. In Systems Biology and Regulatory Genomics. Springer, 150–165.

[10]

W de Nooy. 2006. Ringen om de macht. In Wilco Dekker&Ben van Raaij, De elite. De Volkskrant Top 200 van invloedrijkste Nederlanders. Meulenhoff, 85–94.

[11]

Grahne G and Jianfei Zhu. 2004. Reducing the main memory consumptions of FPmax* and FPclose. In Proc. Workshop Frequent Item Set Mining Implementations (FIMI 2004, Brighton, UK), Aachen, Germany. Citeseer, 75.

[12]

Gösta Grahne and Jianfei Zhu. 2003. Efficiently using prefix-trees in mining frequent itemsets. In FIMI, Vol. 90.

[13]

Roumyana Kirova, Michael A Langston, Xinxia Peng, Andy D Perkins, and Elissa J Chesler. 2006. A systems genetic analysis of chronic fatigue syndrome: combinatorial data integration from SNPs to differential diagnosis of disease. In Proceedings, International Conference for the Critical Assessment of Microarray Data Analysis (CAMDA06).

[14]

Jinyan Li, Haiquan Li, Donny Soh, and Limsoon Wong. 2005. A correspondence between maximal complete bipartite subgraphs and closed patterns. In European Conference on Principles of Data Mining and Knowledge Discovery. Springer, 146–156.

[15]

Jinyan Li, Guimei Liu, Haiquan Li, and Limsoon Wong. 2007. Maximal biclique subgraphs and closed pattern pairs of the adjacency matrix: A one-to-one correspondence and mining algorithms. IEEE Transactions on Knowledge and Data Engineering 19, 12(2007), 1625–1637.

Digital Library

[16]

Guimei Liu, Kelvin Sim, and Jinyan Li. 2006. Efficient mining of large maximal bicliques. In International Conference on Data Warehousing and Knowledge Discovery. Springer, 437–448.

Digital Library

[17]

Jinze Liu and Wei Wang. 2003. Op-cluster: Clustering by tendency in high dimensional space. In Third IEEE International Conference on Data Mining, ICDM 2003. IEEE, 187–194.

[18]

Kazuhisa Makino and Takeaki Uno. 2004. New algorithms for enumerating all maximal cliques. In Scandinavian Workshop on Algorithm Theory. Springer, 260–272.

[19]

Richard A Mushlin, Aaron Kershenbaum, Stephen T Gallagher, and Timothy R Rebbeck. 2007. A graph-theoretical approach for pattern discovery in epidemiological research. IBM systems journal 46, 1 (2007), 135–149.

[20]

Michael J Sanderson, Amy C Driskell, Richard H Ree, Oliver Eulenstein, and Sasha Langley. 2003. Obtaining maximal concatenated phylogenetic data sets from large sequence databases. Molecular biology and evolution 20, 7 (2003), 1036–1042.

[21]

Amos Tanay, Roded Sharan, and Ron Shamir. 2002. Discovering statistically significant biclusters in gene expression data. Bioinformatics 18, suppl_1 (2002), S136–S144.

[22]

Takeaki Uno, Masashi Kiyomi, and Hiroki Arimura. 2004. LCM ver. 2: Efficient mining algorithms for frequent/closed/maximal itemsets. In Fimi, Vol. 126.

[23]

Haixun Wang, Wei Wang, Jiong Yang, and Philip S Yu. 2002. Clustering by pattern similarity in large data sets. In Proceedings of the 2002 ACM SIGMOD international conference on Management of data. ACM, 394–405.

Digital Library

[24]

Jianyong Wang, Jiawei Han, and Jian Pei. 2003. Closet+: Searching for the best strategies for mining frequent closed itemsets. In Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 236–245.

Digital Library

[25]

Mohammed J Zaki and Ching-Jui Hsiao. 2002. CHARM: An efficient algorithm for closed itemset mining. In Proceedings of the 2002 SIAM international conference on data mining. SIAM, 457–473.

[26]

Mohammed Javeed Zaki and Mitsunori Ogihara. 1998. Theoretical foundations of association rules. In 3rd ACM SIGMOD workshop on research issues in data mining and knowledge discovery. 71–78.

[27]

Yun Zhang, Charles A Phillips, Gary L Rogers, Erich J Baker, Elissa J Chesler, and Michael A Langston. 2014. On finding bicliques in bipartite graphs: a novel algorithm and its application to the integration of diverse biological data types. BMC bioinformatics 15, 1 (2014), 110.

Cited By

Lao HChen HLi FLyu S(2023)New Constant Dimension Subspace Codes From the Mixed Dimension ConstructionIEEE Transactions on Information Theory10.1109/TIT.2023.325592969:7(4333-4344)Online publication date: Jul-2023
https://doi.org/10.1109/TIT.2023.3255929

Index Terms

An Effective Algorithm for Extracting Maximal Bipartite Cliques
1. Mathematics of computing
  1. Discrete mathematics
    1. Graph theory
      1. Graph algorithms
2. Theory of computation

Index terms have been assigned to the content through auto-classification.

Recommendations

Dominating induced matching in some subclasses of bipartite graphs
Abstract
A subset M ⊆ E of edges of a graph G = ( V , E ) is called a matching if no two edges of M share a common vertex. An edge e ∈ E is said to dominate itself and all other edges adjacent to it. A matching M in a graph G = ( V , E ) is ...
Bipartite minors

We introduce a notion of bipartite minors and prove a bipartite analog of Wagner's theorem: a bipartite graph is planar if and only if it does not contain K 3 , 3 as a bipartite minor. Similarly, we provide a forbidden minor characterization for ...
A note on transformations of edge colorings of bipartite graphs

The author and A. Mirumian proved the following theorem: Let G be a bipartite graph with maximum degree @D and let t,n be integers, t>=n>=@D. Then it is possible to obtain, from one proper edge t-coloring of G, any proper edge n-coloring of G using only ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

DATA'21: International Conference on Data Science, E-learning and Information Systems 2021

April 2021

277 pages

ISBN:9781450388382

DOI:10.1145/3460620

Editors:
Juan Alfonso Lara Torralbo
UDIMA, Madrid, Spain
,
Shadi A. Aljawarneh
JUST, Jordan
,
Vangipuram Radhakrishna
VNRVJIET, India
,
Arun N.
JAIN, India

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 June 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

DATA'21

DATA'21: International Conference on Data Science, E-learning and Information Systems 2021

April 5 - 7, 2021

Ma'an, Jordan

Acceptance Rates

Overall Acceptance Rate 74 of 167 submissions, 44%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
81
Total Downloads

Downloads (Last 12 months)19
Downloads (Last 6 weeks)0

Reflects downloads up to 02 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Lao HChen HLi FLyu S(2023)New Constant Dimension Subspace Codes From the Mixed Dimension ConstructionIEEE Transactions on Information Theory10.1109/TIT.2023.325592969:7(4333-4344)Online publication date: Jul-2023
https://doi.org/10.1109/TIT.2023.3255929

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten