Skip to main content

A Novel Approach for Identifying Banded Patterns in Zero-One Data Using Column and Row Banding Scores

  • Conference paper
Machine Learning and Data Mining in Pattern Recognition (MLDM 2014)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8556))

Abstract

Zero-one data is frequently encountered in the field of data mining. A banded pattern in zero-one data is one where the attributes (columns) and records (rows) are organized in such a way that the “ones” are arranged along the leading diagonal. The significance is that rearranging zero-one data so as to feature bandedness enhances the operation of some data mining algorithms that work with zero-one data. The fact that a dataset features banding may also be of interest in its own right with respect to various application domains. In this paper an effective banding algorithm is presented designed to reveal banding in 2D data by rearranging the ordering of columns and rows. The challenge is the large number of potential row and column permutations. To address this issue a column and row scoring mechanism is proposed that allows columns and rows to be ordered so as to reveal bandedness without the need to consider large numbers of permutations. This mechanism has been incorporated into the Banded Pattern Mining (BPM) algorithm proposed in this paper. The operation of BPM is fully discussed. A Complete evaluation of the BPM algorithm is also presented clearly indicating the advantages offered by BPM with respect to a number of competitor algorithms in the context of a collection of UCI Datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Agrawal, R., Imielinski, T., Swami, A.: Mining association rules between sets of items in large databases. In: SIGMOD 1993, pp. 207–216 (1993)

    Google Scholar 

  2. Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: Proceedings 20th International Conference on Very Large Data Bases (VLDB 1994), pp. 487–499 (1994)

    Google Scholar 

  3. Alizadeh, F., Karp, R.M., Newberg, L.A., Weisser, D.K.: Physical mapping of chromosomes: A combinatorial problem in molecular biology. Algorithmica 13, 52–76 (1995)

    Article  MATH  MathSciNet  Google Scholar 

  4. Atkins, J., Boman, E., Hendrickson, B.: Spectral algorithm for seriation and the consecutive ones problem. SIAM J. Comput. 28, 297–310 (1999)

    Article  MathSciNet  Google Scholar 

  5. Baeza-Yates, R., RibeiroNeto, B.: Modern Information Retrieval. Addison-Wesley (1999)

    Google Scholar 

  6. Blake, C.L., Merz, C.J.: Uci repository of machine learning databases (1998), http://www.ics.uci.edu/~mlearn/MLRepository.htm

  7. Brower, J.C., Kile, K.M.: Seriation of an original data matrix as applied to paleoecology. Lethaia 21, 79–93 (1988)

    Article  Google Scholar 

  8. Coenen, F., Goulbourne, G., Leng, P.: Computing association rules using partial totals. In: Siebes, A., De Raedt, L. (eds.) PKDD 2001. LNCS (LNAI), vol. 2168, pp. 54–66. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  9. Cuthill, A.E., McKee, J.: Reducing bandwidth of sparse symmentric matrices. In: Proceedings of the 1969 29th ACM national Conference, pp. 157–172 (1969)

    Google Scholar 

  10. Fortelius, M., Puolamaki, M.F.K., Mannila, H.: Seriation in paleontological data using markov chain monte method. PLoS Computational Biology, 2 (2006)

    Google Scholar 

  11. Garriga, G.C., Junttila, E., Mannila, H.: Banded structures in binary matrices. Knowledge Discovery and Information System 28, 197–226 (2011)

    Article  Google Scholar 

  12. Pio, G., Ceci, M., D’Elia, D., Loglisci, C., Malerba, D.: A novel biclustering algorithm for the discovery of meaningful boiological correlaions between micrornas and their target genes. BMC Bioiinformatics 14, 8 (2013)

    Article  Google Scholar 

  13. Inokuchi, A., Washio, T., Motoda, H.: An apriori-based algorithm for mining frequent substructures from graph data. In: Zighed, D.A., Komorowski, J., Żytkow, J.M. (eds.) PKDD 2000. LNCS (LNAI), vol. 1910, pp. 13–23. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  14. Junttila, E.: Pattern in Permuted Binary Matrices. Ph.D. thesis (2011)

    Google Scholar 

  15. Luxburg, U.V.: A tutorial on spectral clustering. Statistical Computation 17, 395–416 (2007)

    Article  Google Scholar 

  16. Deodhar, M., Gupta, G., Ghosh, J., Cho, H., Dhillon, I.: A scalable framework for discovering coherent co-clusters in noisy data. In: Proceedings of 26th International Conference on Machine Learning (ICML), Montreal Canada, p. 31 (2009)

    Google Scholar 

  17. Mueller, C.: Sparse matrix reordering algorithms for cluster identification. Machune Learning in Bioinformatics (2004)

    Google Scholar 

  18. Myllykangas, S., Himberg, J., Bohling, T., Nagy, B., Hollman, J., Knuutila, S.: DNA copy number amplification profiling of human neoplasms. Oncogene, 7324–7332 (2006)

    Google Scholar 

  19. Mäkinen, E., Siirtola, H.: The barycenter heuristic and the reorderable matrix. Informatica 29, 357–363 (2005)

    MATH  Google Scholar 

  20. Sugiyama, K., Tagawa, S., Toda, M.: Methods for visual understanding of hierarchical system structures. IEEE Transactions on Systems, Man and Cybernetics 11, 109–125 (1981)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Abdullahi, F.B., Coenen, F., Martin, R. (2014). A Novel Approach for Identifying Banded Patterns in Zero-One Data Using Column and Row Banding Scores. In: Perner, P. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2014. Lecture Notes in Computer Science(), vol 8556. Springer, Cham. https://doi.org/10.1007/978-3-319-08979-9_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-08979-9_5

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-08978-2

  • Online ISBN: 978-3-319-08979-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics