Skip to main content

Banded Pattern Mining Algorithms in Multi-dimensional Zero-One Data

  • Chapter
  • First Online:
Transactions on Large-Scale Data- and Knowledge-Centered Systems XXVI

Part of the book series: Lecture Notes in Computer Science ((TLDKS,volume 9670))

Abstract

A zero-one high-dimensional data set is said to be banded if all the dimensions can be reorganised such that the “non zero” entries are arranged along the leading diagonal across the dimensions. Our goal is to develop effective algorithms that identify banded patterns in multi-dimensional zero-one data by automatically rearranging the ordering of all the dimensions. Rearranging zero-one data so as to feature “bandedness” allows for the identification of hidden information and enhances the operation of many data mining algorithms (and other algorithms) that work with zero-one data. In this paper two N-Dimensional Banded Pattern Mining (NDBPM) algorithms are presented. The first is an approximate algorithm (NDBPM\(_{APPROX}\)) and the second an exact algorithm (NDBPM\(_{EXACT}\)). Two variations of NDBPM\(_{EXACT}\) are presented (Euclidean and Manhattan). Both algorithms are fully described together with evaluations of their operation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://www.csc.liv.ac.uk/~/frans/KDD/Software/LUCS_KDD_DN_ARM.

  2. 2.

    The Total From Partial (TFP) algorithm [8] was used for this purpose, but any alternative FIM algorithm could equally well have been used.

  3. 3.

    http://www.csc.liv.ac.uk/~/frans/KDD/Software/LUCS_KDD_DN_ARM.

References

  1. Abdullahi, F.B., Coenen, F., Martin, R.: A scalable algorithm for banded pattern mining in multi-dimensional zero-one data. In: Bellatreche, L., Mohania, M.K. (eds.) DaWaK 2014. LNCS, vol. 8646, pp. 345–356. Springer, Heidelberg (2014)

    Google Scholar 

  2. Agrawal, R., Imielinski, T., Swami, A.: Mining association rules between sets of items in large databases. In: SIGMOD 1993, pp. 207–216 (1993)

    Google Scholar 

  3. Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: Proceedings 20th International Conference on Very Large Data Bases (VLDB 1994), pp. 487–499 (1994)

    Google Scholar 

  4. Alizadeh, F., Karp, R.M., Newberg, L.A., Weisser, D.K.: Physical mapping of chromosomes: a combinatorial problem in molecular biology. Algorithmica 13, 52–76 (1995)

    Article  MathSciNet  MATH  Google Scholar 

  5. Baeza-Yates, R., RibeiroNeto, B.: Modern Information Retrieval. Addison-Wesley, Wokingham (1999)

    Google Scholar 

  6. Blake, C.I., Merz, C.J.: UCI repository of machine learning databases (1998). http://www.ics.uci.edu/mlearn/MLRepository.htm

  7. Cheng, K.Y.: Minimising the bandwidth of sparse symmetric matrices. Computing 11, 103–110 (1973)

    Article  MATH  Google Scholar 

  8. Coenen, F., Goulbourne, G., Leng, P.: Computing association rules using partial totals. In: Siebes, A., De Raedt, L. (eds.) PKDD 2001. LNCS (LNAI), vol. 2168, pp. 54–66. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  9. Cuthill, A.E., McKee, J.: Reducing bandwidth of sparse symmetric matrices. In: Proceedings of the 1969 29th ACM National Conference, pp. 157–172 (1969)

    Google Scholar 

  10. Fortelius, M., Kai Puolamaki, M.F., Mannila, H.: Seriation in paleontological data using Markov Chain Monte method. PLoS Comput. Biol. 2, e6 (2006)

    Article  Google Scholar 

  11. Gemma, G.C., Junttila, E., Mannila, H.: Banded structures in binary matrices. Knowl. Discov. Inf. Syst. 28, 197–226 (2011)

    Article  Google Scholar 

  12. Green, D.M., Kao, R.R.: Data quality of the Cattle Tracing System in great Britain. Vet. Rec. 161, 439–443 (2007)

    Article  Google Scholar 

  13. Junttila, E.: Pattern in Permuted Binary Matrices. PhD thesis (2011)

    Google Scholar 

  14. Von Luxburg, U.A.: A tutorial on spectral clustering. Stat. Comput. 17, 395–416 (2007)

    Article  MathSciNet  Google Scholar 

  15. Makinen, E., Siirtola, H.: The barycenter heuristic and the reorderable matrix. Informatica 29, 357–363 (2005)

    MathSciNet  MATH  Google Scholar 

  16. Mannila, H., Terzi, E.: Nestedness and segmented nestedness. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 2007, New York, NY, USA, pp. 480–489 (2007)

    Google Scholar 

  17. Mueller, C.: Sparse matrix reordering algorithms for cluster identification. Mach. Learn. Bioinform. 1532 (2004)

    Google Scholar 

  18. Papadimitrious, C.H.: The NP-completeness of the bandwidth minimisation problem. Computing 16, 263–270 (1976)

    Article  MathSciNet  Google Scholar 

  19. Nohuddin, P.N.E., Christley, R., Coenen, F., Setzkorn, C.: Trend mining in social networks: a study using a large cattle movement database. In: Perner, P. (ed.) ICDM 2010. LNCS, vol. 6171, pp. 464–475. Springer, Heidelberg (2010)

    Google Scholar 

  20. Robinson, S., Christley, R.M.: Identifying temporal variation in reported birth, death and movements of cattle in Britain. BMC Vet. Res. 2, 11 (2006)

    Article  Google Scholar 

  21. Rosen, R.: Matrix bandwidth minimisation. In: ACM National Conference Proceedings, pp. 585–595 (1968)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fatimah B. Abdullahi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Abdullahi, F.B., Coenen, F., Martin, R. (2016). Banded Pattern Mining Algorithms in Multi-dimensional Zero-One Data. In: Hameurlain, A., Küng, J., Wagner, R., Bellatreche, L., Mohania, M. (eds) Transactions on Large-Scale Data- and Knowledge-Centered Systems XXVI. Lecture Notes in Computer Science(), vol 9670. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-49784-5_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-49784-5_1

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-662-49783-8

  • Online ISBN: 978-3-662-49784-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics