Skip to main content
Log in

Multiple labels associative classification

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Building fast and accurate classifiers for large-scale databases is an important task in data mining. There is growing evidence that integrating classification and association rule mining can produce more efficient and accurate classifiers than traditional techniques. In this paper, the problem of producing rules with multiple labels is investigated, and we propose a multi-class, multi-label associative classification approach (MMAC). In addition, four measures are presented in this paper for evaluating the accuracy of classification approaches to a wide range of traditional and multi-label classification problems. Results for 19 different data sets from the UCI data collection and nine hyperheuristic scheduling runs show that the proposed approach is an accurate and effective classification technique, highly competitive and scalable if compared with other traditional and associative classification approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Agrawal, R., Imielinski, T., Swami, A.: Mining associations between sets of items in massive databases. In: Buneman, P., Jajodia, S. (eds) Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 207–216. Washington, DC (1993)

  2. Agrawal, R., Srikant, R.: Fast algorithms for mining association rule. In: Proceedings of the 20th International Conference on Very Large Data Bases, pp. 487–499 (1995)

  3. Ali, K., Manganaris, S., Srikant, R.: Partial classification using association rules. In: Proceedings of the Third International Conference on Knowledge Discovery and Data Mining, pp. 115–118. AAAI (1997)

  4. Antonie, M., Zaiane, O.: An associative classifier based on positive and negative rules. In: Proceedings of the 9th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, pp. 64–69. Paris, France (2004)

  5. Boutell, M., Shen, X., Luo, J., Brown, C.: Multi-label semantic scene classification. Technical report 813, Department of Computer Science, University of Rochester, Rochester, NY 14627 & Electronic Imaging Products R & D, Eastern Kodak Company (2003)

  6. Clare, A., King, R.: Knowledge discovery in multi-label phenotype data. In: De Raedt, L., Siebes, A. (eds) Proceedings of the PKDD '01, vol. 2168, Lecture notes in Artificial Intelligence, pp. 42–53. Springer-Verlag, Berlin Heidelberg New York (2001)

  7. Cohen, W.: Fast effective rule induction. In: Proceedings of the 12th International Conference on Machine Learning, pp. 115–123. Morgan Kaufmann, San Mateo (1995)

  8. Cowling, P., Chakhlevitch, K.: Hyperheuristics for managing a large collection of low level heuristics to schedule personnel. In: Proceedings of the 2003 IEEE Conference on Evolutionary Computation, pp. 1214–1221. Canberra, Australia (2003)

  9. Duda, R., Hart, P., Stork, D.: Pattern Classification. Wiley-Interscience, New York (2001)

    Google Scholar 

  10. Fayyad, U., Irani, K.: Multi-interval discritisation of continues-valued attributes for classification learning. In: Proceedings of the IJCAI, pp. 1022–1027 (1993)

  11. Frank, E., Witten, I.: Generating accurate rule sets without global optimisation. In: Proceedings of the Fifteenth International Conference on Machine Learning, pp. 144–151. Morgan Kaufmann Madison, Wisconsin, (1998)

  12. Furnkranz, J.: Separate-and-conquer rule learning. Technical Report TR-96-25. Austrian Research Institute for Artificial Intelligence, Vienna (1996)

  13. Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, pp. 1–12. Dallas, TX (2000)

  14. Hipp, J., Guntzer, U., Nakaeizadeh, G.: Algorithms for association rule mining – a general survey and comparison. In: Proceedings of the ACM SIGKDD Explorations, pp. 58–64 (2000)

  15. Joachims, T.: Text categorisation with support vector machines: learning with many relevant features. In: Proceedings of the Tenth European Conference on Machine Learning, pp. 137–142 (1998)

  16. Li, W.: Classification based on multiple association rules. M.Sc. Thesis. Simon Fraser University (2001)

  17. Li, W., Han, J., Pei, J.: CMAr: accurate and efficient classification based on multiple class association rule. In: Proceedings of the ICDM'01, pp. 369–376. San Jose, CA (2001)

  18. Li, J., Shen, H., Topor, R.: Mining optimal class association rule set. In: Proceedings of the 5th Pacific–Asia Conference on Methodologies for Knowledge Discovery and Data Mining (PAKDD 2001), pp. 364–375 (2001)

  19. Li, J., Topor, R., Shen, H.: Construct robust rule sets for classification. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 564–569. Edmonton, Alberta, Canada (2002)

  20. Lim, T., Loh, W., Shih, Y.: A comparison of prediction accuracy, complexity and training time of thirty-three old and new classification algorithms. Machine Learn. 40, 203–228 (2000)

    Google Scholar 

  21. Liu, B., Hsu, W., Ma, Y.: Mining association rules with multiple minimum supports. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 337–341 (1999)

  22. Liu, B., Hsu, W., Ma, Y.: Integrating classification and association rule mining. In: Proceedings of the KDD '98, pp. 80–86. New York, NY (1998)

  23. Merz, C.J., Murphy, P.M.: UCI repository of machine learning databases. Department of Information and Computer Science, University of California, Irvine, CA (1996)

    Google Scholar 

  24. Park, J., Chen, M., Yu, P.: An effective Hash-based algorithm for mining association rules. In: Proceedings of the International Conference on Management of Data, pp. 175–186. San Jose, CA (1995)

  25. Quinlan, J.: C4.5 Programs for Machine Learning. Morgan Kaufmann, San Mateo, San Francisco (1993)

    Google Scholar 

  26. Quinlan, J., Cameron-Jones, J.: FOIL: a midterm report. In: Proceedings of 1993 European Conference on Machine Learning, pp. 3–20. Vienna, Austria (1993)

  27. Quinlan, J.: Generating production rules from decision trees. In: Proceedings of the 10th International Joint Conferences on Artificial Intelligence, pp. 304–307. Morgan Kaufmann, San Francisco (1987)

  28. Schapire, R., Singer, Y.: BoosTexter: a boosting-based system for text categorization. Machine Learn. 39(2/3), 135–168 (2000)

    Google Scholar 

  29. Tan, P., Kumar, V.: Interestingness measures for association patterns: a prospective. In: Proceedings of the Workshop on Postprocessing in Machine Learning and Data Mining in Conjunction with Seventh ACM SIGKDD (2000)

  30. Thabtah, F., Cowling, P., Peng, Y.: MCAR: multi-class classification based onassociation rule. In: Proceedings of the 3rd ACS/IEEE International Conference on Computer Systems and Applications, pp. 1–8. Cairo, Egypt (2005)

  31. Thabtah, F., Cowling, P., Peng, Y.: Comparison of classification techniques for a personnel scheduling problem. In: Proceedings of the 2004 International Business Information Management Conference, pp. 207–213. Amman, Jordan (2004)

  32. Witten, I., Frank, E.: Data mining: practical machine learning tools and techniques with java implementations. Morgan Kaufmann, San Francisco (2000)

    Google Scholar 

  33. Yang, Y.: An evaluation of statistical approaches to text categorisation. Technical Report CMU-CS-97-127. Carnegie Mellon University (1997)

  34. Yin, X., Han, J.: CPAR: classification based on predictive association rule. In: Proceedings of the SDM 2003, pp. 369–376. San Francisco, CA (2003)

  35. CBA (1998) CBA: http://www.comp.nus.edu.sg/dm2

  36. Weka (2001) weka: data mining software in java. http://www.cs.waikato.ac.nz/ml/weka

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fadi Abdeljaber Thabtah.

Additional information

Fadi Abdeljaber Thabtah received a B.S. degree in Computer Science from Philadelphia University, Jordan, in 1997 and an M.S. degree in Computer Science from California State University, USA in 2001. From 1996 to 2001, he worked as professional in database programming and administration in United Insurance Ltd. in Amman. In 2002, he started his academic career and joined the Philadelphia University as a lecturer. He is currently a final graduate student at the Department of Computer Science, Bradford University, UK. He has published about seven scientific papers in the areas of data mining and machine learning. His research interests include machine learning, data mining, artificial intelligence and object-oriented databases.

Peter Cowling is a Professor of Computing at the University of Bradford. He obtained M.A. and D.Phil. degrees from the University of Oxford. He leads the Modelling Optimisation Scheduling And Intelligent Control (MOSAIC) research centre (http://mosaic.ac), whose main research interests lie in the investigation and development of new modelling, optimisation, control and decision support technologies, which bridge the gap between theory and practice. Applications include production and personnel scheduling, intelligent game agents and data mining. He has published over 40 scientific papers in these areas and is active as a consultant to industry.

Yonghong Peng's research areas include machine learning and data mining, and bioinformatics. He has published more than 35 scientific papers in related areas. Dr. Peng is a member of the IEEE and Computer Society, and has been a member of the programme committee of several conferences and workshops. Dr. Peng referees papers for several journals including the IEEE Trans. on Systems, Man and Cybernetics (part C), IEEE Trans. on Evolutionary Computation, Journal of Fuzzy Sets and Systems, Journal of Bioinformatics, and Journal of Data Mining and Knowledge Discovery, and is refereeing papers for several conferences.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Thabtah, F.A., Cowling, P. & Peng, Y. Multiple labels associative classification. Knowl Inf Syst 9, 109–129 (2006). https://doi.org/10.1007/s10115-005-0213-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-005-0213-x

Keywords

Navigation