Skip to main content
Log in

Combining rough decisions for intelligent text mining using Dempster’s rule

  • Published:
Artificial Intelligence Review Aims and scope Submit manuscript

Abstract

An important issue in text mining is how to make use of multiple pieces knowledge discovered to improve future decisions. In this paper, we propose a new approach to combining multiple sets of rules for text categorization using Dempster’s rule of combination. We develop a boosting-like technique for generating multiple sets of rules based on rough set theory and model classification decisions from multiple sets of rules as pieces of evidence which can be combined by Dempster’s rule of combination. We apply these methods to 10 of the 20-newsgroups—a benchmark data collection (Baker and McCallum 1998), individually and in combination. Our experimental results show that the performance of the best combination of the multiple sets of rules on the 10 groups of the benchmark data is statistically significant and better than that of the best single set of rules. The comparative analysis between the Dempster–Shafer and the majority voting (MV) methods along with an overfitting study confirm the advantage and the robustness of our approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Aphinyanaphongs Y, Aliferis CF (2003) Text categorization models for retrieval of high quality articles in internal medicine. In: Proceedings of the American Medical Informatics Association (AMIA) annual symposium, Washington, DC, USA, pp 31–35

  • Apte C, Damerau F and Weiss S (1994). Automated Learning of Decision Text Categorization. ACM Trans Inf Syst 12(3): 233–251

    Article  Google Scholar 

  • Baker D, McCallum A (1998) Distributional clustering of words for text classification. In: Proceedings of 21st ACM international conference on research and development in information retrieval, pp 96–103

  • Bi Y (2004) Combining multiple classifiers for text categorization using Dempster’s rule of combination. PhD dissertation, University of Ulster

  • Bi Y, Anderson T, McClean S (2004a) Combining rules for text categorization using Dempster’s rule of combination. In: Proceedings of 5th international conference on intelligent data engineering and automated learning. LNCS 3177, Spring-Verlag, pp 457–463

  • Bi Y, Bell D, Guan JW (2004b) Combining evidence from classifiers in text categoriza-tion. In: Proceedings of the 8th international conference on knowledge-based intelligent information & engineering systems. LNCS 3215, Spring, pp 521–528

  • Chouchoulas A and Shen Q (2001). Rough set-aided keyword reduction for text categorization. Appl Artif Intell 15(9): 843–873

    Article  Google Scholar 

  • Cohen WW, Singer Y (1999) Simple, fast, and effective rule learner. In: Proceedings of annual conference of American association for artificial intelligence, pp 335–342

  • Denoeux T (2000). A neural network classifier based on Dempster–Shafer theory. IEEE Trans Syst Man Cybern A 30(2): 131–150

    Article  MathSciNet  Google Scholar 

  • Freund Y, Schapire R (1996) Experiments with a new boosting algorithm. In: Machine learning: proceedings of the thirteenth international conference, pp 148–156

  • Freund Y and Schapire RE (1997). A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55(1): 119–139

    Article  MATH  MathSciNet  Google Scholar 

  • Friedman J, Hastie T, Tibshirani R (1998) Additive logistic regression: A statistical view of boosting (Technical Report). Stanford University Statistics Department. http://www.stat-stanford.edu/~tibs

  • Grzymala-Busse J (1992) LERS—A System for learning from examples based on Rough Sets. In: Slowinski R (ed), Intelligent decision support. Kluwer Academic, pp 3–17

  • Guang JW and Bell D (1998). Rough computational methods for information systems. Artif Intell 105: 77–103

    Article  Google Scholar 

  • Kittler J, Hatef M, Duin RPW and Matas J (1998). On combining classifiers. IEEE Trans Pattern Anal Mach Intell 20(3): 226–239

    Article  Google Scholar 

  • Kuncheva L (2001) Combining classifiers: soft computing solutions. In: Pal SK, Pal A (eds) Pattern recognition: from classical to modern approaches. World Scientific, pp 427–451

  • Lam L (2000) Classifier combinations: implementation and theoretical issues. In: Kittler J, Roli F (eds) Multiple classifier systems. LNCS 1857, Spring, pp 78–86

  • Mitchell T (1999). Machine learning and data mining. Commun ACM 42(11): 31–36

    Article  Google Scholar 

  • Nardiello P, Sebastiani F, Sperduti A (2003) Discretizing continuous attributes in AdaBoost for text categorization. In: Proceedings of 25th European conference on information retrieval. LNCS 2633, Springer-Verlag, Berlin, pp 320–334

  • Opitz D and Maclin R (1999). Popular ensemble methods: an empirical study. J Artif Intell Res 11: 169–198

    MATH  Google Scholar 

  • Pawlak Z (1991) Rough Set: theoretical aspects of reasoning about data. Kluwer Academic

  • Quinlan JR (1996) Bagging, boosting, and C4.5. In: Proceedings of the thirteenth national conference on artificial intelligence, pp 725–730

  • Schapire RE and Singer Y (2000). BoosTexter: aboosting-based system for text categorization. Mach Learn 39(2/3): 135–168

    Article  MATH  Google Scholar 

  • Shafer G (1976). A mathematical theory of evidence. Princeton University Press, Princeton

    MATH  Google Scholar 

  • Skowron A and Grzymala-Busse J (1994). From rough set theory to evidence theory. In: Yager, R, Fedrizzi, M and Kacprzyk, J (eds) Advances of the Dempster–Shafer Theory of Evidence, pp 193–236. Wiley, New York

    Google Scholar 

  • Tumer K and Ghosh JR (2002). Combining of disparate classifiers through order statistics. Pattern Anal Appl 6(1): 41–46

    Google Scholar 

  • Xu L, Krzyzak A and Suen CY (1992). Several methods for combining multiple classifiers and their applications in handwritten character recognition. IEEE Trans Syst Man Cybern 22(3): 418–435

    Article  Google Scholar 

  • Yao YY and Lingras PJ (1998). Interpretations of belief functions in the theory of rough sets. Inf Sci 104(1–2): 81–106

    Article  MATH  MathSciNet  Google Scholar 

  • Yang Y (1999). An evaluation of statistical approaches to text categorization. J Inf Retr 1(1/2): 67–88

    Google Scholar 

  • van Rijsbergen CJ (1979) Information retrieval, 2nd edn. Butterworths

  • Weiss S, Kulikowski C (1991) Computer system that learn: classification and prediction methods from statistics, neural nets, machine learning, and expert systems. Morgan Kaufmann

  • Weiss SM, Indurkhya N (2000) Lightweight rule induction. In: Proceedings of the seventeenth international conference on machine learning, pp 1135–1142

  • Whiteaker CJ, Kuncheva L (2003) Examining the relationship between majority vote accuracy and diversity in bagging and boosting. Technical report. University of Wales, Bangor

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yaxin Bi.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bi, Y., McClean, S. & Anderson, T. Combining rough decisions for intelligent text mining using Dempster’s rule. Artif Intell Rev 26, 191–209 (2006). https://doi.org/10.1007/s10462-007-9049-y

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10462-007-9049-y

Keywords

Navigation