Skip to main content

Using Intuitionistic Fuzzy Sets in Text Categorization

  • Conference paper
Book cover Artificial Intelligence and Soft Computing – ICAISC 2008 (ICAISC 2008)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5097))

Included in the following conference series:

Abstract

We address some crucial problem associated with text categorization, a local feature selection. It seems that intuitionistic fuzzy sets can be an effective and efficient tool making it possible to assess each term (from a feature set for each category) from a point of view of both its indicative and non-indicative ability. It is important especially for high dimensional problems to improve text filtering via a confident rejection of non-relevant documents. Moreover, we indicate that intuitionistic fuzzy sets are a good tool for the classification of imbalanced and overlapping classes, a commonly encountered case in text categorization.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 189.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Atanassov, K.: Intuitionistic Fuzzy Sets. VII ITKR Session. Sofia (Deposed in Centr. Sci.-Techn. Library of Bulg. Acad. of Sci., 1697/84) (in Bulgarian) (1983)

    Google Scholar 

  2. Atanassov, K.: Intuitionistic Fuzzy Sets. Fuzzy Sets and Systems 20, 87–96 (1986)

    Article  MATH  MathSciNet  Google Scholar 

  3. Atanassov, K.: Intuitionistic Fuzzy Sets: Theory and Applications. Springer, Heidelberg (1999)

    MATH  Google Scholar 

  4. Chawla, N., Bowyer, K., Hall, L., Kegelmeyer, W.: Smote: synthetic minority over-sampling technigue. Artificial Intelligence Research 16, 321–357 (2002)

    MATH  Google Scholar 

  5. Dubois, D., Gottwald, S., Hajek, P., Kacprzyk, J., Prade, H.: Terminological difficulties in fuzzy set theory - the case of ”Intuitionistic Fuzzy Sets”. Fuzzy Sets and Systems 156, 496–499 (2005)

    Article  MathSciNet  Google Scholar 

  6. Fawcett, T., Provost, F.: Adaptive Fraud Detection. Data Mining and Knowledge Discovery 3(1), 291–316 (1997)

    Article  Google Scholar 

  7. Forman, G.: An experimental study of feature selection metrics for text categorization. Journal of Machine Learning Research 3, 1289–1305 (2003)

    Article  MATH  Google Scholar 

  8. Galavotti, L., Sebastiani, F., Simi, M.: Experiments on the use of feature selection and negative evidence in automated text categorization. In: 4th European Conf. on Research and Advanced Technology for Digital Libraries ECDL 2000, pp. 59–68 (2000)

    Google Scholar 

  9. Japkowicz, N.: Class Imbalances: Are we Focusing on the Right Issue? In: Workshop on Learning from Imbalanced Data II, ICML, Washington (2003)

    Google Scholar 

  10. Joachims, T.: Text categorization with support vector machines: lerning with many relevant features. In: European Conf. on machine Learning (ECML), pp. 137–142. Springer, Berlin (1998)

    Google Scholar 

  11. Kubat, M., Holte, R., Matwin, S.: Learning when negative examples abound. In: van Someren, M., Widmer, G. (eds.) ECML 1997. LNCS, vol. 1224, pp. 146–153. Springer, Heidelberg (1997)

    Google Scholar 

  12. Kubat, M., Holte, R., Matwin, S.: Machine Learning for the Detection of Oil Spills in Satellite Radar Images. Machine Learning 30, 195–215 (1998)

    Article  Google Scholar 

  13. Lewis, D., Catlett, J.: Heterogeneous Uncertainty Sampling for Supervised Learning. In: Proc. 11th Conf. on Machine Learning, pp. 148–156 (1994)

    Google Scholar 

  14. Lingras, P., Butz, C.J.: Precision and Recall in Rough Support Vector Machines. In: 2007 IEEE Int. Conf. on Granular Computing, pp. 654–658 (2007)

    Google Scholar 

  15. Mladenic, D., Grobelnik, M.: Feature Selection for Unbalanced Class Distribution and Naive Bayes. In: 16th Int. Conf. on Machine Learning, pp. 258–267 (1999)

    Google Scholar 

  16. Mladenic, D., Grobelnik, M.: Feature selection on hierarchy of web documents. Decision Support Systems 35, 45–87 (2003)

    Article  Google Scholar 

  17. Van Rijsbergen, C.J.: Information retrieval, 2nd edn. Butterworths, London (1979)

    Google Scholar 

  18. Van Rijsbergen, C.J., Harper, D.J., Porter, M.F.: The selection of good search terms. Information Processing and Management 17, 77–91 (1981)

    Article  Google Scholar 

  19. Sebastiani, F.: Machine Learning in Automated Text Categorizaton. ACM Coputing Surveys 34(1), 1–47 (2002)

    Article  Google Scholar 

  20. Sousa, P., Pimentao, J., Santos, B., Moura-Pires, F.: Feture selection algorithms to improve documents classification performance. LNAI 2663, pp. 288–296 (2003)

    Google Scholar 

  21. Soucy, P., Mineau, G.: Feature Selection Strategies for Text Categorization. In: Xiang, Y., Chaib-draa, B. (eds.) Canadian AI 2003. LNCS (LNAI), vol. 2671, pp. 505–509. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  22. Szmidt, E., Baldwin, J.: Assigning the parameters for Intuitionistic Fuzzy Sets. Notes on IFSs 11(6), 1–12 (2005)

    Google Scholar 

  23. Szmidt, E., Baldwin, J.: Intuitionistic fuzzy set functions, mass assignment theory, possibility theory and histograms. In: Proc. of 2006 IEEE World Congress on Computational Intelligence, Vancouver, Canada, pp. 234–243, Omnipress (IEEE Catalog Number: 06CH37726D; ISBN: 0-7803-9489-5) (2006)

    Google Scholar 

  24. Szmidt, E., Kacprzyk, J.: Remarks on some applications of intuitionistic fuzzy sets in decision making. Notes on IFS 2(3), 22–31 (1996c)

    MATH  MathSciNet  Google Scholar 

  25. Szmidt, E., Kacprzyk, J.: On measuring distances between intuitionistic fuzzy sets. Notes on IFS 3(4), 1–13 (1997)

    MATH  MathSciNet  Google Scholar 

  26. Szmidt, E., Kacprzyk, J.: Group Decision Making under Intuitionistic Fuzzy Preference Relations. In: IPMU 1998, Paris, La Sorbonne, pp. 172–178 (1998a)

    Google Scholar 

  27. Szmidt, E., Kacprzyk, J.: Applications of Intuitionistic Fuzzy Sets in Decision Making. In: EUSFLAT 1999, pp. 150–158 (1998b)

    Google Scholar 

  28. Szmidt, E., Kacprzyk, J.: Distances between intuitionistic fuzzy sets. Fuzzy Sets and Systems 114(3), 505–518 (2000)

    Article  MATH  MathSciNet  Google Scholar 

  29. Szmidt, E., Kacprzyk, J.: On Measures on Consensus Under Intuitionistic Fuzzy Relations. In: IPMU 2000, pp. 1454–1461 (2000)

    Google Scholar 

  30. Szmidt, E., Kacprzyk, J.: Entropy for intuitionistic fuzzy sets. Fuzzy Sets and Systems 118(3), 467–477 (2001)

    Article  MATH  MathSciNet  Google Scholar 

  31. Szmidt, E., Kacprzyk, J.: Analysis of Consensus under Intuitionistic Fuzzy Preferences. In: Proc. Int. Conf. in Fuzzy Logic and Technology. De Montfort Univ. Leicester, pp. 79–82 (2001)

    Google Scholar 

  32. Szmidt, E., Kacprzyk, J.: An Intuitionistic Fuzzy Set Based Approach to Intelligent Data Analysis (an application to medical diagnosis). In: Abraham, A., Jain, L., Kacprzyk, J. (eds.) Recent Advances in Intelligent Paradigms and and Applications, pp. 57–70. Springer, Heidelberg (2002)

    Google Scholar 

  33. Szmidt, E., Kacprzyk, J.: Analysis of Agreement in a Group of Experts via Distances Between Intuitionistic Fuzzy Preferences. In: Proc. 9th Int. Conf. IPMU 2002, Annecy, France, pp. 1859–1865 (2002)

    Google Scholar 

  34. Szmidt, E., Kacprzyk, J.: An Intuitionistic Fuzzy Set Based Approach to Intelligent Data Analysis (an application to medical diagnosis). In: Abraham, A., Jain, L., Kacprzyk, J. (eds.) Recent Advances in Intelligent Paradigms and Applications, pp. 57–70. Springer, Heidelberg (2002b)

    Google Scholar 

  35. Szmidt, E., Kacprzyk, J.: Evaluation of Agreement in a Group of Experts via Distances Between Intuitionistic Fuzzy Sets. In: Proc. IS 2002 – Int. IEEE Symposium: Intelligent Systems, Varna, IEEE Catalog Number 02EX499, pp. 166–170 (2002c)

    Google Scholar 

  36. Szmidt, E., Kacprzyk, J.: A New Concept of a Similarity Measure for Intuitionistic Fuzzy Sets and Its Use in Group Decision Making. In: Torra, V., Narukawa, Y., Miyamoto, S. (eds.) MDAI 2005. LNCS (LNAI), vol. 3558, pp. 272–282. Springer, Heidelberg (2005)

    Google Scholar 

  37. Szmidt, E., Kacprzyk, J.: Distances Between Intuitionistic Fuzzy Sets: Straightforward Approaches may not work. In: 3rd International IEEE Conference Intelligent Systems IS06, London, May 2006, pp. 716–721 (2006)

    Google Scholar 

  38. Szmidt, E., Kacprzyk, J.: An Application of Intuitionistic Fuzzy Set Similarity Measures to a Multi-criteria Decision Making Problem. In: Rutkowski, L., Tadeusiewicz, R., Zadeh, L.A., Żurada, J.M. (eds.) ICAISC 2006. LNCS (LNAI), vol. 4029, pp. 314–323. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  39. Szmidt, E., Kacprzyk, J.: Some Problems with Entropy Measures for the Atanassov Intuitionistic Fuzzy Sets. In: Masulli, F., Mitra, S., Pasi, G. (eds.) WILF 2007. LNCS (LNAI), vol. 4578, pp. 291–297. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  40. Szmidt, E., Kacprzyk, J.: A New Similarity Measure for Intuitionistic Fuzzy Sets: Straightforward Approaches not work. In: 2007 IEEE Conf. on Fuzzy Sytems, pp. 481–486 (2007a); IEEE Catalog Number: 07CH37904C,ISBN: 1-4244-1210-2

    Google Scholar 

  41. Szmidt, E., Kukier, M.: Classification of Imbalanced and Overlapping Classes using Intuitionistic Fuzzy Sets. In: 3rd International IEEE Conference on Intelligent Systems IS 2006, pp. 722–727 (2006)

    Google Scholar 

  42. Szmidt, E., Kukier, M.: A New Approach to Classification of Imbalanced Classes via Atanassov’s Intuitionistic Fuzzy Sets. In: Wang, H.-F. (ed.) Intelligent Data Analysis: Developing New Methodologies Through Pattern Discovery and Recovery (in press )

    Google Scholar 

  43. Torkkola, K.: Discriminative features for text document classification. In: Int. Conf. on Pattern Recognition, Canada (2002)

    Google Scholar 

  44. Visa, S., Ralescu, A.: Experiments in guided class rebalance based on class structure. In: 15th Midwest Artificial Intelligence and Cognitive Science Conference, Dayton, USA, pp. 8–14 (2004)

    Google Scholar 

  45. Yang, Y., Pedersen, J.: A comparative study on feature selection in text categorization. In: Fisher Jr., D.H. (ed.) The 14th Int. Conf. on Machine Learning, pp. 412–420. Morgan Kaufmann, San Francisco (1997)

    Google Scholar 

  46. Yang, Y.: An evaluation of statistical approach to text categorization. Journal of Information retrieval 1(1/2), 67–88 (1999)

    Google Scholar 

  47. Zadeh, L.A.: Fuzzy sets. Information and Control 8, 338–353 (1965)

    Article  MATH  MathSciNet  Google Scholar 

  48. Zadrozny, S., Kacprzyk, J.: Computing with words for text processing: An approach to the text categorization. Information Sciences 176, 415–437 (2006)

    Article  MathSciNet  Google Scholar 

  49. Zhang, J., Mani, J.: knn approach to unbalanced data distributions: A case study involving information extraction. In: Proceedings of the ICML-2003 Workshop: Learning with Imbalanced Data Sets II, pp. 42–48 (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Leszek Rutkowski Ryszard Tadeusiewicz Lotfi A. Zadeh Jacek M. Zurada

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Szmidt, E., Kacprzyk, J. (2008). Using Intuitionistic Fuzzy Sets in Text Categorization. In: Rutkowski, L., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds) Artificial Intelligence and Soft Computing – ICAISC 2008. ICAISC 2008. Lecture Notes in Computer Science(), vol 5097. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-69731-2_35

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-69731-2_35

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-69572-1

  • Online ISBN: 978-3-540-69731-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics