Skip to main content

Advertisement

Log in

The Comprehensive Analysis of the Effect of Chinese Word Segmentation on Fuzzy-Based Classification Algorithms for Agricultural Questions

  • Published:
International Journal of Fuzzy Systems Aims and scope Submit manuscript

Abstract

Fuzzy logic is the core method for handling uncertainty and vagueness of information in agricultural natural language processing, and it also plays a crucial role in word segmentation and text classification algorithms using the neural network. Word segmentation is often the primary step in Chinese text classification tasks and has a profound effect on the generation ability of classification algorithm-based fuzzy logic. However, the high complexity of text classification models structure and specificity of agricultural data take a great challenge to studying the effect of word segmentation. Although there have been several attempts to resolve this issue, the main effort focuses on word segment Precision or the generalization performance of multiple word segment methods for the same classification algorithm and does not involve agricultural text. To solve this problem from the perspective of rational analysis and empirical analysis, a comprehensive analysis has been made to study the effect of Chinese word segmentation on fuzzy-based classification algorithms for agricultural questions. It initially discusses the characteristics of agricultural questions for the subsequent analysis of the field adaptability of word segmentation and classification algorithms, employs fuzzy logic to convert the Chinese word segmentation task into a sequence labeling problem, and then analyzes the characteristics, techniques, and performance disparities of the seven mainstream open-source Chinese word segmentation integration tools at the current stage. Subsequently, an exploration has been conducted into the impact of Chinese word segmentation on the generalization performance of classification algorithms under the proposed unified model framework for text classification based on fuzzy logic. Finally, many experiments have been performed on the actual data crawled from typical agricultural websites to empirically study the differences and robustness of the effect of different word segmentation tools on classification performance, as well as the contribution of the external dictionary. Comparative experimental results show which word segmentation tools have a solid effect on classification performance and a strong robust effect on the typical text feature extraction layer for classification tasks, and the external dictionary have no significant effect on classification performance. The research results have essential reference significance for how to select appropriate word segmentation tools to deal with Chinese natural language processing tasks in future.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  1. Li, W., Zhan, T.: Multi-granularity probabilistic rough fuzzy sets for interval-valued fuzzy decision systems. Int. J. Fuzzy Syst. 25, 3061–3073 (2023)

    Article  MATH  Google Scholar 

  2. Pandya, B., Pourabdollah, A., Lotfi, A.: A comparative study of stand-alone and cloud-based fuzzy logic systems for human fall detection. Int. J. Fuzzy Syst. 25(3), 951–965 (2023)

    Article  Google Scholar 

  3. Wanzala, J.N., Atim, M.R., Obungoloch, J.: Design of fuzzy logic-based ARDS Berlin definition for ventilator adjustments to ensure lung protection. Int. J. Fuzzy Syst. 25(5), 1–17 (2023)

    Article  Google Scholar 

  4. Zhang, C., Li, D., Liang, J.: Multi-granularity three-way decisions with adjustable hesitant fuzzy linguistic multigranulation decision-theoretic rough sets over two universes. Inf. Sci. 507, 665–683 (2020)

    Article  MathSciNet  MATH  Google Scholar 

  5. Zhang, C., Ding, J., Zhan, J., Sangaiah, A.K., Li, D.: Fuzzy intelligence learning based on bounded rationality in IoMT systems: a case study in Parkinson’s disease. IEEE Trans. Comput. Soc. Syst. 10(4), 1607–1621 (2023)

    Article  MATH  Google Scholar 

  6. Gupta, C., Jain, A., Joshi, N.: Fuzzy logic in natural language processing-a closer view. Procedia Comput. Sci. 132, 1375–1384 (2018)

    Article  MATH  Google Scholar 

  7. Omoregbe, N.A., Ndaman, I.O., Misra, S., Abayomi-Alli, O.O., Damaševičius, R., Dogra, A.: Text messaging-based medical diagnosis using natural language processing and fuzzy logic. J. Healthc. Eng. 2020, 1–14 (2020)

    Article  Google Scholar 

  8. Li, W., Zhai, S., Xu, W., Pedrycz, W., Qian, Y., Ding, W., Zhan, T.: Feature selection approach based on improved fuzzy c-means with principle of refined justifiable granularity. IEEE Trans. Fuzzy Syst. 31(7), 2112–2126 (2023)

    Article  MATH  Google Scholar 

  9. Madani, Y., Erritali, M., Bengourram, J., Sailhan, F.: A multilingual fuzzy approach for classifying twitter data using fuzzy logic and semantic similarity. Neural Comput. Appl. 32, 8655–8673 (2020)

    Article  Google Scholar 

  10. Gu, X., Xia, K., Jiang, Y., Jolfaei, A.: Multi-task fuzzy clustering-based multi-task tsk fuzzy system for text sentiment classification. Trans. Asian Low-Resour. Lang. Inf. Process. 21(2), 1–24 (2021)

    Google Scholar 

  11. Jain, G., Lobiyal, D.: Word sense disambiguation using cooperative game theory and fuzzy Hindi wordnet based on conceptnet. Trans. Asian Low-Resour. Lang. Inf. Process. 21(4), 1–25 (2022)

    Article  MATH  Google Scholar 

  12. Lai, L., Wu, C., Lin, P., Huang, L.: Developing a fuzzy search engine based on fuzzy ontology and semantic search. In: 2011 IEEE International Conference on Fuzzy Systems, pp. 2684–2689. IEEE, Taipei, Taiwan (2011)

  13. Li, M., Li, Y., Peng, Q., Wang, J., Yu, C.: Evaluating community question-answering websites using interval-valued intuitionistic fuzzy DANP and TODIM methods. Appl. Soft Comput. 99, 106918 (2021)

    Article  Google Scholar 

  14. Chen, X., Shi, Z., Qiu, X., Huang, X.: Adversarial multi-criteria learning for Chinese word segmentation. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, vol. 1, pp. 1193–1203. Association for Computational Linguistics, Vancouver, Canada (2017)

  15. Cai, D., Zhao, H., Zhang, Z., Xin, Y., Wu, Y., Huang, F.: Fast and accurate neural word segmentation for Chinese. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, vol. 1, pp. 608–615. Association for Computational Linguistics, Vancouver, Canada (2017)

  16. Liu, S., He, T., Dai, J.: A survey of CRF algorithm based knowledge extraction of elementary mathematics in Chinese. Mob. Netw. Appl. 26, 1891–1903 (2021)

    Article  MATH  Google Scholar 

  17. Yang, M., Liu, S., Chen, K., Zhang, H., Zhao, E., Zhao, T.: A hierarchical clustering approach to fuzzy semantic representation of rare words in neural machine translation. IEEE Trans. Fuzzy Syst. 28(5), 992–1002 (2020)

    Article  MATH  Google Scholar 

  18. Du, Y., Huo, H.: News text summarization based on multi-feature and fuzzy logic. IEEE Access 8, 140261–140272 (2020)

    Article  MATH  Google Scholar 

  19. Moldovan, D., Paşca, M., Harabagiu, S., Surdeanu, M.: Performance issues and error analysis in an open-domain question answering system. ACM Trans. Inf. Syst. 21(2), 133–154 (2003)

    Article  MATH  Google Scholar 

  20. Pintas, J.T., Fernandes, L.A., Garcia, A.C.B.: Feature selection methods for text classification: a systematic literature review. Artif. Intell. Rev. 54(8), 6149–6200 (2021)

    Article  MATH  Google Scholar 

  21. Minaee, S., Kalchbrenner, N., Cambria, E., Nikzad, N., Chenaghlu, M., Gao, J.: Deep learning-based text classification: a comprehensive review. ACM Comput. Surv. (CSUR) 54(3), 1–40 (2021)

    Article  MATH  Google Scholar 

  22. Liu, H., Burnap, P., Alorainy, W., Williams, M.L.: A fuzzy approach to text classification with two-stage training for ambiguous instances. IEEE Trans. Comput. Soc. Syst. 6(2), 227–240 (2019)

    Article  Google Scholar 

  23. Asgarnezhad, R., Monadjemi, S.A., Soltanaghaei, M.: Fahpbep: a fuzzy analytic hierarchy process framework in text classification. Majlesi J. Electr. Eng. 14(3), 111–123 (2020)

    Google Scholar 

  24. Lu, X.S., Zhou, M., Wu, K.: A novel fuzzy logic-based text classification method for tracking rare events on twitter. IEEE Trans. Syst. Man Cybern.: Syst. 51(7), 4324–4333 (2019)

    Article  MATH  Google Scholar 

  25. Soares, M.A.C., Parreiras, F.S.: A literature review on question answering techniques, paradigms and systems. J. King Saud Univ.-Comput. Inf. Sci. 32(6), 635–646 (2020)

    MATH  Google Scholar 

  26. Dimitrakis, E., Sgontzos, K., Tzitzikas, Y.: A survey on question answering systems over linked data and documents. J. Intell. Inf. Syst. 55, 233–259 (2020)

    Article  Google Scholar 

  27. Zulqarnain, M., Alsaedi, A.K.Z., Ghazali, R., Ghouse, M.G., Sharif, W., Husaini, N.A.: A comparative analysis on question classification task based on deep learning approaches. PeerJ Comput. Sci. 7, 570 (2021)

    Article  Google Scholar 

  28. Huang, K., Fu, S.: Some related problems faced by the application of it in information retrieval. Data Anal. Knowl. Discov., pp. 26–29 (2001)

  29. Liu, Y., Zhang, S., Wang, Y., Xie, Y.: Speech recognition method based on multi-task loss with additional language model. J. Jiangsu Univ. (Nat. Sci. Ed.) 44, 564–569 (2023)

    MATH  Google Scholar 

  30. Li, F., Fu, D.: Sentiment analysis method of financial text based on transformer encoder. Electron. Sci. Technol. 33, 10–15 (2020)

    MATH  Google Scholar 

  31. Jin, N., Chunjiang, Z., Wu, H., Yisheng, M., Li, S., Baozhu, Y.: Classification technology of agricultural questions based on bigru_mulcnn. Trans. Chin. Soc. Agric. Mach. 51(5), 199–206 (2020)

    Google Scholar 

  32. Li, X., Meng, Y., Sun, X., Han, Q., Yuan, A., Li, J.: Is word segmentation necessary for deep learning of Chinese representations? In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 3242–3252. Association for Computational Linguistics, Florence, Italy (2019)

  33. Sun, X., Wang, H., Li, W.: Fast online training with frequency-adaptive learning rates for chinese word segmentation and new word detection. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, pp. 253–262. Association for Computational Linguistics, Jeju Island, Korea (2012)

  34. Li, Z., Sun, M.: Punctuation as implicit annotations for Chinese word segmentation. Comput. Linguist. 35(4), 505–512 (2009)

    Article  MATH  Google Scholar 

  35. Zhang, H., Shang, J.: Nlpir-parser: an intelligent semantic analysis toolkit for big data. Corpus Linguist. 6(1), 87–104 (2019)

    MATH  Google Scholar 

  36. He, H., Choi, J.D.: The stem cell hypothesis: dilemma behind multi-task learning with transformer encoders. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp. 5555–5577. Association for Computational Linguistics, Punta Cana, Dominican Republic (2021)

  37. Che, W., Feng, Y., Qin, L., Liu, T.: N-LTP: An open-source neural language technology platform for Chinese. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pp. 42–49. Association for Computational Linguistics, Punta Cana, Dominican Republic (2021)

  38. Kim, Y.: Convolutional neural networks for sentence classification. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, pp. 1746–1751. Association for Computational Linguistics, Doha, Qatar (2014)

  39. Liu, P., Qiu, X., Huang, X.: Recurrent neural network for text classification with multi-task learning. In: Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, pp. 2873–2879. AAAI Press, New York, USA (2016)

  40. Lai, S., Xu, L., Liu, K., Zhao, J.: Recurrent convolutional neural networks for text classification. In: Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, pp. 2267–2273. AAAI Press, Austin, Texas (2015)

  41. Zhou, P., Shi, W., Tian, J., Qi, Z., Li, B., Hao, H., Xu, B.: Attention-based bidirectional long short-term memory networks for relation classification. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, vol. 2, pp. 207–212. Association for Computational Linguistics, Berlin, Germany (2016)

  42. Joulin, A., Grave, E., Bojanowski, P., Mikolov, T.: Bag of tricks for efficient text classification. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, pp. 427–431. Association for Computational Linguistics, Valencia, Spain (2017)

  43. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 6000–6010. Curran Associates Inc., Red Hook, NY, USA (2017)

  44. Drury, B., Roche, M.: A survey of the applications of text mining for agriculture. Comput. Electron. Agric. 163, 104864 (2019)

    Article  MATH  Google Scholar 

  45. Zhang, X.: The past life of the input method. China Internet, pp. 54–55 (2009)

  46. Demšar, J., Demsar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)

    MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

This research was supported by Shandong Provincial Natural Science Foundation, China, grant number ZR2020MF146.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yunsheng Song.

Ethics declarations

Funding

Shandong Provincial Natural Science Foundation, China: ZR2020MF146.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhao, X., Huang, J., Zhang, J. et al. The Comprehensive Analysis of the Effect of Chinese Word Segmentation on Fuzzy-Based Classification Algorithms for Agricultural Questions. Int. J. Fuzzy Syst. 26, 2726–2749 (2024). https://doi.org/10.1007/s40815-024-01724-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s40815-024-01724-0

Keywords