skip to main content
research-article

Natural Language Processing Pretraining Language Model for Computer Intelligent Recognition Technology

Published: 07 August 2024 Publication History

Abstract

Computer intelligent recognition technology refers to the use of computer vision, Natural Language Processing (NLP), machine learning and other technologies to enable computers to recognize, analyze, understand and answer human language and behavior. The common applications of computer intelligent recognition technology include image recognition, NLP, face recognition, target tracking, and other fields. NLP is a field of computer science, which involves the interaction between computers and natural languages. NLP technology can be used to process, analyze and generate natural language data, such as text, voice and image. Common NLP technology applications include language translation, emotion analysis, text classification, speech recognition and question answering system. Language model is a machine learning model, which uses a large number of text data for training to learn language patterns and relationships in text data. Although the language model has made great progress in the past few years, it still faces some challenges, including: poor semantic understanding, confusion in multilingual processing, slow language processing and other shortcomings. Therefore, in order to optimize these shortcomings, this article would study the pre-training language model based on NLP technology, which aimed at using NLP technology to optimize and improve the performance of the language model, thus optimizing the computer intelligent recognition technology. The model had a higher language understanding ability and more accurate prediction ability. In addition, the model could learn language rules and structures by using a large number of corpus, so as to better understand natural language. Through experiments, it could be known that the data size and total computing time of the traditional Generative Pretrained Transformer-2 (GPT-2) language model were 10 GB and 97 hours respectively. The data size and total computing time of BERT (Bidirectional Encoder Representations from Transformer) were 12 GB and 86 hours respectively. The data size and total computing time of the pre-training language model based on NLP were 18 GB and 71 hours respectively. Obviously, the pre-training language model based on NLP had a larger data size and shorter computing time. The experimental results showed that the NLP technology could better optimize the language model and effectively improve its various capabilities. This article opened up a new development direction for computer intelligent recognition technology and provided excellent technical support for the development of language models.

References

[1]
Q. Wang and P. Lu. 2019. Research on application of artificial intelligence in computer network technology. International Journal of Pattern Recognition and Artificial Intelligence 33, 5 (2019), 1959015.
[2]
Q. Wang, Y. Li, and X. Liu. 2018. Analysis of feature fatigue EEG signals based on wavelet entropy. International Journal of Pattern Recognition and Artificial Intelligence 32, 8 (2018), 1854023
[3]
Xipeng Qiu. 2020. Pre-trained models for natural language processing: A survey. Science China Technological Sciences 63, 10 (2020), 1872–1897.
[4]
Hang Li. 2018. Deep learning for natural language processing: advantages and challenges. National Science Review 5, 1 (2018), 24–26.
[5]
Tian-Xiang Sun. 2022. Paradigm shift in natural language processing. Machine Intelligence Research 19, 3 (2022), 169–183.
[6]
Julia El Zini and Mariette Awad. 2022. On the explainability of natural language processing deep models. ACM Computing Surveys 55, 5 (2022), 1–31.
[7]
Tianjiao Yu. 2020. Construction and optimization of computer intelligent recognition and analysis system based on natural language processing. Solid State Technology 63, 3 (2020), 4999–5007.
[8]
Simon Mezgec. 2019. Mixed deep learning and natural language processing method for fake-food image recognition and standardization to help automated dietary assessment. Public Health Nutrition 22, 7 (2019), 1193–1202.
[9]
Stephen Wu. 2020. Deep learning in clinical natural language processing: A methodical review. Journal of the American Medical Informatics Association 27, 3 (2020), 457–470.
[10]
Dastan Hussen Maulud. 2021. State of art for semantic analysis of natural language processing. Qubahan Academic Journal 1, 2 (2021), 21–28.
[11]
F. Bao, Y. Zheng, S. Bao, J. Wang, and S. Yang. 2022. Formulaic language identification model based on GCN fusing associated information. PeerJ Computer Science 8, 2 (2022), e984.
[12]
J. Cao and S. Ananiadou. 2021. GenerativeRE: Incorporating a novel copy mechanism and pretrained model for joint entity and relation extraction. In Findings of the Association for Computational Linguistics: EMNLP 2021. (2021), 2119–2126. DOI:
[13]
Jian Guo. 2020. Gluoncv and gluonnlp: Deep learning in computer vision and natural language processing. The Journal of Machine Learning Research 21, 1 (2020), 845-851.
[14]
Yu Gu. 2021. Domain-specific language model pretraining for biomedical natural language processing. ACM Transactions on Computing for Healthcare (HEALTH) 3, 1 (2021), 1–23.
[15]
C. Zhang, C.-N. Hsu, Y. Katsis, H.-C. Kim, and Y. Vázquez-Baeza. 2022. Theoretical rule-based knowledge graph reasoning by connectivity dependency discovery. 2022 International Joint Conference on Neural Networks. (2022), 1–9. DOI:
[16]
C. Zhang and X. Liu. 2022. Dense embeddings preserving the semantic relationships in wordnet. 2022 International Joint Conference on Neural Networks. 01–08. DOI:
[17]
Lingfei Wu. 2023. Graph neural networks for natural language processing: A survey. Foundations and Trends® in Machine Learning 16, 2 (2023), 119–328.
[18]
Imane Guellil. 2021. Arabic natural language processing: An overview. Journal of King Saud University-Computer and Information Sciences 33, 5 (2021), 497–507.
[19]
Silvia Quarteroni. 2018. Natural language processing for industry: ELCA's experience. Informatik-Spektrum 41, 2 (2018), 105–112.
[20]
Yue Kang. 2020. Natural language processing (NLP) in management research: A literature review. Journal of Management Analytics 7, 2 (2020), 139–172.
[21]
Jack W. Luo and Jaron J. R. Chong. 2020. Review of natural language processing in radiology. Neuroimaging Clinics 30, 4 (2020), 447–458.

Cited By

View all
  • (2024)An Exploration of the Application of Natural Language Processing Technology in the Quality Enhancement of Cross-Cultural Language ConversionApplied Mathematics and Nonlinear Sciences10.2478/amns-2024-33299:1Online publication date: 18-Nov-2024
  • (2024)Theoretical Approach Of Implementing Blockchain And Artificial Intelligence For Diploma Verification2024 13th Mediterranean Conference on Embedded Computing (MECO)10.1109/MECO62516.2024.10577813(1-4)Online publication date: 11-Jun-2024
  • (2024)Enhancing Academic Credentials: The Synergy of Blockchain and Artificial Intelligence2024 7th International Balkan Conference on Communications and Networking (BalkanCom)10.1109/BalkanCom61808.2024.10557185(206-211)Online publication date: 3-Jun-2024
  • Show More Cited By

Index Terms

  1. Natural Language Processing Pretraining Language Model for Computer Intelligent Recognition Technology

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Asian and Low-Resource Language Information Processing
      ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 23, Issue 8
      August 2024
      343 pages
      EISSN:2375-4702
      DOI:10.1145/3613611
      Issue’s Table of Contents

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 07 August 2024
      Online AM: 20 June 2023
      Accepted: 05 June 2023
      Revised: 21 April 2023
      Received: 20 February 2023
      Published in TALLIP Volume 23, Issue 8

      Check for updates

      Author Tags

      1. Natural Language Processing
      2. Computer Intelligent Recognition Technology
      3. Pre-trained Language Model
      4. Computer Vision
      5. Machine Learning

      Qualifiers

      • Research-article

      Funding Sources

      • Sound Speed Input Method Foundation of the Chinese Dictionary Research Center of the National Language Committee of Ludong University

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)314
      • Downloads (Last 6 weeks)10
      Reflects downloads up to 02 Mar 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)An Exploration of the Application of Natural Language Processing Technology in the Quality Enhancement of Cross-Cultural Language ConversionApplied Mathematics and Nonlinear Sciences10.2478/amns-2024-33299:1Online publication date: 18-Nov-2024
      • (2024)Theoretical Approach Of Implementing Blockchain And Artificial Intelligence For Diploma Verification2024 13th Mediterranean Conference on Embedded Computing (MECO)10.1109/MECO62516.2024.10577813(1-4)Online publication date: 11-Jun-2024
      • (2024)Enhancing Academic Credentials: The Synergy of Blockchain and Artificial Intelligence2024 7th International Balkan Conference on Communications and Networking (BalkanCom)10.1109/BalkanCom61808.2024.10557185(206-211)Online publication date: 3-Jun-2024
      • (2024)Discovering Personally Identifiable Information in Textual Data - A Case Study with Automated Concatenation of EmbeddingsAdvanced Information Networking and Applications10.1007/978-3-031-57916-5_13(145-158)Online publication date: 9-Apr-2024
      • (2023)Low-Resource Language Processing Using Improved Deep Learning with Hunter–Prey Optimization AlgorithmMathematics10.3390/math1121449311:21(4493)Online publication date: 30-Oct-2023

      View Options

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Full Text

      View this article in Full Text.

      Full Text

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media