skip to main content
10.1145/3603781.3603797acmotherconferencesArticle/Chapter ViewAbstractPublication PagescniotConference Proceedingsconference-collections
research-article

Information extraction method of topic webpage based on multi-angle feature learning

Authors Info & Claims
Published:27 July 2023Publication History

ABSTRACT

It's difficult to find topic information in the web page because it is slow to find specific information by labor in the process and the result of commonly used methods is inaccurate. This paper proposes a multi-angle feature analysis method for web information identifying. With this method, it mines the characteristics of web page information content in a comprehensive way. Focusing on the characteristics of the web page, the text is segmented, and features are extracted and quantified from multiple perspectives. The fully connected neural network deep learning model is used for training. Besides, use linear classifiers to classify web page. The final experiment shows that this method improves the F value by more than 4% compared with the keyword method and the SVM (Support Vector Machine) method.

References

  1. Wang Jian, Peng Yuqi, Zhao Yufei, etc. A review of social network public opinion information extraction methods based on deep learning. Computer Science. 2022,49(08)Google ScholarGoogle Scholar
  2. Xiao Shuzhou, Wu Dekun. Rapid Extraction Algorithm for Big Data Information Based on Artificial Intelligence. Digital Technology and Application. 2022,40(07)Google ScholarGoogle Scholar
  3. Kakkar D., Blossom J., Guan W.RINX: A SOLUTION FOR INFORMATION EXTRACTION FROM BIG RASTER DATASETS. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences[J]. 2022.PP 245Google ScholarGoogle Scholar
  4. Dai Jianhua, Jiang Chao, Peng Ruoyao Chinese medical dialogue information extraction via contrastive multi-utterance inference. Briefings in Bioinformatics[J].2022Google ScholarGoogle Scholar
  5. Zhou Peng, ElGohary Nora. Semantic Information Extraction of Energy Requirements from Contract Specifications: Dealing with Complex Extraction Tasks. Journal of Computing in Civil Engineering[J]. Volume 36, Issue 5. 2022Google ScholarGoogle Scholar
  6. Tian Hui Xia. The development technology of MOOC teaching resources based on web crawler. International Journal of Continuing Engineering Education and Life - Long Learning[J]. Volume 32, Issue 3. 2022. PP 327-343Google ScholarGoogle ScholarCross RefCross Ref
  7. Liu Jingfa, Li Xin, Zhang Qiansheng; Zhong Guo. A novel focused crawler combining Web space evolution and domain ontology. Knowledge-Based Systems[J]. Volume 243, 2022.Google ScholarGoogle Scholar
  8. Premalatha Mariappan, Viswanathan Vadivel, Čepová Lenka. Application of Semantic Analysis and LSTM-GRU in Developing a Personalized Course Recommendation System. Applied Sciences[J]. Volume 12, Issue 21. 2022. PP 10792-10792Google ScholarGoogle Scholar
  9. Liu Jingfa, Li Xin, Zhang Qiansheng; Zhong Guo. A novel focused crawler combining Web space evolution and domain ontology. Knowledge-Based Systems[J]. Volume 243, 2022.Google ScholarGoogle Scholar
  10. Stevenson Matthew, Mues Christophe, Bravo Cristián.The value of text for small business default prediction: A Deep Learning approach. European Journal of Operational Research[J]. Volume 295, Issue 2. 2021. PP 758-771Google ScholarGoogle Scholar
  11. Lu Miao, Bi Ying; Xue Bing, Hu Qiong Genetic Programming for High-Level Feature Learning in Crop Classification. Remote Sensing[J]. Volume 14, Issue 16. 2022. PP 3982-3982Google ScholarGoogle Scholar
  12. Zhang Xiaoqin, Jiang Runhua, Huang Pengcheng Dynamic feature learning for COVID-19 segmentation and classification. Computers in Biology and Medicine[J]. Volume 150, 2022. PP 106136-106136Google ScholarGoogle Scholar
  13. Zhang Linghao, Pang Bo, Tang Haitao Pairwise Constraints Multidimensional Scaling for Discriminative Feature Learning. Mathematics[J]. Volume 10, Issue 21. 2022. PP 4059-4059Google ScholarGoogle Scholar
  14. Zhang Yang, Ge Chuyan, Hong Shuai DeleSmell: Code smell detection based on deep learning and latent semantic analysis. Knowledge-Based Systems[J]. Volume 255, 2022.Google ScholarGoogle Scholar
  15. Ke Yong, Liu Zhen, Liu Sai. Prediction algorithm and simulation of tennis impact area based on semantic analysis of prior knowledge. Soft Computing[J]. Volume 26, Issue 20. 2022. PP 10863-10870Google ScholarGoogle Scholar

Index Terms

  1. Information extraction method of topic webpage based on multi-angle feature learning
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Other conferences
          CNIOT '23: Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things
          May 2023
          1025 pages
          ISBN:9798400700705
          DOI:10.1145/3603781

          Copyright © 2023 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 27 July 2023

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed limited

          Acceptance Rates

          Overall Acceptance Rate39of82submissions,48%
        • Article Metrics

          • Downloads (Last 12 months)18
          • Downloads (Last 6 weeks)0

          Other Metrics

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format .

        View HTML Format