skip to main content
10.1145/2459118.2459133acmotherconferencesArticle/Chapter ViewAbstractPublication PagescomputeConference Proceedingsconference-collections
research-article

Automatic gazette creation for named entity recognition and application to resume processing

Published: 23 January 2012 Publication History

Abstract

Named entities are important content-carrying units within documents. Consequently named entity recognition (NER) is an important part of information extraction. One fast and accurate approach to NER uses a list or gazette consisting of known instances. Gazette creation problem considers how to automatically create a comprehensive gazette from given unlabeled document repository. We describe an unsupervised algorithm for automatic gazette creation, which is modified from [5]. We propose a fast NER algorithm using large gazette and show that it significantly outperforms a naïve approach based on regular expressions. We describe experimental results obtained by using the system for gazette creation for various resume related named entities (e.g., ORG, DEGREE, EDUCATIONAL_INSTITUTE, DESIGNATION) and the associated NER on a large set of real-life resumes.

References

[1]
Collins, M. and Singer, Y. 1999. Unsupervised models for named entity classification. Proc. EMNLP.
[2]
Etzioni, O., Cafarella, M., Downey, D., Popescu, A.-M., Shaked, T., Soderland, S., Weld, D. S. and Yates, A. 2005. Unsupervised named-entity extraction from the Web: An experimental study. Artificial Intelligence, 165, pp. 91--134.
[3]
Nadeau, D., Turney, P. and Matwin, S. 2006. Unsupervised named-entity recognition: generating gazetteers and resolving ambiguity. Proc. 19th Canadian Conf. Artificial Intelligence.
[4]
Palshikar, G. K., 2011. Techniques for named entity recognition: a survey. TRDDC Technical Report.
[5]
Thelen, M. and Riloff E. 2002. A bootstrapping method for learning semantic lexicons using extraction pattern contexts. Conference on Empirical Methods in Natural Language Processing (EMNLP 2002).

Cited By

View all
  • (2024)Teaching and Recognition of Skills in the Digital Era Through OER-Based Personalized and Gamified Learning: The ENCORE Project2024 IEEE Global Engineering Education Conference (EDUCON)10.1109/EDUCON60312.2024.10578861(1-10)Online publication date: 8-May-2024
  • (2024)Extracting IT Knowledge Using Named Entity Recognition Based on BERT from IOB Annotated Job DescriptionsArtificial Intelligence, Data Science and Applications10.1007/978-3-031-48573-2_35(241-247)Online publication date: 30-Jan-2024
  • (2023)Adlandırılmış Varlık Tanıma Modelleri ile Türkçe Sosyal Medya Metinlerinde Küfürlü Sözlerin SansürlenmesiCensorship of Profanity Words in Turkish Social Media Texts with Named Entity Recognition ModelsAfyon Kocatepe University Journal of Sciences and Engineering10.35414/akufemubid.111578623:1(72-88)Online publication date: 1-Mar-2023
  • Show More Cited By

Index Terms

  1. Automatic gazette creation for named entity recognition and application to resume processing

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Other conferences
      COMPUTE '12: Proceedings of the 5th ACM COMPUTE Conference: Intelligent & scalable system technologies
      January 2012
      146 pages
      ISBN:9781450314404
      DOI:10.1145/2459118
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      • ACM Pune Professional Chapter: ACM Pune Professional Chapter

      In-Cooperation

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 23 January 2012

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. gazette creation
      2. information extraction
      3. information retrieval
      4. named entity extraction
      5. named entity recognition
      6. resume processing

      Qualifiers

      • Research-article

      Conference

      Compute '12
      Sponsor:
      • ACM Pune Professional Chapter

      Acceptance Rates

      COMPUTE '12 Paper Acceptance Rate 18 of 116 submissions, 16%;
      Overall Acceptance Rate 114 of 622 submissions, 18%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)11
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 20 Jan 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Teaching and Recognition of Skills in the Digital Era Through OER-Based Personalized and Gamified Learning: The ENCORE Project2024 IEEE Global Engineering Education Conference (EDUCON)10.1109/EDUCON60312.2024.10578861(1-10)Online publication date: 8-May-2024
      • (2024)Extracting IT Knowledge Using Named Entity Recognition Based on BERT from IOB Annotated Job DescriptionsArtificial Intelligence, Data Science and Applications10.1007/978-3-031-48573-2_35(241-247)Online publication date: 30-Jan-2024
      • (2023)Adlandırılmış Varlık Tanıma Modelleri ile Türkçe Sosyal Medya Metinlerinde Küfürlü Sözlerin SansürlenmesiCensorship of Profanity Words in Turkish Social Media Texts with Named Entity Recognition ModelsAfyon Kocatepe University Journal of Sciences and Engineering10.35414/akufemubid.111578623:1(72-88)Online publication date: 1-Mar-2023
      • (2023)Text and Dynamic Network Analysis for Measuring Technological Convergence: A Case Study on Defense Patent DataIEEE Transactions on Engineering Management10.1109/TEM.2021.307823170:4(1490-1503)Online publication date: Apr-2023
      • (2023)ResuFormer: Semantic Structure Understanding for Resumes via Multi-Modal Pre-training2023 IEEE 39th International Conference on Data Engineering (ICDE)10.1109/ICDE55515.2023.00242(3154-3167)Online publication date: Apr-2023
      • (2023)RINX: A system for information and knowledge extraction from resumesData & Knowledge Engineering10.1016/j.datak.2023.102202147(102202)Online publication date: Sep-2023
      • (2023)Novel data augmentation for named entity recognitionInternational Journal of Speech Technology10.1007/s10772-023-10055-826:4(869-878)Online publication date: 3-Nov-2023
      • (2022)On the link between Education and Industry 4.0: a framework for a data-driven education design2022 IEEE Global Engineering Education Conference (EDUCON)10.1109/EDUCON52537.2022.9766534(1670-1677)Online publication date: 28-Mar-2022
      • (2020)Semi-supervised deep learning based named entity recognition model to parse education section of resumesNeural Computing and Applications10.1007/s00521-020-05351-2Online publication date: 18-Sep-2020
      • (2015)LN-AnnoteProceedings of the 24th International Conference on World Wide Web10.1145/2736277.2741633(538-548)Online publication date: 18-May-2015

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media