Skip to main content

NERSE: Named Entity Recognition in Software Engineering as a Service

  • Conference paper
  • First Online:
Service Research and Innovation (ASSRI 2018, ASSRI 2018)

Abstract

Named Entity Recognition (NER) is a computational linguistics task that seek to classify every word in a document as falling into different category. NER serves as an important component for many domain specific expert systems. Software engineering is one such domain where very minimum work has been done on identifying entities specific to domain. In this paper, we present NERSE, a tool that enables the user to identify software specific entities. It is developed with machine learning algorithms trained on software specific entity categories using Conditional Random Fields (CRF) and Bidirectional Long Short-Term Memory - Conditional Random Fields (BiLSTM-CRF). NERSE identifies 22 different categories of entities specific to software engineering domain with 0.85% and 0.95% for CRF (source code for Named Entity Recognition Model CRF is available at https://github.com/prathapreddymv/NERSE) and BiLSTM-CRF (source code for Named Entity Recognition Model BiLSTM-CRF is available at https://github.com/prathapreddymv/NERSE) models respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://archive.org/details/stackexchange.

References

  1. Rizzo, G., Troncy, R.: NERD: a framework for unifying named entity recognition and disambiguation extraction tools. In: Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics, Avignon, France, pp. 73–76 (2012)

    Google Scholar 

  2. Derczynski, L., et al.: Analysis of named entity recognition and linking for tweets. Proc. Inf. Process. Manag. 51, 32–49 (2015)

    Article  Google Scholar 

  3. Mamykina, L., Manoim, B., Mittal, M., Hripcsak, G., Hartmann, B.: Design lessons from the fastest Q&A site in the west. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 2857–2866. ACM (2011)

    Google Scholar 

  4. Gantz, J., Reinsel, D.: The Digital Universe Decade - Are You Ready?. Sponsored by EMC Corporation May 2010

    Google Scholar 

  5. Marrero, M., Urbano, J., Sánchez-Cuadrado, S., Morato, J., Gómez-Berbís, J.M.: Named entity recognition: fallacies, challenges and opportunities. Proc. Comput. Stand. Interfaces 35, 482–489 (2013)

    Article  Google Scholar 

  6. Ye, D., Xing, Z., Foo, C.Y., Ang, Z.Q., Li, J., Kapre, N.: Software-specific named entity recognition in software engineering social content. In: IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER), Suita, pp. 90–101 (2016)

    Google Scholar 

  7. Meij, E., Balog, K., Odijk, D.: Entity linking and retrieval for semantic search. In: WSDM, pp. 683–684 (2014)

    Google Scholar 

  8. Pantel, P., Fuxman, A.: Jigs and Lures: associating web queries with structured entities. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1, pp. 83–92 (2011)

    Google Scholar 

  9. Surabhi, M.C.: Natural language processing future. In: Proceedings of International Conference on Optical Imaging Sensor and Security, Coimbatore, TamilNadu, India, 2–3 July 2013

    Google Scholar 

  10. Kaur, N., Pushe, V., Kaur, R.: Natural language processing interface for synonym. Proc. Int. J. Comput. Sci. Mobile Comput. 3(7), 638–642 (2014)

    Google Scholar 

  11. Adak, C., Chaudhuri, B.B., Blumenstein, M.: Named entity recognition from unstructured handwritten document images. In: Proceedings of IEEE 12th IAPR Workshop on Document Analysis Systems (2016)

    Google Scholar 

  12. Settles, B.: Biomedical named entity recognition using conditional random fields and rich feature sets. In: Proceeding JNLPBA 2004 Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and Its Applications, Geneva, Switzerland, pp. 104–107, 28–29 August 2004

    Google Scholar 

  13. Grishman, R., Sundheim, B.: Message understanding conference-6: a brief history. In: COLING, vol. 96, pp. 466–471 (1996)

    Google Scholar 

  14. Rodrigo, Á., Pérez-Iglesias, J., Peñas, A., Garrido, G., Araujo, L.: Answering questions about European legislation. Expert Syst. Appl. 40, 5811–5816 (2013)

    Article  Google Scholar 

  15. Chen, Y., Zong, C., Su, K.Y.: A joint model to identify and align bilingual named entities. Comput. Linguist. 39, 229–266 (2013)

    Article  Google Scholar 

  16. Jung, J.J.: Online named entity recognition method for microtexts in social networking services: a case study of Twitter. Expert Syst. Appl. 39, 8066–8070 (2012)

    Article  Google Scholar 

  17. Habernal, I., Konopík, M.: SWSNL: semantic web search using natural language. Expert Syst. Appl. 40, 3649–3664 (2013)

    Article  Google Scholar 

  18. Baralis, E., Cagliero, L., Jabeen, S., Fiori, A., Shah, S.: Multi-document summarization based on the Yago ontology. Expert Syst. Appl. 40, 6976–6984 (2013)

    Article  Google Scholar 

  19. Glavas, G., Snajder, J.: Event graphs for information retrieval and multidocument summarization. Expert Syst. Appl. 41, 6904–6916 (2014)

    Article  Google Scholar 

  20. Kabadjov, M., Steinberger, J., Steinberger, R.: Multilingual statistical news summarization. In: Poibeau, T., Saggion, H., Piskorski, J., Yangarber, R. (eds.) Multilingual Information Extraction and Summarization. Theory and Applications of Natural Language Processing, pp. 229–252. Springer, Berlin (2013)

    Chapter  Google Scholar 

  21. Etzioni, O., et al.: Unsupervised named-entity extraction from the web: an experimental study. Artif. Intell. 165(1), 91–134 (2005)

    Article  MathSciNet  Google Scholar 

  22. Popescu, A.M., Etzioni, O.: Extracting product features and opinions from reviews. In: Kao, A., Poteet, S.R. (eds.) Natural Language Processing and Text Mining, pp. 9–28. Springer, London (2007). https://doi.org/10.1007/978-1-84628-754-1_2

    Chapter  Google Scholar 

  23. Cao, T.H., Tang, T.M., Chau, C.K.: Text clustering with named entities: a model, experimentation and realization. In: Holmes, D.E., Jain, L.C. (eds.) Data mining: Foundations and Intelligent Paradigms, pp. 267–287. Springer, Berlin (2012)

    Chapter  Google Scholar 

  24. Wang, X., Jiang, X., Liu, M., He, T., Hu, X.: Bacterial named entity recognition based on dictionary and conditional random field. In: Proceedings of IEEE International Conference on Bioinformatics and Biomedicine (BIBM) (2017)

    Google Scholar 

  25. Cruzes, D., Mendonça, M., Basili, V., Shull, F., Jino, M.: Automated information extraction from empirical software engineering literature: is that possible? In: Proceeding of IEEE First International Symposium on Empirical Software Engineering and Measurement (2007)

    Google Scholar 

  26. Das, P., Das, A.K.: A two-stage approach of named-entity recognition for crime analysis. In: Proceeding of IEEE - 40222 8th ICCCNT 2017, 3–5 July 2017

    Google Scholar 

  27. Lin, B.Y., Xu, F., Luo, Z., Zhu, K.: Multi channel BiLSTM CRF model for emerging named entity recognition in social media. In: Proceedings of the 3rd Workshop on Noisy User Generated Text, Copenhagen, Denmark, 7 September, pp. 160–165 (2017)

    Google Scholar 

  28. Seshathriaathithyan, S., Sriram, M.V., Prasanna, S., Venkatesan, R.: Affective—hierarchical classification of text—an approach using NLP toolkit. In: Proceedings of 2016 International Conference on Circuit, Power and Computing Technologies (2016)

    Google Scholar 

  29. Barcala, F.M., Vilares, J., Alonso, M.A., Grana, J., Vilares, M.: Tokenization and proper noun recognition for information retrieval. In: Proceedings of the 13th International Workshop on Database and Expert Systems Applications (DEXA 2002) (2002)

    Google Scholar 

  30. Kanya, N., Ravi, T.: Modelings and techniques in named entity recognition an information extraction task. In: Proceeding of Third International Conference on Sustainable Energy and Intelligent System, 27–29 December (2012)

    Google Scholar 

  31. Lafferty, J., McCallum, A., Pereira, F.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proc. ICML (2001)

    Google Scholar 

  32. Malouf, R.: A comparison of algorithms for maximum entropy parameter estimation. In: Sixth Workshop on Computational Language Learning CoNLL (2002)

    Google Scholar 

  33. Sha, F., Pereira, F.: Shallow parsing with conditional random fields. In: Proceedings of Human Language Technology, NAACL (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to M. Veera Prathap Reddy .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Veera Prathap Reddy, M., Prasad, P.V.R.D., Chikkamath, M., Mandadi, S. (2019). NERSE: Named Entity Recognition in Software Engineering as a Service. In: Lam, HP., Mistry, S. (eds) Service Research and Innovation. ASSRI ASSRI 2018 2018. Lecture Notes in Business Information Processing, vol 367. Springer, Cham. https://doi.org/10.1007/978-3-030-32242-7_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-32242-7_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-32241-0

  • Online ISBN: 978-3-030-32242-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics