Name identification and extraction with formal concept analysis

Taghva, Kazem

doi:10.1007/s13042-016-0514-2

Name identification and extraction with formal concept analysis

Original Article
Published: 18 March 2016

Volume 8, pages 171–178, (2017)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Kazem Taghva ORCID: orcid.org/0000-0001-8320-0080¹

339 Accesses
2 Citations
Explore all metrics

Abstract

One of the applications of the Formal concept analysis (FCA) is the ability to extract structured information from textual documents. Typically, one can define a set of attributes that will characterize the objects. Consequently, these defined objects will be extracted by standard FCA algorithms. In this paper, we describe how FCA identifies and extracts personal names as units of thought similar to the decoding of text sequences by Viterbi algorithm as used with Hidden Markov Models. We further exhibit how FCA mimics the thought process that goes into a rule-based information extraction system. We then observe that the formal approach of FCA combined with already established computational techniques such as bottom up intersection algorithm avoids the difficulties associated with hand coding and maintenance of rule-based systems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Appelt DE, Israel DJ (1999) Introduction to information extraction technology. Tutorial Prepared for IJCAI-99
Ganter B, Wille R (1999) Formal Concept Analysis: Logical Foundations. Springer-Verlag
Dias SM, Vieira NJ (2013) Applying the jbos reduction method for relevant knowledge extraction. Expert Syst Appl 40(5):1880–1887
Article Google Scholar
Freitag D, McCallum AKD (1999) Information extraction with hmms and shrinkage. In: Proceedings of the AAAI-99 Workshop on Machine Learning for Information Extraction
Grishman R, Sundheim B (1996) Message understanding conference-6: a brief history. In: Proceedings of the 16th conference on Computational linguistics, vol 1, COLING ’96. Association for Computational Linguistics, Stroudsburg, PA, USA, pp 466–471
Hall GR, Taghva K (2015) Using the web 1t 5-gram database for attribute selection in formal concept analysis to correct overstemmed clusters. In: 2015 12th International Conference on Information Technology—New Generations (ITNG), pp 651–654
Kumar CA, Ishwarya MS, Loo CK (2015) Formal concept analysis approach to cognitive functionalities of bidirectional associative memory. Biol Inspired Cogn Archit
Li J, Mei C, Weihua X, Qian Y (2015a) Concept learning via granular computing: a cognitive viewpoint. Inf Sci 298:447–467
Article MathSciNet Google Scholar
Li J, Ren Y, Mei C, Qian Y, Yang Xibei (2016) A comparative study of multigranulation rough sets and concept lattices via rule acquisition. Knowl Based Syst 91:152–164
Article Google Scholar
LSN2001 (2001) Licensing support network baselined design requirements. http://www.lsnnet.gov/
Nadeau D, Sekine S (2007) A survey of named entity recognition and classification. Lingvisticae Investigationes 30(1):3–26
Article Google Scholar
Poibeau T, Kosseim L (2001) Proper name extraction from non-journalistic texts. Lang Comput 37(1):144–157
MATH Google Scholar
Powley B, Dale R (2007) High accuracy citation extraction and named entity recognition for a heterogeneous corpus of academic papers. In: IEEE International Conference on Natural Language Processing and Knowledge Engineering, 2007. NLP-KE 2007, pp 119–124
Priss U (2005) Linguistic applications of formal concept analysis. In: Formal Concept Analysis. Springer, pp 149–160
Rabiner LR (1989) Readings in speech recognition. In: Waibel A, Lee K-F (eds) Readings in speech recognition, chapter A tutorial on hidden Markov models and selected applications in speech recognition. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, pp 267–296, 1990. ISBN 1-55860-124-4
Rocha LM (2002) Proximity and semi-metric analysis of social networks. In: Report of Advanced Knowledge Integration In Assessing Terrorist Threats LDRD-DR Network Analysis Component. LAUR 02-6557
Siff M, Reps TW (1999) Identifying modules via concept analysis. IEEE Trans Softw Eng 25(6):749–768
Article Google Scholar
Stumme G (2002) Efficient data mining based on formal concept analysis. In: DEXA, pp 534–546
Taghva K (2009) Identification of Sensitive Unclassified Information. Springer, pp 89–103
Taghva K, Gilbreth J (1999) Recognizing acronyms and their definitions. IJDAR 1(4):191–198
Article Google Scholar
Taghva K, Coombs JS, Pereda R, Nartker TA (2005) Address extraction using hidden markov models. In: Proceedings Document Recognition and Retrieval XII, 16-20 January 2005, San Jose, California, USA, pp 119–126
Taghva K, Beckley R, Coombs JS (2006) The effects of ocr error on the extraction of private information. In: Document Analysis Systems, pp 348–357
Taghva K, Beckley R, Coombs JS (2011) Name extraction and formal concept analysis. In: Proceedings Conceptual Structures for Discovering Knowledge—19th International Conference on Conceptual Structures, ICCS 2011, Derby, UK, July 25–29, pp 339–345
US Government (2005) Frequently occurring first names and surnames from the 1990 census, 1990. View ed August, 2005. http://www.census.gov/genealogy/www/freqnames.html
Weihua X, Pang J, Luo S (2014) A novel cognitive system model and approach to transformation of information granules. Int J Approx Reason 55(3):853–866
Article MathSciNet MATH Google Scholar
Xu WH, Li WT (2016) Granular computing approach to two-way learning based on formal concept analysis in fuzzy datasets. IEEE Transactions on Cybernetics (To appear)

Download references

Acknowledgments

The author would like to thank anonymous reviewers for their contributions to this paper.

Author information

Authors and Affiliations

Department of Computer Science, University of Nevada, Las Vegas, USA
Kazem Taghva

Authors

Kazem Taghva
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kazem Taghva.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Taghva, K. Name identification and extraction with formal concept analysis. Int. J. Mach. Learn. & Cyber. 8, 171–178 (2017). https://doi.org/10.1007/s13042-016-0514-2

Download citation

Received: 31 August 2015
Accepted: 16 February 2016
Published: 18 March 2016
Issue Date: February 2017
DOI: https://doi.org/10.1007/s13042-016-0514-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Name identification and extraction with formal concept analysis

Abstract

Access this article

Similar content being viewed by others

Near-Perfect Relation Extraction from Family Books

Knowledge Discovery with CRF-Based Clustering of Named Entities without a Priori Classes

An Automatic Construction of Concept Maps Based on Statistical Text Mining

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Name identification and extraction with formal concept analysis

Abstract

Access this article

Similar content being viewed by others

Near-Perfect Relation Extraction from Family Books

Knowledge Discovery with CRF-Based Clustering of Named Entities without a Priori Classes

An Automatic Construction of Concept Maps Based on Statistical Text Mining

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation