Skip to main content

Mutual Screening Graph Algorithm: A New Bootstrapping Algorithm for Lexical Acquisition

  • Conference paper
Information Retrieval Technology (AIRS 2009)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5839))

Included in the following conference series:

  • 838 Accesses

Abstract

Bootstrapping is a weakly supervised algorithm that has been the focus of attention in many Information Extraction(IE) and Natural Language Processing(NLP) fields, especially in learning semantic lexicons. In this paper, we propose a new bootstrapping algorithm called Mutual Screening Graph Algorithm (MSGA) to learn semantic lexicons. The approach uses only unannotated corpus and a few of seed words to learn new words for each semantic category. By changing the format of extracted patterns and the method for scoring patterns and words, we improve the former bootstrapping algorithm. We also evaluate the semantic lexicons produced by MSGA with previous bootstrapping algorithm Basilisk [1] and GMR (Graph Mutual Reinforcement based Bootstrapping) [4]. Experiments have shown that MSGA can outperform those approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Thelen, M., Riloff, E.: A bootstrapping method for learning semantic lexicons using extraction pattern contexts. In: Proceedings of the ACL 2002 conference on Empirical methods in natural language processing, Philadelphia, USA, vol. 10, pp. 214–221 (2002)

    Google Scholar 

  2. Blum, A., Mitchell, T.: Combining labeled and unlabeled data with co-training. In: Proceedings of the eleventh annual conference on Computational learning theory, Madison, Wisconsin, United States, pp. 92–100 (1998)

    Google Scholar 

  3. Phillips, W., Riloff, E.: Exploiting Role-Identifying Nouns and Expressions for Information Extraction. In: 2007 Proceedings of Recent Advances in Natural Language Processing, RANLP 2007 (2007)

    Google Scholar 

  4. Hassan, H., Hassan, A., Emam, O.: Unsupervised Information Extraction Approach Using Graph Mutual. In: Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing (EMNLP 2006), pp. 501–508 (2006)

    Google Scholar 

  5. Patwardhan, S., Riloff, E.: Learning Domain-Specific Information Extraction Patterns from the Web. In: Proceedings of the Workshop on Information Extraction Beyond The Document, pp. 66–73 (2006)

    Google Scholar 

  6. Florian, R., Hassan, H., Ittycheriah, A., Jing, H., Kambhatla, N., Luo, X., Nicolov, N., Roukos, S.: A statistical model for multilingual entity detection and tracking. In: HLT-NAACL 2004: Main Proceedings, pp. 1–8 (2004)

    Google Scholar 

  7. Kambhatla, N.: Combining lexical, syntactic, and semantic features with maximum entropy models for in-formation extraction. In: The Companion Volume to the Proceedings of 42st Annual Meeting of the Association for Computational Linguistics, Barcelona, Spain, pp. 178–181 (2004)

    Google Scholar 

  8. Collins, M., Singer, Y.: Unsupervised models for named entity classification. In: Proceedings of the Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora. University of Maryland, MD (1999)

    Google Scholar 

  9. Etzioni, O., Cafarella, M., Downey, D., Popescu, A., Shaked, S.T.: Unsupervised named-entity extraction from the web: An experimental study. Artificial Intelligence 165(1), 91–134 (2005)

    Article  Google Scholar 

  10. Riloff, E., Wiebe, J., Wilson, T.: Learning subjective nouns using extraction pattern bootstrapping. In: Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003, Edmonton, Canada, vol. 4, pp. 25–32 (2003)

    Google Scholar 

  11. Riloff, E.: Automatically generating extraction patterns from untagged text. pattern bootstrapping. In: Proceedings of the Thirteenth National Conference on Artificial Intelligence, Portland, Oregon, pp. 1044–1049 (1996)

    Google Scholar 

  12. Riloff, E., Phillips, W.: An Introduction to the Sundance and AutoSlog Systems (2004)

    Google Scholar 

  13. COAE Proceedings: COAE proceedings. In: Proceedings of Chinese Opinion Analysis Evaluation 2008, COAE 2008 (2008)

    Google Scholar 

  14. Riloff, E., Jones, R.: Learning dictionaries for information extraction by multi-level bootstrapping. In: Proceedings of the 16th National Conference on Artificial Intelligence, Orlando, USA, pp. 474–479 (1999)

    Google Scholar 

  15. Hirschman, L., Light, M., Breck, E., Burger, J.D.: Deep read: A reading comprehension system. University of Maryland, United States (1999)

    Book  Google Scholar 

  16. Moldovan, D., Harabagiu, S., Pasca, M., Mihalcea, R., Goodrum, R., Girju, R., Rus, V.: Lasso: A tool for surfing the answer net. In: Proceedings of the Eighth Text REtrieval Conference, TREC-8 (1999)

    Google Scholar 

  17. Riloff, E., Schmelzenbach, M.: An empirical approach to conceptual case frame acquisition. In: Proceedings of the Sixth Workshop on Very large Corpora, Montreal, Canada (August 1998)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Zhang, Y., Zhou, Y. (2009). Mutual Screening Graph Algorithm: A New Bootstrapping Algorithm for Lexical Acquisition. In: Lee, G.G., et al. Information Retrieval Technology. AIRS 2009. Lecture Notes in Computer Science, vol 5839. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04769-5_25

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-04769-5_25

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-04768-8

  • Online ISBN: 978-3-642-04769-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics