Skip to main content

Context Enhanced Keyword Extraction for Sparse Geo-Entity Relation from Web Texts

  • Conference paper
  • First Online:
Web Technologies and Applications (APWeb 2016)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9865))

Included in the following conference series:

Abstract

Geo-entity relation recognition from rich texts requires robust and effective solutions on keyword extraction. Compared with supervised learning methods, unsupervised learning methods attract more attention for their capability to capture the dynamic feature variation in text and to discover additional relation types. The frequency-based methods of keyword extraction have been widely studied. However, it is difficult to be applied into geo-entity keyword extraction directly because of the sparse distribution of geo-entity relations in texts. Besides, there are few studies on Chinese keyword extraction. This paper proposes a context enhanced keyword extraction method. Firstly the contexts for geo-entities are enhanced to reduce the sparseness of terms. Secondly two well-known frequency-based statistical methods (i.e., DF and Entropy) are used to build a large-scale corpus automatically from the enhanced contexts. Thirdly the lexical features and their weights are statistically determined based on the corpus to enhance the distinction of the terms. Finally, all terms in the enhanced contexts are measured with the lexical features, and the most important terms are selected as the keywords of geo-entity pairs. Experiments are conducted with mass real Chinese web texts. Compared with DF and Entropy, the presented method improves the precision by 41 % and 36 % respectively in discovering the keywords with sparse distribution and generates additional 60 % correct keywords for geo-entity relation recognition.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://baike.baidu.com.

  2. 2.

    http://www.datatang.com/data/42306/.

  3. 3.

    http://www.360doc.com/content/11/0110/01/694750_85358960.shtml.

  4. 4.

    https://gate.ac.uk/.

References

  1. Jones, C.B., Purves, R.S.: Geographical information retrieval. Int. J. Geogr. Inf. Sci. 22(3), 219–228 (2008)

    Article  Google Scholar 

  2. Kordjamshidi, P., Otterlo, M.V., Moens, M.F.: Spatial role labeling: towards extraction of spatial relations from natural language. ACM Trans. Speech Lang. Process. 8(3), 1–39 (2011)

    Article  Google Scholar 

  3. Purves, R.S., Clough, P., Jones, C.B.: The design and implementation of SPIRIT: a spatially aware search engine for information retrieval on the Internet. Int. J. Geogr. Inf. Sci. 21(7), 717–745 (2007)

    Article  Google Scholar 

  4. Zhu, S.N., Zhang, X.Y., Zhang, C.J.: Syntactic pattern recognition of geospatial relations described in natural language. In: Proceedings of 2010 International Conference on Broadcast Technology and Multimedia Communication, 13 December, pp. 354–357. CNKI, Chongqing (2010)

    Google Scholar 

  5. Li, W.W., Goodchild, M.F., Raskin, R.: Towards geospatial semantic search: exploiting latent semantic relations in geospatial data. Int. J. Digit. Earth 7(1), 17–37 (2014)

    Article  Google Scholar 

  6. Loglisci, C., Ienco, D., Roche, M., et al.: Towards geographic information harvesting: extraction of spatial relational facts from web documents. In: 2012 IEEE 12th International Conference on Data Mining Workshops, 10 December, pp. 789–796. IEEE, Brussels (2012)

    Google Scholar 

  7. Turney, P.D., Pantel, P.: From frequency to meaning: vector space models of semantics. J. Artif. Intell. Res. 37, 141–188 (2014)

    MathSciNet  MATH  Google Scholar 

  8. Zhang, W.R., Sun, L., Han, X.P.: A entity relation extraction method based on Wikipedia and pattern clustering. J. Chin. Inf. Process. 26(2), 75–127 (2012)

    MathSciNet  Google Scholar 

  9. Liu, Z.Y., Sun, M.S.: Can prior knowledge help graph-based methods for keyword extraction? Front. Electr. Electron. Eng. 7(2), 242–253 (2012)

    Google Scholar 

  10. Vasardani, M., Winter, S., Richter, K.F.: Locating place names from place descriptions. Int. J. Geogr. Inf. Sci. 27(12), 2509–2532 (2013)

    Article  Google Scholar 

  11. Shen, M.M., Liu, D.R., Huang, Y.S.: Extracting semantic relations to enrich domain ontologies. J. Intell. Inf. Syst. 39(3), 749–761 (2012)

    Article  Google Scholar 

  12. Zhang, X.Y., et al.: SVM based extraction of spatial relations in text. In: 2011 IEEE International Conference on Spatial Data Mining and Geographical Knowledge Services, 29 June–01 July, pp. 529–533. IEEE, Fuzhou (2011)

    Google Scholar 

  13. Naughton, M., Stokes, N., Carthy, J.: Sentence-level event classification in unstructured texts. Inf. Retrieval 13(2), 132–156 (2010)

    Article  Google Scholar 

Download references

Acknowledgments

This work was partially supported by the National High-Tech Research and Development Program of China (2013AA120305) and the National Natural Science Foundation of China (41271408).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Feng Lu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Yu, L., Lu, F., Zhang, X., Liu, X. (2016). Context Enhanced Keyword Extraction for Sparse Geo-Entity Relation from Web Texts. In: Morishima, A., et al. Web Technologies and Applications. APWeb 2016. Lecture Notes in Computer Science(), vol 9865. Springer, Cham. https://doi.org/10.1007/978-3-319-45835-9_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-45835-9_22

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-45834-2

  • Online ISBN: 978-3-319-45835-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics