Skip to main content
Log in

Probabilistic classification techniques to perform geographical labeling of web objects

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

Web search engines provide relevant documents to answer user query. These result-set documents might also contain redundant information which the user might not require. The user has to invest effort in navigating each document to identify relevant information. To overcome such cumbersome overheads, Web object search engines were proposed. These systems provide powerful vertical search facility, so that, the result set of a query will only contain the relevant Web object information. Many techniques have been proposed to geographically label documents for Web search engines, however, geographical labeling of Web objects has got limited attention. The presence of noise in the Web objects due to inaccurate object extraction process complicates the task of assigning geographical labels. Recently in the literature. Gaussian mixture model oriented classification technique was proposed to achieve geographical labeling of Web objects, even then, there is an ample scope to improve labeling accuracy. In this work, two probabilistic classier namely-Bayesian Classifier and Variational Inference Classifier are utilized to achieve geographical labeling of Web objects. The proposed technique provides at least 30% better labeling accuracy and twice better computational efficiency when compared with the contemporary technique.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. http://products.live.com

  2. http://academic.research.microsoft.com/

  3. Raper, J.: Geographic relevance. J. Doc. 63(6), 836–852 (2007)

    Article  Google Scholar 

  4. Li, K., Changqing, Z., Shuhui, B., Yun, L., Jian, Z., Minglun, G.: Multi-modal feature fusion for geographic image annotation. Pattern Recognit. 73, 1–14 (2018)

    Article  Google Scholar 

  5. Sathiyamoorthi, V.: Data mining and data warehousing: introduction to data mining and data warehousing. In: Web Data Mining and the Development of Knowledge-Based Decision Support Systems, IGI Global , pp. 312–337 (2017)

  6. Tezuka, T., Kondo, H., Tanaka, K.: Estimation of Geographic Relevance for Web objects Using Probabilistic Models. Springer-Verlag, Berlin Heidelberg (2008)

    Book  Google Scholar 

  7. McCurley, K.S.: Geospatial mapping and navigation of the Web. In: Proceedings of the 10th International World Wide Web Conference, Hong kong, China, pp. 221–229 (2001)

  8. Gao, W., Lee, H.C., Miao, Y.: Geographically focused collaborative crawling. In: Proceedings of the 15th International World Wide Web Conference, Edinburgh, Scotland, pp. 287–296 (2006)

  9. Mei, Q., Liu, C., Su, H., Zhai, C.: A probabilistic approach to spatiotemporal theme pattern mining on weblogs. In: Proceedings of the 15th International World Wide Web Conference, Edinburgh, Scotland, pp. 533–542 (2006)

  10. Tezuka, T., Kurashima, T., Tanaka, k.: Toward tighter integration of Web search with a geo graphic information system. In: Proceedings of the 15th World Wide Web Conference, Edin -burgh, Scotland, pp. 277–286 (2006)

  11. Gravano, L., Hatzivassiloglou, V., Litchenstein, R.: Categorizing Web queries according to geo- graphical locality. In: Proceedings of the 12th International Conference on Information and Knowledge Management, New Orleans, Lousiana, pp. 325–333(2003)

  12. Chen, L., Zhang, L., Jing, F., Deng, k., Ma, W Y.: Ranking Web objects from multiple communities. In: Proceedings of the International Conference on Information and Knowledge Management, Arlington, Virginia, pp. 377–386 (2006)

  13. Nie, Z., Ma, Y., Shi, S., Wen, J.R., Ma, W.Y.: Web object retrieval. In: Proceedings of the 16th International World Wide Web Conference, Ban, Canada, pp. 81–90 (2007)

  14. Nie, Z., Wen, J R., Ma, W.Y.: Object-level vertical search. In: Proceedings of the 3rd Biennial Conference on Innovative Data Systems Research, Asilomar, California, pp. 235–246 (2007)

  15. Buyukkokten, O., Cho, J., Garcia-Molina, H., Gravano, L., shivakumar, N.: Exploiting geographical location information of Web pages. In: Proceedings of the ACM SIGMOD Workshop on the Web and databases, Philadelphia, Pennsylvania (1999)

  16. Davis, C.A., Fonseca, F.T.: Assessing the certainty of locations produced by an address geo coding system. Geoinformatica 11(1), 103–129 (2007)

    Article  Google Scholar 

  17. Amitay, E., Har, N., Sivan, R., Soffer, A.: Geotagging web content. In: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Sheffield, United Kingdom, pp. 273–280

  18. Lieberman, M.D., Sperling, J.: STEWARD Architecture of a spatio-textual search engine. In: Proceedings of the 15th Annual ACM International Symposium on Advances in Geo-graphic Information Systems, Seattle, Washington, Article No. 25 (2007)

  19. Sengar, V., Joshi, T., Joy, Prakash. S.: Toyama K Robust location search from text queries. In: Proceedings of the 15th Annual ACM International Symposium on Advances in Geographic Information Systems, Seattle, Washington, Article No. 24 (2007)

  20. Schneider, M.: Geographic data modeling: Fuzzy topological Predicates, their properties and their integration into query languages. In: Proceedings of the 9th ACM International Symposium on advances in geographic information systems, Atlanta, Georgia, pp. 9–14 (2001)

  21. Coffman, J., Weaver, A.C.: A framework for evaluating database keyword search strategies. In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management (2010)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to T. Satish Kumar.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

AnjanKumar, K.N., Chitra, S. & Satish Kumar, T. Probabilistic classification techniques to perform geographical labeling of web objects. Cluster Comput 22 (Suppl 1), 277–285 (2019). https://doi.org/10.1007/s10586-018-1822-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-018-1822-y

Keywords

Navigation