Skip to main content

Building Enhanced Link Context by Logical Sitemap

  • Conference paper
Book cover Knowledge Science, Engineering and Management (KSEM 2013)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8041))

  • 2220 Accesses

Abstract

Link contexts have been applied to enrich document representation for a variety of information retrieval tasks. However, the valuable site-specific hierarchical information has not yet been exploited to enrich link contexts. In this paper, we propose to enhance link contexts by mining the underlying information organization architecture of a Web site, which is termed as logical sitemap to differ from sites supplied sitemap pages. We reconstruct a logical sitemap for a Web site by mining existing navigation elements such as menus, breadcrumbs, sitemap etc. It then enriches contexts of a link by aggregating contexts according to the hierarchical relationship in the mined logical sitemap. The experimental results show that our proposed approach can reliably construct a logical sitemap for a general site and the enriched link contexts derived from the logical sitemap can improve site-specific known item search performance noticeably.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Yang, Q., Jiang, P., Zhang, C., Niu, Z.: Reconstruct Logical Hierarchical Sitemap for Related Entity Finding. In: TREC 2010 (2011)

    Google Scholar 

  2. Keller, M., Nussbaumer, M.: Beyond the Web Graph: Mining the Information Architecture of the WWW with Navigation Structure Graphs. In: Proceedings of the 2011 International Conference on Emerging Intelligent Data and Web Technologies, pp. 99–106. IEEE Computer Society (2011)

    Google Scholar 

  3. Weninger, T., Zhai, C., Han, J.: Building enriched web page representations using link paths. In: Proceedings of the 23rd ACM Conference on Hypertext and Social Media, Milwaukee, Wisconsin, USA, pp. 53–62. ACM (2012)

    Google Scholar 

  4. Craswell, N., Hawking, D., Robertson, S.: Effective site finding using link anchor information. In: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, New Orleans, Louisiana, United States, pp. 250–257. ACM (2001)

    Google Scholar 

  5. Bron, M., et al.: The University of Amsterdam at TREC 2010 Session, Entity, and Relevance Feedback. In: TREC 2010 (2011)

    Google Scholar 

  6. Fujii, A.: Modeling anchor text and classifying queries to enhance web document retrieval. In: Proceeding of the 17th International Conference on World Wide Web, Beijing, China, pp. 337–346. ACM (2008)

    Google Scholar 

  7. Dou, Z., et al.: Using anchor texts with their hyperlink structure for web search. In: Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval, Boston, MA, USA, pp. 227–234. ACM (2009)

    Google Scholar 

  8. Lei, C., Jiafeng, G., Xueqi, C.: Bipartite Graph Based Entity Ranking for Related Entity Finding. In: 2011 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, WI-IAT (2011)

    Google Scholar 

  9. Lu, W.-H., Chien, L.-F., Lee, H.-J.: Anchor text mining for translation of Web queries: A transitive translation approach. ACM Trans. Inf. Syst. 22(2), 242–269 (2004)

    Article  Google Scholar 

  10. Metzler, D., et al.: Building enriched document representations using aggregated anchor text. In: Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval, Boston, MA, USA, pp. 219–226. ACM (2009)

    Google Scholar 

  11. Dai, N., Davison, B.D.: Mining Anchor Text Trends for Retrieval. In: Gurrin, C., He, Y., Kazai, G., Kruschwitz, U., Little, S., Roelleke, T., Rüger, S., van Rijsbergen, K. (eds.) ECIR 2010. LNCS, vol. 5993, pp. 127–139. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  12. Talukdar, P.P., et al.: Weakly-supervised acquisition of labeled class instances using graph random walks. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, Honolulu, Hawaii, pp. 582–590. Association for Computational Linguistics (2008)

    Google Scholar 

  13. Venetis, P., et al.: Recovering semantics of tables on the web. Proc. VLDB Endow. 4(9), 528–538 (2011)

    Google Scholar 

  14. Xiang, S., Nie, F., Zhang, C.: Learning a Mahalanobis distance metric for data clustering and classification. Pattern Recogn. 41(12), 3600–3612 (2008)

    Article  MATH  Google Scholar 

  15. Yang, C.C., Liu, N.: Web site topic-hierarchy generation based on link structure. J. Am. Soc. Inf. Sci. Technol. 60(3), 495–508 (2009)

    Article  Google Scholar 

  16. Agarwal, A., Chakrabarti, S., Aggarwal, S.: Learning to rank networked entities. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Philadelphia, PA, USA, pp. 14–23. ACM (2006)

    Google Scholar 

  17. Kumar, R., Punera, K., Tomkins, A.: Hierarchical topic segmentation of websites. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Philadelphia, PA, USA, pp. 257–266. ACM (2006)

    Google Scholar 

  18. Kurland, O., Lee, L.: Respect my authority!: HITS without hyperlinks, utilizing cluster-based language models. In: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Seattle, Washington, USA, pp. 83–90. ACM (2006)

    Google Scholar 

  19. Koschützki, D., Lehmann, K.A., Tenfelde-Podehl, D., Zlotowski, O.: Advanced Centrality Concepts. In: Brandes, U., Erlebach, T. (eds.) Network Analysis. LNCS, vol. 3418, pp. 83–111. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  20. Talukdar, P.P., Crammer, K.: New regularized algorithms for transductive learning. In: Buntine, W., Grobelnik, M., Mladenić, D., Shawe-Taylor, J. (eds.) ECML PKDD 2009, Part II. LNCS, vol. 5782, pp. 442–457. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Yang, Q., Niu, Z., Zhang, C., Huang, S. (2013). Building Enhanced Link Context by Logical Sitemap. In: Wang, M. (eds) Knowledge Science, Engineering and Management. KSEM 2013. Lecture Notes in Computer Science(), vol 8041. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39787-5_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-39787-5_4

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-39786-8

  • Online ISBN: 978-3-642-39787-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics