Skip to main content

Topic-Based Website Feature Analysis for Enterprise Search from the Web

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4255))

Abstract

Efficient and accurate enterprise search is a challenging and important problem for specified resources available on the web. Domain-specific enterprise websites are similar in the topic structures and textual contents. Considering the semantic information of website content terms, a novel website feature vector modelling method representing website topic were proposed on the basis of vector space model. The feature vector elements integrated textual semantic information about topic content and structure information through different semantic terms and weighting schema respectively. The contrast recognition performances demonstrate that this feature analysis approach to website topic gives full potentials for specific enterprise web search.

The work was supported partially by the Natural Science Foundation of China (No. 60374057) and Key Program of the Ministry of Education of China (No.211CERS-8).

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Chakrabarti, S., Dom, B., van den Berg, M.: Focused Crawling: a New Approach to Topic-specific Web Resource Discovery. Computer Networks 31, 1623–1640 (1999)

    Article  Google Scholar 

  2. Ester, M., Kriegel, H.-P., Schubert, M.: Website Mining: A New Way to Spot Competitors, Customers and Suppliers in the World Wide Web. In: Proc. 8th ACM SIGKDD 2002, Edmonton, pp. 249–258 (2002)

    Google Scholar 

  3. Kriegel, H.-P., Schubert, M.: Classification of Websites as Sets of Feature Vectors. In: Proc. International Conference on Databases and Applications (DBA 2004), Innsbruck, pp. 127–132 (2004)

    Google Scholar 

  4. Ester, M., Kriegel, H.-P., Schubert, M.: Accurate and Efficient Crawling for Relevant Websites. In: Proc. 30th International Conference on Very Large Databases (VLDB 2004), Toronto, pp. 396–407 (2004)

    Google Scholar 

  5. Chen, X.Q., Yu, Z.H., Bai, S., et al.: Automatic Information Extraction and Classification of Web Sites. In: Proc. JSCL 1999, Beijing, pp. 87–92 (1999)

    Google Scholar 

  6. Tian, Y.H., Huang, T.J., Gao, W.: A Web Site Representation and Mining Algorithm Using a Multiscale Tree Model. Journal of Software 15, 1393–1404 (2004)

    MATH  Google Scholar 

  7. Joachims, T.: Text Categorization with Support Vector Machines: Learning with Many Relevant Features. In: Nédellec, C., Rouveirol, C. (eds.) ECML 1998. LNCS, vol. 1398, pp. 137–142. Springer, Heidelberg (1998)

    Chapter  Google Scholar 

  8. Han, E.-H., Karypis, G.: Centroid-based Document Classification: Analysis and Experimental Results. In: Zighed, D.A., Komorowski, J., Żytkow, J.M. (eds.) PKDD 2000. LNCS, vol. 1910, pp. 424–431. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  9. Dong, B.L., Liu, H.M.: Implementation Web Resource Service to Product Design. In: Proc. International Conference on Programming Language for Machine Tools, Shanghai, pp. 972–977 (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Dong, B., Liu, H., Hou, Z., Liu, X. (2006). Topic-Based Website Feature Analysis for Enterprise Search from the Web. In: Aberer, K., Peng, Z., Rundensteiner, E.A., Zhang, Y., Li, X. (eds) Web Information Systems – WISE 2006. WISE 2006. Lecture Notes in Computer Science, vol 4255. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11912873_11

Download citation

  • DOI: https://doi.org/10.1007/11912873_11

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-48105-8

  • Online ISBN: 978-3-540-48107-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics