Skip to main content

Approximately Similarity Measurement of Web Sites

  • Conference paper
  • First Online:
Neural Information Processing (ICONIP 2015)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9492))

Included in the following conference series:

Abstract

In this paper we will present a way to approximately measure the similarity of two web sites. The web sites considered will have only HTML web pages. We will present an algorithm which chooses a number of significant pages for each site and it will determine the similarity using the information from the selected web pages. We will use a genetic algorithm in order to select significant web pages. To implement the algorithm and show the results we used Java language.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. https://en.wikipedia.org/wiki/Genetic_algorithm

  2. Bollegata, D., Matsuo, Y., Ishizuka, M.: Measuring semantic similarity between words using web search engines. In: IW3C2 (2007)

    Google Scholar 

  3. Balcau, C.: Combinatorics and Graph Theory. University of Pitești Publishing, Pitești (2007)

    Google Scholar 

  4. Constantin, D., Samarescu, N.: Modern Techniques of Using the Computer. Tiparg Publishing, Pitești (2009)

    Google Scholar 

  5. Lin, D.: An information-theoretic definition of simimarity. In: ICML 1998, pp. 296–304. ACM

    Google Scholar 

  6. Popescu, D.A., Radulescu, D.: Monitoring of irrigation systems using genetic algorithm. In: ICMSAO 2015. IEEE Xplore (2015)

    Google Scholar 

  7. Popescu, D.A., Danauta, C.M.: Similarity measurement of web sites using sink web pages. In: 34th International Conference on Telecommunications and Signal Processing, TSP 2011, 18–20 August 2011, pp. 24–26. IEEE Xplore, Budapest (2011)

    Google Scholar 

  8. Popescu, D.A., Nicolae, D.: Determining the similarity of two web applications using the edit distance. In: SOFA. LNCS (2014). http://trivent.hu/2014/sofa2014/documents/sofa2014_final_program.pdf

  9. Popescu, D.A.: Sink web pages in web application. In: Schwenker, F., Trentin, E. (eds.) PSL 2011. LNCS, vol. 7081, pp. 154–158. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  10. Torres, G.J., Basnet, R.B., Sung, A.H., Mukkamala, S., Ribeiro, B.M.: A similarity measure for clustering and its applications. In: ICASA 2008 (2008)

    Google Scholar 

  11. Jeh, G., Windom, J.: SimRank: a measure of structural-context similarity. In: KDD 2002, pp. 538–543. ACM (2002)

    Google Scholar 

  12. Pushpa, C.N., Thriveni, J., Venugopal, K.R., Patnaik, L.M.: Web search engine based semantic similarity measure between words using pattern retrieval algorithm. In: CS & IT-CSCP 2013 (2013)

    Google Scholar 

  13. Zhao, P., Han, J., Sun, Y.: P-Rank: a comprehensive structural similarity measure over information networks. In: CIKM 2009. ACM (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Doru Anastasiu Popescu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Popescu, D.A., Radulescu, D. (2015). Approximately Similarity Measurement of Web Sites. In: Arik, S., Huang, T., Lai, W., Liu, Q. (eds) Neural Information Processing. ICONIP 2015. Lecture Notes in Computer Science(), vol 9492. Springer, Cham. https://doi.org/10.1007/978-3-319-26561-2_73

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-26561-2_73

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-26560-5

  • Online ISBN: 978-3-319-26561-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics