Employing Social Network Construction and Analysis in Web Structure Optimization

Xagi, Mohamad; Guerbas, Abdelghani; Kianmehr, Keivan; Karampelas, Panagiotis; Ridley, Mick; Alhajj, Reda; Rokne, Jon

doi:10.1007/978-3-7091-0294-7_2

Mohamad Xagi³,
Abdelghani Guerbas⁴,
Keivan Kianmehr⁴,
Panagiotis Karampelas⁵,
Mick Ridley³,
Reda Alhajj^4,5,6 &
…
Jon Rokne⁴

1035 Accesses

Abstract

The world wide web is growing continuously and rapidly; it is quickly facilitating the migration of tasks of the daily life into web-based. This trend shows time will come when everyone is forced to use the web for daily activities. Naive users arc the major concern of such a shift; so, it is necessary to have the web ready to serve them. We argue that this requires well optimized websites for users to quickly locate the information they arc looking for. This, on the other hand, becomes more and more important due to the widespread reliance on the many services available on the Internet nowadays. It is true that search engines can facilitate the task of finding the information one is looking for. However, search engines will never replace but do complement the optimization of a website’s internal structure based on previously recorded user behavior. In this chapter, wc will present a novel approach for identifying problematic structures in websites. This method consists of two phases. The first phase compares user behavior, derived via web log mining techniques, to a combined analysis of the website’s link structure obtained by applying three methods leading to more robust framework and hence strong and consistent outcome: (1) constructing and analyzing a social network of the pages constituting the website by considering both the structure and the usage information; (2) applying the Weighted PageRank algorithm; and (3) applying the Hypertext Induced Topic Selection (HITS) method. In the second phase, we use the term frequency-inverse document frequency (TFIDF) measure to investigate further the correlation between the page that contains the link and the linked to pages in order to further support the findings of the first phase of our approach. We will then show how to use these intermediate results in order to point out problematic website structures to the website owner.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

S. Abitcboul, M. Prcda, and G. Cobcna. Adaptive on-line page importance computation. Proceedings of the International Conference on World Wide Web, pp. 280–290, 2003.
Google Scholar
M. Adnan and R. Alhajj, “DRFP-Tree: Disk-Resident Frequent Pattern Tree,” Applied Intelligence, Vol. 30, No.2, pp. 84–97, 2009.
Article Google Scholar
A. Altman and M. Tennenholtz. Ranking systems: the pagerank axioms. Proceedings of ACM Conference International on Electronic commerce, pp. 1–8, 2005.
Google Scholar
V. Batagclj, A. Mrvar: Pajek — Program for Large Network Analysis. Home page: http:// vlado. fmf.uni-lj. si/pub/networ ks/paj ek/
Google Scholar
M. Bianchini, M. Gori, and F. Scarselli. Inside pagerank. ACM Transactions on Internet Technology, 5(1):92–128, 2005.
Article Google Scholar
P. Boldi, M. Santini, and S. Vigna. Pagcrank as a function of the damping factor. Proceedings of the. International Conference on World Wide. Web, pp. 557–566, 2005.
Google Scholar
C. Borgclt. Efficient implementations of apriori and eclat, Proceedings of the. Workshop of Frequent Item Set Mining Implementations, Melbourne, FL, 2003.
Google Scholar
A. Borodin, G. O. Roberts, J. S. Rosenthal, and P. Tsaparas. Link analysis ranking: algorithms, theory, and experiments. ACM Transactions on Internet Technology, 5(l):231–297, 2005.
Article Google Scholar
J. T. Bradley, D. V. de Jager, W. J. Knottenbelt, and A. Trifunovic. Hypergraph partitioning for faster parallel pagcrank computation. Proceedings of Formal Techniques for Computer Systems and Business Processes, European Performance Engineering Workshop, pp. 155–171, 2005.
Google Scholar
S. Chakrabarti, B. Dom, D. Gibson, J. Klcinbcrg, P. Raghavan, and S. Rajagopalan. Automatic resource compilation by analyzing hyperlink structure and associated text. Proceedings of the. International Conference on World Wide. Web, 1998.
Google Scholar
Y.-Y. Chen, Q. Gan, and T. Sucl. I/o-cfficicnt techniques for computing pagcrank. Proceedings of ACM International Conference on Information and knowledge management, pp. 549 557, 2002.
Google Scholar
P.-A. Chirita, J. Dicdcrich, and W. Ncjdl. Mailrank: using ranking for spam detection. Proceedings of ACM International Conference on Information and knowledge. management, pp. 373–380, 2005.
Google Scholar
J. Clio, S. Roy, and R. E. Adams. Page quality: in search of an unbiased web ranking. Proceedings of ACM SIGMOD, pp. 551–562, 2005.
Google Scholar
L. da F. Costa, F. A. Rodrigues, G. Travieso and P. R. Villas Boas. Characterization of complex networks: a survey of measurements Advanced Physics, Vol. 56, pp 167–242, 2007.
Article Google Scholar
J. Dean and M. Henzinger. Finding related pages in the world wide web. Proceedings of the International Conference on World Wide Weh, 1999.
Google Scholar
G. Guo, et al., An kNN Model-Based Approach and Its Application in Text Categorization. Proceedings of the. International Conference on Computational Linguistics and Intelligent Text Processing, pp. 559–570, 2004.
Google Scholar
J. Hou and Y. Zhang. Effectively finding relevant web pages from linkage information. IEEE Transactions on Knowledge and Data Engineering, 15(4):940–951, 2003.
Article Google Scholar
W. H. Hsu, A. King, M. S. Paradesi, T. Pydimarri, and T. Weninger. Collaborative and structural recommendation of friends using weblog-based social network analysis. In AAAI Spring Symposium on Computational Approaches to Analyzing Weblogs (CAAW), volume SS-06-03, pages 55–60, Menlo Park, CA, 2006.
Google Scholar
J. Jeffrey, P. Karski, B. Lolirmann, K. Kianmehr and R. Alhajj, “Optimizing Web Structures Using Web Mining Techniques,” In Proceedings of the International Conference on Intelligent Data Engineering and Automated Learning, Springer-Verlag LNCS, Brimingham, UK, 2007.
Google Scholar
X.-M. Jiang, G.-R. Xue, W.-G. Song, H.-J. Zeng, Z. Chen, and W.-Y. Ma. Exploiting pagerank at different block level. Proceedings of the International Conference on Web Information Systems Engineering, pp. 241–252. 2004.
Google Scholar
J. M. Kleinberg. Authoritative sources in a hyperlinked environment. Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms, pap. 668–677. 1998.
Google Scholar
C.H. Li and C.K. Chui. Web structure mining for usability analysis. Proceedings of IEEE/WIC/ACM International Conference on Web Intelligence, pp. 309–312, 2005.
Google Scholar
P. Massa and C. Hayes. Page-rerank: Using trusted links to re-rank authority. Proc. of IEEE/WIC/ACM International Conference on Web Intelligence, pp. 614–617, 2005.
Google Scholar
I. V. Renata Iváncsy. Frequent pattern mining in web log data. Journal of Applied Sciences at Budapest Tech, 3(1):77–90, 2006.
Google Scholar
P. Soucy and G. W. Mineau, Beyond TFIDF Weighting for Text Categorization in the Vector Space Model. Proceedings of the International Joint Conference on Artificial Intelligence, pp. 1130–1135, 2005.
Google Scholar
R. Steinberger, B. Pouliquen and J. Hagman, Cross-Lingual Document Similarity Calculation Using the Multilingual Thesaurus EUROVOC. Proceedings of the International Conference on Computational Linguistics and Intelligent Text Processing, pp. 415–424, 2002.
Google Scholar
X. Wan, E. Milios, N. Kalyaniwalla, and J. Janssen, “Link-based event detection in email communication networks,” in SAC’ 09: Proceedings of the. 2009 ACM symposium on Applied Computing. New York, NY, USA: ACM, 2009, pp. 1506-1510.
Google Scholar
W. Xing and A. A. Ghorbani. Weighted pagcrank algorithm. itProceedings of Annual Conference on Communication Networks and Services Research, pp. 305–314, 2004.
Google Scholar
J. X. Yu, Y. Ou, C. Zhang, and S. Zhang. Identifying interesting customers through web log classification. IEEE Intelligent Systems, 20(3):55–59, 2005.
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computing, School of Computing Informatics & Media, University of Bradford, Bradford, UK
Mohamad Xagi & Mick Ridley
Department of Computer Science, University of Calgary, Calgary, Alberta, Canada
Abdelghani Guerbas, Keivan Kianmehr, Reda Alhajj & Jon Rokne
Department of Information Technology, Hellenic American University, Athens, Greece
Panagiotis Karampelas & Reda Alhajj
Department of Computer Science, Global University, Beirut, Lebanon
Reda Alhajj

Authors

Mohamad Xagi
View author publications
You can also search for this author in PubMed Google Scholar
Abdelghani Guerbas
View author publications
You can also search for this author in PubMed Google Scholar
Keivan Kianmehr
View author publications
You can also search for this author in PubMed Google Scholar
Panagiotis Karampelas
View author publications
You can also search for this author in PubMed Google Scholar
Mick Ridley
View author publications
You can also search for this author in PubMed Google Scholar
Reda Alhajj
View author publications
You can also search for this author in PubMed Google Scholar
Jon Rokne
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

The Maersk Mc-Kinney Moller Institute, 5230, Odense, Denmark
Nasrullah Memon
Department of Computer Science, University of Calgary, Calgary, AB, Canada
Reda Alhajj

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Xagi, M. et al. (2010). Employing Social Network Construction and Analysis in Web Structure Optimization. In: Memon, N., Alhajj, R. (eds) From Sociology to Computing in Social Networks. Springer, Vienna. https://doi.org/10.1007/978-3-7091-0294-7_2

Download citation

DOI: https://doi.org/10.1007/978-3-7091-0294-7_2
Publisher Name: Springer, Vienna
Print ISBN: 978-3-7091-0293-0
Online ISBN: 978-3-7091-0294-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics