Skip to main content

Application of SEO Metrics to Determine the Quality of Wikipedia Articles and Their Sources

  • Conference paper
  • First Online:
Information and Software Technologies (ICIST 2018)

Abstract

The leading online encyclopedia Wikipedia is struggling with inconsistent article quality caused by the collaborative editing model. While one can find many helpful articles with consistent information on Wikipedia, there are also a lot of questionable articles with unclear or unfinished information yet. The quality of each article may vary over time as different users repeatedly re-edit content. One of the most important elements of the Wikipedia articles are references which allow to verify content and to show its source to user. Based on the fact that most of these references are web pages, it is possible to get more information about their quality by using citation analysis tools. For science and practice the empirical proof of the quality of the articles in Wikipedia could have a further signal effect, as the citation of Wikipedia articles, especially in scientific practice, is not yet recognised. This paper presents general results of Wikipedia analysis using metrics from the Toolbox SISTRIX, which is one of the leading providers of indicators for Search Engine Optimization (SEO). In addition to the preliminary analysis of the Wikipedia articles as separate web pages, we extracted data from more than 30 million references in different language versions of Wikipedia and analyzed over 180 thousand most popular hosts. In addition, we compared the same sources from different geographical perspectives using country-specific visibility indices.

The original version of this chapter was replaced by an updated version. The correction to this chapter is available at https://doi.org/10.1007/978-3-319-99972-2_49

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Change history

  • 18 October 2018

    A correction has been published.

Notes

  1. 1.

    https://meta.wikimedia.org/wiki/List_of_Wikipedias.

  2. 2.

    http://wiki.dbpedia.org.

  3. 3.

    https://www.wikidata.org.

  4. 4.

    https://en.wikipedia.org/wiki/Criticism_of_Wikipedia.

  5. 5.

    https://www.alexa.com/siteinfo/wikipedia.org.

  6. 6.

    https://en.wikipedia.org/wiki/Wikipedia:Featured_articles.

  7. 7.

    http://wikirank.net.

  8. 8.

    https://en.wikipedia.org/wiki/Help:Citation_Style_1.

  9. 9.

    https://dumps.wikimedia.org.

  10. 10.

    Extended results can be found under http://data.lewoniewski.info/bis2018seo/.

  11. 11.

    http://infoboxes.net.

References

  1. Gantz, J., Reinsel, D.: The digital universe in 2020: big data, bigger digital shadows, and biggest growth in the far east (2012). http://www.emc.com/collateral/analyst-reports/idc-the-digital-universe-in-2020.pdf

  2. Bughin, J., Chui, M., Manyika, J.: Clouds, big data, and smart assets: ten tech-enabled business trends to watch. McKinsey Q. 56(1), 75–86 (2010)

    Google Scholar 

  3. Schmidt, R., Möhring, M., Härting, R.-C., Reichstein, C., Neumaier, P., Jozinović, P.: Industry 4.0 - potentials for creating smart products: empirical research results. In: Abramowicz, W. (ed.) BIS 2015. LNBIP, vol. 208, pp. 16–27. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-19027-3_2

    Chapter  Google Scholar 

  4. International Telecommunication Union: Measuring the Information Society Report 2017, vol. 1 (2017). https://www.itu.int/en/ITU-D/Statistics/Documents/publications/misr2017/MISR2017_Volume1.pdf

  5. Kumar, L., Kumar, N.: SEO technique for a website and its effectiveness in context of Google Search Engine. Int. J. Comput. Sci. Eng. (IJCSE) 2, 113–118 (2014)

    Google Scholar 

  6. Schroeder, B.: Publicizing your program: website evaluation, design, and marketing strategies. AACE J. 15(4), 437–471 (2007)

    Google Scholar 

  7. SISTRIX GmbH: The secret of successful Websites. http://www.sistrix.com

  8. Stróżyna, M., Eiden, G., Abramowicz, W., et al.: A framework for the quality-based selection and retrieval of open data - a use case from the maritime domain. Electron Mark. (2017). https://doi.org/10.1007/s12525-017-0277-y

    Article  Google Scholar 

  9. Filipiak, D., Filipowska, A.: Improving the quality of art market data using linked open data and machine learning. In: Abramowicz, W., Alt, R., Franczyk, B. (eds.) BIS 2016. LNBIP, vol. 263, pp. 418–428. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-52464-1_39

    Chapter  Google Scholar 

  10. Lewoniewski, W., Wecel, K., Abramowicz, W.: Relative quality and popularity evaluation of multilingual Wikipedia articles. Informatics 4, 43 (2017)

    Article  Google Scholar 

  11. Lewoniewski, W.: Enrichment of information in multilingual Wikipedia based on quality analysis. In: Abramowicz, W. (ed.) BIS 2017. LNBIP, vol. 303, pp. 216–227. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-69023-0_19

    Chapter  Google Scholar 

  12. Teplitskiy, M., Lu, G., Duede, E.: Amplifying the impact of open access: Wikipedia and the diffusion of science. J. Assoc. Inf. Sci. Technol. 68(9), 2116–2127 (2017)

    Article  Google Scholar 

  13. Drèze, X., Zufryden, F.: Measurement of online visibility and its impact on Internet traffic. J. Interact. Mark. 18(1), 20–37 (2004)

    Article  Google Scholar 

  14. Goodman, A.: Winning Results with Google AdWords, 2nd edn. McGraw-Hill, New York City (2009)

    Google Scholar 

  15. Maynes, R., Everdell, I.: The Evolution of Google Search Results Pages & Their Effects on User Behaviour (2014). http://www.mediative.com/whitepaper-the-evolution-of-googles-search-results-pages-effects-on-user-behaviour/

  16. Kronenberg, H.: Wie wird der Sichtbarkeitsindex berechnet? (2013). http://www.sistrix.de/frag-sistrix/was-ist-der-sistrix-sichtbarkeitsindex/

  17. Härting, R.-C., Mohl, M., Steinhauser, P., Möhring, M.: Search engine visibility indices versus visitor traffic on websites. In: Abramowicz, W., Alt, R., Franczyk, B. (eds.) BIS 2016. LNBIP, vol. 255, pp. 91–101. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-39426-8_8

    Chapter  Google Scholar 

  18. RYTE GmbH: Search Engine Optimization (2018) https://en.ryte.com/wiki/Category:Search_Engine_Optimization

  19. Berman, R., Katona, Z.: The role of search engine optimization in search marketing. Mark. Sci. 32(4), 644–651 (2011)

    Article  Google Scholar 

  20. Searchmetrics: Backlinks Definition - SEO Glossary. https://www.searchmetrics.com/glossary/Backlinks/

  21. Killoran, J.B.: How to use search engine optimization techniques to increase website visibility. IEEE Trans. Prof. Commun. 56(1), 50–66 (2013)

    Article  Google Scholar 

  22. Warncke-Wang, M., Cosley, D., Riedl, J.: Tell me more: an actionable quality model for Wikipedia. In: Proceedings of the 9th International Symposium on Open Collaboration, p. 8. ACM (2013)

    Google Scholar 

  23. Wecel, K., Lewoniewski, W.: Modelling the quality of attributes in Wikipedia infoboxes. In: Abramowicz, W. (ed.) BIS 2015. LNBIP, vol. 228, pp. 308–320. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-26762-3_27

    Chapter  Google Scholar 

  24. Peinado, A.J.R., Barahona, J.M.G.: Temporal and behavioral patterns in the use of Wikipedia. Doctoral dissertation, Ph.D. thesis, Universidad Rey Juan Carlos, pp. 128, 139 (2011)

    Google Scholar 

  25. Lerner, J., Lomi, A.: Knowledge categorization affects popularity and quality of Wikipedia articles. PLoS ONE 13(1), e0190674 (2018)

    Article  Google Scholar 

  26. Lehmann, J., Müller-Birn, C., Laniado, D., Lalmas, M., Kaltenbrunner, A.: Reader preferences and behavior on Wikipedia. In: Proceedings of the 25th ACM Conference on Hypertext and Social Media, pp. 88–97. ACM (2014)

    Google Scholar 

  27. Luyt, B., Tan, D.: Improving Wikipedia’s credibility: references and citations in a sample of history articles. J. Assoc. Inf. Sci. Technol. 61(4), 715–722 (2010)

    Google Scholar 

  28. Lewoniewski, W., Wecel, K., Abramowicz, W.: Analysis of references across Wikipedia languages. In: Damaševičius, R., Mikašytė, V. (eds.) ICIST 2017. CCIS, vol. 756, pp. 561–573. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-67642-5_47

    Chapter  Google Scholar 

  29. Klusch, M.: Information agent technology for the internet: a survey. Data Knowl. Eng. 36(3), 337–372 (2001)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Włodzimierz Lewoniewski .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Lewoniewski, W., Härting, RC., Węcel, K., Reichstein, C., Abramowicz, W. (2018). Application of SEO Metrics to Determine the Quality of Wikipedia Articles and Their Sources. In: Damaševičius, R., Vasiljevienė, G. (eds) Information and Software Technologies. ICIST 2018. Communications in Computer and Information Science, vol 920. Springer, Cham. https://doi.org/10.1007/978-3-319-99972-2_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-99972-2_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-99971-5

  • Online ISBN: 978-3-319-99972-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics