Skip to main content

Companies in Multilingual Wikipedia: Articles Quality and Important Sources of Information

  • Conference paper
  • First Online:
Information Technology for Management: Approaches to Improving Business and Society (FedCSIS-AIST 2022, ISM 2022)

Abstract

In this paper, we provide a method for the identification and assessment of reliable internet sources about companies. We first identified 516,586 Wikipedia articles related to companies in 310 language versions, and then extracted and analyzed references contained in them using three different models for article quality assessment. As a result, we compiled a ranking of reliable sources. We found that there are several universal sources shared by many languages, but usually each language has its own specific sources. Our ranking of sources can be useful for Wikipedia editors looking for source material for their articles. Companies themselves can leverage this ranking for public relations activities. Moreover, our method can be used to automatically maintain a list of reliable internet sources.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Apollonio, D.E., Broyde, K., Azzam, A., De Guia, M., Heilman, J., Brock, T.: Pharmacy students can improve access to quality medicines information by editing Wikipedia articles. BMC Med. Educ. 18(1), 1–8 (2018). https://doi.org/10.1186/s12909-018-1375-z

    Article  Google Scholar 

  2. BestRef: Popularity and Reliability Assessment of Wikipedia Sources. https://bestref.net (2022)

  3. Blumenstock, J.E.: Size matters: word count as a measure of quality on Wikipedia. In: Proceedings of the 17th International Conference on World Wide Web, pp. 1095–1096. ACM (2008). https://doi.org/10.1145/1367497.1367673

  4. Callahan, E.S., Herring, S.C.: Cultural bias in Wikipedia content on famous persons. J. Am. Soc. Inform. Sci. Technol. 62(10), 1899–1915 (2011). https://doi.org/10.1002/asi.21577

    Article  Google Scholar 

  5. Colavizza, G.: COVID-19 research in Wikipedia. Quant. Sci. Stud. 1(4), 1349–1380 (2020). https://doi.org/10.1162/qss_a_00080

    Article  Google Scholar 

  6. Conti, R., Marzini, E., Spognardi, A., Matteucci, I., Mori, P., Petrocchi, M.: Maturity assessment of Wikipedia medical articles. In: 2014 IEEE 27th International Symposium on Computer-Based Medical Systems (CBMS), pp. 281–286. IEEE (2014). https://doi.org/10.1109/CBMS.2014.69

  7. Databus: DBpedia Ontology instance types. https://databus.dbpedia.org/dbpedia/mappings/instance-types/ (2022)

  8. data.lewoniewski.info: Supplementary materials for this research (2022). https://data.lewoniewski.info/company/

  9. English Wikipedia: Wikipedia: Reliable sources (2022). https://en.wikipedia.org/wiki/Wikipedia:Reliable_sources

  10. English Wikipedia: Wikipedia: Reliable sources/Perennial sources (2022). https://en.wikipedia.org/wiki/Wikipedia:Reliable_sources/Perennial_sources

  11. English Wikipedia: Wikipedia:Verifiability (2022). https://en.wikipedia.org/wiki/Wikipedia:Verifiability

  12. Färber, M., Ell, B., Menne, C., Rettinger, A.: A comparative survey of dbpedia, freebase, opencyc, wikidata, and yago. Semantic Web J. 1(1), 1–5 (2015)

    Google Scholar 

  13. Fetahu, B., Markert, K., Nejdl, W., Anand, A.: Finding news citations for Wikipedia. In: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, pp. 337–346 (2016)

    Google Scholar 

  14. Filipiak, D., Filipowska, A.: Improving the quality of art market data using linked open data and machine learning. In: Abramowicz, W., Alt, R., Franczyk, B. (eds.) BIS 2016. LNBIP, vol. 263, pp. 418–428. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-52464-1_39

    Chapter  Google Scholar 

  15. Internet Live Stats: Total number of Websites (2022). https://www.internetlivestats.com/total-number-of-websites/

  16. Jemielniak, D., Masukume, G., Wilamowski, M.: The most influential medical journals according to Wikipedia: quantitative analysis. J. Med. Internet Res. 21(1), e11429 (2019). https://doi.org/10.2196/11429

    Article  Google Scholar 

  17. Kane, G.C.: A multimethod study of information quality in wiki collaboration. ACM Trans. Manage. Inf. Syst. (TMIS) 2(1), 4 (2011). https://doi.org/10.1145/1929916.1929920

    Article  Google Scholar 

  18. Lerner, J., Lomi, A.: Knowledge categorization affects popularity and quality of Wikipedia articles. PLoS ONE 13(1), e0190674 (2018). https://doi.org/10.1371/journal.pone.0190674

    Article  Google Scholar 

  19. Lewańska, E.: Towards automatic business networks identification. In: Abramowicz, W., Alt, R., Franczyk, B. (eds.) BIS 2016. LNBIP, vol. 263, pp. 389–398. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-52464-1_36

    Chapter  Google Scholar 

  20. Lewoniewski, W.: Identification of important web sources of information on Wikipedia across various topics and languages. Procedia Comput. Sci. 207, 3290–3299 (2022)

    Article  Google Scholar 

  21. Lewoniewski, W., Węcel, K., Abramowicz, W.: Analysis of references across Wikipedia languages. In: Damaševičius, R., Mikašytė, V. (eds.) ICIST 2017. CCIS, vol. 756, pp. 561–573. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-67642-5_47

    Chapter  Google Scholar 

  22. Lewoniewski, W., Węcel, K., Abramowicz, W.: Modeling Popularity and Reliability of Sources in Multilingual Wikipedia. Information 11(5), 263 (2020). https://doi.org/10.3390/info11050263

    Article  Google Scholar 

  23. Lewoniewski, W., Węcel, K., Abramowicz, W.: Identifying reliable sources of information about companies in multilingual Wikipedia. In: 2022 17th Conference on Computer Science and Intelligence Systems (FedCSIS), pp. 705–714. IEEE (2022). https://doi.org/10.15439/2022F259

  24. Lih, A.: Wikipedia as Participatory Journalism: Reliable Sources? Metrics for evaluating collaborative media as a news resource. In: 5th International Symposium on Online Journalism, p. 31 (2004)

    Google Scholar 

  25. Liu, J., Ram, S.: Using big data and network analysis to understand Wikipedia article quality. Data Knowl. Eng. (2018). https://doi.org/10.1016/j.datak.2018.02.004

    Article  Google Scholar 

  26. Metilli, D., Bartalesi, V., Meghini, C.: A Wikidata-based tool for building and visualising narratives. Int. J. Digit. Libr. 20(4), 417–432 (2019). https://doi.org/10.1007/s00799-019-00266-3

    Article  Google Scholar 

  27. Netcraft: August 2021 Web Server Survey (2021). https://news.netcraft.com/archives/2021/08/25/august-2021-web-server-survey.html

  28. Nielsen, F.Å.: Scientific citations in Wikipedia. arXiv preprint arXiv:0705.2106 (2007). https://doi.org/10.48550/arXiv.0705.2106

  29. Nielsen, F.Å., Mietchen, D., Willighagen, E.: Scholia, scientometrics and wikidata. In: Blomqvist, E., Hose, K., Paulheim, H., Ławrynowicz, A., Ciravegna, F., Hartig, O. (eds.) ESWC 2017. LNCS, vol. 10577, pp. 237–259. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-70407-4_36

    Chapter  Google Scholar 

  30. Piccardi, T., Redi, M., Colavizza, G., West, R.: Quantifying engagement with citations on Wikipedia. In: Proceedings of The Web Conference 2020, pp. 2365–2376 (2020). https://doi.org/10.1145/3366423.3380300

  31. Public Suffix List: List (2022). https://publicsuffix.org/learn/

  32. Redi, M.: Characterizing Wikipedia Citation Usage. Analyzing Reading Sessions (2019). https://meta.wikimedia.org/wiki/Research:Characterizing_Wikipedia_Citation_Usage/Analyzing_Reading_Sessions. Accessed 01 Sept 2021

  33. Singh, H., West, R., Colavizza, G.: Wikipedia citations: a comprehensive data set of citations with identifiers extracted from English Wikipedia. Quant. Sci. Stud. 2(1), 1–19 (2021). https://doi.org/10.1162/qss_a_00105

    Article  Google Scholar 

  34. Stvilia, B., Twidale, M.B., Smith, L.C., Gasser, L.: Assessing information quality of a community-based encyclopedia. In: Proceedings of the ICIQ, pp. 442–454 (2005)

    Google Scholar 

  35. Teplitskiy, M., Lu, G., Duede, E.: Amplifying the impact of open access: Wikipedia and the diffusion of science. J. Am. Soc. Inf. Sci. 68(9), 2116–2127 (2017). https://doi.org/10.1002/asi.23687

    Article  Google Scholar 

  36. Tzekou, P., Stamou, S., Kirtsis, N., Zotos, N.: Quality assessment of Wikipedia external links. In: WEBIST, pp. 248–254 (2011)

    Google Scholar 

  37. Weiner, S.S., Horbacewicz, J., Rasberry, L., Bensinger-Brody, Y.: Improving the quality of consumer health information on Wikipedia: case series. J. Med. Internet Res. 21(3), e12450 (2019). https://doi.org/10.2196/12450

    Article  Google Scholar 

  38. Wikimedia Downloads: Main page (2021). https://dumps.wikimedia.org

  39. WikiRank: Quality and Popularity Assessment of Wikipedia Articles (2022). https://wikirank.net/

  40. Wilkinson, D.M., Huberman, B.a.: Cooperation and quality in wikipedia. Proceedings of the 2007 international symposium on Wikis WikiSym 2007, pp. 157–164 (2007). https://doi.org/10.1145/1296951.1296968

  41. Wulczyn, E., West, R., Zia, L., Leskovec, J.: Growing Wikipedia across languages via recommendation. In: Proceedings of the 25th International Conference on World Wide Web, pp. 975–985 (2016). https://doi.org/10.1145/2872427.2883077

  42. Yaari, E., Baruchson-Arbib, S., Bar-Ilan, J.: Information quality assessment of community generated content: a user study of Wikipedia. J. Inf. Sci. 37(5), 487–498 (2011). https://doi.org/10.1177/0165551511416065

    Article  Google Scholar 

Download references

Acknowledgement

This research is supported by the project “OpenFact – artificial intelligence tools for verification of the veracity of information sources and fake news detection” (INFOSTRATEG-I/0035/2021-00), granted within the INFOSTRATEG I program of the National Center for Research and Development, under the topic: Verifying information sources and detecting fake news.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Włodzimierz Lewoniewski .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Lewoniewski, W., Wȩcel, K., Abramowicz, W. (2023). Companies in Multilingual Wikipedia: Articles Quality and Important Sources of Information. In: Ziemba, E., Chmielarz, W., Wątróbski, J. (eds) Information Technology for Management: Approaches to Improving Business and Society. FedCSIS-AIST ISM 2022 2022. Lecture Notes in Business Information Processing, vol 471. Springer, Cham. https://doi.org/10.1007/978-3-031-29570-6_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-29570-6_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-29569-0

  • Online ISBN: 978-3-031-29570-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics