Abstract
In this paper, we provide a method for the identification and assessment of reliable internet sources about companies. We first identified 516,586 Wikipedia articles related to companies in 310 language versions, and then extracted and analyzed references contained in them using three different models for article quality assessment. As a result, we compiled a ranking of reliable sources. We found that there are several universal sources shared by many languages, but usually each language has its own specific sources. Our ranking of sources can be useful for Wikipedia editors looking for source material for their articles. Companies themselves can leverage this ranking for public relations activities. Moreover, our method can be used to automatically maintain a list of reliable internet sources.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Apollonio, D.E., Broyde, K., Azzam, A., De Guia, M., Heilman, J., Brock, T.: Pharmacy students can improve access to quality medicines information by editing Wikipedia articles. BMC Med. Educ. 18(1), 1–8 (2018). https://doi.org/10.1186/s12909-018-1375-z
BestRef: Popularity and Reliability Assessment of Wikipedia Sources. https://bestref.net (2022)
Blumenstock, J.E.: Size matters: word count as a measure of quality on Wikipedia. In: Proceedings of the 17th International Conference on World Wide Web, pp. 1095–1096. ACM (2008). https://doi.org/10.1145/1367497.1367673
Callahan, E.S., Herring, S.C.: Cultural bias in Wikipedia content on famous persons. J. Am. Soc. Inform. Sci. Technol. 62(10), 1899–1915 (2011). https://doi.org/10.1002/asi.21577
Colavizza, G.: COVID-19 research in Wikipedia. Quant. Sci. Stud. 1(4), 1349–1380 (2020). https://doi.org/10.1162/qss_a_00080
Conti, R., Marzini, E., Spognardi, A., Matteucci, I., Mori, P., Petrocchi, M.: Maturity assessment of Wikipedia medical articles. In: 2014 IEEE 27th International Symposium on Computer-Based Medical Systems (CBMS), pp. 281–286. IEEE (2014). https://doi.org/10.1109/CBMS.2014.69
Databus: DBpedia Ontology instance types. https://databus.dbpedia.org/dbpedia/mappings/instance-types/ (2022)
data.lewoniewski.info: Supplementary materials for this research (2022). https://data.lewoniewski.info/company/
English Wikipedia: Wikipedia: Reliable sources (2022). https://en.wikipedia.org/wiki/Wikipedia:Reliable_sources
English Wikipedia: Wikipedia: Reliable sources/Perennial sources (2022). https://en.wikipedia.org/wiki/Wikipedia:Reliable_sources/Perennial_sources
English Wikipedia: Wikipedia:Verifiability (2022). https://en.wikipedia.org/wiki/Wikipedia:Verifiability
Färber, M., Ell, B., Menne, C., Rettinger, A.: A comparative survey of dbpedia, freebase, opencyc, wikidata, and yago. Semantic Web J. 1(1), 1–5 (2015)
Fetahu, B., Markert, K., Nejdl, W., Anand, A.: Finding news citations for Wikipedia. In: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, pp. 337–346 (2016)
Filipiak, D., Filipowska, A.: Improving the quality of art market data using linked open data and machine learning. In: Abramowicz, W., Alt, R., Franczyk, B. (eds.) BIS 2016. LNBIP, vol. 263, pp. 418–428. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-52464-1_39
Internet Live Stats: Total number of Websites (2022). https://www.internetlivestats.com/total-number-of-websites/
Jemielniak, D., Masukume, G., Wilamowski, M.: The most influential medical journals according to Wikipedia: quantitative analysis. J. Med. Internet Res. 21(1), e11429 (2019). https://doi.org/10.2196/11429
Kane, G.C.: A multimethod study of information quality in wiki collaboration. ACM Trans. Manage. Inf. Syst. (TMIS) 2(1), 4 (2011). https://doi.org/10.1145/1929916.1929920
Lerner, J., Lomi, A.: Knowledge categorization affects popularity and quality of Wikipedia articles. PLoS ONE 13(1), e0190674 (2018). https://doi.org/10.1371/journal.pone.0190674
Lewańska, E.: Towards automatic business networks identification. In: Abramowicz, W., Alt, R., Franczyk, B. (eds.) BIS 2016. LNBIP, vol. 263, pp. 389–398. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-52464-1_36
Lewoniewski, W.: Identification of important web sources of information on Wikipedia across various topics and languages. Procedia Comput. Sci. 207, 3290–3299 (2022)
Lewoniewski, W., Węcel, K., Abramowicz, W.: Analysis of references across Wikipedia languages. In: Damaševičius, R., Mikašytė, V. (eds.) ICIST 2017. CCIS, vol. 756, pp. 561–573. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-67642-5_47
Lewoniewski, W., Węcel, K., Abramowicz, W.: Modeling Popularity and Reliability of Sources in Multilingual Wikipedia. Information 11(5), 263 (2020). https://doi.org/10.3390/info11050263
Lewoniewski, W., Węcel, K., Abramowicz, W.: Identifying reliable sources of information about companies in multilingual Wikipedia. In: 2022 17th Conference on Computer Science and Intelligence Systems (FedCSIS), pp. 705–714. IEEE (2022). https://doi.org/10.15439/2022F259
Lih, A.: Wikipedia as Participatory Journalism: Reliable Sources? Metrics for evaluating collaborative media as a news resource. In: 5th International Symposium on Online Journalism, p. 31 (2004)
Liu, J., Ram, S.: Using big data and network analysis to understand Wikipedia article quality. Data Knowl. Eng. (2018). https://doi.org/10.1016/j.datak.2018.02.004
Metilli, D., Bartalesi, V., Meghini, C.: A Wikidata-based tool for building and visualising narratives. Int. J. Digit. Libr. 20(4), 417–432 (2019). https://doi.org/10.1007/s00799-019-00266-3
Netcraft: August 2021 Web Server Survey (2021). https://news.netcraft.com/archives/2021/08/25/august-2021-web-server-survey.html
Nielsen, F.Å.: Scientific citations in Wikipedia. arXiv preprint arXiv:0705.2106 (2007). https://doi.org/10.48550/arXiv.0705.2106
Nielsen, F.Å., Mietchen, D., Willighagen, E.: Scholia, scientometrics and wikidata. In: Blomqvist, E., Hose, K., Paulheim, H., Ławrynowicz, A., Ciravegna, F., Hartig, O. (eds.) ESWC 2017. LNCS, vol. 10577, pp. 237–259. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-70407-4_36
Piccardi, T., Redi, M., Colavizza, G., West, R.: Quantifying engagement with citations on Wikipedia. In: Proceedings of The Web Conference 2020, pp. 2365–2376 (2020). https://doi.org/10.1145/3366423.3380300
Public Suffix List: List (2022). https://publicsuffix.org/learn/
Redi, M.: Characterizing Wikipedia Citation Usage. Analyzing Reading Sessions (2019). https://meta.wikimedia.org/wiki/Research:Characterizing_Wikipedia_Citation_Usage/Analyzing_Reading_Sessions. Accessed 01 Sept 2021
Singh, H., West, R., Colavizza, G.: Wikipedia citations: a comprehensive data set of citations with identifiers extracted from English Wikipedia. Quant. Sci. Stud. 2(1), 1–19 (2021). https://doi.org/10.1162/qss_a_00105
Stvilia, B., Twidale, M.B., Smith, L.C., Gasser, L.: Assessing information quality of a community-based encyclopedia. In: Proceedings of the ICIQ, pp. 442–454 (2005)
Teplitskiy, M., Lu, G., Duede, E.: Amplifying the impact of open access: Wikipedia and the diffusion of science. J. Am. Soc. Inf. Sci. 68(9), 2116–2127 (2017). https://doi.org/10.1002/asi.23687
Tzekou, P., Stamou, S., Kirtsis, N., Zotos, N.: Quality assessment of Wikipedia external links. In: WEBIST, pp. 248–254 (2011)
Weiner, S.S., Horbacewicz, J., Rasberry, L., Bensinger-Brody, Y.: Improving the quality of consumer health information on Wikipedia: case series. J. Med. Internet Res. 21(3), e12450 (2019). https://doi.org/10.2196/12450
Wikimedia Downloads: Main page (2021). https://dumps.wikimedia.org
WikiRank: Quality and Popularity Assessment of Wikipedia Articles (2022). https://wikirank.net/
Wilkinson, D.M., Huberman, B.a.: Cooperation and quality in wikipedia. Proceedings of the 2007 international symposium on Wikis WikiSym 2007, pp. 157–164 (2007). https://doi.org/10.1145/1296951.1296968
Wulczyn, E., West, R., Zia, L., Leskovec, J.: Growing Wikipedia across languages via recommendation. In: Proceedings of the 25th International Conference on World Wide Web, pp. 975–985 (2016). https://doi.org/10.1145/2872427.2883077
Yaari, E., Baruchson-Arbib, S., Bar-Ilan, J.: Information quality assessment of community generated content: a user study of Wikipedia. J. Inf. Sci. 37(5), 487–498 (2011). https://doi.org/10.1177/0165551511416065
Acknowledgement
This research is supported by the project “OpenFact – artificial intelligence tools for verification of the veracity of information sources and fake news detection” (INFOSTRATEG-I/0035/2021-00), granted within the INFOSTRATEG I program of the National Center for Research and Development, under the topic: Verifying information sources and detecting fake news.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Lewoniewski, W., Wȩcel, K., Abramowicz, W. (2023). Companies in Multilingual Wikipedia: Articles Quality and Important Sources of Information. In: Ziemba, E., Chmielarz, W., Wątróbski, J. (eds) Information Technology for Management: Approaches to Improving Business and Society. FedCSIS-AIST ISM 2022 2022. Lecture Notes in Business Information Processing, vol 471. Springer, Cham. https://doi.org/10.1007/978-3-031-29570-6_3
Download citation
DOI: https://doi.org/10.1007/978-3-031-29570-6_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-29569-0
Online ISBN: 978-3-031-29570-6
eBook Packages: Computer ScienceComputer Science (R0)