Skip to main content

Quantitative Analysis of the Top Ten Wikipedias

  • Conference paper
Software and Data Technologies (ICSOFT 2007, ENASE 2007)

Abstract

In a few years, Wikipedia has become one of the information systems with more public of the Internet. Based on a relatively simple architecture it has proven to be capable of supporting the largest and more diverse community of collaborative authorship worldwide. Using a quantitative methodology, (analyzing public Wikipedia databases), we describe the main characteristics of the 10 largest language editions, and the authors that work in them. The methodology is generic enough to be used on the rest of the editions, providing a convenient framework to develop a complete quantitative analysis of the Wikipedia. Among other parameters, we study the evolution of the number of contributions and articles, their size, and the differences in contributions by different authors, inferring some relationships between contribution patterns and content. These relationships reflect (and in part, explain) the evolution of the different language editions so far, as well as their future trends.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Amor, J.J., Gonzalez-Barahona, J.M., Robles, G., Herraiz, I.: Measuring libre software using debian 3.1 (sarge) as a case study: preliminary results. Upgrade Magazine (2005a)

    Google Scholar 

  2. Amor, J.J., Robles, G., Gonzalez-Barahona, J.M.: Measuring woody: The size of debian 3.0. Technical Report. Grupo de Sistemas y Comunicaciones, Universidad Rey Juan Carlos. Madrid, Spain. Grupo de Sistemas y Comunicaciones, Universidad Rey Juan Carlos. Madrid, Spain (2005b)

    Google Scholar 

  3. Buriol, L.S., Castillo, C., Donato, D., Millozzi, S.: Temporal evolution of the wikigraph. In: Proceedings of the Web Intelligence Conference, Hong Kong. IEEE CS Press, Los Alamitos (2006)

    Google Scholar 

  4. Ghosh, R.A., Prakash, V.V.: The orbiten free software survey. First Monday (2000)

    Google Scholar 

  5. Gigles, J.: Internet encyclopedias go head to head. Nature Magazine (2005)

    Google Scholar 

  6. Gini, C.: On the measure of concentration with especial reference to income and wealth. Cowless Comission (1936)

    Google Scholar 

  7. Godfrey, M., Tu, Q.: Evolution in open source software: A case study. In: Proceedings of the International Conference on Software Maintenance, San Jos, California, pp. 131–142 (2000)

    Google Scholar 

  8. Gonzalez-Barahona, J.M., Ortuno-Perez, M., de-las Heras-Quiros, P., Gonzalez, J.C., Olivera, V.M.: Counting potatoes: the size of debian 2.2. Upgrade Magazine II(6), 60–66 (2001)

    Google Scholar 

  9. Gonzalez-Barahona, J.M., Robles, G., Ortuno-Perez, M., Rodero-Merino, L., Centeno-Gonzalez, J., Matellan-Olivera, V., Castro-Barbero, E., de-las Heras-Quiros, P.: Analyzing the anatomy of GNU/Linux distributions: methodology and case studies (Red Hat and Debian). In: Koch, S. (ed.) Free/Open Software Development, pp. 27–58. Idea Group Publishing, Hershey (2004)

    Google Scholar 

  10. Koch, S., Schneider, G.: Effort, cooperation and coordination in an open source software project: Gnome. Information Systems Journal 12(1), 27–42 (2002)

    Article  Google Scholar 

  11. Lehman, M.M., Ramil, J.F., Sandler, U.: Metrics and laws of software evolution the nineties view. In: METRICS 1997: Proceedings of the 4th International Symposium on Software Metrics, p. 20 (1997)

    Google Scholar 

  12. Mockus, A., Fielding, R.T., Herbsleb, J.D.: Two case studies of open source software development: Apache and mozilla. ACM Transactions on Software Engineering and Methodology 11(3), 309–346 (2002)

    Article  Google Scholar 

  13. Raymond, E.S.: The cathedral and the bazaar. First Monday 3(3) (1998)

    Google Scholar 

  14. Robles, G.: Empirical software engineering research on libre software: Data sources, methodologies and results. Doctoral Thesis. Universidad Rey Juan Carlos, Mostoles, Spain (2006)

    Google Scholar 

  15. Viegas, F.B., Wattengberg, M., Dave, K.: Studying cooperation and conflict between authors with history flow visualizations. In: Proceedings of the SIGCHI conference on Human factors in computing systems, Viena, Austria, pp. 575–582 (2004)

    Google Scholar 

  16. Voss, J.: Measuring wikipedia. In: Proceedings of the 10th International Conference of the International Society for Scientometrics and Infometrics 2005, Stockholm (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ortega, F., Gonzalez-Barahona, J.M., Robles, G. (2008). Quantitative Analysis of the Top Ten Wikipedias. In: Filipe, J., Shishkov, B., Helfert, M., Maciaszek, L.A. (eds) Software and Data Technologies. ICSOFT ENASE 2007 2007. Communications in Computer and Information Science, vol 22. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-88655-6_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-88655-6_19

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-88654-9

  • Online ISBN: 978-3-540-88655-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics