Skip to main content

Identification of Writing Preferences in Wikipedia

  • Conference paper
  • First Online:
Complex Networks & Their Applications XII (COMPLEX NETWORKS 2023)

Part of the book series: Studies in Computational Intelligence ((SCI,volume 1144))

Included in the following conference series:

  • 1117 Accesses

Abstract

In this paper, we investigate whether there is a standardized writing composition for articles in Wikipedia and, if so, what it entails. By employing a Neural Gas approximation to the topology of our dataset, we generate a graph that represents various prevalent textual compositions adopted by the texts in our dataset. Subsequently, we examine significantly attractive regions within our graph by tracking the evolution of articles over time. Our observations reveal the coexistence of different stable compositions and the emergence and disappearance of certain unstable compositions over time.

We thank the LABEX ASLAN (ANR-10-LABX-0081) of Université de Lyon for its financial support within the program “Investissements d’Avenir” (ANR-11-IDEX-0007) of the French government operated by the National Research Agency (ANR).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Biber, D., Conrad, S.: Register, Genre, and Style. Cambridge University Press (2019)

    Google Scholar 

  2. Chen, N., Blostein, D.: A survey of document image classification: problem statement, classifier architecture and performance evaluation. Int. J. Doc. Anal. Recogn. 10, 1–16 (2007). https://doi.org/10.1007/s10032-006-0020-2

  3. Emigh, W., Herring, S.C: Collaborative authoring on the web a genre analysis of online encyclopedias. In: Proceedings of the Annual Hawaii International Conference on System Sciences 5, pp. 99 (2005). https://doi.org/10.1109/hicss.2005.149

  4. Honnibal, M., Montani, I.: spaCy 2: natural language understanding with Bloom embeddings, convolutional neural networks and incremental parsing. To appear 7(1), 411–420 (2017)

    Google Scholar 

  5. Kenji, K., Larry A. Rendell, L.A.: A practical approach to feature selection. Machine learning proceedings 1992. Morgan Kaufmann, pp. 249–256 (1992). https://doi.org/10.1016/B978-1-55860-247-2.50037-1

  6. Lagutina, K.V., Lagutina, N.S., Boychuk, E.I.: Text classification by genres based on rhythmic characteristics. Autom. Contr. Comput. Scie. 56, 735–743 (2022). https://doi.org/10.3103/S0146411622070136

  7. Lee, Y.B., Myaeng, S.H.: Text genre classification with genre-revealing and subject-revealing features. In: Proceedings of the 25th Annual International ACM SIGIR conference on Research and development in information retrieval, pp. 145–150. (2002). https://doi.org/10.1145/564376.564403

  8. Lieungnapar, A., Todd, R.W., Trakulkasemsuk, W.: Genre induction from a linguistic approach. Indonesian J. Appl. Linguist. 6, 319–329. (2017). https://doi.org/10.17509/ijal.v6i2.4917

  9. Martinetz, T., Schulten, K.: A" neural-gas" network learns topologies (1991)

    Google Scholar 

  10. Mirończuk, M. M., Protasiewicz, J.: A recent overview of the state-of-the-art elements of text classification. Expert Syst. Appl. 106, 36–54 (2018). https://doi.org/10.1016/j.eswa.2018.03.058

  11. Santini, M.: A shallow approach to syntactic feature extraction for genre classification. In: Proceedings of the 7th Annual Colloquium for the UK Special Interest Group for Computational Linguistics, pp. 6–7. Birmingham, UK (2004)

    Google Scholar 

  12. Shin, C., Doermann, D., Rosenfeld, A.: Classification of document pages using structure-based features. Int. J. Doc. Anal. Recogn. 3, 232–247 (2001). https://doi.org/10.1007/PL00013566

  13. Skevik, K.A.: Language homogeneity in the Japanese wikipedia. In: Proceedings of the 24th Pacific Asia Conference on Language, Information and Computation, pp. 527–534. (2010)

    Google Scholar 

  14. Quemener, E., Corvellec, M.: SIDUS–the solution for extreme deduplication of an operating system. Linux J. 2013(235), 3 (2013). Article no. 3

    Google Scholar 

  15. Vicente, M., Maestre, M.M., Lloret, E., Cueto, A.S.: Leveraging machine learning to explain the nature of written genre. IEEE Access 9, 24705–24726. (2021). https://doi.org/10.1109/ACCESS.2021.3056927

  16. Wan, M., Fang, A. C., Huang, C. R.: The discriminativeness of internal syntactic representations in automatic genre classification. J. Quant. Linguist. 28, 138–171 (2021). https://doi.org/10.1080/09296174.2019.1663655

  17. Wikipedia: Five pillars. https://en.wikipedia.org/wiki/Wikipedia:Five.pillars

  18. Wołowski, W.: La sémantique du prototype et les genres (littéraires). Studia Romanica Posnaniensia 33, 65–83. (2006). https://doi.org/10.14746/strop.2006.33.005

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jean-Baptiste Chaudron .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Chaudron, JB., Magué, JP., Vigier, D. (2024). Identification of Writing Preferences in Wikipedia. In: Cherifi, H., Rocha, L.M., Cherifi, C., Donduran, M. (eds) Complex Networks & Their Applications XII. COMPLEX NETWORKS 2023. Studies in Computational Intelligence, vol 1144. Springer, Cham. https://doi.org/10.1007/978-3-031-53503-1_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-53503-1_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-53502-4

  • Online ISBN: 978-3-031-53503-1

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics