short-paper

Automatic Content Quality Estimation Using Deep Neural Networks in Collaborative Encyclopedias on the Web

Authors:

Clara D. A. Marques,

Nicolas G. Rezende,

Daniel H. Dalip,

Marcos A. GonçalvesAuthors Info & Claims

WebMedia '20: Proceedings of the Brazilian Symposium on Multimedia and the Web

Pages 125 - 128

https://doi.org/10.1145/3428658.3431756

Published: 30 November 2020 Publication History

Get Access

Abstract

Wikipedia is based on user-generated content, in other words, anyone with internet access can write and make changes to its articles, hence the information quality of the encyclopedias is often criticized. Therefore, assigning the correct class of quality to Wikipedia articles is crucial for the experience of both authors and readers when using this large repository of information. In this paper, we present an approach relying on deep neural networks for this problem, our experiments consisted of testing three different models to achieve the best possible result. The first one using a conventional deep learning architecture and the other two using sets of semantically related quality indicators (aka, views) to better exploit their different properties, thus improving the final prediction. Finally, we compared our results with the state-of-the-art method (Support Vector Regression with views), achieving similar results with the possibility of improvement.

References

[1]

Andrew Borthwick, John Sterling, Eugene Agichtein, and Ralph Grishman. 1998. Exploiting Diverse Knowledge Sources via Maximum Entropy in Named Entity Recognition. In Sixth Workshop on Very Large Corpora. https://www.aclweb.org/anthology/W98-1118

Google Scholar

[2]

Gabriel Calzada and Alex Dekhtyar. 2010. On Measuring the Quality of Wikipedia Articles. 11--18. https://doi.org/10.1145/1772938.1772943

Crossref

Google Scholar

[3]

Quang-Vinh Dang and Claudia-Lavinia Ignat. 2016. Quality Assessment of Wikipedia Articles without Feature Engineering. 27--30. https://doi.org/10.1145/2910896.2910917

Crossref

Google Scholar

[4]

Marco Cristo Pável Calado Daniel H Dalip, Marcos A Gonçalves. 2009. Automatic quality assessment of content created collaboratively by web communities: a case study of wikipedia. Proceedings of the 9th ACM/IEEE-CS joint conference on Digital libraries (2009), 295--304.

Google Scholar

[5]

Marco Cristo Pável Calado Daniel H Dalip, Marcos A Gonçalves. 2017. A general multiview framework for assessing the quality of collab-oratively created content on web 2.0. Journal of the Association forInformation Science and Technology (2017), 86--308.

Google Scholar

[6]

Diederik P. Kingma and Jimmy Ba. 2014. Adam: A Method for Stochastic Optimization. arXiv:1412.6980 [cs.LG]

Google Scholar

[7]

Will Koehrsen. 2018. A Conceptual Explanation of Bayesian Hyperparameter Optimization for Machine Learning. Retrieved July 11, 2020 from https://towardsdatascience.com/a-conceptual-explanation-of-bayesian-model-based-hyperparameter-optimization-for-machine-learning-b8172278050f

Google Scholar

[8]

Ning Qian. 1999. On the Momentum Term in Gradient Descent Learning Algorithms. Neural networks 12.1 (1999), 145--151.

Google Scholar

[9]

Pincock T. Mingus B Rassbach, L. 2007. Exploring the feasibility of automatically rating online article quality. Retrieved Aug 05, 2020 from https://upload.wikimedia.org/wikipedia/wikimania2007/d/d3/RassbachPincockMingus07.pdf

Google Scholar

[10]

Geoffrey Hinton Tijmen Tieleman. 2012. Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude. COURSERA: Neural networks for machine learning 4, 2 (Oct. 2012), 26--31.

Google Scholar

[11]

David H. Wolpert. 1992. Stacked Generalization. Neural Networks 5 (1992), 241--259.

Digital Library

Google Scholar

Cited By

View all

Gunatilaka TAhangama SAhangama S(2024)Leveraging Large Language Models and Deep Learning for Wikipedia Quality Assessment2024 6th International Conference on Advancements in Computing (ICAC)10.1109/ICAC64487.2024.10851135(558-563)Online publication date: 12-Dec-2024
https://doi.org/10.1109/ICAC64487.2024.10851135

Index Terms

Automatic Content Quality Estimation Using Deep Neural Networks in Collaborative Encyclopedias on the Web
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Supervised learning
        Supervised learning by regression
    2. Machine learning approaches
      1. Neural networks
2. Information systems
  1. Information retrieval

Recommendations

Automatic quality assessment of content created collaboratively by web communities: a case study of wikipedia
JCDL '09: Proceedings of the 9th ACM/IEEE-CS joint conference on Digital libraries

The old dream of a universal repository containing all the human knowledge and culture is becoming possible through the Internet and the Web. Moreover, this is happening with the direct collaborative, participation of people. Wikipedia is a great ...
Assessing the Quality of Wikipedia Articles
ICMLSC '21: Proceedings of the 2021 5th International Conference on Machine Learning and Soft Computing

Wikipedia is a very important information reference source for the Internet users. Due to the fact that the content of Wikipedia is the collaborative result from a massive number of participants all over the world, the quality of Wikipedia might be ...
Quality assessment of collaborative content with minimal information
JCDL '14: Proceedings of the 14th ACM/IEEE-CS Joint Conference on Digital Libraries

Content generated by users is one of the most interesting phenomena of published media. However, the possibility of unrestricted edition is a source of doubts about its quality. This issue has motivated many studies on how to automatically assess content ...

Comments

Information & Contributors

Information

Published In

WebMedia '20: Proceedings of the Brazilian Symposium on Multimedia and the Web

November 2020

364 pages

ISBN:9781450381963

DOI:10.1145/3428658

General Chair:
Carlos de Salles Soares Neto
UFMA

Publication rights licensed to ACM. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

In-Cooperation

SBC: Brazilian Computer Society
CNPq: Conselho Nacional de Desenvolvimento Cientifico e Tecn
CGIBR: Comite Gestor da Internet no Brazil
CAPES: Brazilian Higher Education Funding Council

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 November 2020

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Short-paper
Research
Refereed limited

Funding Sources

Conselho Nacional de Desenvolvimento Científico e Tecnológico
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior
Fundação de Amparo à Pesquisa do Estado de Minas Gerais

Conference

WebMedia '20

Sponsor:

WebMedia '20: Brazillian Symposium on Multimedia and the Web

November 30 - December 4, 2020

São Luís, Brazil

Acceptance Rates

WebMedia '20 Paper Acceptance Rate 34 of 87 submissions, 39%;

Overall Acceptance Rate 270 of 873 submissions, 31%

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
28
Total Downloads

Downloads (Last 12 months)1
Downloads (Last 6 weeks)0

Reflects downloads up to 10 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

View all

Gunatilaka TAhangama SAhangama S(2024)Leveraging Large Language Models and Deep Learning for Wikipedia Quality Assessment2024 6th International Conference on Advancements in Computing (ICAC)10.1109/ICAC64487.2024.10851135(558-563)Online publication date: 12-Dec-2024
https://doi.org/10.1109/ICAC64487.2024.10851135

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Abstract

References

Cited By

Index Terms

Recommendations

Automatic quality assessment of content created collaboratively by web communities: a case study of wikipedia

Assessing the Quality of Wikipedia Articles

Quality assessment of collaborative content with minimal information

Comments

Information

Published In

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Funding Sources

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Share

Share this Publication link

Share on social media

Affiliations