Web Article Quality Assessment in Multi-dimensional Space

Han, Jingyu; Fu, Xiong; Chen, Kejia; Wang, Chuandong

doi:10.1007/978-3-642-23535-1_20

Jingyu Han²¹,
Xiong Fu²¹,
Kejia Chen²¹ &
…
Chuandong Wang²¹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6897))

Included in the following conference series:

International Conference on Web-Age Information Management

1720 Accesses

Abstract

Nowadays user-generated content (UGC) such as Wikipedia, is emerging on the web at an explosive rate, but its data quality varies dramatically. How to effectively rate the article’s quality is the focus of research and industry communities. Considering that each quality class demonstrates its specific characteristics on different quality dimensions, we propose to learn the web quality corpus by taking different quality dimensions into consideration. Each article is regarded as an aggregation of sections and each section’s quality is modelled using Dynamic Bayesian Network(DBN) with reference to accuracy, completeness and consistency. Each quality class is represented by three dimension corpora, namely accuracy corpus, completeness corpus and consistency corpus. Finally we propose two schemes to compute quality ranking. Experiments show our approach performs well.

This work is fully supported by National Natural Science Foundation of China under grant No.61003040, 60903181.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.00; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Hu, M., Lim, E.P., Sun, A.: Measuring article quality in wikipedia: Models and evaluation. In: Proc. of the sixteenth CIKM, pp. 243–252 (2007)
Google Scholar
Aebi, D., Perrochon, L.: Towards improving data quality. In: Proc. of the International Conference on Information Systems and Management of Data, pp. 273–281 (1993)
Google Scholar
Wang, R.Y., Kon, H.B., Madnick, S.E.: Data quality requirements analysis and modeling. In: Proc. of the Ninth International Conference on Data Engineering, pp. 670–677 (1993)
Google Scholar
Pernici, B., Scannapieco, M.: Data quality in web information systems. In: Spaccapietra, S., March, S.T., Kambayashi, Y. (eds.) ER 2002. LNCS, vol. 2503, pp. 397–413. Springer, Heidelberg (2002)
Chapter Google Scholar
Stvilia, B., Twidle, M.B., Smith, L.C.: Assessing information quality of a community-based encyclopedia. In: Proc. of the International Conference on Information Quality, pp. 442–454 (2005)
Google Scholar
Rassbach, L., Pincock, T., Mingus, B.: Exploring the feasibility of automatically rating online article quality (2008)
Google Scholar
Dalip, D.H., Cristo, M., Calado, P.: Automatic quality assessment of content created collaboratively by web communities: A case study of wikipedia. In: Proc. of JCDL 2009, pp. 295–304 (2009)
Google Scholar
Zeng, H., Alhossaini, M.A., Ding, L.: Computing trust from revision history. In: Proc. of the 2006 International Conference on Privacy, Security and Trust: Bridge the Gap Between PST Technologies and Business Services (2006)
Google Scholar
Zeng, H., Alhossaini, M.A., Fikes, R., McGuinness, D.L.: Mining revision history to assess trustworthiness of article fragments. In: Proc. of International Conference on Collaborative Computing: Networking, Applications and Worksharing, pp. 1–10 (2009)
Google Scholar
Pipino, L.L., Lee, Y.W., Wang, R.Y.: Data quality assessment. Communications of the ACM 45(4), 211–218 (2002)
Article Google Scholar
Batini, C., Cappiello, C., Francalanci, C., Maurino, A.: Methodologies for data quality assessment and improvement. ACM Computing Surveys 41(3), 1–52 (2009)
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science and Technology, Nanjing University of Posts and Telecommunications, Nanjing, China, 210003
Jingyu Han, Xiong Fu, Kejia Chen & Chuandong Wang

Authors

Jingyu Han
View author publications
You can also search for this author in PubMed Google Scholar
Xiong Fu
View author publications
You can also search for this author in PubMed Google Scholar
Kejia Chen
View author publications
You can also search for this author in PubMed Google Scholar
Chuandong Wang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Microsoft Research Asia, 5 Danling Rd., Haidian District, 100190, Beijing, China
Haixun Wang
Computer School, Wuhan University, 16 Luojiashan Road, 430072, Hubei, China
Shijun Li
Graduate School of Information Science and Technology, Hokkaido University, Kita 14, Nishi 9, Kita-ku, 060-0814, Hokkaido, Sapporo, Japan
Satoshi Oyama
College of Information Science and Technology, Drexel University, 19104, Philadelphia, PA, USA
Xiaohua Hu
State Key Laboratory of Software Engineering, Wuhan University, 16 Luojiashan Road, 430072, Wuhan, Hubei, China
Tieyun Qian

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Han, J., Fu, X., Chen, K., Wang, C. (2011). Web Article Quality Assessment in Multi-dimensional Space. In: Wang, H., Li, S., Oyama, S., Hu, X., Qian, T. (eds) Web-Age Information Management. WAIM 2011. Lecture Notes in Computer Science, vol 6897. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23535-1_20

Download citation

DOI: https://doi.org/10.1007/978-3-642-23535-1_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23534-4
Online ISBN: 978-3-642-23535-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics