Abstract
Defective metadata is a significant problem of digital libraries. So far, automatic error detectors have been in the focus of research interest. However, recent public projects have shown that patrons are willing to invest time to report errors if they are called to contribute. In this case-study, we analyze the community contribution to error detection for DBLP, a public bibliographic collection. Our study is based on e-mails sent to the project between January 2007 and November 2010. We manually and automatically identify error reports and analyze their contribution to corrections of the DBLP collection. We show that users frequently report certain types of defects while others are ignored. The detection of homonym-name inconsistencies in particular strongly depends on user input. We also discuss who sends the reports and which communities are particularly active in this matter.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Bird, C., Gourley, A., Devanbu, P.T.: Detecting Patch Submission and Acceptance in OSS Projects. In: Workshop on Mining Software Repositories, p. 26. IEEE CS, Los Alamitos (2007)
Bovey, J.: Adding User-Editing to a Catalogue of Cartoon Drawings. In: Gonzalo, J., Thanos, C., Verdejo, M.F., Carrasco, R.C. (eds.) ECDL 2006. LNCS, vol. 4172, pp. 457–460. Springer, Heidelberg (2006)
Ferreira, A.A., Veloso, A., Gonçalves, M.A., Laender, A.H.F.: Effective self-training author name disambiguation in scholarly digital libraries. In: Hunter, J., Lagoze, C., Giles, C.L., Li, Y.-F. (eds.) JCDL, pp. 39–48. ACM, New York (2010)
Han, H., Giles, C.L., Zha, H., Li, C., Tsioutsiouliklis, K.: Two supervised learning approaches for name disambiguation in author citations. In: Chen, H., Wactlar, H.D., Chen, C.c., Lim, E.-P., Christel, M.G. (eds.) JCDL, pp. 296–305. ACM, New York (2004)
Han, H., Zha, H., Giles, C.L.: Name disambiguation in author citations using a K-way spectral clustering method. In: Marlino, M., Sumner, T., Shipman III, F.M. (eds.) JCDL, pp. 334–343. ACM, New York (2005)
Kapoor, N., Butler, J.T., McNee, S.M., Fouty, G.C., Stemper, J.A., Konstan, J.A.: A Study of Citations in Users’ Online Personal Collections. In: Kovács, L., Fuhr, N., Meghini, C. (eds.) ECDL 2007. LNCS, vol. 4675, pp. 404–415. Springer, Heidelberg (2007)
Laender, A.H.F., de Lucena, C.J.P., Maldonado, J.C., de Souza e Silva, E., Ziviani, N.: Assessing the research and education quality of the top Brazilian Computer Science graduate programs. SIGCSE Bulletin 40(2), 135–145 (2008)
Martins, W.S., Gonçalves, M.A., Laender, A.H.F., Pappa, G.L.: Learning to assess the quality of scientific conferences: a case study in computer science. In: Heath, F., Rice-Lively, M.L., Furuta, R. (eds.) JCDL, pp. 193–202. ACM, New York (2009)
On, B.-W., Lee, D., Kang, J., Mitra, P.: Comparative study of name disambiguation problem using a scalable blocking-based framework. In: Marlino, M., Sumner, T., Shipman III, F.M. (eds.) JCDL, pp. 344–353. ACM, New York (2005)
Redman, T.C.: Data Quality for the Information Age, 1st edn. Artech House, Inc., Norwood (1996)
Reitz, F., Hoffmann, O.: An Analysis of the Evolving Coverage of Computer Science Sub-fields in the DBLP Digital Library. In: Lalmas, M., Jose, J., Rauber, A., Sebastiani, F., Frommholz, I. (eds.) ECDL 2010. LNCS, vol. 6273, pp. 216–227. Springer, Heidelberg (2010)
Reitz, F., Hoffmann, O.: Learning from the Past: An Analysis of Person Name Corrections in DBLP Collection and Social Network Properties of Affected Entities. In: Memon, N., Alhajj, R. (eds.) International Conference on Advances in Social Networks Analysis and Mining, pp. 9–16. IEEE Computer Society, Los Alamitos (2010)
Weißgerber, P., Neu, D., Diehl, S.: Small patches get in! In: Hassan, A.E., Lanza, M., Godfrey, M.W. (eds.) Workshop on Mining Software Repositories, pp. 67–76. ACM, New York (2008)
Zarro, M.A., Allen, R.B.: User-Contributed Descriptive Metadata for Libraries and Cultural Institutions. In: Lalmas, M., Jose, J.M., Rauber, A., Sebastiani, F., Frommholz, I. (eds.) ECDL 2010. LNCS, vol. 6273, pp. 46–54. Springer, Heidelberg (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Reitz, F., Hoffmann, O. (2011). Did They Notice? – A Case-Study on the Community Contribution to Data Quality in DBLP. In: Gradmann, S., Borri, F., Meghini, C., Schuldt, H. (eds) Research and Advanced Technology for Digital Libraries. TPDL 2011. Lecture Notes in Computer Science, vol 6966. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24469-8_22
Download citation
DOI: https://doi.org/10.1007/978-3-642-24469-8_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-24468-1
Online ISBN: 978-3-642-24469-8
eBook Packages: Computer ScienceComputer Science (R0)