Abstract
The increasing volume of digital material available to the humanities creates clear potential for crowdsourcing. However, tasks in the digital humanities typically do not satisfy the standard requirement for decomposition into microtasks each of which must require little expertise on behalf of the worker and little context of the broader task. Instead, humanities tasks require scholarly knowledge to perform and even where sub-tasks can be extracted, these often involve broader context of the document or corpus from which they are extracted. That is the tasks are macrotasks, resisting simple decomposition. Building on a case study from musicology, the In Concert project, we will explore both the barriers to crowdsourcing in the creation of digital corpora and also examples where elements of automatic processing or less-expert work are possible in a broader matrix that also includes expert microtasks and macrotasks. Crucially we will see that the macrotask–microtask distinction is nuanced: it is often possible to create a partial decomposition into less-expert microtasks with residual expert macrotasks, and crucially do this in ways that preserve scholarly values.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Ackerman, P. (1922). Catalogue of the Retrospective Loan Exhibition of European Tapestries, Taylor and Tayloy, NY. http://www.gutenberg.org/ebooks/57518.
Ahmed, E., Ipeirotis, P., & Verykios, V. (2007). Duplicate record detection: A survey. IEEE Transactions on Knowledge and Data Engineering, 19(1), 1–16. https://doi.org/10.1109/TKDE.2007.9.
Bashford, C., Cowgill, R., & McVeigh, S. (2000). The Concert Life in Nineteenth-Century London Database, in Nineteenth-Century British Music Studies, 2, ed. by J. Dibble and B. Zon (Aldershot: Ashgate, 2000) (pp. 1–12).
Bell, D. (2004). Infinite archives, substance (Vol. 33, No. 3, Issue 105, pp. 148–161). University of Wisconsin Press. http://www.jstor.org/stable/3685549.
Berners, T.L. (1989). Information management: A Proposal. CERN internal report, March 1989, May 1990. http://info.cern.ch/Proposal.html.
Bhattacharya, I., & Getoor, L. (2007). Collective entity resolution in relational data. ACM Transactions on Knowledge Discovery from Data, 1(1), 5.
Bodleian Library (2012/2019). What’s the Score at the Bodleian? Bodleian Library. Retrieved May 1, 2019, from http://scores.bodleian.ox.ac.uk.
Borges, J. (1946). Del rigor en la ciencia. (tr. ‘On Exactitude in Science’) Los Anales de Buenos Aires 1.3 (March 1946):53.
Brown, J., & Stratton, S. (1897). British Musical Biography: a dictionary of musical artists, authors and composers, born in Britain and its colonies. S.S. Stratton, Birmingham. OCR text: https://archive.org/details/britishmusicalb00browsearchable and data version: http://www.datatodata.com/in-concert/BMB/.
Cheng, J., Teevan, J., Iqbal, S. T., & Bernstein, M. S. (2015). Break it down: A comparison of macro- and microtasks. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems (CHI ‘15). ACM, New York, NY, USA (pp. 4061–4064). https://doi.org/10.1145/2702123.2702146.
Concert Life in 19th-Century London database project, funded by the University of Huddersfield and Oxford Brookes University (1997–2001), and the Arts and Humanities Research Board (UK) and University of Leeds (2001–04).
Concert Programmes online database. Created 2004–2007. Retrieved September 29, 2018, from http://www.concertprogrammes.org.uk/about/.
Cowgill, R., & Poriss, H. (eds) (2012). The arts of the prima donna in the long nineteenth century. Oxford University Press.
Di Gioia, M., Scannapieco, M. & Beneventano, D. (2010). Object identification across multiple sources. In Proceedings of the Eighteenth Italian Symposium on Advanced Database Systems, SEBD 2010, Rimini, Italy, June 20–23, 2010.
Distributed Proofreaders (2018). Distributed proofreaders: Preserving history one page at a time. Retrieved September 02, 2018, from https://www.pgdp.net/.
Dix, A. (2019). Creativity – understanding and enhancing technical creativity and innovation. Retrieved November 11, 2019, from https://alandix.com/creativity/.
Dix, A., Beale, R., & Wood, A. (2000). Architectures to make simple visualisations using simple systems. In Proceedings of the Working Conference on Advanced Visual Interfaces (pp. 51–60). ACM
Dix, A., Cowgill, R., Bashford, C., McVeigh, S., & Ridgewell, R. (2014). Authority and judgement in the digital archive. In Proceedings of the 1st International Workshop on Digital Libraries for Musicology (DLfM ‘14). ACM, New York, NY, USA (pp. 1–8). https://doi.org/10.1145/2660168.2660171.
Dix, A., Cowgill, R., Bashford, C., McVeigh, S., & Ridgewell, R. (2016). Spreadsheets as user interfaces. In Proceedings of AVI2016, ACM (pp. 192–195). https://doi.org/10.1145/2909132.2909271.
Dunn, H. (1946). Record linkage. American Journal of Public Health, 36(12), 1412–1416. https://doi.org/10.2105/AJPH.36.12.1412.
Fink, F., Schulz, K. U., & Springmann, U. (2017). Profiling of OCR’ed historical texts revisited. In Proceedings of the 2nd International Conference on Digital Access to Textual Cultural Heritage (DATeCH2017). ACM, New York, NY, USA (pp. 61–66). https://doi.org/10.1145/3078081.3078096.
Gove, M. (2016). Sky News interview with Faisal Islam, 6 June 2016.
Grove, G. (Ed.). (1900). A Dictionary of Music and Musicians AD 1450-1880 (Vol. 3). Macmillan.
Haas, D., Ansel, J., Gu, L., & Marcus, A. (2015). Argonaut: macrotask crowdsourcing for complex data processing. Proceedings of the VLDB Endowment, 8(12), 1642–1653. http://dx.doi.org/10.14778/2824032.2824062.
In Concert (2014–2016). Retrieved January 03, 2016 from http://inconcert.datatodata.com.
Leverhulme Trust (2018). Research Project Grants. Retrieved September 04, 2018, from https://www.leverhulme.ac.uk/funding/grant-schemes/research-project-grants.
McVeigh, S. (1992–2014). Calendar of London Concerts 1750–1800. (Dataset) Goldsmiths, University of London. http://research.gold.ac.uk/10342/.
Nikolov, A., d’Aquin, M., and Motta, E. (2012). Unsupervised learning of link discovery configuration. In Proceedings of ESWC’12, Springer, Berlin, Heidelberg (pp. 119–133). https://doi.org/10.1007/978-3-642-30284-8_15.
Nurmikko-Fuller, T., Dix, A., Weigl, D. M., & Page, K. R. (2016). In collaboration with in concert: reflecting a digital library as linked data for performance ephemera. In Proceedings of the 3rd International workshop on Digital Libraries for Musicology (DLfM 2016). ACM, New York, NY, USA (pp. 17–24). https://doi.org/10.1145/2970044.2970049.
OpenRefine: Reconciliation Service API. Retrieved September 24, 2018, from https://github.com/OpenRefine/OpenRefine/wiki/Reconciliation-Service-API.
Part 2D: Main Panel D criteria, Panel criteria and working methods, REF2014, Research Excellence Framework. January 2012. http://www.ref.ac.uk/pubs/2012-01/.
Rendle, S. & Schmidt-Thieme. L. (2006). Object identification with constraints. Data Mining, 2006 1026–1031. http://www.ismll.uni-hildesheim.de/pub/pdfs/Rendle_SchmidtThieme2006-Object_Identification_with_Constraints.pdf.
Rusbridge, C. (2007). Arts and Humanities Data Service decision. DCC News, 6 June, 2007. Digital Curation Centre. http://www.dcc.ac.uk/news/arts-and-humanities-data-service-decision.
Scannapieco, M., Tosco, L., Valentino, L., Mancini, L., Cibella, N., Tuoto T., & Fortini, M. (2015). Relais User’s Guide – Version 3.0. Technical Report, Italian National Institute of Statistics (Istat). July 2015. https://doi.org/10.13140/rg.2.1.1332.5922.
Schmitz, H., & Lykourentzou, I. (2018). Online sequencing of Non-decomposable macrotasks in expert crowdsourcing. ACM Transactions on Social Computing, 1(1), 1.Article 1 (January 2018), 33 p. https://doi.org/10.1145/3140459.
Transforming Musicology. Retrieved January 03, 2016, from http://www.transforming-musicology.org.
Vobl, T., Gotscharek, A., Reffle, U., Ringlstetter, C., & Schulz, K. U. (2014, May). PoCoTo-an open source system for efficient interactive postcorrection of OCRed historical texts. In Proceedings of the First International Conference on Digital Access to Textual Cultural Heritage (pp. 57–61). ACM. http://doi.org/10.1145/2595188.2595197.
von Ahn, Luis, Maurer, Benjamin, McMillen, Colin, Abraham, David, & Blum, Manuel. (2008). reCAPTCHA: Human-based character recognition via web security measures. Science, 321(5895), 1465–1468.
Wikipedia. (2019). Arts and humanities data service. Retrieved January 01, 2019, from https://en.wikipedia.org/wiki/Arts_and_Humanities_Data_Service.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Dix, A., Cowgill, R., Bashford, C., McVeigh, S., Ridgewell, R. (2019). Crowdsourcing and Scholarly Culture: Understanding Expertise in an Age of Popularism. In: Khan, VJ., Papangelis, K., Lykourentzou, I., Markopoulos, P. (eds) Macrotask Crowdsourcing. Human–Computer Interaction Series. Springer, Cham. https://doi.org/10.1007/978-3-030-12334-5_7
Download citation
DOI: https://doi.org/10.1007/978-3-030-12334-5_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-12333-8
Online ISBN: 978-3-030-12334-5
eBook Packages: Computer ScienceComputer Science (R0)