ABSTRACT
This paper describes how crowdsourcing can be incorporated as an integral part of a comprehensive technical workflow to identify, extract and validate data from large volumes of printed tabular statistics, and transform them into operable digital datasets using current structural and descriptive standards. The recently completed digitisation project for the 1961 Census of England and Wales (commissioned by the UK's Office for National Statistics) is used to provide details on data processing, crowdsourcing platform and tasks, crowd interaction, and validation of results. The multi-modal approach employed was very successful, delivering far more complete and validated data than automated processes alone could produce (due to the challenging nature of the source material).
- C. Clausner, J. Hayes, A. Antonacopoulos, S. Pletschacher. 2017. Creating a Complete Workflow for Digitising Historical Census Documents: Considerations and Evaluation. In Proceedings of the 2017 Workshop on Historical Document Imaging and Processing (HIP2017), Kyoto, Japan, November 2017, pp. 83--88. https://doi.org/10.1145/3151509.3151525Google ScholarDigital Library
- Zooniverse crowdsourcing platform. https://www.zooniverse.org. Last access 09/06/2019.Google Scholar
- James Sprinks, Jessica Wardlaw, Robert Houghton, Steven Bamford, Jeremy Morley. 2017. Task Workflow Design and its impact on performance and volunteers' subjective preference in Virtual Citizen Science. In International Journal of Human-Computer Studies, Volume 104, August 2017, Pages 50--63. https://doi.org/10.1016/j.ijhcs.2017.03.003Google ScholarDigital Library
- Trove. National Library of Australia. https://trove.nla.gov.au. Last access 09/06/2019.Google Scholar
- Digital Proofreaders. Distributed Proofreaders Foundation. https://www.pgdp.net. Last access 09/06/2019.Google Scholar
- TypeWrigth. 18thConnect. http://www.18thconnect.org/typewright/documents. Last access 09/06/2019.Google Scholar
- FamilySearch. https://www.familysearch.org. Last access 09/06/2019.Google Scholar
- Ancestry. https://www.ancestry.com. Last access 09/06/2019.Google Scholar
- Weather Rescue. University of Reading. https://www.zooniverse.org/projects/edh/weather-rescue. Last access 09/06/2019.Google Scholar
- Castaway. https://www.zooniverse.org/projects/zhcreech/castaway. Last access 09/06/2019.Google Scholar
- Southern Weather Discovery. https://www.zooniverse.org/projects/drewdeepsouth/southern-weather-discovery. Last access 09/06/2019.Google Scholar
- C. Clausner, J. Hayes, A. Antonacopoulos, S. Pletschacher. 2017. In Proceedings of Second International Conference on Digital Access to Textual Cultural Heritage (DATeCH 2017), Goettingen, Germany, 01 - 02 June 2017. https://doi.org/10.1145/3078081.3078106Google Scholar
- Office for National Statistics, United Kingdom. https://www.ons.gov.uk/. Last access 09/06/2019.Google Scholar
- C. Clausner, S. Pletschacher, A. Antonacopoulos. 2011. Aletheia - An Advanced Document Layout and Text Ground-Truthing System for Production Environments. In Proceedings of the 11th International Conference on Document Analysis and Recognition (ICDAR2011), Beijing, China September 2011, pp. 48--52. https://doi.org/10.1109/ICDAR.2011.19Google ScholarDigital Library
- 1961 Census. University of Salford, UK. https://www.zooniverse.org/projects/dataliberation/1961 -census. Last accessed 09/06/2019.Google Scholar
- Zooniverse. https://www.zooniverse.org. Last accessed 09/06/2019.Google Scholar
Index Terms
- Crowdsourcing Historical Tabular Data: 1961 Census of England and Wales
Recommendations
Creating a Complete Workflow for Digitising Historical Census Documents: Considerations and Evaluation
HIP '17: Proceedings of the 4th International Workshop on Historical Document Imaging and ProcessingThe 1961 Census of England and Wales was the first UK census to make use of computers. However, only bound volumes and microfilm copies of printouts remain, locking a wealth of information in a form that is practically unusable for research. In this ...
Unearthing the Recent Past: Digitising and Understanding Statistical Information from Census Tables
DATeCH2017: Proceedings of the 2nd International Conference on Digital Access to Textual Cultural HeritageCensuses comprise a wealth of information at a large (national) scale that allow governments (who commission them) and the public to have a detailed snapshot of how people live (geographical distribution and characteristics). In addition to underpinning ...
Historical streetscape simulation system that reflects changes in weather, time, and seasons
SA '18: SIGGRAPH Asia 2018 PostersIn this study, we developed a historical streetscape simulation system for local areas. In recent years, the loss or replacement of regional history and culture has become a pertinent issue in Japan owing to urbanization, depopulation, declining ...
Comments