ABSTRACT
Data annotation is key to a large number of fields, including ubiquitous computing. Documenting the quality and extent of annotation is increasingly recognised as an important aspect of understanding the validity, biases and limitations of systems built using this data: hence, it is also relevant to regulatory and compliance needs and outcomes. However, the process of annotation often receives little attention, and is characterised in the literature as “under-described” and “invisible work”. In this tutorial, we bring together existing resources and methods to present a framework for the iterative development and evaluation of an annotation protocol, from requirements gathering, setting scope, development, documentation, piloting and evaluation, through to scaling-up annotation processes for a production annotation process. We also explore the potential of semi-supervised approaches and state-of-the-art methods such as the use of generative AI in supporting annotation workflows, and how such approaches are validated and their strengths and weaknesses characterised. This tutorial is designed to be suitable for people from a wide range of backgrounds, as annotation can be understood as a highly interdisciplinary task and often requires collaboration with subject matter experts from relevant fields. Participants will trial and evaluate a selection of annotation interfaces and walk through the process of evaluating the outcomes. By the end of the workshop, participants will develop a deeper understanding of the task of developing an annotation protocol and aspects of the requirements and context which should be taken into account.
Presentations and code from this event will be shared openly on a Github repository.
- Artificial Intelligence Act. 2021. Proposal for a regulation of the European Parliament and the Council laying down harmonised rules on Artificial Intelligence (Artificial Intelligence Act) and amending certain Union legislative acts. EUR-Lex-52021PC0206 (2021).Google Scholar
- Patrícia Bota, Joana Silva, Duarte Folgado, and Hugo Gamboa. 2019. A semi-automatic annotation approach for human activity recognition. Sensors 19, 3 (2019), 501.Google ScholarCross Ref
- Karine Lacourse, Ben Yetton, Sara Mednick, and Simon C. Warby. 2020. Massive online data annotation, crowdsourcing to generate high quality sleep spindle annotations from EEG data. Scientific Data 7, 1 (2020), 190. https://doi.org/10.1038/s41597-020-0533-4Google ScholarCross Ref
- Tanushree Mitra, C. J. Hutto, and Eric Gilbert. 2015. Comparing Person- and Process-Centric Strategies for Obtaining Quality Data on Amazon Mechanical Turk. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems (Seoul, Republic of Korea) (CHI ’15). Association for Computing Machinery, New York, NY, USA, 1345–1354. https://doi.org/10.1145/2702123.2702553Google ScholarDigital Library
- Janis Pagel, Nils Reiter, Ina Rösiger, and Sarah Schulz. 2018. A unified text annotation workflow for diverse goals. In Sandra Kübler/Heike Zinsmeister (Hg.), Proceedings of the Workshop on Annotation in Digital Humanities, co-located with ESSLLI. 31–36.Google Scholar
- Teodor Stoev and Kristina Yordanova. 2021. BehavE: Behaviour Understanding Through Automated Generation of Situation Models. In KI 2021: Advances in Artificial Intelligence: 44th German Conference on AI, Virtual Event, September 27–October 1, 2021, Proceedings. Springer, 362–369.Google ScholarDigital Library
- Emma L. Tonkin and Kristina Yordanova. 2020. ARDUOUS 2020: 4th International Workshop on Annotation of useR Data for UbiquitOUs Systems – Welcome and Committees. In 2020 IEEE International Conference on Pervasive Computing and Communications Workshops (PerCom Workshops). IEEE Computer Society, Los Alamitos, CA, USA, 1–2. https://doi.org/10.1109/PerComWorkshops48775.2020.9156077Google ScholarCross Ref
- Ding Wang, Shantanu Prabhat, and Nithya Sambasivan. 2022. Whose AI Dream? In Search of the Aspiration in Data Annotation.. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems (New Orleans, LA, USA) (CHI ’22). Association for Computing Machinery, New York, NY, USA, Article 582, 16 pages. https://doi.org/10.1145/3491102.3502121Google ScholarDigital Library
- Jingru Yang, Ju Fan, Zhewei Wei, Guoliang Li, Tongyu Liu, and Xiaoyong Du. 2018. Cost-effective data annotation using game-based crowdsourcing. Proceedings of the VLDB Endowment 12, 1 (2018), 57–70.Google ScholarDigital Library
- Kristina Yordanova and Adeline Paiement. 2018. PerCom Workshops 2018 Committees. In 2018 IEEE International Conference on Pervasive Computing and Communications Workshops (PerCom Workshops). i–ii. https://doi.org/10.1109/PERCOMW.2018.8480146Google ScholarCross Ref
- Kristina Yordanova, Adeline Paiement, Max Schröder, Emma L. Tonkin, Przemysław Woznowski, Carl Magnus Olsson, Joseph Rafferty, and Timo Sztyler. 2018. Challenges in annotation of user data for ubiquitous systems: Results from the 1st arduous workshop. CoRR abs/1803.05843 (2018). https://doi.org/10.48550/arXiv.1803.05843 arxiv:1803.05843 [cs.CY]Google ScholarCross Ref
- Kristina Yordanova, Emma L. Tonkin, and Adeline Paiement. 2019. ARDUOUS 2019 – 3rd International Workshop on Annotation of user Data for Ubiquitous Systems – Welcome and Committees. In 2019 IEEE International Conference on Pervasive Computing and Communications Workshops (PerCom Workshops). IEEE Computer Society, Los Alamitos, CA, USA, 1–2. https://doi.org/10.1109/PERCOMW.2019.8730683Google ScholarCross Ref
- Kristina Yordanova, Emma L. Tonkin, and Teodor Stoev. 2021. ARDUOUS 2021: 5th International Workshop on Annotation of useR Data for UbiquitOUs Systems – Welcome and Committees. In 2021 IEEE International Conference on Pervasive Computing and Communications Workshops and other Affiliated Events (PerCom Workshops). 1–2. https://doi.org/10.1109/PerComWorkshops51409.2021.9430959Google ScholarCross Ref
- Kristina Yordanova, Emma L. Tonkin, and Teodor Stoev. 2022. Annotation of User Data for Ubiquitous Systems – Welcome and Committees. In 2022 IEEE International Conference on Pervasive Computing and Communications Workshops and other Affiliated Events (PerCom Workshops). IEEE Computer Society, Los Alamitos, CA, USA, i–ii. https://doi.org/10.1109/PerComWorkshops53856.2022.9767338Google ScholarCross Ref
Index Terms
- ARDUOUS: Tutorial on Annotation of useR Data for UbiquitOUs Systems - Developing a Data Annotation Protocol
Recommendations
Semi-automatic semantic annotation of PubMed queries
Information processing algorithms require significant amounts of annotated data for training and testing. The availability of such data is often hindered by the complexity and high cost of production. In this paper, we investigate the benefits of a ...
Ubiquitous annotation systems: technologies and challenges
HYPERTEXT '06: Proceedings of the seventeenth conference on Hypertext and hypermediaUbiquitous annotation systems allow users to annotate physical places, objects, and persons with digital information. Especially in the field of location based information systems much work has been done to implement adaptive and context-aware systems, ...
AUTOMATIC ANNOTATION OF AMBIGUOUS PERSONAL NAMES ON THE WEB
Personal name disambiguation is an important task in social network extraction, evaluation and integration of ontologies, information retrieval, cross-document coreference resolution and word sense disambiguation. We propose an unsupervised method to ...
Comments