Abstract
Prior work on automated annotation of human values has sought to train text classification techniques to label text spans with labels that reflect specific human values such as freedom, justice, or safety. This confounds three tasks: (1) selecting the documents to be labeled, (2) selecting the text spans that express or reflect human values, and (3) assigning labels to those spans. This paper proposes a three-stage model in which separate systems can be optimally trained for each of the three stages. Experiments from the first stage, document selection, indicate that annotation diversity trumps annotation quality, suggesting that when multiple annotators are available, the traditional practice of adjudicating conflicting annotations of the same documents is not as cost effective as an alternative in which each annotator labels different documents. Preliminary results for the second stage, selecting value sentences, indicate that high recall (94%) can be achieved on that task with levels of precision (above 80%) that seem suitable for use as part of a multi-stage annotation pipeline. The annotations created for these experiments are being made freely available, and the content that was annotated is available from commercial sources at modest cost.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Two documents in Round 9 were discovered to be duplicates and removed.
- 2.
As fastText parameters we selected: -dim 50; -loss negative sampling (ns); -epoch 1000; -wordNgrams 5; -minCount 2.
- 3.
We have also generated learning curves using all of the adjudicated annotations, again plotting only at even values for the number of original annotations. This yields similar results.
References
CD-Mainichi Shimbun Data Collection 2011 version; 2012 version; 2013 version; 2014 version; 2015 version; and 2016 version
Cheng, A.-S., Fleischmann, K.R., Wang, P., Ishita, E., Oard, D.W.: The role of innovation and wealth in the net neutrality debate: a content analysis of human values in congressional and FCC hearings. J. Am. Soc. Inf. Sci. Technol. 63, 1360–1373 (2012)
Fleischmann, K.R.: Information and Human Values. Morgan & Claypool, San Rafael (2014)
Friedman, B., Kahn Jr., P.H., Borning, A.: Value sensitive design and information systems. In: Zhang, P., Galletta, D. (eds.) Human-Computer Interaction and Management Information Systems: Foundations, pp. 348–372. M.E. Sharpe, Armonk (2006). https://doi.org/10.1002/9780470281819.ch4
Grimmer, J., Stewart, B.M.: Text as data: the promise and pitfalls of automatic content analysis methods for political texts. Polit. Anal. 21, 267–297 (2013)
Ishita, E., et al.: Toward automating detection of human values in the nuclear power debate. In: Proceedings of 80th Annual Meeting of the Association for Information Science and Technology, vol. 54, no. 1, pp. 714–715 (2017). https://doi.org/10.1002/pra2.2017.14505401127
Joulin, A., Grave, E., Bojanowski, P., Mikolov, T.: Bag of tricks for efficient text classification. https://arxiv.org/abs/1607.01759. Accessed 10 Sept 2018
JUMAN (a user-extensible morphological analyze for Japanese). http://nlp.ist.i.kyoto-u.ac.jp/EN/index.php?JUMAN. Accessed 10 Sept 2018
Khetan, A., Lipton, Z.C., Anandkumar, A.: Learning from noisy singly-labeled data. In: Proceedings of ICLR 2018, 15 p. (2018). https://arxiv.org/abs/1712.04577. Accessed 10 Sept 2018
Nelson, L.K.: Computational grounded theory: a methodological framework. Sociol. Methods Res. (2017). https://doi.org/10.1177/0049124117729703
Nelson, L.K., Burk, D., Knudsen, M., McCall, L.: The future of coding: a comparison of hand-coding and three types of computer-assisted text analysis methods. Sociol. Methods Res. (2018). https://doi.org/10.1177/0049124118769114
Pang, B., Lee, L.: Opinion mining and sentiment analysis. J. Found. Trends Inf. Retr. 2(1–2), 1–135 (2008). https://doi.org/10.1561/1500000011
Schwartz, S.H.: Value orientations: measurement, antecedents and consequences across nations. In: Jowell, R., Roberts, C., Fitzgerald, R., Eva, G. (eds.) Measuring Attitudes Cross-Nationally: Lessons from the European Social Survey, pp. 169–203. Sage, London (2007). https://doi.org/10.4135/9781849209458.n9
Takayama, Y., Tomiura, Y., Ishita, E., Oard, D.W., Fleischmann, K.R., Cheng, A.-S.: A word-scale probabilistic latent variable model for detecting human values. In: Proceedings on ACM International Conference on Information and Knowledge Management (CIKM 2014), pp. 1489–1498 (2014). https://doi.org/10.1145/2661829.2661966
Clay, T., Fleischmann, K.R.: The relationship between human values and attitudes toward the Park51 and nuclear power controversies. In: Proceedings of the 74th Annual Meeting of the American Society for Information Science and Technology, New Orleans, LA (2011). https://doi.org/10.1002/meet.2011.14504801172
TinySVM: Support Vector Machines. http://chasen.org/~taku/software/TinySVM/. Accessed 10 Sept 2018
Verma, N., Fleischmann, K.R., Koltai, K.S.: Human values and trust in scientific journals, the mainstream media and fake news. In: Proceedings of 80th Annual Meeting of the Association for Information Science and Technology, vol. 54, no. 1, pp. 426–435 (2017)
Yan, J.L.S., McCracken, N., Crowston, K.: Semi-automatic content analysis of qualitative data. In: Proceedings of the iConference, pp. 1128–1132 (2014)
Acknowledgements
This work has been supported in part by JSPS KAKENHI Grant Number JP18H03495.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Ishita, E. et al. (2019). Toward Three-Stage Automation of Annotation for Human Values. In: Taylor, N., Christian-Lamb, C., Martin, M., Nardi, B. (eds) Information in Contemporary Society. iConference 2019. Lecture Notes in Computer Science(), vol 11420. Springer, Cham. https://doi.org/10.1007/978-3-030-15742-5_18
Download citation
DOI: https://doi.org/10.1007/978-3-030-15742-5_18
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-15741-8
Online ISBN: 978-3-030-15742-5
eBook Packages: Computer ScienceComputer Science (R0)