Abstract
Descriptions of images form the backbone for many intelligent systems, assuming descriptions that randomly vary in construction and content, but where description content is homogeneous. This assumption becomes problematic being extended to descriptions of images of people [14], where people are known to show systematic biases in how they process others [19]. Therefore, this paper presents a novel approach for discovering exceptional subgroups of descriptions in which the content of those descriptions reliably differs from the general set of descriptions. We develop a novel interestingness measure for subgroup discovery appropriate for probability distributions across semantic representations. The proposed method is applied to a web-based experiment in which 500 raters describe images of 200 people. Our analysis identifies multiple exceptional subgroups and the attributes of the respective raters and images. We further discuss implications for intelligent systems.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
The difference between documents was calculated as the sum across all pairs of descriptions of the cosine similarity of the topic probability distributions. The number of topics per document was calculated as the sum across all descriptions of the conditional entropy of the topic probability distribution.
- 2.
References
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proceedings of VLDB, pp. 487ā499. Morgan Kaufmann (1994)
Antol, S., et al.: VQA: visual question answering. In: Proceedings of IEEE ICCV, pp. 2425ā2433 (2015)
Atzmueller, M.: Subgroup discovery. WIREs DMKD 5(1), 35ā49 (2015)
Atzmueller, M., Lemmerich, F.: VIKAMINE ā open-source subgroup discovery, pattern mining, and analytics. In: Flach, P.A., De Bie, T., Cristianini, N. (eds.) ECML PKDD 2012. LNCS (LNAI), vol. 7524, pp. 842ā845. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33486-3_60
Atzmueller, M., Lemmerich, F.: Exploratory pattern mining on social media using geo-references and social tagging information. IJWS 2(1/2), 80ā112 (2013)
Atzmueller, M., Puppe, F.: SD-Map ā a fast algorithm for exhaustive subgroup discovery. In: FĆ¼rnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) PKDD 2006. LNCS (LNAI), vol. 4213, pp. 6ā17. Springer, Heidelberg (2006). https://doi.org/10.1007/11871637_6
Bayardo, R., Agrawal, R., Gunopulos, D.: Constraint-based rule mining in large, dense databases. Data Min. Knowl. Discov. 4, 217ā240 (2000)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. JMLR 3, 993ā1022 (2003)
Borji, A., Cheng, M.M., Jiang, H., Li, J.: Salient object detection: a benchmark. IEEE Trans. Image Process. 24(12), 5706ā5722 (2015)
ChrupaÅa, G., Gelderloos, L., Alishahi, A.: Representations of language in a model of visually grounded speech signal. In: Proceedings of ACL, pp. 613ā622 (2017)
Duivesteijn, W., Feelders, A.J., Knobbe, A.: Exceptional model mining. Data Min. Knowl. Discov. 30(1), 47ā98 (2016)
Dwork, C., Hardt, M., Pitassi, T., Reingold, O., Zemel, R.S.: Fairness through awareness. CoRR abs/1104.3913 (2011)
Ganter, B., Wille, R.: Formal concept analysis. Wissenschaftliche Zeitschrift-Technischen Universitat Dresden 45, 8ā13 (1996)
Gatt, A., et al.: Face2Text: collecting an annotated image description corpus for the generation of rich face descriptions. In: Proceedings of LREC (2018)
Herlitz, A., LovĆ©n, J.: Sex differences and the own-gender bias in face recognition: a meta-analytic review. Visual Cogn. 21(9ā10), 1306ā1336 (2013)
Krishna, R., et al.: Visual genome: connecting language and vision using crowdsourced dense image annotations. Int. J. Comput. Vis. 123(1), 32ā73 (2017)
Lemmerich, F., Atzmueller, M., Puppe, F.: Fast exhaustive subgroup discovery with numerical target concepts. Data Min. Knowl. Discov. 30, 711ā762 (2016). https://doi.org/10.1007/s10618-015-0436-8
Lemmerich, F., Becker, M., Atzmueller, M.: Generic pattern trees for exhaustive exceptional model mining. In: Flach, P.A., De Bie, T., Cristianini, N. (eds.) ECML PKDD 2012. LNCS (LNAI), vol. 7524, pp. 277ā292. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33486-3_18
Levin, D.T.: Race as a visual feature: using visual search and perceptual discrimination tasks to understand face categories and the cross-race recognition deficit. J. Exp. Psychol. Gen. 129(4), 559ā574 (2000)
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740ā755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Minka, T.: Estimating a Dirichlet distribution. Technical report, MIT (2000)
Sklar, M.: Fast MLE computation for the Dirichlet multinomial. arXiv:1405.0099
Torralba, A., Efros, A.A.: Unbiased look at dataset bias. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 1521ā1528. IEEE (2011)
Acknowledgments
Funding for data collection was provided by a University of Adelaide Interdisciplinary Research Grant to C. Semmler, A. Hendrickson, R Heyer, A. Dick and A. van den Hengel. Furthermore, this work has also been partially supported by the German Research Council (DFG) under grant AT 88/4-1.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
Ā© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Hendrickson, A.T., Wang, J., Atzmueller, M. (2018). Identifying Exceptional Descriptions of People Using Topic Modeling and Subgroup Discovery. In: Ceci, M., Japkowicz, N., Liu, J., Papadopoulos, G., RaÅ, Z. (eds) Foundations of Intelligent Systems. ISMIS 2018. Lecture Notes in Computer Science(), vol 11177. Springer, Cham. https://doi.org/10.1007/978-3-030-01851-1_44
Download citation
DOI: https://doi.org/10.1007/978-3-030-01851-1_44
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-01850-4
Online ISBN: 978-3-030-01851-1
eBook Packages: Computer ScienceComputer Science (R0)