Possession identification in text

CARMEN BANEA; RADA MIHALCEA

doi:10.1017/S1351324918000062

Possession identification in text

Published online by Cambridge University Press: 04 April 2018

CARMEN BANEA

and

RADA MIHALCEA

Show author details

CARMEN BANEA: Affiliation:
Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI, USA e-mail: carmennb@umich.edu, mihalcea@umich.edu
RADA MIHALCEA: Affiliation:
Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI, USA e-mail: carmennb@umich.edu, mihalcea@umich.edu

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

Just as industrialization matured from mass production to customization and personalization, so has the Web migrated from generic content to public disclosures of one’s most intimately held thoughts, opinions, and beliefs. This relatively new type of data is able to represent finer and more narrowly defined demographic slices. If until now researchers have primarily focused on leveraging personalized content to identify latent information such as gender, nationality, location, or age, this article seeks to establish a structured way of extracting possessions, or items that people own or are entitled to, as a way to ultimately provide insights into people’s behaviors and characteristics. We introduce the new task of ‘possession identification in text’, and release a novel dataset where possessions are marked at different confidence levels. We present experiments and results obtained when seeking to automatically identify and extract possessions from the text.

Type: Article
Information: Natural Language Engineering , Volume 24 , Issue 4 , July 2018 , pp. 589 - 610

DOI: https://doi.org/10.1017/S1351324918000062 [Opens in a new window]
Copyright: Copyright © Cambridge University Press 2018

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Aha, D. W., Kibler, D., and Albert, M. K., 1991. Instance-based learning algorithms. Machine Learning 6 (1): 37–66.Google Scholar

Burger, J. D., and Henderson, J. C. 2006. An exploration of observable features related to blogger age. In Proceedings of the AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs, March, pp. 15–20.Google Scholar

Burger, J. D., Henderson, J., Kim, G., and Zarrella, G. 2011. Discriminating gender on Twitter. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP-2011), July, pp. 1301–9.Google Scholar

Cheng, Z., Caverlee, J., and Lee, K. 2010. You are where you tweet: a content-based approach to geo-locating Twitter users. In Proceedings of the 19th ACM International Conference on Information and Knowledge Management (CIKM-2010), October, pp. 759–68.Google Scholar

Ciot, M., Sonderegger, M., and Ruths, D. 2013. Gender inference of Twitter users in non-English contexts. Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing (EMNLP-2013), October, pp. 18–21.Google Scholar

Cohen, R., and Ruths, D. 2013. Classifying political orientation on Twitter: it’s not easy! In Proceedings of the 7th International AAAI Conference on Weblogs and Social Media (ICWSM-2013), July, pp. 91–9.Google Scholar

Conover, M., Gonçalves, B., Ratkiewicz, J., Flammini, A., and Menczer, F. 2011. Predicting the political alignment of Twitter users. IEEE Third International Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third International Conference on Social Computing (SocialCom-2011), October, pp. 192–199.Google Scholar

Gerlof, B. 2009. Normalized (pointwise) mutual information in collocation extraction. In Proceedings of the Biennial Conference of the German Society for Computational Linguistics and Language Technology (GSCL-2009), September, pp. 3140–51.Google Scholar

Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., and Witten, I. H., 2009. The WEKA data mining software: an update. SIGKDD Explorations 11 (1): 10–8.Google Scholar

Hornik, K., 1991. Approximation capabilities of multilayer feedforward networks. Neural Networks 4 (2): 251–7.Google Scholar

Hu, T., Bigelow, E., Luo, J., and Kautz, H., 2017. Tales of two cities: using social media to understand idiosyncratic lifestyles in distinctive metropolitan areas. IEEE Transactions on Big Data 3 (1): 55–66.Google Scholar

Levin, B., 1993. English Verb Classes and Alternations: A Preliminary Investigation. Chicago, IL: The University of Chicago Press.Google Scholar

Levin, B. 2006. English Object Alternations: A Unified Account. Unpublished manuscript. Stanford, CA, USA. http://web.stanford.edu/~bclevin/alt06.pdf Google Scholar

Li, J., Ritter, A., and Hovy, E. 2014. Weakly supervised user profile extraction from Twitter. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (ACL-2014), June, pp. 165–74.Google Scholar

Liu, Wendy, & Ruths, Derek. 2013. What’s in a name? Using first names as features for gender inference in Twitter. In Analyzing Microtext: Papers from the 2013 AAAI Spring Symposium, March, pp. 10–6.Google Scholar

Manning, C. D., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S. J., and McClosky, D. 2014. The Stanford CoreNLP natural language processing toolkit. In Proceedings of the Association for Computational Linguistics System Demonstrations (ACL-2014), June, pp. 55–60.Google Scholar

Mukherjee, A., and Liu, B. 2010. Improving gender classification of blog authors. In Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing (EMNLP-2010), October, pp. 207–17.Google Scholar

Nelson, D. L., McEvoy, C. L., and Schreiber, T. A., 2004. The University of South Florida free association, rhyme, and word fragment norms. Behavior Research Methods, Instruments, & Computers 36 (3): 402–7.Google Scholar

Pennacchiotti, M., and Popescu, A.-M. 2011. Democrats, republicans and Starbucks afficinados: user classification in Twitter. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2011), August, pp. 430–8.Google Scholar

Platt, J. C. 1999. Fast training of support vector machine using sequential minimal optimization. In Schölkopf, B., Burges, C. J. C., and Smola, A. J. (eds.), Advances in Kernel Methods – Support Vector Learning. Cambridge, MA: MIT Press, pp. 185–208.Google Scholar

Quinlan, R., 1993. C4.5: Programs for Machine Learning. San Mateo, CA: Morgan Kaufmann Publishers.Google Scholar

Rao, D., Yarowsky, D., Shreevats, A., and Gupta, M. 2010. Classifying latent user attributes in Twitter. In Proceedings of the 2nd International Workshop on Search and Mining User-Generated Contents (SMUC-2010.), October, pp. 37–44.Google Scholar

Rosenberg, M. J., 1956. Cognitive structure and attitudinal affect. The Journal of Abnormal and Social Psychology 53 (3): 367–72.Google Scholar

Rosenberg, M. J. 1968. Hedonism, inauthenticity, and other goals toward expansion of a consistency theory. In pp. 73–111 Abelson, R. P., Aronson, E., McGuire, W. J., Newcomb, T. M., Rosenberg, M. J., and Tannenbaum, P. H. (eds.), Theories of Cognitive Consistency: A Sourcebook. Chicago, IL: Rand McNally.Google Scholar

Sadilek, A., Kautz, H., and Bigham, J. P. 2012. Finding your friends and following them to where you are. In Proceedings of the 5th ACM International Conference on Web Search and Data Mining (WSDM-2012), February, pp. 723–32.Google Scholar

Stecher, K., and Counts, S. 2008. Spontaneous inference of personality traits and effects on memory for online profiles. Proceedings of the 2nd International Conference on Weblogs and Social Media (ICWSM-2008), March, pp. 118–26.Google Scholar

Van Durme, B. 2012. Streaming analysis of discourse participants. Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL-2012), July, pp. 48–58.Google Scholar

Volkova, S., and Bachrach, Y., 2015. On predicting sociodemographic traits and emotions from communications in social networks and their implications to online self-disclosure. Cyberpsychology, Behavior and Social Networking 18 (12): 726–36.Google Scholar

Volkova, S., and Bachrach, Y. 2016. Inferring perceived demographics from user emotional tone and user-environment emotional contrast. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL-2016), August, pp. 1567–78.Google Scholar

Zamal, F. A., Liu, W., and Ruths, D. 2012. Homophily and latent attribute inference: inferring latent attributes of Twitter users from neighbors. In Proceedings of the 6th International AAAI Conference on Weblogs and Social Media (ICWSM-2012), June, pp. 387–90.Google Scholar

Article contents

Possession identification in text

Abstract

Access options

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests