Abstract
Word-of-mouth marketing on social media has become more urgent with the increasing number of users and posts, and it is important to estimate user attributes because most users on Twitter do not reveal their attributes. We propose new methods for estimating user attributes of a Twitter user from the user’s contents (a profile document and tweets) and social neighbors, i.e. those with whom the user has mentioned. This study has three contributions on the task of user attribute estimation. First, we investigate a labeling method that finds the users associated with a blog account and uses their profile attributes on blog as true labels of training tweet data. We confirm that using the blog labels achieved higher accuracy than manual labeling and pattern matching methods, with respect to four attributes (gender, age, occupation, and interests). Second, we validate the best way to combine bag-of-words features of profile documents and tweets. We evaluate nine combining methods and show that words in profile documents should be treated distinctively from those in tweets. Third, we reveal that to adjust amount of information from social neighbors affects estimation accuracy. We experiment three adjustment levels and show that our method, which utilizes the target user’s profile document and tweets and the neighbors’ profile documents (not including tweets), achieved the best accuracy. Overall experiments conducted on the estimation of the four attributes show that our method achieved higher accuracy than conventional methods that use manually-labeled tweets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
A friend is another Twitter user whom you are following.
- 7.
A Tweet by another user, forwarded to you by someone you follow.
- 8.
- 9.
- 10.
The tendency of individuals to associate and bond with similar others.
References
Akaike H (1974) A new look at the statistical model identification. IEEE Trans Autom Control 19(6):716–723
Burger JD, Henderson J, Kim G, Zarrella G (2011) Discriminating gender on Twitter. In: EMNLP, pp 1301–1309
Cheng Z, Caverlee J, Lee K (2010) You are where you tweet: a content-based approach to geo-locating Twitter users. In: CIKM, pp 759–768
Chu Z, Gianvecchio S, Wang H, Jajodia S (2010) Who is tweeting on Twitter: human, bot, or cyborg? In: ACSAC, pp 21–30
Conover M, Gonçalves B, Ratkiewicz J, Flammini A, Menczer F (2011) Predicting the political alignment of Twitter users. In: SocialCom, pp 192–199
Eisenstein J, O’Connor B, Smith NA, Xing EP (2010) A latent variable model for geographic lexical variation. In: EMNLP, pp 1277–1287
He J, Chu WW, Liu ZV (2006) Inferring privacy information from social networks. In: ISI, pp 154–165
Ikeda K, Hattori G, Matsumoto K, Ono C, Higashino T (2012) Demographic estimation of twitter users for marketing analysis. IPSJ Trans Consum Devices Syst 2(1):82–93
Jansen BJ, Zhang M, Sobel K, Chowdury A (2009) Twitter power: tweets as electronic word of mouth. J Am Soc Inf Sci Technol 60(11):2169–2188
Lindamood J, Heatherly R, Kantarcioglu M, Thuraisingham B (2009) Inferring private information using social network data. In: WWW, pp 1145–1146
Matsumoto K, Hashimoto K (1999) Schema design for causal law mining from incomplete database. In: DS, pp 92–102
Mislove A, Viswanath B, Gummadi KP, Druschel P (2010) You are who you know: inferring user profiles in online social networks. In: WSDM, pp 251–260
Mislove A, Lehmann S, Ahn YY, Onnela JP, Rosenquist JN (2011) Understanding the demographics of Twitter users. In: ICWSM
Pennacchiotti M, Popescu AM (2011) Democrats, republicans and starbucks afficionados: user classification in Twitter. In: KDD, pp 430–438
Rao D, Yarowsky D, Shreevats A, Gupta M (2010) Classifying latent user attributes in Twitter. In: SMUC, pp 37–44
Trusov M, Bucklin RE, Pauwels K (2009) Effects of word-of-mouth versus traditional marketing: findings from an internet social networking site. J Mark 73(5):90–102
Wen Z, Lin CY (2010) On the quality of inferring interests from social neighbors. In: KDD, pp 373–382
Wen Z, Lin CY (2011) Improving user interest inference from social neighbors. In: CIKM, pp 1001–1006
Zamal FA, Liu W, Ruths D (2012) Homophily and latent attribute inference: inferring latent attributes of Twitter users from neighbors. In: ICWSM
Zheleva E, Getoor L (2009) To join or not to join: the illusion of privacy in social networks with mixed public and private user profiles. In: WWW, pp 531–540
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Ito, J., Nishida, K., Hoshide, T., Toda, H., Uchiyama, T. (2014). Demographic and Psychographic Estimation of Twitter Users Using Social Structures . In: Kawash, J. (eds) Online Social Media Analysis and Visualization. Lecture Notes in Social Networks. Springer, Cham. https://doi.org/10.1007/978-3-319-13590-8_2
Download citation
DOI: https://doi.org/10.1007/978-3-319-13590-8_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-13589-2
Online ISBN: 978-3-319-13590-8
eBook Packages: Computer ScienceComputer Science (R0)