Abstract
This paper describes the investigation of personality estimation from Japanese weblog text. Among various personality types, we focus on Egogram, which has been used in Transactional Analysis and is strongly related to the communicative behavior of individuals. Estimation is performed using the Multinomial Naïve Bayes classifier with some feature words that are selected based on the information gain. The validity of this approach was evaluated with real weblog text of 551 subjects. The results showed that our approach achieved 12-25% improvement from baseline. The feature words selected for the estimation are strongly correlated with the characteristics of Egogram.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Argamon, S., Dhawle, S., Koppel, M., Pennebaker, J.W.: Lexical Predictors of Personality Type. In: Proc. the 2005 Joint Annual Meeting of the Interface and the Classification Society of North America (2005)
Breck, E., Choi, Y., Cardie, C.: Identifying Expressions of Opinion in Context. In: Proc. IJACAI 2007 (2007)
Costa, P.T., McCrae, R.R., Revised, N.E.O.: Personality Inventory (NEO-PI-R) and NEO Five-Factor Inventory (NEO-FFI): Professional Manual. Psychological Assessment Resources, Odessa
Dusey, J.M.: Egograms and the “Constancy Hypothesis”. Transactional Analysis Journal 2, 37–42
Gill, A.J., Gergle, D., French, R.M., Oberlander, J.: Emotion Rating from Short Blog Texts. In: Proc. CHI 2008, pp. 1121–1124 (2008)
Gill, A.J., Nowson, S., Oberlander, J.: What Are They Blogging About? Personality, Topic and Motivation in Blogs. In: Proc. ICWSM 2009 (2009)
Hancock, J.T., Landrigan, C., Silver, C.: Expressing Emotion in Text-Based Communication. In: Proc. CHI 2007, pp. 929–932 (2007)
Mairesse, F., Walker, M.: Using Linguistic Cues for the Automatic Recognition of Personality in Conversation and Text. Journal of Artificial Intelligence Research 30, 457–500 (2007)
Mairesse, F., Walker, M.: Words Mark the Nerds: Computational Models of Personality Recognition through Language. In: Proc. CogSci 2006, pp. 543–548 (2006)
Mishne, G.: Experiments with Mood Classification in Blog Posts. In: Style Workshop, SIGIR 2005 (2005)
Nowson, S., Oberlander, J.: Identifying More Blogger: Towards Large Scale Personality Classification of Personal Weblogs. In: Proc. ICWSM 2007 (2007)
Oberlander, J., Nowson, S.: Whose Thumb is it Anyway? Classifying Author Personality from Weblog Text. In: Proc. ACL 2006, pp. 627–634 (2006)
Pennebaker, J., Francis, M.: Linguistic Inquiry and Word Count 2001. Lawrence Erlbaum Associates, Mahwah (2001)
Sen (in Japanese), http://ultimania.org/sen/
TEG Research Group in the University Tokyo Medical school, TEG: Tokyo University Egogram (New Ver.), Kaneko Shobou (2000)
Toma, C.J., Hancock, J.T.: Reading between the Lines: Linguistic Cues to Deception in Online Dating Profiles. In: Proc. CSCW 2008 (2008)
Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)
Fujita, H., Hakura, J., Kurematu, M.: Intelligent Human Interface Based on Mental Cloning-Based Software. Journal of Knowledge-Based Systems 22, 216–234 (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Minamikawa, A., Yokoyama, H. (2011). Personality Estimation Based on Weblog Text Classification. In: Mehrotra, K.G., Mohan, C.K., Oh, J.C., Varshney, P.K., Ali, M. (eds) Modern Approaches in Applied Intelligence. IEA/AIE 2011. Lecture Notes in Computer Science(), vol 6704. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21827-9_10
Download citation
DOI: https://doi.org/10.1007/978-3-642-21827-9_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-21826-2
Online ISBN: 978-3-642-21827-9
eBook Packages: Computer ScienceComputer Science (R0)