Abstract
Cultural modeling (CM) is an emergent and promising research area in social computing. It aims to develop behavioral models of human groups and analyze the impact of culture factors on human group behavior using computational methods. Machine learning methods, in particular classification, play a critical role in such applications. Since various cultural-related data sets possess different characteristics, it is important to gain a computational understanding of performance characteristics of various machine learning methods. In this paper, we investigate the performance of seven representative classification algorithms using a benchmark cultural modeling data set and analyze the experimental results as to group behavior forecasting.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Subrahmanian V S. Computer science: Cultural modeling in real time. Science, 2007, 317(5844): 1509–1510.
Subrahmanian V S, Albanese M, Martinez M V, Nau D, Reforgiato D, Simari G I, Sliva A, Wilkenfeld J, Udrea O. CARA: A cultural-reasoning architecture. IEEE Intelligent Systems, 2007, 22(2): 12–16.
Khuller S, Martinez V, Nau D, Simari G, Sliva A, Subrahmanian V S. Finding most probable worlds of logic programs. In Proc. the First International Conference on Scalable Uncertainty Management, Washington DC, USA, October 10–12, 2007, pp.45–59.
Martinez V, Simari G I, Sliva A, Subrahmanian V S. CONVEX: Context vectors as a paradigm for learning group behaviors based on similarity. IEEE Intelligent Systems, 2007, 23(4): 51–57.
Wang F Y. Is culture computable? IEEE Intelligent Systems, 2009, 24(2): 2–3.
Wang F Y, Carley K M, Zeng D, Mao W. Social computing: From social informatics to social intelligence. IEEE Intelligent Systems, 2007, 22(2): 79–83.
Wang F Y. Toward a paradigm shift in social computing: The ACP approach. IEEE Intelligent Systems, 2007, 22(5): 65–67.
Zeng D, Wang F Y, Carley K M. Social computing. IEEE Intelligent Systems, 2007, 22(5): 20–22.
Minorities at risk organizational behavior dataset. Minorities at Risk Project, University of Maryland, College Park: Center for International Development and Conflict Management, 2008, http://www.cidcm.umd.edu/mar.
Hand D J, Yu K. Idiot’s Bayes — Not so stupid after all? International Statistical Review, 2001, 69(3): 385–398.
Vladimir N Vapnik. Support Vector Estimation of Functions. Statistical Learning Theory, Haykin S (ed.), Springer-Verlag, 1998, pp.375–570.
Jain A K, Mao J, Mohiuddin K M. Artificial neural networks: A tutorial. Computer, 1996, 29(3): 31–44.
Christopher M Bishop. Neural Networks for Pattern Recognition. Oxford University Press, 1995.
Kotsiantis S, Zaharakis I, Pintelas P. Machine learning: A review of classification and combining techniques. Artificial Intelligence Review, 2006, 26(3): 159–190.
Breiman L. Random forests. Machine Learning, 2001, 45(1): 5–32.
Thabtah F. A review of associative classification mining. The Knowledge Engineering Review, 2007, 22(1): 37–65.
Baralis E, Garza P. A lazy approach to pruning classification rules. In Proc. the Second IEEE International Conference on Data Mining, Maebashi, Japan, December 9–12, 2002, pp.35–42.
Minorities at Risk Project. College Park, MD: Center for International Development and Conflict Management, 2005, http://www.cidcm.umd.edu/mar/.
Ling C X, Huang J, Zhang H. AUC: A statistically consistent and more discriminating measure than accuracy. In Proc. the Eighteenth International Joint Conference on Artificial Intelligence, Acapulco, Mexico, August 9–15, 2003, pp. 329–341.
Japkowicz N, Stephen S. The class imbalance problem: A systematic study. Intelligent Data Analysis, 2002, 6(5): 429–450.
Kohavi R, Wolpert D H. Bias plus variance decomposition for zero-one loss functions. In Proc. the Thirteenth International Conference on Machine Learning, Bari, Italy, July 3–6, 1996, pp.275–283.
Govindarajan M. Text mining technique for data mining application. Proceedings of World Academy of Science, Engineering and Technology, 2007, 26(104): 544–549.
Liu B, Hsu W, Ma Y. Integrating classification and association rule mining. In Proc. the Third International Conference on Knowledge Discovery and Data Mining, New York, USA, August 27–31, 1998, pp.80–86.
Li W, Han J, Pei J. CMAR: Accurate and efficient classification based on multiple class-association rules. In Proc. the First IEEE International Conference on Data Mining, San Jose, USA, November 29–December 2, 2001, pp.369–376.
Yin X, Han J. CPAR: Classification based on predictive association rules. In Proc. the Third SIAM International Conference on Data Mining, San Francisco, USA, May 1–3, 2003, pp.369–376.
Grčar M, Mladenič D, Fortuna B, Grobelnik M. Data sparsity issues in the collaborative filtering framework. In Proc. the Seventh International Workshop on Knowledge Discovery on the Web, Chicago, USA, August 21, 2005, pp.58–76.
Nathalie J. Class imbalances: Are we focusing on the right issue? In Proc. the ICML 2003 Workshop on Learning from Imbalanced Data Sets, Washington DC, USA, August 21, 2003.
Japkowicz N. The class imbalance problem: Significance and strategies. In Proc. the Second International Conference on Artificial Intelligence, Las Vegas, USA, June 26–29, 2000, pp.111–117.
Zhang J. kNN approach to unbalanced data distributions: A case study involving information extraction. In Proc. the ICML2003 Workshop on Learning from Imbalanced Data Sets, Washington DC, USA, August 21, 2003.
Barandela R, Sanchez J S, Garcia V, Rangel E. Strategies for learning in class imbalance problems. Pattern Recognition, 2003, 36(3): 849–851.
Rish I, Hellerstein J, Thathachar J. An analysis of data characteristics that affect naive Bayes performance. Technical Report, IBM T.J. Watson Research Center, 2001.
Domingos P, Pazzani M. On the optimality of the simple Bayesian classifier under zero-one loss. Machine Learning, 1997, 29(2/3): 103–130.
Martinez-Arroyo M, Sucar L E. Learning an optimal naive Bayes classifier. In Proc. the Eighteenth International Conference on Pattern Recognition, Hong Kong, China, August 20–24, 2006, p.958.
Zhang C, Yu P S, Bell D. Domain-driven data mining. IEEE Transactions on Knowledge and Data Engineering, 2009, 21(2): 301.
Cao L, Zhang C. Domain-driven, actionable knowledge discovery. IEEE Intelligent Systems, 2007, 22(4): 78–88.
Nau D, Wilkenfeld J. Computational cultural dynamics. IEEE Intelligent Systems, 2008. 23(4): 18–19.
Author information
Authors and Affiliations
Corresponding author
Additional information
This work is supported in part by the National Natural Science Foundation of China under Grant Nos. 60621001, 60875028, 60875049, and 70890084, the Ministry of Science and Technology of China under Grant No. 2006AA010106, and the Chinese Academy of Sciences under Grant Nos. 2F05N01, 2F08N03 and 2F07C01.
Rights and permissions
About this article
Cite this article
Li, XC., Mao, WJ., Zeng, D. et al. Performance Evaluation of Machine Learning Methods in Cultural Modeling. J. Comput. Sci. Technol. 24, 1010–1017 (2009). https://doi.org/10.1007/s11390-009-9290-8
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11390-009-9290-8