Skip to main content

First Place Solution for NLPCC 2017 Shared Task Social Media User Modeling

  • Conference paper
  • First Online:
Natural Language Processing and Chinese Computing (NLPCC 2017)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10619))

  • 3426 Accesses

Abstract

With the popularity of mobile Internet, many social networking applications provide users with the function to share their personal information. It is of high commercial value to leverage the users’ personal information such as tweets, preferences and locations for user profiling. There are two subtasks working in user profiling. Subtask one is to predict the Point-of-Interest (POI) a user will check in at. We adopted a combination of multiple approach results, including user-based collaborative filtering (CF) and social-based CF to predict the locations. Subtask two is to predict the users’ gender. We divided the users into two groups, depending on whether the user has posted or not. We treat this task subtask as a classification task. Our results achieved first place in both subtasks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Farseev, A., Nie, L., Akbari, M., et al.: Harvesting multiple sources for user profile learning: a big data study. In: Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, pp. 235–242. ACM (2015)

    Google Scholar 

  2. Goldberg, D., Nichols, D., Oki, B.M., et al.: Using collaborative filtering to weave an information tapestry. Commun. ACM 35, 61–70 (1992)

    Article  Google Scholar 

  3. Ye, M., Yin, P., Lee, W.-C., et al.: Exploiting geographical influence for collaborative point-of-interest recommendation. In: Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 325–334. ACM (2011)

    Google Scholar 

  4. Breese, J.S., Heckerman, D., Kadie, C.: Empirical analysis of predictive algorithms for collaborative filtering. In: Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence, pp. 43–52. Morgan Kaufmann Publishers Inc. (1998)

    Google Scholar 

  5. Sarwar, B., Karypis, G., Konstan, J., et al.: Item-based collaborative filtering recommendation algorithms. In: Proceedings of the 10th International Conference on World Wide Web, pp. 285–295. ACM (2001)

    Google Scholar 

  6. Wang, J., De Vries, A.P., Reinders, M.J.: Unifying user-based and item-based collaborative filtering approaches by similarity fusion. In: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 501–508. ACM (2006)

    Google Scholar 

  7. Koren, Y., Bell, R., Volinsky, C.: Matrix factorization techniques for recommender systems. Computer 42(8), 30–37 (2009)

    Article  Google Scholar 

  8. Berjani, B., Strufe, T.: A recommendation system for spots in location-based online social networks. In: Proceedings of the 4th Workshop on Social Network Systems, p. 4. ACM (2011)

    Google Scholar 

  9. Cao, X., Cong, G., Jensen, C.S.: Mining significant semantic locations from GPS data. Proc. VLDB Endow. 3, 1009–1020 (2010)

    Article  Google Scholar 

  10. Lian, D., Zhao, C., Xie, X., et al.: GeoMF: joint geographical modeling and matrix factorization for point-of-interest recommendation. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 831–840. ACM (2014)

    Google Scholar 

  11. Ye, M., Yin, P., Lee, W.-C.: Location recommendation for location-based social networks. In: Proceedings of the 18th SIGSPATIAL International Conference on Advances in Geographic Information Systems, pp. 458–461. ACM (2010)

    Google Scholar 

  12. Schler, J., Koppel, M., Argamon, S., et al.: Effects of age and gender on blogging. In: AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs, pp. 199–205 (2006)

    Google Scholar 

  13. Mukherjee, A., Liu, B.: Improving gender classification of blog authors. In: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, pp. 207–217. Association for Computational Linguistics (2010)

    Google Scholar 

  14. Rangel, F., Rosso, P.: On the impact of emotions on author profiling. Inf. Process. Manag. 52, 73–92 (2016)

    Article  Google Scholar 

  15. Burger, J.D., Henderson, J., Kim, G., et al.: Discriminating gender on Twitter. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 1301–1309. Association for Computational Linguistics (2011)

    Google Scholar 

  16. Littlestone, N.: Learning quickly when irrelevant attributes abound: a new linear-threshold algorithm. Mach. Learn. 2, 285–318 (1988)

    Google Scholar 

  17. Rahimi, A., Vu, D., Cohn, T., et al.: Exploiting text and network context for geolocation of social media users. arXiv preprint arXiv:1506.04803 (2015)

  18. Carmagnola, F., Cena, F., Cortassa, O., Gena, C., Torre, I.: Towards a tag-based user model: how can user model benefit from tags? In: Conati, C., McCoy, K., Paliouras, G. (eds.) UM 2007. LNCS (LNAI), vol. 4511, pp. 445–449. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-73078-1_62

    Chapter  Google Scholar 

  19. Ma, H., Cao, H., Yang, Q., et al.: A habit mining approach for discovering similar mobile users. In: Proceedings of the 21st International Conference on World Wide Web, pp. 231–240. ACM (2012)

    Google Scholar 

  20. Kurt, I., Ture, M., Kurum, A.T.: Comparing performances of logistic regression, classification and regression tree, and neural networks for predicting coronary artery disease. Expert Syst. Appl. 34, 366–374 (2008)

    Article  Google Scholar 

  21. Breiman, L.: Random forests. Mach. Learn. 45, 5–32 (2001)

    Article  MATH  Google Scholar 

  22. Chen, T., Guestrin, C.: XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794. ACM (2016)

    Google Scholar 

Download references

Acknowledgments

This research is supported by the National Key Research Development Program of China (No. 2016YFB1001103) and Natural Science Foundation of China (No. 61572098, 61572102).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jian Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Qian, L., Wang, A., Wang, Y., Huang, Y., Wang, J., Lin, H. (2018). First Place Solution for NLPCC 2017 Shared Task Social Media User Modeling. In: Huang, X., Jiang, J., Zhao, D., Feng, Y., Hong, Y. (eds) Natural Language Processing and Chinese Computing. NLPCC 2017. Lecture Notes in Computer Science(), vol 10619. Springer, Cham. https://doi.org/10.1007/978-3-319-73618-1_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-73618-1_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-73617-4

  • Online ISBN: 978-3-319-73618-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics