skip to main content
10.1145/2487575.2487661acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Collaborative boosting for activity classification in microblogs

Published: 11 August 2013 Publication History

Abstract

Users' daily activities, such as dining and shopping, inherently reflect their habits, intents and preferences, thus provide invaluable information for services such as personalized information recommendation and targeted advertising. Users' activity information, although ubiquitous on social media, has largely been unexploited. This paper addresses the task of user activity classification in microblogs, where users can publish short messages and maintain social networks online. We identify the importance of modeling a user's individuality, and that of exploiting opinions of the user's friends for accurate activity classification. In this light, we propose a novel collaborative boosting framework comprising a text-to-activity classifier for each user, and a mechanism for collaboration between classifiers of users having social connections. The collaboration between two classifiers includes exchanging their own training instances and their dynamically changing labeling decisions. We propose an iterative learning procedure that is formulated as gradient descent in learning function space, while opinion exchange between classifiers is implemented with a weighted voting in each learning iteration. We show through experiments that on real-world data from Sina Weibo, our method outperforms existing off-the-shelf algorithms that do not take users' individuality or social connections into account.

References

[1]
J. Baxter. A bayesian/information theoretic model of learning to learn viamultiple task sampling. Machine Learning, 28(1):7--39, 1997.
[2]
M. Belkin, P. Niyogi, and V. Sindhwani. Manifold regularization: A geometric framework for learning from labeled and unlabeled examples. JMLR, 1(1):1--48, 2006.
[3]
S. Ben-David, J. Gehrke, and R. Schuller. A theoretical framework for learning from a pool of disparate data sources. In KDD, pages 443--449, 2002.
[4]
S. Ben-David and R. Schuller. Exploiting task relatedness for mulitple task learning. In COLT, pages 567--580, 2003.
[5]
K. P. Bennett, A. Demiriz, and R. Maclin. Exploiting unlabeled data in ensemble methods. In KDD, pages 289--296, 2002.
[6]
R. Caruana. Multitask learning. Machine Learning, 28(1):41--75, 1997.
[7]
S. Chakrabarti, B. Dom, and P. Indyk. Enhanced hypertext categorization using hyperlinks. In SIGMOD, pages 307--318, 1998.
[8]
O. Chapelle, P. Shivaswamy, S. Vadrevu, K. Weinberger, Y. Zhang, and B. Tseng. Multi-task learning for boosting with application to web search ranking. In KDD, pages 1189--1198, 2010.
[9]
F. d'Alché Buc, Y. Grandvalet, and C. Ambroise. Semi-supervised marginboost. In NIPS, pages 553--560, 2001.
[10]
T. Evgeniou and M. Pontil. Regularized multi-task learning. In KDD, pages 109--117, 2004.
[11]
Y. Freund and R. Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55:119--139, 1997.
[12]
T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning. Springer-Verlag, 2001.
[13]
A. Helal, D. J. Cook, and M. Schmalz. Smart home-based health platform for behavioral monitoring and alteration of diabetes patients. Journal of diabetes science and technology (Online), 3(1):141, 2009.
[14]
X. Hu, L. Tang, J. Tang, and H. Liu. Exploiting social relations for sentiment analysis in microblogging. In WSDM, 2013.
[15]
T. Joachims. Transductive inference for text classification using support vector machines. In ICML, pages 200--209, 1999.
[16]
Z. Kang, K. Grauman, and F. Sha. Learning with whom to share in multi-task feature learning. In ICML, pages 521--528, 2011.
[17]
R. Lee, S. Wakamiya, and K. Sumiya. Discovery of unusual regional social activities using geo-tagged microblogs. WWW Journal, 14:321--349, 2011.
[18]
D. Lian and X. Xie. Collaborative activity recognition via check-in history. In SIGSPATIAL Workshop on Location-Based Social Networks, pages 45--48, 2011.
[19]
Q. Lu and L. Getoor. Link-based classification. In ICML, pages 496--503, 2003.
[20]
S. A. Macskassy and F. Provost. Classification in networked data: A toolkit and a univariate case study. JMLR, 8:935--983, 2007.
[21]
L. Mason, J. Baxter, P. L. Bartlett, and M. Frean. Functional gradient techniques for combining hypotheses. Advances in Large Margin Classifiers, pages 221--246, 2000.
[22]
A. K. McCallum. Mallet: A machine learning for language toolkit. http://mallet.cs.umass.edu, 2002.
[23]
A. Ritter, Mausam, O. Etzioni, and S. Clark. Open domain event extraction from twitter. In KDD, pages 1104--1112, 2012.
[24]
T. Sakaki, M. Okazaki, and Y. Matsuo. Earthquake shakes twitter users: real-time event detection by social sensors. In WWW, pages 851--860, 2010.
[25]
H. Sayyadi, M. Hurst, and A. Maykov. Event detection and tracking in social streams. In ICWSM, 2009.
[26]
P. Sen, G. M. Namata, M. Bilgic, L. Getoor, B. Gallagher, and T. Eliassi-Rad. Collective classification in network data. AI Magazine, 29(3):93--106, 2008.
[27]
C. Tan, L. Lee, J. Tang, L. Jiang, M. Zhou, and P. Li. User-level sentiment analysis incorporating social networks. In KDD, pages 1397--1405, 2011.
[28]
E. Tapia, S. Intille, and K. Larson. Activity recognition in the home using simple and ubiquitous sensors. IEEE Pervasive Computing, pages 158--175, 2004.
[29]
S. Wang and C. Zhang. Network game and boosting. In ECML, pages 461--472, 2005.
[30]
Z. Wang, Y. Song, and C. Zhang. Homotopy regularization for boosting. In ICDM, pages 1115--1120, 2010.
[31]
W. Weerkamp and M. de Rijke. Activity prediction: A twitter-based exploration. In SIGIR Workshop on Time-aware Information Access, 2012.
[32]
J. Weng and B.-S. Lee. Event detection in twitter. In ICWSM,2011.
[33]
Y. Zhang and D.-Y. Yeung. A convex formulation for learning task relationships in multi-task learning. In UAI, 2010.
[34]
Y. Zhang and D.-Y. Yeung. Multi-task boosting by exploiting task relationships. In ECML/PKDD, pages 697--710, 2012.

Cited By

View all
  • (2023)Where You Are Is What You Do: On Inferring Offline Activities From Location Data2023 IEEE International Conference on Data Mining Workshops (ICDMW)10.1109/ICDMW60847.2023.00113(836-843)Online publication date: 4-Dec-2023
  • (2021)A machine learning based approach for user privacy preservation in social networksPeer-to-Peer Networking and Applications10.1007/s12083-020-01068-014:3(1596-1607)Online publication date: 9-Mar-2021
  • (2020)Identifying Human Daily Activity Types with Time-Aware InteractionsApplied Sciences10.3390/app1024892210:24(8922)Online publication date: 14-Dec-2020
  • Show More Cited By

Index Terms

  1. Collaborative boosting for activity classification in microblogs

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    KDD '13: Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
    August 2013
    1534 pages
    ISBN:9781450321747
    DOI:10.1145/2487575
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 11 August 2013

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. activity classification
    2. boosting
    3. collaborative classification
    4. social regularization.

    Qualifiers

    • Research-article

    Conference

    KDD' 13
    Sponsor:

    Acceptance Rates

    KDD '13 Paper Acceptance Rate 125 of 726 submissions, 17%;
    Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

    Upcoming Conference

    KDD '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)5
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 05 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Where You Are Is What You Do: On Inferring Offline Activities From Location Data2023 IEEE International Conference on Data Mining Workshops (ICDMW)10.1109/ICDMW60847.2023.00113(836-843)Online publication date: 4-Dec-2023
    • (2021)A machine learning based approach for user privacy preservation in social networksPeer-to-Peer Networking and Applications10.1007/s12083-020-01068-014:3(1596-1607)Online publication date: 9-Mar-2021
    • (2020)Identifying Human Daily Activity Types with Time-Aware InteractionsApplied Sciences10.3390/app1024892210:24(8922)Online publication date: 14-Dec-2020
    • (2020)Tweets can tell: activity recognition using hybrid gated recurrent neural networksSocial Network Analysis and Mining10.1007/s13278-020-0628-010:1Online publication date: 2-Mar-2020
    • (2020)Application of Machine Learning in the Social NetworkRecent Advances in Hybrid Metaheuristics for Data Clustering10.1002/9781119551621.ch4(61-83)Online publication date: 5-Jun-2020
    • (2019)Understanding Context for Tasks and ActivitiesProceedings of the 2019 Conference on Human Information Interaction and Retrieval10.1145/3295750.3298929(133-142)Online publication date: 8-Mar-2019
    • (2019)A Large-Scale Empirical Study of Internet Users' Privacy Leakage in China2019 IEEE Intl Conf on Dependable, Autonomic and Secure Computing, Intl Conf on Pervasive Intelligence and Computing, Intl Conf on Cloud and Big Data Computing, Intl Conf on Cyber Science and Technology Congress (DASC/PiCom/CBDCom/CyberSciTech)10.1109/DASC/PiCom/CBDCom/CyberSciTech.2019.00083(406-411)Online publication date: Aug-2019
    • (2018)Modeling and Analysis of Demand for Personalized PortalProceedings of the 3rd International Conference on Crowd Science and Engineering10.1145/3265689.3265708(1-8)Online publication date: 28-Jul-2018
    • (2018)Where in the World Is Carmen Sandiego?Proceedings of the 10th ACM Conference on Web Science10.1145/3201064.3201068(229-238)Online publication date: 15-May-2018
    • (2017)Socialized word embeddingsProceedings of the 26th International Joint Conference on Artificial Intelligence10.5555/3172077.3172436(3915-3921)Online publication date: 19-Aug-2017
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media