ABSTRACT
In this paper, we address the challenge of modeling the size, duration, and temporal dynamics of short-lived crowds that manifest in social media. Successful population modeling for crowds is critical for many services including location recommendation, traffic prediction, and advertising. However, crowd modeling is challenging since 1) user-contributed data in social media is noisy and oftentimes incomplete, in the sense that users only reveal when they join a crowd through posts but not when they depart; and 2) the size of short-lived crowds typically changes rapidly, growing and shrinking in sharp bursts. Toward robust population modeling, we first propose a duration model to predict the time users spend in a particular crowd. We propose a time-evolving population model for estimating the number of people departing a crowd, which enables the prediction of the total population remaining in a crowd. Based on these population models, we further describe an approach that allows us to predict the number of posts generated from a crowd. We validate the crowd models through extensive experiments over 22 million geo-location based check-ins and 120,000 event-related tweets.
- F. Bai, N. Sadagopan, B. Krishnamachari, and A. Helmy. Modeling path duration distributions in manets and their impact on reactive routing protocols. IEEE Journal on Selected Areas in Communications, 22(7), 2004. Google ScholarDigital Library
- R. J. Butler and J. D. Worrall. Gamma Duration Models with Heterogeneity. The Review of Economics and Statistics, 73(1):85--102, 1998.Google Scholar
- Z. Cheng, J. Caverlee, K. Kamath, and K. Lee. Toward Traffic-Driven Location-Based Web Search. In CIKM, 2011. Google ScholarDigital Library
- Z. Cheng, J. Caverlee, and K. Lee. Exploring millions of footprints in location sharing services. In ICWSM, 2011.Google Scholar
- B. De Longueville, R. Smith, and G. Luraschi. OMG, from here, I can see the flames!: A Use Case of Mining Location Based Social Networks to Acquire Spatio-temporal Data on Forest Fires. In Workshop on Location Based Social Networks, 2009. Google ScholarDigital Library
- P. O. V. De Melo, L. Akoglu, C. Faloutsos, and A. A. Loureiro. Surprising Patterns for the Call Duration Distribution of Mobile Phone Users. In ECML PKDD, 2010. Google ScholarDigital Library
- S. A. Golder, D. M. Wilkinson, and B. A. Huberman. Rhythms of social interaction: Messaging within a massive online network. In the Third Communities and Technologies Conference, 2007.Google ScholarCross Ref
- P. Kitano and K. Boer. The local business owner's guide to twitter, 2009.Google Scholar
- J. Lindqvist, J. Cranshaw, J. Wiese, J. Hong, and J. Zimmerman. I'm the mayor of my house: Examining why people use foursquare - a social-driven location sharing application. In SIGCHI, 2011. Google ScholarDigital Library
- D. Nam and F. Mannering. An exploratory hazard-based analysis of highway incident duration. Transportation Research Part A: Policy and Practice, 34:161--166, 1991.Google Scholar
- J. Sadik-Khan. New York City Bridge Traffic Volumes 2010, 2010.Google Scholar
- T. Sakaki, M. Okazaki, and Y. Matsuo. Earthquake shakes twitter users: real-time event detection by social sensors. In WWW, 2010. Google ScholarDigital Library
- C. Song, Z. Qu, N. Blumm, and A.-L. Barabá si. Limits of predictability in human mobility. Science, 327(5968):1018--1021, 2010.Google ScholarCross Ref
- S. Yardi and D. Boyd. Tweeting from the town square: Measuring geographic local networks. In ICWSM, 2010.Google Scholar
- M. Ye, D. Shou, W. C. Lee, P. Yin, and K. Janowicz. On the semantic annotation of places in location-based social networks. In 17th ACM SIGKDD, 2011. Google ScholarDigital Library
- M. Ye, P. Yin, and W. C. Lee. Location recommendation for location-based social networks. In SIGSPATIAL, 2010. Google ScholarDigital Library
- M. Y. E. Zhang, H. Korayem and C. D. J. Beyond Co-occurrence: Discovering and Visualizing Tag Relationships from Geo-spatial and Temporal Similarities. In WSDM, 2012. Google ScholarDigital Library
- Y. Zheng, X. Xie, and W. Ma. GeoLife: A Collaborative Social Networking Service among User, location and trajectory. IEEE Data Engineering Bulletin, 33(2):32--34, 2010.Google Scholar
Index Terms
- How big is the crowd?: event and location based population modeling in social media
Recommendations
Modus Operandi of Crowd Workers: The Invisible Role of Microtask Work Environments
The ubiquity of the Internet and the widespread proliferation of electronic devices has resulted in flourishing microtask crowdsourcing marketplaces, such as Amazon MTurk. An aspect that has remained largely invisible in microtask crowdsourcing is that ...
Crowd-based urban characterization: extracting crowd behavioral patterns in urban areas from Twitter
LBSN '11: Proceedings of the 3rd ACM SIGSPATIAL International Workshop on Location-Based Social NetworksThe advent of location-based social networking sites provides an open sharing space of crowd-sourced lifelogs that can be regarded as a novel source to monitor massive crowds' lifestyles in the real world. In this paper, we challenge to analyze urban ...
Make Hay While the Crowd Shines: Towards Efficient Crowdsourcing on the Web
WWW '15 Companion: Proceedings of the 24th International Conference on World Wide WebWithin the scope of this PhD proposal, we set out to investigate two pivotal aspects that influence the effectiveness of crowdsourcing: (i) microtask design, and (ii) workers behavior. Leveraging the dynamics of tasks that are crowdsourced on the one ...
Comments