research-article

My City, My Voice: Listening to the Citizen Views from Web Sources

Authors:
Manjira Sinha

IIT Kharagpur

IIT Kharagpur
View Profile

,
Satarupa Guha

Microsoft India (R&D)

Microsoft India (R&D)
View Profile

,
Preethy Varma

Evestnet Yodlee

Evestnet Yodlee
View Profile

,
Tridib Mukherjee

Play Games 24X7

Play Games 24X7
View Profile

,
Sandya Mannarswamy

Conduent Labs India

Conduent Labs India
View Profile

CODS-COMAD '19: Proceedings of the ACM India Joint International Conference on Data Science and Management of DataJanuary 2019Pages 52–60https://doi.org/10.1145/3297001.3297008

Published:03 January 2019Publication History

CODS-COMAD '19: Proceedings of the ACM India Joint International Conference on Data Science and Management of Data

Pages 52–60

ABSTRACT

To facilitate an environment of inclusive urban management, civic agencies need to listen to the voices of citizens on web sources such as social media, online blogs, public forums and so on. Owing to the vastness and noisy nature of online data, it is challenging, yet important to mine actionable issues related to a city as faced by the citizens firsthand, so that timely measures can be taken by the administration to remedy them. In this work, we filter, analyze, and model web data on urban civic issues of a city, with respect to three modalities - semantics, spatial and temporal. We have come up with a novel approach that captures the contexts through dense distributed word embedding as well as identifies the latent issues through a generative model. Due to the scarcity of geo-tagged posts and delayed reporting, we rely primarily on the textual content of the data for location mining and temporal resolution. We present a first-of-a-kind unified system named CUrb that introduces a novel pipeline to construct long term topology of issues across three dimensions, aggregated over a variety of documents. Through extensive experimentation, we demonstrate the efficacy of our system both qualitatively and quantitatively. It achieves improvement upto 24% compared to the state-of-the-art technique.

References

H. Abdelhaq, C. Sengstock, and M. Gertz. Eventweet: Online localized event detection from twitter. Proc. VLDB Endow., 6(12):1326--1329, Aug. 2013. Google ScholarDigital Library
D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. J. Mach. Learn. Res., 3:993--1022, Mar. 2003. Google ScholarCross Ref
J. Chang, J. Boyd-Graber, C. Wang, S. Gerrish, and D. M. Blei. Reading tea leaves: How humans interpret topic models. In Neural Information Processing Systems, 2009. Google ScholarDigital Library
M. Ester, H.-P. Kriegel, J. Sander, and X. Xu. A density-based algorithm for discovering clusters in large spatial databases with noise. In Kdd, volume 96, pages 226--231, 1996. Google ScholarDigital Library
J. Fleiss et al. Measuring nominal scale agreement among many raters. Psychological Bulletin, 76(5):378--382, 1971.Google ScholarCross Ref
Google. Google places API. https://developers.google.com/places/.Google Scholar
Google. Google Word2vec model. https://code.google.com/archive/p/word2vec/, 2013.Google Scholar
J. Hartigan and M. Wong. Algorithm AS 136: A K-means clustering algorithm. Applied Statistics, pages 100--108, 1979.Google Scholar
B. Hu and M. Ester. Spatial topic modeling in online social media for location recommendation. In Proceedings of the 7th A CM Conference on Recommender Systems, RecSys '13, pages 25--32, New York, NY, USA, 2013. ACM. Google ScholarDigital Library
C.-H. Lee, H.-C. Yang, T.-F. Chien, and W.-S. Wen. A novel approach for event detection by mining spatio-temporal information on microblogs. In Proceedings of the 2011 International Conference on Advances in Social Networks Analysis and Mining, ASONAM '11, pages 254--259, Washington, DC, USA, 2011. IEEE Computer Society. Google ScholarDigital Library
I. Mani and G. Wilson. Robust temporal processing of news. In Proceedings of the 38th Annual Meeting on Association for Computational Linguistics, ACL '00, pages 69--76, Stroudsburg, PA, USA, 2000. Association for Computational Linguistics. Google ScholarDigital Library
T. Mikolov, K. Chen, G. Corrado, and J. Dean. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781, 2013.Google Scholar
D. Mimno, H. M. Wallach, E. Talley, M. Leenders, and A. McCallum. Optimizing semantic coherence in topic models. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP '11, pages 262--272, Stroudsburg, PA, USA, 2011. Association for Computational Linguistics. Google ScholarDigital Library
T. Mukherjee, D. Chander, S. Eswaran, M. Singh, P. Varma, A. Chugh, and K. Dasgupta. Janayuja: A people-centric platform to generate reliable and actionable insights for civic agencies. In Proceedings of the 2015 Annual Symposium on Computing for Development, DEV '15, pages 137--145, New York, NY, USA, 2015. ACM. Google ScholarDigital Library
A. Nurwidyantoro and E. Winarko. Event detection in social media: A survey. In Proceedings of the International Conference on ICT for Smart Society (ICISS), IEEE, pages 1--5, 2013.Google ScholarCross Ref
F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825--2830, 2011. Google ScholarDigital Library
Q. Qu, C. Chen, C. S. Jensen, and A. Skovsgaard. Space-time aware behavioral topic modeling for microblog posts. IEEE Data Eng. Bull., 38(2):58--67, 2015.Google Scholar
D. Ramage, D. Hall, R. Nallapati, and C. D. Manning. Labeled lda: A supervised topic model for credit attribution in multi-labeled corpora. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1, EMNLP '09, pages 248--256, Stroudsburg, PA, USA, 2009. Association for Computational Linguistics. Google ScholarDigital Library
V. K. Rangarajan Sridhar. Unsupervised topic modeling for short texts using distributed representations of words. In Proceedings of the 1st Workshop on Vector Space Modeling for Natural Language Processing, pages 192--200, Denver, Colorado, June 2015. Association for Computational Linguistics.Google ScholarCross Ref
R. Řehůřek and P. Sojka. Software Framework for Topic Modelling with Large Corpora. In Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks, pages 45--50, Valletta, Malta, May 2010. ELRA.Google Scholar
A. Ritter, Mausam, O. Etzioni, and S. Clark. Open domain event extraction from twitter. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '12, pages 1104--1112, New York, NY, USA, 2012. ACM. Google ScholarDigital Library
C. C. Robusto. The cosine-haversine formula. The American Mathematical Monthly, 64(1):38--40, 1957.Google ScholarCross Ref
X. Wang and A. McCallum. Topics over time: A non-markov continuous-time model of topical trends. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '06, pages 424--433, New York, NY, USA, 2006. ACM. Google ScholarDigital Library
X. Yan, J. Guo, Y. Lan, and X. Cheng. A biterm topic model for short texts. In Proceedings of the 22Nd International Conference on World Wide Web, WWW '13, pages 1445--1456, New York, NY, USA, 2013. ACM. Google ScholarDigital Library

Recommendations

The Problem of Community Engagement: Disentangling the Practices of Municipal Government
CHI '18: Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems

In this paper, we work to inform the growing space of Digital Civics with a qualitative study of community engagement practices across the breadth of municipal departments and agencies in a large US city. We conducted 34 inter-views across 15 different ...
Read More
Digital soapboxes: towards an interaction design agenda for situated civic innovation
UbiComp '13 Adjunct: Proceedings of the 2013 ACM conference on Pervasive and ubiquitous computing adjunct publication

We argue that there are at least two significant issues for interaction designers to consider when creating the next generation of human interfaces for civic and urban engagement: (1) The disconnect between citizens participating in either digital or ...
Read More
Agora2.0: enhancing civic participation through a public display
C&T '13: Proceedings of the 6th International Conference on Communities and Technologies

Providing a common place for the civil society to gather and discuss topics of mutual interest is a growing challenge for social and collaborative computing. Web-based tools for civic engagement, while promising, are still disconnected from meaningful ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CODS-COMAD '19: Proceedings of the ACM India Joint International Conference on Data Science and Management of Data
January 2019
380 pages
ISBN:9781450362078
DOI:10.1145/3297001
General Chairs:
Lipika Dey
TCS Innovation Labs
,
Surajit Chaudhury
Microsoft Research
,
Program Chairs:
Raghu Krishnapuram
Robert Bosch Center, IISc Bangalore
,
Parag Singla
IIT Delhi
,
Publications Chair:
Rishiraj Saha Roy
Max Planck Institute for Informatics
Copyright © 2019 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 3 January 2019
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Issue Identification
Natural Language Processing
Urban Informatics
Web Data Mining
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
CODS-COMAD '19 Paper Acceptance Rate62of198submissions,31%Overall Acceptance Rate197of680submissions,29%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 2
  Total Citations
  View Citations
- 99
  Total Downloads
- Downloads (Last 12 months)3
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

My City, My Voice: Listening to the Citizen Views from Web Sources

CODS-COMAD '19: Proceedings of the ACM India Joint International Conference on Data Science and Management of Data

ABSTRACT

References

Cited By

Recommendations

The Problem of Community Engagement: Disentangling the Practices of Municipal Government

Digital soapboxes: towards an interaction design agenda for situated civic innovation

Agora2.0: enhancing civic participation through a public display

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

My City, My Voice: Listening to the Citizen Views from Web Sources

CODS-COMAD '19: Proceedings of the ACM India Joint International Conference on Data Science and Management of Data

ABSTRACT

References

Cited By

Recommendations

The Problem of Community Engagement: Disentangling the Practices of Municipal Government

Digital soapboxes: towards an interaction design agenda for situated civic innovation

Agora2.0: enhancing civic participation through a public display

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media