Skip to main content

Advertisement

Log in

Commonsense based text mining on urban policy

  • Original Paper
  • Published:
Language Resources and Evaluation Aims and scope Submit manuscript

Abstract

Local laws on urban policy, i.e., ordinances directly affect our daily life in various ways (health, business etc.), yet in practice, for many citizens they remain impervious and complex. This article focuses on an approach to make urban policy more accessible and comprehensible to the general public and to government officials, while also addressing pertinent social media postings. Due to the intricacies of the natural language, ranging from complex legalese in ordinances to informal lingo in tweets, it is practical to harness human judgment here. To this end, we mine ordinances and tweets via reasoning based on commonsense knowledge so as to better account for pragmatics and semantics in the text. Ours is pioneering work in ordinance mining, and thus there is no prior labeled training data available for learning. This gap is filled by commonsense knowledge, a prudent choice in situations involving a lack of adequate training data. The ordinance mining can be beneficial to the public in fathoming policies and to officials in assessing policy effectiveness based on public reactions. This work contributes to smart governance, leveraging transparency in governing processes via public involvement. We focus significantly on ordinances contributing to smart cities, hence an important goal is to assess how well an urban region heads towards a smart city as per its policies mapping with smart city characteristics, and the corresponding public satisfaction.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Notes

  1. https://github.com/mpuri14/urbanpolicymining.

References

  • Alkhammash, E. H., Jussila, J., Lytras, M. D., & Visvizi, A. (2019). Annotation of smart cities twitter micro-contents for enhanced citizen’s engagement. IEEE Access, 7, 116267–116276.

    Article  Google Scholar 

  • Baccianella, S., Esuli, A., & Sebastiani, F. (2010). SentiWordNet 3.0: An enhanced lexical resource for sentiment analysis and opinion mining. In LREC.

  • Baziotis, C., Pelekis, N., & Doulkeridis, C. (2017). DataStories at SemEval-2017 Task 4: Deep LSTM with attention for message-level and topic-based sentiment analysis. In Proceedings of the 11th international workshop on semantic evaluation (SemEval-2017) (pp. 747–754). Association for Computational Linguistics.

  • Bosselut, A., Rashkin, H., Sap, M., Malaviya, C., Celikyilmaz, A., & Choi, Y. (2019). COMET: Commonsense transformers for automatic knowledge graph construction. In Proceedings of the 57th annual meeting of the association for computational linguistics (pp. 4762–4779). Association for Computational Linguistics.

  • Cambria, E., Li, Y., Xing, F. Z., Poria, S., & Kwok, K. (2020). Senticnet 6: Ensemble application of symbolic and subsymbolic AI for sentiment analysis. Association for Computing Machinery.

  • Cao, Z., Wang, L., & de Melo, G. (2018). Link prediction via subgraph embedding-based convex matrix completion. In AAAI.

  • Chalier, Y., Razniewski, S., & Weikum, G. (2020). Joint reasoning for multi-faceted commonsense knowledge. In AKBC conf.

  • Clark, P., Cowhey, I., Etzioni, O., Khot, T., Sabharwal, A., Schoenick, C., & Tafjord, O. (2018) Think you have solved question answering? Try arc, the AI2 reasoning challenge. CoRR, abs/1803.05457.

  • Davison, J., Feldman, J., & Rush, A. (2019). Commonsense knowledge mining from pretrained models. In Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP) (pp. 1173–1178). Association for Computational Linguistics.

  • DBPedia: Global and unified access to knowledge graphs. DBPedia.

  • Du, X., Emebo, O., Varde, A., Tandon, N., Chowdhury, S. N., & Weikum, G. (2016) Air quality assessment from social media and structured data: Pollutants and health impacts in urban planning. In IEEE ICDE workshops (pp. 54–59).

  • Du, X., Kowalski, M., Varde, A. S., de Melo, G., & Taylor, R. W. (2019). Public opinion matters: Mining social media text for environmental management. In ACM SIGWEB, 5, 1–5:15.

  • Elazar, Y., Mahabal, A., Ramachandran, D., Bedrax-Weiss, T., & Roth, D. (2019) How large are lions? Inducing distributions over quantitative attributes. CoRR, abs/1906.01327.

  • Gundogan, F. (2015). Real-time signal control in developing cities: Challenges and opportunities. In IEEE international conference on intelligent transportation systems (pp. 38–41).

  • Han, P., Shen, S., Wang, D., & Liu, Y. (2012). The influence of word normalization in english document clustering. IEEE CSAE, 2, 116–120.

    Google Scholar 

  • Hitzler, P., Bianchi, F., Ebrahimi, M., & Sarker, Md.K. (2019). Neural-symbolic integration and the semantic web. Semantic Web, 11, 1–9.

  • Holtzman, A., Buys, J., Forbes, M., & Choi, Y. (2019). The curious case of neural text degeneration. CoRR, abs/1904.09751.

  • Hutto, C., & Gilbert, E. (2014). VADER: A parsimonious rule-based model for sentiment analysis of social media text.

  • Hwang, J.D., Bhagavatula, C., Le Bras, R., Da, J., Sakaguchi, K., Bosselut, A., & Choi, Y. (2020). Comet-atomic 2020: On symbolic and neural commonsense knowledge graphs.

  • IMD Business School – Lausanne Switzerland. (2020). Smart City Index 2020: Singapore, Helsinki and Zurich triumph in global smart city index. https://www.imd.org/smart-city-observatory/smart-city-index

  • Jayadharshini, J., Sivapriya, R., & Abirami, S. (2018) Trend square: An android application for extracting twitter trends based on location. In 2018 international conference on current trends towards converging technologies (ICCTCT) (pp. 1–5).

  • Kaggle. (2021). Sentiment140 dataset with 1.6 million tweets. https://www.kaggle.com/kazanova/sentiment140

  • Lenat, D. B., Guha, R. V., Pittman, K., Pratt, D., & Shepherd, M. (1990). Cyc: Toward programs with common sense. Communications of the ACM, 33(8), 30–49.

    Article  Google Scholar 

  • Leskovec, J. (2020). Mining of massive datasets. Cambridge University Press.

  • Li, Q., Shah, S., Liu, X., Nourbakhsh, A., & Fang, R. (2016) Tweetsift: Tweet topic classification based on entity knowledge base and topic enhanced word embedding. In ACM CIKM (pp. 2429–2432).

  • Mikolov, T., Sutskever, I., Chen, K., Corrado, G., & Dean, J. (2013). Distributed representations of words and phrases and their compositionality. In NIPS - Volume 2 (pp. 3111–3119).

  • Miller, G., & Fellbaum, C. (1998). WordNet: An electronic lexical database. The MIT Press.

    Google Scholar 

  • Mishra, B. D., Tandon, N., & Clark, P. (2017). Domain-targeted, high precision knowledge extraction. TACL Journal, 5, 233–246.

    Article  Google Scholar 

  • Pandey, A., Puri, M., & Varde, A. (2018). Object detection with neural models, deep learning and common sense to aid smart mobility. IEEE ICTAI (pp. 859–863)

  • Puri, M., Varde, A. S., Du, X., & de Melo, G. (2018a). Smart governance through opinion mining of public reactions on ordinances. In IEEE ICTAI (pp. 838–845) IEEE.

  • Puri, M., Varde, A. S. & Dong, B. (2018b). Pragmatics and semantics to connect specific local laws with public reactions. In IEEE Big Data (pp. 5433–5435).

  • Razniewski, S., Tandon, N., & Varde, A. (2021). Information to wisdom: Commonsense knowledge extraction and compilation. In ACM WSDM (pp. 1443–1446).

  • Romero, J., Razniewski, S., Pal, K., Pan, J. Z., Sakhadeo, A., & Weikum, G. (2019). Commonsense properties from query logs and question answering forums. CoRR, abs/1905.10989.

  • Rose, G., & Willis, A. (2019). Seeing the smart city on twitter: Colour and the affective territories of becoming smart. Environment and Planning D: Society and Space, 37(3), 411–427.

    Article  Google Scholar 

  • Sakaguchi, K., Le Bras, R., Bhagavatula, C., & Choi, Y. (2020). WinoGrande: An adversarial winograd schema challenge at scale. In AAAI conference (pp. 8732–8740).

  • Shahidehpour, M., Li, Z., & Ganji, M. (2018). Smart cities for a sustainable urbanization: Illuminating the need for establishing smart urban infrastructures. IEEE Electrification Magazine, 6(2), 16–33.

    Article  Google Scholar 

  • Shams, M. B., Hossain, M. J. & Noori. S. R. H. (2020). A time series analysis of trends with twitter hashtags using lstm. In 2020 11th international conference on computing, communication and networking technologies (ICCCNT) (pp 1–6).

  • Shoeb, A. A. Md., Raji, S., & de Melo, G. (2019). EmoTag: Towards an emotion-based analysis of emojis. In Proceedings of RANLP 2019 (pp. 1094–1103).

  • Singh, S., Wen, N., Hou, Y., Alipoormolabashi, P., Wu, T., Ma, X., & Peng, N. (2021) COM2SENSE: A commonsense reasoning benchmark with complementary sentences. In Findings of the association for computational linguistics: ACL-IJCNLP 2021 (pp. 883–898). Association for Computational Linguistics.

  • Singhai, A. Introducing the knowledge graph: Things, not strings. googleblog.blogspot.co.uk

  • Singhal, A. (2001). Modern information retrieval: A brief overview. IEEE Data Engineering Bulletin, 24, 35–43.

    Google Scholar 

  • Solanki, S. K., & Patel, J. T. (2015). A survey on association rule mining. In Internaional conference on advanced computing communication technologies (pp. 212–216).

  • spaCy. (2021). Spacy: Industrial strength natural language processing. https://spacy.io/api

  • Speer, R., Chin, J., & Havasi, C. (2016) ConceptNet 5.5: An open multilingual graph of general knowledge. CoRR, abs/1612.03975.

  • Stanford University. (2021). Stemming and lemmatization. https://nlp.stanford.edu/IR-book/html/htmledition/stemming-and-lemmatization-1.html

  • Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S. E., Anguelov, D., Erhan, D., Vanhoucke, V., & Rabinovich, A. (2014). Going deeper with convolutions. CoRR, abs/1409.4842.

  • Talmor, A., Yoran, O., Le Bras, R., Bhagavatula, C., Goldberg, Y., Choi, Y., & Berant, J. (2021). Commonsenseqa 2.0: Exposing the limits of ai through gamification. In Proceedings of the neural information processing systems track on datasets and benchmarks 2021.

  • Tandon, N., & de Melo, G. (2010). Information extraction from web-scale n-gram data. In Zhai, C., Yarowsky, D., Viegas, E., Wang, K., & Vogel, S. (Eds.) Web N-gram Workshop ACM SIGIR (Vol. 5803, pp. 8–15).

  • Tandon, N., de Melo, G., Suchanek, F., & Weikum, G. (2014) WebChild: Harvesting and organizing commonsense knowledge from the web. In ACM WSDM (pp. 523–532).

  • Tandon, N., de Melo, G., & Weikum, G. (2011) Deriving a Web-scale common sense fact database. In AAAI (pp. 152–157).

  • Tandon, N., de Melo, G., & Weikum, G. (2017) WebChild 2.0: Fine-grained commonsense knowledge distillation. In ACL system demo (pp. 115–120)

  • Tandon, N., Varde, A. S., & de Melo, G. (2017). Commonsense knowledge in machine intelligence. ACM SIGMOD Record, 46(4), 49–52.

    Article  Google Scholar 

  • The IEEE Smart Cities Technical Community. (2018). https://smartcities.ieee.org/

  • The New York City Council. Legislative research center web page. http://legistar.council.nyc.gov/, 2018.

  • United Nations. (2019). Department of Economic and Social Affairs: Population Division. World population prospects: Highlights, Key findings and advance tables. United Nations.

  • Wang, C., Liang, S., Jin, Y., Wang, Y., Zhu, X., & Zhang, Y. (2020). SemEval-2020 Task 4: Commonsense validation and explanation.

  • Wang, L., Wang, Y., Liu, B., He, L., Liu, S., de Melo, G., & Xu, Z. (2017). Link prediction by exploiting network formation games in exchangeable graphs. In IJCNN

  • Wien, T. U. (Vienna University of Technology). (2015). European smart cities. Technical report.

Download references

Acknowledgements

Manish Puri was supported by a Graduate Teaching and Research Assistantship from the Computer Science (CS) department at Montclair State University (MSU) as an MS student in CS. Aparna Varde’s research has support via grants from NSF (USA), Award Number 2018575 on MRI: Acquisition of a High-Performance GPU Cluster for Research and Education, and Award Number 2117308 on MRI: Acquisition of a Multimodal Collaborative Robot System (MCROS) to Support Cross-Disciplinary Human-Centered Research and Education at Montclair State University, She is a visiting researcher at Max Planck Institute for Informatics, Saarbrücken, Germany, in the research group of Dr. Gerhard Weikum, during her sabbatical. Additionally, we thank Xu Du, Boxiang Dong, Anna Feldman and Matthew Kowalski from MSU for some early inputs on this work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Aparna S. Varde.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Puri, M., Varde, A.S. & de Melo, G. Commonsense based text mining on urban policy. Lang Resources & Evaluation 57, 733–763 (2023). https://doi.org/10.1007/s10579-022-09584-6

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10579-022-09584-6

Keywords

Navigation