skip to main content
10.1145/2487788.2487995acmotherconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections

A graph-based approach to commonsense concept extraction and semantic similarity detection

Published: 13 May 2013 Publication History


Commonsense knowledge representation and reasoning support a wide variety of potential applications in fields such as document auto-categorization, Web search enhancement, topic gisting, social process modeling, and concept-level opinion and sentiment analysis. Solutions to these problems, however, demand robust knowledge bases capable of supporting flexible, nuanced reasoning. Populating such knowledge bases is highly time-consuming, making it necessary to develop techniques for deconstructing natural language texts into commonsense concepts. In this work, we propose an approach for effective multi-word commonsense expression extraction from unrestricted English text, in addition to a semantic similarity detection technique allowing additional matches to be found for specific concepts not already present in knowledge bases.


S. Auer, C. Bizer, G. Kobilarov, J. Lehmann, R. Cyganiak, and Z. Ives. Dbpedia: A nucleus for a web of open data. The Semantic Web, pages 722--735, 2007.
E. Cambria, N. Howard, J. Hsu, and A. Hussain. Sentic blending: Scalable multimodal fusion for continuous interpretation of semantics and sentics. In IEEE SSCI, Singapore, 2013.
E. Cambria and A. Hussain. Sentic Computing: Techniques, Tools, and Applications. Springer, Dordrecht, Netherlands, 2012.
E. Cambria, D. Rajagopal, D. Olsher, and D. Das. Big social data analysis. In R. Akerkar, editor, Big Data Computing, chapter 13. Chapman and Hall/CRC, 2013.
E. Cambria, Y. Song, H. Wang, and N. Howard. Semantic multi-dimensional scaling for open-domain sentiment analysis. IEEE Intelligent Systems. 2013.
A. Carlson, J. Betteridge, B. Kisiel, B. Settles, E. Hruschka, and T. Mitchell. Toward an architecture for never-ending language learning. In AAAI, pages 1306--1313, Atlanta, 2010.
G. Carroll and E. Charniak. Two experiments on learning probabilistic dependency grammars from corpora. AAAI technical report WS-92-01, Department of Computer Science, Univ., 1992.
E. Charniak. Statistical parsing with a context-free grammar and word statistics. In AAAI, pages 598--603, Providence, 1997.
S. Deerwester, S. Dumais, G. Furnas, T. Landauer, and R. Harshman. Indexing by latent semantic analysis. Journal of the American society for information science, 41(6):391--407, 1990.
C. Eckart and G. Young. The approximation of one matrix by another of lower rank. Psychometrika, 1(3):211--218, 1936.
C. Fellbaum. WordNet: An Electronic Lexical Database (Language, Speech, and Communication). The MIT Press, 1998.
M. Grassi, E. Cambria, A. Hussain, and F. Piazza. Sentic web: A new paradigm for managing social media affective information. Cognitive Computation, 3(3):480--489, 2011.
R. Hwa. Sample selection for statistical grammar induction. In EMNLP, pages 45--52, Hong Kong, 2000.
J. Kandola, J. Shawe-Taylor, and N. Cristianini. Learning semantic similarity. Advances in neural information processing systems, 15:657--664, 2002.
D. Lenat and R. Guha. Building Large Knowledge-Based Systems: Representation and Inference in the Cyc Project. Addison-Wesley, Boston, 1989.
C. Manning. Part-of-speech tagging from 97% to 100%: Is it time for some linguistics? In A. Gelbukh, editor, Computational Linguistics and Intelligent Text Processing, volume 6608 of Lecture Notes in Computer Science, pages 171--189. Springer, 2011.
D. Olsher. COGPARSE: Brain-inspired knowledge-driven full semantics parsing. In Advances in Brain Inspired Cognitive Systems, pages 1--11, 2012.
D. Olsher. Full spectrum opinion mining: Integrating domain, syntactic and lexical knowledge. In ICDM SENTIRE, pages 693--700, Brussels, 2012.
D. Olsher. COGVIEW & INTELNET: Nuanced energy-based knowledge representation and integrated cognitive-conceptual framework for realistic culture, values, and concept-affected systems simulation. In IEEE SSCI, Singapore, 2013.
C. Paice. Another stemmer. SIGIR Forum, 24(3):56--61, 1990.
H. Park and C. Jun. A simple and fast algorithm for k-medoids clustering. Expert Systems with Applications, 36(2):3336--3341, 2009.
M. Sahami and T. Heilman. A web-based kernel function for measuring the similarity of short text snippets. In WWW, pages 377--386, Edinburgh, 2006.
R. Speer and C. Havasi. ConceptNet 5: A large semantic network for relational knowledge. In E. Hovy, M. Johnson, and G. Hirst, editors, Theory and Applications of Natural Language Processing, chapter 6. Springer, 2012.
N. Tandon, D. Rajagopal, and G. De Melo. Markov chains for robust graph-based commonsense information extraction. In COLING, pages 439--446, Mumbai, 2012.
M. Tang, X. Luo, S. Roukos, et al. Active learning for statistical natural language parsing. In ACL, pages 120--127, Philadelphia, 2002.
K. Toutanova, D. Klein, C. Manning, and Y. Singer. Feature-rich part-of-speech tagging with a cyclic dependency network. In NAACL, pages 173--180, Stroudsburg, 2003.
M. Wall, A. Rechtsteiner, and L. Rocha. Singular value decomposition and principal component analysis. In D. Berrar, W. Dubitzky, and M. Granzow, editors, A Practical Approach to Microarray Data Analysis, pages 91--109. Springer, 2003.

Cited By

View all
  • (2024)Integration of Symbolic and Implicit Knowledge Representation for Open Domain2024 Second International Conference on Intelligent Cyber Physical Systems and Internet of Things (ICoICI)10.1109/ICoICI62503.2024.10696090(1023-1027)Online publication date: 28-Aug-2024
  • (2024)Acquiring and Modeling Abstract Commonsense Knowledge via ConceptualizationArtificial Intelligence10.1016/j.artint.2024.104149(104149)Online publication date: May-2024
  • (2024)Exploring New Horizons in Word Sense Disambiguation and Topic Modeling: Potential of Deep Learning Based Transformers ModelsDigital Humanities Looking at the World10.1007/978-3-031-48941-9_26(341-356)Online publication date: 20-Apr-2024
  • Show More Cited By



Information & Contributors


Published In

cover image ACM Other conferences
WWW '13 Companion: Proceedings of the 22nd International Conference on World Wide Web
May 2013
1636 pages


  • NICBR: Nucleo de Informatcao e Coordenacao do Ponto BR
  • CGIBR: Comite Gestor da Internet no Brazil



Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 May 2013


Request permissions for this article.

Check for updates

Author Tags

  1. ai
  2. commonsense knowledge representation and reasoning
  3. natural language processing
  4. semantic similarity


  • Research-article


WWW '13
WWW '13: 22nd International World Wide Web Conference
May 13 - 17, 2013
Rio de Janeiro, Brazil

Acceptance Rates

WWW '13 Companion Paper Acceptance Rate 831 of 1,250 submissions, 66%;
Overall Acceptance Rate 1,899 of 8,196 submissions, 23%


Other Metrics

Bibliometrics & Citations


Article Metrics

  • Downloads (Last 12 months)25
  • Downloads (Last 6 weeks)3
Reflects downloads up to 17 Feb 2025

Other Metrics


Cited By

View all
  • (2024)Integration of Symbolic and Implicit Knowledge Representation for Open Domain2024 Second International Conference on Intelligent Cyber Physical Systems and Internet of Things (ICoICI)10.1109/ICoICI62503.2024.10696090(1023-1027)Online publication date: 28-Aug-2024
  • (2024)Acquiring and Modeling Abstract Commonsense Knowledge via ConceptualizationArtificial Intelligence10.1016/j.artint.2024.104149(104149)Online publication date: May-2024
  • (2024)Exploring New Horizons in Word Sense Disambiguation and Topic Modeling: Potential of Deep Learning Based Transformers ModelsDigital Humanities Looking at the World10.1007/978-3-031-48941-9_26(341-356)Online publication date: 20-Apr-2024
  • (2024)Substructure Discovery in Commonsense Relations Using Graph Representation LearningIntelligent Systems and Applications10.1007/978-3-031-47721-8_48(714-734)Online publication date: 10-Jan-2024
  • (2023)An experimental study measuring the generalization of fine‐tuned language representation models across commonsense reasoning benchmarksExpert Systems10.1111/exsy.1324340:5Online publication date: 10-Feb-2023
  • (2023)Knowledge Graph enhanced Aspect-Based Sentiment Analysis Incorporating External Knowledge2023 IEEE International Conference on Data Mining Workshops (ICDMW)10.1109/ICDMW60847.2023.00107(791-798)Online publication date: 4-Dec-2023
  • (2023)Video Sentiment Analysis for Child Safety2023 IEEE International Conference on Data Mining Workshops (ICDMW)10.1109/ICDMW60847.2023.00106(783-790)Online publication date: 4-Dec-2023
  • (2023)FinXABSA: Explainable Finance through Aspect-Based Sentiment Analysis2023 IEEE International Conference on Data Mining Workshops (ICDMW)10.1109/ICDMW60847.2023.00105(773-782)Online publication date: 4-Dec-2023
  • (2023)Sentiment Analysis of Primary Historical Sources2023 IEEE International Conference on Data Mining Workshops (ICDMW)10.1109/ICDMW60847.2023.00104(767-772)Online publication date: 4-Dec-2023
  • (2023)Non-Fungible Tokens: What Makes Them Valuable?2023 IEEE International Conference on Data Mining Workshops (ICDMW)10.1109/ICDMW60847.2023.00102(750-756)Online publication date: 4-Dec-2023
  • Show More Cited By

View Options

Login options

View options


View or Download as a PDF file.



View online with eReader.







Share this Publication link

Share on social media