Web Semantics: Science, Services and Agents on the World Wide Web
Automatic construction of a large-scale situation ontology by mining how-to instructions from the web
Introduction
Ontological knowledge has become a main vehicle for semantically and conceptually oriented techniques and applications such as word sense disambiguation, searching, classification, question answering, entity resolution, and context/situation-aware reasoning for personalized services. However, currently available large-scale ontologies often fail to deal with diverse task situations that may arise in the real world because they lack in understanding the dynamic nature of daily lives of people and the associated activities. For example, automatically built ontologies like YAGO [12] driven by WordNet [40] and Wikipedia [39] do not have a sufficient coverage of contextual instances to reason about situations and activities arising from different domains. There is no consideration about such activities of daily living as shopping, driving, wedding, etc., for which the context variables like actions, location, and time should be made available. Without a situation ontology of this kind, it would not be possible to infer what activity the user is engaged in and what actions are likely to be taken from the current situation, which can be characterized with context variables like the current location, objects used, and time.
As a novel solution to the problem, we attempt to build a huge situation knowledge base of human activities by means of text mining techniques that exploit the structure of the how-to descriptions, which is essential for context/situation-aware services. Action level knowledge is extracted from eHow1 and wikiHow2, freely accessible websites currently storing more than one million articles on how to do things step by step, which collectively cover almost every domain of daily lives including business, cars, computers, education, health, travel, weddings, etc. An article can be converted into an instance of a situation ontology model that consists of a goal, action sequence, and contextual ingredient that includes location, time, and objects. To organize such knowledge, we have defined a situation ontology specification that includes six ontology classes, topic, goal, action, object, time, and location, and six types of semantic relations, hasTopic, hasAction, hasNextAction, hasObject, hasTime, and hasLocation, all of which are derived from the eHow articles, as in Fig. 2.
We crawled the entire set of articles from the eHow and the wikiHow websites and applied natural language processing (NLP) techniques to obtain a highly refined situation ontology, which can help detecting the current situation of a user in a daily life and suggesting a solution suitable for the problem at hand if any. The task of the employed NLP techniques is to extract actions expressed in a verb form and associated contextual ingredient items from the goal and subsequent action sequences expressed in natural language in an article. In order to put the linguistic constituents in an ontological form,3 we designed four additional steps: goal normalization, action normalization, action transition probability calculation, and ingredient resolution.
To assess the utility of the proposed method and its outcome, we measured accuracy and coverage of the automatically constructed ontology. Accuracy was measured by taking a random sample of the situation instances converted from the corresponding articles. We checked whether or not those instances were clear without ambiguity and well-formed. For coverage of the resulting ontology, it was compared for verbs against existing large-scale ontology-like resources: WordNet and OMICS [27].
In this paper, an automatic situation ontology construction based on action mining from the Web is presented to build a large-scale situation ontology that is required to reason about user intentions (or situations) and provide relevant recommendations in a given context. Its main contribution is to show that an automatic methodology can be employed to construct a large-scale situation ontology for the situation model with high precision. Given the dynamic nature of knowledge in people's daily activities, it is critical to devise an automatic method for constructing situation ontologies. Through the application scenarios, we also show that the ontology constructed as such can be of practical value for context-aware applications. We advocate that the high accuracy of the method and the sheer size and utility of the situation ontology lend themselves to further research and development in context-aware applications involving unconstrained daily lives.
Section 2 describes the main features and drawbacks of previous work concerning situation-awareness, situation ontology, and automatic ontology construction to set the stage for our work. In Section 3, we introduce our situation ontology model and the resources from which the current situation ontology is constructed. Section 4 explains the details of our situation ontology construction process focusing action mining and normalization. In Section 5, we present an evaluation of the constructed ontology for its accuracy and comparison to other ontology-like resources. Section 6 shows how the newly constructed situation ontology can be utilized in situation-aware recommendation and semantic web service composition. In Section 7, we give our conclusion and discuss future directions.
Section snippets
Related work
The notion of context-awareness in ubiquitous computing was proposed in 1990s to address the interaction between computer systems and environments [5]. Situation-awareness has also been used to refer to the same meaning [13]. The notion has received a great deal of attention because it is a basis for improving the quality of decisions in a heterogeneous, highly dynamic environment [26]. The meaning of information about the perceived objects can be correctly determined when the situation or
Situation ontology
In this section, we introduce our situation model and situation ontology specification that are driven by the content how-to knowledge, eHow and wikiHow. The model is intended to hold the action knowledge available in the resources, instead of taking a prescriptive approach for general purposes. In addition, the details of the knowledge sources are presented.
Situation ontology construction: goal-action mining
The goal of our ontology construction process is to derive an explicit specification of goals and associated actions from how-to instructions people created so that they can serve as conceptualization of situations that arise in daily lives. As depicted in Fig. 4, there are two main sub-processes. The how-to articles from the eHow and wikiHow sites are first processed with both a syntactic pattern based method and a probabilistic method so that actions (in the form of verbs) and associated
Evaluation
In order to validate our effort for automatically constructing a situation ontology, we first show the statistics of the result and discuss about the experiment we ran for extraction accuracy and its result for both the syntactic pattern-based and the CRF-based methods. To put the result in perspective, we compare the coverage of the resulting situation ontology with other ontology-like resources, WordNet and OMICS, in terms of actions covered.
Applications
To demonstrate the applicability of the situation ontology, we introduce two application scenarios where it can play a key role: situation-aware service recommendation and semantic web service composition. In the first application, the system attempts to infer user's current situation through identification of the goal that can be revealed by contextual information including user's current location, actions taken, and objects used for the actions. Since the ontology currently contains about
Conclusion and future work
We presented an automatic approach to constructing a large-scale situation ontology by means of action mining from the web resources. Especially, in order to aggregate situation knowledge from evolving web resources, such as eHow.com and wikiHow.com, we have defined a situation ontology model consisting of user goals, action sequences, and their context information such as objects, locations, and times, all of which are derived from the how-to instructions in natural language.
The ontology
Acknowledgements
Support for this research came from the Ministry of Knowledge Economy, Korea, under the Information Technology Research Center support program supervised by the National IT Industry Promotion Agency; NIPA-2009-(C1090-0903-0008). Financial support for this work also came from a grant from the strategic technology development program 2008-F-047-02 of the Ministry of Knowledge Economy.
References (40)
- et al.
Automated generation of composite web services based on functional semantics
Journal of Web Semantics
(2009) - et al.
HTN planning for web service composition using SHOP2
Journal of Web Semantics
(2004) - et al.
YAGO: a large ontology from Wikipedia and WordNet
Journal of Web Semantics: Sci. Serv. Agents World Wide Web
(2008) - et al.
Specification, decomposition and agent synthesis for situation-aware service-based systems
Journal of Systems and Software
(2008) - et al.
Semantic distance in WordNet: an experimental, application-oriented evaluation of five measures
- et al.
Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons
- A.K. Dey, Providing architectural support for building context-aware applications, Ph.D. Thesis, College of Computing,...
- et al.
Unsupervised resolution of objects and relations on the web
- et al.
Context-aware computing applications
- et al.
A core ontology for situation awareness
Snowball: extracting relations from large plain-text collections
Enriching very large ontologies using the WWW
HESA: a human-centric evolvable situation-awareness model in smart homes
Low-cost supervision for multiple-source attribute extraction
Lightly-supervised attribute extraction
Data Mining: A Knowledge Discovery Approach
Cited by (53)
A comprehensive survey of procedural video datasets
2021, Computer Vision and Image UnderstandingCitation Excerpt :Examples include videos on cooking, assembly, repair, craft, beauty care, etc. While various works have investigated mining procedural knowledge from natural language sources (Perkowitz et al., 2004; Jung et al., 2010; Addis et al., 2011; Yang and Nyberg, 2015), many events are implicit and are not described explicitly in natural language. A picture is worth a thousand words, and actions speak louder than words.
Recommending tasks based on search queries and missions
2023, Natural Language EngineeringNon-Sequential Graph Script Induction via Multimedia Grounding
2023, Proceedings of the Annual Meeting of the Association for Computational LinguisticsCausal Reasoning About Entities and Events in Procedural Texts
2023, EACL 2023 - 17th Conference of the European Chapter of the Association for Computational Linguistics, Findings of EACL 2023Building Knowledge Graphs from Unstructured Texts: Applications and Impact Analyses in Cybersecurity Education
2022, Information (Switzerland)