Keywords

1 Introduction

As the number of Internet of Things (IoT) devices increased, the integration of IoT and the World Wide Web (WWW) gradually started, which has led to the Web of Things (WoT) [1]. In WoT smart devices are blended into the fabric of the WWW [2], which provides an interoperable infrastructure for enabling communications among physical world objects and data access to create future IoT applications through existing standardized Web technologies (e.g. HTML, JavaScript, Ajax, etc.) [2, 3]. The WoT adapts the Web’s representational state transfer (REST) architectural style to the physical world by embedding web servers within smart devices to expose data and functionality using Web protocols [4]. Already, WoT appliances have tremendously changed our lives since they are increasingly surrounding the environments we live in, from interactive smart homes to conference venue systems. If these WoT devices are exploited by end users, they have the opportunity to compose the diverse behaviors exposed by the smart environments to fulfill their desired requirements. However, research in this area has been mainly dedicated to technical issues like network connectivity of smart devices and device interoperability. Relatively little attention has been paid to involving end users to compose their smart environments based on their personal requirements. Presently, programming WoT devices is mostly possible for professional developers since in most cases it involves familiarity with different web technologies (like HTTP, web services, web sockets, etc.) and even electronics and underlying hardware.

To reach such a goal, the End User Development (EUD) paradigm is a potential solution in allowing end users themselves to tailor their systems to satisfy their goals [5]. However, many of the current EUD applications available for WoT devices are not efficiently usable for the non-technical users, since the end user must write different rules in the form of event condition action (ECA), which requires the end users to know about possible actions of the WoT devices. Moreover, not only the end user requires to know about the concrete syntax, but also it requires algorithmic thinking and abstractions to work with these tools. These tools put high requirements on the end users, therefore fewer users use WoT technology, and the users who are using it are slower and may not get the results they want. This is due to the fact that users often know what they want to achieve, but they do not know how to achieve it. Therefore, these limitations constitute an increase in both the development time and the learning effort for the end user.

Therefore, to take a step towards facilitating end users to be active participants of WoT and relieving them of constructing such rules, we propose a goal-oriented approach which enables end users to model their smart environments based on the desired goals (effects) rather than the concrete operations of the devices. GrOWTH uses semantic web ontologies for knowledge representation and planning techniques from artificial intelligence to dynamically generate plans at runtime considering user goals, context, and available WoT devices. The design principles of our approach are:

  1. 1.

    Improve ease of use by enabling goal-oriented interaction and incorporating context-aware features

  2. 2.

    Reduce development time by providing a higher degree of automation utilizing automatic planning of required actions

This article describes the design and implementation of such a goal-oriented framework with particular emphasis on non-technical users. It is worth noting that we do not provide any new solutions for the planning itself. Rather our main objective is to provide the architecture for embedding existing planning techniques for interaction with WoT devices.

This article is structured as follows. Section 2 presents the related work on WoT and EUD. Section 3 introduces the proposed solution and its implementation details. Finally, Sect. 4 concludes the paper and provides some future works.

2 Web of Things and End User Development

The opportunities provided by WoT can be enhanced by offering new techniques to enable end users to control and shape their environment. This is provided by End User Development (EUD) which is defined as “a set of methods, techniques, and tools that allow users of software systems, who are acting as non-professional software developers, at some point to create, modify, or extend a software artifact” [5]. Different from “traditional” software engineering techniques, in EUD the idea is that the end user plays the role of both developers as well as users of the software. The motivation is that end users are experts in their domain and can develop applications that support their many different needs. Since the early findings, a broad range of research has been carried out in this field. In the Web context, EUD approaches are clearly visible.

2.1 Related Work

There are popular tools in the industry for EUD in the home automation domain such as IFTTTFootnote 1, openHABFootnote 2, ZapierFootnote 3, Node-REDFootnote 4 and StreamBase StudioFootnote 5. Among these solutions IFTTT (If This Then That) is an app store like service where developers can build “recipes”; simple conditional statements, to connect services or devices from different providers using a wizard-based paradigm. Zapier allows the composition of Web services and smart objects using wizard-based approach to formulate a simple rule using only one event and one action. Although IFTTT and Zapier are suitable for non-technical users, they only support the construction of basic rules, making them unsuitable for a lot of use cases which require multiple triggers or actions. Solutions like openHAB, Node-RED, and StreamBase studio employ visual programming metaphor to create mashups. However, they are too difficult to use for end users without programming experience and will significantly increase the learning curve and development time for them. For example, openHAB proposes the concepts of “channels” and a new rule language, Node-RED requires scripting knowledge and network knowledge for the development, and both openHAB and StreamBase studio requires the user to use a new rule language which is an extension of Java for programming rules. These solutions are unsuitable for end users without programming languages since they must learn new concepts and languages and they do not take context-aware features into account.

In addition to the aforementioned projects, academic research projects have also presented different solutions for EUD in the home automation domain. For example, AppsGate [6] proposed a special programming language with a more natural syntax than “traditional” programming languages. Similarly, end users’ must still learn the syntax of the programming language. The authors of [7,8,9] employ Event Condition Action (ECA) based rule solutions. A recent survey [10] in end user requirement concludes that rule-based solutions are only suitable for simple use cases and inflexible for more complex goals. Also, rule-based interactions are not a natural way of interaction for humans since the end user is forced to define a tedious sequence of rules based on the specific WoT device actions.

Instead, what is missing, is a framework that enables end users to state the goal and having the corresponding actions automatically executed by various types of Web-capable devices, taking into account context-aware features. Therefore, to fulfill the desires of the end user and assure their comfort a more natural and effective way of programming for non-technical users is required. The next section focuses on our vision towards enabling goal-oriented interaction within WoT environments.

3 Goal-Oriented Interaction with Web of Things Devices

In EUD for WoT devices end users are forced to think of interaction in terms of the different “action” these devices provide: actions such as “close”, “open”, “turn on”, “play” etc. The different WoT devices possess various actions and the same action may behave differently on different devices due to a lack of semantic interoperability [11]. Currently, when programming the smart devices, we have to select the device, provide some parameters or conditions, and then execute a specific action on the device. After the operation is executed it produces an effect, such as increased luminosity, darker, reduced temperature and so on. Obviously, the end user is interested in defining the effect rather than inventing the sequence of actions which need to be executed on different devices to achieve the desired effect. The main idea behind goal-oriented interaction is to enable end users to state the overall effect (goal) such as “reduce the energy consumption by 10 Euro per month” or “it’s too noisy here” instead of forcing the user to provide a list of actions which result in the effect.

To enable goal-oriented interaction the framework should be capable of mapping between the users’ goal and the set of actions to fulfill the goal. For this, our solution incorporates (i) semantic ontologies to provide knowledge of the effects of each device action and (ii) planning techniques to dynamically generate plans at runtime considering user goals, context, and available WoT devices. This way, different device actions are performed for the same goal based on the current environment state of the WoT devices, providing a context-aware system. Since the framework is specially designed for non-technical end users we cannot assume that they are familiar with semantics and reasoning. Therefore, we provide an abstraction layer on top so that end users only need to state the desired state of their smart environment through either web-based application, voice interaction or a mobile application. The framework can be simply customized to fulfill new user requirements by adding goals by the end users themselves through the user interface hiding the underlying technicalities.

3.1 Conceptual Model

The goal planning problem is modeled as a transition system which progresses from the initial state to a goal state, provided a set of actions. More formally, consider a planning problem represented with a 5-tuple \( Pr = \left\langle {S,T,s^{0} ,A,s^{*} } \right\rangle \) where \( S = \left\{ {s_{0} , s_{1} , \ldots , s_{n} } \right\} \) is a finite set of states including the initial state \( s^{0} \) and the goal state \( s^{*} \). Each state \( s \in S \) is represented by a finite set of parameters \( P_{s} \) = \( \{ p_{0} \), \( p_{1} , \ldots ,p_{k} \} \) expressing the facts known about this state. Each parameter \( p_{i} \in P_{s} \subset N \times O \times V \) represents an atomic fact, combining the name of the parameter \( n_{{p_{i} }} \in N \), operator \( o_{{p_{i} }} \in O = \left\{ { = ,\left\langle , \right\rangle , \# , \le , \ge } \right\} \) and the value \( v_{{p_{i} }} \in V \). Transitions \( T \subset S \times S \) between the states are associated with Actions from \( A \) performed by the devices. An Action \( a \in A \) defines its functionality as: \( a = \left( {C_{a} , E_{a} , Act_{a} } \right) \) where \( C_{a} , E_{a} , Act_{a} \) refer to the preconditions, effects, and actuations of action \( a \) respectively. Preconditions are assertions that must hold before a transition can be used to reach the next state, effects refers to the results of the transition, and actuations are a set of HTTP requests \( r_{i} \) with \( r_{i} = \left\langle {method, url, body} \right\rangle \), defining the HTTP method (GET, POST, PUT), URL and the body respectively. The preconditions and effects both have the form as a state, i.e., they are represented as a set of parameters.

3.2 GrOWTH Approach

In this section, we describe the overall architecture of GrOWTH our goal-oriented EUD system for WoT. Figure 1 depicts the GrOWTH architecture which the following main components: Controller, Device Manager, User Interface, Goal Analyzer, Knowledge Repository, Context Sensing, Planner, and Execution Engine. In the following, we describe the functionality of each component.

Fig. 1.
figure 1

Overview of GrOWTH architecture

Controller:

The Controller provides the means to coordinate the inputs and outputs of the different components and invoking the appropriate component.

Device Manager:

The main responsibility of the Device Manager is to discover and interconnected WoT device. All types of WoT devices are described according to their different actions described as a set of preconditions, effects, and actuations published by the device provider itself. After the device is discovered, the device is automatically registered to the Device Manager (0). Upon device registration, the Device Manager interacts with the Reasoner, where a device representation, specifying its functionalities in terms of actions, preconditions, effects, and actuation is added to the Triple Store. This component is based on asynchronous publish and subscribe architecture, to enable other components to be informed when devices appear or disappear from the network.

User Interface:

The user interface provides the means for end users to interact with the system interface (1) for issuing goals to the WoT environment (e.g., “It’s too warm”). The end users use this interface to formulate goals using voice and gesture interactions. This component interacts with the Coordinator to pass the goal in text format (2).

Goal Analyzer:

Given a high-level goal received from the Controller, the Goal Analyzer interacts with the User Preferences component to identify the users attitude towards the stated goal (e.g., temp = 18). Then the goals are passed back to the Controller (3).

Knowledge Repository:

The Knowledge Repository is responsible for representing aspects of the domain under consideration and their relations, in well defined, machine-understandable syntax and unambiguous semantics. The format of the knowledge stored in the Triple Store is RDF/OWL. It keeps an instance of the descriptions of the set of device actions (a set of precondition, effects, and actuations) that are active at any given moment, and domain ontology. The description of the device instances is kept up to date based on the periodic notifications send by the Device Manager to inform about new devices and the disappeared devices (0). Once a new device is discovered a new instance is added to the knowledge repository, and accordingly devices that disappeared are removed. The domain ontology includes device ontology from the WoT ontologyFootnote 6 which includes description about different WoT sensors and actuators.

Context Sensing:

The role of the Context Sensing component is to monitor the status of the active WoT devices found at runtime (e.g., lamp = off, shutter = open, TV = on) and inform the subscribed components using a publish-subscribe mechanism upon state change (5).

Planner:

The Planner receives Goal, Actions and Environment State as input from the Controller (6) and produces a plan to the given planning problem to achieve the desired goal (7). The plan is a sequence of steps (actions of available devices). To produce this sequence, we are using PDDL Engine. Planning Domain Definition Language (PDDL) is a standard syntax used by most automatic planning tools [12]. A PDDL definition contains two parts namely planning definition and the problem definition. The role of the PDDL Constructor is to formulate the planning problem into PDDL from the OWL input in (6). To formulate the PDDL problem domain, the environment state (initial state) and the goal of the user are used as input. On the other hand, to construct the domain definition file the PDDL Constructor takes the Actions and transforms them into the syntax of PDDL planning problem file. Given the planning domain file and the problem definition file, the PDDL Engine creates a planning graph with all existing states as vertices and the actions as edges. Then the Planner searches through all device actions that affect one of the goal variables to reach the goal state. After that, the list of steps to reach the goal is generated as output.

Execution Engine:

The plan generated by the Planner and the available actions are passed to the Execution Engine (8). It then maps the steps of the plan to the actions and actuates each action by executing the associated HTTP request (9). These requests are defined in the device description in terms of HTTP method, URL, and body. The HTTP requests cause the physical WoT devices to produce the desired effects.

3.3 Implementation

To demonstrate the feasibility of GroWTH we apply the architecture presented in the previous section to the smart home domain. The prototypical implementation was performed in Python and allows end users to control a set of smart devices by stating their goals using natural language voice input.

The set of devices we are using is a Samsung TV and Phillips Hue Lamps which consume RESTful web services with JSON as the output format, raspberry pi 3.0 and Amazon Echo Dot. The Echo is used as a voice interface of our framework handling voice audio processing via custom skill code. For instance, the end user’s can say their goal to Echo as “Alexa, tell GrOWTH I want to sleep”. In this case, the invocation word GroWTH finds the skill that the user wants to direct their inquiry to. Echo sends the request to the Alexa Service Platform, which handles speech recognition, turning the user’s speech into tokens. We define custom utterances that might be part of the goal that the user wants to achieve. In this way, the goal is matched to what the user has said.

The goal analyzer then invokes the RESTful API of our backend, sending the goal as JSON to do the processing, resulting in the suitable response. To fulfill this goal, based on the current environment state (e.g., lamps and TV are on) the lamps and the TV are switched off, and the response containing the natural language description of the actions triggered by the user goals are then sent back, in our example Alexa would tell the user “I have turned the lamp and the TV off”.

The raspberry pi runs a RESTful web service handling the incoming goals using the Flask-Ask library to communicate with the Echo. The web service interface acts as the Controller component from Fig. 1 and invokes the following backend components: ontology, reasoner, and planner. Our ontology has been modeled using Protégé, RDFlib is used to work with RDF, and HermiT as the reasoner in Python. The planner that we have adopted is STRIPSFootnote 7 (Stanford Research Institute Problem Solver) planner which finds the solution starting at the goal state to reach the initial state using Backward State-Space Search.

4 Conclusion and Future Work

Defining rules in a DSL-defined syntax is not a natural way of interaction for end users. Therefore, we have introduced a goal-oriented interface for end user development of WoT devices. The user only states their goal using natural language than a planner controls and manipulates the devices avoiding the users dealing with low-level technicalities. In the future, we plan to extend the planner component by also considering emergent effects since the application environment is dynamic and all the effects of a set of device actuations cannot always be predicted. Besides, in its current version the planning steps provided by the planner are executed sequentially, however in some cases this may result in a state change that will contradict with the current user goal. Rather when a device action is executed the environment state needs to be monitored again and if necessary re-planning needs to be performed before the next step is executed, similar to the plan do check adjust (PDCA) method used in business management. Moreover, we plan to conduct a thorough evaluation of our approach with end users, to obtain a complete picture of end user requirements. We are especially interested in the correctness of the generated plans, the development time and usability aspects of our approach.