Keywords

1 Introduction

End-User Development (EUD) refers to a set of tools and techniques to empower non-programmer users, who are experts in a given domain, to develop, modify and extend web applications. These applications assist end users to fulfill the specific goals in their domain of expertise [3]. The advantages of applying EUD techniques are preventing time-consuming and error-prone knowledge transfer from domain-experts to programmers [4] and fading the distinction in runtime and design time development by supporting user’s development at runtime [2].

One of the successful practices in the field of EUD is mashup. Based on the definition provided in [2] “A mashup is a composite application developed starting from reusable data, application logic, and/or user interfaces typically, but not mandatorily, sourced from the Web”. Mashups, due to their composite nature, are good candidates to realize the goal of the programmable webFootnote 1 for end users.

Existing solutions have shortcomings in the terms of expressiveness power and flexibility. The flexibility feature is normally limited by narrowing down the design effort to a specific domain and address the requirements of that domain only. On the other hand, the tool should offer high expressive power to address the exact needs of users.

In this paper, we bridge this gap by introducing a Natural-language-based EUD tool. In our approach, users are allowed to express their intentions in a natural way. Since using Natural-Language Programming (NLP) techniques guarantees the highest level of expressive power [1], this approach allows users to develop feature-rich applications. To guarantee a higher level of flexibility, domain experts are assisted in the process of domain ontology generation. These ontologies will be used later to map the user’s intentions to the specific domain entities.

2 Natural-Language Based End-User Tool

The overall platform architecture is illustrated in Fig. 1. The first interaction with the platform is made by the end user using natural language interface to express his intentions which can be retrieving knowledge on a particular topic or performing a task such as planning, data analyzing and visualization. Natural-language interfaces have gained popularity during recent years since they provide intuitive, less complicated interaction with machines. Apple’s SiriFootnote 2 and Amazon’s AlexaFootnote 3 are among the popular natural-language-based personal assistance. To efficiently address the user’s need, the first step is comprehending the meaning of the input. The NLP component was implemented with python Natural Language ToolkitFootnote 4, which provides a suite of text processing and semantic reasoning.

Fig. 1.
figure 1

Platform architecture

2.1 User’s Domain Identification

As the first step, we generate a domain taxonomy as an aid for better understanding the user’s intention. Identifying user’s domain eliminates the ambiguities caused by different word’s semantics. For instance, the word “book” refers to reserving a flight or hotel room in the traveling domain and a learning material in the education domain.

To address this issue, we use the concept of ontology which consists of a structured set of domain entities and relations among them. Ontology extraction, also known as ontology learning, is a set of methods, which promotes constructing ontologies from domain-dependent natural-language texts. The source of data can be unstructured texts, DBpedia or APIs of social networks. From these sources, the key concepts (classes) and semantic relations among them (properties) are extracted. The classes can be derived, based on the identified taxonomy in the previous section. This step is a semi-automatic procedure, in the way that domain expert can refine the concepts and relations manually.

To extract the ontology, we use the two-layer neural network algorithm word2vec that can be used to extract the word’s semantic concept. The word2vec algorithm is trained with a large corpus of text and forms a vector space representation of this corpus, onto which input words are mapped. To each word in the corpus, word2vec assigns a vector which later can be used to compute the similarity between words and sentences. For implementation purposes, we used the gensimFootnote 5 python library, which provides unsupervised semantic modeling.

Domain experts are asked to upload sample domain-related documents such as scientific papers and plain text from HTML pages, to the repository in the form of unstructured text that requires preprocessing. These documents serve as text corpus for training the word2vec model. Preprocessing step involves eliminating special characters and common words such as “for”, “of”, “in” and etc. This step increases the accuracy of the model produced by the word2vec algorithm. We tune the parameters to control the speed and quality of the trained model. Better trained model results in the more accurate taxonomy which leads to better intention identification. As shown in the following listing, states that words occurring fewer than 5 times are ignored because they don’t provide any useful information. The parameter is related to the size of Neural Network layers. Bigger size values result in more accurate models but require larger training sets.

figure a

The trained model can be used to identify the key domain concepts and relations among them. After forming the ontology based on the domain related documents, the platform is able to analyze the user’s input considering the domain of the user.

Additionally, the model can be used to find the similarities between variations of user’s input sentences. Based on the input in the intention field, the platform suggests a list of available (business or personal) goals in the domain. Later during the runtime, these terms will be used to finding the suitable components.

2.2 Ontology Mapping

After domain identification based on the input, the data and models provided by domain experts, the next step is performing two mappings over goal and domain ontologies. First, a SPARQL query is formed and the input is mapped onto the goal ontology. The Goal Ontology comprises the hierarchy of user’s business goals. Following the hierarchy, class Task produces a result that helps to realize the Goals. Each instance of the class Task will be associated with an Action from the Domain Ontology. Identifying the correct action to map to the Task instances is the purpose of this step, in which, a more specific query is formulated and the second mapping to the domain ontology is performed. Following the mappings, the search parameters are produced and sent to the Component Browser to execute the query on the Component Repository. This Repository contains numerous component implementations and their descriptors which specifies the component characteristics such as events, operations and data types and how they can be invoked [2]. The Component Browser identifies and fetches suitable components. The user can then choose the best fitting ones from the list of candidates. To create the solution mashup, component instances, are added to the runtime environment. The component instances are created based on the instantiation model (stateful or stateless) and provide access to the component functionalities.

3 Conclusion

In this demo paper, we presented a platform to assist the end users to address their situational needs. End users can express their intentions in natural languages. To increase the platform flexibility to cope with various domains, we proposed a technique to generate the domain ontology based on the provided data by the domain expert.