Semantic approach to service discovery in a Grid environment

doi:10.1016/j.websem.2005.04.001

Journal of Web Semantics

Volume 4, Issue 1, January 2006, Pages 1-13

https://doi.org/10.1016/j.websem.2005.04.001 Get rights and content

Abstract

The fundamental problem that the Grid research and development community is seeking to solve is how to coordinate distributed resources amongst a dynamic set of individuals and organisations in order to solve a common collaborative goal. The problem arises through the heterogeneity, distribution and sharing of the resources in different virtual organisations. Interoperability is a main issue for applications to function with the Grid. This paper proposes a matchmaking framework for service discovery in Grid environments based on three selection stages which are context, semantic and registry selection. It provides a better service discovery process by using semantic descriptions stored in ontologies which specify both the Grid services and the application knowledge. The framework permits Grid applications to specify the criteria a service request is matched with and enables interoperability for the matchmaking process. A proof of concept is done with a prototype implementation, and an enhancement of the matchmaking process is achieved with a similarity metric which allows quantifying the quality of a match. A qualitative and quantitative evaluation of the prototype system is given with an analysis and performance measurements to quantify the scalability of the prototype.

Introduction

In mid 1990s Ian Foster and Carl Kesselman proposed a distributed computing infrastructure for advanced science and engineering which they called “The Grid”. The vision behind the Grid is to supply computing and data resources over the Internet seamlessly, transparently and dynamically when needed, such as the power Grid supplies electricity to end users. The Grid originated from trying to solve the information and computational challenges of science [1].

Resource discovery and as a result also service discovery is an important issue for the Grid in answering the questions of how a service requester finds the resources/services needed to solve its particular problem and how a service provider makes potential service requesters aware of the computing resources it can offer. Service discovery is a key concept in a distributed Grid environment. It defines a process for locating service providers and retrieving service descriptions. The problem of service discovery in a Grid environment arises through the heterogeneity, distribution and sharing of the resources in different Virtual Organisations (VOs). The two different approaches implemented in the early stages of the Grid software (GLOBUS toolkit, GT [2]) were:

•
Monitoring and Discovery Service (MDS),
•
Grid Information Service (GIS).

Although these approaches deal only with resource discovery, service discovery can be seen as an extension of resource discovery.

The MDS [3] was initially designed as a centralised way to obtain Grid service information via an LDAP (Lightweight Directory Access Protocol) server. Later designs in MDS-2 have moved to a decentralised approach where Grid information is stored and indexed by index servers that communicate via a registration protocol [4]. Users can then query directory servers. The assignment of content to servers and the overlay topology of those servers is done in an ad hoc fashion.

GIS is a service that allows storing information about the state of the Grid infrastructure [5]. One approach for describing the data is to use a hierarchical model. This is the approach which is currently in place as GISs have been built on top of directory services. The question arises whether these systems and the hierarchical model will provide sufficient performance and expressiveness. An alternative solution is to use a relational data model, which arguably is more difficult to implement and scale, but allows for more expressiveness with a relational query language.

Due to the lack of expressive and efficient matchmaking in Grid environments Condor [6] was used. Condor which is used for high-throughput computing is a matchmaking framework which was developed with classified advertisement (ClassAd) for solving resource allocation problems in a distributed environment with decentralised ownership of resources [7]. This framework provides a bi-lateral match where both resource providers and consumers specify their matching constraints, e.g. policy and requirements. A symmetric requirement is then evaluated for each request–resource pair to determine whether there is a match or not.

The Open Grid Services Infrastructure (OGSI) [8] defines a set of conventions and extensions on the use of Web Service Definition Language and XML Schema to enable stateful Web services. It introduces the idea of stateful Web services and defines approaches for creating, naming, and managing the lifetime of instances of services; for declaring and inspecting service state data; for asynchronous notification of service state change; for representing and managing collections of service instances; and for common handling of service invocation faults. Recently, the WS-Resource Framework (WSRF) [9] was proposed as a refactoring and evolution of OGSI aimed at exploiting new Web services standards, specifically WS-Addressing, and at evolving OGSI based on early implementation and application experiences. WSRF retains essentially all of the functional capabilities present in OGSI, while changing some of the syntax (for example, to exploit WS-Addressing) and also adopting a different terminology in its presentation.

Until recently, research on Grids has focused on designing and building Grid middleware that addresses the core problem of Grids which are resource management and services in a distributed environment. Such services include security and data management. Argonne National Laboratory (ANL) has developed an open-source Grid middleware called GLOBUS [2] which has become the de facto Grid middleware for research and possibly production purposes. From the evolution of the Grid software it can be seen that it went from a middleware approach, where many different tools were combined in a toolbox, to a service-based approach which focuses on application-level issues. The approach proposed in this paper follows this direction by taking this service-based view and presents a framework which is developed on the application level. The approach applies semantics to Grid services and to the applications in order to achieve interoperability within Grid environments. The interactions such as service requests with services from the applications and the Grid are matched semantically. As there are many different Grid implementations and applications, which want to make use of the Grid, available, therefore there is a need for semantics to make them interoperable with each other. In order to connect applications such as the High Energy Physics (HEP) experiments to the Grid two interoperability layers are necessary. One interoperability layer is attached to the application layer and the other to the collective layer. The first interoperability layer serves as a dictionary, allowing the different HEP applications to specify their service needs in their “own” application context. The second interoperability layer allows the definition of semantic service description in order to allow a more flexible and dynamic service discovery process [10].

This paper is organised as follows. In Section 2 related efforts are summarised and the differences to the proposed approach are discussed. Section 3 gives an introduction to the background of semantics and ontologies. The framework of the semantic service discovery approach for Grid environments with a detailed description of the components is shown in Section 4. Section 5 presents a portal prototype implementation and explains the tools used. In Section 6 an enhancement of the matchmaking process by means of a similarity metric is done. Section 7 presents an evaluation of the system by an introduction of a similiarity metric and finally, Section 8 concludes this paper.

Section snippets

Related efforts

During the past few years lots of effort and research have been placed in the field of resource matching which are described in the following paragraphs. The different approaches are based on resource matching, resource mapping and selection, and developing infrastructural middleware.

myGrid [11] is a multi-organisational project aiming to develop the necessary infrastructural middleware (e.g. provenance, service discovery, workflow enactment, change notification and personalisation) that

Background to semantics and ontologies

Ontologies contain categories, lexicons contain word senses, terminologies contain terms, directories contain addresses, catalogs contain part numbers, and databases contain numbers, character strings and BLOBs (BinaryLarge OBjects). All these lists, hierarchies and networks are tightly interconnected collections of signs. But the primary connections are not in the bits and bytes that encode the signs, but in the minds of the people who interpret them. The goal of various metadata proposals is

Semantic service discovery framework

This section describes the semantic service discovery framework for a Grid environment. It gives a description of the components of the framework and shows how the matchmaking process is done.

Implementation of prototype

The Semantic Grid Service Discovery Portal is a portal for service discovery using an ontology-based matchmaking engine. The tool provides six menus which are login, load ontologies, view ontology, search defined service, list all services and logout. The most common steps will be login, loading of ontologies, searching for a defined service and logout. The three matching modules, especially the semantic service discovery lies behind the search for a defined service (Fig. 6). The user is asked

Enhancement of prototype by similarity metric

A drawback related with performing flexible matches is that the matchmaking engine is open to exploitation from advertisements and requests that are too generic in the attempt to maximise the likelihood of matching. For instance, a service may advertise itself as a provider of everything, rather than to be precise with what it does. Similarly, the requester may ask for any service, rather than specifying exactly what it expects. The matchmaking engine can reduce the efficiency of these

Evaluation of prototype

The evaluation of the semantic matchmaking modules is done using a qualitative and a quantitative analysis. The qualitative analysis discusses the advantages and disadvantages and suggests the potential for further improvements. The quantitative analysis is to show that the prototype implementation satisfies the performance requirements as applied in real-world applications and most importantly to show the quality improvement of the matchmaking. Performance measurements were conducted to

Conclusion

The Semantic Grid Matchmaker achieves interoperability for service discovery by using a semantic matchmaking approach. The requirements which have driven the development were high degree of flexibility and expressiveness, support for subsumption and datatypes and a flexible and modular structure. This approach enables a more flexible and dynamic matchmaking mechanism based on semantic descriptions stored in ontologies. The separation of application and Grid service knowledge provides a modular,

References (36)

C. Goble, The Grid – from concept to reality in distributed computing, Bioinformatics World Article, 2003....
The GLOBUS Project....
S. Fitzgerald et al.
A directory service for configuring high-performance distributed computations
K. Czajkowski et al.
Grid information services for distributed resource sharing
G. von Laszewski, “Quickstart Guide: GIS”, 1999....
The Condor Project....
M. Solomon et al.
Matchmaking distributed resource management for high throughput computing
I. Foster, et al., The physiology of the Grid: an open Grid services architecture for distributed systems integration,...
The WS-Resource Framework....
S.A. Ludwig
A Grid service discovery matchmaker based on ontology description

The myGrid Project....

E. Deelman et al.

Mapping abstract complex workflows onto Grid environments

J. Grid Comput.

(2003)

Grid Interoperability Project....

H. Tangmunarunkit et al.

Ontology-based resource matching in the Grid – the Grid meets the semantic web

The Portable Batch System....

The XSB Research Group....

The Legion Project....

NetSolve/GridSolve Project 2004....

Cited by (35)

Multi-modal Multimedia Big Data Analyzing Architecture and Resource Allocation on Cloud Platform
2017, Neurocomputing
Citation Excerpt :
The MapReduce framework of Hadoop relies on InputFormat, RecordReader classes to convert input files into key-value pairs and pass these pairs to map function, and relies on OutputFormat and RecordWriter classes to write key-value pairs, which are output from mapper/reducer, into an output file. In the resource allocation experiment, we used partial swarm optimization (PSO) [5], artificial bee colony (ABC) and ACO [6]. The CloudAnalyst tool configured to evaluate the performance of the algorithms.
Multimedia big data analyzing is the new topic that focus on all features of distributed computing systems that contains of a combination of text, visual and audio modalities. The traditional method to transcoding multi-modal multimedia big data needs expensive hardware and the amount of data increases transcoding executes a significant burden on the computing infrastructure. Therefore we illustrate a novel implementation for multimedia big data analyzing and data distribution. Our proposed architecture contains three layers such as service layer, platform layer and infrastructure layer. We design and implement the platform layer of the system by using a MapReduce framework running on a hadoop distributed file system (HDFS) and the media processing libraries Xuggler. In this way, our proposed system reduces the time for transcoding large amounts of data into specific formats depending on the user requirements. It provides flexible multimedia record/write interface and we can build large scale multimedia big data analytic applications based on Hadoop cloud platform. Moreover, we proposed the ant colony optimization (ACO) algorithm for efficient resource allocation in infrastructure layer. The simulation results demonstrate that the proposed algorithm can optimally allocate VM to achieve a minimal response time.
View-based model-driven architecture for enhancing maintainability of data access services
2011, Data and Knowledge Engineering
Citation Excerpt :
On the basis of the user requirements and constraints, the services automate the searching and finding of data sources to be analyzed by the data mining tools. The disadvantage of these semantic approaches is that the semantic service discovery is more time-consuming due to the additional context and semantic matching modules [26]. In this paper, we show that our query engine performs much better than these semantic discovery approaches.
In modern service-oriented architectures, database access is done by a special type of services, the so-called data access services (DAS). Though, particularly in data-intensive applications, using and developing DAS are very common today, the link between the DAS and their implementation, e.g. a layer of data access objects (DAOs) encapsulating the database queries, still is not sufficiently elaborated, yet. As a result, as the number of DAS grows, finding the desired DAS for reuse and/or associated documentation can become an impossible task. In this paper we focus on bridging this gap between the DAS and their implementation by presenting a view-based, model-driven data access architecture (VMDA) managing models of the DAS, DAOs and database queries in a queryable manner. Our models support tailored views of different stakeholders and are scalable with all types of DAS implementations. In this paper we show that our view-based and model driven architecture approach can enhance software development productivity and maintainability by improving DAS documentation. Moreover, our VMDA opens a wide range of applications such as evaluating DAS usage for DAS performance optimization. Furthermore, we provide tool support and illustrate the applicability of our VMDA in a large-scale case study. Finally, we quantitatively prove that our approach performs with acceptable response times.
A semantic information system for services and traded resources in Grid e-markets
2010, Future Generation Computer Systems
Citation Excerpt :
Furthermore, similarly to G4A-SIS, it provides a central registry for advertising services. The Semantic Grid Matchmaker (SGM) framework [14] consists of service requesters (Grid applications), service providers (Grid services) and a service retrieval matchmaker. Furthermore, similarly to G4A-SIS, it provides a central registry for advertising services.
This paper aims at presenting a semantic information system for the advertisement, retrieval and selection of application services, and markets that trade resources, in a democratized Grid e-marketplace environment. The market-based environment in a democratized Grid context poses new requirements concerning the functionality of a semantic information system. In this context, the paper motivates the development of the Grid4All Semantic Information System (G4A-SIS) and presents the design and the technologies used for the realization of this system. G4A-SIS is deployed as a web service using a Java API which defines the functionality of the system and abstracts low-level ontology implementation details, providing to the clients an interface for registering and querying semantically annotated market and application-specific services. In addition to the “ontology-centred” mechanisms provided for matchmaking and semi-automatic annotation of web services, the system provides a ranking mechanism to support the selection of markets and services that, in contrast to baseline ranking approaches, is based on agents’ preferences and satisfaction as well as on the workload. The paper accentuates specific advantages of using the G4A-SIS service: support for highly complex queries, multi-attributed and continuous queries. Finally, the paper provides experimental results of G4A-SIS efficiency, showing the strengths and limitations of deploying this system in a real setting.
Dynamic resource selection heuristics for a non-reserved bidding-based Grid environment
2010, Future Generation Computer Systems
Citation Excerpt :
Thus, exploiting previous cluster-based scheduling heuristics [3–7] to allocate tasks through a centralized manager or mapper is not feasible. In recent years, many matchmaking-based technologies have been proposed to address the issue of Grid resource management [8–15]. Fig. 1(a) presents an abstract matchmaking model generalized from these technologies.
A Grid system is comprised of large sets of heterogeneous and geographically distributed resources that are aggregated as a virtual computing platform for executing large-scale scientific applications. As the number of resources in Grids increases rapidly, selecting appropriate resources for jobs has become a crucial issue. To avoid single point of failure and server overload problems, bidding provides an alternative means of resource selection in distributed systems. However, under the bidding model, the key challenge of resource selection is that there is no global information system to facilitate optimum decision-making; hence requesters can only obtain partial information revealed by resource providers. To address this problem, we propose a set of resource selection heuristics to minimize the turnaround time in a non-reserved bidding-based Grid environment, while considering the level of information about competing jobs revealed by providers. We also present the results of experiments conducted to evaluate the performance of the proposed heuristics.
The Effect of the Topology Adaptation on Search Performance in Overlay Network
2022, Lecture Notes in Networks and Systems
A proposed framework for cloud-aware multimodal multimedia big data analysis toward optimal resource allocation
2021, Computer Journal

View all citing articles on Scopus

View full text

Journal of Web Semantics

Semantic approach to service discovery in a Grid environment

Abstract

Introduction

Section snippets

Related efforts

Background to semantics and ontologies

Semantic service discovery framework

Implementation of prototype

Enhancement of prototype by similarity metric

Evaluation of prototype

Conclusion

A directory service for configuring high-performance distributed computations

Grid information services for distributed resource sharing

Matchmaking distributed resource management for high throughput computing

A Grid service discovery matchmaker based on ontology description

Mapping abstract complex workflows onto Grid environments

J. Grid Comput.

Ontology-based resource matching in the Grid – the Grid meets the semantic web

Multi-modal Multimedia Big Data Analyzing Architecture and Resource Allocation on Cloud Platform

View-based model-driven architecture for enhancing maintainability of data access services

A semantic information system for services and traded resources in Grid e-markets

Dynamic resource selection heuristics for a non-reserved bidding-based Grid environment

The Effect of the Topology Adaptation on Search Performance in Overlay Network

A proposed framework for cloud-aware multimodal multimedia big data analysis toward optimal resource allocation