Abstract
With the ever increasing number of Web services, discovering an appropriate Web service requested by users has become a vital yet challenging task. We need a scalable and efficient search engine to deal with the large volume of Web services. The aim of this approach is to provide an efficient search engine that can retrieve the most relevant Web services in a short time. The proposed Web service search engine (WSSE) is based on the probabilistic topic modeling and clustering techniques that are integrated to support each other by discovering the semantic meaning of Web services and reducing the search space. The latent Dirichlet allocation (LDA) is used to extract topics from Web service descriptions. These topics are used to group similar Web services together. Each Web service description is represented as a topic vector, so the topic model is an efficient technique to reduce the dimensionality of word vectors and to discover the semantic meaning that is hidden in Web service descriptions. Also, the Web service description is represented as a word vector to address the drawbacks of the keyword-based search system. The accuracy of the proposed WSSE is compared with the keyword-based search system. Also, the precision and recall metrics are used to evaluate the performance of the proposed approach and the keyword-based search system. The results show that the proposed WSSE based on LDA and clustering outperforms the keyword-based search system.
Similar content being viewed by others
References
Mongodb. https://www.mongodb.com/
programmableweb website. http://www.programmableweb.com/
scikit-learn, machine learning in python. http://scikit-learn.org/stable/
snowball. http://snowball.tartarus.org/
Al-Masri E, Mahmoud QH (2007) Wsce: a crawler engine for large-scale discovery of web services. In: IEEE International conference on Web Services, 2007. ICWS 2007, pp 1104–1111
Aznag M, Quafafou M, Rochd EM, Jarir Z (2013) Service-oriented and cloud computing: second European Conference, ESOCC 2013, Málaga, Spain, September 11–13, 2013. In: Proceedings, chapter probabilistic topic models for Web services clustering and discovery, pp 19–33. Springer, Berlin, Heidelberg, Berlin, Heidelberg
Chen L, Hu L, Zheng Z, Wu J, Yin J, Li Y, Deng S (2011) Wtcluster: Utilizing tags for web services clustering. In: Service-Oriented Computing, pp 204–218
Chen L, Wang Y, Yu Q, Zheng Z, Wu J (2013) Service-oriented computing: 11th International Conference, ICSOC 2013, Berlin, Germany, December 2–5, 2013. In: Proceedings, chapter WT-LDA: user tagging augmented LDA for Web service clustering, . Springer, Berlin, Heidelberg, pp 162–176
Davies DL, Bouldin DW (1979) A cluster separation measure. IEEE Trans Pattern Anal Mach Intell 1(2):224–227
Elgazzar K, Hassan A, Martin P (2010) Clustering wsdl documents to bootstrap the discovery of web services. In: IEEE international conference on Web services (ICWS), 2010, pp 147–154
Elshater Y, Elgazzar K, Martin P (2015) Godiscovery: Web service discovery made efficient. In: IEEE International Conference on Web Services (ICWS), 2015, pp 711–716
Fensel D, Kerrigan M, Zaremba M (2008) Implementing semantic web services: the SESA framework, chapter discovery. Springer, Berlin, pp 169–172
Griffiths T (2002) Gibbs sampling in the generative model of latent dirichlet allocation. Technical report
Hatzi O, Batistatos G, Nikolaidou M, Anagnostopoulos D (2012) A specialized search engine for web service discovery. In: IEEE 19th International Conference on Web Services (ICWS), 2012, pp 448–455
Lo W, Yin J, Wu Z (2015) Accelerated sparse learning on tag annotation for web service discovery. In: IEEE international conference on Web services (ICWS), 2015, pp 265–272
MacQueen J (1967) Some methods for classification and analysis of multivariate observations. In: Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, volume 1: statistics, University of California Press, Berkeley, pp 281–297
The Mathworks, Inc. (2015) Natick, Massachusetts. MATLAB version 8.5.0.197613 (R2015a)
McCallum AK (2002) Mallet: a machine learning for language toolkit. http://mallet.cs.umass.edu
Miller GA (1995) Wordnet: a lexical database for english. Commun ACM 38(11):39–41
PleplÃl Q, Perplexity to evaluate topic models
Ward JH (1963) Hierarchical grouping to optimize an objective function. J Am Stat Assoc 58(301):236–244
Xia Y, Chen P, Bao L, Wang M, Yang J (2011) A qos-aware web service selection algorithm based on clustering. In: 2011 IEEE international conference on Web services (ICWS), pp 428–435
Xie P, Xing EP (2013) Integrating document clustering and topic modeling. CoRR. arxiv:1309.6874
Zhang Y, Zheng Z, Lyu M (2010) Wsexpress: a qos-aware search engine for web services. In: IEEE International Conference on Web services (ICWS), 2010, pp 91–98
Zhou J, Li S (2009) Semantic web service discovery approach using service clustering. In: International conference on information engineering and computer science, ICIECS 2009, pp 1–5
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Bukhari, A., Liu, X. A Web service search engine for large-scale Web service discovery based on the probabilistic topic modeling and clustering. SOCA 12, 169–182 (2018). https://doi.org/10.1007/s11761-018-0232-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11761-018-0232-6