Skip to main content
Log in

On the Suitability of Applying WordNet to Privacy Measurement

  • Published:
Wireless Personal Communications Aims and scope Submit manuscript

Abstract

Privacy protection is a fundamental issue in the era of big data. For personalized privacy protection, it is necessary to measure the amount or the degree of privacy leakage. To facilitate such measurement, semantic similarities and relationships of words should be determined since the words may come from multiple sources and present themselves in as many different ways as one can imagine, an intrinsic nature of big data. WordNet has been widely used for measuring the semantic similarity of words. This paper aims at analyzing the suitability of applying WordNet to measuring the semantic similarity or relatedness of words in the field of privacy. The analysis includes an experiment designed to obtain human rating scores as the benchmark dataset and a comprehensive comparison between results from four WordNet based measures and the human rating scores. The conclusion of the analysis is that current WordNet based measures are not very suitable for privacy measurement. Therefore, this paper also provides some suggestions on possible ways of enhancing WordNet to improve the effectiveness of semantic similarity or relatedness measurement in the field of privacy to support more effective and personalized privacy protection.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Agirre, E., Alfonseca, E., Hall, K., Kravaloca, J., Pasca, M., & Soroa, A. (2009). A study on similarity and relatedness using distributional and WordNet-based approaches. In Proceedings of human language technologies: The 2009 annual conference of the North American chapter of the association for computational linguistics (pp. 19–27). Boulder, CO.

  2. Wei, T., Lu, Y., Chang, H., Zhou, Q., & Bao, X. (2015). A semantic approach for text clustering using WordNet and lexical chains. Expert Systems with Applications, 42(4), 2264–2275.

    Article  Google Scholar 

  3. Choi, S. M., Cho, D. J., Han, Y. S., & Man, K. L. (2015). Recommender systems using category correlations based on WordNet similarity. In Proceedings of the 2015 international conference on platform technology and service (pp. 5–6). Jeju.

  4. Perera, K., & Karunarathne, D. (2015). KeyGraph and WordNet hypernyms for topic detection. In Proceedings of the 12th international joint conference on computer science and software engineering (pp. 303–308). Hatyai.

  5. Zhang, Y., Li, B., Wang, X., Liu, X., & Chen, J. (2014). Mapping word senses of middle ancient Chinese to WordNet. In Proceedings of the 2014 IEEE/WIC/ACM international joint conferences on web intelligence and intelligent agent technologies (pp. 446–450). Warsaw.

  6. Kolb, P. (2009). Experiments on the difference between semantic similarity and relatedness. In Proceedings of the 17th Nordic conference on computational linguistics (pp. 81–88). Odense.

  7. Zhao, D., Qin, L., Liu, P., Ma, Z., & Li, Y. (2015). Computing terms semantic relatedness by knowledge in Wikipedia. In Proceedings of the 2015 12th web information system and application conference (pp. 107–111). Jinan.

  8. Budanitsky, A., & Hirst, G. (2001). Semantic distance in WordNet: An experimental, application-oriented evaluation of five measures. In Proceedings of the workshop on WordNet and other lexical resources, 2nd meeting of the North American chapter of the association for computational linguistics (Vol. 2(12), pp. 29–34). Pittsburgh, PA.

  9. Budanitsky, A., & Hirst, G. (2006). Evaluating WordNet-based measures of lexical semantic relatedness. Computational Linguistics, 32(1), 13–47.

    Article  Google Scholar 

  10. Patwardhan, S., & Pedersen, T. (2006). Using WordNet-based context vectors to estimate the semantic relatedness of concepts. In Proceedings of the EACL 2006 workshop on making sense of sense: Bringing computational linguistics and psycholinguistics together (pp. 1–8). Trento.

  11. Gao, J. B., Zhang, B. W., & Chen, X. H. (2015). A WordNet-based semantic similarity measurement combining edge-counting and information content theory. Engineering Applications of Artificial Intelligence, 39, 80–88.

    Article  Google Scholar 

  12. Han, L. S., Finin, T., McNamee, P. L., Joshi, A., & Yesha, Y. (2013). Improving word similarity by augmenting PMI with estimates of word polysemy. IEEE Transactions on Knowledge and Data Engineering, 25(6), 1307–1322.

    Article  Google Scholar 

  13. Cranor, L., Langheinrich, M., Marchiori, M., Presler-Marshall, M., & Reagle, J. (2002). The platform for privacy preferences 1.0 (p3p 1.0) specification. W3C Recommendation.

  14. Castro, J., Gómez, D., Molina, E., & Tejada, J. (2017). Improving polynomial estimation of the Shapley value by stratified random sampling with optimum allocation. Computers and Operations Research, 82, 180–188.

    Article  MathSciNet  Google Scholar 

  15. Maalej, M., Mtibaa, A., & Gargouri, F. (2015). Enriching user model ontology for handicraft domain by FOAF. In Proceedings of the 2015 IEEE/ACIS 14th international conference on computer and information science (pp. 651–655). Las Vegas, NV.

  16. Ell, B., Hakimov, S., & Cimiano, P. (2016). Statistical induction of coupled domain/range restrictions from RDF knowledge bases. In Proceedings of the 15th international semantic web conference, lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics) (#10579, pp. 27–40). Kobe.

  17. Jilek, C., Maus, H., Schwarz, S., & Dengel, A. (2015). Diary generation from personal information models to support contextual remembering and reminiscence. In Proceedings of the 2015 IEEE international conference on multimedia and expo workshops. Turin.

  18. Terzi, D. S., Terzi, R., & Sagiroglu, S. (2015). A survey on security and privacy issues in big data. In Proceedings of the 2015 10th international conference on internet technology and secured transactions (pp. 202–207). London.

Download references

Acknowledgements

The work in this paper has been supported by National Natural Science Foundation of China (No. 61602456) and National High Technology Research and Development Program of China (863 Program) (No. 2015AA017204).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jingsha He.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhu, N., Wang, S., He, J. et al. On the Suitability of Applying WordNet to Privacy Measurement. Wireless Pers Commun 103, 359–378 (2018). https://doi.org/10.1007/s11277-018-5447-5

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11277-018-5447-5

Keywords

Navigation