Outsourcing analyses on privacy-protected multivariate categorical data stored in untrusted clouds

Domingo-Ferrer, Josep; Sánchez, David; Ricci, Sara; Muñoz-Batista, Mónica

doi:10.1007/s10115-019-01424-4

Outsourcing analyses on privacy-protected multivariate categorical data stored in untrusted clouds

Regular Paper
Published: 15 November 2019

Volume 62, pages 2301–2326, (2020)
Cite this article

Knowledge and Information Systems Aims and scope Submit manuscript

Josep Domingo-Ferrer ORCID: orcid.org/0000-0001-7213-4962¹,
David Sánchez¹,
Sara Ricci² &
…
Mónica Muñoz-Batista¹

247 Accesses
1 Citation
Explore all metrics

Abstract

Outsourcing data storage and computation to the cloud is appealing due to the cost savings it entails. However, when the data to be outsourced contain private information, appropriate protection mechanisms should be implemented by the data controller. Data splitting, which consists of fragmenting the data and storing them in separate clouds for the sake of privacy preservation, is an interesting alternative to encryption in terms of flexibility and efficiency. However, multivariate analyses on data split among various clouds are challenging, and they are even harder when data are nominal categorical (i.e., textual, non-ordinal), because the standard arithmetic operators cannot be used. In this article, we tackle the problem of outsourcing multivariate analyses on nominal data split over several honest-but-curious clouds. Specifically, we propose several secure protocols to outsource to multiple clouds the computation of a variety of multivariate analyses on nominal categorical data (frequency-based and semantic-based). Our protocols have been designed to outsource as much workload as possible to the clouds, in order to retain the cost-saving benefits of cloud computing while ensuring that the outsourced stay split and hence privacy-protected versus the clouds. The experiments we report on the Amazon cloud service show that by using our protocols the controller can save nearly all the runtime because it can integrate partial results received from the clouds with very little computation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A survey on security challenges in cloud computing: issues, threats, and solutions

Article 28 February 2020

A systematic review of homomorphic encryption and its contributions in healthcare industry

Article Open access 03 May 2022

Big healthcare data: preserving security and privacy

Article Open access 09 January 2018

References

Aggarwal G, Bawa M, Ganesan P, Garcia-Molina H, Kenthapadi K, Motwani R, Srivastava U, Thomas D, Xu Y (2005) Two can keep a secret: a distributed architecture for secure database services. CIDR 2005:186–199
Google Scholar
Agresti A, Kateri M (2011) Categorical data analysis. Springer, Berlin
MATH Google Scholar
Amazon EC2 Instance Types. https://aws.amazon.com/ec2/instance-types/?nc1=h_ls
Armbrust M, Fox A, Griffith R, Joseph AD, Katz R, Konwinski A, Lee G, Patterson D, Rabkin A, Stoica I, Zaharia M (2010) A view of cloud computing. Commun ACM 53(4):50–58
Article Google Scholar
Atallah MJ, Frikken KB (2010) Securely outsourcing linear algebra computations. In: 5th ACM symposium on information, computer and communications security—ASIACCS 2010, ACM, pp 48–59
Batet M, Harispe S, Ranwez S, Sánchez D, Ranwez V (2014) An information theoretic approach to improve semantic similarity assessments across multiple ontologies. Inf Sci 283:197–2010
Article Google Scholar
Batet M, Sánchez D (2015) A review on semantic similarity. In: Encyclopedia of information science and technology, 3rd edn. IGI Global, pp 7575–7583
California patient discharge data: California Office of Statewide Health Planning and Development (OSHPD), 2009. http://www.oshpd.ca.gov/HID/DataFlow/index.html
Calviño A, Ricci S, Domingo-Ferrer J (2015) Privacy-preserving distributed statistical computation to a semi-honest multi-cloud. In: IEEE conference on communications and network security (CNS 2015), IEEE, pp 506–514
Cimiano P (2006) Ontology learning and population from text: algorithms, evaluation and applications. Springer, Berlin
Google Scholar
Ciriani V, De Capitani di Vimercati S, Foresti S, Jajodia S, Paraboschi S, Samarati P (2011) Selective data outsourcing for enforcing privacy. J Comput Secur 19(3):531–566
Article Google Scholar
CLARUS—a Framework for user centred privacy and security in the cloud, H2020 project (2015–2017). http://www.clarussecure.eu
Clifton C, Kantarcioglu M, Vaidya J, Lin X, Zhu M (2002) Tools for privacy preserving distributed data mining. ACM SiGKDD Explor Newsl 4(2):28–34
Article Google Scholar
Domingo-Ferrer J, Ricci S, Domingo-Enrich C (2018) Outsourcing scalar products and matrix products on privacy-protected unencrypted data stored in untrusted clouds. Inf Sci 436–437:320–342
Article MathSciNet Google Scholar
Domingo-Ferrer J, Sánchez D, Rufian-Torrell G (2013) Anonymization of nominal data based on semantic marginality. Inf Sci 242:35–48
Article Google Scholar
Domingo-Ferrer J, Torra V (2005) Ordinal, continuous and heterogeneous $k$-anonymity through microaggregation. Data Min Knowl Discov 11(2):195–212
Article MathSciNet Google Scholar
Du W, Han Y, Chen S (2004) Privacy-preserving multivariate statistical analysis: linear regression and classification. In: SDM, vol 4. SIAM, pp 222–233
Dubovitskaya A, Urovi V, Vasirani M, Aberer K, Schumacher M (2015) A cloud-based eHealth architecture for privacy preserving data integration. In: ICT systems security and privacy protection, Springer, pp 585–598
Fu Z, Sun X, Ji S, Xie G (2016) Towards efficient content-aware search over encrypted outsourced data in cloud. In: Computer communications, IEEE INFOCOM 2016-the 35th annual IEEE international conference, IEEE, pp 1–9
General data protection regulation. European Union. http://www.gdpr-info.eu
Ghattas B, Michel P, Boyer L (2017) Clustering nominal data using unsupervised binary decision trees: comparisons with the state of the art methods. Pattern Recognit 67:177–85
Article Google Scholar
Gelman A (2005) Analysis of variance—why it is more important than ever. Ann Stat 33(1):1–53
Article MathSciNet MATH Google Scholar
Goethals B, Laur S , Lipmaa H, Mielikäinen T (2005) On private scalar product computation for privacy-preserving data mining. In: Information security and cryptology—ICISC 2004, LNCS, vol 3506, Springer, pp 104–120
Hundepool A, Domingo-Ferrer J, Franconi L, Giessing S, Schulte Nordholt E, Spicer K, De Wolf P-P (2006) Statistical disclosure control. Wiley, Hoboken
Google Scholar
Karr A, Lin X, Sanil A, Reiter J (2009) Privacy-preserving analysis of vertically partitioned data using secure matrix products. J Off Stat 25(1):125–138
Google Scholar
Lei X, Liao X, Huang T, Li H, Hu C (2013) Outsourcing large matrix inversion computation to a public cloud. IEEE Trans Cloud Comput 1(1):78–87
Google Scholar
Lei X, Liao X, Huang T, Heriniaina F (2014) Achieving security, robust cheating resistance, and high-efficiency for outsourcing large matrix multiplication computation to a malicious cloud. Inf Sci 280:205–217
Article Google Scholar
Li H, Yang Y, Luan TH, Liang X, Zhou L, Shen XS (2016) Enabling fine-grained multi-keyword search supporting classified sub-dictionaries over encrypted cloud data. IEEE Trans Dependable Secur Comput 13(3):312–25
Article Google Scholar
Li L, Lu R, Choo KK, Datta A, Shao J (2016) Privacy-preserving-outsourced association rule mining on vertically partitioned databases. IEEE Trans Inf Forensics Secur 11(8):1847–61
Article Google Scholar
Lin D (1998) An information-theoretic definition of similarity. In: Proceedings of the 15th international conference on machine learning, ICML 1998, pp 296–304
Nassar M, Erradi A, Sabry F, Malluhi Q M (2014) Secure outsourcing of matrix operations as a service. In: IEEE CLOUD 2013, IEEE, pp 918–925
Paillier P (1999) Public-key cryptosystems based on composite degree residuosity classes. In: Advances in cryptology—EUROCRYPT ’99, LNCS, vol 1592, Springer, pp 223–238
Rada R, Mili H, Bichnell E, Blettner M (1989) Development and application of a metric on semantic nets. IEEE Trans Syst Man Cybern 9:17–30
Article Google Scholar
Ren K, Wang C, Wang Q (2012) Security challenges for the public cloud. IEEE Internet Comput 16(1):69–73
Article MathSciNet Google Scholar
Resnik P (1995) Using information content to evaluate semantic similarity in a taxonomy. In: Proceedings of the 14th international joint conference on artificial intelligence, IJCAI, vol 1, pp 448–453
Ricci S, Domingo-Ferrer J, Sánchez D (2016) Privacy-preserving cloud-based statistical analyses on sensitive categorical data. In: Modeling decisions for artificial intelligence, Springer, pp 227–238
Rodríguez-García M, Batet M, Sánchez D (2017) A semantic framework for noise addition with nominal data. Knowl Based Syst 112:103–118
Article Google Scholar
Samarati P (2001) Protecting respondents’ identities in microdata release. IEEE Trans Knowl Data Eng 13(6):1010–1027
Article Google Scholar
Sánchez D, Batet M (2017) Privacy-preserving data outsourcing in the cloud via semantic data splitting. Comput Commun 110:187–201
Article Google Scholar
Sánchez D, Batet M, Isern D, Valls A (2012) Ontology-based semantic similarity: a new feature-based approach. Expert Syst Appl 39(9):7718–7728
Article Google Scholar
Sánchez D, Batet M, Isern D (2011) Ontology-based information content computation. Knowl Based Syst 24(2):297–303
Article Google Scholar
Sánchez D, Batet M, Martínez S, Domingo-Ferrer J (2015) Semantic variance: an intuitive measure for ontology accuracy evaluation. Eng Appl Artif Intell 39:89–99
Article Google Scholar
SNOMED-CT Ontology. https://en.wikipedia.org/wiki/SNOMED_CT
Sun Y, Yu Y, Li X, Zhang K, Qian H, Zhou Y (2016) Batch verifiable computation with public verifiability for outsourcing polynomials and matrix computations. In: Australasian conference on information security and privacy—ACISP 2016, Lecture Notes in Computer Science, vol 9722, Springer, pp 293–309
Székely GJ, Rizzo ML (2009) Brownian distance covariance. Ann Appl Stat 3(4):1236–1265
Article MathSciNet MATH Google Scholar
Taha A, Hadi AS (2016) Pair-wise association measures for categorical and mixed data. Inf Sci 346:73–89
Article Google Scholar
Tugrul B, Polat H (2014) Privacy-preserving kriging interpolation on partitioned data. Knowl Based Syst 62:38–46
Article MATH Google Scholar
U.S. Federal Trade Commission: Data Brokers, A Call for Transparency and Accountability (2014)
Wang I-C, Shen C-H, Hsu T-S, Liao C-C, Wang DW, Zhan J (2009) Towards empirical aspects of secure scalar product. IEEE Trans Syst Man Cybern Part C 39(4):440–447
Article Google Scholar
Wu Z, Palmer M (1994) Verbs semantics and lexical selection. In: Proceedings of the annual meeting of the association for computational linguistics, pp 133–139
Xia Z, Wang X, Sun X, Wangm Q (2016) A secure and dynamic multi-keyword ranked search scheme over encrypted cloud data. IEEE Trans Parallel Distrib Syst 27(2):340–52
Article Google Scholar
Yang JJ, Li JQ, Niu Y (2015) A hybrid solution for privacy preserving medical data sharing in the cloud environment. Future Gener Comput Syst 43:74–86
Article Google Scholar
Zhang X, Boscardin WJ, Belin TR, Wan X, He Y, Zhang K (2015) A Bayesian method for analyzing combinations of continuous, ordinal, and nominal categorical data with missing values. J Multivar Anal 135:43–58
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

Partial support to this work has been received from the European Commission (projects H2020-700540 “CANVAS” and H2020-644024 “CLARUS”), from the Government of Catalonia (ICREA Acadèmia Prize to J. Domingo-Ferrer and grant 2017 SGR 705), and from the Spanish Government (projects RTI2018-095094-B-C21 “CONSENT” and TIN2016-80250-R “Sec-MCloud”). The authors are with the UNESCO Chair in Data Privacy, but the views in this paper are the authors’ own and are not necessarily shared by UNESCO.

Author information

Authors and Affiliations

Department of Computer Science and Mathematics, UNESCO Chair in Data Privacy, CYBERCAT-Center for Cybersecurity Research of Catalonia, Universitat Rovira i Virgili, Av. Països Catalans 26, 43007, Tarragona, Catalonia
Josep Domingo-Ferrer, David Sánchez & Mónica Muñoz-Batista
Department of Telecommunications, Brno University of Technology, Technická 3058/10, 61600, Brno, Czech Republic
Sara Ricci

Authors

Josep Domingo-Ferrer
View author publications
You can also search for this author in PubMed Google Scholar
David Sánchez
View author publications
You can also search for this author in PubMed Google Scholar
Sara Ricci
View author publications
You can also search for this author in PubMed Google Scholar
Mónica Muñoz-Batista
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Josep Domingo-Ferrer.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

A Semantic distance calculation

The semantic distance quantifies the difference between the meaning of two nominal values. Semantic similarity/distance measures rely on the semantic evidences gathered from knowledge bases, such as ontologies, which taxonomically structure the concepts of a domain of knowledge [7]. Formally, an ontology${\mathcal {O}}$ is composed, at least, of a set of concepts or classes C organized in a directed acyclic graph (due to multiple inheritance) by means of is-a ($c_i < c_j$) relationships [10], as shown in Fig. 2.

Measuring the semantic distance in large ontologies can be costly. In this section, we discuss the computational cost of some well-known measures by relying on the concepts introduced in the following definition.

Definition 1

Let $S(\mathbf{X^a} )$ be the set of subsumers (i.e., taxonomic ancestors) of the nominal values of attribute $\mathbf X^a$ mapped in an ontology ${\mathcal {O}}$. The least common subsumer of $\mathbf X^a$, denoted by $LCS(\mathbf{X^a})$, is the most specific concept in $S(\mathbf{X^a})$. Formally,

$$\begin{aligned} S(\mathbf{X^a})= & {} \{ c_i \in {\mathcal {O}} | \forall c_j \in \mathbf{X^a} : c_j \le c_i\}; \\ LCS(\mathbf{X^a})= & {} \{ c \in S(\mathbf{X^a}) | \forall c_i \in S(\mathbf{X^a}) : c \le c_i\}. \end{aligned}$$

The semantic distance is defined as a function $d_s: {\mathcal {O}} \times {\mathcal {O}} \rightarrow {\mathbb {R}}$ mapping a pair of concepts (corresponding to nominal values) to a real number that quantifies the difference between their meanings. According to the calculation principle employed, ontology-based measures can be divided in three families:

1.
Edge-counting measures.
2.
Feature-based measures.
3.
Measures based on information content.

1.1 A. 1 Edge-counting measures

They estimate the semantic distance between concept pairs as a function of the length of the taxonomic path connecting the two concepts in the ontology [33].

A well-known edge-counting measure was proposed by Wu and Palmer [50]:

$$\begin{aligned} d_{\text {WP}} (c_1 , c_2) = 1 - \frac{2\times \text {depth}(LCS( c_1 , c_2 ))}{\mathrm{denominator}}, \end{aligned}$$

(10)

where ${\mathrm{denominator}} = 2\times \text {depth}(LCS( c_1 , c_2 )) + \text {path}(c_1, LCS( c_1 , c_2 )) + \text {path}(c_2, LCS( c_1 , c_2 ))$; $LCS(c_1 ,c_2)$ is the most specific subsumer of $c_1$ and $c_2$ in the ontology; $\text {depth}(LCS( c_1 , c_2 ))$ is the number of nodes in the longest taxonomic path between the $LCS(c_1 ,c_2 )$ and the node root of the taxonomy; and $\text {path}(c_i, LCS( c_1,$$c_2 ))$ is the number of taxonomic edges in the shortest taxonomic path between the two concepts.

Simplicity is the main advantage of edge-counting measures. However, they present some shortcomings: (1) if they are applied to ontologies incorporating multiple taxonomical inheritance, several taxonomical paths are not taken into account, and (2) by considering only the paths (i.e., subsumers) between the concepts, much of the taxonomical knowledge explicitly modeled in the ontology is ignored.

Assuming that concepts in the ontology are linked with their ancestors through pointers, in the worst case (comparing the two most specific concepts in the ontology that have the root node as LCS), obtaining the $LCS(c_1 ,c_2)$ requires running through the longest path in the taxonomy, i.e., twice the taxonomy depth D. Therefore, it takes O(D) cost to compute Expression (10).

1.2 A.2 Feature-based measures

They consider the degree of overlap between the sets of ontological features of the concepts to be compared. In [40], the authors suggested measuring the semantic distance as a function of taxonomic features, i.e., as the ratio between the number of non-common taxonomic ancestors and the total number of ancestors of the two concepts:

$$\begin{aligned}&d_{\mathrm{log}\text {SC}} ( c_1 , c_2 ) \nonumber \\&\quad = \log _{2} \left( 1 + \frac{|S(c_1)\cup S(c_2)|-|S(c_1)\cap S(c_2)|}{|S(c_1)\cup S(c_2)|}\right) , \end{aligned}$$

(11)

where $S(c_i)$ is the set of taxonomic subsumers of the concept $c_i$, for $i = 1,2$. Due to the additional knowledge feature-based measures take into account (i.e., multiple direct ancestors in case of multiple inheritance), they tend to be more accurate than edge-counting measures [40].

If S is the maximum number of ancestors that a concept can have in the ontology, computing Expression (11) takes O(S) cost. Notice that, for ontologies without multiple inheritance, this cost is the same as the one of edge-counting measures.

1.3 A.3 Measures based on information content

They measure the semantic distance between two concepts as the inverse of the amount of information they share in the ontology, which is represented by their LCS [35]. In particular, Lin [30] proposed as a measure the inverse of the ratio between the information content of the LCS of the concepts and the sum of the information content of each concept.

$$\begin{aligned} d_{\text {lin}}(c_1,c_2) =1- \frac{IC(LCS(c_1,c_2))}{IC(c_1)+IC(c_2)}. \end{aligned}$$

(12)

In [41], IC(c) is intrinsically estimated within the ontology as the normalized ratio between the number of leaves (i.e., terminal hyponyms) under concept c in the taxonomy and the number of subsumers of c:

$$\begin{aligned} IC(c) = - \log \left( \frac{\frac{|\text {leaves}(c)|}{|S(c)|}+1}{|\text {max\_leaves}+1|}\right) . \end{aligned}$$

(13)

Thanks to IC-based measures exploiting the largest amount of ontological evidence (i.e., ancestors and leaves), they achieve better accuracy than edge-counting and feature-based measures [6].

Expression (12) requires computing the LCS of the two concepts, plus the ICs of the LCS and the concepts. Like in edge-counting measures, computing the LCS has a worst-case complexity O(D). On the other hand, Expression (13) requires obtaining all the possible concepts connected to c, either subsumers of hyponyms; hence, in the worst case (i.e., when c is the root node, which subsumes all the concepts in the ontology), the IC computation takes O(C) cost, where C is the total number of concepts in the taxonomy. In conclusion, Expression (12) has $O(C+D)$ computational cost. Thus, IC-based measures are not only the most accurate but also the costliest.

B Security of the scalar product protocols used

1.1 B. 1 Proof of Proposition 1

Charlie receives $\varvec{r}'_x$ from Alice. But $\varvec{r}'_x$ can be obtained as the difference between $\hat{\varvec{x}}'+{\varvec{k}}$ and ${\varvec{x}}+{\varvec{k}}$, where ${\varvec{k}}$ is an n-vector with all its components set to k and k is any real number. Hence, Charlie learns nothing about $\varvec{x}$. A similar argument shows that Charlie learns nothing about $\varvec{y}$.

Bob receives $\hat{\varvec{x}}'$ from Alice and ${\varvec{r}}_y$ from Charlie. Clearly, ${\varvec{r}}_y$ contains no information on ${\varvec{x}}$. On the other hand,

$$\begin{aligned} \hat{\varvec{x}}' = {{{\mathcal {P}}}}_x(\hat{\varvec{x}})= {{{\mathcal {P}}}}_x({\varvec{x}}+{\varvec{r}}_x). \end{aligned}$$

Since $\mathbf{P}_x$ is a random permutation, the probability of Bob’s learning $\hat{\varvec{x}}$ from $\hat{\varvec{x}}'$ is 1 over the number of permutations of $\hat{\varvec{x}}$, that is

$$\begin{aligned} \frac{n^x_1! n^x_2! \ldots n^x_{d_x}!}{n!}, \end{aligned}$$

where $d_x$ is the number of different values among the n values of $\hat{\varvec{x}}$, and $n^x_i$ is the number of repetitions of the ith different value. Since $\hat{\varvec{x}}$ is the result of adding a random vector to ${\varvec{x}}$, it is highly unlikely that $\hat{\varvec{x}}$ contains repeated values, so the probability of Bob’s learning $\hat{\varvec{x}}$ is very low. Furthermore, Bob does not know ${\varvec{r}}_x$. Without knowledge of $\hat{\varvec{x}}$ and ${\varvec{r}}_x$, Bob cannot learn $\varvec{x}$.

The argument on the inability of Alice to learn $\varvec{y}$ is analogous.

1.2 B.2 On the security of Protocol 2

Protocol 2 is a variation of a protocol proposed in [23]. The latter protocol takes place only between Alice and Bob and there is no CLARUS proxy. Thus it differs from Protocol 2 in the last three steps, which are as follows:

4.
Bob generates a random plaintext $s_B$, a random number $r'$ and sends $\omega ' = \omega Enc_{p_k}(-s_B;r')$ to Alice.
5.
Alice computes $s_A = Dec_{s_k}(\omega ') = \varvec{x}^T \varvec{y} - s_B$.
6.
Alice and Bob simultaneously exchange the values $s_A$ and $s_B$, respectively, so that both can compute $s_A + s_B = \varvec{x}^T \varvec{y}$.

The authors of [23] prove that, if Paillier’s cryptosystem is secure, Alice cannot learn ${\varvec{y}}$ and Bob cannot learn $\mathbf{x}$ in their protocol.

The only modification introduced by Protocol 2 is that Alice and Bob do not share their results $s_A$ and $s_B$, but they send these values to CLARUS. Since neither Alice nor Bob have more information than in the protocol of [23], the security of the latter protocol is preserved in Protocol 2.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Domingo-Ferrer, J., Sánchez, D., Ricci, S. et al. Outsourcing analyses on privacy-protected multivariate categorical data stored in untrusted clouds. Knowl Inf Syst 62, 2301–2326 (2020). https://doi.org/10.1007/s10115-019-01424-4

Download citation

Received: 22 February 2019
Revised: 02 November 2019
Accepted: 02 November 2019
Published: 15 November 2019
Issue Date: June 2020
DOI: https://doi.org/10.1007/s10115-019-01424-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Outsourcing analyses on privacy-protected multivariate categorical data stored in untrusted clouds

Abstract

Access this article

Similar content being viewed by others

A survey on security challenges in cloud computing: issues, threats, and solutions

A systematic review of homomorphic encryption and its contributions in healthcare industry

Big healthcare data: preserving security and privacy

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

A Semantic distance calculation

Definition 1

1.1 A. 1 Edge-counting measures

1.2 A.2 Feature-based measures

1.3 A.3 Measures based on information content

B Security of the scalar product protocols used

1.1 B. 1 Proof of Proposition 1

1.2 B.2 On the security of Protocol 2

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Outsourcing analyses on privacy-protected multivariate categorical data stored in untrusted clouds

Abstract

Access this article

Similar content being viewed by others

A survey on security challenges in cloud computing: issues, threats, and solutions

A systematic review of homomorphic encryption and its contributions in healthcare industry

Big healthcare data: preserving security and privacy

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

A Semantic distance calculation

Definition 1

1.1 A. 1 Edge-counting measures

1.2 A.2 Feature-based measures

1.3 A.3 Measures based on information content

B Security of the scalar product protocols used

1.1 B. 1 Proof of Proposition 1

1.2 B.2 On the security of Protocol 2

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation