ABSTRACT
Link discovery is a process of identifying association(s) among different entities included in a complex network structure. These association(s) may represent any interaction among entities, for example between people or even bank accounts. The need for link discovery arises in many applications including law enforcement, counter-terrorism, social network analysis, intrusion detection, and fraud detection. Given the sensitive nature of information that can be revealed from link discovery, privacy is a major concern from the perspective of both individuals and organizations. For example in the context of financial fraud detection, linking transaction may reveal sensitive information about other individuals not involved in any fraud. In this paper, we propose an approach for link discovery in a privacy-preserving manner. We show how the problem can be reduced to finding the transitive closure of a graph. A secure split-matrix multiplication protocol based on secure scalar product computations is proposed to find the transitive closure. We analyze the performance and usability of the proposed approach.
- R. Agrawal and R. Srikant. Privacy-preserving data mining. In Proceedings of the 2000 ACM SIGMOD Conference on Management of Data, pages 439--450, 2000. Google ScholarDigital Library
- J. C. Benaloh. Secret sharing homomorphisms: Keeping shares of a secret secret. In A. Odlyzko, editor, Advances in Cryptography - CRYPTO86: Proceedings, volume 263, pages 251--260. Springer-Verlag, Lecture Notes in Computer Science, 1986. Google ScholarDigital Library
- M. Blum and S. Goldwasser. An efficient probabilistic public-key encryption that hides all partial information. In R. Blakely, editor, Advances in Cryptology -- Crypto 84 Proceedings. Springer-Verlag, 1984. Google ScholarDigital Library
- T. H. Cormen, C. E. Leiserson, and R. L. Rivest. Introduction to Algorithms. McGraw-Hill Book Company, New York, 1990. Google ScholarDigital Library
- Y. Duan, J. Wang, M. Kam, and J. Canny. A secure online algorithm for link analysis on weighted graph. In In Proceedings of SIAM Workshop on Link Analysis, Counterterrorism and Security, Apr. 2005.Google Scholar
- M. C. Ganiz, W. M. Pottenger, and X. Yang. Link analysis of higher-order path in supervised learning datasets. In In proceedings of the 4th SIAM Workshop on Link Analysis, Counterterrorism and Security, Apr. 2006.Google Scholar
- L. Getoor and C. P. Diehl. Link mining: a survey. SIGKDD Explorations, 7(2):3--12, 2005. Google ScholarDigital Library
- B. Goethals, S. Laur, H. Lipmaa, and T. Mielikäinen. On Secure Scalar Product Computation for Privacy-Preserving Data Mining. In C. Park and S. Chee, editors, The 7th Annual International Conference in Information Security and Cryptology (ICISC 2004), volume 3506, pages 104--120, December 2--3, 2004. Google ScholarDigital Library
- O. Goldreich. The Foundations of Cryptography, volume 2, chapter General Cryptographic Protocols. Cambridge University Press, 2004.Google ScholarDigital Library
- O. Goldreich, S. Micali, and A. Wigderson. How to play any mental game - a completeness theorem for protocols with honest majority. In 19th ACM Symposium on the Theory of Computing, pages 218--229, 1987. Google ScholarDigital Library
- Z. Huang, W. Du, and B. Chen. Deriving private information from randomized data. In Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data, Baltimore, MD, June 13--16 2005. Google ScholarDigital Library
- H. Kargupta, S. Datta, Q. Wang, and K. Sivakumar. On the privacy preserving properties of random data perturbation techniques. In Proceedings of the Third IEEE International Conference on Data Mining (ICDM'03), 2003. Google ScholarDigital Library
- Y. Lindell and B. Pinkas. Privacy preserving data mining. In Advances in Cryptology -- CRYPTO 2000, pages 36--54. Springer-Verlag, Aug. 20--24 2000. Google ScholarDigital Library
- Y. Lindell and B. Pinkas. Privacy preserving data mining. Journal of Cryptology, 15(3):177--206, 2002.Google ScholarDigital Library
- R. Mooney, P. Melville, L. Tang, J. Shavlik, I. Dutra, D. Page, and V. Costa. Relational data mining with inductive logic programming for link discovery. In In Proceedings of the National Science Foundation Workshop on Next Generation Data Mining, Baltimore, Maryland, 2002.Google Scholar
- D. Naccache and J. Stern. A new public key cryptosystem based on higher residues. In Proceedings of the 5th ACM conference on Computer and communications security, pages 59--66, San Francisco, California, United States, 1998. ACM Press. Google ScholarDigital Library
- T. Okamoto and S. Uchiyama. A new public-key cryptosystem as secure as factoring. In Advances in Cryptology - Eurocrypt '98, LNCS 1403, pages 308--318. Springer-Verlag, 1998.Google ScholarCross Ref
- P. Paillier. Public key cryptosystems based on composite degree residuosity classes. In Advances in Cryptology - Eurocrypt '99 Proceedings, LNCS 1592, pages 223--238. Springer-Verlag, 1999. Google ScholarDigital Library
- M. J. Rattigan and D. Jensen. The case for anomalous link discovery. SIGKDD Explorations, 7(2):41--47, 2005. Google ScholarDigital Library
- L. Sweeney. Privacy-enhanced linking. SIGKDD Explorations, 7(2):72--75, 2005. Google ScholarDigital Library
- J. Vaidya, C. Clifton, and M. Zhu. Privacy-Preserving Data Mining. Advances in Information Security. Springer-Verlag, 1st edition, 2005. Google ScholarDigital Library
- J. J. Xu and H. Chen. Fighting organized crimes: using shortest-path algorithms to identify associations in criminal networks. Decision Support Systems, 38(3):473--487, 2004. Google ScholarDigital Library
- A. C. Yao. How to generate and exchange secrets. In Proceedings of the 27th IEEE Symposium on Foundations of Computer Science, pages 162--167. IEEE, 1986.Google ScholarDigital Library
Index Terms
- Privacy-preserving link discovery
Recommendations
Efficient Privacy-Preserving Link Discovery
PAKDD '09: Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data MiningLink discovery is a process of identifying association(s) among different entities included in a complex network structure. These association(s) may represent any interaction among entities, for example between people or even bank accounts. The need for ...
Collusion-Free Privacy Preserving Data Mining
Distributed association rule mining is an integral part of data mining that extracts useful information hidden in distributed data sources. As local frequent itemsets are globalized from data sources, sensitive information about individual data sources ...
Privacy preserving mining of association rules
Knowledge discovery and data mining (KDD 2002)We present a framework for mining association rules from transactions consisting of categorical items where the data has been randomized to preserve privacy of individual transactions. While it is feasible to recover association rules and preserve ...
Comments