skip to main content
10.1145/1621995.1622041acmconferencesArticle/Chapter ViewAbstractPublication PagesdocConference Proceedingsconference-collections
research-article

Automatically identifying relations in privacy policies

Published:05 October 2009Publication History

ABSTRACT

E-commerce privacy policies tend to consist of many ambiguities in language that protects companies more than the customers. Types of ambiguities found are currently divided into four patterns: mitigation (downplaying frequency), enhancement (emphasizing nonessential qualities), obfuscation (hedging claims and obscuring causality), and omission (removing agents). A number of phrases have been identified as creating ambiguities within these four categories. When a customer accepts the terms and conditions of a privacy policy, words and phrases (from the category of mitigation) such as "occasionally" or "from time to time" actually give the e-commerce vendor permission to send as many spamming email offers as they deem necessary . Our study uses techniques based on Latent Semantic Analysis to discover the underlying semantic relations between words in privacy policies. Additional potential ambiguities and other word relations are found automatically. Words are clustered according to their topic in privacy policies using principal directions. This provides us with a ranking of the most significant words from each clustered topic as well as a ranking of the privacy policy topics. We also extract a signature that forms the basis of a typical privacy policy. These results lead to the design of a system used to analyze privacy policies called Hermes. Given an arbitrary privacy policy our system provides a list of the potential ambiguities along with a score that represents the similarity to a typical privacy policy.

References

  1. Pollach, I. 2007 What's Wrong with online Privacy Policies?, Communications of the ACM, Volume 50--9, 103--108. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Lassez, J-L., Rossi, R., Sheel, S., Mukkamala, S. 2008 Signature Based Intrusion Detection System using Latent Semantic Analysis, IJCNN, 1068--1074.Google ScholarGoogle Scholar
  3. Landauer, T. K., Foltz, P. W., Laham, D. 1998 Introduction to Latent Semantic Analysis. Discourse Processes, 25, 259--284.Google ScholarGoogle ScholarCross RefCross Ref
  4. Landaur, T. K. and Dumais, S. T. 1997 A Solution to Plato's Problem: The Latent Semantic Analysis Theory of the Acquisition, Induction, and Representation of Knowledge, Psychological Review, vol. 104, pp. 211--240.Google ScholarGoogle ScholarCross RefCross Ref
  5. Landauer, T. K. and Littman, M. L. 1990 Fully automatic cross language document retrieval using latent semantic indexing, Proceedings of the Sixth Annual Conference of the UW Centre for the New Oxford English Dictionary and Text Research., 31--38.Google ScholarGoogle Scholar
  6. Deerwester, S., Dumais, S. T., Landauer, T. K., Furnas, G. W. and Harshman, R. A. 1990 Indexing by latent semantic analysis. JSIS, 41(6), 391--407.Google ScholarGoogle ScholarCross RefCross Ref
  7. J-L. Lassez, J-L., Rossi, R., Jeev, K. 2008 Ranking Links on the Web: Search and Surf Engines, Lecture Notes of Artificial Intelligence, IEA/AIE, 199--208. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Eckart, C. and Young, G. 1936 The approximation of one matrix by another of lower rank, Psychometrika, 1, 211--218.Google ScholarGoogle ScholarCross RefCross Ref
  9. Berry, M.&Browne M. 1999 Understanding Search Engines: Mathematical Modeling and Text Retrieval, SIAM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Golub, G., Reinsch, C. 1970 Singular value decomposition and least squares solutions. Numer. Math. 14, 403--420.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Ezor, Jonathan, Clicking Through. Bloomberg Press, 1999, and personal communication with the authors (August 2006).Google ScholarGoogle Scholar
  12. Earp, J.D., Anton, A.I., Aiman-Smith, L&Stufflebeam, W.H. 2005 Examining Internet Privacy Policies Within the Context of User Privacy Values. IEEE Transactions on Engineering Management, 52(2), 227--237.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Automatically identifying relations in privacy policies

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          SIGDOC '09: Proceedings of the 27th ACM international conference on Design of communication
          October 2009
          328 pages
          ISBN:9781605585598
          DOI:10.1145/1621995

          Copyright © 2009 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 5 October 2009

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate355of582submissions,61%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader