Skip to main content
Log in

A roadmap for privacy-enhanced secure data provenance

  • Published:
Journal of Intelligent Information Systems Aims and scope Submit manuscript

Abstract

The notion of data provenance was formally introduced a decade ago and has since been investigated, but mainly from a functional perspective, which follows the historical pattern of introducing new technologies with the expectation that security and privacy can be added later. Despite very recent interests from the cyber security community on some specific aspects of data provenance, there is no long-haul, overarching, systematic framework for the security and privacy of provenance. The importance of secure provenance R&D has been emphasized in the recent report on Federal game-changing R&D for cyber security especially with respect to the theme of Tailored Trustworthy Spaces. Secure data provenance can significantly enhance data trustworthiness, which is crucial to various decision-making processes. Moreover, data provenance can facilitate accountability and compliance (including compliance with privacy preferences and policies of relevant users), can be an important factor in access control and usage control decisions, and can be valuable in data forensics. Along with these potential benefits, data provenance also poses a number of security and privacy challenges. For example, sometimes provenance needs to be confidential so it is visible only to properly authorized users, and we also need to protect the identity of entities in the provenance from exposure. We thus need to achieve high assurance of provenance without comprising privacy of those in the chain that produced the data. Moreover, if we expect voluntary large-scale participation in provenance-aware applications, we must assure that the privacy of the individuals or organizations involved will be maintained. It is incumbent on the cyber security community to develop a technical and scientific framework to address the security and privacy challenges so that our society can gain maximum benefit from this technology. In this paper, we discuss a framework of theoretical foundations, models, mechanisms and architectures that allow applications to benefit from privacy-enhanced and secure use of provenance in a modular fashion. After introducing the main components of such a framework and the notion of provenance life cycle, we discuss in details research questions and issues concerning each such component and related approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  • Buneman, P., Khanna, S., Tan, W.C. (2001). Database Theory-ICDT, (pp. 316–330).

  • Buneman, P., Khanna, S., Tan, W.C. (2000). FST TCS 2000: Foundations of software technology and theoretical computer science. In S. Kapoor & S. Prasad (Eds.) Lecture notes in computer science (vol. 1974, pp. 87–93). Berlin: Springer. doi:10.1007/3-540-44450-5_6.

  • Cheney, J., Chong, S., Foster, N., Seltzer, M., Vansummeren, S. (2009). In Proceeding of the 24th ACM SIGPLAN conference companion on object oriented programming systems languages and applications, OOPSLA ’09 (pp. 957–964). New York: ACM. doi:10.1145/1639950.1640064.

  • Moreau, L. (2009). Foundations and trends in web science. http://eprints.ecs.soton.ac.uk/18176/1/psurvey.pdf.

  • Moreau, L., Groth, P., Miles, S., Vazquez-Salceda, J., Ibbotson, J., Jiang, S., Munroe, S., Rana, O., Schreiber, A., Tan, V., Varga, L. (2008). Communications of the ACM, 51, 52. doi:10.1145/1330311.1330323.

    Article  Google Scholar 

  • Sahoo, S., Sheth, A., Henson, C. (2008). IEEE Internet Computing, 12(4), 46.

    Article  Google Scholar 

  • Simmhan, Y.L., Plale, B., Gannon, D. (2005). SIGMOD Record, 34, 31. doi:10.1145/1084805.1084812.

    Article  Google Scholar 

  • Curbera, F., Doganata, Y., Martens, A., Mukhi, N., Slominski, A. (2008). On the move to meaningful internet systems: OTM, (pp. 100–119).

  • Hui, P., Bruce, J., Fink, G., Gregory, M., Best, D., McGrath, L., Endert, A. (2010). In International symposium on collaborative technologies and systems (CTS) (pp. 489–498). doi:10.1109/CTS.2010. 5478473 .

  • Moitra, A., Barnett, B., Crapo, A., Dill, S. (2009). In Military communications conference, MILCOM 2009. IEEE (pp. 1–7). doi:10.1109/MILCOM.2009.5379854.

  • Hajnal, A., Kifor, T., Pedone, G., Varga, L. (2007). In Proceedings of HealthGrid 2007 (pp. 330–341).

  • Kifor, T., Varga, L., Vazquez-Salceda, J., Alvarez, S., Willmott, S., Miles, S., Moreau, L. (2006). IEEE Intelligent Systems, 21(6), 38. doi:9D04F813-E31E-416F-99B7-DBC4D177ACA7.

    Article  Google Scholar 

  • Liu, Y., Futrelle, J., Myers, J., Rodriguez, A., Kooper, R. (2010). In 2010 international symposium on collaborative technologies and systems (CTS) (pp. 330–339). doi:10.1109/CTS.2010.5478496.

  • Groth, P., Miles, S., Moreau, L. (2009). ACM Transactions Internet Technology, 9(3), 1. doi:10.1145/ 1462159.1462162 .

    Article  Google Scholar 

  • Golbeck, J. (2006). Provenance and annotation of data. In L. Moreau & I. Foster (Eds.), Lecture notes in computer Science (vol. 4145, pp. 101–108). Berlin: Springer. doi:10.1007/11890850_12.

  • Lu, R., Lin, X., Liang, X., Shen, X.S. (2010). In Proceedings of the 5th ACM symposium on information, computer and communications security, ASIACCS ’10 (pp. 282–292). New York: ACM. doi:10.1145/1755688.1755723.

  • Vijayakumar, N., & Plale, B. (2006).

  • Networking, F., Research, I.T., Program, D.N. (2010). (May 2010). http://www.nitrd.gov/pubs/CSIA_IWG_%Cybersecurity_%20Gamechange_RD_%20Recommendations_20100513.pdf .

  • Networking, F., Research, I.T., Program, D.N. (2009). (September 2009). http://www.nitrd.gov/pubs/CSIA_IWG_%Cybersecurity_%20Gamechange_RD_%20Recommendations_20100513.pdf.

  • Muniswamy-Reddy, K., Holland, D., Braun, U., Seltzer, M. (2006). In Proceedings of the 2006 USENIX annual technical conference (pp. 43–56).

  • Agrawal, P., Benjelloun, O., Sarma, A., Hayworth, C., Nabar, S., Sugihara, T., Widom, J. (2006). In VLDB (pp. 1151–1154).

  • Green, T., Karvounarakis, G., Ives, Z., Tannen V. (2007). In VLDB.

  • Ives, Z., Khandelwal, N., Kapur, A., Cakir, M. (2005). In CIDR (pp. 107–118).

  • Taylor, N., & Ives, Z. (2006). In SIGMOD’06 (pp. 13–24).

  • Bowers, S., McPhillips, T., Ludäscher, B., Cohen, S., Davidson, S. (2006). In International provenance and annotation workshop (IPAW) (pp. 133–147).

  • Cohen, S., Boulakia, S., Davidson, S. (2006). In Third international workshop on data integration in the life sciences (DILS) (pp. 264–279).

  • Davidson, S., Boulakia, S., Eyal, A., Ludascher, B., McPhillips, T., Bowers, S., Anand, M., Freire, J. (2007). IEEE Data Engineering Bulletin, 30(4), 44.

    Google Scholar 

  • Golbeck, J., & Hendler, J. (2008). Concurrency and Computation: Practice and Experience, 20(5), 431.

    Article  Google Scholar 

  • Groth, P., Jiang, S., Miles, S., Munroe, S., Tan, V., Tsasakou, S., Moreau, L. (2006). An architecture for provenance systems. Technical report, University of Southampton. http://eprints.ecs.soton.ac.uk/13216/1/provenanceArchitecture10.pdf.

  • Simmhan, Y., Plale, B., Gannon, D. (2008). International Journal Web Service Research, 5(2), 1.

    Article  Google Scholar 

  • Braun, U., Shinnar, A., Seltzer, M. (2008). In Proceedings of the 3rd conference on hot topics in security USENIX association (p. 4).

  • Hasan, R., Sion, R., Winslett, M. (2007). In Proceedings of the 2007 ACM workshop on storage security and survivability, StorageSS ’07 (pp. 13–18). New York: ACM. doi:10.1145/1314313.1314318.

  • Hasan, R., Sion, R., Winslett, M. (2009). In Proceedings of the 7th conference on file and storage technologies (FAST’09) (pp. 1–14) .

  • Zhang, J., Chapman, A., Lefevre, K. (2009). In Proceedings of the 6th VLDB workshop on secure data management (SDM’09) (pp. 17–32).

  • McDaniel, P., Butler, K., McLaughlin, S., Sion, R., Zadok, E., Winslett, M. (2010). In 2nd USENIX workshop on the theory and practice of provenance (TaPP 10).

  • Lyle, J., & Martin, A. (2010). In 2nd USENIX workshop on the theory and practice of provenance (TaPP 10).

  • Sultana, S., & Bertino, E. (2012). In 4th international provenance and annotation workshop.

  • Chapman, A.P., Jagadish, H.V., Ramanan, P. (2008). Proceedings of the 2008 ACM SIGMOD international conference on management of data (pp. 993–1006).

  • Heinis, T., & Alonso, G. (2008). In Proceedings of the 2008 ACM SIGMOD international conference on management of data (pp. 1007–1018).

  • Samarati, P., & Sweeney, L. (1998). In Proceedings of principles of database systems (p. 188).

  • Syalim, A., Hori, Y., Sakurai, K. (2009). In Advances in information security and assurance, (pp. 51–59).

  • Corcoran, B., Swamy, N., Hicks, M. (2007). In On-line proceedings of the workshop on principles of provenance (PrOPr) (Citeseer).

  • Ni, Q., Xu, S., Bertino, E., Sandhu, R., Han, W. (2009). Secure data management (pp. 68–88).

  • Perez, J., Arenas, M., Gutierrez, C. (2009). ACM Transactions on Database Systems (TODS), 34(3), 1.

    Article  Google Scholar 

  • PrudHommeaux, E., Seaborne, A., et al. (2006). W3C working draft, 4.

  • Cadenhead, T., Khadilkar, V., Kantarcioglu, M., Thuraisingham, B. (2011). In Proceedings of the first ACM conference on data and application security and privacy, CODASPY ’11 (pp. 133–144). New York: ACM. doi:10.1145/1943513.1943532.

  • Cadenhead, T., Khadilkar, T., Kantarcioglu, M., Thuraisingham, B. (2012). In Proceedings of the 17th ACM symposium on access control models and technologies, SACMAT ’12 (pp. 113–116) New York: ACM. doi:10.1145/2295136.2295157.

  • Nguyen, D., Park, J., Sandhu, R. (2012). In 4th USENIX workshop on the theory and practice of provenance (USENIX Association), TaPP’12.

  • Moreau, L., Clifford, B., Freire, J., Futrelle, J., Gil Y., Groth, P., Kwasnikowska, N., Miles, S., Missier, P., Myers, J., Plale, B., Simmhan, Y., Stephan, E., den Bussche, J.V. (2011). Future Generation Computer Systems, 27(6), 743. doi:10.1016/j.future.2010.07.005. http://www.sciencedirect.com/science/article/pii/S0167739X10001275.

    Article  Google Scholar 

  • Park, J., Nguyen, D., Sandhu, R. (2012). In 10th annual conference on privacy, security and trust (IEEE), PST 2012.

  • Nguyen, D., Park, J., Sandhu, R. (2012). In 2012 IEEE international Conference on information reuse and integration (IRI).

  • Park, J., Nguyen, D., Sandhu, R. (2011). In 7th international conferenceon collaborative computing: Networking applications and worksharing (CollaborateCom) (pp. 221–230).

  • Machanavajjhala, A., Gehrke, J., Kifer, D., Venkitasubramaniam, M. (2006). In ICDE.

  • Li, N., Li, T., Venkatasubramanian, S. (2007). In ICDE.

  • Dwork, C. (2008). In TAMC (pp. 1–19).

  • Rachapalli, J., Kantarcioglu, M., Thuraisingham, B. (2012). In 4th USENIX workshop on the theory and practice of provenance (USENIX Association), TaPP’12.

  • Boneh, D., Di Crescenzo, R., Ostrovsky, R., Persiano, G. (2004). In Advances in Cryptology-Eurocrypt 2004 (pp. 506–522). Springer.

  • Boneh, D., & Waters, B. (2007). Theory of cryptography, (pp. 535–554).

  • Goyal, V., Pandey, O., Sahai, A., Waters, B. (2006). In ACM Conference on computer and communications security (pp. 89–98).

  • Lewko, A., Okamoto, T., Sahai, A., Takashima, K., Waters, B. (2010). In EUROCRYPT (pp. 62–91).

  • Ostrovsky, R., Sahai, A., Waters, B. (2007). In ACM Conference on Computer and Communications Security (pp. 195–203).

  • Pirretti, M., Traynor, P., McDaniel, P., Waters, B. (2010). Journal of Computer Security, 18(5), 799.

    Google Scholar 

  • Kiayias, A., Xu, S., Yung, M. (2008). In Proceedings of 6th international conference security and cryptography for networks (SCN’08). Lecture notes in computer science (vol. 5229, pp. 57–76). Springer.

  • Chaum, D., & van Heyst, E. (1991). In EUROCRYPT (pp. 257–265).

  • Cheney, J. (2007). IEEE Data Engineering Bulletin, 30(4), 22.

    Google Scholar 

  • Groth, P. (2007). The origin of data: Enabling the determination of provenance in multi-institutional scientific systems through the documentation of processes.Ph.D. thesis University of Southampton. http://eprints.ecs.soton.ac.uk/14649/1/ThesisSubmitted.pdf.

  • Xu, S., Qian, H., Wang, F., Zhan, Z., Bertino, E., Sandhu, R. (2010). In Proceedings of 11th International Conference Web-Age Information Management (WAIM’10) (pp. 398–404).

  • Lysyanskaya, A., Micali, S., Reyzin, L., Shacham, H. (2004). Advances in cryptology - EUROCRYPT. In C. Cachin & J. Camenisch (Eds.), Lecture notes in computer science (vol. 3027, pp. 74–90). Springer.

  • Bellare, M., & Neven, G. (2006). In ACM conference on computer and communications security (CCS’06) (pp. 390–399).

  • Qian, H., & Xu, S. (2010). Information Processing Letter (accepted in 2010).

  • Boneh, D., Gentry, C., Lynn, B., Shacham, H. (2003). In EUROCRYPT’03 (pp. 416–432).

  • Ateniese, G., & Hohenberger, S. (2005). In ACM conference on computer and communications security (CCS’05) (pp. 310–319).

  • Blaze, M., Bleumer, G., Strauss, M. (1998). In EUROCRYPT’98date (pp. 127–144).

  • Libert, B., & Vergnaud, D. (2008). In ACM conference on computer and communications security 2008 (pp. 511–520).

  • Waters, B. (2005). In EUROCRYPT’05 (pp. 114–127).

  • Qian, H., & Xu, S. (2011). In To appear in the Proceedings of First ACM Conference on Data and Application Security and Privacy (ACM CODASPY’11).

  • Ding, X., Tsudik, G., Xu, S. (2009). Journal of Computer Security, 17(4), 489.

    Google Scholar 

  • Tsudik, G., & Xu, S. (2003). In ASIACRYPT (pp. 269–286).

  • Xu, S., & Yung, M. (2009). First international conference on trusted systems (INTRUST’09). In Lecture notes in computer science (vol. 6163, pp. 104–128).

  • Demsky, B. (2009). In Proceedings of the 4rd conference on hot topics in security (USENIX Association).

  • Weitzner D.J., Abelson, H., Berners-Lee, T., Feigenbaum, J., Hendler, J., Sussman, G.J. (2008). Communication ACM, 51(82). doi:10.1145/1349026.1349043.

  • Kantarcioglu, M., & Clifton, C. (2004). IEEE TKDE, 16(9), 1026. http://ieeexplore.ieee.org/iel5/69/29187/01316832.pdf?isnumber=29187&prod=JNL&arnumber=1316832&arnumber=1316832&arSt=+1026&ared=+1037&arAuthor=Kantarcioglu%2C+M.%3B+Clifton%2C+C..

    Google Scholar 

  • Kantarcioglu, M., & Kardes, O. (2009). International Journal of Information and Computer Security, 2(353). doi:10.1504/IJICS.2008.022488. http://www.ingentaconnect.com/content/ind/ijics/2009/00000002/00000004/art00002.

  • Cederquist, J., Conn, R., Dekker, M., Etalle, S., den Hartog, J. (2005). In Sixth IEEE international workshop on policies for distributed systems and networks (pp. 34–43). doi:10.1109/POLICY.2005.5.

  • Celikel, E., Kantarcioglu, M., Thuraisingham, B., Bertino, E. (2007). In Proceedings of the 2007 OTM confederated international conference on the move to meaningful internet systems: CoopIS, DOA, ODBASE, GADA, and IS - Volume Part II, OTM’07 (pp. 1548–1566). Berlin / Heidelberg: Springer-Verlag. http://portal.acm.org/citation.cfm?id=1784707.1784750.

  • Dimmock, N., Belokosztolszki, A., Eyers, D., Bacon, J., Moody, K. (2004). In Proceedings of the ninth ACM symposium on access control models and technologies, SACMAT ’04 (pp. 156–162). New York: ACM. doi:10.1145/990036.990062.

  • Hong, J.I., Ng, J.D., Lederer, S., Landay, J.A. (2004). In Proceedings of the 5th conference on designing interactive systems: Processes, practices, methods,and techniques, DIS ’04 (pp. 91–100). New York: ACM. doi:10.1145/1013115.1013129.

  • Cadenhead, T., Kantarcioglu, M., Thuraisingham, B. (2011). In 3th USENIX workshop on the theory and practice of provenance (USENIX Association), TaPP’11.

  • Dai, C., Lin, D., Kantarcioglu, M., Bertino, E., Celikel, E., Thuraisingham, B.M. (2009). In Secure data management (pp. 49–67).

  • Krishnan, S., Snow, K.Z., Monrose, F. (2010). In Proceedings of the 17th ACM conference on computer and communications security (pp. 50–60).

  • Jones, S.T., Arpaci-Dusseau, A.C., Arpaci-Dusseau, R.H. (2006). In Proceedings of the annual conference on USENIX ’06 annual technical conference (pp. 1–1).

  • Jones, S.T., Arpaci-Dusseau, A.C., Arpaci-Dusseau, R.H. (2006). SIGOPS Operations Systematics Review, 40, 14.

    Article  Google Scholar 

  • Luby, M. (2002). In Annual IEEE symposium on foundations of computer science (p. 271).

  • Groth, P., Jiang, S., Miles, S., Munroe, S., Tan, V., Tsasakou, S., Moreau, L. (2006). In Technical report D3.1.1, Ver 0.6, www.pasoa.org.

  • Stevens, R.D., Robinson, A.J., Goble, C.A. (2003). Bioinformatics Journal, 19(302).

  • Simmhan, Y.L., Plale, B., Gannon, D. (2006). In IEEE international conference on web services (pp. 18–22).

  • Gentry, C. (2009). In Proceedings of the 41st annual ACM symposium on theory of computing, STOC ’09 (pp. 169–178).

Download references

Acknowledgments

The work reported in this paper has been partially supported by NSF under grant CNS-1111512.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Elisa Bertino.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bertino, E., Ghinita, G., Kantarcioglu, M. et al. A roadmap for privacy-enhanced secure data provenance. J Intell Inf Syst 43, 481–501 (2014). https://doi.org/10.1007/s10844-014-0322-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10844-014-0322-7

Keywords

Navigation