Design and implementation of an access control processor for XML documents

https://doi.org/10.1016/S1389-1286(00)00053-0Get rights and content

Abstract

More and more information is distributed in XML format, both on corporate Intranets and on the global Net. In this paper an Access Control System for XML is described allowing for definition and enforcement of access restrictions directly on the structure and content of XML documents, thus providing a simple and effective way for users to protect information at the same granularity level provided by the language itself.

Introduction

As more and more information is made available in eXtensible Markup Language (XML) format, both on corporate Intranets and on the global Net, concerns are being raised by developers and end-users about XML security problems. Early research work about XML was not directly related to access control and security, because XML was initially introduced as a data format for documents; therefore, many researchers assumed well-known techniques for securing documents to be straightforwardly applicable to XML data. But the way XML is being positioned has caused some to question if additional measures will be necessary.

For example, in the scenario of the oncoming FASTER (Flexible Access to Statistics, Tables, and Electronic Resources) project, end-users will be able to control their interaction with Web sites by pulling the information they are interested in out of dynamically generated XML documents. However, different users may well have different interests or access authorizations, and XML enabled servers will need to know which data each user should get, at a finer level of granularity than whole documents. In other words, some FASTER applications will need to block or allow access to entire XML instances, while others will control access at the tag level. The control residing at the tag level is particularly important in the view of wider use of the XLink and XPointer standards, which enable applications to retrieve portions of documents. Indeed, a clean model for dynamic access control with granularity control is needed to allow XML documents to link against arbitrary XML chunks. It is interesting to remark that the same observation applies to authentication and encryption-based techniques, that naturally complement access control in our usage scenario. With authentication, the server will know what information can be sent to the user based on that user's identity or certified property (e.g., group membership), whereas encryption will only let users with adequate decryption keys see the message. Therefore, XML security should support the entire range of coarse- to fine-grain granularity. In the remainder of this section, we propose five basic requirements for standardizing XML access control at the tag level. Our requirements take into account the experience of other FASTER consortium partners, and are directed at large-scale knowledge management within organizations using XML, as well as at XML-based Internet applications.

  • 1.

    Support of authorizations at different organizational levels. Organizations may need to enforce security policies on huge document-bases, often dynamically created from heterogeneous datasources; on the other hand, site administrators require full control on authorization specifications on single documents.

  • 2.

    Extension to existing Web server technology. XML documents are usually made available by means of Web sites, using a variety of HTTP-based protocols. XML access control must exploit current solutions in much the same way as cryptography-based services, without interfering with existing APIs and development tools.

  • 3.

    Fine-grained access control. Access control policies should be supported at all levels of granularity, including documents and individual XML elements.

  • 4.

    Transparency. The access control system operation should be as transparent as possible to the requesters. The requester should not be aware of the information within a document which is being hidden to them by the access control system. The transparency of the access control must be preserved by the presentation and rendering phases and may therefore impose constraints on the behavior of technologies such as CSS and XSL [18]. In particular, access control should preserve the validity of the documents with respect to their DTDs.

  • 5.

    Smoothless integration with existing technologies for user authentication (e.g. digital signatures). Access control should complement tag-level authentication based on digital signatures.

Fig. 1 depicts the conceptual architecture of our approach. A central authority uses a pool of XML DTDs to specify the format of information to be exchanged within the organization. XML documents instances of such DTDs are defined and maintained at each site, describing the site-specific information. The schema–instance relationship between XML documents and DTDs naturally supports the distinction between two levels of authorizations, both of them allowing for fine-grained specifications. Namely, we distinguish: (1) low-level authorizations, associated to XML documents, providing full control on authorizations on a document-by-document basis; (2) high-level authorizations, associated to XML DTDs, providing organization-wide and department-wide declarations of access permissions. Centrally specified DTD-level authorizations can be mandatory, stating impositions of the central authority to lower organizational levels where XML documents are created and managed, usually by means of a network of federated Web sites. This technique allows for easy, centralized modification of access permissions on large document sets, and provides a general, abstract way of specifying access authorizations. In other words, specifying authorizations at the DTD level cleanly separates access control specified via XML markup from access control policies defined for the individual datasources (e.g., relational databases vs. file systems) which are different from one another both in granularity and abstraction level. Each departmental authority managing a Web site retains the right to define its own authorizations (again, at the granularity of XML tags) on individual documents, or to document sets by means of wild cards. In our model local authorities can also define authorizations at the DTD level; however, such authorizations only apply to the documents of the local domain.

Section snippets

Authorization specification

The architectural framework depicted in Fig. 1 describes the basic components taking part in the specification of access and protection requirements. We now discuss their specification. Before introducing the form and semantics of the authorizations supported by our model, we describe the basic features that they need to provide to satisfy requirements 1 and 3 discussed in the introduction.

Authorizations

The list of features illustrated in the previous section outlines the form and semantics of the authorizations supported by our model. We can then summarize the discussion above and introduce our authorizations as follows:

  • Authorizations can be specified at the level of a DTD (schema) or specific documents (instance). DTD authorizations can be specified either at the global organization level or at the local site. Document authorizations can be specified at the local site.

  • Both DTD and XML

Authorization enforcement

For each possible requester (user connected from a certain location) and document, the authorizations on the document applicable to the requester describe what information can or cannot be returned to the requester. Hence, given the request from a subject to access a document, the joint application of the DTD-level and document-level authorizations applicable to the subject will produce a custom view on the document, including only the information that a particular requester is entitled to see.

Design and implementation guidelines

First of all, architectural design will be briefly discussed. Two main architectural patterns are currently used for the design of XML/XSL systems: server side and client side XSL processing (see Section 6). The former technique is common in association with translation to HTML and provides limited interaction: XML documents are translated to HTML before sending them to the client, avoiding the need for the client browser to provide XML support. The latter technique requires an XSL processor to

Related work

Conventional HTML tagging is aimed at defining page rendering and is seldom if ever related to information granulation. For this reason, access control mechanisms currently available for Web sites tend to be coarse-grained. For instance, the Apache Web4 server allows the specification of access control lists via a configuration file (access.conf) containing a list of users, hosts (IP addresses), or host/user pairs, which must be allowed/forbidden connection to the server.

An example

We now illustrate an example of authorization specification and document transformation.

Conclusions

We have presented an access control system providing fine-grained access control for XML documents. The approach proposed is focused on enforcing and resolving fine-grained authorizations with respect to the data model and semantics. Although presented in association with a specific approach to authorization specification and subject identification, as supported in the current prototype, its operation is independent from such approaches and could then be applied in combination with different

Acknowledgements

The work presented in this paper has been supported by Esprit Project `W3I3', Esprit Project `FASTER', MURST Project `Data-X' and by the HP Internet Philanthropic Initiative.

Ernesto Damiani holds a laurea degree in ingegneria elettronica from the University of Pavia and a PhD degree in computer science from the University of Milano. He is currently an assistant professor at the campus located in Crema of the University of Milano. His research interests include distributed and object-oriented systems, semi-structured information processing and soft computing.

Sabrina De Capitani di Vimercati is an assistant professor at Dipartimento di Elettronica per l' Automazione

References (21)

  • AlphaWorks, XML Security Suite, April 1999,...
  • T. Berners-Lee, R. Fielding and L. Masinter, Uniform Resource Identifiers (URI): Generic Syntax, 1998,...
  • F. Buschmann, R. Meunier, H. Rohnert, P. Sommerlad and M. Stal, Pattern-Oriented Software Architecture — A System of...
  • S. Castano, M.G. Fugini, G. Martella and P. Samarati, Database Security, Addison-Wesley, Reading, MA,...
  • S. Ceri, S. Comai, E. Damiani, P. Fraternali, S. Paraboschi and L. Tanca, XML-GL: A graphical language for querying and...
  • E. Damiani, S. De Capitani di Vimercati, S. Paraboschi and P. Samarati, Securing XML documents, in: Proc. 2000...
  • B. Gladman, C. Ellison and N. Bohm, Digital signatures, certificates and electronic commerce,...
  • S. Jajodia, P. Samarati, V.S. Subramanian and E. Bertino, A unified framework for enforcing multiple access control...
  • J. Kahan, WDAI: a simple World Wide Web distributed authorization infrastructure, in: Proc. 8th International World...
  • S. Lewontin and M.E. Zurko, The DCE project: providing authorizations and other distributed services to the World Wide...
There are more references available in the full text version of this article.

Cited by (83)

View all citing articles on Scopus

  1. Download : Download high-res image (43KB)
  2. Download : Download full-size image
Ernesto Damiani holds a laurea degree in ingegneria elettronica from the University of Pavia and a PhD degree in computer science from the University of Milano. He is currently an assistant professor at the campus located in Crema of the University of Milano. His research interests include distributed and object-oriented systems, semi-structured information processing and soft computing.

  1. Download : Download high-res image (48KB)
  2. Download : Download full-size image
Sabrina De Capitani di Vimercati is an assistant professor at Dipartimento di Elettronica per l' Automazione of the University of Brescia. Her research interests are in the area of information security, databases, and information systems. She has been an international fellow in the Computer Science Laboratory at SRI, CA (USA). She is co-recipient of the ACM-PODS'99 Best Newcomer Paper Award.

  1. Download : Download high-res image (40KB)
  2. Download : Download full-size image
Stefano Paraboschi is an associate professor at the Dipartimento di Elettronica e Informazione of Politecnico di Milano. He received the laurea degree in ingegneria elettronica in 1990, and a PhD in ingegneria informatica in 1994, both from Politecnico di Milano. His main research interests are in the area of databases, with a focus on active databases, data warehouses, and the construction of data-intensive Web sites. He is the author, together with Paolo Atzeni, Stefano Ceri, and Riccardo Torlone, of the book `Database Systems: Concepts, Languages and Architectures' (McGraw-Hill, 1999).

  1. Download : Download high-res image (44KB)
  2. Download : Download full-size image
Pierangela Samarati is an associate professor at the Department of Computer Science of the University of Milan. Her main research interests are in data and application security. She has been computer scientist in the Computer Science Laboratory at SRI, CA (USA). She has been a visiting researcher at the Computer Science Department of Stanford University, CA (USA), and at the ISSE Department of George Mason University, VA (USA). She is co-author of the book `Database Security', Addison-Wesley, 1995. She is co-recipient of the ACM-PODS'99 Best Newcomer Paper Award.

1

E-mail: [email protected]

2

E-mail: {decapita,samarati}@dsi.unimi.it

3

E-mail: [email protected]

View full text