Bridging the gap between real world repositories and Scalable Preservation Environments | IEEE Conference Publication | IEEE Xplore

Bridging the gap between real world repositories and Scalable Preservation Environments


Abstract:

Integrating large scale processing environments, such as Hadoop, with traditional repository systems, such as Fedora Commons 3, have long proved a daunting task. In this ...Show More

Abstract:

Integrating large scale processing environments, such as Hadoop, with traditional repository systems, such as Fedora Commons 3, have long proved a daunting task. In this paper we show how this integration can be achieved using software developed in the SCAPE project. The SCAPE integration is based on four steps: retrieving the metadata records from the repository, reading the records and their references to data files, updating the records, and storing them back in the repository. This allows full use of the Hadoop system for massively distributed processing without causing excessive load on the repository.
Date of Conference: 08-12 September 2014
Date Added to IEEE Xplore: 04 December 2014
Electronic ISBN:978-1-4799-5569-5
Conference Location: London, UK

Contact IEEE to Subscribe

References

References is not available for this document.