Skip to main content

PrEV: Preservation Explorer and Vault for Web 2.0 User-Generated Content

  • Conference paper
Theory and Practice of Digital Libraries (TPDL 2012)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7489))

Included in the following conference series:

  • 2276 Accesses

Abstract

We present the Preservation Explorer and Vault (PrEV) system, a city-centric multilingual digital library that archives and makes available Web 2.0 resources, and aims to store a comprehensive record of what urban lifestyle is like. To match the current state of the digital environment, a key architectural design choice in PrEV is to archive not only Web 1.0 web pages, but also Web 2.0 multilingual resources that include multimedia, real-time microblog content, as well as mobile application descriptions (e.g., iPhone app) in a collaborative manner. PrEV performs the preservation of such resources for posterity, and makes them available for programmatic retrieval by third party agents, and for exploration by scholars with its user interface.

This work was supported by Natural Science Foundation (60903107, 61073071), National High Technology Research and Development (863) Program (2011AA01A207) and the Research Fund for the Doctoral Program of Higher Education of China (20090002120005). This work has been done at the NUS–Tsinghua EXtreme search centre (NExT).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Adar, E., Dontcheva, M., Fogarty, J., Weld, D.: Zoetrope: Interacting with the ephemeral web. In: Proceedings of the 21st Annual ACM Symposium on User Interface Software and Technology, pp. 239–248. ACM (2008)

    Google Scholar 

  2. Albertsen, K.: The paradigma web harvesting environment. In: Proceedings of the 3rd Workshop on Web Archives, pp. 49–62 (August 2003)

    Google Scholar 

  3. Ball, A.: Web archiving. Tech. rep., Digital Curation Centre, UKOLN, University of Bath (March 2010)

    Google Scholar 

  4. Campbell, L.E.: Recollection: Integrating Data through Access. In: Agosti, M., Borbinha, J., Kapidakis, S., Papatheodorou, C., Tsakonas, G. (eds.) ECDL 2009. LNCS, vol. 5714, pp. 396–397. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  5. Chang, H.: Enriched Content: Concept, Architecture, Implementation, and Applications. Ph.D. thesis, New York University (2003)

    Google Scholar 

  6. Collins, C., Viegas, F., Wattenberg, M.: Parallel tag clouds to explore and analyze faceted text corpora. In: IEEE Symposium on Visual Analytics Science and Technology, VAST 2009, pp. 91–98. IEEE (2009)

    Google Scholar 

  7. Dougherty, M., Meyer, E., Madsen, C., Van den Heuvel, C., Thomas, A., Wyatt, S.: Researcher engagement with web archives: State of the art (2010)

    Google Scholar 

  8. Hallgrímsson, T.: The International Internet Preservation Consortium (IIPC). In: Conference of Directors of National Libraries (CDNL 2005), Oslo, Norway, pp. 14–18 (2005)

    Google Scholar 

  9. Hockx-Yu, H.: The past issue of the web. In: Proceedings of the ACM WebSci Conference 2011, pp. 1–8 (2011)

    Google Scholar 

  10. Hodge, G.: An information life-cycle approach: Best practices for digital archiving. Journal of Electronic Publishing 5(4) (2000)

    Google Scholar 

  11. JaJa, J., Song, S.: Robust tools and services for long-term preservation of digital information. Library Trends 57(3) (2009)

    Google Scholar 

  12. Jatowt, A., Kawai, Y., Tanaka, K.: Visualizing historical content of web pages. In: Proceedings of the 17th International Conference on World Wide Web, pp. 1221–1222. ACM (2008)

    Google Scholar 

  13. Jatowt, A., Kawai, Y., Tanaka, K.: Page history explorer: Visualizing and comparing page histories. IEICE Transactions on Information and Systems 94(3), 564 (2011)

    Article  Google Scholar 

  14. Kahle, B.: Preserving the Internet. Scientific American 276(3), 82–83 (1997)

    Article  Google Scholar 

  15. Kwak, H., Lee, C., Park, H., Moon, S.: What is Twitter, a social network or a news media? In: Proceedings of the 19th International Conference on World Wide Web, pp. 591–600. ACM (2010)

    Google Scholar 

  16. McCown, F., Nelson, M.: What happens when facebook is gone? In: Proceedings of the 9th ACM/IEEE-CS Joint Conference on Digital Libraries, pp. 251–254. ACM (2009)

    Google Scholar 

  17. Nelson, M., McCown, F., Smith, J., Klein, M.: Using the web infrastructure to preserve web pages. International Journal on Digital Libraries 6(4), 327–349 (2007)

    Article  Google Scholar 

  18. Petrovic, S., Osborne, M., Lavrenko, V.: The Edinburgh Twitter corpus. In: Proceedings of the NAACL HLT 2010 Workshop on Computational Linguistics in a World of Social Media, pp. 25–26 (2010)

    Google Scholar 

  19. Ronald Jantz, M., Mlis, M.: Digital archiving and preservation: Technologies and processes for a trusted repository. Journal of Archival Organization 4(1-2), 193–213 (2007)

    Article  Google Scholar 

  20. Seadle, M.: Selection for digital preservation. Library Hi Tech. 22(2), 119–121 (2004)

    Article  Google Scholar 

  21. Van de Sompel, H., Nelson, M., Sanderson, R., Balakireva, L., Ainsworth, S., Shankar, H.: Memento: Time travel for the web. Arxiv preprint arxiv: 0911.1112 (2009)

    Google Scholar 

  22. Song, S.: Long-term information preservation and access. Ph.D. thesis, University of Maryland, College Park (2011)

    Google Scholar 

  23. Thomas, A., Meyer, E., Dougherty, M., Van den Heuvel, C., Madsen, C., Wyatt, S.: Researcher engagement with web archives: Challenges and opportunities for investment (2010)

    Google Scholar 

  24. Yan, H., Huang, L., Chen, C., Xie, Z.: A new data storage and service model of China web infomall. In: 8th European Conference on Research and Advanced Technologies for Digital Libraries The 4th International Web Archiving Workshop (IWAW 2004), Bath, UK (2004)

    Google Scholar 

  25. Yang, J., Leskovec, J.: Patterns of temporal variation in online media. In: Proceedings of the fourth ACM International Conference on Web Search and Data Mining, pp. 177–186. ACM (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Cui, A. et al. (2012). PrEV: Preservation Explorer and Vault for Web 2.0 User-Generated Content. In: Zaphiris, P., Buchanan, G., Rasmussen, E., Loizides, F. (eds) Theory and Practice of Digital Libraries. TPDL 2012. Lecture Notes in Computer Science, vol 7489. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33290-6_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-33290-6_12

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-33289-0

  • Online ISBN: 978-3-642-33290-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics