ABSTRACT
The Web has pervaded all walks of life and has become an important corpus for studying the humanities, social sciences, and for use by computer scientists and other disciplines. Web archives collect, preserve, and provide ongoing access to ephemeral Web pages and hence encode traces of human thought, activity, and history. This makes them a valuable resource for analysis and study. However, there have been only few concerted efforts to bring together tools, platforms, storage, processing frameworks, and existing collections for mining and analysing Web archives.
Index Terms
- Exploring the past of the web: alexandria & archive-it hackathon
Recommendations
The past issue of the web
WebSci '11: Proceedings of the 3rd International Web Science ConferenceThis paper takes a critical look at the efforts since the mid-1990s in archiving and preserving websites by memory institutions around the world. It contains an overview of the approaches and practices to date, and a discussion of the various technical, ...
A browser for browsing the past web
WWW '06: Proceedings of the 15th international conference on World Wide WebWe describe a browser for the past web. It can retrieve data from multiple past web resources and features a passive browsing style based on change detection and presentation. The browser shows past pages one by one along a time line. The parts that ...
Search the past with the portuguese web archive
WWW '13 Companion: Proceedings of the 22nd International Conference on World Wide WebThe web was invented to quickly exchange data between scientists, but it became a crucial communication tool to connect the world. However, the web is extremely ephemeral. Most of the information published online becomes quickly unavailable and is lost ...
Comments