This document introduces the Archival Acid Test, which evaluates how well web archiving tools archive modern webpages that use advanced HTML, JavaScript, and other web technologies. The test is divided into basic tests, JavaScript tests, and advanced features tests to assess different areas. Results show that archiving tools perform well on basic tests but struggle with dynamic content, asynchronous JavaScript, iframes, and other complex features. The goal of the Archival Acid Test is to create a standardized, publicly available way to evaluate how completely archiving tools archive modern webpages and identify areas for improvement.
1 of 12
Download to read offline
More Related Content
際際滷s
1. The Archival Acid Test
Evaluating Archive Performance on Advanced HTML and JavaScript
Mat Kelly, Michael L. Nelson, Michele C. Weigle
{mkelly, mln, mweigle}@cs.odu.edu
Old Dominion University
Web Science & Digital Libraries Research Group
http://ws-dl.cs.odu.edu
http://acid.matkelly.com
Digital Libraries 2014 London, England September 9, 2014
2. Preserving the Web
Web Archivists Use Software for Digital Preservation
Archiving Tools
Heritrix WARCreate GNU Wget
Archiving Websites
Each tool produces a different result
Beyond manual inspection, these tools have not been
comparatively evaluated
2
Digital Libraries (JCDL 2014) - London, England - September 9, 2014
archive.today
3. Evaluating Software Made for the Web
Acid1 Test (1998)
Cross-browser CSS 1 compatibility
Acid 2 Test (2005)
HTML, CSS 2, PNGs, display spacing
Acid 3 Test (2008)
JavaScript, advanced CSS
Evaluated Web Browsers
3
5. Horseshoes and Hand Grenades
5
archive.today
Digital Libraries (JCDL 2014) - London, England - September 9, 2014
6. Evaluate Web Archiving Software
Represent the modern web
Highlight problematic areas for tools
Return a quantifiable result
Acid3 Reference Archival Acid Test Reference 6
7. What the Archival Acid Test Tests
1. Basic Tests (6 tests)
Simple image, CSS representations
2. JavaScript Tests (8 tests)
Dynamic resource location, asynchronous
fetching, other Ajax features
3. Advanced Features Tests (4 tests)
HTML5 features, multimedia, state-of-the-art
web browser functionality
7
Digital Libraries (JCDL 2014) - London, England - September 9, 2014
9. A Comparative Look
Basic JavaScript Advanced Features
9
Digital Libraries (JCDL 2014) - London, England - September 9, 2014
10. A Comparative Look
Basic JavaScript Advanced Features
10
Code that loads content only after user interaction
Digital Libraries (JCDL 2014) - London, England - September 9, 2014
11. A Comparative Look
Basic JavaScript Advanced Features
11
Content embedded in an iframe (external webpage)
Digital Libraries (JCDL 2014) - London, England - September 9, 2014
12. Purpose of tests for archiving tools:
Identify problem points cf. web browsers
Evaluate performance
Create a means of general evaluation instead
of identifying the shortcomings of a particular
tool on an ad hoc basis
Publicly available for further archiving tools
and web browser testing
12
The Archival Acid Test
Evaluating Archive Performance on Advanced HTML and JavaScript
Contributions
http://acid.matkelly.com
Editor's Notes
#2: Intro
Here to present work series of tests setup to eval tools created to preserve the web the Archival Acid Test
#3: First, talk about tools evaluated.
analyzed results of 3 diff software packages & 5 diff websites on variety metrics relating their perform in accurately preserving various features of HTML, JS + other web techs.
Among these:
IAs archival crawler Heritrix
our own browser-based preservation tool WARCreate
& GNU wget, which recently implemented WARC output.
Also tested 5 wss that have accessible means of user-driven web archiving:
Mummify, WebCitation, Internet Archives Save Page Now feature, Perma, and archive.today, formally archive.is.
One of our main contributions is to compare the results of each of these tools in an evaluatable way to better identify the shortcomings of each.
#4: First, quick history on eval. of tools created for web.
1998: initial Acid Test created to eval how well web browsers conformed to web standards by supplying a web page and a reference rendering then allowing them to be compared to evaluate correctness.
3:2nd acid test, something similar but focused more on correct rendering through some of the newer features of HTML and CSS.
The most recent acid test went a step further and evaluated facets of the behavior of web browsers and focused more on correctness as a function of the accuracy of executing JavaScript correctly. This was called the Acid3 Test
#5: Initially justify our test, and given that web archiving tools and web browsers are setup to consume the same medium web pages, we directed each archiving tool and website to attempt to preserve the Acid3 test page with a modern browsers rendering as the reference model. Any deviation in appearance from the reference model constitutes an imperfect score.
Here, you can see, even in comparison to modern browsers, which all completely pass the Acid3 test, that no archiving tool completely does.
#6: One particular note on the archive.today performance that you can see here is that despite the 100 out of 100 % score on the Acid3 test, Archive.today did not pass the Acid3 test. The various discrepancies like the YOU SHOULD NOT SEE THIS text in the top left of the rendering as well as the lack of borders on the colored squares indicates an incomplete score, which intuitively contradicts the number displayed.
We hoped to design our rudimentary test with a bit more clarity in the results.
#7: So, we created a test to not just evaluate how well archiving tools conformed to web standards but rather, how well they performed at their primary purpose that is, to archive web pages. Our test of 18 features was established with the basis of areas where these tools have been known to do a sub-par job at preservation. The test also looks to exercise certain features of web technologies and standards that browsers currently render correctly and web archiving tools likely do not.
Our reference rendering is in a similar vein as the standard acid test where each blue square represents a test passed. Any deviation from the reference rendering means the Archival Acid test was not completely passed.
#8: To go into details for each of these tests, the basic tests evaluated support for simple web features that we expected all tools to perform well in and acted as a filter to ensure the archiving tools work with basic web features. These include the preservation of images hosted on the TARGET?site and various remote images that should be cohesively captured for the capture to be considered complete.
The second set tested various aspects of JavaScript on the web, for which archival crawlers have particular trouble. These include simple JavaScript execution to pull in remote resources as well as resources that are only part of the page after certain triggers; This is an attempt to exercise the URI mining capability of each tool.
The third set of tests dealt with features that are common in browser function but are particularly novel or out of the scope of priority for implementation in archival tools. These tests include various HTML5 canvas drawing samples, interfacing with a browsers local storage, and a few other advanced features of web browsers.
#9: The performance of each of these tools can be seen here with the reference screenshot, as rendered in a modern version of Google Chrome. As predicted, all tools had a problem with at least one JavaScript test and more interestingly, all had a varying degree of issue with the advanced features tests.
Part of the importance of testing web browser technologies on archival web tools is to ensure consistency of replay through the integrity of the capture. For many of these tests, the tools were not able to discover the resources needed for replay and thus, when the archive was replayed, the resources needed were not available in the archives.
Other tools lacked support for JavaScript execution, as can be seen with the WebCitation crawlers results, which caused all tests dependent on JavaScript to fail.
Isolating features of each test like this allows the Archival Acid Test to not just indicate complete lack of support but also instances of partial support, as can be seen by other tools missing only a portion of the JavaScript squares.
#10: To evaluate the general problem points for archiving tools as surfaced by the Archival Acid test, we compare the performance of all tests side by side and took note of the particular features where all tests failed. These particular tests should be the focus of all tools in improving capture integrity to set themselves apart in the realm of archive capture completeness.
Even with the tests where only a few browsers failed, any failure in a test is indicative of a discrepancy in functionality between a web browser: the medium used to view the live web and an archival crawler, which was created to allow the live web experience to be replicated once the live web page has been preserved.
#11: To evaluate the general problem points for archiving tools as surfaced by the Archival Acid test, we compare the performance of all tests side by side and took note of the particular features where all tests failed. These particular tests should be the focus of all tools in improving capture integrity to set themselves apart in the realm of archive capture completeness.
Even with the tests where only a few browsers failed, any failure in a test is indicative of a discrepancy in functionality between a web browser: the medium used to view the live web and an archival crawler, which was created to allow the live web experience to be replicated once the live web page has been preserved.
#12: To evaluate the general problem points for archiving tools as surfaced by the Archival Acid test, we compare the performance of all tests side by side and took note of the particular features where all tests failed. These particular tests should be the focus of all tools in improving capture integrity to set themselves apart in the realm of archive capture completeness.
Even with the tests where only a few browsers failed, any failure in a test is indicative of a discrepancy in functionality between a web browser: the medium used to view the live web and an archival crawler, which was created to allow the live web experience to be replicated once the live web page has been preserved.
#13: Finally, like to hit on the contribs of this work. Previously, comparative evaluation of these tools was focused on the particular tool wherein there were problems and shortcomings that were exhibited by all tools but never concretely identified due a lack of the feature by all. The Archival Acid Test sets the basis for providing a means of identifying issues with web archiving tools and while not comprehensive of every problematic part of archival capture, is already effective in identifying problem areas.