
際際滷Share a Scribd company logo
The Web of Sites: Creating Effective
Web Archiving Appraisal and
Collection Development Policies
Jennifer Wright
Archives and Information Management Team Leader
SAA 2013
Session 408
The Mission of Smithsonian Archives
 Appraise, acquire, and preserve
the records of the Smithsonian
 Offer a range of research and
reference services
 Establish policy and provide
expert guidance on record
keeping practices
 Create and promote products
and services that broaden
understanding of the
Websites as Records
 Smithsonians official definition of a record:
any official recorded information, regardless of
medium or characteristics, created, received,
and maintained by a Smithsonian museum,
office, or employee
Smithsonian Directive 950
Management of the Smithsonian Web
 Sets policies and procedures to ensure the integrity
of content, reliability of infrastructure, and usability
of websites while protecting privacy of visitors and
Smithsonians reputation
 Requires Archives to provide dispositions for unit
websites, web applications, and online exhibits
 Requires Archives to maintain historical snapshots
of Smithsonian websites and related content
Smithsonian Directive 814
Social Media Policy
 Sets policy for opening and maintaining official
Smithsonian social media accounts
 Requires that units notify Archives when opening
and before closing a social media account
 Requires Archives to maintain registry of social
media accounts and to archive information
contained in the accounts according to current
standards and retention policies
Why Save?
 Websites and social media profiles are Smithsonians
public face
 Similar to a publication
 May incorporate many types of materials
 May replace other formats
Sounds straightforward.
How complicated could
appraisal possibly be?
Smithsonians Web Presence
 257 websites + 10 mobile websites
 89 blogs
 26 apps for various platforms
 578 social media accounts including:
 153 Facebook accounts
 105 Twitter accounts
 66 Flickr accounts
 66 YouTube accounts
Why Not Save Everything?
 Some content already transferred to Archives in
another format
 Some content is the responsibility of other units
 Some content is collections, not records
 Some content serves only as pointers to other
Smithsonian and non-Smithsonian content
Other Issues Affecting Appraisal
 Certain types of files and coding dont crawl well
 Flash, JavaScript, some video
 Organization and coding of site may make it impossible to
capture everything wanted and exclude everything unwanted
 Social media terms of service often do not allow
 Users may consider social media interactions to be
One policy doesnt fit all
Our Policies: Public Websites
 Permanent records but may exclude:
 Detailed collections information
 Large sections duplicated in another format
 Crawl annually, before and after redesign, and on
day of major event
Our Policies: Intranets
 Individually appraised based upon content
 Generally block crawlers  permanent records must
be transferred via ftp, server to server transfer, or
external drive
 Will be restricted as appropriate
Our Policies: Social Media Accounts
 Will capture most accounts one time to show they
existed and how they were used
 Will crawl, use export tool, take screenshots, or a
combo to best capture account
 Will not be made immediately available online to
mitigate violations of terms of service
Our Policies: Social Media Accounts
 Must include or link to Smithsonians Terms of Use
 no capture otherwise
Our Policies: Social Media Accounts
 After first capture, account will be appraised
annually - significant original content will be
captured again
Our Policies: Blogs
 Permanent records
 Crawl annually unless there is no link to
Smithsonians terms of use
Jennifer Wright
Archives and Information
Management Team Leader
SAA 2013 Session 408
Original Smithsonian Home
Page, launched May 8, 1995

More Related Content

The Web of Sites: Creating Effective Web Archiving Appraisal and Collection Development Policies

  • 1. The Web of Sites: Creating Effective Web Archiving Appraisal and Collection Development Policies Jennifer Wright Archives and Information Management Team Leader SAA 2013 Session 408
  • 2. The Mission of Smithsonian Archives Appraise, acquire, and preserve the records of the Smithsonian Institution Offer a range of research and reference services Establish policy and provide expert guidance on record keeping practices Create and promote products and services that broaden understanding of the Smithsonian
  • 3. Websites as Records Smithsonians official definition of a record: any official recorded information, regardless of medium or characteristics, created, received, and maintained by a Smithsonian museum, office, or employee
  • 4. Smithsonian Directive 950 Management of the Smithsonian Web Sets policies and procedures to ensure the integrity of content, reliability of infrastructure, and usability of websites while protecting privacy of visitors and Smithsonians reputation Requires Archives to provide dispositions for unit websites, web applications, and online exhibits Requires Archives to maintain historical snapshots of Smithsonian websites and related content
  • 5. Smithsonian Directive 814 Social Media Policy Sets policy for opening and maintaining official Smithsonian social media accounts Requires that units notify Archives when opening and before closing a social media account Requires Archives to maintain registry of social media accounts and to archive information contained in the accounts according to current standards and retention policies
  • 6. Why Save? Websites and social media profiles are Smithsonians public face Similar to a publication May incorporate many types of materials May replace other formats
  • 7. Sounds straightforward. How complicated could appraisal possibly be?
  • 8. Smithsonians Web Presence 257 websites + 10 mobile websites 89 blogs 26 apps for various platforms 578 social media accounts including: 153 Facebook accounts 105 Twitter accounts 66 Flickr accounts 66 YouTube accounts http://www.si.edu/Connect
  • 9. Why Not Save Everything? Some content already transferred to Archives in another format Some content is the responsibility of other units Some content is collections, not records Some content serves only as pointers to other Smithsonian and non-Smithsonian content
  • 10. Other Issues Affecting Appraisal Certain types of files and coding dont crawl well Flash, JavaScript, some video Organization and coding of site may make it impossible to capture everything wanted and exclude everything unwanted Social media terms of service often do not allow crawling Users may consider social media interactions to be private
  • 11. One policy doesnt fit all
  • 12. Our Policies: Public Websites Permanent records but may exclude: Detailed collections information Large sections duplicated in another format Crawl annually, before and after redesign, and on day of major event
  • 13. Our Policies: Intranets Individually appraised based upon content Generally block crawlers permanent records must be transferred via ftp, server to server transfer, or external drive Will be restricted as appropriate
  • 14. Our Policies: Social Media Accounts Will capture most accounts one time to show they existed and how they were used Will crawl, use export tool, take screenshots, or a combo to best capture account Will not be made immediately available online to mitigate violations of terms of service
  • 15. Our Policies: Social Media Accounts Must include or link to Smithsonians Terms of Use no capture otherwise http://www.si.edu/Termsofuse
  • 16. Our Policies: Social Media Accounts After first capture, account will be appraised annually - significant original content will be captured again
  • 17. Our Policies: Blogs Permanent records Crawl annually unless there is no link to Smithsonians terms of use
  • 18. Questions? Jennifer Wright Archives and Information Management Team Leader wrightjm@si.edu http://www.siarchives.si.edu/ SAA 2013 Session 408 Original Smithsonian Home Page, launched May 8, 1995

Editor's Notes

  • #4: By this definition, any official web presence maintained by Smithsonian units is considered a record and subject to appraisal by the Archives.
  • #5: The Smithsonian also has two directives governing its web presence that give the Archives specific responsibilities.
  • #8: An organizations web presence may be larger than you realize.
  • #9: Not to mention iTunes,Pinterest, UStream, FourSquare, Instagram, Tumblr, Google+, Wikis, Vine, Vimeo, and many others.Thats a lot of data to be captured, preserved, and stored over the long haul. We need to make sure were not capturing more than is necessary.
  • #11: There are also technical and legal issues affecting appraisal.
  • #12: Weve found that one policy doesnt fit every situation and weve developed general polices for different types of web presences.
  • #13: Annually is our goal, but were still working up to that frequency.
  • #17: On the left is my favorite example of original content. On April 30, 2012, the National Zoo live-tweeted from the artificial insemination of our giant panda.On the right is an excerpt from the Smithsonian Magazines Twitter feed. It simply tweets teasers and links to its blog posts and other web content. The account has immediate marketing value, but not long-term significance.