際際滷

際際滷Share a Scribd company logo
The Century Archive Project CAP
Technology-Independent Information Storage
Steven H. McCown & Michael Leonhardt
Storage Technology Corporation
4 April 2002
McCown & Leonhardt  4/4/2002
What is a Document?
 A document is:
 Letter, check, picture, plot, report, birth certificate, deed...
 A document is NOT:
 Database element
 Encoded record
 Encoded object
 Perhaps:
 ASCII record of transaction
 Image of database table
 etc.
McCown & Leonhardt  4/4/2002
Documents in a Paperless Environment
 4.4 M Tons of Paper Printed in 1995  to 5.9 M in 2000
 790 B Sheets Laser Printers in 1996  to 1.2 T sheets in 2001
 810 B Sheets From Office Copiers in 1996...to 1.1 T Sheets in 2001
 21 Billion Letters Sent
 170 Billion Pages of Fiche
 60 Billion Checks Processed Each Year
 E-Mail has created 40% more (personal) printing
 +$100 M in corporate revenues adds 8.8 M sheets printed
McCown & Leonhardt  4/4/2002
To ensure that the media will be readable far into the future, it may
be necessary to archive the system along with the media. For a
100-year life, recording systems and sufficient spare parts will
need to be archived along with the data storage media. Media with
life expectancies greater than 20 years are capable of out-surviving
existing recording system technologies.
-- John Van Bogart, NARA 11th Annual Preservation Conference,
Magnetic Tape Storage, 1996
McCown & Leonhardt  4/4/2002
Information Management
 Long-term storage
 Defined: in excess of 100 years
 Inherent to many domains such as genealogy
 Information Management strategies
 Usually based on frequent data migration
 Poor incorporation of long-term storage
 Problem:
 How to access todays archives in 100 years or more
McCown & Leonhardt  4/4/2002
Long-Term Storage Wish List
 Easy integration with data processing environments
 Easy data access
 Migration free
 Long-life media  no maintenance
 Reader technology independence
 Human readable data
 Low cost
McCown & Leonhardt  4/4/2002
Current Options
 Encode the data and record digitally
 Magnetic media
 Optical media
 Store unencoded, human readable images
 Microfilm
 Something new - CAP
McCown & Leonhardt  4/4/2002
Century Archive Project
 Features
 High density storage of human-readable document images
 Storage of digitally encoded documents
 Metadata ascription to aid retrieval
 Industry standard physical media form factor
 Patent on concepts and format filed
McCown & Leonhardt  4/4/2002
CAP Operations
 Scan document to create electronic (e.g. TIFF) file
 Write de-magnified image on optical tape with scanning laser
 Use WORM optical media
 Create analog record (human readable)
 Append new documents as needed
 Write digitally encoded document file
 Read
 View magnified image
Direct or CCD camera & monitor
 Recover digital file
McCown & Leonhardt  4/4/2002
Tape Record Layout
McCown & Leonhardt  4/4/2002
Features
 Record header with document index and metadata
 Updatable Table of Contents
 Digital record in addition to analog record
 Retrieve digital version if compatible reader available
 Include digital header and TOC
 Gray scale Documents
 Use half-toning technique
 Color Documents
 Store separate images for red, green and blue breakdown
 Requires three-beam optics for direct color viewing
 Stereoscopic images
McCown & Leonhardt  4/4/2002
Adjacent Digital File
 TIFF file is reformatted with ECC and bit encoding
 Image file compressed using lossless compression
 Digital record format on tape:
 Width same as analog record
 Track spacing doubled to reduce crosstalk on read-out
 Length 1.5x to 2x analog record
McCown & Leonhardt  4/4/2002
Retrieved Image
McCown & Leonhardt  4/4/2002
Tape Format Example
 For 8 1/2 wide documents
 Scan images at 300 dpi
 Write images on tape at 25,400 dpi
85x reduction in size
 Document length parallel to tape
Accommodates different lengths
 5 document tracks across tape
 1/2 tape in 3480 cartridge
 200m of tape
 220,000 image documents
 80,000 documents with both image and digital records
McCown & Leonhardt  4/4/2002
Storage Costs ($/MB)
 Manually intensive
 Paper - $10.00
 Microfiche (volumetric improvement only) - $1.20
 Semi-automated (manually mounted media)
 Non-automated magnetic tape
 Microfilm - $.005 (media only, 16 mm)
 Full automation
 Magnetic tape - $.004
 Optical disk - $.03
 CAP - $.002
McCown & Leonhardt  4/4/2002
Century Archive Summary
 Provides alternative storage method for valued documents,
images
 Direct optical viewing
 Eliminates drive, media technology migration
 Robust media options for relaxed environmental storage conditions
 Provides digital storage
 Faster availability
 Data integration
 Complements magnetic tape storage of bulk records
McCown & Leonhardt  4/4/2002
Questions?

More Related Content

Similar to The Century Archive Project "Cap" (20)

DAMbusters: IWMs Mission to Design and Implement a Bespoke DAMS
DAMbusters: IWMs Mission to Design and Implement a Bespoke DAMSDAMbusters: IWMs Mission to Design and Implement a Bespoke DAMS
DAMbusters: IWMs Mission to Design and Implement a Bespoke DAMS
Axiell ALM
Information and communication technology
Information and communication technologyInformation and communication technology
Information and communication technology
zfhh01
Cs8092 computer graphics and multimedia unit 4
Cs8092 computer graphics and multimedia unit 4Cs8092 computer graphics and multimedia unit 4
Cs8092 computer graphics and multimedia unit 4
SIMONTHOMAS S
Chapter 2 Digital Data
Chapter 2 Digital DataChapter 2 Digital Data
Chapter 2 Digital Data
shelly3160
MIS Lesson4 Multimedia
MIS Lesson4 MultimediaMIS Lesson4 Multimedia
MIS Lesson4 Multimedia
David Asirvatham
Data Management
Data ManagementData Management
Data Management
Emmett Poindexter
E-Resource
E-ResourceE-Resource
E-Resource
VedaChannayya
Building Workflows for Digitsation and Digital Preservation - Tobias Golodnof...
Building Workflows for Digitsation and Digital Preservation - Tobias Golodnof...Building Workflows for Digitsation and Digital Preservation - Tobias Golodnof...
Building Workflows for Digitsation and Digital Preservation - Tobias Golodnof...
PrestoCentre
Digitization
DigitizationDigitization
Digitization
Lee Cafferata
File Formats for Preservation
File Formats for PreservationFile Formats for Preservation
File Formats for Preservation
Stephen Gray
MULTMEDIA DATABASE.ppt
MULTMEDIA DATABASE.pptMULTMEDIA DATABASE.ppt
MULTMEDIA DATABASE.ppt
Marshall Musungwa
Digital Revolution
Digital RevolutionDigital Revolution
Digital Revolution
DataValueTalk
Chapter Two
Chapter TwoChapter Two
Chapter Two
Nada G.Youssef
Digital Collections Forum2004 Dgm
Digital Collections Forum2004 DgmDigital Collections Forum2004 Dgm
Digital Collections Forum2004 Dgm
Doug Moncur
Unit 4 and 5
Unit 4 and 5Unit 4 and 5
Unit 4 and 5
Guruchellam
(Ch2) Electronic Publishing (2).pptx
(Ch2) Electronic Publishing (2).pptx(Ch2) Electronic Publishing (2).pptx
(Ch2) Electronic Publishing (2).pptx
fatimah791
Demystifying pd fs
Demystifying pd fsDemystifying pd fs
Demystifying pd fs
Betsy Fanning
Information technology (ict3)
Information technology (ict3)Information technology (ict3)
Information technology (ict3)
suresh Rsktvpm
Digitisation Overview
Digitisation OverviewDigitisation Overview
Digitisation Overview
nfitzger
40120130405024
4012013040502440120130405024
40120130405024
IAEME Publication
DAMbusters: IWMs Mission to Design and Implement a Bespoke DAMS
DAMbusters: IWMs Mission to Design and Implement a Bespoke DAMSDAMbusters: IWMs Mission to Design and Implement a Bespoke DAMS
DAMbusters: IWMs Mission to Design and Implement a Bespoke DAMS
Axiell ALM
Information and communication technology
Information and communication technologyInformation and communication technology
Information and communication technology
zfhh01
Cs8092 computer graphics and multimedia unit 4
Cs8092 computer graphics and multimedia unit 4Cs8092 computer graphics and multimedia unit 4
Cs8092 computer graphics and multimedia unit 4
SIMONTHOMAS S
Chapter 2 Digital Data
Chapter 2 Digital DataChapter 2 Digital Data
Chapter 2 Digital Data
shelly3160
Building Workflows for Digitsation and Digital Preservation - Tobias Golodnof...
Building Workflows for Digitsation and Digital Preservation - Tobias Golodnof...Building Workflows for Digitsation and Digital Preservation - Tobias Golodnof...
Building Workflows for Digitsation and Digital Preservation - Tobias Golodnof...
PrestoCentre
File Formats for Preservation
File Formats for PreservationFile Formats for Preservation
File Formats for Preservation
Stephen Gray
Digital Collections Forum2004 Dgm
Digital Collections Forum2004 DgmDigital Collections Forum2004 Dgm
Digital Collections Forum2004 Dgm
Doug Moncur
(Ch2) Electronic Publishing (2).pptx
(Ch2) Electronic Publishing (2).pptx(Ch2) Electronic Publishing (2).pptx
(Ch2) Electronic Publishing (2).pptx
fatimah791
Demystifying pd fs
Demystifying pd fsDemystifying pd fs
Demystifying pd fs
Betsy Fanning
Information technology (ict3)
Information technology (ict3)Information technology (ict3)
Information technology (ict3)
suresh Rsktvpm
Digitisation Overview
Digitisation OverviewDigitisation Overview
Digitisation Overview
nfitzger

Recently uploaded (20)

World Information Architecture Day 2025 - UX at a Crossroads
World Information Architecture Day 2025 - UX at a CrossroadsWorld Information Architecture Day 2025 - UX at a Crossroads
World Information Architecture Day 2025 - UX at a Crossroads
Joshua Randall
MIND Revenue Release Quarter 4 2024 - Finacial Presentation
MIND Revenue Release Quarter 4 2024 - Finacial PresentationMIND Revenue Release Quarter 4 2024 - Finacial Presentation
MIND Revenue Release Quarter 4 2024 - Finacial Presentation
MIND CTI
Wondershare Filmora Crack 14.3.2.11147 Latest
Wondershare Filmora Crack 14.3.2.11147 LatestWondershare Filmora Crack 14.3.2.11147 Latest
Wondershare Filmora Crack 14.3.2.11147 Latest
udkg888
Backstage Software Templates for Java Developers
Backstage Software Templates for Java DevelopersBackstage Software Templates for Java Developers
Backstage Software Templates for Java Developers
Markus Eisele
Wondershare Dr.Fone Crack Free Download 2025
Wondershare Dr.Fone Crack Free Download 2025Wondershare Dr.Fone Crack Free Download 2025
Wondershare Dr.Fone Crack Free Download 2025
maharajput103
DevNexus - Building 10x Development Organizations.pdf
DevNexus - Building 10x Development Organizations.pdfDevNexus - Building 10x Development Organizations.pdf
DevNexus - Building 10x Development Organizations.pdf
Justin Reock
Computational Photography: How Technology is Changing Way We Capture the World
Computational Photography: How Technology is Changing Way We Capture the WorldComputational Photography: How Technology is Changing Way We Capture the World
Computational Photography: How Technology is Changing Way We Capture the World
HusseinMalikMammadli
Unlock AI Creativity: Image Generation with DALL揃E
Unlock AI Creativity: Image Generation with DALL揃EUnlock AI Creativity: Image Generation with DALL揃E
Unlock AI Creativity: Image Generation with DALL揃E
Expeed Software
Endpoint Backup: 3 Reasons MSPs Ignore It
Endpoint Backup: 3 Reasons MSPs Ignore ItEndpoint Backup: 3 Reasons MSPs Ignore It
Endpoint Backup: 3 Reasons MSPs Ignore It
MSP360
Field Device Management Market Report 2030 - TechSci Research
Field Device Management Market Report 2030 - TechSci ResearchField Device Management Market Report 2030 - TechSci Research
Field Device Management Market Report 2030 - TechSci Research
Vipin Mishra
The Future of Repair: Transparent and Incremental by Botond Denes
The Future of Repair: Transparent and Incremental by Botond DenesThe Future of Repair: Transparent and Incremental by Botond Denes
The Future of Repair: Transparent and Incremental by Botond Denes
ScyllaDB
Q4 2024 Earnings and Investor Presentation
Q4 2024 Earnings and Investor PresentationQ4 2024 Earnings and Investor Presentation
Q4 2024 Earnings and Investor Presentation
Dropbox
Transform Your Future with Front-End Development Training
Transform Your Future with Front-End Development TrainingTransform Your Future with Front-End Development Training
Transform Your Future with Front-End Development Training
Vtechlabs
Build with AI on Google Cloud Session #4
Build with AI on Google Cloud Session #4Build with AI on Google Cloud Session #4
Build with AI on Google Cloud Session #4
Margaret Maynard-Reid
Replacing RocksDB with ScyllaDB in Kafka Streams by Almog Gavra
Replacing RocksDB with ScyllaDB in Kafka Streams by Almog GavraReplacing RocksDB with ScyllaDB in Kafka Streams by Almog Gavra
Replacing RocksDB with ScyllaDB in Kafka Streams by Almog Gavra
ScyllaDB
Technology use over time and its impact on consumers and businesses.pptx
Technology use over time and its impact on consumers and businesses.pptxTechnology use over time and its impact on consumers and businesses.pptx
Technology use over time and its impact on consumers and businesses.pptx
kaylagaze
UiPath Agentic Automation Capabilities and Opportunities
UiPath Agentic Automation Capabilities and OpportunitiesUiPath Agentic Automation Capabilities and Opportunities
UiPath Agentic Automation Capabilities and Opportunities
DianaGray10
UiPath Automation Developer Associate Training Series 2025 - Session 1
UiPath Automation Developer Associate Training Series 2025 - Session 1UiPath Automation Developer Associate Training Series 2025 - Session 1
UiPath Automation Developer Associate Training Series 2025 - Session 1
DianaGray10
Q4_TLE-7-Lesson-6-Week-6.pptx 4th quarter
Q4_TLE-7-Lesson-6-Week-6.pptx 4th quarterQ4_TLE-7-Lesson-6-Week-6.pptx 4th quarter
Q4_TLE-7-Lesson-6-Week-6.pptx 4th quarter
MariaBarbaraPaglinaw
1.1. Evolution-and-Scope-of-Business-Analytics.pptx
1.1. Evolution-and-Scope-of-Business-Analytics.pptx1.1. Evolution-and-Scope-of-Business-Analytics.pptx
1.1. Evolution-and-Scope-of-Business-Analytics.pptx
Jitendra Tomar
World Information Architecture Day 2025 - UX at a Crossroads
World Information Architecture Day 2025 - UX at a CrossroadsWorld Information Architecture Day 2025 - UX at a Crossroads
World Information Architecture Day 2025 - UX at a Crossroads
Joshua Randall
MIND Revenue Release Quarter 4 2024 - Finacial Presentation
MIND Revenue Release Quarter 4 2024 - Finacial PresentationMIND Revenue Release Quarter 4 2024 - Finacial Presentation
MIND Revenue Release Quarter 4 2024 - Finacial Presentation
MIND CTI
Wondershare Filmora Crack 14.3.2.11147 Latest
Wondershare Filmora Crack 14.3.2.11147 LatestWondershare Filmora Crack 14.3.2.11147 Latest
Wondershare Filmora Crack 14.3.2.11147 Latest
udkg888
Backstage Software Templates for Java Developers
Backstage Software Templates for Java DevelopersBackstage Software Templates for Java Developers
Backstage Software Templates for Java Developers
Markus Eisele
Wondershare Dr.Fone Crack Free Download 2025
Wondershare Dr.Fone Crack Free Download 2025Wondershare Dr.Fone Crack Free Download 2025
Wondershare Dr.Fone Crack Free Download 2025
maharajput103
DevNexus - Building 10x Development Organizations.pdf
DevNexus - Building 10x Development Organizations.pdfDevNexus - Building 10x Development Organizations.pdf
DevNexus - Building 10x Development Organizations.pdf
Justin Reock
Computational Photography: How Technology is Changing Way We Capture the World
Computational Photography: How Technology is Changing Way We Capture the WorldComputational Photography: How Technology is Changing Way We Capture the World
Computational Photography: How Technology is Changing Way We Capture the World
HusseinMalikMammadli
Unlock AI Creativity: Image Generation with DALL揃E
Unlock AI Creativity: Image Generation with DALL揃EUnlock AI Creativity: Image Generation with DALL揃E
Unlock AI Creativity: Image Generation with DALL揃E
Expeed Software
Endpoint Backup: 3 Reasons MSPs Ignore It
Endpoint Backup: 3 Reasons MSPs Ignore ItEndpoint Backup: 3 Reasons MSPs Ignore It
Endpoint Backup: 3 Reasons MSPs Ignore It
MSP360
Field Device Management Market Report 2030 - TechSci Research
Field Device Management Market Report 2030 - TechSci ResearchField Device Management Market Report 2030 - TechSci Research
Field Device Management Market Report 2030 - TechSci Research
Vipin Mishra
The Future of Repair: Transparent and Incremental by Botond Denes
The Future of Repair: Transparent and Incremental by Botond DenesThe Future of Repair: Transparent and Incremental by Botond Denes
The Future of Repair: Transparent and Incremental by Botond Denes
ScyllaDB
Q4 2024 Earnings and Investor Presentation
Q4 2024 Earnings and Investor PresentationQ4 2024 Earnings and Investor Presentation
Q4 2024 Earnings and Investor Presentation
Dropbox
Transform Your Future with Front-End Development Training
Transform Your Future with Front-End Development TrainingTransform Your Future with Front-End Development Training
Transform Your Future with Front-End Development Training
Vtechlabs
Build with AI on Google Cloud Session #4
Build with AI on Google Cloud Session #4Build with AI on Google Cloud Session #4
Build with AI on Google Cloud Session #4
Margaret Maynard-Reid
Replacing RocksDB with ScyllaDB in Kafka Streams by Almog Gavra
Replacing RocksDB with ScyllaDB in Kafka Streams by Almog GavraReplacing RocksDB with ScyllaDB in Kafka Streams by Almog Gavra
Replacing RocksDB with ScyllaDB in Kafka Streams by Almog Gavra
ScyllaDB
Technology use over time and its impact on consumers and businesses.pptx
Technology use over time and its impact on consumers and businesses.pptxTechnology use over time and its impact on consumers and businesses.pptx
Technology use over time and its impact on consumers and businesses.pptx
kaylagaze
UiPath Agentic Automation Capabilities and Opportunities
UiPath Agentic Automation Capabilities and OpportunitiesUiPath Agentic Automation Capabilities and Opportunities
UiPath Agentic Automation Capabilities and Opportunities
DianaGray10
UiPath Automation Developer Associate Training Series 2025 - Session 1
UiPath Automation Developer Associate Training Series 2025 - Session 1UiPath Automation Developer Associate Training Series 2025 - Session 1
UiPath Automation Developer Associate Training Series 2025 - Session 1
DianaGray10
Q4_TLE-7-Lesson-6-Week-6.pptx 4th quarter
Q4_TLE-7-Lesson-6-Week-6.pptx 4th quarterQ4_TLE-7-Lesson-6-Week-6.pptx 4th quarter
Q4_TLE-7-Lesson-6-Week-6.pptx 4th quarter
MariaBarbaraPaglinaw
1.1. Evolution-and-Scope-of-Business-Analytics.pptx
1.1. Evolution-and-Scope-of-Business-Analytics.pptx1.1. Evolution-and-Scope-of-Business-Analytics.pptx
1.1. Evolution-and-Scope-of-Business-Analytics.pptx
Jitendra Tomar

The Century Archive Project "Cap"

  • 1. The Century Archive Project CAP Technology-Independent Information Storage Steven H. McCown & Michael Leonhardt Storage Technology Corporation 4 April 2002
  • 2. McCown & Leonhardt 4/4/2002 What is a Document? A document is: Letter, check, picture, plot, report, birth certificate, deed... A document is NOT: Database element Encoded record Encoded object Perhaps: ASCII record of transaction Image of database table etc.
  • 3. McCown & Leonhardt 4/4/2002 Documents in a Paperless Environment 4.4 M Tons of Paper Printed in 1995 to 5.9 M in 2000 790 B Sheets Laser Printers in 1996 to 1.2 T sheets in 2001 810 B Sheets From Office Copiers in 1996...to 1.1 T Sheets in 2001 21 Billion Letters Sent 170 Billion Pages of Fiche 60 Billion Checks Processed Each Year E-Mail has created 40% more (personal) printing +$100 M in corporate revenues adds 8.8 M sheets printed
  • 4. McCown & Leonhardt 4/4/2002 To ensure that the media will be readable far into the future, it may be necessary to archive the system along with the media. For a 100-year life, recording systems and sufficient spare parts will need to be archived along with the data storage media. Media with life expectancies greater than 20 years are capable of out-surviving existing recording system technologies. -- John Van Bogart, NARA 11th Annual Preservation Conference, Magnetic Tape Storage, 1996
  • 5. McCown & Leonhardt 4/4/2002 Information Management Long-term storage Defined: in excess of 100 years Inherent to many domains such as genealogy Information Management strategies Usually based on frequent data migration Poor incorporation of long-term storage Problem: How to access todays archives in 100 years or more
  • 6. McCown & Leonhardt 4/4/2002 Long-Term Storage Wish List Easy integration with data processing environments Easy data access Migration free Long-life media no maintenance Reader technology independence Human readable data Low cost
  • 7. McCown & Leonhardt 4/4/2002 Current Options Encode the data and record digitally Magnetic media Optical media Store unencoded, human readable images Microfilm Something new - CAP
  • 8. McCown & Leonhardt 4/4/2002 Century Archive Project Features High density storage of human-readable document images Storage of digitally encoded documents Metadata ascription to aid retrieval Industry standard physical media form factor Patent on concepts and format filed
  • 9. McCown & Leonhardt 4/4/2002 CAP Operations Scan document to create electronic (e.g. TIFF) file Write de-magnified image on optical tape with scanning laser Use WORM optical media Create analog record (human readable) Append new documents as needed Write digitally encoded document file Read View magnified image Direct or CCD camera & monitor Recover digital file
  • 10. McCown & Leonhardt 4/4/2002 Tape Record Layout
  • 11. McCown & Leonhardt 4/4/2002 Features Record header with document index and metadata Updatable Table of Contents Digital record in addition to analog record Retrieve digital version if compatible reader available Include digital header and TOC Gray scale Documents Use half-toning technique Color Documents Store separate images for red, green and blue breakdown Requires three-beam optics for direct color viewing Stereoscopic images
  • 12. McCown & Leonhardt 4/4/2002 Adjacent Digital File TIFF file is reformatted with ECC and bit encoding Image file compressed using lossless compression Digital record format on tape: Width same as analog record Track spacing doubled to reduce crosstalk on read-out Length 1.5x to 2x analog record
  • 13. McCown & Leonhardt 4/4/2002 Retrieved Image
  • 14. McCown & Leonhardt 4/4/2002 Tape Format Example For 8 1/2 wide documents Scan images at 300 dpi Write images on tape at 25,400 dpi 85x reduction in size Document length parallel to tape Accommodates different lengths 5 document tracks across tape 1/2 tape in 3480 cartridge 200m of tape 220,000 image documents 80,000 documents with both image and digital records
  • 15. McCown & Leonhardt 4/4/2002 Storage Costs ($/MB) Manually intensive Paper - $10.00 Microfiche (volumetric improvement only) - $1.20 Semi-automated (manually mounted media) Non-automated magnetic tape Microfilm - $.005 (media only, 16 mm) Full automation Magnetic tape - $.004 Optical disk - $.03 CAP - $.002
  • 16. McCown & Leonhardt 4/4/2002 Century Archive Summary Provides alternative storage method for valued documents, images Direct optical viewing Eliminates drive, media technology migration Robust media options for relaxed environmental storage conditions Provides digital storage Faster availability Data integration Complements magnetic tape storage of bulk records
  • 17. McCown & Leonhardt 4/4/2002 Questions?