This document presents a case study on open-source big data solutions for archives and libraries from Artefactual Systems. It discusses Artefactual's role in developing open-source software like Archivematica, ICA-ATOM, and Qubit for digital preservation. It profiles the company's clients and projects in archives and libraries. It outlines Artefactual's services around hosting, installation, integration, development, and support of open-source archival and library software. Finally, it provides examples of the scale of "big data" held by some Canadian archives and libraries.
1 of 31
Download to read offline
More Related Content
Access2011 - Van Garderen: Occupy The Memory
1. Open-Source Big Data
for Archives and Libraries:
An Case Study
Peter Van Garderen,
President/Systems Archivist
MJ Suhonos,
Systems Librarian/Software Engineer
5. open-source sofware for archives
and libraries
digital preservation consulting
services Peter Van Garderen (MAS)
President / Systems Archivist
http://artefactual.com @pjvangarderen
Evelyn McLellan (MAS) Jessica Bushey (MAS) Courtney Mumma MJ Suhonos (MLIS)
Systems Archivist Systems Archivist (MAS/MLIS) Systems Librarian /
Systems Archivist Software Engineer
David Juhasz Austin Trask Jes¨²s Garc¨ªa Crespo Joseph Perry
Software Engineer Software Engineer Software Engineer Software Engineer
7. Artefactual clients and project sponsors
International Council on Archives ¡ñ
Provincial Archives of Alberta
¡ñ
UNESCO Memory of the World ¡ñ
Alberta Government Services Ministry
¡ñ
UNESCO Archives ¡ñ
Insurance Corporation of British Columbia
¡ñ
United Nations Archives and Records Management Section ¡ñ
Archives Association of British Columbia
¡ñ
The World Bank Group ¡ñ
Archives Society of Alberta
¡ñ
International Monetary Fund ¡ñ
Archives Association of Ontario
¡ñ
NATO Archives ¡ñ
Association for Manitoba Archives
¡ñ
International Records Management Trust University of British Columbia Library
¡ñ ¡ñ
Rockefeller Archive Center Simon Fraser University Archives
¡ñ ¡ñ
Library and Archives Canada Simon Fraser University Library
¡ñ ¡ñ
Canadian Council of Archives University of Victoria Archives
¡ñ ¡ñ
Canadiana University of Toronto iSchool Institute
¡ñ ¡ñ
National Archives of the Netherlands University of Northern British Columbia Library and Archives
¡ñ ¡ñ
Dutch Ministry of the Interior and Kingdom Relations University of Strathclyde Archives
¡ñ ¡ñ
Dutch Institute for Archival Research and Education (Archiefschool) British Columbia Electronic Library Network
¡ñ
¡ñ
British Commonwealth Secretariat
¡ñ
¡ñ
University of British Columbia Irving K. Barber Learning Centre
United Kingdom Department for International Development
¡ñ Diocese of New Westminster - Anglican Church of Canada Archives
Direction des Archives de France
¡ñ
¡ñ City of Vancouver Archives
United Arab Emirates Center for Documentation and Research
¡ñ
¡ñ City of Toronto Corporate Information Management Services
Al-Dhakira Al-Arabiyya ¡ñ
¡ñ City of Rotterdam Archives
Association of Brazilian Archivists ¡ñ
¡ñ City of Edmonton Archives
Botswana National Archives and Records Service ¡ñ
¡ñ
Squamish Public Library
Caribbean Regional Branch of the International Council on Archives ¡ñ
¡ñ
West Vancouver Museum and Archives
American Institute of Architects ¡ñ
¡ñ
Whistler Museum and Archives
British Columbia Museum and Archives ¡ñ
¡ñ
Langley Centennial Museum and National Exhibition Centre
British Columbia Ministry of Management Services ¡ñ
¡ñ
¡ñ
Stirling Council Archives
8. Archivists & Librarians:
Who are we?
Who are we in the face of Google, ebooks,
iTunes, Facebook, Flickr, Internet Archive,
Ancestry.com, History Channel, Sharepoint,
Twitter...
Who are we in the face of our traditional
services, our traditional identity? tight
budgets?
14. all creation is connected
in various ways
in a marvelous spatial balance.
Out of the formation of new entities
has emerged
information
resulting in communication
and memory
Hugh Taylor. ¡°The Archivist, the Letter, and the Spirit¡±
Archivaria 43 Association of Canadian Archivists (1997) p6
http://journals.sfu.ca/archivar
15. contextualize
authenticate
relate / bind
file system file format codec
find
character encoding fonts packaging decryption
error correction operating system compression metadata
now future
storage media storage driver input / output devices Accessible?
bitstream storage device application software user interface Usable?
Authentic?
stored
conserved
protected
16. Accessible?
In your scope, Usable?
I am content Authentic?
<metadata isa=¡±love note to the future¡± />
now future
communication wisdom
memory consciousness
21. we're the 99%
¡ñ
We the people, helped by our archivists &
librarians, should be in charge of:
¡ñ
the space
¡ñ
the portals
¡ñ
the Trusted Digital Repositories
¡ñ
the code
¡ñ
the information
22. we're the 99%
¡ñ
We the people, helped by our archivists &
librarians, should be in charge of:
¡ñ
the space
¡ñ
the portals
¡ñ
the Trusted Digital Repositories
¡ñ
the code
¡ñ
the information
¡ñ
the public record
¡ñ
the social network
¡ñ
personal archives
¡ñ
big data
23. #occupy the memory
¡ñ
We the people, helped by our archivists &
librarians, should be in charge of:
¡ñ
the space
¡ñ
the portals
¡ñ
the Trusted Digital Repositories
¡ñ
the code
¡ñ
the information
occupythememory.org
24. ¡°They¡¯ll never take
our freedom!¡±
??1995?Paramount?Pictures?&?20th?Century?Fox
See?fair?use?rationale:?http://en.wikipedia.org/wiki/File:Brave_mel.jpg
25. Users Foundation or
Steering Committee
Lead institutions
Funding
Development Code Governance
All users Time Time
Bug reports Money Money Coordination
Enhancement requests Knowledge Knowledge
Funding
Code patches
Open Source Software Promotion
Documentation
Promotion
Code
Knowledge
Community
Code
Time
Money
Knowledge
Service Providers
Development
Technical Support
Hosting
Training
Promotion
The open-source eco-system
27. hosting Community Support
installation We will try to answer fairly straight-forward
integration questions from the open source community about
software development installing and configuring our software. When we
tech support think a particular query is beyond these free support
training parameters (too specific, in-depth, or time-
system analysis consuming) we will inform the user that it may be
strategy necessary to address it as paid, commercial support.
$125/hr Commercial Support
Our software is always free and open source, but
Annual maintenance program with our optional hosting and support services, the
Artefactual development team will assist a client
with more in-depth questions to get the software
installed and operating as required, whether on one
of our servers or their own.
30. Big Data in Canadian
Library and Archives: How Big?
¡ñ MemoryBC.ca <100,00 archival descriptions &
authority
¡ñ Archeion.ca <100,000 archival descriptions & authority
¡ñ Canadiana Portal: 1 million items, 4-5 million records
¡ñ Toronto Public Library: 3 million MARC records
¡ñ Library Archives Canada: 3.5 million MARC records
¡ñ ArchivesCanada.ca: with LAC & BNQ? (<5 million?)
¡ñ City of Vancouver: >25TB of digital files from VANOC