ºÝºÝߣ

ºÝºÝߣShare a Scribd company logo
Upscaling digitisation at the Wellcome Library showcasing the Goobi workflow system Christy Henshaw Programme Manager  Wellcome Digital Library 3 rd  LIBER-EBLIDA Workshop on Digitisation of Library Material in Europe Koninklijke Bibliotheek 6 October 2011
The Wellcome Trust A global charitable foundation Achieving extraordinary improvements in human and animal health Supporting the brightest minds in biomedical research and the medical humanities Exploring medicine in historical and cultural contexts
The Wellcome Library Major resource for the study of medical history Collections of books, manuscripts, archives, films and pictures on the history of medicine from the earliest times to the present day. Provide insight and information to anyone seeking to understand medicine and its role in society, past and present. Provide access to a growing collection of contemporary biomedical information resources relating to consumer health, popular science, biomedical ethics and the public understanding of science. The Wellcome Library
The Wellcome Library
The Wellcome Library Image library created from transparencies/prints, and on demand photography ¨C 300,000 images Journal backfiles digitisation ¨C (funder) Med. Hist., BMJ, etc. in PMC Wellcome Film ¨C 500+ titles (also Wellcome Film YouTube channel) AIDS posters project ¨C 3,000 posters Arabic manuscripts ¨C 500 manuscripts 17 th  century recipe books ¨C 74 manuscripts Contributions to Europeana via the Europeana Libraries project, and World Digital Library Digitisation ¨C the story so far
The Library Transformation Strategy 2009 - 2014 To provide global access to, and expert interpretation of, a world class collection that explores medicine in its cultural contexts Targeted collecting  ¨C putting challenges in context Expert interpretation   ¨C engaging (new) audiences  Strategic digitisation  ¨C online access to our collections
The Wellcome Digital Library pilot 2010-2013 Genetics and its Modern Foundations A new online resource for everyone interested in the history of human and animal health.   Aims build sustainable/expandable mechanism  ¨C foundation stone for WDL digitise key library holdings  - relating to a major Trust challenge area digitise important third party content  ¨C linked to theme use innovative content and tools  ¨C to encourage discovery and use explore commercial partnerships  ¨C enhance access to non-theme material
Archival material ¨C 900,000 Wellcome Library - 600,000 images External ¨C 300,000 images
Books related to genetic research - 600,000 images
ProQuest, Early European Books ¨C 5.5m images
Born digital material ¨C initially small but growing
Digitisation strategy Then  Now Small projects (<10,000 pp)  Large projects (>100,000 pp) Relatively ad-hoc  Major strategic programme SMT & Project teams  Programme Board, advisors Library-centric  W. Trust, external stakeholders Entirely open access  Commercial models encouraged Little impact on IT systems  Requires major IT development Examples  Everything
Digitisation processes Then  Now Manual processes  Automated processes 100% QA  Sample QA, error minimization TIFF  JPEG 2000 Bespoke tracking/monitoring  Centralised tracking system Incremental storage growth  New storage strategy Detailed, painstaking  Streamlined, pragmatic
Streamlining digitisation Staff dedicated to specific projects, or streams of work Carry out sample workflow tests for new types of material The right equipment for the right job ¨C eliminate the ¡°fiddly bits¡± Live-view monitors Easy-clean surfaces Foot-pedals Custom-made supports
Streamlining digitisation Photographers do the photography¡­ Prepare materials separately Leave loose pages and bindings as they are, they are easier to digitise that way! Use existing staff as support ¨C moving items to and from stack Minimise movement Keep plenty of shelving, working space at hand Find a preferred supplier for ad hoc support
Streamlining project management
What is it? Web-based workflow system Open source (core system) Used by many libraries in Germany, and half a dozen other European libraries Intranda   version  developed by Intranda to meet Wellcome Library specific requirements
What does it do? Task-focused, customisable workflows developed by Intranda User-specific ¡°dashboard¡±  Import/export and store metadata Encode data as METS Display progress of tasks, statistics on activities Tracks projects, batches, and units (location, current activity) ¡° Command central¡± for 3 rd  party systems
User tasks
User tasks
Project management tasks
Administrative tasks Example workflow steps
Digital asset management Master files backed up offsite to WORM storage drive WORM = Write Once Read Many ¨C permanent storage Self-healing of errors on main storage system from WORM  Lightroom used to convert RAW to TIFF LuraWave converts TIFF to JP2K Validation of JP2K conversion coming soon ¨C  via Goobi File conversion Automated ingest workflow in the DAM (Safety Deposit Box - SDB) ¨C  via Goobi One file serves as master and dissemination file  Ingest DAM is a preservation system Manages all preservation actions (characterisation, format migration) API to allow 3 rd  party systems access to content Preservation Storage
Lightroom -  post-processing,  convert  to  TIFF Temp Temp Temp Hotfolder Hotfolder LuraWave  automatically  converts  files to  JP2  and outputs to a folder Goobi  automatically triggers  validation Person  triggers  ingest via Goobi SDB  ingests Pillar permanent WORM backup Really permanent Hotfolder External (TIFF) External (JP2) In-house (RAW) QA QA QA
Thank you! Christy Henshaw [email_address]

More Related Content

Upscaling digitisation at the Wellcome Library

  • 1. Upscaling digitisation at the Wellcome Library showcasing the Goobi workflow system Christy Henshaw Programme Manager Wellcome Digital Library 3 rd LIBER-EBLIDA Workshop on Digitisation of Library Material in Europe Koninklijke Bibliotheek 6 October 2011
  • 2. The Wellcome Trust A global charitable foundation Achieving extraordinary improvements in human and animal health Supporting the brightest minds in biomedical research and the medical humanities Exploring medicine in historical and cultural contexts
  • 3. The Wellcome Library Major resource for the study of medical history Collections of books, manuscripts, archives, films and pictures on the history of medicine from the earliest times to the present day. Provide insight and information to anyone seeking to understand medicine and its role in society, past and present. Provide access to a growing collection of contemporary biomedical information resources relating to consumer health, popular science, biomedical ethics and the public understanding of science. The Wellcome Library
  • 5. The Wellcome Library Image library created from transparencies/prints, and on demand photography ¨C 300,000 images Journal backfiles digitisation ¨C (funder) Med. Hist., BMJ, etc. in PMC Wellcome Film ¨C 500+ titles (also Wellcome Film YouTube channel) AIDS posters project ¨C 3,000 posters Arabic manuscripts ¨C 500 manuscripts 17 th century recipe books ¨C 74 manuscripts Contributions to Europeana via the Europeana Libraries project, and World Digital Library Digitisation ¨C the story so far
  • 6. The Library Transformation Strategy 2009 - 2014 To provide global access to, and expert interpretation of, a world class collection that explores medicine in its cultural contexts Targeted collecting ¨C putting challenges in context Expert interpretation ¨C engaging (new) audiences Strategic digitisation ¨C online access to our collections
  • 7. The Wellcome Digital Library pilot 2010-2013 Genetics and its Modern Foundations A new online resource for everyone interested in the history of human and animal health. Aims build sustainable/expandable mechanism ¨C foundation stone for WDL digitise key library holdings - relating to a major Trust challenge area digitise important third party content ¨C linked to theme use innovative content and tools ¨C to encourage discovery and use explore commercial partnerships ¨C enhance access to non-theme material
  • 8. Archival material ¨C 900,000 Wellcome Library - 600,000 images External ¨C 300,000 images
  • 9. Books related to genetic research - 600,000 images
  • 10. ProQuest, Early European Books ¨C 5.5m images
  • 11. Born digital material ¨C initially small but growing
  • 12. Digitisation strategy Then Now Small projects (<10,000 pp) Large projects (>100,000 pp) Relatively ad-hoc Major strategic programme SMT & Project teams Programme Board, advisors Library-centric W. Trust, external stakeholders Entirely open access Commercial models encouraged Little impact on IT systems Requires major IT development Examples Everything
  • 13. Digitisation processes Then Now Manual processes Automated processes 100% QA Sample QA, error minimization TIFF JPEG 2000 Bespoke tracking/monitoring Centralised tracking system Incremental storage growth New storage strategy Detailed, painstaking Streamlined, pragmatic
  • 14. Streamlining digitisation Staff dedicated to specific projects, or streams of work Carry out sample workflow tests for new types of material The right equipment for the right job ¨C eliminate the ¡°fiddly bits¡± Live-view monitors Easy-clean surfaces Foot-pedals Custom-made supports
  • 15. Streamlining digitisation Photographers do the photography¡­ Prepare materials separately Leave loose pages and bindings as they are, they are easier to digitise that way! Use existing staff as support ¨C moving items to and from stack Minimise movement Keep plenty of shelving, working space at hand Find a preferred supplier for ad hoc support
  • 17. What is it? Web-based workflow system Open source (core system) Used by many libraries in Germany, and half a dozen other European libraries Intranda version developed by Intranda to meet Wellcome Library specific requirements
  • 18. What does it do? Task-focused, customisable workflows developed by Intranda User-specific ¡°dashboard¡± Import/export and store metadata Encode data as METS Display progress of tasks, statistics on activities Tracks projects, batches, and units (location, current activity) ¡° Command central¡± for 3 rd party systems
  • 23. Digital asset management Master files backed up offsite to WORM storage drive WORM = Write Once Read Many ¨C permanent storage Self-healing of errors on main storage system from WORM Lightroom used to convert RAW to TIFF LuraWave converts TIFF to JP2K Validation of JP2K conversion coming soon ¨C via Goobi File conversion Automated ingest workflow in the DAM (Safety Deposit Box - SDB) ¨C via Goobi One file serves as master and dissemination file Ingest DAM is a preservation system Manages all preservation actions (characterisation, format migration) API to allow 3 rd party systems access to content Preservation Storage
  • 24. Lightroom - post-processing, convert to TIFF Temp Temp Temp Hotfolder Hotfolder LuraWave automatically converts files to JP2 and outputs to a folder Goobi automatically triggers validation Person triggers ingest via Goobi SDB ingests Pillar permanent WORM backup Really permanent Hotfolder External (TIFF) External (JP2) In-house (RAW) QA QA QA
  • 25. Thank you! Christy Henshaw [email_address]