際際滷

際際滷Share a Scribd company logo
Every datum counts! Capitalising on small contributions to the big dreams of mobilising biodiversity information Vishwas Chavan, Eamonn O Tuama,  Samy Gaiji, David Remsen and Nicholas King 2008 Annual Conference of Taxonomic Databases Working Group 19-25 October 2008, Fremantle, AUSTRALIA
Both biodiversity and biodiversity data are unevenly distributed around the world: Digital Divide  Content Divide  Lingual Divide  Knowledge Divide  Emerging catastrophe Developing World Biodiversity Biodiversity Data Developed World
Uneven distribution of biodiversity
Large volume of biodiversity data and information is in languages other than English
Biodiversity Informatics activities are concentrated in the North
Few more reasons. Investment in biodiversity information management is towards large projects Research in biodiversity informatics is focused towards large data publishers Small Data Publishers   A neglected mass!
Biodiversity Knowledge Divide: Emerging Catastrophe
Open Access movement can help mobilise data -   (a) from mega-biodiversity regions, and (b) by  small data publishers Good News!
Small Data Publishers: Who are they? (1) Cant discover, access, and use their data Do not know how to manage data for reuse by others Lack of skills, infrastructure, and support for interoperable data management More interested in peer-reviewed publishing than data publishing  as former brings recognition and funding
Small Data Publishers: Who are they? (2) PIs of small scale projects, small and medium sized R&D organisations and NGOs, Citizen Scientists Citizen Scientists- e.g. Peoples Biodiversity Register P. Bryan Heidorns Hypothesis:  Disproportionate amount of dark data is in the tail of science Small Data Publishers forms the Long tail as well droplets of  Oceans of Biodiversity Data
Small Are BIG! Long tail or Dark Data is economically and ecologically very critical Most of existing and future data would be hold by Small Data Publishers 80% of current investment is towards Small Data Publishers Total Awards: 9347 Big Awards: 1869 SMALL Awards: 7478 Source: Curating the Dark Data in the Long tail of science by P. Bryan Heidorn
Characteristics of SDP Data Heterogeneous Distributed and isolated Manually generated Individual creation Not maintained for reuse by others Obscured or protected Uneven distribution as well unequal access It is highly Unorganised data sector.......
Festive uses of bio resources Census of trees Uses of Plants Status and knowledge  about medicinal plants Census of Birds Birds signs for forecasting  or weather change Wild Animals Burrowing or sub-soil fauna Paudi village,  Siwani, India
Need standards to discover and access such data! Domestic Animals Social belief about biodiversity Citizen Scientists Seed Diversity Millions of Ramsinghs across the world are busy in generating biodiversity data
What do we lack? Data Publishing Framework  Lack awareness about current knowledge system Recognition for Data Publishing Data standards for wide spectrum of biodiversity and associated data Suite of standards for data life cycle (generation to dissemination) Standards addressing data generation phase
What do we lack? Tools for Data Capture at its source Metadata creation as close to the source of data as possible Multilingual tools and standards Hassle-free, skill-level independent tools Because..... Adapting to standards is time-consuming as well costly exercise
Data mobilisation is like moving mountains. Digital   Biodiversity  Data
What Can be done! Data Publishing Framework Proposed GBIF recommendation on Discovery and Publishing of Biodiversity Data GUID for data set and data records Expedite the process of standards development Standards development, ratification and uptake Hassle-free, skill-level independent, easy to adapt standards Standards as integral part of recording / monitoring devices Metadata creation as close to source as possible
What Can be done! Standards for interoperability and/or integration with non-biodiversity data Evaluation of authenticity, reliability, and data quality as close to source as possible Outreach to national/regional/thematic standards building initiatives Domain experts find it difficult to understand / adapt standards Cultural as well lingual barriers Engagement of eastern, southern, mega-biodiversity communities in standards development processes
What Can be done! Internationalise standards Awareness in mega-biodiversity world about standards Multilingual dissemination  talk the languages that  people understand the bests Think Globally  Act Locally Moving beyond comfort zone Standards for unorganised data sector Standards for citizen scientists Address concerns of data sensitivity through standards implementation Will standards help me in identification and protection of sensitive data?
 Krishna  can move data mountains, if standards bodies act as  Kamdhenus TDWG GBIF
because Every datum counts!

More Related Content

Chavan 02 02 Gbif Small To Big

  • 1. Every datum counts! Capitalising on small contributions to the big dreams of mobilising biodiversity information Vishwas Chavan, Eamonn O Tuama, Samy Gaiji, David Remsen and Nicholas King 2008 Annual Conference of Taxonomic Databases Working Group 19-25 October 2008, Fremantle, AUSTRALIA
  • 2. Both biodiversity and biodiversity data are unevenly distributed around the world: Digital Divide Content Divide Lingual Divide Knowledge Divide Emerging catastrophe Developing World Biodiversity Biodiversity Data Developed World
  • 3. Uneven distribution of biodiversity
  • 4. Large volume of biodiversity data and information is in languages other than English
  • 5. Biodiversity Informatics activities are concentrated in the North
  • 6. Few more reasons. Investment in biodiversity information management is towards large projects Research in biodiversity informatics is focused towards large data publishers Small Data Publishers A neglected mass!
  • 7. Biodiversity Knowledge Divide: Emerging Catastrophe
  • 8. Open Access movement can help mobilise data - (a) from mega-biodiversity regions, and (b) by small data publishers Good News!
  • 9. Small Data Publishers: Who are they? (1) Cant discover, access, and use their data Do not know how to manage data for reuse by others Lack of skills, infrastructure, and support for interoperable data management More interested in peer-reviewed publishing than data publishing as former brings recognition and funding
  • 10. Small Data Publishers: Who are they? (2) PIs of small scale projects, small and medium sized R&D organisations and NGOs, Citizen Scientists Citizen Scientists- e.g. Peoples Biodiversity Register P. Bryan Heidorns Hypothesis: Disproportionate amount of dark data is in the tail of science Small Data Publishers forms the Long tail as well droplets of Oceans of Biodiversity Data
  • 11. Small Are BIG! Long tail or Dark Data is economically and ecologically very critical Most of existing and future data would be hold by Small Data Publishers 80% of current investment is towards Small Data Publishers Total Awards: 9347 Big Awards: 1869 SMALL Awards: 7478 Source: Curating the Dark Data in the Long tail of science by P. Bryan Heidorn
  • 12. Characteristics of SDP Data Heterogeneous Distributed and isolated Manually generated Individual creation Not maintained for reuse by others Obscured or protected Uneven distribution as well unequal access It is highly Unorganised data sector.......
  • 13. Festive uses of bio resources Census of trees Uses of Plants Status and knowledge about medicinal plants Census of Birds Birds signs for forecasting or weather change Wild Animals Burrowing or sub-soil fauna Paudi village, Siwani, India
  • 14. Need standards to discover and access such data! Domestic Animals Social belief about biodiversity Citizen Scientists Seed Diversity Millions of Ramsinghs across the world are busy in generating biodiversity data
  • 15. What do we lack? Data Publishing Framework Lack awareness about current knowledge system Recognition for Data Publishing Data standards for wide spectrum of biodiversity and associated data Suite of standards for data life cycle (generation to dissemination) Standards addressing data generation phase
  • 16. What do we lack? Tools for Data Capture at its source Metadata creation as close to the source of data as possible Multilingual tools and standards Hassle-free, skill-level independent tools Because..... Adapting to standards is time-consuming as well costly exercise
  • 17. Data mobilisation is like moving mountains. Digital Biodiversity Data
  • 18. What Can be done! Data Publishing Framework Proposed GBIF recommendation on Discovery and Publishing of Biodiversity Data GUID for data set and data records Expedite the process of standards development Standards development, ratification and uptake Hassle-free, skill-level independent, easy to adapt standards Standards as integral part of recording / monitoring devices Metadata creation as close to source as possible
  • 19. What Can be done! Standards for interoperability and/or integration with non-biodiversity data Evaluation of authenticity, reliability, and data quality as close to source as possible Outreach to national/regional/thematic standards building initiatives Domain experts find it difficult to understand / adapt standards Cultural as well lingual barriers Engagement of eastern, southern, mega-biodiversity communities in standards development processes
  • 20. What Can be done! Internationalise standards Awareness in mega-biodiversity world about standards Multilingual dissemination talk the languages that people understand the bests Think Globally Act Locally Moving beyond comfort zone Standards for unorganised data sector Standards for citizen scientists Address concerns of data sensitivity through standards implementation Will standards help me in identification and protection of sensitive data?
  • 21. Krishna can move data mountains, if standards bodies act as Kamdhenus TDWG GBIF