際際滷

際際滷Share a Scribd company logo
The Biological Data
Sustainability Paradox:
A Time To Think Differently
Terence Johnson & Philip E. Bourne
School of Data Science & Dept of Biomedical Engineering
University of Virginia
https://arxiv.org/abs/2311.05668
ISCB Fellows Meeting July 2024 1
Global Data Growth Over 10 Years 130 Zettabytes
ISCB Fellows Meeting July 2024 2
The Paradox
With a fixed funding base the maintenance of
biomedical data takes money away from
innovative research, however
Those data are increasingly important in achieving
innovation.
ISCB Fellows Meeting July 2024 3
Today - Biological Data Sustainability (BDS) 1.0
 Funding of biological data is mostly with public money
 Public  private partnership is limited
 Scientific foundation models are limited
 The system is fragile  think USA November 2024
 Some gains in efficiency are possible through
aggregation/centralization
 Politics, cultures and funding models work against
aggregation/centralization
ISCB Fellows Meeting July 2024 4
Funders Want To Do Something
https://globalbiodata.org/what-we-do/global-core-biodata-resources/
ISCB Fellows Meeting July 2024 5
Tomorrow 
Biological Data Sustainability (BDS) 2.0 
A data economy
Not for one moment do we think the community of
funders and data providers and users will find this
model palatable
The intent is to invoke discussion amongst a broader
group of innovative thinkers
ISCB Fellows Meeting July 2024 6
Biological Data Sustainability (BDS) 2.0
 Cap and Trade Model  Premise
 Data have value in the marketplace and can be traded
 That value is in the form of currency - cash or data credits
 A global broker perhaps a coalition of funders distributes credits
 Services have value in the marketplace
 The private sector is an active not passive player in the model
 The era of entitlement by all stakeholders  researchers, data resources,
funders is over  whatever the model, it is driven by the economics of the
marketplace
ISCB Fellows Meeting July 2024 7
BDS 2.0 Mechanics
 Funders act collectively to support a global broker
 Every researcher get some free data credits
 Data upload to a repository or data download expends credits
 Data download is paid in credits by the downloader
 Data depositor received credits upon that download minus an operational fee
 Parties can solicit data credits by performing services
 Data credits can be exchanged for cash for which the broker gets a fee
 The broker issues or removes credits in the system to maintain
stability
ISCB Fellows Meeting July 2024 8
BDS 2.0 Mechanics
ISCB Fellows Meeting July 2024 9
BSD 2.0 Implications
 Controls the amount of data in the system
 A single currency for evaluating the system
 A depositor dividend for producing widely used data
 Disincentivizes depositing low quality data as its costs credits but will
not gain credits
 Creates a distributed service economy supporting curation
 Private sector actors who do not deposit pay cash for credits or
perform services
 Access to the system is tied to data quality
ISCB Fellows Meeting July 2024 10
BDS 2.0 Advantages
 Multiple brokers  a problem but maybe it means its working
 Anyone e.g. in US parlance R2s, NBCUs, community colleges can
participate
 Creates a bias for producing highly accessed datasets
 Potential for higher quality curation
Red is an advantage and disadvantage
ISCB Fellows Meeting July 2024 11
BDS 2.0 Disadvantages
 Its an all or nothing approach and hard to launch
 Predatory data purveyors
 Multiple brokers  multiple non-consistent markets
 Creates a bias for producing highly accessed datasets
 Potential for lower quality curation
Red is an advantage and disadvantage
ISCB Fellows Meeting July 2024 12
Is BDS 2.0 More Sustainable than BDS 1.0?
 2.0 leverages the entire community of users
 A distributed service economy will raise overall awareness concerning
data quality
ISCB Fellows Meeting July 2024 13
Next Steps
 There is a GBC Funders meeting in September at which I will present
and the model discussed. It would be helpful to have a view from the
ISCB community (positive or negative) to add to the discussion.
 Publish a paper based on the arXiv version and all input received.
 We (well mainly Terry Johnson) has developed a mathematical model
to simulate the activities of the credits system. We will report back.
ISCB Fellows Meeting July 2024 14

More Related Content

The Biological Data Sustainability Paradox: A Time to Think Differently

  • 1. The Biological Data Sustainability Paradox: A Time To Think Differently Terence Johnson & Philip E. Bourne School of Data Science & Dept of Biomedical Engineering University of Virginia https://arxiv.org/abs/2311.05668 ISCB Fellows Meeting July 2024 1
  • 2. Global Data Growth Over 10 Years 130 Zettabytes ISCB Fellows Meeting July 2024 2
  • 3. The Paradox With a fixed funding base the maintenance of biomedical data takes money away from innovative research, however Those data are increasingly important in achieving innovation. ISCB Fellows Meeting July 2024 3
  • 4. Today - Biological Data Sustainability (BDS) 1.0 Funding of biological data is mostly with public money Public private partnership is limited Scientific foundation models are limited The system is fragile think USA November 2024 Some gains in efficiency are possible through aggregation/centralization Politics, cultures and funding models work against aggregation/centralization ISCB Fellows Meeting July 2024 4
  • 5. Funders Want To Do Something https://globalbiodata.org/what-we-do/global-core-biodata-resources/ ISCB Fellows Meeting July 2024 5
  • 6. Tomorrow Biological Data Sustainability (BDS) 2.0 A data economy Not for one moment do we think the community of funders and data providers and users will find this model palatable The intent is to invoke discussion amongst a broader group of innovative thinkers ISCB Fellows Meeting July 2024 6
  • 7. Biological Data Sustainability (BDS) 2.0 Cap and Trade Model Premise Data have value in the marketplace and can be traded That value is in the form of currency - cash or data credits A global broker perhaps a coalition of funders distributes credits Services have value in the marketplace The private sector is an active not passive player in the model The era of entitlement by all stakeholders researchers, data resources, funders is over whatever the model, it is driven by the economics of the marketplace ISCB Fellows Meeting July 2024 7
  • 8. BDS 2.0 Mechanics Funders act collectively to support a global broker Every researcher get some free data credits Data upload to a repository or data download expends credits Data download is paid in credits by the downloader Data depositor received credits upon that download minus an operational fee Parties can solicit data credits by performing services Data credits can be exchanged for cash for which the broker gets a fee The broker issues or removes credits in the system to maintain stability ISCB Fellows Meeting July 2024 8
  • 9. BDS 2.0 Mechanics ISCB Fellows Meeting July 2024 9
  • 10. BSD 2.0 Implications Controls the amount of data in the system A single currency for evaluating the system A depositor dividend for producing widely used data Disincentivizes depositing low quality data as its costs credits but will not gain credits Creates a distributed service economy supporting curation Private sector actors who do not deposit pay cash for credits or perform services Access to the system is tied to data quality ISCB Fellows Meeting July 2024 10
  • 11. BDS 2.0 Advantages Multiple brokers a problem but maybe it means its working Anyone e.g. in US parlance R2s, NBCUs, community colleges can participate Creates a bias for producing highly accessed datasets Potential for higher quality curation Red is an advantage and disadvantage ISCB Fellows Meeting July 2024 11
  • 12. BDS 2.0 Disadvantages Its an all or nothing approach and hard to launch Predatory data purveyors Multiple brokers multiple non-consistent markets Creates a bias for producing highly accessed datasets Potential for lower quality curation Red is an advantage and disadvantage ISCB Fellows Meeting July 2024 12
  • 13. Is BDS 2.0 More Sustainable than BDS 1.0? 2.0 leverages the entire community of users A distributed service economy will raise overall awareness concerning data quality ISCB Fellows Meeting July 2024 13
  • 14. Next Steps There is a GBC Funders meeting in September at which I will present and the model discussed. It would be helpful to have a view from the ISCB community (positive or negative) to add to the discussion. Publish a paper based on the arXiv version and all input received. We (well mainly Terry Johnson) has developed a mathematical model to simulate the activities of the credits system. We will report back. ISCB Fellows Meeting July 2024 14