2. v0.5 API
Support data access to reads/variants over REST
Most existing applications using API are interactive
3. Batch Processing
Is REST the correct approach?
API is consistent for both local & remote data
But, has overhead (perf + admin) for local data
Approaches moving forward:
Shims to current file formats
Native interface to Hadoop ecosystem?
4. Common Workflow
Language
Pain point: how do we build reproducible pipelines
of tasks?
A group has started building a common workflow
description language for bioinformatics:
https://groups.google.com/forum/#!forum/
common-workflow-language
Should the GA4GH take this task on?