際際滷

際際滷Share a Scribd company logo
IF THIS IS THE FUTURE,
WHERE IS MY TREE OF LIFE?
                  Karen Cranston
 National Evolutionary Synthesis Center (NESCent)

                   @kcranstn
        http://www.slideshare.net/kcranstn
Tree of life

 ~2million named
 species

 Millions
       more
 unnamed / undiscovered
If this is the future, where is my tree of life?
Phylogeny'papers,'1978;2008'
                              12000"




                              10000"
Number'of'papers'published'




                               8000"
                                                                         Rapid"increase"in"applica?ons"of"
                                                                         phylogeny,"beginning"in"early"1990s"
                               6000"




                               4000"




                               2000"




                                  0"
                                       1978" 979" 980"1981" 982" 983" 984"1985" 986"1987" 988" 989" 990"1991" 992" 993" 994"1995" 996"1997" 998" 999" 000"2001" 002" 003" 004"2005" 006"2007" 008"
                                           1    1         1    1    1         1         1    1    1         1    1    1         1         1    1 2            2    2    2         2         2

                                                                                                                Year'
                          Source:"ISI"Web"of"Science""

                                                                                                                                            graph from David Hillis
What does it mean to have the tree of life?
If this is the future, where is my tree of life?
Archiving sequence data is a
community norm




                       ~4% of all published
                        phylogenetic trees
                               Stoltzfus et al 2012
Publishing a tree
                                                                                                                                                                             = picture in a PDF




                                                                                                                                                                 EVOLUTION




Fig. 1. Combined molecular phylogenetic tree for Diptera. Partitioned ML analysis of combined taxon sets of tier 1 and tier 2 FLYTREE data samples (lnL =


             Weigmann et al. PNAS, 2011
344155.6169) calculated in RAxML. Circles indicate bootstrap support >80% (black/bp = 95100%, gray/bp = 8894%, white/bp = 8088%). Nodes with im-
proved bootstrap values resulting from postanalysis pruning of unstable taxa are marked by stars (black/bp = 95100%, gray/bp = 8894%, white/bp = 80
88%). Colored squares on terminal branches indicate the presence, in at least one species of a family, of ecological traits as shown to lower left. The number
of origins of each trait was estimated with reference to the phylogeny, the distribution of each trait among genera within a family, and the known biology of
the organisms.


thermore, a paraphyletic relationship of phorids and syrphids                     To test this hypothesis, we used a relatively recent phylogenomic
would support the hypothesis that their shared special mode of                   marker: small, noncoding, regulatory micro-RNAs (miRNAs).
Lander et al. Nature 2001
Rod asks: Why do we need a database of trees?
Fig. 1. Combined molecular phylogenetic tree for Diptera. Partitioned ML analysis of combined taxon sets of tier 1 and tier 2 FLYTREE data samples (lnL =
344155.6169) calculated in RAxML. Circles indicate bootstrap support >80% (black/bp = 95100%, gray/bp = 8894%, white/bp = 8088%). Nodes with im-
proved bootstrap values resulting from postanalysis pruning of unstable taxa are marked by stars (black/bp = 95100%, gray/bp = 8894%, white/bp = 80
88%). Colored squares on terminal branches indicate the presence, in at least one species of a family, of ecological traits as shown to lower left. The number
of origins of each trait was estimated with reference to the phylogeny, the distribution of each trait among genera within a family, and the known biology of
the organisms.


thermore, a paraphyletic relationship of phorids and syrphids                      To test this hypothesis, we used a relatively recent phylogenomic
would support the hypothesis that their shared special mode of                   marker: small, noncoding, regulatory micro-RNAs (miRNAs).
extraembryonic development (dorsal amnion closure) (26)                          miRNAs exhibit a striking phylogenetic pattern of conservation
evolved in the stem lineage of Cyclorrhapha and preceded the                     across the metazoan tree of life, suggesting the accumulation and
origin of the schizophoran amnioserosa.                                          maintenance of miRNA families throughout organismal evolution

Wiegmann et al.                                                                                                                   PNAS Early Edition | 3 of 6
assembly
alignment
inference


expertise   Fig. 1. Combined molecular phylogenetic tree for Diptera. Partitioned ML analysis of combined taxon sets of tier 1 and tier 2 FLYTREE data samples (lnL =
            344155.6169) calculated in RAxML. Circles indicate bootstrap support >80% (black/bp = 95100%, gray/bp = 8894%, white/bp = 8088%). Nodes with im-
            proved bootstrap values resulting from postanalysis pruning of unstable taxa are marked by stars (black/bp = 95100%, gray/bp = 8894%, white/bp = 80
            88%). Colored squares on terminal branches indicate the presence, in at least one species of a family, of ecological traits as shown to lower left. The number
            of origins of each trait was estimated with reference to the phylogeny, the distribution of each trait among genera within a family, and the known biology of
            the organisms.




  time      thermore, a paraphyletic relationship of phorids and syrphids
            would support the hypothesis that their shared special mode of
            extraembryonic development (dorsal amnion closure) (26)
            evolved in the stem lineage of Cyclorrhapha and preceded the
            origin of the schizophoran amnioserosa.
                                                                                               To test this hypothesis, we used a relatively recent phylogenomic
                                                                                             marker: small, noncoding, regulatory micro-RNAs (miRNAs).
                                                                                             miRNAs exhibit a striking phylogenetic pattern of conservation
                                                                                             across the metazoan tree of life, suggesting the accumulation and
                                                                                             maintenance of miRNA families throughout organismal evolution




  $$$       Wiegmann et al.                                                                                                                   PNAS Early Edition | 3 of 6
If this is the future, where is my tree of life?
NSF IDEAS LAB
i. Pre-proposal / application    iv. Pitch high risk proposal
                                     ideas at end
ii. 5 day highly facilitated
    workshop                     v. NSF invited full proposals
iii. Self-assembly into groups
 Community assembly of the
  tree of life (Open Tree of Life)

 Next generation Phenomics
  (PI OLeary)

 Arbor: Comparative Analysis
  Work鍖ows (PI Harmon)
Karen Cranston, lead PI (Duke)
                              Gordon Burleigh (Florida)
                              Keith Crandall (BYU)
                              Karl Gude (MSU)
                              David Hibbett (Clark)
                              Mark Holder (Kansas)
                              Laura Katz (Smith)
opentreeo鍖ife.org             Rick Ree (FMNH)
                              Stephen Smith (Michigan)
                              Doug Soltis (Florida)
                              Tiffani Williams (TAMU)

     AVAToL: Assembling, Visualizing and Analysis of
     the Tree of Life
1. Synthesize a complete draft tree of life from existing
   phylogenetic trees
1. Synthesize a complete draft tree of life from existing
   phylogenetic trees
2. Release with:
   a. ability to add annotations and upload new data sets
   b. areas of uncertainty / con鍖ict
   c. links to source data and analysis methods
   d. utilities to download whole tree and subtrees
If this is the future, where is my tree of life?
Graph database holding
thousands of input trees with     鍖lter / weight input trees
      millions of nodes
                                  build synthetic trees




   compare to alternate trees
   input new data sets
Dipsicales graph
taxonomy data (578 taxa) +
Soltis et al APG III phylogeny (30 taxa)
Dipsicales graph   Synthesized tree (favouring
                   phylogenetic branches); contains
                   all 578 taxa
AUTOMATIC UPDATING
    update trees
      with new
   sequence data




               detect and synthesize newly
                     published trees
?

 Open   Data

  increasing
            availability of digital data associated with
  phylogeny publications

  synthetic
          tree open to community annotation and
  new data submission

  whole   tree / subtrees available for download
?


 Open   Science

  project   wiki: http://opentree.wikispaces.com/

  open     source software: https://github.com/OpenTreeOfLife

  public   mailing list, meeting notes, management tools
If this is the future, where is my tree of life?
 provide  complete phylogenetic
    framework
   link to biodiversity and systematics
    content


  API   for downloading subtrees to analysis tools




 source   / storage of underlying data
opentreeo鍖ife.org



 Weve   only just started (June 1 2012)
 Open    to input, feedback and participation:
  join   the mailing list & wiki
  add    publications to the Mendeley group
  vote   / comment on plans on the development boards
  participate   in virtual data curation sprint in 2013
Karen Cranston, lead PI (Duke)
                                       Gordon Burleigh (Florida)
                                       Keith Crandall (BYU)
                                       Karl Gude (MSU)
                                       David Hibbett (Clark)
                                       Mark Holder (Kansas)
                                       Laura Katz (Smith)
opentreeo鍖ife.org                      Rick Ree (FMNH)
                                       Stephen Smith (Michigan)
                                       Doug Soltis (Florida)
                                       Tiffani Williams (TAMU)


    AVAToL: Assembling, Visualizing and Analysis of the Tree of Life

More Related Content

If this is the future, where is my tree of life?

  • 1. IF THIS IS THE FUTURE, WHERE IS MY TREE OF LIFE? Karen Cranston National Evolutionary Synthesis Center (NESCent) @kcranstn http://www.slideshare.net/kcranstn
  • 2. Tree of life ~2million named species Millions more unnamed / undiscovered
  • 4. Phylogeny'papers,'1978;2008' 12000" 10000" Number'of'papers'published' 8000" Rapid"increase"in"applica?ons"of" phylogeny,"beginning"in"early"1990s" 6000" 4000" 2000" 0" 1978" 979" 980"1981" 982" 983" 984"1985" 986"1987" 988" 989" 990"1991" 992" 993" 994"1995" 996"1997" 998" 999" 000"2001" 002" 003" 004"2005" 006"2007" 008" 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 Year' Source:"ISI"Web"of"Science"" graph from David Hillis
  • 5. What does it mean to have the tree of life?
  • 7. Archiving sequence data is a community norm ~4% of all published phylogenetic trees Stoltzfus et al 2012
  • 8. Publishing a tree = picture in a PDF EVOLUTION Fig. 1. Combined molecular phylogenetic tree for Diptera. Partitioned ML analysis of combined taxon sets of tier 1 and tier 2 FLYTREE data samples (lnL = Weigmann et al. PNAS, 2011 344155.6169) calculated in RAxML. Circles indicate bootstrap support >80% (black/bp = 95100%, gray/bp = 8894%, white/bp = 8088%). Nodes with im- proved bootstrap values resulting from postanalysis pruning of unstable taxa are marked by stars (black/bp = 95100%, gray/bp = 8894%, white/bp = 80 88%). Colored squares on terminal branches indicate the presence, in at least one species of a family, of ecological traits as shown to lower left. The number of origins of each trait was estimated with reference to the phylogeny, the distribution of each trait among genera within a family, and the known biology of the organisms. thermore, a paraphyletic relationship of phorids and syrphids To test this hypothesis, we used a relatively recent phylogenomic would support the hypothesis that their shared special mode of marker: small, noncoding, regulatory micro-RNAs (miRNAs).
  • 9. Lander et al. Nature 2001
  • 10. Rod asks: Why do we need a database of trees?
  • 11. Fig. 1. Combined molecular phylogenetic tree for Diptera. Partitioned ML analysis of combined taxon sets of tier 1 and tier 2 FLYTREE data samples (lnL = 344155.6169) calculated in RAxML. Circles indicate bootstrap support >80% (black/bp = 95100%, gray/bp = 8894%, white/bp = 8088%). Nodes with im- proved bootstrap values resulting from postanalysis pruning of unstable taxa are marked by stars (black/bp = 95100%, gray/bp = 8894%, white/bp = 80 88%). Colored squares on terminal branches indicate the presence, in at least one species of a family, of ecological traits as shown to lower left. The number of origins of each trait was estimated with reference to the phylogeny, the distribution of each trait among genera within a family, and the known biology of the organisms. thermore, a paraphyletic relationship of phorids and syrphids To test this hypothesis, we used a relatively recent phylogenomic would support the hypothesis that their shared special mode of marker: small, noncoding, regulatory micro-RNAs (miRNAs). extraembryonic development (dorsal amnion closure) (26) miRNAs exhibit a striking phylogenetic pattern of conservation evolved in the stem lineage of Cyclorrhapha and preceded the across the metazoan tree of life, suggesting the accumulation and origin of the schizophoran amnioserosa. maintenance of miRNA families throughout organismal evolution Wiegmann et al. PNAS Early Edition | 3 of 6
  • 12. assembly alignment inference expertise Fig. 1. Combined molecular phylogenetic tree for Diptera. Partitioned ML analysis of combined taxon sets of tier 1 and tier 2 FLYTREE data samples (lnL = 344155.6169) calculated in RAxML. Circles indicate bootstrap support >80% (black/bp = 95100%, gray/bp = 8894%, white/bp = 8088%). Nodes with im- proved bootstrap values resulting from postanalysis pruning of unstable taxa are marked by stars (black/bp = 95100%, gray/bp = 8894%, white/bp = 80 88%). Colored squares on terminal branches indicate the presence, in at least one species of a family, of ecological traits as shown to lower left. The number of origins of each trait was estimated with reference to the phylogeny, the distribution of each trait among genera within a family, and the known biology of the organisms. time thermore, a paraphyletic relationship of phorids and syrphids would support the hypothesis that their shared special mode of extraembryonic development (dorsal amnion closure) (26) evolved in the stem lineage of Cyclorrhapha and preceded the origin of the schizophoran amnioserosa. To test this hypothesis, we used a relatively recent phylogenomic marker: small, noncoding, regulatory micro-RNAs (miRNAs). miRNAs exhibit a striking phylogenetic pattern of conservation across the metazoan tree of life, suggesting the accumulation and maintenance of miRNA families throughout organismal evolution $$$ Wiegmann et al. PNAS Early Edition | 3 of 6
  • 14. NSF IDEAS LAB i. Pre-proposal / application iv. Pitch high risk proposal ideas at end ii. 5 day highly facilitated workshop v. NSF invited full proposals iii. Self-assembly into groups
  • 15. Community assembly of the tree of life (Open Tree of Life) Next generation Phenomics (PI OLeary) Arbor: Comparative Analysis Work鍖ows (PI Harmon)
  • 16. Karen Cranston, lead PI (Duke) Gordon Burleigh (Florida) Keith Crandall (BYU) Karl Gude (MSU) David Hibbett (Clark) Mark Holder (Kansas) Laura Katz (Smith) opentreeo鍖ife.org Rick Ree (FMNH) Stephen Smith (Michigan) Doug Soltis (Florida) Tiffani Williams (TAMU) AVAToL: Assembling, Visualizing and Analysis of the Tree of Life
  • 17. 1. Synthesize a complete draft tree of life from existing phylogenetic trees
  • 18. 1. Synthesize a complete draft tree of life from existing phylogenetic trees 2. Release with: a. ability to add annotations and upload new data sets b. areas of uncertainty / con鍖ict c. links to source data and analysis methods d. utilities to download whole tree and subtrees
  • 20. Graph database holding thousands of input trees with 鍖lter / weight input trees millions of nodes build synthetic trees compare to alternate trees input new data sets
  • 21. Dipsicales graph taxonomy data (578 taxa) + Soltis et al APG III phylogeny (30 taxa)
  • 22. Dipsicales graph Synthesized tree (favouring phylogenetic branches); contains all 578 taxa
  • 23. AUTOMATIC UPDATING update trees with new sequence data detect and synthesize newly published trees
  • 24. ? Open Data increasing availability of digital data associated with phylogeny publications synthetic tree open to community annotation and new data submission whole tree / subtrees available for download
  • 25. ? Open Science project wiki: http://opentree.wikispaces.com/ open source software: https://github.com/OpenTreeOfLife public mailing list, meeting notes, management tools
  • 27. provide complete phylogenetic framework link to biodiversity and systematics content API for downloading subtrees to analysis tools source / storage of underlying data
  • 28. opentreeo鍖ife.org Weve only just started (June 1 2012) Open to input, feedback and participation: join the mailing list & wiki add publications to the Mendeley group vote / comment on plans on the development boards participate in virtual data curation sprint in 2013
  • 29. Karen Cranston, lead PI (Duke) Gordon Burleigh (Florida) Keith Crandall (BYU) Karl Gude (MSU) David Hibbett (Clark) Mark Holder (Kansas) Laura Katz (Smith) opentreeo鍖ife.org Rick Ree (FMNH) Stephen Smith (Michigan) Doug Soltis (Florida) Tiffani Williams (TAMU) AVAToL: Assembling, Visualizing and Analysis of the Tree of Life