ºÝºÝߣ

ºÝºÝߣShare a Scribd company logo
Betaspikes
The Beta distribution approach
PAULA TATARU
AARHUS
UNIVERSITY
Bioinformatics
Research Centre
Oxford, July 19th 2014
Modelling allele frequency data under the
Wright Fisher model of drift, mutation and selection
Joint work with Asger Hobolth and Thomas Bataillon
Allele frequencies: the Beta distribution approach
Paula Tataru paula@birc.au.dk
AARHUS
UNIVERSITY
Bioinformatics
Research Centre
Motivation
›Infer population parameters from DNA data
› mutation rates
› selection coefficients
› split times
› variable population size back in time
›Backward in time (coalescent)
›Forward in time (Wright Fisher)
2
Allele frequencies: the Beta distribution approach
Paula Tataru paula@birc.au.dk
AARHUS
UNIVERSITY
Bioinformatics
Research Centre
3
The Wright Fisher model
Allele frequencies: the Beta distribution approach
Paula Tataru paula@birc.au.dk
AARHUS
UNIVERSITY
Bioinformatics
Research Centre
4
The Wright Fisher model
Allele frequencies: the Beta distribution approach
Paula Tataru paula@birc.au.dk
AARHUS
UNIVERSITY
Bioinformatics
Research Centre
5
The Wright Fisher model
Allele frequencies: the Beta distribution approach
Paula Tataru paula@birc.au.dk
AARHUS
UNIVERSITY
Bioinformatics
Research Centre
›Diffusion
› Kimura 1964
› Gautier & Vitalis 2013
› Malaspinas et al. 2012
› Steinrucken et al. 2013
› Zhao et al. 2013
›Moment based
› Normal distribution
› Nicholson et al. 2002
› Prickrell & Pritchard 2012
› Beta distribution
› Balding & Nichols 1995
› Siren et al. 2011
6
Approximations to the WF
Allele frequencies: the Beta distribution approach
Paula Tataru paula@birc.au.dk
AARHUS
UNIVERSITY
Bioinformatics
Research Centre
›Diffusion
› Kimura 1964
› Gautier & Vitalis 2013
› Malaspinas et al. 2012
› Steinrucken et al. 2013
› Zhao et al. 2013
›Moment based
› Normal distribution
› Nicholson et al. 2002
› Prickrell & Pritchard 2012
› Beta distribution
› Balding & Nichols 1995
› Siren et al. 2011
› Beta with spikes
7
Approximations to the WF
Allele frequencies: the Beta distribution approach
Paula Tataru paula@birc.au.dk
AARHUS
UNIVERSITY
Bioinformatics
Research Centre
8
The Beta approximation
Allele frequencies: the Beta distribution approach
Paula Tataru paula@birc.au.dk
AARHUS
UNIVERSITY
Bioinformatics
Research Centre
9
The Beta approximation
Allele frequencies: the Beta distribution approach
Paula Tataru paula@birc.au.dk
AARHUS
UNIVERSITY
Bioinformatics
Research Centre
10
The Beta approximation
Allele frequencies: the Beta distribution approach
Paula Tataru paula@birc.au.dk
AARHUS
UNIVERSITY
Bioinformatics
Research Centre
The Beta with spikes approximation
›The density of Xt
›Use recursive approach to calculate
› mean and variance
› loss and fixation probabilities
› mean and variance conditional on polymorphism
11
Allele frequencies: the Beta distribution approachAARHUS
UNIVERSITY
Bioinformatics
Research Centre Paula Tataru paula@birc.au.dk
12
›Hellinger distance
› true vs approximated distributions
› between 0 and 1
›Stationary: Beta distribution
›Diffusion > Beta with spikes > Beta
Allele frequencies: the Beta distribution approachAARHUS
UNIVERSITY
Bioinformatics
Research Centre Paula Tataru paula@birc.au.dk
13
Allele frequencies: the Beta distribution approachAARHUS
UNIVERSITY
Bioinformatics
Research Centre Paula Tataru paula@birc.au.dk
14
Allele frequencies: the Beta distribution approachAARHUS
UNIVERSITY
Bioinformatics
Research Centre Paula Tataru paula@birc.au.dk
15
Allele frequencies: the Beta distribution approach
Paula Tataru paula@birc.au.dk
AARHUS
UNIVERSITY
Bioinformatics
Research Centre
16
The Beta with spikes: worst fit
Allele frequencies: the Beta distribution approach
Paula Tataru paula@birc.au.dk
AARHUS
UNIVERSITY
Bioinformatics
Research Centre
17
The Beta with spikes: worst fit
Allele frequencies: the Beta distribution approach
Paula Tataru paula@birc.au.dk
AARHUS
UNIVERSITY
Bioinformatics
Research Centre
18
The Beta with spikes: worst fit
Allele frequencies: the Beta distribution approach
Paula Tataru paula@birc.au.dk
AARHUS
UNIVERSITY
Bioinformatics
Research Centre
19
Inference of split times
›Felsenstein’s peeling algorithm
›Numerically optimized likelihood
›5000 loci
›100 samples in each population
›40 data sets
Allele frequencies: the Beta distribution approachAARHUS
UNIVERSITY
Bioinformatics
Research Centre Paula Tataru paula@birc.au.dk
20
Allele frequencies: the Beta distribution approach
Paula Tataru paula@birc.au.dk
AARHUS
UNIVERSITY
Bioinformatics
Research Centre
Conclusions
›Beta with spikes: new approximation to the WF
› Quality of approximation
› Consistent
› Diffusion > Beta with spikes > Beta
› Inference of split times
› Beta with spikes ~ Kim Tree
› Diffusion ?
21
Allele frequencies: the Beta distribution approach
Paula Tataru paula@birc.au.dk
AARHUS
UNIVERSITY
Bioinformatics
Research Centre
Future work
›Inference of
› mutation rates
› selection coefficients
› variable population size
22
Allele frequencies: the Beta distribution approach
Paula Tataru paula@birc.au.dk
AARHUS
UNIVERSITY
Bioinformatics
Research Centre
23
The Beta approximation
Allele frequencies: the Beta distribution approach
Paula Tataru paula@birc.au.dk
AARHUS
UNIVERSITY
Bioinformatics
Research Centre
24
Mean and variance
Allele frequencies: the Beta distribution approach
Paula Tataru paula@birc.au.dk
AARHUS
UNIVERSITY
Bioinformatics
Research Centre
25
Loss and fixation probabilities
Allele frequencies: the Beta distribution approachAARHUS
UNIVERSITY
Bioinformatics
Research Centre Paula Tataru paula@birc.au.dk
26
Allele frequencies: the Beta distribution approachAARHUS
UNIVERSITY
Bioinformatics
Research Centre Paula Tataru paula@birc.au.dk
27

More Related Content

PaulaTataruOxford

  • 1. Betaspikes The Beta distribution approach PAULA TATARU AARHUS UNIVERSITY Bioinformatics Research Centre Oxford, July 19th 2014 Modelling allele frequency data under the Wright Fisher model of drift, mutation and selection Joint work with Asger Hobolth and Thomas Bataillon
  • 2. Allele frequencies: the Beta distribution approach Paula Tataru paula@birc.au.dk AARHUS UNIVERSITY Bioinformatics Research Centre Motivation ›Infer population parameters from DNA data › mutation rates › selection coefficients › split times › variable population size back in time ›Backward in time (coalescent) ›Forward in time (Wright Fisher) 2
  • 3. Allele frequencies: the Beta distribution approach Paula Tataru paula@birc.au.dk AARHUS UNIVERSITY Bioinformatics Research Centre 3 The Wright Fisher model
  • 4. Allele frequencies: the Beta distribution approach Paula Tataru paula@birc.au.dk AARHUS UNIVERSITY Bioinformatics Research Centre 4 The Wright Fisher model
  • 5. Allele frequencies: the Beta distribution approach Paula Tataru paula@birc.au.dk AARHUS UNIVERSITY Bioinformatics Research Centre 5 The Wright Fisher model
  • 6. Allele frequencies: the Beta distribution approach Paula Tataru paula@birc.au.dk AARHUS UNIVERSITY Bioinformatics Research Centre ›Diffusion › Kimura 1964 › Gautier & Vitalis 2013 › Malaspinas et al. 2012 › Steinrucken et al. 2013 › Zhao et al. 2013 ›Moment based › Normal distribution › Nicholson et al. 2002 › Prickrell & Pritchard 2012 › Beta distribution › Balding & Nichols 1995 › Siren et al. 2011 6 Approximations to the WF
  • 7. Allele frequencies: the Beta distribution approach Paula Tataru paula@birc.au.dk AARHUS UNIVERSITY Bioinformatics Research Centre ›Diffusion › Kimura 1964 › Gautier & Vitalis 2013 › Malaspinas et al. 2012 › Steinrucken et al. 2013 › Zhao et al. 2013 ›Moment based › Normal distribution › Nicholson et al. 2002 › Prickrell & Pritchard 2012 › Beta distribution › Balding & Nichols 1995 › Siren et al. 2011 › Beta with spikes 7 Approximations to the WF
  • 8. Allele frequencies: the Beta distribution approach Paula Tataru paula@birc.au.dk AARHUS UNIVERSITY Bioinformatics Research Centre 8 The Beta approximation
  • 9. Allele frequencies: the Beta distribution approach Paula Tataru paula@birc.au.dk AARHUS UNIVERSITY Bioinformatics Research Centre 9 The Beta approximation
  • 10. Allele frequencies: the Beta distribution approach Paula Tataru paula@birc.au.dk AARHUS UNIVERSITY Bioinformatics Research Centre 10 The Beta approximation
  • 11. Allele frequencies: the Beta distribution approach Paula Tataru paula@birc.au.dk AARHUS UNIVERSITY Bioinformatics Research Centre The Beta with spikes approximation ›The density of Xt ›Use recursive approach to calculate › mean and variance › loss and fixation probabilities › mean and variance conditional on polymorphism 11
  • 12. Allele frequencies: the Beta distribution approachAARHUS UNIVERSITY Bioinformatics Research Centre Paula Tataru paula@birc.au.dk 12 ›Hellinger distance › true vs approximated distributions › between 0 and 1 ›Stationary: Beta distribution ›Diffusion > Beta with spikes > Beta
  • 13. Allele frequencies: the Beta distribution approachAARHUS UNIVERSITY Bioinformatics Research Centre Paula Tataru paula@birc.au.dk 13
  • 14. Allele frequencies: the Beta distribution approachAARHUS UNIVERSITY Bioinformatics Research Centre Paula Tataru paula@birc.au.dk 14
  • 15. Allele frequencies: the Beta distribution approachAARHUS UNIVERSITY Bioinformatics Research Centre Paula Tataru paula@birc.au.dk 15
  • 16. Allele frequencies: the Beta distribution approach Paula Tataru paula@birc.au.dk AARHUS UNIVERSITY Bioinformatics Research Centre 16 The Beta with spikes: worst fit
  • 17. Allele frequencies: the Beta distribution approach Paula Tataru paula@birc.au.dk AARHUS UNIVERSITY Bioinformatics Research Centre 17 The Beta with spikes: worst fit
  • 18. Allele frequencies: the Beta distribution approach Paula Tataru paula@birc.au.dk AARHUS UNIVERSITY Bioinformatics Research Centre 18 The Beta with spikes: worst fit
  • 19. Allele frequencies: the Beta distribution approach Paula Tataru paula@birc.au.dk AARHUS UNIVERSITY Bioinformatics Research Centre 19 Inference of split times ›Felsenstein’s peeling algorithm ›Numerically optimized likelihood ›5000 loci ›100 samples in each population ›40 data sets
  • 20. Allele frequencies: the Beta distribution approachAARHUS UNIVERSITY Bioinformatics Research Centre Paula Tataru paula@birc.au.dk 20
  • 21. Allele frequencies: the Beta distribution approach Paula Tataru paula@birc.au.dk AARHUS UNIVERSITY Bioinformatics Research Centre Conclusions ›Beta with spikes: new approximation to the WF › Quality of approximation › Consistent › Diffusion > Beta with spikes > Beta › Inference of split times › Beta with spikes ~ Kim Tree › Diffusion ? 21
  • 22. Allele frequencies: the Beta distribution approach Paula Tataru paula@birc.au.dk AARHUS UNIVERSITY Bioinformatics Research Centre Future work ›Inference of › mutation rates › selection coefficients › variable population size 22
  • 23. Allele frequencies: the Beta distribution approach Paula Tataru paula@birc.au.dk AARHUS UNIVERSITY Bioinformatics Research Centre 23 The Beta approximation
  • 24. Allele frequencies: the Beta distribution approach Paula Tataru paula@birc.au.dk AARHUS UNIVERSITY Bioinformatics Research Centre 24 Mean and variance
  • 25. Allele frequencies: the Beta distribution approach Paula Tataru paula@birc.au.dk AARHUS UNIVERSITY Bioinformatics Research Centre 25 Loss and fixation probabilities
  • 26. Allele frequencies: the Beta distribution approachAARHUS UNIVERSITY Bioinformatics Research Centre Paula Tataru paula@birc.au.dk 26
  • 27. Allele frequencies: the Beta distribution approachAARHUS UNIVERSITY Bioinformatics Research Centre Paula Tataru paula@birc.au.dk 27