際際滷shows by User: allPowerde / http://www.slideshare.net/images/logo.gif 際際滷shows by User: allPowerde / Mon, 16 Dec 2019 00:48:35 GMT 際際滷Share feed for 際際滷shows by User: allPowerde Cloud-native machine learning - Transforming bioinformatics research /slideshow/cloudnative-machine-learning-transforming-bioinformatics-research/206131228 2019giw-191216004835
Cloud computing and artificial intelligence transforms bioinformatics research Denis Bauer, Transformational Bioinformatics Team Genomic data is outpacing traditional Big Data disciplines, producing more information than Astronomy, twitter, and YouTube combined. As such, Genomic research has leapfrogged to the forefront of Big Data and Cloud solutions. We developed software platforms using the latest in cloud architecture, artificial intelligence and machine learning to support every aspect genome medicine; from disease gene detection through to validation and personalized medicine. This talk outlines how we find disease genes for complex genetic diseases, such as ALS, using VariantSpark, which is a custom machine learning implementation capable of dealing with Whole Genome Sequencing data of 80 million common and rare variants. To support disease gene validation, we created GT-Scan, which is an innovative web application, which we think of it as the search engine for the genome. It enables researchers to identify the optimal editing spot to create animal models efficiently. The talk concludes by demonstrating how cloud-based software distribution channels (digital Marketplaces) can be harnessed to share bioinformatics tools internationally and make research more reproducible.]]>

Cloud computing and artificial intelligence transforms bioinformatics research Denis Bauer, Transformational Bioinformatics Team Genomic data is outpacing traditional Big Data disciplines, producing more information than Astronomy, twitter, and YouTube combined. As such, Genomic research has leapfrogged to the forefront of Big Data and Cloud solutions. We developed software platforms using the latest in cloud architecture, artificial intelligence and machine learning to support every aspect genome medicine; from disease gene detection through to validation and personalized medicine. This talk outlines how we find disease genes for complex genetic diseases, such as ALS, using VariantSpark, which is a custom machine learning implementation capable of dealing with Whole Genome Sequencing data of 80 million common and rare variants. To support disease gene validation, we created GT-Scan, which is an innovative web application, which we think of it as the search engine for the genome. It enables researchers to identify the optimal editing spot to create animal models efficiently. The talk concludes by demonstrating how cloud-based software distribution channels (digital Marketplaces) can be harnessed to share bioinformatics tools internationally and make research more reproducible.]]>
Mon, 16 Dec 2019 00:48:35 GMT /slideshow/cloudnative-machine-learning-transforming-bioinformatics-research/206131228 allPowerde@slideshare.net(allPowerde) Cloud-native machine learning - Transforming bioinformatics research allPowerde Cloud computing and artificial intelligence transforms bioinformatics research Denis Bauer, Transformational Bioinformatics Team Genomic data is outpacing traditional Big Data disciplines, producing more information than Astronomy, twitter, and YouTube combined. As such, Genomic research has leapfrogged to the forefront of Big Data and Cloud solutions. We developed software platforms using the latest in cloud architecture, artificial intelligence and machine learning to support every aspect genome medicine; from disease gene detection through to validation and personalized medicine. This talk outlines how we find disease genes for complex genetic diseases, such as ALS, using VariantSpark, which is a custom machine learning implementation capable of dealing with Whole Genome Sequencing data of 80 million common and rare variants. To support disease gene validation, we created GT-Scan, which is an innovative web application, which we think of it as the search engine for the genome. It enables researchers to identify the optimal editing spot to create animal models efficiently. The talk concludes by demonstrating how cloud-based software distribution channels (digital Marketplaces) can be harnessed to share bioinformatics tools internationally and make research more reproducible. <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/2019giw-191216004835-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> Cloud computing and artificial intelligence transforms bioinformatics research Denis Bauer, Transformational Bioinformatics Team Genomic data is outpacing traditional Big Data disciplines, producing more information than Astronomy, twitter, and YouTube combined. As such, Genomic research has leapfrogged to the forefront of Big Data and Cloud solutions. We developed software platforms using the latest in cloud architecture, artificial intelligence and machine learning to support every aspect genome medicine; from disease gene detection through to validation and personalized medicine. This talk outlines how we find disease genes for complex genetic diseases, such as ALS, using VariantSpark, which is a custom machine learning implementation capable of dealing with Whole Genome Sequencing data of 80 million common and rare variants. To support disease gene validation, we created GT-Scan, which is an innovative web application, which we think of it as the search engine for the genome. It enables researchers to identify the optimal editing spot to create animal models efficiently. The talk concludes by demonstrating how cloud-based software distribution channels (digital Marketplaces) can be harnessed to share bioinformatics tools internationally and make research more reproducible.
Cloud-native machine learning - Transforming bioinformatics research from Denis C. Bauer
]]>
188 4 https://cdn.slidesharecdn.com/ss_thumbnails/2019giw-191216004835-thumbnail.jpg?width=120&height=120&fit=bounds presentation Black http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 0
Translating genomics into clinical practice - 2018 AWS summit keynote /allPowerde/translating-genomics-into-clinical-practice-2018-aws-summit-keynote 2018awssummitcanberrabauer-180914012148
CSIRO's part of the co-presented Keynote at the AWS Public Sector Summit in Canberra on genomics health care. Three key messages: 1) We need a shift from treatment towards prevention 2) Once you go serverless you never go back 3) DevOps 2.0: Hypothesis-driven architecture evolution]]>

CSIRO's part of the co-presented Keynote at the AWS Public Sector Summit in Canberra on genomics health care. Three key messages: 1) We need a shift from treatment towards prevention 2) Once you go serverless you never go back 3) DevOps 2.0: Hypothesis-driven architecture evolution]]>
Fri, 14 Sep 2018 01:21:48 GMT /allPowerde/translating-genomics-into-clinical-practice-2018-aws-summit-keynote allPowerde@slideshare.net(allPowerde) Translating genomics into clinical practice - 2018 AWS summit keynote allPowerde CSIRO's part of the co-presented Keynote at the AWS Public Sector Summit in Canberra on genomics health care. Three key messages: 1) We need a shift from treatment towards prevention 2) Once you go serverless you never go back 3) DevOps 2.0: Hypothesis-driven architecture evolution <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/2018awssummitcanberrabauer-180914012148-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> CSIRO&#39;s part of the co-presented Keynote at the AWS Public Sector Summit in Canberra on genomics health care. Three key messages: 1) We need a shift from treatment towards prevention 2) Once you go serverless you never go back 3) DevOps 2.0: Hypothesis-driven architecture evolution
Translating genomics into clinical practice - 2018 AWS summit keynote from Denis C. Bauer
]]>
475 4 https://cdn.slidesharecdn.com/ss_thumbnails/2018awssummitcanberrabauer-180914012148-thumbnail.jpg?width=120&height=120&fit=bounds presentation Black http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 0
Going Server-less for Web-Services that need to Crunch Large Volumes of Data /slideshow/going-serverless-for-webservices-that-need-to-crunch-large-volumes-of-data/91485721 2018agileindiaserverless-180322013541
AgileIndia Breakout session on serverless applications. This talk covers how AWS serverless infrastructure can be used for a wide range of applications, such as compute intensive tasks (GT-Scan), tasks requiring continuous learning (CryptoBreeder), data intensive tasks (PhenGen Database). ]]>

AgileIndia Breakout session on serverless applications. This talk covers how AWS serverless infrastructure can be used for a wide range of applications, such as compute intensive tasks (GT-Scan), tasks requiring continuous learning (CryptoBreeder), data intensive tasks (PhenGen Database). ]]>
Thu, 22 Mar 2018 01:35:41 GMT /slideshow/going-serverless-for-webservices-that-need-to-crunch-large-volumes-of-data/91485721 allPowerde@slideshare.net(allPowerde) Going Server-less for Web-Services that need to Crunch Large Volumes of Data allPowerde AgileIndia Breakout session on serverless applications. This talk covers how AWS serverless infrastructure can be used for a wide range of applications, such as compute intensive tasks (GT-Scan), tasks requiring continuous learning (CryptoBreeder), data intensive tasks (PhenGen Database). <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/2018agileindiaserverless-180322013541-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> AgileIndia Breakout session on serverless applications. This talk covers how AWS serverless infrastructure can be used for a wide range of applications, such as compute intensive tasks (GT-Scan), tasks requiring continuous learning (CryptoBreeder), data intensive tasks (PhenGen Database).
Going Server-less for Web-Services that need to Crunch Large Volumes of Data from Denis C. Bauer
]]>
176 3 https://cdn.slidesharecdn.com/ss_thumbnails/2018agileindiaserverless-180322013541-thumbnail.jpg?width=120&height=120&fit=bounds presentation Black http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 0
How novel compute technology transforms life science research /slideshow/how-novel-compute-technology-transforms-life-science-research-91485276/91485276 2018agileindiakeynote-180322012552
AgileIndia 2018 Keynote. This talk covers how Datafication will make data wider (more features describing a data point), which represents a paradigm shift for Machine Learning applications. It also covers serverless architecture, which can cater for even compute-intensive tasks. It concludes by stating that business and life-science research are not that different: so lets build a community together!]]>

AgileIndia 2018 Keynote. This talk covers how Datafication will make data wider (more features describing a data point), which represents a paradigm shift for Machine Learning applications. It also covers serverless architecture, which can cater for even compute-intensive tasks. It concludes by stating that business and life-science research are not that different: so lets build a community together!]]>
Thu, 22 Mar 2018 01:25:52 GMT /slideshow/how-novel-compute-technology-transforms-life-science-research-91485276/91485276 allPowerde@slideshare.net(allPowerde) How novel compute technology transforms life science research allPowerde AgileIndia 2018 Keynote. This talk covers how Datafication will make data wider (more features describing a data point), which represents a paradigm shift for Machine Learning applications. It also covers serverless architecture, which can cater for even compute-intensive tasks. It concludes by stating that business and life-science research are not that different: so lets build a community together! <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/2018agileindiakeynote-180322012552-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> AgileIndia 2018 Keynote. This talk covers how Datafication will make data wider (more features describing a data point), which represents a paradigm shift for Machine Learning applications. It also covers serverless architecture, which can cater for even compute-intensive tasks. It concludes by stating that business and life-science research are not that different: so lets build a community together!
How novel compute technology transforms life science research from Denis C. Bauer
]]>
350 5 https://cdn.slidesharecdn.com/ss_thumbnails/2018agileindiakeynote-180322012552-thumbnail.jpg?width=120&height=120&fit=bounds presentation Black http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 0
How novel compute technology transforms life science research /slideshow/how-novel-compute-technology-transforms-life-science-research/69972032 2016govforumclouderabauer-161209011718
Unprecedented data volumes and pressure on turnaround time driven by commercial applications require bioinformatics solutions to evolve to meed these new demands. New compute paradigms and cloud-based IT solutions enable this transition. Here I present two solution capable of meeting these demands for genomic variant analysis, VariantSpark, as well as genome engineering applications, GT-Scan2. VariantSpark classifies 3000 individuals with 80 Million genomic variants each in under 30 minutes. This Hadoop/Spark solution for machine learning application on genomic data is hence capable to scale up to population size cohorts. GT-Scan2, identifies CRISPR target sites by minimizing off-target effects and maximizing on-target efficiency. This optimization is powered by AWS Lambda functions, which offer an always-on web service that can instantaneously recruit enough compute resources keep runtime stable even for queries with several thousand of potential target sites. ]]>

Unprecedented data volumes and pressure on turnaround time driven by commercial applications require bioinformatics solutions to evolve to meed these new demands. New compute paradigms and cloud-based IT solutions enable this transition. Here I present two solution capable of meeting these demands for genomic variant analysis, VariantSpark, as well as genome engineering applications, GT-Scan2. VariantSpark classifies 3000 individuals with 80 Million genomic variants each in under 30 minutes. This Hadoop/Spark solution for machine learning application on genomic data is hence capable to scale up to population size cohorts. GT-Scan2, identifies CRISPR target sites by minimizing off-target effects and maximizing on-target efficiency. This optimization is powered by AWS Lambda functions, which offer an always-on web service that can instantaneously recruit enough compute resources keep runtime stable even for queries with several thousand of potential target sites. ]]>
Fri, 09 Dec 2016 01:17:17 GMT /slideshow/how-novel-compute-technology-transforms-life-science-research/69972032 allPowerde@slideshare.net(allPowerde) How novel compute technology transforms life science research allPowerde Unprecedented data volumes and pressure on turnaround time driven by commercial applications require bioinformatics solutions to evolve to meed these new demands. New compute paradigms and cloud-based IT solutions enable this transition. Here I present two solution capable of meeting these demands for genomic variant analysis, VariantSpark, as well as genome engineering applications, GT-Scan2. VariantSpark classifies 3000 individuals with 80 Million genomic variants each in under 30 minutes. This Hadoop/Spark solution for machine learning application on genomic data is hence capable to scale up to population size cohorts. GT-Scan2, identifies CRISPR target sites by minimizing off-target effects and maximizing on-target efficiency. This optimization is powered by AWS Lambda functions, which offer an always-on web service that can instantaneously recruit enough compute resources keep runtime stable even for queries with several thousand of potential target sites. <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/2016govforumclouderabauer-161209011718-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> Unprecedented data volumes and pressure on turnaround time driven by commercial applications require bioinformatics solutions to evolve to meed these new demands. New compute paradigms and cloud-based IT solutions enable this transition. Here I present two solution capable of meeting these demands for genomic variant analysis, VariantSpark, as well as genome engineering applications, GT-Scan2. VariantSpark classifies 3000 individuals with 80 Million genomic variants each in under 30 minutes. This Hadoop/Spark solution for machine learning application on genomic data is hence capable to scale up to population size cohorts. GT-Scan2, identifies CRISPR target sites by minimizing off-target effects and maximizing on-target efficiency. This optimization is powered by AWS Lambda functions, which offer an always-on web service that can instantaneously recruit enough compute resources keep runtime stable even for queries with several thousand of potential target sites.
How novel compute technology transforms life science research from Denis C. Bauer
]]>
1086 6 https://cdn.slidesharecdn.com/ss_thumbnails/2016govforumclouderabauer-161209011718-thumbnail.jpg?width=120&height=120&fit=bounds presentation Black http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 0
VariantSpark: applying Spark-based machine learning methods to genomic information /slideshow/variantspark-applying-sparkbased-machine-learning-methods-to-genomic-information/55799998 2015bigdatavariantspark-151203221321-lva1-app6892
Genomic information is increasingly used in medical practice giving rise to the need for efficient analysis methodology able to cope with thousands of individuals and millions of variants. Here we introduce VariantSpark, which utilizes Hadoop/Spark along with its machine learning library, MLlib, providing the means of parallelisation for population-scale bioinformatics tasks. VariantSpark is the interface to the standard variant format (VCF), offers seamless genome-wide sampling of variants and provides a pipeline for visualising results. To demonstrate the capabilities of VariantSpark, we clustered more than 3,000 individuals with 80 Million variants each to determine the population structure in the dataset. VariantSpark is 80% faster than the Spark-based genome clustering approach, ADAM, the comparable implementation using Hadoop/Mahout, as well as Admixture, a commonly used tool for determining individual ancestries. It is over 90% faster than traditional implementations using R and Python. These benefits of speed, resource consumption and scalability enables VariantSpark to open up the usage of advanced, efficient machine learning algorithms to genomic data. The package is written in Scala and available at https://github.com/BauerLab/VariantSpark.]]>

Genomic information is increasingly used in medical practice giving rise to the need for efficient analysis methodology able to cope with thousands of individuals and millions of variants. Here we introduce VariantSpark, which utilizes Hadoop/Spark along with its machine learning library, MLlib, providing the means of parallelisation for population-scale bioinformatics tasks. VariantSpark is the interface to the standard variant format (VCF), offers seamless genome-wide sampling of variants and provides a pipeline for visualising results. To demonstrate the capabilities of VariantSpark, we clustered more than 3,000 individuals with 80 Million variants each to determine the population structure in the dataset. VariantSpark is 80% faster than the Spark-based genome clustering approach, ADAM, the comparable implementation using Hadoop/Mahout, as well as Admixture, a commonly used tool for determining individual ancestries. It is over 90% faster than traditional implementations using R and Python. These benefits of speed, resource consumption and scalability enables VariantSpark to open up the usage of advanced, efficient machine learning algorithms to genomic data. The package is written in Scala and available at https://github.com/BauerLab/VariantSpark.]]>
Thu, 03 Dec 2015 22:13:21 GMT /slideshow/variantspark-applying-sparkbased-machine-learning-methods-to-genomic-information/55799998 allPowerde@slideshare.net(allPowerde) VariantSpark: applying Spark-based machine learning methods to genomic information allPowerde Genomic information is increasingly used in medical practice giving rise to the need for efficient analysis methodology able to cope with thousands of individuals and millions of variants. Here we introduce VariantSpark, which utilizes Hadoop/Spark along with its machine learning library, MLlib, providing the means of parallelisation for population-scale bioinformatics tasks. VariantSpark is the interface to the standard variant format (VCF), offers seamless genome-wide sampling of variants and provides a pipeline for visualising results. To demonstrate the capabilities of VariantSpark, we clustered more than 3,000 individuals with 80 Million variants each to determine the population structure in the dataset. VariantSpark is 80% faster than the Spark-based genome clustering approach, ADAM, the comparable implementation using Hadoop/Mahout, as well as Admixture, a commonly used tool for determining individual ancestries. It is over 90% faster than traditional implementations using R and Python. These benefits of speed, resource consumption and scalability enables VariantSpark to open up the usage of advanced, efficient machine learning algorithms to genomic data. The package is written in Scala and available at https://github.com/BauerLab/VariantSpark. <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/2015bigdatavariantspark-151203221321-lva1-app6892-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> Genomic information is increasingly used in medical practice giving rise to the need for efficient analysis methodology able to cope with thousands of individuals and millions of variants. Here we introduce VariantSpark, which utilizes Hadoop/Spark along with its machine learning library, MLlib, providing the means of parallelisation for population-scale bioinformatics tasks. VariantSpark is the interface to the standard variant format (VCF), offers seamless genome-wide sampling of variants and provides a pipeline for visualising results. To demonstrate the capabilities of VariantSpark, we clustered more than 3,000 individuals with 80 Million variants each to determine the population structure in the dataset. VariantSpark is 80% faster than the Spark-based genome clustering approach, ADAM, the comparable implementation using Hadoop/Mahout, as well as Admixture, a commonly used tool for determining individual ancestries. It is over 90% faster than traditional implementations using R and Python. These benefits of speed, resource consumption and scalability enables VariantSpark to open up the usage of advanced, efficient machine learning algorithms to genomic data. The package is written in Scala and available at https://github.com/BauerLab/VariantSpark.
VariantSpark: applying Spark-based machine learning methods to genomic information from Denis C. Bauer
]]>
1924 6 https://cdn.slidesharecdn.com/ss_thumbnails/2015bigdatavariantspark-151203221321-lva1-app6892-thumbnail.jpg?width=120&height=120&fit=bounds presentation Black http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 0
Population-scale high-throughput sequencing data analysis /slideshow/2014-winderschool-brisbane/37260670 2014winderschoolbrisbane-140722192904-phpapp01
Unprecedented computational capabilities and high-throughput data collection methods promise a new era of personalised, evidence-based healthcare, utilising individual genomic profiles to tailor health management as demonstrated by recent successes in rare genetic disorders or stratified cancer treatments. However, processing genomic information at a scale relevant for the health-system remains challenging due to high demands on data reproducibility and data provenance. Furthermore, the necessary computational requirements requires a large investment associated with compute hardware and IT personnel, which is a barrier to entry for small laboratories and difficult to maintain at peak times for larger institutes. This hampers the creation of time-reliable production informatics environments for clinical genomics. Commercial cloud computing frameworks, like Amazon Web Services (AWS) provide an economical alternative to in-house compute clusters as they allow outsourcing of computation to third-party providers, while retaining the software and compute flexibility. To cater for this resource-hungry, fast pace yet sensitive environment of personalized medicine, we developed NGSANE, a Linux-based, HPC-enabled framework that minimises overhead for set up and processing of new projects yet maintains full flexibility of custom scripting and data provenance when processing raw sequencing data either on a local cluster or Amazons Elastic Compute Cloud (EC2).]]>

Unprecedented computational capabilities and high-throughput data collection methods promise a new era of personalised, evidence-based healthcare, utilising individual genomic profiles to tailor health management as demonstrated by recent successes in rare genetic disorders or stratified cancer treatments. However, processing genomic information at a scale relevant for the health-system remains challenging due to high demands on data reproducibility and data provenance. Furthermore, the necessary computational requirements requires a large investment associated with compute hardware and IT personnel, which is a barrier to entry for small laboratories and difficult to maintain at peak times for larger institutes. This hampers the creation of time-reliable production informatics environments for clinical genomics. Commercial cloud computing frameworks, like Amazon Web Services (AWS) provide an economical alternative to in-house compute clusters as they allow outsourcing of computation to third-party providers, while retaining the software and compute flexibility. To cater for this resource-hungry, fast pace yet sensitive environment of personalized medicine, we developed NGSANE, a Linux-based, HPC-enabled framework that minimises overhead for set up and processing of new projects yet maintains full flexibility of custom scripting and data provenance when processing raw sequencing data either on a local cluster or Amazons Elastic Compute Cloud (EC2).]]>
Tue, 22 Jul 2014 19:29:04 GMT /slideshow/2014-winderschool-brisbane/37260670 allPowerde@slideshare.net(allPowerde) Population-scale high-throughput sequencing data analysis allPowerde Unprecedented computational capabilities and high-throughput data collection methods promise a new era of personalised, evidence-based healthcare, utilising individual genomic profiles to tailor health management as demonstrated by recent successes in rare genetic disorders or stratified cancer treatments. However, processing genomic information at a scale relevant for the health-system remains challenging due to high demands on data reproducibility and data provenance. Furthermore, the necessary computational requirements requires a large investment associated with compute hardware and IT personnel, which is a barrier to entry for small laboratories and difficult to maintain at peak times for larger institutes. This hampers the creation of time-reliable production informatics environments for clinical genomics. Commercial cloud computing frameworks, like Amazon Web Services (AWS) provide an economical alternative to in-house compute clusters as they allow outsourcing of computation to third-party providers, while retaining the software and compute flexibility. To cater for this resource-hungry, fast pace yet sensitive environment of personalized medicine, we developed NGSANE, a Linux-based, HPC-enabled framework that minimises overhead for set up and processing of new projects yet maintains full flexibility of custom scripting and data provenance when processing raw sequencing data either on a local cluster or Amazons Elastic Compute Cloud (EC2). <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/2014winderschoolbrisbane-140722192904-phpapp01-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> Unprecedented computational capabilities and high-throughput data collection methods promise a new era of personalised, evidence-based healthcare, utilising individual genomic profiles to tailor health management as demonstrated by recent successes in rare genetic disorders or stratified cancer treatments. However, processing genomic information at a scale relevant for the health-system remains challenging due to high demands on data reproducibility and data provenance. Furthermore, the necessary computational requirements requires a large investment associated with compute hardware and IT personnel, which is a barrier to entry for small laboratories and difficult to maintain at peak times for larger institutes. This hampers the creation of time-reliable production informatics environments for clinical genomics. Commercial cloud computing frameworks, like Amazon Web Services (AWS) provide an economical alternative to in-house compute clusters as they allow outsourcing of computation to third-party providers, while retaining the software and compute flexibility. To cater for this resource-hungry, fast pace yet sensitive environment of personalized medicine, we developed NGSANE, a Linux-based, HPC-enabled framework that minimises overhead for set up and processing of new projects yet maintains full flexibility of custom scripting and data provenance when processing raw sequencing data either on a local cluster or Amazons Elastic Compute Cloud (EC2).
Population-scale high-throughput sequencing data analysis from Denis C. Bauer
]]>
1360 5 https://cdn.slidesharecdn.com/ss_thumbnails/2014winderschoolbrisbane-140722192904-phpapp01-thumbnail.jpg?width=120&height=120&fit=bounds presentation White http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 0
Trip Report Seattle /slideshow/trip-report-seattle/15252955 cmisseminar2012-121119150522-phpapp02
The primary goal of my trip to Seattle was to establish a collaboration with a world-leading group on data integration. But by having chosen Seattle, a hub for technology companies, I also learned about synergies between business and research: Ilya Shmulevich from the Institute for Systems Biology makes use of Amazon's ''Random Forest" implementation and Google's 600.000 CPU cluster for cancer genomic association discovery. I also met with experts from University of Washington and Microsoft research to learn about technological advancements to tackle BigData and commoditizing parallelization. Finally, I observed a government funded research agency invest in solutions geared towards their enterprise structure rather than adopt solutions designed for research institutes without active computational community. In conclusion: CSIRO has unique properties and skill-sets that many collaborators would be interested in benefiting from, in return such collaborations would propel CSIRO instantly to the forefront of technology, which in particular for the analysis of big, unstructured datasets could be very rewarding. ]]>

The primary goal of my trip to Seattle was to establish a collaboration with a world-leading group on data integration. But by having chosen Seattle, a hub for technology companies, I also learned about synergies between business and research: Ilya Shmulevich from the Institute for Systems Biology makes use of Amazon's ''Random Forest" implementation and Google's 600.000 CPU cluster for cancer genomic association discovery. I also met with experts from University of Washington and Microsoft research to learn about technological advancements to tackle BigData and commoditizing parallelization. Finally, I observed a government funded research agency invest in solutions geared towards their enterprise structure rather than adopt solutions designed for research institutes without active computational community. In conclusion: CSIRO has unique properties and skill-sets that many collaborators would be interested in benefiting from, in return such collaborations would propel CSIRO instantly to the forefront of technology, which in particular for the analysis of big, unstructured datasets could be very rewarding. ]]>
Mon, 19 Nov 2012 15:05:21 GMT /slideshow/trip-report-seattle/15252955 allPowerde@slideshare.net(allPowerde) Trip Report Seattle allPowerde The primary goal of my trip to Seattle was to establish a collaboration with a world-leading group on data integration. But by having chosen Seattle, a hub for technology companies, I also learned about synergies between business and research: Ilya Shmulevich from the Institute for Systems Biology makes use of Amazon's ''Random Forest" implementation and Google's 600.000 CPU cluster for cancer genomic association discovery. I also met with experts from University of Washington and Microsoft research to learn about technological advancements to tackle BigData and commoditizing parallelization. Finally, I observed a government funded research agency invest in solutions geared towards their enterprise structure rather than adopt solutions designed for research institutes without active computational community. In conclusion: CSIRO has unique properties and skill-sets that many collaborators would be interested in benefiting from, in return such collaborations would propel CSIRO instantly to the forefront of technology, which in particular for the analysis of big, unstructured datasets could be very rewarding. <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/cmisseminar2012-121119150522-phpapp02-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> The primary goal of my trip to Seattle was to establish a collaboration with a world-leading group on data integration. But by having chosen Seattle, a hub for technology companies, I also learned about synergies between business and research: Ilya Shmulevich from the Institute for Systems Biology makes use of Amazon&#39;s &#39;&#39;Random Forest&quot; implementation and Google&#39;s 600.000 CPU cluster for cancer genomic association discovery. I also met with experts from University of Washington and Microsoft research to learn about technological advancements to tackle BigData and commoditizing parallelization. Finally, I observed a government funded research agency invest in solutions geared towards their enterprise structure rather than adopt solutions designed for research institutes without active computational community. In conclusion: CSIRO has unique properties and skill-sets that many collaborators would be interested in benefiting from, in return such collaborations would propel CSIRO instantly to the forefront of technology, which in particular for the analysis of big, unstructured datasets could be very rewarding.
Trip Report Seattle from Denis C. Bauer
]]>
1765 5 https://cdn.slidesharecdn.com/ss_thumbnails/cmisseminar2012-121119150522-phpapp02-thumbnail.jpg?width=120&height=120&fit=bounds presentation Black http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 0
Allelic Imbalance for Pre-capture Whole Exome Sequencing /slideshow/allelic-imbalance-for-precapture-whole-exome-sequencing/12183146 bugbauer20122-120327180204-phpapp02
Exome sequencing has emerged as an economical way of focusing DNA sequencing efforts on the most functionally understood regions of the genome. Pre-capture pooling, where one bait library is used to pull down the exonic regions of several pooled samples simultaneously is a further financial improvement. However, rare alleles in the pool might not be able to attract baits at the same rate as reference conform sequences can, and may hence be underrepresented. We investigated this potential issue by sequencing a hapmap family (4 individuals) using the pre-capture protocol from Illumina and Nimblegen. We did not observe clear evidence that heterozygote variants are missed but noted a trend for indels to be imbalanced. Our findings do not provide clear evidence to rule out allelic imbalance or bias having an impact on research findings, this may be especially critical for low cellular cancer tissue where rare alleles are more ubiquitous. ]]>

Exome sequencing has emerged as an economical way of focusing DNA sequencing efforts on the most functionally understood regions of the genome. Pre-capture pooling, where one bait library is used to pull down the exonic regions of several pooled samples simultaneously is a further financial improvement. However, rare alleles in the pool might not be able to attract baits at the same rate as reference conform sequences can, and may hence be underrepresented. We investigated this potential issue by sequencing a hapmap family (4 individuals) using the pre-capture protocol from Illumina and Nimblegen. We did not observe clear evidence that heterozygote variants are missed but noted a trend for indels to be imbalanced. Our findings do not provide clear evidence to rule out allelic imbalance or bias having an impact on research findings, this may be especially critical for low cellular cancer tissue where rare alleles are more ubiquitous. ]]>
Tue, 27 Mar 2012 18:02:02 GMT /slideshow/allelic-imbalance-for-precapture-whole-exome-sequencing/12183146 allPowerde@slideshare.net(allPowerde) Allelic Imbalance for Pre-capture Whole Exome Sequencing allPowerde Exome sequencing has emerged as an economical way of focusing DNA sequencing efforts on the most functionally understood regions of the genome. Pre-capture pooling, where one bait library is used to pull down the exonic regions of several pooled samples simultaneously is a further financial improvement. However, rare alleles in the pool might not be able to attract baits at the same rate as reference conform sequences can, and may hence be underrepresented. We investigated this potential issue by sequencing a hapmap family (4 individuals) using the pre-capture protocol from Illumina and Nimblegen. We did not observe clear evidence that heterozygote variants are missed but noted a trend for indels to be imbalanced. Our findings do not provide clear evidence to rule out allelic imbalance or bias having an impact on research findings, this may be especially critical for low cellular cancer tissue where rare alleles are more ubiquitous. <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/bugbauer20122-120327180204-phpapp02-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> Exome sequencing has emerged as an economical way of focusing DNA sequencing efforts on the most functionally understood regions of the genome. Pre-capture pooling, where one bait library is used to pull down the exonic regions of several pooled samples simultaneously is a further financial improvement. However, rare alleles in the pool might not be able to attract baits at the same rate as reference conform sequences can, and may hence be underrepresented. We investigated this potential issue by sequencing a hapmap family (4 individuals) using the pre-capture protocol from Illumina and Nimblegen. We did not observe clear evidence that heterozygote variants are missed but noted a trend for indels to be imbalanced. Our findings do not provide clear evidence to rule out allelic imbalance or bias having an impact on research findings, this may be especially critical for low cellular cancer tissue where rare alleles are more ubiquitous.
Allelic Imbalance for Pre-capture Whole Exome Sequencing from Denis C. Bauer
]]>
1501 4 https://cdn.slidesharecdn.com/ss_thumbnails/bugbauer20122-120327180204-phpapp02-thumbnail.jpg?width=120&height=120&fit=bounds presentation White http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 0
Centralizing sequence analysis /slideshow/centralizing-sequence-analysis/12027668 centralizingsequenceanalysis-120315213913-phpapp02
The first steps of analysing sequencing data (2GS,NGS) has entered a transitional period where on one hand most analysis steps can be automated and standardized (pipeline), while on the other constantly evolving protocols and software updates makes maintaining these analysis pipelines labour intensive. I propose a centralized system within CSIRO that is flexible to cater for different analyses while also being generic to efficiently disseminate labour intensive maintenance and extension amongst the user community.]]>

The first steps of analysing sequencing data (2GS,NGS) has entered a transitional period where on one hand most analysis steps can be automated and standardized (pipeline), while on the other constantly evolving protocols and software updates makes maintaining these analysis pipelines labour intensive. I propose a centralized system within CSIRO that is flexible to cater for different analyses while also being generic to efficiently disseminate labour intensive maintenance and extension amongst the user community.]]>
Thu, 15 Mar 2012 21:39:11 GMT /slideshow/centralizing-sequence-analysis/12027668 allPowerde@slideshare.net(allPowerde) Centralizing sequence analysis allPowerde The first steps of analysing sequencing data (2GS,NGS) has entered a transitional period where on one hand most analysis steps can be automated and standardized (pipeline), while on the other constantly evolving protocols and software updates makes maintaining these analysis pipelines labour intensive. I propose a centralized system within CSIRO that is flexible to cater for different analyses while also being generic to efficiently disseminate labour intensive maintenance and extension amongst the user community. <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/centralizingsequenceanalysis-120315213913-phpapp02-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> The first steps of analysing sequencing data (2GS,NGS) has entered a transitional period where on one hand most analysis steps can be automated and standardized (pipeline), while on the other constantly evolving protocols and software updates makes maintaining these analysis pipelines labour intensive. I propose a centralized system within CSIRO that is flexible to cater for different analyses while also being generic to efficiently disseminate labour intensive maintenance and extension amongst the user community.
Centralizing sequence analysis from Denis C. Bauer
]]>
588 3 https://cdn.slidesharecdn.com/ss_thumbnails/centralizingsequenceanalysis-120315213913-phpapp02-thumbnail.jpg?width=120&height=120&fit=bounds document Black http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 0
Qbi Centre for Brain genomics (Informatics side) /allPowerde/qbi-centre-for-brain-genomics-informatics-side 09-08-qbi-cbg-the-informatics-side-110907205315-phpapp01
An overview of QBIs production informatics framework with an emphasis on what service will be provided and how the resulting data is made available: from interactive quality control to integration with external data on the genome browser.]]>

An overview of QBIs production informatics framework with an emphasis on what service will be provided and how the resulting data is made available: from interactive quality control to integration with external data on the genome browser.]]>
Wed, 07 Sep 2011 20:53:14 GMT /allPowerde/qbi-centre-for-brain-genomics-informatics-side allPowerde@slideshare.net(allPowerde) Qbi Centre for Brain genomics (Informatics side) allPowerde An overview of QBIs production informatics framework with an emphasis on what service will be provided and how the resulting data is made available: from interactive quality control to integration with external data on the genome browser. <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/09-08-qbi-cbg-the-informatics-side-110907205315-phpapp01-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> An overview of QBIs production informatics framework with an emphasis on what service will be provided and how the resulting data is made available: from interactive quality control to integration with external data on the genome browser.
Qbi Centre for Brain genomics (Informatics side) from Denis C. Bauer
]]>
2189 2 https://cdn.slidesharecdn.com/ss_thumbnails/09-08-qbi-cbg-the-informatics-side-110907205315-phpapp01-thumbnail.jpg?width=120&height=120&fit=bounds presentation Black http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 0
Differential gene expression /slideshow/differential-gene-expression/9169043 08-25-differential-gene-expression-110907205041-phpapp02
This session will follow up from transcript quantification of RNAseq data and discusses statistical means of identifying differentially regulated transcripts, and isoforms and contrasts these against microarray analysis approaches.]]>

This session will follow up from transcript quantification of RNAseq data and discusses statistical means of identifying differentially regulated transcripts, and isoforms and contrasts these against microarray analysis approaches.]]>
Wed, 07 Sep 2011 20:50:39 GMT /slideshow/differential-gene-expression/9169043 allPowerde@slideshare.net(allPowerde) Differential gene expression allPowerde This session will follow up from transcript quantification of RNAseq data and discusses statistical means of identifying differentially regulated transcripts, and isoforms and contrasts these against microarray analysis approaches. <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/08-25-differential-gene-expression-110907205041-phpapp02-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> This session will follow up from transcript quantification of RNAseq data and discusses statistical means of identifying differentially regulated transcripts, and isoforms and contrasts these against microarray analysis approaches.
Differential gene expression from Denis C. Bauer
]]>
9059 6 https://cdn.slidesharecdn.com/ss_thumbnails/08-25-differential-gene-expression-110907205041-phpapp02-thumbnail.jpg?width=120&height=120&fit=bounds presentation Black http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 0
Transcript detection in RNAseq /slideshow/transcript-detection-in-rnaseq/8822867 08-11-rnaseq-110810202204-phpapp01
Abstract: The focus in this session will be put on the differences between standard DNA mapping and RNAseq-specific transcript mapping: identifying splice variants and isoforms. The issue of transcript quantification and genomic variants that can be identified from RNAseq data will be discussed.]]>

Abstract: The focus in this session will be put on the differences between standard DNA mapping and RNAseq-specific transcript mapping: identifying splice variants and isoforms. The issue of transcript quantification and genomic variants that can be identified from RNAseq data will be discussed.]]>
Wed, 10 Aug 2011 20:22:00 GMT /slideshow/transcript-detection-in-rnaseq/8822867 allPowerde@slideshare.net(allPowerde) Transcript detection in RNAseq allPowerde Abstract: The focus in this session will be put on the differences between standard DNA mapping and RNAseq-specific transcript mapping: identifying splice variants and isoforms. The issue of transcript quantification and genomic variants that can be identified from RNAseq data will be discussed. <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/08-11-rnaseq-110810202204-phpapp01-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> Abstract: The focus in this session will be put on the differences between standard DNA mapping and RNAseq-specific transcript mapping: identifying splice variants and isoforms. The issue of transcript quantification and genomic variants that can be identified from RNAseq data will be discussed.
Transcript detection in RNAseq from Denis C. Bauer
]]>
2800 7 https://cdn.slidesharecdn.com/ss_thumbnails/08-11-rnaseq-110810202204-phpapp01-thumbnail.jpg?width=120&height=120&fit=bounds presentation Black http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 0
Functionally annotate genomic variants /slideshow/functionally-annotate-genomic-variants/8776547 07-21-makingsense-110804203033-phpapp01
This seminar aims at answering the question of what to make of the identified variants, specifically how to evaluate the quality, prioritize and functionally annotate the variants.]]>

This seminar aims at answering the question of what to make of the identified variants, specifically how to evaluate the quality, prioritize and functionally annotate the variants.]]>
Thu, 04 Aug 2011 20:30:31 GMT /slideshow/functionally-annotate-genomic-variants/8776547 allPowerde@slideshare.net(allPowerde) Functionally annotate genomic variants allPowerde This seminar aims at answering the question of what to make of the identified variants, specifically how to evaluate the quality, prioritize and functionally annotate the variants. <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/07-21-makingsense-110804203033-phpapp01-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> This seminar aims at answering the question of what to make of the identified variants, specifically how to evaluate the quality, prioritize and functionally annotate the variants.
Functionally annotate genomic variants from Denis C. Bauer
]]>
1748 5 https://cdn.slidesharecdn.com/ss_thumbnails/07-21-makingsense-110804203033-phpapp01-thumbnail.jpg?width=120&height=120&fit=bounds presentation Black http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 0
Variant (SNPs/Indels) calling in DNA sequences, Part 2 /slideshow/variant-snpsindels-calling-in-dna-sequences-part-2/8590032 07-14-variant-2-2-110713204120-phpapp02
Abstract: This session will focus on the steps involved in identifying genomic variants after an initial mapping was achieved: improvement the mapping, SNP and indel calling and variant filtering/recalibration will be introduced.]]>

Abstract: This session will focus on the steps involved in identifying genomic variants after an initial mapping was achieved: improvement the mapping, SNP and indel calling and variant filtering/recalibration will be introduced.]]>
Wed, 13 Jul 2011 20:41:16 GMT /slideshow/variant-snpsindels-calling-in-dna-sequences-part-2/8590032 allPowerde@slideshare.net(allPowerde) Variant (SNPs/Indels) calling in DNA sequences, Part 2 allPowerde Abstract: This session will focus on the steps involved in identifying genomic variants after an initial mapping was achieved: improvement the mapping, SNP and indel calling and variant filtering/recalibration will be introduced. <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/07-14-variant-2-2-110713204120-phpapp02-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> Abstract: This session will focus on the steps involved in identifying genomic variants after an initial mapping was achieved: improvement the mapping, SNP and indel calling and variant filtering/recalibration will be introduced.
Variant (SNPs/Indels) calling in DNA sequences, Part 2 from Denis C. Bauer
]]>
4371 3 https://cdn.slidesharecdn.com/ss_thumbnails/07-14-variant-2-2-110713204120-phpapp02-thumbnail.jpg?width=120&height=120&fit=bounds presentation Black http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 0
Variant (SNPs/Indels) calling in DNA sequences, Part 1 /slideshow/variant-snpsindels-calling-in-dna-sequences-part-1/8463681 06-30-variant-1-2-110629201351-phpapp02
Abstract: This session will focus on the first steps involved in identifying SNPs from whole genome, exome capture or targeted resequencing data: The different read mapping approaches to a DNA reference sequence will be introduced and quality metrics discussed.]]>

Abstract: This session will focus on the first steps involved in identifying SNPs from whole genome, exome capture or targeted resequencing data: The different read mapping approaches to a DNA reference sequence will be introduced and quality metrics discussed.]]>
Wed, 29 Jun 2011 20:13:46 GMT /slideshow/variant-snpsindels-calling-in-dna-sequences-part-1/8463681 allPowerde@slideshare.net(allPowerde) Variant (SNPs/Indels) calling in DNA sequences, Part 1 allPowerde Abstract: This session will focus on the first steps involved in identifying SNPs from whole genome, exome capture or targeted resequencing data: The different read mapping approaches to a DNA reference sequence will be introduced and quality metrics discussed. <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/06-30-variant-1-2-110629201351-phpapp02-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> Abstract: This session will focus on the first steps involved in identifying SNPs from whole genome, exome capture or targeted resequencing data: The different read mapping approaches to a DNA reference sequence will be introduced and quality metrics discussed.
Variant (SNPs/Indels) calling in DNA sequences, Part 1 from Denis C. Bauer
]]>
2128 4 https://cdn.slidesharecdn.com/ss_thumbnails/06-30-variant-1-2-110629201351-phpapp02-thumbnail.jpg?width=120&height=120&fit=bounds presentation White http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 0
Introduction to second generation sequencing /slideshow/introduction-to-second-generation-sequencing/8394573 06-23-introductionto2gs-110622202743-phpapp01
An introduction to second generation sequencing will be given with focus on the basic production informatics: The approach of raw data conversion and quality control will be discussed. ]]>

An introduction to second generation sequencing will be given with focus on the basic production informatics: The approach of raw data conversion and quality control will be discussed. ]]>
Wed, 22 Jun 2011 20:27:40 GMT /slideshow/introduction-to-second-generation-sequencing/8394573 allPowerde@slideshare.net(allPowerde) Introduction to second generation sequencing allPowerde An introduction to second generation sequencing will be given with focus on the basic production informatics: The approach of raw data conversion and quality control will be discussed. <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/06-23-introductionto2gs-110622202743-phpapp01-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> An introduction to second generation sequencing will be given with focus on the basic production informatics: The approach of raw data conversion and quality control will be discussed.
Introduction to second generation sequencing from Denis C. Bauer
]]>
8248 7 https://cdn.slidesharecdn.com/ss_thumbnails/06-23-introductionto2gs-110622202743-phpapp01-thumbnail.jpg?width=120&height=120&fit=bounds presentation Black http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 0
Introduction to Bioinformatics /slideshow/introduction-to-bioinformatics-8322153/8322153 06-16-introduction-110615202903-phpapp01
An introduction to bioinformatics practices and aims will be given and contrasted against approaches from other fields. Most importantly, it will be discussed how bioinformatics fits into the discovery cycle for hypothesis driven neuroscience research.]]>

An introduction to bioinformatics practices and aims will be given and contrasted against approaches from other fields. Most importantly, it will be discussed how bioinformatics fits into the discovery cycle for hypothesis driven neuroscience research.]]>
Wed, 15 Jun 2011 20:29:00 GMT /slideshow/introduction-to-bioinformatics-8322153/8322153 allPowerde@slideshare.net(allPowerde) Introduction to Bioinformatics allPowerde An introduction to bioinformatics practices and aims will be given and contrasted against approaches from other fields. Most importantly, it will be discussed how bioinformatics fits into the discovery cycle for hypothesis driven neuroscience research. <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/06-16-introduction-110615202903-phpapp01-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> An introduction to bioinformatics practices and aims will be given and contrasted against approaches from other fields. Most importantly, it will be discussed how bioinformatics fits into the discovery cycle for hypothesis driven neuroscience research.
Introduction to Bioinformatics from Denis C. Bauer
]]>
16377 12 https://cdn.slidesharecdn.com/ss_thumbnails/06-16-introduction-110615202903-phpapp01-thumbnail.jpg?width=120&height=120&fit=bounds presentation Black http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 0
The missing data issue for HiSeq runs /slideshow/the-missing-data-issue-for-hiseq-runs/6095087 missingdataissue-101209171223-phpapp02
Critical Run files can be missing/corrupt after the Run folder was transferred from the HiSeq storage to the cluster storage. This presentation discusses the issue and suggests four workarounds. ]]>

Critical Run files can be missing/corrupt after the Run folder was transferred from the HiSeq storage to the cluster storage. This presentation discusses the issue and suggests four workarounds. ]]>
Thu, 09 Dec 2010 17:12:16 GMT /slideshow/the-missing-data-issue-for-hiseq-runs/6095087 allPowerde@slideshare.net(allPowerde) The missing data issue for HiSeq runs allPowerde Critical Run files can be missing/corrupt after the Run folder was transferred from the HiSeq storage to the cluster storage. This presentation discusses the issue and suggests four workarounds. <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/missingdataissue-101209171223-phpapp02-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> Critical Run files can be missing/corrupt after the Run folder was transferred from the HiSeq storage to the cluster storage. This presentation discusses the issue and suggests four workarounds.
The missing data issue for HiSeq runs from Denis C. Bauer
]]>
1335 1 https://cdn.slidesharecdn.com/ss_thumbnails/missingdataissue-101209171223-phpapp02-thumbnail.jpg?width=120&height=120&fit=bounds presentation Black http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 0
Deciphering the regulatory code in the genome /slideshow/deciphering-the-regulatory-code-in-the-genome-1888933/1888933 presentationstandalone-090821004428-phpapp01
There are messages hidden within our genome, regulating when and how long a gene is switched on. The presentation describes a method, STREAM, targeted at deciphering this regulatory code.]]>

There are messages hidden within our genome, regulating when and how long a gene is switched on. The presentation describes a method, STREAM, targeted at deciphering this regulatory code.]]>
Fri, 21 Aug 2009 00:44:19 GMT /slideshow/deciphering-the-regulatory-code-in-the-genome-1888933/1888933 allPowerde@slideshare.net(allPowerde) Deciphering the regulatory code in the genome allPowerde There are messages hidden within our genome, regulating when and how long a gene is switched on. The presentation describes a method, STREAM, targeted at deciphering this regulatory code. <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/presentationstandalone-090821004428-phpapp01-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> There are messages hidden within our genome, regulating when and how long a gene is switched on. The presentation describes a method, STREAM, targeted at deciphering this regulatory code.
Deciphering the regulatory code in the genome from Denis C. Bauer
]]>
671 7 https://cdn.slidesharecdn.com/ss_thumbnails/presentationstandalone-090821004428-phpapp01-thumbnail.jpg?width=120&height=120&fit=bounds presentation Black http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 0
https://cdn.slidesharecdn.com/profile-photo-allPowerde-48x48.jpg?cb=1682066061 Dr Bauer, BSc(Hons), PhD, is an internationally accredited bioinformatics researcher and team leader at CSIRO. Her focus lies on high performance compute systems for integrating large volumes of data to inform strategic interventions in human health. She has a BSc(Hons) and PhD (2010) in Bioinformatics (Germany, UQ) and Post-Docs in machine learning and genetics (IMB, QBI). She published in high impact factor journals (Nature Genetics and Genome Research) and her 25 peer- reviewed publications (nine as first author, five as senior author) have been cited more than 500 times (h-index 10). She is involved in the threeAustralia genomics alliances jointly funded with more than $100M and ... www.allPower.de https://cdn.slidesharecdn.com/ss_thumbnails/2019giw-191216004835-thumbnail.jpg?width=320&height=320&fit=bounds slideshow/cloudnative-machine-learning-transforming-bioinformatics-research/206131228 Cloud-native machine l... https://cdn.slidesharecdn.com/ss_thumbnails/2018awssummitcanberrabauer-180914012148-thumbnail.jpg?width=320&height=320&fit=bounds allPowerde/translating-genomics-into-clinical-practice-2018-aws-summit-keynote Translating genomics i... https://cdn.slidesharecdn.com/ss_thumbnails/2018agileindiaserverless-180322013541-thumbnail.jpg?width=320&height=320&fit=bounds slideshow/going-serverless-for-webservices-that-need-to-crunch-large-volumes-of-data/91485721 Going Server-less for ...