This talk was presented at Super Computing '09 in the Cloud Computing for Systems and Computational Biology workshop. It describes the proteomics analysis package we built using Amazon's cloud computing architecture. More information in our paper in J. Proteome Research
http://pubs.acs.org/doi/abs/10.1021/pr800970z
1 of 22
More Related Content
Virtual Proteomics Analysis Cluster in the Cloud
1. Its always sunny on top of the Cloud! An intro to Amazon Web Services Simon Twigger, Ph.D. Medical College of Wisconsin, Milwaukee ViPDAC, a stand-alone Proteomics Analysis Suite in the Cloud
2. ` How the humble pipette tip helped us rethink our computing strategy...¨
8. What would you do if there was only one tip? Wait in line to use it Run fewer experiments ( due to waiting in line ) Do small scale things ( Its a small tip, pipetting 5l takes all week! ) Try fewer things ( its a real pain to keep washing it up ) Not try anything weird ( What happens if it gets permanently clogged!? )
9. OK, more computers might be better... but... we dont have the money! we dont have an IT guy/gal we dont have a sysadmin we dont know how to install a cluster we wont use it all the time
18. Observations Sign up & Start up is hard for biologists. http://www.directthought.com / http://www.elasticpod.com /
19. Now what? No need to Wait in line to use it No need to Run fewer analyses No need to Do small scale things No need to Try fewer things No need to Not try anything weird Molly¨s ViPDAC Shama¨s ViPDAC Brian¨s ViPDAC Bassam¨s ViPDAC
21. Clouds & Bioinformatics: Our observations so far Use it as a software delivery method Use it to provide computing to virtually anyone Get fast access to large data files (Ensembl, Genbank, etc) Use it to COMPLEMENT existing clusters/grids AMIs/Apps not easy for non-informatics folks to get going ` Cloud-friendly¨ licensing structures for commercial software? ` Grant-friendly¨ billing options Data transfer for large datasets (NextGen sequencing?)
22. Acknowledgements Joey Geiger, Brian Halligan and Andrew Vallejos Molly Pellitteri-Hahn, Shama Mirsa Mike Olivier, Andy Greene NHLBI National Proteomics Center Low Cost, Scalable Proteomics Data Analysis Using Amazon¨s Cloud Computing Services and Open Source Search Algorithms. J. Proteome Res., 2009, 8 (6), pp 3148C3153
Editor's Notes
#21: Internally we now utilize a hybrid solution C Sequest and mascot running on local clusters, X!Tandem and OMSSA are run on AWS. Raw data can be sent to any and all of these algorithms through an integrated workflow system