ºÝºÝߣshows by User: DanHalperin / http://www.slideshare.net/images/logo.gif ºÝºÝߣshows by User: DanHalperin / Thu, 08 Dec 2016 09:04:07 GMT ºÝºÝߣShare feed for ºÝºÝߣshows by User: DanHalperin Introduction to Apache Beam & No Shard Left Behind: APIs for Massive Parallel Efficiency /slideshow/introduction-to-apache-beam-no-shard-left-behind-apis-for-massive-parallel-efficiency/69944676 2016-12-07dhalperistrata-161208090407
Apache Beam (incubating) is a unified batch and streaming data processing programming model that is efficient and portable. Beam evolved from a decade of system-building at Google, and Beam pipelines run today on both open source (Apache Flink, Apache Spark) and proprietary (Google Cloud Dataflow) runners. This talk will focus on I/O and connectors in Apache Beam, specifically its APIs for efficient, parallel, adaptive I/O. Google will discuss how these APIs enable a Beam data processing pipeline runner to dynamically rebalance work at runtime, to work around stragglers, and to automatically scale up and down cluster size as a job’s workload changes. Together these APIs and techniques enable Apache Beam runners to efficiently use computing resources without compromising on performance or correctness. Practical examples and a demonstration of Beam will be included.]]>

Apache Beam (incubating) is a unified batch and streaming data processing programming model that is efficient and portable. Beam evolved from a decade of system-building at Google, and Beam pipelines run today on both open source (Apache Flink, Apache Spark) and proprietary (Google Cloud Dataflow) runners. This talk will focus on I/O and connectors in Apache Beam, specifically its APIs for efficient, parallel, adaptive I/O. Google will discuss how these APIs enable a Beam data processing pipeline runner to dynamically rebalance work at runtime, to work around stragglers, and to automatically scale up and down cluster size as a job’s workload changes. Together these APIs and techniques enable Apache Beam runners to efficiently use computing resources without compromising on performance or correctness. Practical examples and a demonstration of Beam will be included.]]>
Thu, 08 Dec 2016 09:04:07 GMT /slideshow/introduction-to-apache-beam-no-shard-left-behind-apis-for-massive-parallel-efficiency/69944676 DanHalperin@slideshare.net(DanHalperin) Introduction to Apache Beam & No Shard Left Behind: APIs for Massive Parallel Efficiency DanHalperin Apache Beam (incubating) is a unified batch and streaming data processing programming model that is efficient and portable. Beam evolved from a decade of system-building at Google, and Beam pipelines run today on both open source (Apache Flink, Apache Spark) and proprietary (Google Cloud Dataflow) runners. This talk will focus on I/O and connectors in Apache Beam, specifically its APIs for efficient, parallel, adaptive I/O. Google will discuss how these APIs enable a Beam data processing pipeline runner to dynamically rebalance work at runtime, to work around stragglers, and to automatically scale up and down cluster size as a job’s workload changes. Together these APIs and techniques enable Apache Beam runners to efficiently use computing resources without compromising on performance or correctness. Practical examples and a demonstration of Beam will be included. <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/2016-12-07dhalperistrata-161208090407-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> Apache Beam (incubating) is a unified batch and streaming data processing programming model that is efficient and portable. Beam evolved from a decade of system-building at Google, and Beam pipelines run today on both open source (Apache Flink, Apache Spark) and proprietary (Google Cloud Dataflow) runners. This talk will focus on I/O and connectors in Apache Beam, specifically its APIs for efficient, parallel, adaptive I/O. Google will discuss how these APIs enable a Beam data processing pipeline runner to dynamically rebalance work at runtime, to work around stragglers, and to automatically scale up and down cluster size as a job’s workload changes. Together these APIs and techniques enable Apache Beam runners to efficiently use computing resources without compromising on performance or correctness. Practical examples and a demonstration of Beam will be included.
Introduction to Apache Beam & No Shard Left Behind: APIs for Massive Parallel Efficiency from Dan Halperin
]]>
1440 3 https://cdn.slidesharecdn.com/ss_thumbnails/2016-12-07dhalperistrata-161208090407-thumbnail.jpg?width=120&height=120&fit=bounds presentation Black http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 0
https://public.slidesharecdn.com/v2/images/profile-picture.png