ºÝºÝߣshows by User: ydn / http://www.slideshare.net/images/logo.gif ºÝºÝߣshows by User: ydn / Mon, 21 Oct 2019 22:04:46 GMT ºÝºÝߣShare feed for ºÝºÝߣshows by User: ydn Developing Mobile Apps for Performance - Swapnil Patel, Verizon Media /slideshow/developing-mobile-apps-for-performance-swapnil-patel-verizon-media/184976203 developingmobileappsforperformance-final-191021220446
Presented at https://www.meetup.com/Mobile-Apps-Performance-SF-Events/events/257929211/.]]>

Presented at https://www.meetup.com/Mobile-Apps-Performance-SF-Events/events/257929211/.]]>
Mon, 21 Oct 2019 22:04:46 GMT /slideshow/developing-mobile-apps-for-performance-swapnil-patel-verizon-media/184976203 ydn@slideshare.net(ydn) Developing Mobile Apps for Performance - Swapnil Patel, Verizon Media ydn Presented at https://www.meetup.com/Mobile-Apps-Performance-SF-Events/events/257929211/. <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/developingmobileappsforperformance-final-191021220446-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> Presented at https://www.meetup.com/Mobile-Apps-Performance-SF-Events/events/257929211/.
Developing Mobile Apps for Performance - Swapnil Patel, Verizon Media from Yahoo Developer Network
]]>
844 3 https://cdn.slidesharecdn.com/ss_thumbnails/developingmobileappsforperformance-final-191021220446-thumbnail.jpg?width=120&height=120&fit=bounds presentation Black http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 1
Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infrastructures - Tatsuya Yano, Yahoo Japan /slideshow/athenz-the-opensource-solution-to-provide-access-control-in-dynamic-infrastructures-tatsuya-yano-yahoo-japan/153060910 athenz-theopen-sourcesolutiontoprovideaccesscontrolindynamicinfrastructurestatsuyayano-190702033344
Learn more at http://www.athenz.io.]]>

Learn more at http://www.athenz.io.]]>
Tue, 02 Jul 2019 03:33:43 GMT /slideshow/athenz-the-opensource-solution-to-provide-access-control-in-dynamic-infrastructures-tatsuya-yano-yahoo-japan/153060910 ydn@slideshare.net(ydn) Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infrastructures - Tatsuya Yano, Yahoo Japan ydn Learn more at http://www.athenz.io. <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/athenz-theopen-sourcesolutiontoprovideaccesscontrolindynamicinfrastructurestatsuyayano-190702033344-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> Learn more at http://www.athenz.io.
Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infrastructures - Tatsuya Yano, Yahoo Japan from Yahoo Developer Network
]]>
564 4 https://cdn.slidesharecdn.com/ss_thumbnails/athenz-theopen-sourcesolutiontoprovideaccesscontrolindynamicinfrastructurestatsuyayano-190702033344-thumbnail.jpg?width=120&height=120&fit=bounds presentation Black http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 1
Athenz & SPIFFE, Tatsuya Yano, Yahoo Japan /slideshow/athenz-spiffe-tatsuya-yano-yahoo-japan/153058082 athenzspiffe-190702032001
Presented at the SPIFFE Meetup in Tokyo. Athenz (www.athenz.io) is an open source platform for X.509 certificate-based service authentication and fine-grained access control in dynamic infrastructures.]]>

Presented at the SPIFFE Meetup in Tokyo. Athenz (www.athenz.io) is an open source platform for X.509 certificate-based service authentication and fine-grained access control in dynamic infrastructures.]]>
Tue, 02 Jul 2019 03:20:01 GMT /slideshow/athenz-spiffe-tatsuya-yano-yahoo-japan/153058082 ydn@slideshare.net(ydn) Athenz & SPIFFE, Tatsuya Yano, Yahoo Japan ydn Presented at the SPIFFE Meetup in Tokyo. Athenz (www.athenz.io) is an open source platform for X.509 certificate-based service authentication and fine-grained access control in dynamic infrastructures. <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/athenzspiffe-190702032001-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> Presented at the SPIFFE Meetup in Tokyo. Athenz (www.athenz.io) is an open source platform for X.509 certificate-based service authentication and fine-grained access control in dynamic infrastructures.
Athenz & SPIFFE, Tatsuya Yano, Yahoo Japan from Yahoo Developer Network
]]>
632 2 https://cdn.slidesharecdn.com/ss_thumbnails/athenzspiffe-190702032001-thumbnail.jpg?width=120&height=120&fit=bounds presentation Black http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 1
Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tatsuya Yano, Yahoo Japan /slideshow/athenz-with-istio-single-access-control-model-in-cloud-infrastructures/153056092 athenzwithistio-singleaccesscontrolmodelincloudinfrastructures1-190702030459
Athenz (www.athenz.io) is an open source platform for X.509 certificate-based service authentication and fine-grained access control in dynamic infrastructures that provides options to run multi-environments with a single access control model.]]>

Athenz (www.athenz.io) is an open source platform for X.509 certificate-based service authentication and fine-grained access control in dynamic infrastructures that provides options to run multi-environments with a single access control model.]]>
Tue, 02 Jul 2019 03:04:59 GMT /slideshow/athenz-with-istio-single-access-control-model-in-cloud-infrastructures/153056092 ydn@slideshare.net(ydn) Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tatsuya Yano, Yahoo Japan ydn Athenz (www.athenz.io) is an open source platform for X.509 certificate-based service authentication and fine-grained access control in dynamic infrastructures that provides options to run multi-environments with a single access control model. <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/athenzwithistio-singleaccesscontrolmodelincloudinfrastructures1-190702030459-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> Athenz (www.athenz.io) is an open source platform for X.509 certificate-based service authentication and fine-grained access control in dynamic infrastructures that provides options to run multi-environments with a single access control model.
Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tatsuya Yano, Yahoo Japan from Yahoo Developer Network
]]>
568 4 https://cdn.slidesharecdn.com/ss_thumbnails/athenzwithistio-singleaccesscontrolmodelincloudinfrastructures1-190702030459-thumbnail.jpg?width=120&height=120&fit=bounds presentation Black http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 1
CICD at Oath using Screwdriver /slideshow/cicd-at-oath-using-screwdriver/127656068 cicdatoathusingscrewdriver-190109212538
Jithin Emmanuel, Sr. Software Development Manager, Developer Platform Services, provides an overview of Screwdriver (http://www.screwdriver.cd), and shares how it’s used at scale for CI/CD at Oath. Jithin leads the product development and operations of Screwdriver, which is a flagship CI/CD product used at scale in Oath.]]>

Jithin Emmanuel, Sr. Software Development Manager, Developer Platform Services, provides an overview of Screwdriver (http://www.screwdriver.cd), and shares how it’s used at scale for CI/CD at Oath. Jithin leads the product development and operations of Screwdriver, which is a flagship CI/CD product used at scale in Oath.]]>
Wed, 09 Jan 2019 21:25:38 GMT /slideshow/cicd-at-oath-using-screwdriver/127656068 ydn@slideshare.net(ydn) CICD at Oath using Screwdriver ydn Jithin Emmanuel, Sr. Software Development Manager, Developer Platform Services, provides an overview of Screwdriver (http://www.screwdriver.cd), and shares how it’s used at scale for CI/CD at Oath. Jithin leads the product development and operations of Screwdriver, which is a flagship CI/CD product used at scale in Oath. <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/cicdatoathusingscrewdriver-190109212538-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> Jithin Emmanuel, Sr. Software Development Manager, Developer Platform Services, provides an overview of Screwdriver (http://www.screwdriver.cd), and shares how it’s used at scale for CI/CD at Oath. Jithin leads the product development and operations of Screwdriver, which is a flagship CI/CD product used at scale in Oath.
CICD at Oath using Screwdriver from Yahoo Developer Network
]]>
890 5 https://cdn.slidesharecdn.com/ss_thumbnails/cicdatoathusingscrewdriver-190109212538-thumbnail.jpg?width=120&height=120&fit=bounds presentation Black http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 1
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath /slideshow/big-data-serving-with-vespa-jon-bratseth-distinguished-architect-oath/125333417 bigdataservingwithvespa1-181207225414
Offline and stream processing of big data sets can be done with tools such as Hadoop, Spark, and Storm, but what if you need to process big data at the time a user is making a request? Vespa (http://www.vespa.ai) allows you to search, organize and evaluate machine-learned models from e.g TensorFlow over large, evolving data sets with latencies in the tens of milliseconds. Vespa is behind the recommendation, ad targeting, and search at Yahoo where it handles billions of daily queries over billions of documents.]]>

Offline and stream processing of big data sets can be done with tools such as Hadoop, Spark, and Storm, but what if you need to process big data at the time a user is making a request? Vespa (http://www.vespa.ai) allows you to search, organize and evaluate machine-learned models from e.g TensorFlow over large, evolving data sets with latencies in the tens of milliseconds. Vespa is behind the recommendation, ad targeting, and search at Yahoo where it handles billions of daily queries over billions of documents.]]>
Fri, 07 Dec 2018 22:54:14 GMT /slideshow/big-data-serving-with-vespa-jon-bratseth-distinguished-architect-oath/125333417 ydn@slideshare.net(ydn) Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath ydn Offline and stream processing of big data sets can be done with tools such as Hadoop, Spark, and Storm, but what if you need to process big data at the time a user is making a request? Vespa (http://www.vespa.ai) allows you to search, organize and evaluate machine-learned models from e.g TensorFlow over large, evolving data sets with latencies in the tens of milliseconds. Vespa is behind the recommendation, ad targeting, and search at Yahoo where it handles billions of daily queries over billions of documents. <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/bigdataservingwithvespa1-181207225414-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> Offline and stream processing of big data sets can be done with tools such as Hadoop, Spark, and Storm, but what if you need to process big data at the time a user is making a request? Vespa (http://www.vespa.ai) allows you to search, organize and evaluate machine-learned models from e.g TensorFlow over large, evolving data sets with latencies in the tens of milliseconds. Vespa is behind the recommendation, ad targeting, and search at Yahoo where it handles billions of daily queries over billions of documents.
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath from Yahoo Developer Network
]]>
816 4 https://cdn.slidesharecdn.com/ss_thumbnails/bigdataservingwithvespa1-181207225414-thumbnail.jpg?width=120&height=120&fit=bounds presentation Black http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 1
How @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenu /slideshow/how-twitter-hadoop-chose-google-cloud-joep-rottinghuis-lohit-vijayarenu/121612758 howtwitterhadoopchosegooglecloud-181102235452
Presented at the Hadoop Contributors Meetup, hosted by Oath. Explore career opportunities at Oath: https://www.oath.com/careers/search-jobs/.]]>

Presented at the Hadoop Contributors Meetup, hosted by Oath. Explore career opportunities at Oath: https://www.oath.com/careers/search-jobs/.]]>
Fri, 02 Nov 2018 23:54:51 GMT /slideshow/how-twitter-hadoop-chose-google-cloud-joep-rottinghuis-lohit-vijayarenu/121612758 ydn@slideshare.net(ydn) How @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenu ydn Presented at the Hadoop Contributors Meetup, hosted by Oath. Explore career opportunities at Oath: https://www.oath.com/careers/search-jobs/. <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/howtwitterhadoopchosegooglecloud-181102235452-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> Presented at the Hadoop Contributors Meetup, hosted by Oath. Explore career opportunities at Oath: https://www.oath.com/careers/search-jobs/.
How @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenu from Yahoo Developer Network
]]>
479 3 https://cdn.slidesharecdn.com/ss_thumbnails/howtwitterhadoopchosegooglecloud-181102235452-thumbnail.jpg?width=120&height=120&fit=bounds presentation Black http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 1
The Future of Hadoop in an AI World, Milind Bhandarkar, CEO, Ampool /slideshow/the-future-of-hadoop-in-an-ai-world-milind-bhandarkar-ceo-ampool/121069442 futureofhadoopinaiworld-181029134733
Presented at the Hadoop Contributors Meetup, hosted by Oath. Explore career opportunities at Oath: https://www.oath.com/careers/search-jobs/.]]>

Presented at the Hadoop Contributors Meetup, hosted by Oath. Explore career opportunities at Oath: https://www.oath.com/careers/search-jobs/.]]>
Mon, 29 Oct 2018 13:47:33 GMT /slideshow/the-future-of-hadoop-in-an-ai-world-milind-bhandarkar-ceo-ampool/121069442 ydn@slideshare.net(ydn) The Future of Hadoop in an AI World, Milind Bhandarkar, CEO, Ampool ydn Presented at the Hadoop Contributors Meetup, hosted by Oath. Explore career opportunities at Oath: https://www.oath.com/careers/search-jobs/. <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/futureofhadoopinaiworld-181029134733-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> Presented at the Hadoop Contributors Meetup, hosted by Oath. Explore career opportunities at Oath: https://www.oath.com/careers/search-jobs/.
The Future of Hadoop in an AI World, Milind Bhandarkar, CEO, Ampool from Yahoo Developer Network
]]>
484 3 https://cdn.slidesharecdn.com/ss_thumbnails/futureofhadoopinaiworld-181029134733-thumbnail.jpg?width=120&height=120&fit=bounds presentation Black http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 1
Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara, Botong Huang /slideshow/apache-yarn-federation-and-tez-at-microsoft-anupam-upadhyay-adrian-nicoara-botong-huang/120489907 hadoopdevelopersummitsept2018-181023232330
Presented at the Hadoop Contributors Meetup, hosted by Oath. Explore career opportunities at Oath: https://www.oath.com/careers/search-jobs/.]]>

Presented at the Hadoop Contributors Meetup, hosted by Oath. Explore career opportunities at Oath: https://www.oath.com/careers/search-jobs/.]]>
Tue, 23 Oct 2018 23:23:30 GMT /slideshow/apache-yarn-federation-and-tez-at-microsoft-anupam-upadhyay-adrian-nicoara-botong-huang/120489907 ydn@slideshare.net(ydn) Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara, Botong Huang ydn Presented at the Hadoop Contributors Meetup, hosted by Oath. Explore career opportunities at Oath: https://www.oath.com/careers/search-jobs/. <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/hadoopdevelopersummitsept2018-181023232330-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> Presented at the Hadoop Contributors Meetup, hosted by Oath. Explore career opportunities at Oath: https://www.oath.com/careers/search-jobs/.
Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara, Botong Huang from Yahoo Developer Network
]]>
440 2 https://cdn.slidesharecdn.com/ss_thumbnails/hadoopdevelopersummitsept2018-181023232330-thumbnail.jpg?width=120&height=120&fit=bounds presentation Black http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 1
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shane Kumpf, Hortonworks /slideshow/containerized-services-on-apache-hadoop-yarn-past-present-and-future-shane-kumpf-hortonworks/120337928 containerizedservicesonyarn-pastpresentandfuture-181022191632
Presented at the Hadoop Contributors Meetup, hosted by Oath. Explore career opportunities at Oath: https://www.oath.com/careers/search-jobs/.]]>

Presented at the Hadoop Contributors Meetup, hosted by Oath. Explore career opportunities at Oath: https://www.oath.com/careers/search-jobs/.]]>
Mon, 22 Oct 2018 19:16:32 GMT /slideshow/containerized-services-on-apache-hadoop-yarn-past-present-and-future-shane-kumpf-hortonworks/120337928 ydn@slideshare.net(ydn) Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shane Kumpf, Hortonworks ydn Presented at the Hadoop Contributors Meetup, hosted by Oath. Explore career opportunities at Oath: https://www.oath.com/careers/search-jobs/. <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/containerizedservicesonyarn-pastpresentandfuture-181022191632-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> Presented at the Hadoop Contributors Meetup, hosted by Oath. Explore career opportunities at Oath: https://www.oath.com/careers/search-jobs/.
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shane Kumpf, Hortonworks from Yahoo Developer Network
]]>
317 3 https://cdn.slidesharecdn.com/ss_thumbnails/containerizedservicesonyarn-pastpresentandfuture-181022191632-thumbnail.jpg?width=120&height=120&fit=bounds presentation Black http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 1
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, Oath /slideshow/hdfs-scalability-and-security-daryn-sharp-senior-engineer-oath-120063924/120063924 meetup20181-181020002219
Presented at the Hadoop Contributors Meetup, hosted by Oath. Explore career opportunities at Oath: https://www.oath.com/careers/search-jobs/.]]>

Presented at the Hadoop Contributors Meetup, hosted by Oath. Explore career opportunities at Oath: https://www.oath.com/careers/search-jobs/.]]>
Sat, 20 Oct 2018 00:22:19 GMT /slideshow/hdfs-scalability-and-security-daryn-sharp-senior-engineer-oath-120063924/120063924 ydn@slideshare.net(ydn) HDFS Scalability and Security, Daryn Sharp, Senior Engineer, Oath ydn Presented at the Hadoop Contributors Meetup, hosted by Oath. Explore career opportunities at Oath: https://www.oath.com/careers/search-jobs/. <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/meetup20181-181020002219-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> Presented at the Hadoop Contributors Meetup, hosted by Oath. Explore career opportunities at Oath: https://www.oath.com/careers/search-jobs/.
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, Oath from Yahoo Developer Network
]]>
185 2 https://cdn.slidesharecdn.com/ss_thumbnails/meetup20181-181020002219-thumbnail.jpg?width=120&height=120&fit=bounds presentation Black http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 1
Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda Tan, Hortonworks /slideshow/hadoop-submarine-project-running-deep-learning-workloads-on-yarn-wangda-tan-hortonworks/120043366 submarine-v0-181019173620
Presented at the Hadoop Contributors Meetup, hosted by Oath. Explore career opportunities at Oath: https://www.oath.com/careers/search-jobs/.]]>

Presented at the Hadoop Contributors Meetup, hosted by Oath. Explore career opportunities at Oath: https://www.oath.com/careers/search-jobs/.]]>
Fri, 19 Oct 2018 17:36:20 GMT /slideshow/hadoop-submarine-project-running-deep-learning-workloads-on-yarn-wangda-tan-hortonworks/120043366 ydn@slideshare.net(ydn) Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda Tan, Hortonworks ydn Presented at the Hadoop Contributors Meetup, hosted by Oath. Explore career opportunities at Oath: https://www.oath.com/careers/search-jobs/. <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/submarine-v0-181019173620-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> Presented at the Hadoop Contributors Meetup, hosted by Oath. Explore career opportunities at Oath: https://www.oath.com/careers/search-jobs/.
Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda Tan, Hortonworks from Yahoo Developer Network
]]>
1630 6 https://cdn.slidesharecdn.com/ss_thumbnails/submarine-v0-181019173620-thumbnail.jpg?width=120&height=120&fit=bounds presentation Black http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 1
Moving the Oath Grid to Docker, Eric Badger, Oath /slideshow/moving-the-oath-grid-to-docker-eric-badger-oath/119958388 movingtheoathgridtodocker-181019032139
Presented at the Hadoop Contributors Meetup, hosted by Oath. Explore career opportunities at Oath: https://www.oath.com/careers/search-jobs/.]]>

Presented at the Hadoop Contributors Meetup, hosted by Oath. Explore career opportunities at Oath: https://www.oath.com/careers/search-jobs/.]]>
Fri, 19 Oct 2018 03:21:39 GMT /slideshow/moving-the-oath-grid-to-docker-eric-badger-oath/119958388 ydn@slideshare.net(ydn) Moving the Oath Grid to Docker, Eric Badger, Oath ydn Presented at the Hadoop Contributors Meetup, hosted by Oath. Explore career opportunities at Oath: https://www.oath.com/careers/search-jobs/. <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/movingtheoathgridtodocker-181019032139-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> Presented at the Hadoop Contributors Meetup, hosted by Oath. Explore career opportunities at Oath: https://www.oath.com/careers/search-jobs/.
Moving the Oath Grid to Docker, Eric Badger, Oath from Yahoo Developer Network
]]>
295 2 https://cdn.slidesharecdn.com/ss_thumbnails/movingtheoathgridtodocker-181019032139-thumbnail.jpg?width=120&height=120&fit=bounds presentation Black http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 1
Architecting Petabyte Scale AI Applications /ydn/architecting-petabyte-scale-ai-applications petabytescaleaiplatformaisummit-181008200620
Presented at the AI Summit SF by Ganesh Harinath, VP of Engineering, Big Data and Artificial Intelligence, Oath.]]>

Presented at the AI Summit SF by Ganesh Harinath, VP of Engineering, Big Data and Artificial Intelligence, Oath.]]>
Mon, 08 Oct 2018 20:06:20 GMT /ydn/architecting-petabyte-scale-ai-applications ydn@slideshare.net(ydn) Architecting Petabyte Scale AI Applications ydn Presented at the AI Summit SF by Ganesh Harinath, VP of Engineering, Big Data and Artificial Intelligence, Oath. <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/petabytescaleaiplatformaisummit-181008200620-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> Presented at the AI Summit SF by Ganesh Harinath, VP of Engineering, Big Data and Artificial Intelligence, Oath.
Architecting Petabyte Scale AI Applications from Yahoo Developer Network
]]>
540 3 https://cdn.slidesharecdn.com/ss_thumbnails/petabytescaleaiplatformaisummit-181008200620-thumbnail.jpg?width=120&height=120&fit=bounds presentation Black http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 1
Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth, Distinguished Architect, Oath /slideshow/introduction-to-vespa-the-open-source-big-data-serving-engine/118017098 bigdataservingwithvespa-181003193658
Offline and stream processing of big data sets can be done with tools such as Hadoop, Spark, and Storm, but what if you need to process big data at the time a user is making a request? This presentation introduces Vespa (http://vespa.ai) – the open source big data serving engine. Vespa allows you to search, organize, and evaluate machine-learned models from e.g TensorFlow over large, evolving data sets with latencies in the tens of milliseconds. Vespa is behind the recommendation, ad targeting, and search at Yahoo where it handles billions of daily queries over billions of documents and was recently open sourced at http://vespa.ai.]]>

Offline and stream processing of big data sets can be done with tools such as Hadoop, Spark, and Storm, but what if you need to process big data at the time a user is making a request? This presentation introduces Vespa (http://vespa.ai) – the open source big data serving engine. Vespa allows you to search, organize, and evaluate machine-learned models from e.g TensorFlow over large, evolving data sets with latencies in the tens of milliseconds. Vespa is behind the recommendation, ad targeting, and search at Yahoo where it handles billions of daily queries over billions of documents and was recently open sourced at http://vespa.ai.]]>
Wed, 03 Oct 2018 19:36:58 GMT /slideshow/introduction-to-vespa-the-open-source-big-data-serving-engine/118017098 ydn@slideshare.net(ydn) Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth, Distinguished Architect, Oath ydn Offline and stream processing of big data sets can be done with tools such as Hadoop, Spark, and Storm, but what if you need to process big data at the time a user is making a request? This presentation introduces Vespa (http://vespa.ai) – the open source big data serving engine. Vespa allows you to search, organize, and evaluate machine-learned models from e.g TensorFlow over large, evolving data sets with latencies in the tens of milliseconds. Vespa is behind the recommendation, ad targeting, and search at Yahoo where it handles billions of daily queries over billions of documents and was recently open sourced at http://vespa.ai. <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/bigdataservingwithvespa-181003193658-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> Offline and stream processing of big data sets can be done with tools such as Hadoop, Spark, and Storm, but what if you need to process big data at the time a user is making a request? This presentation introduces Vespa (http://vespa.ai) – the open source big data serving engine. Vespa allows you to search, organize, and evaluate machine-learned models from e.g TensorFlow over large, evolving data sets with latencies in the tens of milliseconds. Vespa is behind the recommendation, ad targeting, and search at Yahoo where it handles billions of daily queries over billions of documents and was recently open sourced at http://vespa.ai.
Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth, Distinguished Architect, Oath from Yahoo Developer Network
]]>
1849 5 https://cdn.slidesharecdn.com/ss_thumbnails/bigdataservingwithvespa-181003193658-thumbnail.jpg?width=120&height=120&fit=bounds presentation Black http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 1
Jun 2017 HUG: YARN Scheduling – A Step Beyond /slideshow/jun-2017-hug-yarn-scheduling-a-step-beyond/77189827 2017junapachehadoopyarn-capacityschedulerimprovements-170622221801
In recent times, YARN Capacity Scheduler has improved a lot in terms of some critical features and refactoring.  Here is a quick look into some of the recent changes in scheduler: Global Scheduling Support General placement support Better preemption model to handle resource anomalies across and within queue. Absolute resources’ configuration support Priority support between Queues and Applications In this talk, we will deep dive into each of these new features to give a better picture of their usage and performance comparison. We will also provide some more brief overview about the ongoing efforts and how they can help to solve some of the core issues we face today. Speakers: Sunil Govind (Hortonworks), Jian He (Hortonworks) ]]>

In recent times, YARN Capacity Scheduler has improved a lot in terms of some critical features and refactoring.  Here is a quick look into some of the recent changes in scheduler: Global Scheduling Support General placement support Better preemption model to handle resource anomalies across and within queue. Absolute resources’ configuration support Priority support between Queues and Applications In this talk, we will deep dive into each of these new features to give a better picture of their usage and performance comparison. We will also provide some more brief overview about the ongoing efforts and how they can help to solve some of the core issues we face today. Speakers: Sunil Govind (Hortonworks), Jian He (Hortonworks) ]]>
Thu, 22 Jun 2017 22:18:00 GMT /slideshow/jun-2017-hug-yarn-scheduling-a-step-beyond/77189827 ydn@slideshare.net(ydn) Jun 2017 HUG: YARN Scheduling – A Step Beyond ydn In recent times, YARN Capacity Scheduler has improved a lot in terms of some critical features and refactoring.  Here is a quick look into some of the recent changes in scheduler: Global Scheduling Support General placement support Better preemption model to handle resource anomalies across and within queue. Absolute resources’ configuration support Priority support between Queues and Applications In this talk, we will deep dive into each of these new features to give a better picture of their usage and performance comparison. We will also provide some more brief overview about the ongoing efforts and how they can help to solve some of the core issues we face today. Speakers: Sunil Govind (Hortonworks), Jian He (Hortonworks) <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/2017junapachehadoopyarn-capacityschedulerimprovements-170622221801-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> In recent times, YARN Capacity Scheduler has improved a lot in terms of some critical features and refactoring.  Here is a quick look into some of the recent changes in scheduler: Global Scheduling Support General placement support Better preemption model to handle resource anomalies across and within queue. Absolute resources’ configuration support Priority support between Queues and Applications In this talk, we will deep dive into each of these new features to give a better picture of their usage and performance comparison. We will also provide some more brief overview about the ongoing efforts and how they can help to solve some of the core issues we face today. Speakers: Sunil Govind (Hortonworks), Jian He (Hortonworks)
Jun 2017 HUG: YARN Scheduling – A Step Beyond from Yahoo Developer Network
]]>
1068 2 https://cdn.slidesharecdn.com/ss_thumbnails/2017junapachehadoopyarn-capacityschedulerimprovements-170622221801-thumbnail.jpg?width=120&height=120&fit=bounds presentation Black http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 1
Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies /slideshow/jun-2017-hug-largescale-machine-learning-use-cases-and-technologies/77189732 2017junmlusecasetechnology-170622221201
In recent years, Yahoo has brought the big data ecosystem and machine learning together to discover mathematical models for search ranking, online advertising, content recommendation, and mobile applications. We use distributed computing clusters with CPUs and GPUs to train these models from 100’s of petabytes of data.  A collection of distributed algorithms have been developed to achieve 10-1000x the scale and speed of alternative solutions. Our algorithms construct regression/classification models and semantic vectors within hours, even for billions of training examples and parameters. We have made our distributed deep learning solutions, CaffeOnSpark and TensorFlowOnSpark, available as open source.  In this talk, we highlight Yahoo use cases where big data and machine learning technologies are best exemplified. We explain algorithm/system challenges to scale ML algorithms for massive datasets. We provide a technical overview of CaffeOnSpark and TensorFlowOnSpark to jumpstart your journey of large-scale machine learning. Speakers: Andy Feng is a VP of Architecture at Yahoo, leading the architecture and design of big data and machine learning initiatives. He has architected large-scale systems for personalization, ad serving, NoSQL, and cloud infrastructure. Prior to Yahoo, he was a Chief Architect at Netscape/AOL, and Principal Scientist at Xerox. He received a Ph.D. degree in computer science from Osaka University, Japan. ]]>

In recent years, Yahoo has brought the big data ecosystem and machine learning together to discover mathematical models for search ranking, online advertising, content recommendation, and mobile applications. We use distributed computing clusters with CPUs and GPUs to train these models from 100’s of petabytes of data.  A collection of distributed algorithms have been developed to achieve 10-1000x the scale and speed of alternative solutions. Our algorithms construct regression/classification models and semantic vectors within hours, even for billions of training examples and parameters. We have made our distributed deep learning solutions, CaffeOnSpark and TensorFlowOnSpark, available as open source.  In this talk, we highlight Yahoo use cases where big data and machine learning technologies are best exemplified. We explain algorithm/system challenges to scale ML algorithms for massive datasets. We provide a technical overview of CaffeOnSpark and TensorFlowOnSpark to jumpstart your journey of large-scale machine learning. Speakers: Andy Feng is a VP of Architecture at Yahoo, leading the architecture and design of big data and machine learning initiatives. He has architected large-scale systems for personalization, ad serving, NoSQL, and cloud infrastructure. Prior to Yahoo, he was a Chief Architect at Netscape/AOL, and Principal Scientist at Xerox. He received a Ph.D. degree in computer science from Osaka University, Japan. ]]>
Thu, 22 Jun 2017 22:12:01 GMT /slideshow/jun-2017-hug-largescale-machine-learning-use-cases-and-technologies/77189732 ydn@slideshare.net(ydn) Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies ydn In recent years, Yahoo has brought the big data ecosystem and machine learning together to discover mathematical models for search ranking, online advertising, content recommendation, and mobile applications. We use distributed computing clusters with CPUs and GPUs to train these models from 100’s of petabytes of data. � A collection of distributed algorithms have been developed to achieve 10-1000x the scale and speed of alternative solutions. Our algorithms construct regression/classification models and semantic vectors within hours, even for billions of training examples and parameters. We have made our distributed deep learning solutions, CaffeOnSpark and TensorFlowOnSpark, available as open source.  In this talk, we highlight Yahoo use cases where big data and machine learning technologies are best exemplified. We explain algorithm/system challenges to scale ML algorithms for massive datasets. We provide a technical overview of CaffeOnSpark and TensorFlowOnSpark to jumpstart your journey of large-scale machine learning. Speakers: Andy Feng is a VP of Architecture at Yahoo, leading the architecture and design of big data and machine learning initiatives. He has architected large-scale systems for personalization, ad serving, NoSQL, and cloud infrastructure. Prior to Yahoo, he was a Chief Architect at Netscape/AOL, and Principal Scientist at Xerox. He received a Ph.D. degree in computer science from Osaka University, Japan. <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/2017junmlusecasetechnology-170622221201-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> In recent years, Yahoo has brought the big data ecosystem and machine learning together to discover mathematical models for search ranking, online advertising, content recommendation, and mobile applications. We use distributed computing clusters with CPUs and GPUs to train these models from 100’s of petabytes of data. � A collection of distributed algorithms have been developed to achieve 10-1000x the scale and speed of alternative solutions. Our algorithms construct regression/classification models and semantic vectors within hours, even for billions of training examples and parameters. We have made our distributed deep learning solutions, CaffeOnSpark and TensorFlowOnSpark, available as open source.  In this talk, we highlight Yahoo use cases where big data and machine learning technologies are best exemplified. We explain algorithm/system challenges to scale ML algorithms for massive datasets. We provide a technical overview of CaffeOnSpark and TensorFlowOnSpark to jumpstart your journey of large-scale machine learning. Speakers: Andy Feng is a VP of Architecture at Yahoo, leading the architecture and design of big data and machine learning initiatives. He has architected large-scale systems for personalization, ad serving, NoSQL, and cloud infrastructure. Prior to Yahoo, he was a Chief Architect at Netscape/AOL, and Principal Scientist at Xerox. He received a Ph.D. degree in computer science from Osaka University, Japan.
Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies from Yahoo Developer Network
]]>
1041 7 https://cdn.slidesharecdn.com/ss_thumbnails/2017junmlusecasetechnology-170622221201-thumbnail.jpg?width=120&height=120&fit=bounds presentation Black http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 1
February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Problems while Operationalizing Big Data Apps /slideshow/february-2017-hug-slow-stuck-or-runaway-apps-learn-how-to-quickly-fix-problems-while-operationalizing-big-data-apps/72285824 2017febunravelhugfinal-170217203345
Spark and SQL-on-Hadoop have made it easier than ever for enterprises to create or migrate apps to the big data stack. Thousands of apps are being generated every day in the form of ETL and modeling pipelines, business intelligence and data cubes, deep machine learning, graph analytics, and real-time data streaming. However, the task of reliably operationalizing these big data apps involves many painpoints. Developers may not have the experience in distributed systems to tune apps for efficiency and performance. Diagnosing failures or unpredictable performance of apps can be a laborious process that involves multiple people. Apps may get stuck or steal resources and cause mission-critical apps to miss SLAs.  This talk with introduce the audience to these problems and their common causes. We will also demonstrate how to find and fix these problems quickly, as well as prevent such problems from happening in the first place.  Speakers: Dr. Shivnath Babu is a Co-founder and CTO of Unravel and Associate Professor of Computer Science at Duke University. With more than a decade of experience researching the ease of use and manageability of data-intensive systems, he leads the Starfish project at Duke, which pioneered the automation of Hadoop application tuning, problem diagnosis, and resource management. Shivnath has more than 80 peer-reviewed publications to his credit and has received the U.S. National Science Foundation CAREER Award, the HP Labs Innovation Award, and three IBM Faculty Awards.  ]]>

Spark and SQL-on-Hadoop have made it easier than ever for enterprises to create or migrate apps to the big data stack. Thousands of apps are being generated every day in the form of ETL and modeling pipelines, business intelligence and data cubes, deep machine learning, graph analytics, and real-time data streaming. However, the task of reliably operationalizing these big data apps involves many painpoints. Developers may not have the experience in distributed systems to tune apps for efficiency and performance. Diagnosing failures or unpredictable performance of apps can be a laborious process that involves multiple people. Apps may get stuck or steal resources and cause mission-critical apps to miss SLAs.  This talk with introduce the audience to these problems and their common causes. We will also demonstrate how to find and fix these problems quickly, as well as prevent such problems from happening in the first place.  Speakers: Dr. Shivnath Babu is a Co-founder and CTO of Unravel and Associate Professor of Computer Science at Duke University. With more than a decade of experience researching the ease of use and manageability of data-intensive systems, he leads the Starfish project at Duke, which pioneered the automation of Hadoop application tuning, problem diagnosis, and resource management. Shivnath has more than 80 peer-reviewed publications to his credit and has received the U.S. National Science Foundation CAREER Award, the HP Labs Innovation Award, and three IBM Faculty Awards.  ]]>
Fri, 17 Feb 2017 20:33:45 GMT /slideshow/february-2017-hug-slow-stuck-or-runaway-apps-learn-how-to-quickly-fix-problems-while-operationalizing-big-data-apps/72285824 ydn@slideshare.net(ydn) February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Problems while Operationalizing Big Data Apps ydn Spark and SQL-on-Hadoop have made it easier than ever for enterprises to create or migrate apps to the big data stack. Thousands of apps are being generated every day in the form of ETL and modeling pipelines, business intelligence and data cubes, deep machine learning, graph �analytics, and real-time data streaming. However, the task of reliably operationalizing these big data apps involves many painpoints. Developers may not have the experience in distributed systems to tune apps for efficiency and performance. Diagnosing failures or unpredictable performance of apps can be a laborious process that �involves multiple people. Apps may get stuck or steal resources and cause mission-critical apps to miss SLAs.  This talk with introduce the audience to these problems and their common causes. We will also demonstrate how to find and fix these problems quickly, as well as prevent such problems from happening in the first place.  Speakers: Dr. Shivnath Babu is a Co-founder and CTO of Unravel and Associate Professor of Computer Science at Duke University. With more than a decade of experience researching the ease of use and manageability of data-intensive systems, he leads the Starfish project at Duke, which pioneered the automation of Hadoop application tuning, problem diagnosis, and resource management. Shivnath has more than 80 peer-reviewed publications to his credit and has received the U.S. National Science Foundation CAREER Award, the HP Labs Innovation Award, and three IBM Faculty Awards.  <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/2017febunravelhugfinal-170217203345-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> Spark and SQL-on-Hadoop have made it easier than ever for enterprises to create or migrate apps to the big data stack. Thousands of apps are being generated every day in the form of ETL and modeling pipelines, business intelligence and data cubes, deep machine learning, graph �analytics, and real-time data streaming. However, the task of reliably operationalizing these big data apps involves many painpoints. Developers may not have the experience in distributed systems to tune apps for efficiency and performance. Diagnosing failures or unpredictable performance of apps can be a laborious process that �involves multiple people. Apps may get stuck or steal resources and cause mission-critical apps to miss SLAs.  This talk with introduce the audience to these problems and their common causes. We will also demonstrate how to find and fix these problems quickly, as well as prevent such problems from happening in the first place.  Speakers: Dr. Shivnath Babu is a Co-founder and CTO of Unravel and Associate Professor of Computer Science at Duke University. With more than a decade of experience researching the ease of use and manageability of data-intensive systems, he leads the Starfish project at Duke, which pioneered the automation of Hadoop application tuning, problem diagnosis, and resource management. Shivnath has more than 80 peer-reviewed publications to his credit and has received the U.S. National Science Foundation CAREER Award, the HP Labs Innovation Award, and three IBM Faculty Awards. 
February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Problems while Operationalizing Big Data Apps from Yahoo Developer Network
]]>
737 4 https://cdn.slidesharecdn.com/ss_thumbnails/2017febunravelhugfinal-170217203345-thumbnail.jpg?width=120&height=120&fit=bounds presentation Black http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 1
February 2017 HUG: Exactly-once end-to-end processing with Apache Apex /slideshow/february-2017-hug-exactlyonce-endtoend-processing-with-apache-apex/72285662 2017febapacheapexendtoendexactlyonce-170217202847
Apache Apex (http://apex.apache.org/) is a stream processing platform that helps organizations to build processing pipelines with fault tolerance and strong processing guarantees. It was built to support low processing latency, high throughput, scalability, interoperability, high availability and security. The platform comes with Malhar library - an extensive collection of processing operators and a wide range of input and output connectors for out-of-the-box integration with an existing infrastructure. In the talk I am going to describe how connectors together with the distributed checkpointing (a mechanism used by the Apex to support fault tolerance and high availability) provide exactly-once end-to-end processing guarantees. Speakers: Vlad Rozov is Apache Apex PMC member and back-end engineer at DataTorrent where he focuses on the buffer server, Apex platform network layer, benchmarks and optimizing the core components for low latency and high throughput. Prior to DataTorrent Vlad worked on distributed BI platform at Huawei and on multi-dimensional database (OLAP) at Hyperion Solutions and Oracle. ]]>

Apache Apex (http://apex.apache.org/) is a stream processing platform that helps organizations to build processing pipelines with fault tolerance and strong processing guarantees. It was built to support low processing latency, high throughput, scalability, interoperability, high availability and security. The platform comes with Malhar library - an extensive collection of processing operators and a wide range of input and output connectors for out-of-the-box integration with an existing infrastructure. In the talk I am going to describe how connectors together with the distributed checkpointing (a mechanism used by the Apex to support fault tolerance and high availability) provide exactly-once end-to-end processing guarantees. Speakers: Vlad Rozov is Apache Apex PMC member and back-end engineer at DataTorrent where he focuses on the buffer server, Apex platform network layer, benchmarks and optimizing the core components for low latency and high throughput. Prior to DataTorrent Vlad worked on distributed BI platform at Huawei and on multi-dimensional database (OLAP) at Hyperion Solutions and Oracle. ]]>
Fri, 17 Feb 2017 20:28:47 GMT /slideshow/february-2017-hug-exactlyonce-endtoend-processing-with-apache-apex/72285662 ydn@slideshare.net(ydn) February 2017 HUG: Exactly-once end-to-end processing with Apache Apex ydn Apache Apex (http://apex.apache.org/) is a stream processing platform that helps organizations to build processing pipelines with fault tolerance and strong processing guarantees. It was built to support low processing latency, high throughput, scalability, interoperability, high availability and security. The platform comes with Malhar library - an extensive collection of processing operators and a wide range of input and output connectors for out-of-the-box integration with an existing infrastructure. In the talk I am going to describe how connectors together with the distributed checkpointing (a mechanism used by the Apex to support fault tolerance and high availability) provide exactly-once end-to-end processing guarantees. �Speakers: Vlad Rozov is Apache Apex PMC member and back-end engineer at DataTorrent where he focuses on the buffer server, Apex platform network layer, benchmarks and optimizing the core components for low latency and high throughput. Prior to DataTorrent Vlad worked on distributed BI platform at Huawei and on multi-dimensional database (OLAP) at Hyperion Solutions and Oracle. <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/2017febapacheapexendtoendexactlyonce-170217202847-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> Apache Apex (http://apex.apache.org/) is a stream processing platform that helps organizations to build processing pipelines with fault tolerance and strong processing guarantees. It was built to support low processing latency, high throughput, scalability, interoperability, high availability and security. The platform comes with Malhar library - an extensive collection of processing operators and a wide range of input and output connectors for out-of-the-box integration with an existing infrastructure. In the talk I am going to describe how connectors together with the distributed checkpointing (a mechanism used by the Apex to support fault tolerance and high availability) provide exactly-once end-to-end processing guarantees. �Speakers: Vlad Rozov is Apache Apex PMC member and back-end engineer at DataTorrent where he focuses on the buffer server, Apex platform network layer, benchmarks and optimizing the core components for low latency and high throughput. Prior to DataTorrent Vlad worked on distributed BI platform at Huawei and on multi-dimensional database (OLAP) at Hyperion Solutions and Oracle.
February 2017 HUG: Exactly-once end-to-end processing with Apache Apex from Yahoo Developer Network
]]>
1021 3 https://cdn.slidesharecdn.com/ss_thumbnails/2017febapacheapexendtoendexactlyonce-170217202847-thumbnail.jpg?width=120&height=120&fit=bounds presentation Black http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 1
February 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics /slideshow/february-2017-hug-data-sketches-a-required-toolkit-for-big-data-analytics/72285382 2017febnewdatasketcheshugfeb2017public-1-170217201912
In the analysis of big data there are problematic queries that don’t scale because they require huge compute resources and time to generate exact results. Examples include count distinct, quantiles, most frequent items, joins, matrix computations, and graph analysis. If approximate results are acceptable, there is a class of sub-linear, stochastic streaming algorithms, called "sketches", that can produce results orders-of magnitude faster and with mathematically proven error bounds. For interactive queries there may not be other viable alternatives, and in the case of extracting results for these problem queries in real-time, sketches are the only known solution. For any analysis system that requires these problematic queries from big data, sketches are a required toolkit that should be tightly integrated into the system's analysis capabilities. This technology has helped Yahoo successfully reduce data processing times from days to hours, or minutes to seconds on a number of its internal platforms. This talk covers the current state of our Open Source DataSketches.github.io library, which includes adaptations and example code for Pig, Hive, Spark and Druid and gives architectural examples of use and a case study.  Speakers: Jon Malkin is a scientist at Yahoo working to extend the DataSketches library. His previous roles have involved large scale data processing for sponsored search, display advertising, user counting, ad targeting, and cross-device user identity modeling.  Alexander Saydakov is a senior software engineer at Yahoo working on the open source Data Sketches project. In his previous roles he has been involved in building large-scale back-end data processing systems and frameworks for data analytics and experimentation based on Torque, Hadoop, Pig, Hive and Druid. Alexander’s education background is in the field of applied mathematics. ]]>

In the analysis of big data there are problematic queries that don’t scale because they require huge compute resources and time to generate exact results. Examples include count distinct, quantiles, most frequent items, joins, matrix computations, and graph analysis. If approximate results are acceptable, there is a class of sub-linear, stochastic streaming algorithms, called "sketches", that can produce results orders-of magnitude faster and with mathematically proven error bounds. For interactive queries there may not be other viable alternatives, and in the case of extracting results for these problem queries in real-time, sketches are the only known solution. For any analysis system that requires these problematic queries from big data, sketches are a required toolkit that should be tightly integrated into the system's analysis capabilities. This technology has helped Yahoo successfully reduce data processing times from days to hours, or minutes to seconds on a number of its internal platforms. This talk covers the current state of our Open Source DataSketches.github.io library, which includes adaptations and example code for Pig, Hive, Spark and Druid and gives architectural examples of use and a case study.  Speakers: Jon Malkin is a scientist at Yahoo working to extend the DataSketches library. His previous roles have involved large scale data processing for sponsored search, display advertising, user counting, ad targeting, and cross-device user identity modeling.  Alexander Saydakov is a senior software engineer at Yahoo working on the open source Data Sketches project. In his previous roles he has been involved in building large-scale back-end data processing systems and frameworks for data analytics and experimentation based on Torque, Hadoop, Pig, Hive and Druid. Alexander’s education background is in the field of applied mathematics. ]]>
Fri, 17 Feb 2017 20:19:12 GMT /slideshow/february-2017-hug-data-sketches-a-required-toolkit-for-big-data-analytics/72285382 ydn@slideshare.net(ydn) February 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics ydn In the analysis of big data there are problematic queries that don’t scale because they require huge compute resources and time to generate exact results. Examples include count distinct, quantiles, most frequent items, joins, matrix computations, and graph analysis. ��If approximate results are acceptable, there is a class of sub-linear, stochastic streaming algorithms, called "sketches", that can produce results orders-of magnitude faster and with mathematically proven error bounds. For interactive queries there may not be other viable alternatives, and in the case of extracting results for these problem queries in real-time, sketches are the only known solution. ��For any analysis system that requires these problematic queries from big data, sketches are a required toolkit that should be tightly integrated into the system's analysis capabilities. This technology has helped Yahoo successfully reduce data processing times from days to hours, or minutes to seconds on a number of its internal platforms. ��This talk covers the current state of our Open Source DataSketches.github.io library, which includes adaptations and example code for Pig, Hive, Spark and Druid and gives architectural examples of use and a case study.  Speakers: Jon Malkin is a scientist at Yahoo working to extend the DataSketches library. His previous roles have involved large scale data processing for sponsored search, display advertising, user counting, ad targeting, and cross-device user identity modeling.  Alexander Saydakov is a senior software engineer at Yahoo working on the open source Data Sketches project. In his previous roles he has been involved in building large-scale back-end data processing systems and frameworks for data analytics and experimentation based on Torque, Hadoop, Pig, Hive and Druid. Alexander’s education background is in the field of applied mathematics. <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/2017febnewdatasketcheshugfeb2017public-1-170217201912-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> In the analysis of big data there are problematic queries that don’t scale because they require huge compute resources and time to generate exact results. Examples include count distinct, quantiles, most frequent items, joins, matrix computations, and graph analysis. ��If approximate results are acceptable, there is a class of sub-linear, stochastic streaming algorithms, called &quot;sketches&quot;, that can produce results orders-of magnitude faster and with mathematically proven error bounds. For interactive queries there may not be other viable alternatives, and in the case of extracting results for these problem queries in real-time, sketches are the only known solution. ��For any analysis system that requires these problematic queries from big data, sketches are a required toolkit that should be tightly integrated into the system&#39;s analysis capabilities. This technology has helped Yahoo successfully reduce data processing times from days to hours, or minutes to seconds on a number of its internal platforms. ��This talk covers the current state of our Open Source DataSketches.github.io library, which includes adaptations and example code for Pig, Hive, Spark and Druid and gives architectural examples of use and a case study.  Speakers: Jon Malkin is a scientist at Yahoo working to extend the DataSketches library. His previous roles have involved large scale data processing for sponsored search, display advertising, user counting, ad targeting, and cross-device user identity modeling.  Alexander Saydakov is a senior software engineer at Yahoo working on the open source Data Sketches project. In his previous roles he has been involved in building large-scale back-end data processing systems and frameworks for data analytics and experimentation based on Torque, Hadoop, Pig, Hive and Druid. Alexander’s education background is in the field of applied mathematics.
February 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics from Yahoo Developer Network
]]>
1282 4 https://cdn.slidesharecdn.com/ss_thumbnails/2017febnewdatasketcheshugfeb2017public-1-170217201912-thumbnail.jpg?width=120&height=120&fit=bounds presentation Black http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 1
https://cdn.slidesharecdn.com/profile-photo-ydn-48x48.jpg?cb=1592847992 The Yahoo Developer Network (YDN) is Yahoo!'s central resource for developers and partners. YDN offers developer tools, APIs, web services, and resources that help developers build rich web experiences, integrate key data sources, and drive traffic. developer.yahoo.com https://cdn.slidesharecdn.com/ss_thumbnails/developingmobileappsforperformance-final-191021220446-thumbnail.jpg?width=320&height=320&fit=bounds slideshow/developing-mobile-apps-for-performance-swapnil-patel-verizon-media/184976203 Developing Mobile Apps... https://cdn.slidesharecdn.com/ss_thumbnails/athenz-theopen-sourcesolutiontoprovideaccesscontrolindynamicinfrastructurestatsuyayano-190702033344-thumbnail.jpg?width=320&height=320&fit=bounds slideshow/athenz-the-opensource-solution-to-provide-access-control-in-dynamic-infrastructures-tatsuya-yano-yahoo-japan/153060910 Athenz - The Open-Sour... https://cdn.slidesharecdn.com/ss_thumbnails/athenzspiffe-190702032001-thumbnail.jpg?width=320&height=320&fit=bounds slideshow/athenz-spiffe-tatsuya-yano-yahoo-japan/153058082 Athenz &amp; SPIFFE, Tatsu...