ºÝºÝߣshows by User: ZhijieShen / http://www.slideshare.net/images/logo.gif ºÝºÝߣshows by User: ZhijieShen / Sun, 08 Jun 2014 00:30:52 GMT ºÝºÝߣShare feed for ºÝºÝߣshows by User: ZhijieShen Hadoop Summit San Jose 2014 - Analyzing Historical Data of Applications on Hadoop YARN: for Fun and Profit /slideshow/application-historyt-imeline/35612985 applicationhistorytimeline-140608003052-phpapp02
Apache Hadoop YARN is the default platform for running distributed apps - batch & interactive apps and long running services. A YARN cluster may run lots of apps of different frameworks and from different users, groups and organizations. It's of significant value to monitor and visualize what has happened to these apps, i.e., application history, to glean important insights - how their performance changes over time, how queues get utilized, changes in workload patterns etc. It’s also useful to ensure application history accessible whether apps are finished, or failed for some reasons, such as master restart, crash or memory pressure. In this talk, we’ll describe how YARN enables storage of all sorts of historical information, both generic and framework-specific, of any kinds of apps, and how YARN exposes the historical information and provide users the tools to view it, conduct any analysis, and understand various dimensions of YARN clusters over time. We'll cover a number of technical highlights, such as persisting information into a pluggable & reliable storage like HDFS, establishing a history-server for users to easily access via command-line tools, web & REST interfaces in a secure manner, and enabling apps to define and publish framework specific information. Moreover, the talk will also brief developers and administrators about how to make use of the new YARN feature.]]>

Apache Hadoop YARN is the default platform for running distributed apps - batch & interactive apps and long running services. A YARN cluster may run lots of apps of different frameworks and from different users, groups and organizations. It's of significant value to monitor and visualize what has happened to these apps, i.e., application history, to glean important insights - how their performance changes over time, how queues get utilized, changes in workload patterns etc. It’s also useful to ensure application history accessible whether apps are finished, or failed for some reasons, such as master restart, crash or memory pressure. In this talk, we’ll describe how YARN enables storage of all sorts of historical information, both generic and framework-specific, of any kinds of apps, and how YARN exposes the historical information and provide users the tools to view it, conduct any analysis, and understand various dimensions of YARN clusters over time. We'll cover a number of technical highlights, such as persisting information into a pluggable & reliable storage like HDFS, establishing a history-server for users to easily access via command-line tools, web & REST interfaces in a secure manner, and enabling apps to define and publish framework specific information. Moreover, the talk will also brief developers and administrators about how to make use of the new YARN feature.]]>
Sun, 08 Jun 2014 00:30:52 GMT /slideshow/application-historyt-imeline/35612985 ZhijieShen@slideshare.net(ZhijieShen) Hadoop Summit San Jose 2014 - Analyzing Historical Data of Applications on Hadoop YARN: for Fun and Profit ZhijieShen Apache Hadoop YARN is the default platform for running distributed apps - batch & interactive apps and long running services. A YARN cluster may run lots of apps of different frameworks and from different users, groups and organizations. It's of significant value to monitor and visualize what has happened to these apps, i.e., application history, to glean important insights - how their performance changes over time, how queues get utilized, changes in workload patterns etc. It’s also useful to ensure application history accessible whether apps are finished, or failed for some reasons, such as master restart, crash or memory pressure. In this talk, we’ll describe how YARN enables storage of all sorts of historical information, both generic and framework-specific, of any kinds of apps, and how YARN exposes the historical information and provide users the tools to view it, conduct any analysis, and understand various dimensions of YARN clusters over time. We'll cover a number of technical highlights, such as persisting information into a pluggable & reliable storage like HDFS, establishing a history-server for users to easily access via command-line tools, web & REST interfaces in a secure manner, and enabling apps to define and publish framework specific information. Moreover, the talk will also brief developers and administrators about how to make use of the new YARN feature. <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/applicationhistorytimeline-140608003052-phpapp02-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> Apache Hadoop YARN is the default platform for running distributed apps - batch &amp; interactive apps and long running services. A YARN cluster may run lots of apps of different frameworks and from different users, groups and organizations. It&#39;s of significant value to monitor and visualize what has happened to these apps, i.e., application history, to glean important insights - how their performance changes over time, how queues get utilized, changes in workload patterns etc. It’s also useful to ensure application history accessible whether apps are finished, or failed for some reasons, such as master restart, crash or memory pressure. In this talk, we’ll describe how YARN enables storage of all sorts of historical information, both generic and framework-specific, of any kinds of apps, and how YARN exposes the historical information and provide users the tools to view it, conduct any analysis, and understand various dimensions of YARN clusters over time. We&#39;ll cover a number of technical highlights, such as persisting information into a pluggable &amp; reliable storage like HDFS, establishing a history-server for users to easily access via command-line tools, web &amp; REST interfaces in a secure manner, and enabling apps to define and publish framework specific information. Moreover, the talk will also brief developers and administrators about how to make use of the new YARN feature.
Hadoop Summit San Jose 2014 - Analyzing Historical Data of Applications on Hadoop YARN: for Fun and Profit from Zhijie Shen
]]>
868 5 https://cdn.slidesharecdn.com/ss_thumbnails/applicationhistorytimeline-140608003052-phpapp02-thumbnail.jpg?width=120&height=120&fit=bounds presentation 000000 http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 0
ApacheCon North America 2014 - Apache Hadoop YARN: The Next-generation Distributed Operating System /slideshow/apachecon14/35612823 apachecon14yarnpresentation1-140608001302-phpapp02
For diverse organizations, Apache Hadoop has become the de-facto place where data & computational resources are shared. This broad usage has stretched its design beyond its intended target. To address this, Apache Hadoop community has come up with next generation of Hadoop’s compute platform: YARN. YARN in a nutshell is the distributed Operating System of the big-data world. In this talk, we will introduce YARN, covering how the new architecture decouples programming model from resource management, scheduling functions, platform’s fault tolerance & high availability, tools for application tracing & analyses. We will then discuss the exciting ecosystem of Apache Software Foundation projects forming around YARN. We will conclude with a coverage on the applications & services being built around YARN platform which lets user chose the programming models choice, all on the same data.]]>

For diverse organizations, Apache Hadoop has become the de-facto place where data & computational resources are shared. This broad usage has stretched its design beyond its intended target. To address this, Apache Hadoop community has come up with next generation of Hadoop’s compute platform: YARN. YARN in a nutshell is the distributed Operating System of the big-data world. In this talk, we will introduce YARN, covering how the new architecture decouples programming model from resource management, scheduling functions, platform’s fault tolerance & high availability, tools for application tracing & analyses. We will then discuss the exciting ecosystem of Apache Software Foundation projects forming around YARN. We will conclude with a coverage on the applications & services being built around YARN platform which lets user chose the programming models choice, all on the same data.]]>
Sun, 08 Jun 2014 00:13:01 GMT /slideshow/apachecon14/35612823 ZhijieShen@slideshare.net(ZhijieShen) ApacheCon North America 2014 - Apache Hadoop YARN: The Next-generation Distributed Operating System ZhijieShen For diverse organizations, Apache Hadoop has become the de-facto place where data & computational resources are shared. This broad usage has stretched its design beyond its intended target. To address this, Apache Hadoop community has come up with next generation of Hadoop’s compute platform: YARN. YARN in a nutshell is the distributed Operating System of the big-data world. In this talk, we will introduce YARN, covering how the new architecture decouples programming model from resource management, scheduling functions, platform’s fault tolerance & high availability, tools for application tracing & analyses. We will then discuss the exciting ecosystem of Apache Software Foundation projects forming around YARN. We will conclude with a coverage on the applications & services being built around YARN platform which lets user chose the programming models choice, all on the same data. <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/apachecon14yarnpresentation1-140608001302-phpapp02-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> For diverse organizations, Apache Hadoop has become the de-facto place where data &amp; computational resources are shared. This broad usage has stretched its design beyond its intended target. To address this, Apache Hadoop community has come up with next generation of Hadoop’s compute platform: YARN. YARN in a nutshell is the distributed Operating System of the big-data world. In this talk, we will introduce YARN, covering how the new architecture decouples programming model from resource management, scheduling functions, platform’s fault tolerance &amp; high availability, tools for application tracing &amp; analyses. We will then discuss the exciting ecosystem of Apache Software Foundation projects forming around YARN. We will conclude with a coverage on the applications &amp; services being built around YARN platform which lets user chose the programming models choice, all on the same data.
ApacheCon North America 2014 - Apache Hadoop YARN: The Next-generation Distributed Operating System from Zhijie Shen
]]>
1041 4 https://cdn.slidesharecdn.com/ss_thumbnails/apachecon14yarnpresentation1-140608001302-phpapp02-thumbnail.jpg?width=120&height=120&fit=bounds presentation White http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 0
https://cdn.slidesharecdn.com/profile-photo-ZhijieShen-48x48.jpg?cb=1502476502 Zhijie is now working at Data Infra Team of Facebook, and is responsible for data infrastructure efficiency optimization. Previously, he was a Member of Technical Staff at Hortonworks, focusing on Hadoop YARN, which is the resource management system for various Hadoop ecosystem workloads. Other than YARN, Zhijie was also actively working other open source projects, such as MapReduce, Pig and Samza, and therefore became the committer and PMC of Hadoop and Samza, and the member of Apache Software Foundation. Prior to Hortonworks, Zhijie was working as a Software Development Engineer at Microsoft, working on the cloud-based file synchronization service. Zhijie was awarded a PhD degree in Co... http://www.comp.nus.edu.sg/~z-shen/ https://cdn.slidesharecdn.com/ss_thumbnails/applicationhistorytimeline-140608003052-phpapp02-thumbnail.jpg?width=320&height=320&fit=bounds slideshow/application-historyt-imeline/35612985 Hadoop Summit San Jose... https://cdn.slidesharecdn.com/ss_thumbnails/apachecon14yarnpresentation1-140608001302-phpapp02-thumbnail.jpg?width=320&height=320&fit=bounds slideshow/apachecon14/35612823 ApacheCon North Americ...