�ݺ�ߣshows by User: yzhou2110

�ݺ�ߣshows by User: yzhou2110 / http://www.slideshare.net/images/logo.gif �ݺ�ߣshows by User: yzhou2110 / Fri, 27 Mar 2015 01:15:45 GMT �ݺ�ߣShare feed for �ݺ�ߣshows by User: yzhou2110 Spark meetup v2.0.5 /slideshow/spark-meetup-v205/46348935 sparkmeetupv2-150327011545-conversion-gate01
In this talk, we’ll discuss technical designs of support of HBase as a “native” data source to Spark SQL to achieve both query and load performance and scalability: near-precise execution locality of query and loading, fine-tuned partition pruning, predicate pushdown, plan execution through coprocessor, and optimized and fully parallelized bulk loader. Point and range queries on dimensional attributes will benefit particularly well from the techniques. Preliminary test results vs. established SQL-on-HBase technologies will be provided. The speaker will also share the future plan and real-world use cases, particularly in the telecom industry.]]>
In this talk, we’ll discuss technical designs of support of HBase as a “native” data source to Spark SQL to achieve both query and load performance and scalability: near-precise execution locality of query and loading, fine-tuned partition pruning, predicate pushdown, plan execution through coprocessor, and optimized and fully parallelized bulk loader. Point and range queries on dimensional attributes will benefit particularly well from the techniques. Preliminary test results vs. established SQL-on-HBase technologies will be provided. The speaker will also share the future plan and real-world use cases, particularly in the telecom industry.]]> Fri, 27 Mar 2015 01:15:45 GMT /slideshow/spark-meetup-v205/46348935 yzhou2110@slideshare.net(yzhou2110) Spark meetup v2.0.5 yzhou2110 In this talk, we’ll discuss technical designs of support of HBase as a “native” data source to Spark SQL to achieve both query and load performance and scalability: near-precise execution locality of query and loading, fine-tuned partition pruning, predicate pushdown, plan execution through coprocessor, and optimized and fully parallelized bulk loader. Point and range queries on dimensional attributes will benefit particularly well from the techniques. Preliminary test results vs. established SQL-on-HBase technologies will be provided. The speaker will also share the future plan and real-world use cases, particularly in the telecom industry. <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/sparkmeetupv2-150327011545-conversion-gate01-thumbnail.jpg?width=120&height=120&fit=bounds" /><br> In this talk, we’ll discuss technical designs of support of HBase as a “native” data source to Spark SQL to achieve both query and load performance and scalability: near-precise execution locality of query and loading, fine-tuned partition pruning, predicate pushdown, plan execution through coprocessor, and optimized and fully parallelized bulk loader. Point and range queries on dimensional attributes will benefit particularly well from the techniques. Preliminary test results vs. established SQL-on-HBase technologies will be provided. The speaker will also share the future plan and real-world use cases, particularly in the telecom industry.

Spark meetup v2.0.5 from Yan Zhou

]]> 4220 4 https://cdn.slidesharecdn.com/ss_thumbnails/sparkmeetupv2-150327011545-conversion-gate01-thumbnail.jpg?width=120&height=120&fit=bounds presentation Black http://activitystrea.ms/schema/1.0/post

http://activitystrea.ms/schema/1.0/posted

0 https://cdn.slidesharecdn.com/profile-photo-yzhou2110-48x48.jpg?cb=1690911821 Apache Hadoop PIG committer; experiences in Apache Hadoop including HBase, distributed computing, Data Warehouse/OLAP, MDX/SQL, SQL query engines, multi-dimensional data modeling, and DB techniques. Familiar with big data field and data-driven analysis in the field.