際際滷shows by User: yzhou2110 / http://www.slideshare.net/images/logo.gif 際際滷shows by User: yzhou2110 / Fri, 27 Mar 2015 01:15:45 GMT 際際滷Share feed for 際際滷shows by User: yzhou2110 Spark meetup v2.0.5 /slideshow/spark-meetup-v205/46348935 sparkmeetupv2-150327011545-conversion-gate01
In this talk, well discuss technical designs of support of HBase as a native data source to Spark SQL to achieve both query and load performance and scalability: near-precise execution locality of query and loading, fine-tuned partition pruning, predicate pushdown, plan execution through coprocessor, and optimized and fully parallelized bulk loader. Point and range queries on dimensional attributes will benefit particularly well from the techniques. Preliminary test results vs. established SQL-on-HBase technologies will be provided. The speaker will also share the future plan and real-world use cases, particularly in the telecom industry.]]>

In this talk, well discuss technical designs of support of HBase as a native data source to Spark SQL to achieve both query and load performance and scalability: near-precise execution locality of query and loading, fine-tuned partition pruning, predicate pushdown, plan execution through coprocessor, and optimized and fully parallelized bulk loader. Point and range queries on dimensional attributes will benefit particularly well from the techniques. Preliminary test results vs. established SQL-on-HBase technologies will be provided. The speaker will also share the future plan and real-world use cases, particularly in the telecom industry.]]>
Fri, 27 Mar 2015 01:15:45 GMT /slideshow/spark-meetup-v205/46348935 yzhou2110@slideshare.net(yzhou2110) Spark meetup v2.0.5 yzhou2110 In this talk, well discuss technical designs of support of HBase as a native data source to Spark SQL to achieve both query and load performance and scalability: near-precise execution locality of query and loading, fine-tuned partition pruning, predicate pushdown, plan execution through coprocessor, and optimized and fully parallelized bulk loader. Point and range queries on dimensional attributes will benefit particularly well from the techniques. Preliminary test results vs. established SQL-on-HBase technologies will be provided. The speaker will also share the future plan and real-world use cases, particularly in the telecom industry. <img style="border:1px solid #C3E6D8;float:right;" alt="" src="https://cdn.slidesharecdn.com/ss_thumbnails/sparkmeetupv2-150327011545-conversion-gate01-thumbnail.jpg?width=120&amp;height=120&amp;fit=bounds" /><br> In this talk, well discuss technical designs of support of HBase as a native data source to Spark SQL to achieve both query and load performance and scalability: near-precise execution locality of query and loading, fine-tuned partition pruning, predicate pushdown, plan execution through coprocessor, and optimized and fully parallelized bulk loader. Point and range queries on dimensional attributes will benefit particularly well from the techniques. Preliminary test results vs. established SQL-on-HBase technologies will be provided. The speaker will also share the future plan and real-world use cases, particularly in the telecom industry.
Spark meetup v2.0.5 from Yan Zhou
]]>
4220 4 https://cdn.slidesharecdn.com/ss_thumbnails/sparkmeetupv2-150327011545-conversion-gate01-thumbnail.jpg?width=120&height=120&fit=bounds presentation Black http://activitystrea.ms/schema/1.0/post http://activitystrea.ms/schema/1.0/posted 0
https://cdn.slidesharecdn.com/profile-photo-yzhou2110-48x48.jpg?cb=1690911821 Apache Hadoop PIG committer; experiences in Apache Hadoop including HBase, distributed computing, Data Warehouse/OLAP, MDX/SQL, SQL query engines, multi-dimensional data modeling, and DB techniques. Familiar with big data field and data-driven analysis in the field.