際際滷

際際滷Share a Scribd company logo
Introduction to Apache Tajo
  郁規 / 蠏碁
About me
 Gruter Corp / BigData Engineer
 Apache Tajo Committer
 jhjung@gruter.com
 Home Page: http://blrunner.com
 Twitter: @blrunner78
 The author of Hadoop book
覈谿
1. Tajo 螳
2. 旧 蠍磯
3. Tajo vs Spark
4. 覯れ襷 蟆郁骸
5. 襦襷
6.  襦
1. Tajo 螳
5
1.1 Tajo?
  蠍磯 觜一危 危一 ろ
 2013 豺 誤覯伎, 2014 豺 覯 襦
 ANSI SQL 讌
 譯殊 轟
 豌 螻焔 覿 豌襴 讌 (Not MapReduce)
 れ 貎朱Μ 豕 蠍磯 覦 螻襴讀 
 螳 伎 ろ ETL 貎朱Μ 讌
 覦 覦襴語雑 企 ろ 誤磯磯 貎朱Μ 讌
6
1.2 Tajo ろ豌
Master Server (HA)
Client
JDBC TSql Web UI
CatalogStore
DBMS
HiveMetastor
e
Submit a Query
Manage metadata
Allocate a query
Send task
& monitor
Send task
& monitor
Slave Server
TajoWorker
QueryMaster
Local
FileSystem
HDFS
Local Query
Engine
StorageManager
Slave Server
TajoWorker
QueryMaster
Local
FileSystem
HDFS
Local Query
Engine
StorageManager
Slave Server
TajoWorker
QueryMaster
Local
FileSystem
HDFS
Local Query
Engine
StorageManager
TajoMaster
TajoMaster
7
1.3 Tajo 觜蟲一
 ANSI SQL 讌
- 給 豕 覦 蠍一ヾ ろ  
- 觜譴 SQL 蟆曙, Oracle螻 PostgreSQL 谿瑚
 企ろ レ
- 豌 碁蟾讌  螳ロ
 螻焔 覿 豌襴 讌
- れ: 覓朱Μ ろ 100MB/sec (SATA 蠍一)
- 1TB襯 10  碁襦 豌襴
 螳 aggregation 貎朱Μ: 30豐 ~ 1覿 伎
 螳 join 貎朱Μ: 1 ~ 2 覿 伎
 覲旧″ join 覦 distinct aggregation :  覿 10覿
2. 旧 蠍磯
9
2.1 SQL 煙
豌 覿 豌襴 讌 伎, 轟 貎朱Μ 覿 ろ 讌
蠍一ヾ 一危  覦  襷穴骸 誤
 讌 覿 豌襴
- Inner join, and left/right/full outer join
- Groupby, sort, multiple distinct aggregation
- window function
 SQL 一危 
- CHAR, BOOL, INT, DOUBLE, TEXT, DATE, Etc
 れ  襷
- Text file (CSV), SequenceFile, RCFile, ORC, Parquet, Avro
10
2.2 貎朱Μ 豕
 Cost-based Join Optimization (Greedy Heuristic)
- 螳 豕 Join 襯 豢豸″ 螻 蟇
  螳ロ rewrite rule 讌
- rewrite rule 誤壱伎 螻糾骸 れ 碁Μ 螻
 讌 貎朱Μ 豕 (Progressive Query Optimization)
- ろ 螳 糾 讌
- 覿   覯 覿 (range partitioning)  一 覯, 螳 煙
壱 譟一
- 覿 Join, 蠏碁9覦企ゼ  一 螳襯 壱 譟一
11
2.3 一 企
 一 企 焔
CREATE TABLE student (
id INT,
name TEXT,
grade TEXT
) USING PARQUET
PARTITION BY COLUMN (country TEXT, city TEXT);
/tajo/warehouse/student/country=KOREA/city=SEOUL/
/tajo/warehouse/student/country=KOREA/city=PUSAN/
/tajo/warehouse/student/country=KOREA/city=INCHEON/
/tajo/warehouse/student/country=USA/city=NEWYORK/
/tajo/warehouse/student/country=USA/city=BOSTON/
. . .
Hive 誤 讌  Column Value 覦 企 一 讌
 磯Μ 蟲
 ロ Range 一 讌
12
2.4 貎朱Μ 伎(Federation) 覦 企ろ伎 讌

 れ 一危 り Join 覦 Union 貎朱Μ 豌襴 讌
 レ
- 一危 襷願係伎 : RDBMS  
- 蠍一ヾ RBMS 一危一  一危一 Join 貎朱Μ 豌襴
- SQL襯 伎 NoSQL 覦 れ ろ襴讌 (S3, Swift, HBase, ElasticSearch, Kafka)
- SQL 蟲襯 伎 誤壱伎 譴
HDFS NoSQL S3 Swift
Tajo
13
2.5 Nested 覦 JSON 襷 讌
覲 螳螻旧 Nested 覦 JSON 襷 殊 SQL 豌襴 螳
 一危
企 
SQL 覓
14
2.6 UDF 讌
> SELECT pow(col1, col2), col3, col4,  FROM orders, 
@Override
public Datum eval(Tuple params) {
Datum value1Datum = params.get(0);
Datum value2Datum = params.get(1);
if(value1Datum instanceof NullDatum || value2Datum instanceof NullDatum) {
return NullDatum.get();
}
return DatumFactory.createFloat8(Math.pow(value1Datum.asFloat8(),
value2Datum.asFloat8()));
}
@ScalarFunction(name="pow", returnType = FLOAT8,paramTypes = {FLOAT8, FLOAT8})
public static double pow(double x, double y) {
return Math.pow(x, y);
}
 貊  1
 貊  2
3. Tajo vs Spark
16
3.1 蠍磯蓋 轟 觜蟲
覈 Tajo Spark
 一危一危一 企ろ 一 
襴 蟯襴 豌 豌 or YARN
ろ襴讌 HDFS, S3, HBase, Swift HDFS, S3, HBase
狩襷 CSV, RC, ORC, Parquet, Avro CSV, RC, ORC, Parquet, Avro
蟲語 Java Scala
貉危 轟
一危磯 Disk, 譴螳 一危磯
Memory/Disk 覈 
覿  一危郁 In-Memory
 襦
貎朱Μ 危伎
Long running batch ,
Interactive
Interactive
17
3.2 SQL 讌 轟 觜蟲 : Tajo vs Spark SQL
覈 Tajo Spark SQL
貎朱Μ 譴 ANISI SQL
- HiveQL : HiveContext 螻
- ANSI SQL: SQLContext 螻,
企  蟇
SELECT 貎朱Μ O O
INSERT INTO 覦
CREAT TABLE AS
SELECT 貎朱Μ
O O
Multiple Distinct
Column
O X
Command Line
Interface
TSql
-襦貉 覦 企ろ 覈 讌
- 覲 磯 ろ 
1. 襦貉 覈
- Spark SQL CLI
2. 企ろ 覈
- Thrift JDBC 覯 + Beeline
Database
Connectivity
JDBC JDBC, ODBC
4. 覯れ襷 蟆郁骸
19
4.1 企ろ   豕
碁覈襴 蠍磯 蠍一 覃覈襴 曙 蠏豪概,   豌襴
1. EC2 語ろ伎: c3.4xlarge (vCPU: 16, 覃覈襴: 30GiB, SSDろ襴讌: 160GB x
2)
2. Tajo 蟲: 0.9.1-SNAPSHOT 覯, 1 master, 16 worker
3. 一危一: TPC-H 1TB
AWS 覯れ襷 蟆郁骸
0
2000
4000
6000
8000
10000
12000
14000
Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 Q12 Q13 Q14 Q15 Q16 Q17 Q18 Q19 Q20 Q21 Q22
sec.
hive
presto
spark
tajo
Tajo Hive 觜 蠏4覦, Presto 觜 蠏1.5覦 觜襴.
Spark 蟆曙, 覃覈襴 曙朱 ろ 讌 覿螳.
20
4.2  焔 レ
一危 碁襯 豢螳  , 焔 
1. EC2 語ろ伎: c3.4xlarge (vCPU: 16, 覃覈襴: 30GiB, SSDろ襴讌: 160GB x 2)
2. Tajo 蟲: 0.9.1-SNAPSHOT 覯
3. 一危一: TPC-H 1TB
AWS 覯れ襷 蟆郁骸
0
1000
2000
3000
4000
5000
6000
Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 Q12 Q13 Q14 Q15 Q16 Q17 Q18 Q19 Q20 Q21 Q22
sec.
16 workers
8 workers
4 workers
4 -> 8襦 讀螳 1.6覦 伎 焔 レ
4 -> 16襦 讀螳 2.4覦 伎 焔 レ
れ  蟆曙 500 蠏覈蟾讌  蟆渚
5. 襦襷
22
5.1 0.11.0 覯
 襴企Μ讀 : 2015 9 
 旧 蠍磯
 覃壱(Multi-tenancy) れ譽企
 企ろ伎
 Nested 襷 讌
 IN 觚貎朱Μ 讌
 ALTER TABLE ADD/DROP PARTITION 讌
 ORC 襷 讌
 伎 UDF/UDAF 讌
23
5.2 0.11.0 危 襦襷
 YARN 讌
  語 讌
 Scalar 覦 Exist 觚貎朱Μ 讌
 覦 ろ襴渚 蠍磯 Stored Procedure 讌
 願レ (Fault Tolerance)
 企 UDF 誤
 Map  覦 Array  讌
 ろる襴(Schema-less) 一危 襷 讌
 WITH  讌
 伎 覦 C++ 企殊伎誤 讌
6.  襦
25
6.1  DW 豌 襦
SK貊れ 蠍一ヾ  DW襯 Tajo襦 豌危 , DW Data Mart襦 
 ETL Processing: 120+ queries, ~4TB read/day
 OLAP Processing: 500+ queries
Operational
Systems
Integration
Layer
Data Warehouse
Data Mart
Marketing
Sales
ERP
SCM
ODS
Staging
Area Strategic
Marts
Data
Vault
26
6.2 ろ語 貊誤(Cohort) 覿 襦
る誤 蠍覃伎 襴 蟯螻 觜  螳覦 : Locket
EC2 c3.2xlarge 語ろ伎 10螳襦 GB 襦蠏碁ゼ  40豐  一危 豌襴
 豐 觜 : 0.420 * 10 = 4.20  (: 4898.5)
Amazon EC2 Cloud
Tajo Cluster
TajoWorker
TajoMaster TajoWorker
TajoWorker
TajoWorker
S3
Source Data Tajo Tables
RDS
MySQL
1. 覦一 ろ
2. 襦蠏  蠍磯
external 企 
3. Cohort 覿 貎朱Μ 
4. 貎朱Μ  蟆郁骸 
5. 糾 一危 襦
27
6.3 一ろ 覿 讌
ETL 螻殊  朱, 一危郁 貉れ 襦 Tajo螳 襴
 襾語 覲旧 Worker襦 覿 豌襴 螳
 MySQL 一危 襦(ETL) 螳
0.3 1.5
16.5
179.3
0.4 1.2
10.4
101.5
0.0
50.0
100.0
150.0
200.0
TPC-H 10m 100m 1g 10g
Mysql-5.5
Tajo-0.10



螳
(
豐
)
一危 蠍
 貎朱Μ ろ 螳 觜蟲 (TPC-H Q1)
 TPC-H Benchmark 蟆
OS : CentOS release 5.7 (x86_64)/ CPU : 3.40GHz 8 core / Mem :32G / Disk : SSD X 2
 一危磯 れ螳 覿
一危 蠍郁 貉れ襦 Tajo螳 襴
28
6.4 れ 覿 蟲 牛 蟆 螻
 JDBC 蠍磯  OLAP 蟲: Birst, Spotfire
 JDBC 蠍磯 覯れ 蟲: DbVisualizer, SQLWorkbench J,
Flamingo
 一危 伎語 蟲: Zeppelin
 DB : 豈(Tedpole DB hub)
Welcome to Tajo
1. Homepage: http://tajo.apache.org
2. 蟲 譟  蠏碁9
- 蟲蠍 蠏碁9: https://groups.google.com/forum/#!forum/tajo-user-kr
- 伎る: https://www.facebook.com/groups/tajokorea/
3. 譟 蠍 覓語 襦: http://bit.ly/1Ir417T
4. 蠍壱 谿瑚 危
- http://www.gruter.com/blog/tag/apache-tajo
- http://teamblog.gruter.com/tag/apache-tajo
- http://blrunner.com/category/Development/Tajo
Q&A
GRUTER: YOUR PARTNER
IN THE BIG DATA REVOLUTION
Phone +82-2-508-5911
Fax +82-2-508-5912
E-mail contact@gruter.com
Web www.gruter.com
Ad

Recommended

ろ語襦襦 覲 襦蠏 一危磯 : Tajo on AWS
ろ語襦襦 覲 襦蠏 一危磯 : Tajo on AWS
Gruter
Tajo TPC-H Benchmark Test on AWS
Tajo TPC-H Benchmark Test on AWS
Gruter
Big data analysis with R and Apache Tajo (in Korean)
Big data analysis with R and Apache Tajo (in Korean)
Gruter
Data analysis with Tajo
Data analysis with Tajo
Gruter
Expanding Your Data Warehouse with Tajo
Expanding Your Data Warehouse with Tajo
Matthew ()
Vectorized processing in_a_nutshell_DeView2014
Vectorized processing in_a_nutshell_DeView2014
Gruter
Tajo and SQL-on-Hadoop in Tech Planet 2013
Tajo and SQL-on-Hadoop in Tech Planet 2013
Gruter
SQL-on-Hadoop with Apache Tajo, and application case of SK Telecom
SQL-on-Hadoop with Apache Tajo, and application case of SK Telecom
Gruter
[Pgday.Seoul 2018] Greenplum 碁 覿 り
[Pgday.Seoul 2018] Greenplum 碁 覿 り
PgDay.Seoul
Gpdb best practices v a01 20150313
Gpdb best practices v a01 20150313
Sanghee Lee
觜一危 蟲豢 襦
觜一危 蟲豢 襦
Taehyeon Oh
Spark Day 2017@Seoul(Spark Bootcamp)
Spark Day 2017@Seoul(Spark Bootcamp)
Sang-bae Lim
覲願鍵(Learn about Hadoop basic), NetApp FAS NFS Connector for Hadoop
覲願鍵(Learn about Hadoop basic), NetApp FAS NFS Connector for Hadoop
SeungYong Baek
3 Hadoop 覈 / 豺 朱
3 Hadoop 覈 / 豺 朱
Teddy Choi
Gruter TECHDAY 2014 MelOn BigData
Gruter TECHDAY 2014 MelOn BigData
Gruter
Spark_Overview_qna
Spark_Overview_qna
豌 覦
GRUTER螳 るれ朱 Big Data Platform 蟲豢 糾骸 襦: Tajo SQL-on-Hadoop
GRUTER螳 るれ朱 Big Data Platform 蟲豢 糾骸 襦: Tajo SQL-on-Hadoop
Gruter
Hadoop 譯朱
Hadoop 譯朱
DaeHeon Oh
Spark overview 伎(SK C&C)_ろ 覈_20141106
Spark overview 伎(SK C&C)_ろ 覈_20141106
SangHoon Lee
Apache Hive: for business intelligence use and real-time I/O use (Korean)
Apache Hive: for business intelligence use and real-time I/O use (Korean)
Teddy Choi
Hadoop螻 SQL-on-Hadoop (A short intro to Hadoop and SQL-on-Hadoop)
Hadoop螻 SQL-on-Hadoop (A short intro to Hadoop and SQL-on-Hadoop)
Matthew ()
Spark 螳 2覿
Spark 螳 2覿
Jinho Yoo
[D2 COMMUNITY] Spark User Group - ろ襯 牛 ル 企螻 れ
[D2 COMMUNITY] Spark User Group - ろ襯 牛 ル 企螻 れ
NAVER D2
れ螳 觜 一危 蠍一 覦 Daum 襦 螳 (2013)
れ螳 觜 一危 蠍一 覦 Daum 襦 螳 (2013)
Channy Yun
= 梶a求求= 襷覲旧曙
= 梶a求求= 襷覲旧曙
覩殊 覩殊
Druid+superset
Druid+superset
Dongwoo Lee
Spark 企蟆 覈伎螻 蟾?
Spark 企蟆 覈伎螻 蟾?
KSLUG
Cloudera session seoul - Spark bootcamp
Cloudera session seoul - Spark bootcamp
Sang-bae Lim
Expanding Your Data Warehouse with Tajo
Expanding Your Data Warehouse with Tajo
Gruter
Introduction to Apache Tajo: Data Warehouse for Big Data
Introduction to Apache Tajo: Data Warehouse for Big Data
Gruter

More Related Content

What's hot (20)

[Pgday.Seoul 2018] Greenplum 碁 覿 り
[Pgday.Seoul 2018] Greenplum 碁 覿 り
PgDay.Seoul
Gpdb best practices v a01 20150313
Gpdb best practices v a01 20150313
Sanghee Lee
觜一危 蟲豢 襦
觜一危 蟲豢 襦
Taehyeon Oh
Spark Day 2017@Seoul(Spark Bootcamp)
Spark Day 2017@Seoul(Spark Bootcamp)
Sang-bae Lim
覲願鍵(Learn about Hadoop basic), NetApp FAS NFS Connector for Hadoop
覲願鍵(Learn about Hadoop basic), NetApp FAS NFS Connector for Hadoop
SeungYong Baek
3 Hadoop 覈 / 豺 朱
3 Hadoop 覈 / 豺 朱
Teddy Choi
Gruter TECHDAY 2014 MelOn BigData
Gruter TECHDAY 2014 MelOn BigData
Gruter
Spark_Overview_qna
Spark_Overview_qna
豌 覦
GRUTER螳 るれ朱 Big Data Platform 蟲豢 糾骸 襦: Tajo SQL-on-Hadoop
GRUTER螳 るれ朱 Big Data Platform 蟲豢 糾骸 襦: Tajo SQL-on-Hadoop
Gruter
Hadoop 譯朱
Hadoop 譯朱
DaeHeon Oh
Spark overview 伎(SK C&C)_ろ 覈_20141106
Spark overview 伎(SK C&C)_ろ 覈_20141106
SangHoon Lee
Apache Hive: for business intelligence use and real-time I/O use (Korean)
Apache Hive: for business intelligence use and real-time I/O use (Korean)
Teddy Choi
Hadoop螻 SQL-on-Hadoop (A short intro to Hadoop and SQL-on-Hadoop)
Hadoop螻 SQL-on-Hadoop (A short intro to Hadoop and SQL-on-Hadoop)
Matthew ()
Spark 螳 2覿
Spark 螳 2覿
Jinho Yoo
[D2 COMMUNITY] Spark User Group - ろ襯 牛 ル 企螻 れ
[D2 COMMUNITY] Spark User Group - ろ襯 牛 ル 企螻 れ
NAVER D2
れ螳 觜 一危 蠍一 覦 Daum 襦 螳 (2013)
れ螳 觜 一危 蠍一 覦 Daum 襦 螳 (2013)
Channy Yun
= 梶a求求= 襷覲旧曙
= 梶a求求= 襷覲旧曙
覩殊 覩殊
Druid+superset
Druid+superset
Dongwoo Lee
Spark 企蟆 覈伎螻 蟾?
Spark 企蟆 覈伎螻 蟾?
KSLUG
Cloudera session seoul - Spark bootcamp
Cloudera session seoul - Spark bootcamp
Sang-bae Lim
[Pgday.Seoul 2018] Greenplum 碁 覿 り
[Pgday.Seoul 2018] Greenplum 碁 覿 り
PgDay.Seoul
Gpdb best practices v a01 20150313
Gpdb best practices v a01 20150313
Sanghee Lee
觜一危 蟲豢 襦
觜一危 蟲豢 襦
Taehyeon Oh
Spark Day 2017@Seoul(Spark Bootcamp)
Spark Day 2017@Seoul(Spark Bootcamp)
Sang-bae Lim
覲願鍵(Learn about Hadoop basic), NetApp FAS NFS Connector for Hadoop
覲願鍵(Learn about Hadoop basic), NetApp FAS NFS Connector for Hadoop
SeungYong Baek
3 Hadoop 覈 / 豺 朱
3 Hadoop 覈 / 豺 朱
Teddy Choi
Gruter TECHDAY 2014 MelOn BigData
Gruter TECHDAY 2014 MelOn BigData
Gruter
Spark_Overview_qna
Spark_Overview_qna
豌 覦
GRUTER螳 るれ朱 Big Data Platform 蟲豢 糾骸 襦: Tajo SQL-on-Hadoop
GRUTER螳 るれ朱 Big Data Platform 蟲豢 糾骸 襦: Tajo SQL-on-Hadoop
Gruter
Hadoop 譯朱
Hadoop 譯朱
DaeHeon Oh
Spark overview 伎(SK C&C)_ろ 覈_20141106
Spark overview 伎(SK C&C)_ろ 覈_20141106
SangHoon Lee
Apache Hive: for business intelligence use and real-time I/O use (Korean)
Apache Hive: for business intelligence use and real-time I/O use (Korean)
Teddy Choi
Hadoop螻 SQL-on-Hadoop (A short intro to Hadoop and SQL-on-Hadoop)
Hadoop螻 SQL-on-Hadoop (A short intro to Hadoop and SQL-on-Hadoop)
Matthew ()
Spark 螳 2覿
Spark 螳 2覿
Jinho Yoo
[D2 COMMUNITY] Spark User Group - ろ襯 牛 ル 企螻 れ
[D2 COMMUNITY] Spark User Group - ろ襯 牛 ル 企螻 れ
NAVER D2
れ螳 觜 一危 蠍一 覦 Daum 襦 螳 (2013)
れ螳 觜 一危 蠍一 覦 Daum 襦 螳 (2013)
Channy Yun
= 梶a求求= 襷覲旧曙
= 梶a求求= 襷覲旧曙
覩殊 覩殊
Druid+superset
Druid+superset
Dongwoo Lee
Spark 企蟆 覈伎螻 蟾?
Spark 企蟆 覈伎螻 蟾?
KSLUG
Cloudera session seoul - Spark bootcamp
Cloudera session seoul - Spark bootcamp
Sang-bae Lim

Viewers also liked (13)

Expanding Your Data Warehouse with Tajo
Expanding Your Data Warehouse with Tajo
Gruter
Introduction to Apache Tajo: Data Warehouse for Big Data
Introduction to Apache Tajo: Data Warehouse for Big Data
Gruter
Elastic Search Performance Optimization - Deview 2014
Elastic Search Performance Optimization - Deview 2014
Gruter
Introduction to Apache Tajo: Future of Data Warehouse
Introduction to Apache Tajo: Future of Data Warehouse
Gruter
GRUTER螳 るれ朱 Big Data Platform 蟲豢 糾骸 襦: 覲伎 襦蠏 覿 觜一危 ろ 蟲豢 襦
GRUTER螳 るれ朱 Big Data Platform 蟲豢 糾骸 襦: 覲伎 襦蠏 覿 觜一危 ろ 蟲豢 襦
Gruter
SKT 覈 4 2譯殊姶 Output
SKT 覈 4 2譯殊姶 Output
nceo
MelOn 觜一危 弬骸 Tajo 伎手鍵
MelOn 觜一危 弬骸 Tajo 伎手鍵
Gruter
GRUTER螳 るれ朱 Big Data Platform 蟲豢 糾骸 襦: 誤磯 狩覈一 れ螳 覿 蟲豢 襦
GRUTER螳 るれ朱 Big Data Platform 蟲豢 糾骸 襦: 誤磯 狩覈一 れ螳 覿 蟲豢 襦
Gruter
蟲 覦 譯殊 Agile practices 襦 v1.0
蟲 覦 譯殊 Agile practices 襦 v1.0
Sangcheol Hwang
GRUTER螳 るれ朱 Big Data Platform 蟲豢 糾骸 襦: GRUTER 觜一危 覦 螳
GRUTER螳 るれ朱 Big Data Platform 蟲豢 糾骸 襦: GRUTER 觜一危 覦 螳
Gruter
SPARK SQL
SPARK SQL
Juhui Park
Structuring Apache Spark 2.0: SQL, DataFrames, Datasets And Streaming - by Mi...
Structuring Apache Spark 2.0: SQL, DataFrames, Datasets And Streaming - by Mi...
Databricks
Hive on spark is blazing fast or is it final
Hive on spark is blazing fast or is it final
Hortonworks
Expanding Your Data Warehouse with Tajo
Expanding Your Data Warehouse with Tajo
Gruter
Introduction to Apache Tajo: Data Warehouse for Big Data
Introduction to Apache Tajo: Data Warehouse for Big Data
Gruter
Elastic Search Performance Optimization - Deview 2014
Elastic Search Performance Optimization - Deview 2014
Gruter
Introduction to Apache Tajo: Future of Data Warehouse
Introduction to Apache Tajo: Future of Data Warehouse
Gruter
GRUTER螳 るれ朱 Big Data Platform 蟲豢 糾骸 襦: 覲伎 襦蠏 覿 觜一危 ろ 蟲豢 襦
GRUTER螳 るれ朱 Big Data Platform 蟲豢 糾骸 襦: 覲伎 襦蠏 覿 觜一危 ろ 蟲豢 襦
Gruter
SKT 覈 4 2譯殊姶 Output
SKT 覈 4 2譯殊姶 Output
nceo
MelOn 觜一危 弬骸 Tajo 伎手鍵
MelOn 觜一危 弬骸 Tajo 伎手鍵
Gruter
GRUTER螳 るれ朱 Big Data Platform 蟲豢 糾骸 襦: 誤磯 狩覈一 れ螳 覿 蟲豢 襦
GRUTER螳 るれ朱 Big Data Platform 蟲豢 糾骸 襦: 誤磯 狩覈一 れ螳 覿 蟲豢 襦
Gruter
蟲 覦 譯殊 Agile practices 襦 v1.0
蟲 覦 譯殊 Agile practices 襦 v1.0
Sangcheol Hwang
GRUTER螳 るれ朱 Big Data Platform 蟲豢 糾骸 襦: GRUTER 觜一危 覦 螳
GRUTER螳 るれ朱 Big Data Platform 蟲豢 糾骸 襦: GRUTER 觜一危 覦 螳
Gruter
Structuring Apache Spark 2.0: SQL, DataFrames, Datasets And Streaming - by Mi...
Structuring Apache Spark 2.0: SQL, DataFrames, Datasets And Streaming - by Mi...
Databricks
Hive on spark is blazing fast or is it final
Hive on spark is blazing fast or is it final
Hortonworks
Ad

Similar to Introduction to Apache Tajo (20)

ろ語 襦襦 覲 襦蠏 一危 覿 : Tajo on AWS
ろ語 襦襦 覲 襦蠏 一危 覿 : Tajo on AWS
Matthew ()
Tajo korea meetup oct 2015-spatial tajo
Tajo korea meetup oct 2015-spatial tajo
BD
[2015 07-06-れ譴] Oracle 焔 豕 覦 讌 螻 4
[2015 07-06-れ譴] Oracle 焔 豕 覦 讌 螻 4
Seok-joon Yun
2020 10 24 螳覦 伎手鍵
2020 10 24 螳覦 伎手鍵
Jay Park
[215]求メ梶釈メ求梶 求戟求п
[215]求メ梶釈メ求梶 求戟求п
NAVER D2
AWS襯 牛 觜一危 蠍磯 觜讌 誤襴 蟲豢- AWS Summit Seoul 2017
AWS襯 牛 觜一危 蠍磯 觜讌 誤襴 蟲豢- AWS Summit Seoul 2017
Amazon Web Services Korea
Apache Htrace overview (20160520)
Apache Htrace overview (20160520)
Steve Min
Bigquery airflow襯 伎 一危 覿 ろ 蟲豢 v1 覓願鍵(譯) 豕 20170912
Bigquery airflow襯 伎 一危 覿 ろ 蟲豢 v1 覓願鍵(譯) 豕 20170912
Yooseok Choi
豐豐豐 (豐螻 豐讌 豐郁屋) 5G IoT 螳覦 伎手鍵
豐豐豐 (豐螻 豐讌 豐郁屋) 5G IoT 螳覦 伎手鍵
ksdc2019
MSA(Service Mesh), MDA(Data Mesh), MIA(Inference Mesh) 蠍一 -=@...
MSA(Service Mesh), MDA(Data Mesh), MIA(Inference Mesh) 蠍一 -=@...
覓瑚鍵 覦
求 ≡== メ ≡メ
求 ≡== メ ≡メ
Yeonhee Kim
AWS 6 觜 | AWS MS SQL 覯 伎蠍 (蟾覩殊 襭讀ろ)
AWS 6 觜 | AWS MS SQL 覯 伎蠍 (蟾覩殊 襭讀ろ)
Amazon Web Services Korea
Spark sql
Spark sql
求梶 =罪= Real-time In-memory Stream Processing 求a
求梶 =罪= Real-time In-memory Stream Processing 求a
Ted Won
Amazon Aurora 焔 レ 覦 襷願係伎 覈覯 襦 - AWS Summit Seoul 2017
Amazon Aurora 焔 レ 覦 襷願係伎 覈覯 襦 - AWS Summit Seoul 2017
Amazon Web Services Korea
磯ろ一 一危 危 蟲豢 伎手鍵 : Data Lake architecture case study (覦譯狩 一危 覿 覦 誤 ...
磯ろ一 一危 危 蟲豢 伎手鍵 : Data Lake architecture case study (覦譯狩 一危 覿 覦 誤 ...
Amazon Web Services Korea
Vertica New Features - 8.1 9.2蟾讌
Vertica New Features - 8.1 9.2蟾讌
Kee Hoon Lee
Object storage 危伎
Object storage 危伎
Seoro Kim
[231]п釈 メ 求メ求 メ釈梶 メ メ
[231]п釈 メ 求メ求 メ釈梶 メ メ
NAVER D2
Private PaaS with Docker, spring cloud and mesos
Private PaaS with Docker, spring cloud and mesos
uEngine Solutions
ろ語 襦襦 覲 襦蠏 一危 覿 : Tajo on AWS
ろ語 襦襦 覲 襦蠏 一危 覿 : Tajo on AWS
Matthew ()
Tajo korea meetup oct 2015-spatial tajo
Tajo korea meetup oct 2015-spatial tajo
BD
[2015 07-06-れ譴] Oracle 焔 豕 覦 讌 螻 4
[2015 07-06-れ譴] Oracle 焔 豕 覦 讌 螻 4
Seok-joon Yun
2020 10 24 螳覦 伎手鍵
2020 10 24 螳覦 伎手鍵
Jay Park
[215]求メ梶釈メ求梶 求戟求п
[215]求メ梶釈メ求梶 求戟求п
NAVER D2
AWS襯 牛 觜一危 蠍磯 觜讌 誤襴 蟲豢- AWS Summit Seoul 2017
AWS襯 牛 觜一危 蠍磯 觜讌 誤襴 蟲豢- AWS Summit Seoul 2017
Amazon Web Services Korea
Apache Htrace overview (20160520)
Apache Htrace overview (20160520)
Steve Min
Bigquery airflow襯 伎 一危 覿 ろ 蟲豢 v1 覓願鍵(譯) 豕 20170912
Bigquery airflow襯 伎 一危 覿 ろ 蟲豢 v1 覓願鍵(譯) 豕 20170912
Yooseok Choi
豐豐豐 (豐螻 豐讌 豐郁屋) 5G IoT 螳覦 伎手鍵
豐豐豐 (豐螻 豐讌 豐郁屋) 5G IoT 螳覦 伎手鍵
ksdc2019
MSA(Service Mesh), MDA(Data Mesh), MIA(Inference Mesh) 蠍一 -=@...
MSA(Service Mesh), MDA(Data Mesh), MIA(Inference Mesh) 蠍一 -=@...
覓瑚鍵 覦
求 ≡== メ ≡メ
求 ≡== メ ≡メ
Yeonhee Kim
AWS 6 觜 | AWS MS SQL 覯 伎蠍 (蟾覩殊 襭讀ろ)
AWS 6 觜 | AWS MS SQL 覯 伎蠍 (蟾覩殊 襭讀ろ)
Amazon Web Services Korea
Spark sql
Spark sql
求梶 =罪= Real-time In-memory Stream Processing 求a
求梶 =罪= Real-time In-memory Stream Processing 求a
Ted Won
Amazon Aurora 焔 レ 覦 襷願係伎 覈覯 襦 - AWS Summit Seoul 2017
Amazon Aurora 焔 レ 覦 襷願係伎 覈覯 襦 - AWS Summit Seoul 2017
Amazon Web Services Korea
磯ろ一 一危 危 蟲豢 伎手鍵 : Data Lake architecture case study (覦譯狩 一危 覿 覦 誤 ...
磯ろ一 一危 危 蟲豢 伎手鍵 : Data Lake architecture case study (覦譯狩 一危 覿 覦 誤 ...
Amazon Web Services Korea
Vertica New Features - 8.1 9.2蟾讌
Vertica New Features - 8.1 9.2蟾讌
Kee Hoon Lee
Object storage 危伎
Object storage 危伎
Seoro Kim
[231]п釈 メ 求メ求 メ釈梶 メ メ
[231]п釈 メ 求メ求 メ釈梶 メ メ
NAVER D2
Private PaaS with Docker, spring cloud and mesos
Private PaaS with Docker, spring cloud and mesos
uEngine Solutions
Ad

More from Gruter (16)

What's New Tajo 0.10 and Its Beyond
What's New Tajo 0.10 and Its Beyond
Gruter
Efficient In足situ Processing of Various Storage Types on Apache Tajo
Efficient In足situ Processing of Various Storage Types on Apache Tajo
Gruter
Gruter TECHDAY 2014 Realtime Processing in Telco
Gruter TECHDAY 2014 Realtime Processing in Telco
Gruter
Gruter_TECHDAY_2014_04_TajoCloudHandsOn (in Korean)
Gruter_TECHDAY_2014_04_TajoCloudHandsOn (in Korean)
Gruter
Gruter_TECHDAY_2014_03_ApacheTajo (in Korean)
Gruter_TECHDAY_2014_03_ApacheTajo (in Korean)
Gruter
Gruter_TECHDAY_2014_01_SearchEngine (in Korean)
Gruter_TECHDAY_2014_01_SearchEngine (in Korean)
Gruter
Apache Tajo - BWC 2014
Apache Tajo - BWC 2014
Gruter
Hadoop security DeView 2014
Hadoop security DeView 2014
Gruter
Big Data Camp LA 2014 - Apache Tajo: A Big Data Warehouse System on Hadoop
Big Data Camp LA 2014 - Apache Tajo: A Big Data Warehouse System on Hadoop
Gruter
Hadoop Summit 2014: Query Optimization and JIT-based Vectorized Execution in ...
Hadoop Summit 2014: Query Optimization and JIT-based Vectorized Execution in ...
Gruter
Cloumon sw戟メп釈 ==
Cloumon sw戟メп釈 ==
Gruter
Tajo case study bay area hug 20131105
Tajo case study bay area hug 20131105
Gruter
Apache Tajo - Bay Area HUG Nov. 2013 LinkedIn Special Event
Apache Tajo - Bay Area HUG Nov. 2013 LinkedIn Special Event
Gruter
DeView2013 Big Data Platform Architecture with Hadoop - Hyeong-jun Kim
DeView2013 Big Data Platform Architecture with Hadoop - Hyeong-jun Kim
Gruter
GRUTER螳 るれ朱 Big Data Platform 蟲豢 糾骸 襦: 殊 貉豸 觜るゼ 觜一危 蟲豢 襦
GRUTER螳 るれ朱 Big Data Platform 蟲豢 糾骸 襦: 殊 貉豸 觜るゼ 觜一危 蟲豢 襦
Gruter
GRUTER螳 るれ朱 Big Data Platform 蟲豢 糾骸 襦: Bioinformatics Data襯 Hadoop蠍磯...
GRUTER螳 るれ朱 Big Data Platform 蟲豢 糾骸 襦: Bioinformatics Data襯 Hadoop蠍磯...
Gruter
What's New Tajo 0.10 and Its Beyond
What's New Tajo 0.10 and Its Beyond
Gruter
Efficient In足situ Processing of Various Storage Types on Apache Tajo
Efficient In足situ Processing of Various Storage Types on Apache Tajo
Gruter
Gruter TECHDAY 2014 Realtime Processing in Telco
Gruter TECHDAY 2014 Realtime Processing in Telco
Gruter
Gruter_TECHDAY_2014_04_TajoCloudHandsOn (in Korean)
Gruter_TECHDAY_2014_04_TajoCloudHandsOn (in Korean)
Gruter
Gruter_TECHDAY_2014_03_ApacheTajo (in Korean)
Gruter_TECHDAY_2014_03_ApacheTajo (in Korean)
Gruter
Gruter_TECHDAY_2014_01_SearchEngine (in Korean)
Gruter_TECHDAY_2014_01_SearchEngine (in Korean)
Gruter
Apache Tajo - BWC 2014
Apache Tajo - BWC 2014
Gruter
Hadoop security DeView 2014
Hadoop security DeView 2014
Gruter
Big Data Camp LA 2014 - Apache Tajo: A Big Data Warehouse System on Hadoop
Big Data Camp LA 2014 - Apache Tajo: A Big Data Warehouse System on Hadoop
Gruter
Hadoop Summit 2014: Query Optimization and JIT-based Vectorized Execution in ...
Hadoop Summit 2014: Query Optimization and JIT-based Vectorized Execution in ...
Gruter
Cloumon sw戟メп釈 ==
Cloumon sw戟メп釈 ==
Gruter
Tajo case study bay area hug 20131105
Tajo case study bay area hug 20131105
Gruter
Apache Tajo - Bay Area HUG Nov. 2013 LinkedIn Special Event
Apache Tajo - Bay Area HUG Nov. 2013 LinkedIn Special Event
Gruter
DeView2013 Big Data Platform Architecture with Hadoop - Hyeong-jun Kim
DeView2013 Big Data Platform Architecture with Hadoop - Hyeong-jun Kim
Gruter
GRUTER螳 るれ朱 Big Data Platform 蟲豢 糾骸 襦: 殊 貉豸 觜るゼ 觜一危 蟲豢 襦
GRUTER螳 るれ朱 Big Data Platform 蟲豢 糾骸 襦: 殊 貉豸 觜るゼ 觜一危 蟲豢 襦
Gruter
GRUTER螳 るれ朱 Big Data Platform 蟲豢 糾骸 襦: Bioinformatics Data襯 Hadoop蠍磯...
GRUTER螳 るれ朱 Big Data Platform 蟲豢 糾骸 襦: Bioinformatics Data襯 Hadoop蠍磯...
Gruter

Introduction to Apache Tajo

  • 1. Introduction to Apache Tajo 郁規 / 蠏碁
  • 2. About me Gruter Corp / BigData Engineer Apache Tajo Committer jhjung@gruter.com Home Page: http://blrunner.com Twitter: @blrunner78 The author of Hadoop book
  • 3. 覈谿 1. Tajo 螳 2. 旧 蠍磯 3. Tajo vs Spark 4. 覯れ襷 蟆郁骸 5. 襦襷 6. 襦
  • 5. 5 1.1 Tajo? 蠍磯 觜一危 危一 ろ 2013 豺 誤覯伎, 2014 豺 覯 襦 ANSI SQL 讌 譯殊 轟 豌 螻焔 覿 豌襴 讌 (Not MapReduce) れ 貎朱Μ 豕 蠍磯 覦 螻襴讀 螳 伎 ろ ETL 貎朱Μ 讌 覦 覦襴語雑 企 ろ 誤磯磯 貎朱Μ 讌
  • 6. 6 1.2 Tajo ろ豌 Master Server (HA) Client JDBC TSql Web UI CatalogStore DBMS HiveMetastor e Submit a Query Manage metadata Allocate a query Send task & monitor Send task & monitor Slave Server TajoWorker QueryMaster Local FileSystem HDFS Local Query Engine StorageManager Slave Server TajoWorker QueryMaster Local FileSystem HDFS Local Query Engine StorageManager Slave Server TajoWorker QueryMaster Local FileSystem HDFS Local Query Engine StorageManager TajoMaster TajoMaster
  • 7. 7 1.3 Tajo 觜蟲一 ANSI SQL 讌 - 給 豕 覦 蠍一ヾ ろ - 觜譴 SQL 蟆曙, Oracle螻 PostgreSQL 谿瑚 企ろ レ - 豌 碁蟾讌 螳ロ 螻焔 覿 豌襴 讌 - れ: 覓朱Μ ろ 100MB/sec (SATA 蠍一) - 1TB襯 10 碁襦 豌襴 螳 aggregation 貎朱Μ: 30豐 ~ 1覿 伎 螳 join 貎朱Μ: 1 ~ 2 覿 伎 覲旧″ join 覦 distinct aggregation : 覿 10覿
  • 9. 9 2.1 SQL 煙 豌 覿 豌襴 讌 伎, 轟 貎朱Μ 覿 ろ 讌 蠍一ヾ 一危 覦 襷穴骸 誤 讌 覿 豌襴 - Inner join, and left/right/full outer join - Groupby, sort, multiple distinct aggregation - window function SQL 一危 - CHAR, BOOL, INT, DOUBLE, TEXT, DATE, Etc れ 襷 - Text file (CSV), SequenceFile, RCFile, ORC, Parquet, Avro
  • 10. 10 2.2 貎朱Μ 豕 Cost-based Join Optimization (Greedy Heuristic) - 螳 豕 Join 襯 豢豸″ 螻 蟇 螳ロ rewrite rule 讌 - rewrite rule 誤壱伎 螻糾骸 れ 碁Μ 螻 讌 貎朱Μ 豕 (Progressive Query Optimization) - ろ 螳 糾 讌 - 覿 覯 覿 (range partitioning) 一 覯, 螳 煙 壱 譟一 - 覿 Join, 蠏碁9覦企ゼ 一 螳襯 壱 譟一
  • 11. 11 2.3 一 企 一 企 焔 CREATE TABLE student ( id INT, name TEXT, grade TEXT ) USING PARQUET PARTITION BY COLUMN (country TEXT, city TEXT); /tajo/warehouse/student/country=KOREA/city=SEOUL/ /tajo/warehouse/student/country=KOREA/city=PUSAN/ /tajo/warehouse/student/country=KOREA/city=INCHEON/ /tajo/warehouse/student/country=USA/city=NEWYORK/ /tajo/warehouse/student/country=USA/city=BOSTON/ . . . Hive 誤 讌 Column Value 覦 企 一 讌 磯Μ 蟲 ロ Range 一 讌
  • 12. 12 2.4 貎朱Μ 伎(Federation) 覦 企ろ伎 讌 れ 一危 り Join 覦 Union 貎朱Μ 豌襴 讌 レ - 一危 襷願係伎 : RDBMS - 蠍一ヾ RBMS 一危一 一危一 Join 貎朱Μ 豌襴 - SQL襯 伎 NoSQL 覦 れ ろ襴讌 (S3, Swift, HBase, ElasticSearch, Kafka) - SQL 蟲襯 伎 誤壱伎 譴 HDFS NoSQL S3 Swift Tajo
  • 13. 13 2.5 Nested 覦 JSON 襷 讌 覲 螳螻旧 Nested 覦 JSON 襷 殊 SQL 豌襴 螳 一危 企 SQL 覓
  • 14. 14 2.6 UDF 讌 > SELECT pow(col1, col2), col3, col4, FROM orders, @Override public Datum eval(Tuple params) { Datum value1Datum = params.get(0); Datum value2Datum = params.get(1); if(value1Datum instanceof NullDatum || value2Datum instanceof NullDatum) { return NullDatum.get(); } return DatumFactory.createFloat8(Math.pow(value1Datum.asFloat8(), value2Datum.asFloat8())); } @ScalarFunction(name="pow", returnType = FLOAT8,paramTypes = {FLOAT8, FLOAT8}) public static double pow(double x, double y) { return Math.pow(x, y); } 貊 1 貊 2
  • 15. 3. Tajo vs Spark
  • 16. 16 3.1 蠍磯蓋 轟 觜蟲 覈 Tajo Spark 一危一危一 企ろ 一 襴 蟯襴 豌 豌 or YARN ろ襴讌 HDFS, S3, HBase, Swift HDFS, S3, HBase 狩襷 CSV, RC, ORC, Parquet, Avro CSV, RC, ORC, Parquet, Avro 蟲語 Java Scala 貉危 轟 一危磯 Disk, 譴螳 一危磯 Memory/Disk 覈 覿 一危郁 In-Memory 襦 貎朱Μ 危伎 Long running batch , Interactive Interactive
  • 17. 17 3.2 SQL 讌 轟 觜蟲 : Tajo vs Spark SQL 覈 Tajo Spark SQL 貎朱Μ 譴 ANISI SQL - HiveQL : HiveContext 螻 - ANSI SQL: SQLContext 螻, 企 蟇 SELECT 貎朱Μ O O INSERT INTO 覦 CREAT TABLE AS SELECT 貎朱Μ O O Multiple Distinct Column O X Command Line Interface TSql -襦貉 覦 企ろ 覈 讌 - 覲 磯 ろ 1. 襦貉 覈 - Spark SQL CLI 2. 企ろ 覈 - Thrift JDBC 覯 + Beeline Database Connectivity JDBC JDBC, ODBC
  • 19. 19 4.1 企ろ 豕 碁覈襴 蠍磯 蠍一 覃覈襴 曙 蠏豪概, 豌襴 1. EC2 語ろ伎: c3.4xlarge (vCPU: 16, 覃覈襴: 30GiB, SSDろ襴讌: 160GB x 2) 2. Tajo 蟲: 0.9.1-SNAPSHOT 覯, 1 master, 16 worker 3. 一危一: TPC-H 1TB AWS 覯れ襷 蟆郁骸 0 2000 4000 6000 8000 10000 12000 14000 Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 Q12 Q13 Q14 Q15 Q16 Q17 Q18 Q19 Q20 Q21 Q22 sec. hive presto spark tajo Tajo Hive 觜 蠏4覦, Presto 觜 蠏1.5覦 觜襴. Spark 蟆曙, 覃覈襴 曙朱 ろ 讌 覿螳.
  • 20. 20 4.2 焔 レ 一危 碁襯 豢螳 , 焔 1. EC2 語ろ伎: c3.4xlarge (vCPU: 16, 覃覈襴: 30GiB, SSDろ襴讌: 160GB x 2) 2. Tajo 蟲: 0.9.1-SNAPSHOT 覯 3. 一危一: TPC-H 1TB AWS 覯れ襷 蟆郁骸 0 1000 2000 3000 4000 5000 6000 Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 Q12 Q13 Q14 Q15 Q16 Q17 Q18 Q19 Q20 Q21 Q22 sec. 16 workers 8 workers 4 workers 4 -> 8襦 讀螳 1.6覦 伎 焔 レ 4 -> 16襦 讀螳 2.4覦 伎 焔 レ れ 蟆曙 500 蠏覈蟾讌 蟆渚
  • 22. 22 5.1 0.11.0 覯 襴企Μ讀 : 2015 9 旧 蠍磯 覃壱(Multi-tenancy) れ譽企 企ろ伎 Nested 襷 讌 IN 觚貎朱Μ 讌 ALTER TABLE ADD/DROP PARTITION 讌 ORC 襷 讌 伎 UDF/UDAF 讌
  • 23. 23 5.2 0.11.0 危 襦襷 YARN 讌 語 讌 Scalar 覦 Exist 觚貎朱Μ 讌 覦 ろ襴渚 蠍磯 Stored Procedure 讌 願レ (Fault Tolerance) 企 UDF 誤 Map 覦 Array 讌 ろる襴(Schema-less) 一危 襷 讌 WITH 讌 伎 覦 C++ 企殊伎誤 讌
  • 25. 25 6.1 DW 豌 襦 SK貊れ 蠍一ヾ DW襯 Tajo襦 豌危 , DW Data Mart襦 ETL Processing: 120+ queries, ~4TB read/day OLAP Processing: 500+ queries Operational Systems Integration Layer Data Warehouse Data Mart Marketing Sales ERP SCM ODS Staging Area Strategic Marts Data Vault
  • 26. 26 6.2 ろ語 貊誤(Cohort) 覿 襦 る誤 蠍覃伎 襴 蟯螻 觜 螳覦 : Locket EC2 c3.2xlarge 語ろ伎 10螳襦 GB 襦蠏碁ゼ 40豐 一危 豌襴 豐 觜 : 0.420 * 10 = 4.20 (: 4898.5) Amazon EC2 Cloud Tajo Cluster TajoWorker TajoMaster TajoWorker TajoWorker TajoWorker S3 Source Data Tajo Tables RDS MySQL 1. 覦一 ろ 2. 襦蠏 蠍磯 external 企 3. Cohort 覿 貎朱Μ 4. 貎朱Μ 蟆郁骸 5. 糾 一危 襦
  • 27. 27 6.3 一ろ 覿 讌 ETL 螻殊 朱, 一危郁 貉れ 襦 Tajo螳 襴 襾語 覲旧 Worker襦 覿 豌襴 螳 MySQL 一危 襦(ETL) 螳 0.3 1.5 16.5 179.3 0.4 1.2 10.4 101.5 0.0 50.0 100.0 150.0 200.0 TPC-H 10m 100m 1g 10g Mysql-5.5 Tajo-0.10 螳 ( 豐 ) 一危 蠍 貎朱Μ ろ 螳 觜蟲 (TPC-H Q1) TPC-H Benchmark 蟆 OS : CentOS release 5.7 (x86_64)/ CPU : 3.40GHz 8 core / Mem :32G / Disk : SSD X 2 一危磯 れ螳 覿 一危 蠍郁 貉れ襦 Tajo螳 襴
  • 28. 28 6.4 れ 覿 蟲 牛 蟆 螻 JDBC 蠍磯 OLAP 蟲: Birst, Spotfire JDBC 蠍磯 覯れ 蟲: DbVisualizer, SQLWorkbench J, Flamingo 一危 伎語 蟲: Zeppelin DB : 豈(Tedpole DB hub)
  • 29. Welcome to Tajo 1. Homepage: http://tajo.apache.org 2. 蟲 譟 蠏碁9 - 蟲蠍 蠏碁9: https://groups.google.com/forum/#!forum/tajo-user-kr - 伎る: https://www.facebook.com/groups/tajokorea/ 3. 譟 蠍 覓語 襦: http://bit.ly/1Ir417T 4. 蠍壱 谿瑚 危 - http://www.gruter.com/blog/tag/apache-tajo - http://teamblog.gruter.com/tag/apache-tajo - http://blrunner.com/category/Development/Tajo
  • 30. Q&A
  • 31. GRUTER: YOUR PARTNER IN THE BIG DATA REVOLUTION Phone +82-2-508-5911 Fax +82-2-508-5912 E-mail contact@gruter.com Web www.gruter.com

Editor's Notes

  • #18: Join 豌襴 Join 豕 讌, れ 覦 Join 讌 Join 豕 覩語, Hash Join 讌