際際滷

際際滷Share a Scribd company logo
Dataworks Summit 2018 蠍
SAN JOSE, USA JUNE 18-21
 蠍一螳覦
覦
Dataworks Summit 2018
 蠍一螳覦
覦
OVERVIEW
語 蠍一  伎覲企 覦 伎手鍵襯 る螻 
( 螳 語 誤 伎 蠍壱螳 る 豢 覦 )
螳 殊 れ 語
 覿蠍
SUMMIT 襦蟇
語 螳
蠏 覦 覩語 
豐 覦 螳
Dataworks Summit 2018
 蠍一螳覦
覦
 KEYNOTE
 YARN 3.x
 GoPros Streaming Pipelines
 MaterializedViews in Apache Hive
 Spark Configuration Tuning
 INGs Docker-based Pipelines
DAY 1
SCHEDULE
 KEYNOTE
 Cincinnati Insurances Spark ETL
 Streaming Analytics in Apache Metron
 United Airliness ETL
 HDFS Router-based Federation
DAY 2
SCHEDULE
 Apache Flink
 Geospatial Data Platform at Uber
DAY 3
SCHEDULE
谿語 語
2100+
32 螳蟲 23 譬
# Keynote  覦
谿語 語
讌襷,  觜 壱 
3殊姶 語 覈
谿語 語
2014 豈 螻糾 蟆 2018 豈 螻糾 蟆
# 2014 豈螻糾 讌 豢豌 : Hortonworks 豕譬 伎る (https://www.facebook.com/pudidic)
豈 螻糾 蟆 覲
ろ佒
れ襦 ろ佒 螳 譴
2017 2018
Hortonworks Yahoo!
Microsoft
Hewlett Packard
IBM
ORACLE
DELL EMC
Hortonworks
IBM
IMPETUS
TERADATA
IMPETUS Syncsoft ATSCALE
Microsoft Hewlett Packard
Syncsoft NetApp
# 蠍一覈 HOST, DIAMOND, PLATINUM ろ佒襷 蠍壱.
豐 37 蠍一 (覈炎) 豐 27 蠍一 (覈炎)
谿語 語
螻朱 るゴ蟆 蟲 谿語螳 襷 覲伎
蠏語譴 譴蟲語 讀螳螳 螳 蟆 讌
2017
覦煙 覦
2018
碁 譴蟲
THINKWARE
NAVER
LG CNS
SK Hynix SAMSUNG Elec
碁
DATAWORKS 2018
SLOGAN
一覈   觜讀 覈語 覲
PROCEDURAL PROCESSING CONNECTED COMMUNITIES
Enterprise Customer Product Supply Customer
Enterprise
Product
Supply
一危郁 讀螳, 蠏語 磯ジ 螳豺 
覓伎伎 覯豺 覃語梗 覯豺 覯譟一れ 覯豺
襷危襦豺 焔レ
2襷 2覦磯 讀螳も
ろ語 螳豺
蠏 伎  螻煙 觜襦.
企殊磯 貉危 螳蟆
3襷 覦 伎も
觜 螳豺
ROB REARDEN, CEO, Hortonworks
HDP-3 伎 企襴 襦 Cloud GPU襯 讌 蟆企も
ROBTHOMAS, General Manager, IBM
DATA Transformation 蠍一れ 螳豺 麹 蟆企
蠏  AI螳 . IBM  AI襯 所 襷壱蟆 磯襦  蟆企も
Algorithms can be bought. Not your data.
PRAVEEN KANKARIYA, CEO, Impetus
伎 一危磯ゼ 企至 豌襴讌 覲企 企至 讌螳 譴
Ideas,
Insights,
Innovation.
Dataworks 2017 Slogan - Transformation through Data
# GoPro Spark Streaming Pipeline
語 螳
Deep Learning 覲企 Spark Streaming, NiFi 煙 襷 瑚
蟲豢 襦襯 覲企 NiFi, Kafka, Spark Streaming 襦 豌襴 襦螳 襷
(Event)
(State)
語 螳
一危磯ゼ 譯手 覦  襦蠏語 襷血 譯朱 JSON 襷血 
Spark DataFrame レ 蠏豪
JSON Support, SQL Transformations, Parquet Support, Hive Support, Kafka Integration
df = sqlContext.read.json(JSON_DATAFILE)
df.show()
// +------+----------+
// |action| timestamp|
// +------+----------+
// |create|1452121277|
// | null| null|
// +------+----------+
df = sqlContext.read.json(JSON_DATAFILE)
df.createOrReplaceTempView(json_view)
df_new = sqlContext.sql(select * from json_view)
df_new.show()
// +------+----------+
// |action| timestamp|
// +------+----------+
// |create|1452121277|
// | null| null|
// +------+----------+
語 螳
ING Docker data science pipeline 語
GITLAB  CI-CD 蠍磯レ  Docker 企語襯 /覦壱/ろ  螻殊 る
GITLAB  CI-CD 蠍磯レ ,
 覦壱/ろ誤 覦覯   覲   蟆
 
 Gitlab 牛 覦壱襦 NEW SONA 麹伎 覯 蟯襴
 IX 糾 覈 一危 豺 覦壱
語 螳
Hortonworks Accelerating query processing with materialized views in Apache Hive 語
譟壱 貎朱Μ 磯 企 觀一 貎朱Μ螳 豕  Materialized View  螳
覦覲  貎朱Μ  觜襯 焔レ 蠍磯  
, 豌螳 襷れ   蟆朱 螳
語 螳
data Artisans Why and how to leverage the simplicity and power of SQL on Flink 語
SQL襦 Streaming/Batch 一危磯ゼ 譟壱   Apache Flink  螳
豺 SQL 伎 レ螻
Query 譬襯 磯 Streaming/Batch 蟆郁骸襯 螻
kafka+kibana襯 伎 Query 襦,
 麹螻 企Μ 讌 れ螳朱 豢 襯 覲伎譯殊
語 螳
Uber Geospatial data platform at Uber 語
GPS 譬 一危磯ゼ 觜襯願 Polygon 襷れ広蠍  譬 一危一 豕 螻殊 螳
Uber QuadTree 螻襴讀 覦
轟 譬 ADMCODE襯 觜襯願 谿城    蟆朱 蠍磯
蠏 覦 覩語 蟆
覿 讓曙 Stress Free Zone 伎
蠍 覃覃基矩れ 襷讌覃 襷  
覿 讓曙 Ask Me Anything Lounge襯 伎
 螻褐 覓伎企 蟠蠍 蟇 企覃 企轟螳  る伎朱 レ
CAT!
Ask me about
蟲郁  螻褐 企蟆
豐 覦 螳
譴蟲蟇一QR貊襦蟲蟇誤襦
讌碁螳觜襯願殊企螻
豐 覦 螳
DevOps NoOps
蠍一れ 覯 讌願
Dataworks Summit 2018
豐 覦 螳
 Dataworks 2017 覦覓   
 襦螻 蟆 覲伎
  Dataworks 2018 覦覓   
蠍 磯Μ 觜訣?
磯Μ Data Transformation  炎概  蟆手?
Dataworks Summit 2018
豐 覦 螳
COST REDUCTION MODERNIZATION INSIGHT-DRIVEN TRANSFORMATION
Enterprise Value 2017 2018
6%
43%
45%
6% 5%
41%
42%
12%
The Data Maturity Curve
# 豢豌 : DELL EMC
螻 一危 
螳 覿
襦  谿曙
螳
# 殊企  伎 豢豌 : NounProject (https://thenounproject.com/)

More Related Content

DataWorks Summit 2018

  • 1. Dataworks Summit 2018 蠍 SAN JOSE, USA JUNE 18-21 蠍一螳覦 覦
  • 2. Dataworks Summit 2018 蠍一螳覦 覦 OVERVIEW 語 蠍一 伎覲企 覦 伎手鍵襯 る螻 ( 螳 語 誤 伎 蠍壱螳 る 豢 覦 ) 螳 殊 れ 語 覿蠍 SUMMIT 襦蟇 語 螳 蠏 覦 覩語 豐 覦 螳
  • 3. Dataworks Summit 2018 蠍一螳覦 覦 KEYNOTE YARN 3.x GoPros Streaming Pipelines MaterializedViews in Apache Hive Spark Configuration Tuning INGs Docker-based Pipelines DAY 1 SCHEDULE
  • 4. KEYNOTE Cincinnati Insurances Spark ETL Streaming Analytics in Apache Metron United Airliness ETL HDFS Router-based Federation DAY 2 SCHEDULE
  • 5. Apache Flink Geospatial Data Platform at Uber DAY 3 SCHEDULE
  • 6. 谿語 語 2100+ 32 螳蟲 23 譬 # Keynote 覦
  • 7. 谿語 語 讌襷, 觜 壱 3殊姶 語 覈
  • 8. 谿語 語 2014 豈 螻糾 蟆 2018 豈 螻糾 蟆 # 2014 豈螻糾 讌 豢豌 : Hortonworks 豕譬 伎る (https://www.facebook.com/pudidic) 豈 螻糾 蟆 覲
  • 9. ろ佒 れ襦 ろ佒 螳 譴 2017 2018 Hortonworks Yahoo! Microsoft Hewlett Packard IBM ORACLE DELL EMC Hortonworks IBM IMPETUS TERADATA IMPETUS Syncsoft ATSCALE Microsoft Hewlett Packard Syncsoft NetApp # 蠍一覈 HOST, DIAMOND, PLATINUM ろ佒襷 蠍壱. 豐 37 蠍一 (覈炎) 豐 27 蠍一 (覈炎)
  • 10. 谿語 語 螻朱 るゴ蟆 蟲 谿語螳 襷 覲伎 蠏語譴 譴蟲語 讀螳螳 螳 蟆 讌 2017 覦煙 覦 2018 碁 譴蟲 THINKWARE NAVER LG CNS SK Hynix SAMSUNG Elec 碁
  • 12. 一覈 觜讀 覈語 覲 PROCEDURAL PROCESSING CONNECTED COMMUNITIES Enterprise Customer Product Supply Customer Enterprise Product Supply
  • 13. 一危郁 讀螳, 蠏語 磯ジ 螳豺 覓伎伎 覯豺 覃語梗 覯豺 覯譟一れ 覯豺 襷危襦豺 焔レ 2襷 2覦磯 讀螳も ろ語 螳豺 蠏 伎 螻煙 觜襦. 企殊磯 貉危 螳蟆 3襷 覦 伎も 觜 螳豺
  • 14. ROB REARDEN, CEO, Hortonworks HDP-3 伎 企襴 襦 Cloud GPU襯 讌 蟆企も ROBTHOMAS, General Manager, IBM DATA Transformation 蠍一れ 螳豺 麹 蟆企 蠏 AI螳 . IBM AI襯 所 襷壱蟆 磯襦 蟆企も Algorithms can be bought. Not your data. PRAVEEN KANKARIYA, CEO, Impetus 伎 一危磯ゼ 企至 豌襴讌 覲企 企至 讌螳 譴
  • 15. Ideas, Insights, Innovation. Dataworks 2017 Slogan - Transformation through Data
  • 16. # GoPro Spark Streaming Pipeline 語 螳 Deep Learning 覲企 Spark Streaming, NiFi 煙 襷 瑚 蟲豢 襦襯 覲企 NiFi, Kafka, Spark Streaming 襦 豌襴 襦螳 襷 (Event) (State)
  • 17. 語 螳 一危磯ゼ 譯手 覦 襦蠏語 襷血 譯朱 JSON 襷血 Spark DataFrame レ 蠏豪 JSON Support, SQL Transformations, Parquet Support, Hive Support, Kafka Integration df = sqlContext.read.json(JSON_DATAFILE) df.show() // +------+----------+ // |action| timestamp| // +------+----------+ // |create|1452121277| // | null| null| // +------+----------+ df = sqlContext.read.json(JSON_DATAFILE) df.createOrReplaceTempView(json_view) df_new = sqlContext.sql(select * from json_view) df_new.show() // +------+----------+ // |action| timestamp| // +------+----------+ // |create|1452121277| // | null| null| // +------+----------+
  • 18. 語 螳 ING Docker data science pipeline 語 GITLAB CI-CD 蠍磯レ Docker 企語襯 /覦壱/ろ 螻殊 る GITLAB CI-CD 蠍磯レ , 覦壱/ろ誤 覦覯 覲 蟆 Gitlab 牛 覦壱襦 NEW SONA 麹伎 覯 蟯襴 IX 糾 覈 一危 豺 覦壱
  • 19. 語 螳 Hortonworks Accelerating query processing with materialized views in Apache Hive 語 譟壱 貎朱Μ 磯 企 觀一 貎朱Μ螳 豕 Materialized View 螳 覦覲 貎朱Μ 觜襯 焔レ 蠍磯 , 豌螳 襷れ 蟆朱 螳
  • 20. 語 螳 data Artisans Why and how to leverage the simplicity and power of SQL on Flink 語 SQL襦 Streaming/Batch 一危磯ゼ 譟壱 Apache Flink 螳 豺 SQL 伎 レ螻 Query 譬襯 磯 Streaming/Batch 蟆郁骸襯 螻 kafka+kibana襯 伎 Query 襦, 麹螻 企Μ 讌 れ螳朱 豢 襯 覲伎譯殊
  • 21. 語 螳 Uber Geospatial data platform at Uber 語 GPS 譬 一危磯ゼ 觜襯願 Polygon 襷れ広蠍 譬 一危一 豕 螻殊 螳 Uber QuadTree 螻襴讀 覦 轟 譬 ADMCODE襯 觜襯願 谿城 蟆朱 蠍磯
  • 22. 蠏 覦 覩語 蟆 覿 讓曙 Stress Free Zone 伎 蠍 覃覃基矩れ 襷讌覃 襷 覿 讓曙 Ask Me Anything Lounge襯 伎 螻褐 覓伎企 蟠蠍 蟇 企覃 企轟螳 る伎朱 レ CAT! Ask me about 蟲郁 螻褐 企蟆
  • 24. 豐 覦 螳 DevOps NoOps 蠍一れ 覯 讌願
  • 25. Dataworks Summit 2018 豐 覦 螳 Dataworks 2017 覦覓 襦螻 蟆 覲伎 Dataworks 2018 覦覓 蠍 磯Μ 觜訣? 磯Μ Data Transformation 炎概 蟆手?
  • 26. Dataworks Summit 2018 豐 覦 螳 COST REDUCTION MODERNIZATION INSIGHT-DRIVEN TRANSFORMATION Enterprise Value 2017 2018 6% 43% 45% 6% 5% 41% 42% 12% The Data Maturity Curve # 豢豌 : DELL EMC 螻 一危 螳 覿 襦 谿曙
  • 27. 螳 # 殊企 伎 豢豌 : NounProject (https://thenounproject.com/)