際際滷

際際滷Share a Scribd company logo
Hive: Data Warehousing for
          Hadoop

           Ben Lever
           @bmlever

    Big Data Analytics Meetup
         27 March 2012
Another Data Warehousing System?
? Problem:
  C Lots of data
? Partial solution:
  C Hadoop
? Another problem:
  C MapReduce can be hard
  C Schema information embedded in program C a lot
    of data is still structured
Solution: Hive
? A system for querying and managing
  structured data within Hadoop
  C MapReduce for execution
  C HDFS for storage
? Designed for end-users that know more SQL
  than Java
? Apache v2
? hive.apache.org
Working example: MovieLens
? Movie ratings
? 3 ^tables ̄:
      Users             Movies         Ratings
          id                id           user id
         age               title        movie id
       gender          release date   rating (1 C 5)
     occupation           action       timestamp
      zip code          adventure
                         romance
                            ...



                  www.grouplens.org
Demo
So far
? Hive shell
? Creating and loading tables
? Data model:
  C INT, BIGINT, TINYINT, STRING, etc
  C Also: FLOAT, DOUBLE, ARRAY, MAP, STRUCT
? Simple queries with filtering
? Table data is immutable
? Schema on readvsschema on write
Hive components
                      TABLE customer (
                      customer_id    BIGINT,
        Metastore       gender         STRING,
                        ...


 schema info
                     launch                 MapReduce
          Driver    MapReduc
                      e job

  Hive query
                                                 HDFS
   (SQL-like)
                               raw source data
                                (compressed)
SELECT *
FROM customers           CLI
WHERE gender = `M¨;
Metastore




            Hadoop C The Definitive Guide
Other SQL-like features

? Aggregation C COUNT, AVG
? JOIN
? GROUP BY
? SORT BY
? Sub queries
Demo
Built in functions
? Text mining:
  C ngrams()
  C context_ngrams()
  C sentences()
? Statistics + mathematics:
  C stddev()
  C histogram_numeric()
  C log
  C radians
User Defined Functions
? Written in Java
? User Defined Functions (UDFs):
  C Single row ? Single row
  C e.g. mathematical and string functions
? User Defined Aggregate Functions (UDAFs):
  C Multiple rows ? Single row
  C e.g. AVG
? User Defined Table Functions (UDTFs):
  C Single row ? Multiple rows
  C e.g. ^explode ̄
Hive Clients




               Hadoop C The Definitive Guide
Hive Server


JDBC         ODBC
Sqoop
  Move data between Hadoop
   and relational databases


RDBMS         Sqoop            Hadoop
                                                    Hive

                               Metastore
                  schema



  http://incubator.apache.org/projects/sqoop.html
Sqoop adapters
Conclusion
? Scales to handle much more data than traditional
  systems:
  C Leverages Hadoop HDFS and MapReduce
  C Relational/structured data
  C Schema on read vs schema on write
? Supports rapid iteration of ad-hoc queries
  C SQL-like querying language
  C Complex queries (joins, etc) with minimal code
? Is not a database replacement:
  C Treats data as immutable
  C No indexing

More Related Content

What's hot (20)

Apache Hadoop and Spark: Introduction and Use Cases for Data Analysis
Apache Hadoop and Spark: Introduction and Use Cases for Data AnalysisApache Hadoop and Spark: Introduction and Use Cases for Data Analysis
Apache Hadoop and Spark: Introduction and Use Cases for Data Analysis
Trieu Nguyen
?
Bigdata Nedir? Hadoop Nedir? MapReduce Nedir? Big Data.
Bigdata Nedir? Hadoop Nedir? MapReduce Nedir? Big Data.Bigdata Nedir? Hadoop Nedir? MapReduce Nedir? Big Data.
Bigdata Nedir? Hadoop Nedir? MapReduce Nedir? Big Data.
Zekeriya Besiroglu
?
Big Data and Hadoop Ecosystem
Big Data and Hadoop EcosystemBig Data and Hadoop Ecosystem
Big Data and Hadoop Ecosystem
Rajkumar Singh
?
Big Data Fundamentals in the Emerging New Data World
Big Data Fundamentals in the Emerging New Data WorldBig Data Fundamentals in the Emerging New Data World
Big Data Fundamentals in the Emerging New Data World
Jongwook Woo
?
Apache Hadoop at 10
Apache Hadoop at 10Apache Hadoop at 10
Apache Hadoop at 10
Cloudera, Inc.
?
Hadoop introduction
Hadoop introductionHadoop introduction
Hadoop introduction
菅c 川
?
The Evolution of the Hadoop Ecosystem
The Evolution of the Hadoop EcosystemThe Evolution of the Hadoop Ecosystem
The Evolution of the Hadoop Ecosystem
Cloudera, Inc.
?
Hive and data analysis using pandas
 Hive  and  data analysis  using pandas Hive  and  data analysis  using pandas
Hive and data analysis using pandas
Purna Chander K
?
P.Maharajothi,II-M.sc(computer science),Bon secours college for women,thanjavur.
P.Maharajothi,II-M.sc(computer science),Bon secours college for women,thanjavur.P.Maharajothi,II-M.sc(computer science),Bon secours college for women,thanjavur.
P.Maharajothi,II-M.sc(computer science),Bon secours college for women,thanjavur.
MaharajothiP
?
Intro to Apache Hadoop
Intro to Apache HadoopIntro to Apache Hadoop
Intro to Apache Hadoop
Sufi Nawaz
?
Sep 2012 HUG: Apache Drill for Interactive Analysis
Sep 2012 HUG: Apache Drill for Interactive Analysis Sep 2012 HUG: Apache Drill for Interactive Analysis
Sep 2012 HUG: Apache Drill for Interactive Analysis
Yahoo Developer Network
?
Hadoop ecosystem J.AYEESHA PARVEEN II-M.SC.,COMPUTER SCIENCE, BON SECOURS CO...
Hadoop ecosystem  J.AYEESHA PARVEEN II-M.SC.,COMPUTER SCIENCE, BON SECOURS CO...Hadoop ecosystem  J.AYEESHA PARVEEN II-M.SC.,COMPUTER SCIENCE, BON SECOURS CO...
Hadoop ecosystem J.AYEESHA PARVEEN II-M.SC.,COMPUTER SCIENCE, BON SECOURS CO...
AyeeshaParveen
?
Big data and Hadoop
Big data and HadoopBig data and Hadoop
Big data and Hadoop
Rahul Agarwal
?
Hadoop ecosystem; J.Ayeesha parveen 2 nd M.sc., computer science Bon Secours...
Hadoop ecosystem; J.Ayeesha parveen 2 nd M.sc., computer science  Bon Secours...Hadoop ecosystem; J.Ayeesha parveen 2 nd M.sc., computer science  Bon Secours...
Hadoop ecosystem; J.Ayeesha parveen 2 nd M.sc., computer science Bon Secours...
AyeeshaParveen
?
BIG DATA: Apache Hadoop
BIG DATA: Apache HadoopBIG DATA: Apache Hadoop
BIG DATA: Apache Hadoop
Oleksiy Krotov
?
Hadoop hbase introduction
Hadoop hbase introductionHadoop hbase introduction
Hadoop hbase introduction
Jakub Stransky
?
WaterlooHiveTalk
WaterlooHiveTalkWaterlooHiveTalk
WaterlooHiveTalk
nzhang
?
Nextag talk
Nextag talkNextag talk
Nextag talk
Joydeep Sen Sarma
?
HADOOP TECHNOLOGY ppt
HADOOP  TECHNOLOGY pptHADOOP  TECHNOLOGY ppt
HADOOP TECHNOLOGY ppt
sravya raju
?
Introduction to the Hadoop Ecosystem (FrOSCon Edition)
Introduction to the Hadoop Ecosystem (FrOSCon Edition)Introduction to the Hadoop Ecosystem (FrOSCon Edition)
Introduction to the Hadoop Ecosystem (FrOSCon Edition)
Uwe Printz
?
Apache Hadoop and Spark: Introduction and Use Cases for Data Analysis
Apache Hadoop and Spark: Introduction and Use Cases for Data AnalysisApache Hadoop and Spark: Introduction and Use Cases for Data Analysis
Apache Hadoop and Spark: Introduction and Use Cases for Data Analysis
Trieu Nguyen
?
Bigdata Nedir? Hadoop Nedir? MapReduce Nedir? Big Data.
Bigdata Nedir? Hadoop Nedir? MapReduce Nedir? Big Data.Bigdata Nedir? Hadoop Nedir? MapReduce Nedir? Big Data.
Bigdata Nedir? Hadoop Nedir? MapReduce Nedir? Big Data.
Zekeriya Besiroglu
?
Big Data and Hadoop Ecosystem
Big Data and Hadoop EcosystemBig Data and Hadoop Ecosystem
Big Data and Hadoop Ecosystem
Rajkumar Singh
?
Big Data Fundamentals in the Emerging New Data World
Big Data Fundamentals in the Emerging New Data WorldBig Data Fundamentals in the Emerging New Data World
Big Data Fundamentals in the Emerging New Data World
Jongwook Woo
?
Hadoop introduction
Hadoop introductionHadoop introduction
Hadoop introduction
菅c 川
?
The Evolution of the Hadoop Ecosystem
The Evolution of the Hadoop EcosystemThe Evolution of the Hadoop Ecosystem
The Evolution of the Hadoop Ecosystem
Cloudera, Inc.
?
Hive and data analysis using pandas
 Hive  and  data analysis  using pandas Hive  and  data analysis  using pandas
Hive and data analysis using pandas
Purna Chander K
?
P.Maharajothi,II-M.sc(computer science),Bon secours college for women,thanjavur.
P.Maharajothi,II-M.sc(computer science),Bon secours college for women,thanjavur.P.Maharajothi,II-M.sc(computer science),Bon secours college for women,thanjavur.
P.Maharajothi,II-M.sc(computer science),Bon secours college for women,thanjavur.
MaharajothiP
?
Intro to Apache Hadoop
Intro to Apache HadoopIntro to Apache Hadoop
Intro to Apache Hadoop
Sufi Nawaz
?
Sep 2012 HUG: Apache Drill for Interactive Analysis
Sep 2012 HUG: Apache Drill for Interactive Analysis Sep 2012 HUG: Apache Drill for Interactive Analysis
Sep 2012 HUG: Apache Drill for Interactive Analysis
Yahoo Developer Network
?
Hadoop ecosystem J.AYEESHA PARVEEN II-M.SC.,COMPUTER SCIENCE, BON SECOURS CO...
Hadoop ecosystem  J.AYEESHA PARVEEN II-M.SC.,COMPUTER SCIENCE, BON SECOURS CO...Hadoop ecosystem  J.AYEESHA PARVEEN II-M.SC.,COMPUTER SCIENCE, BON SECOURS CO...
Hadoop ecosystem J.AYEESHA PARVEEN II-M.SC.,COMPUTER SCIENCE, BON SECOURS CO...
AyeeshaParveen
?
Hadoop ecosystem; J.Ayeesha parveen 2 nd M.sc., computer science Bon Secours...
Hadoop ecosystem; J.Ayeesha parveen 2 nd M.sc., computer science  Bon Secours...Hadoop ecosystem; J.Ayeesha parveen 2 nd M.sc., computer science  Bon Secours...
Hadoop ecosystem; J.Ayeesha parveen 2 nd M.sc., computer science Bon Secours...
AyeeshaParveen
?
Hadoop hbase introduction
Hadoop hbase introductionHadoop hbase introduction
Hadoop hbase introduction
Jakub Stransky
?
WaterlooHiveTalk
WaterlooHiveTalkWaterlooHiveTalk
WaterlooHiveTalk
nzhang
?
HADOOP TECHNOLOGY ppt
HADOOP  TECHNOLOGY pptHADOOP  TECHNOLOGY ppt
HADOOP TECHNOLOGY ppt
sravya raju
?
Introduction to the Hadoop Ecosystem (FrOSCon Edition)
Introduction to the Hadoop Ecosystem (FrOSCon Edition)Introduction to the Hadoop Ecosystem (FrOSCon Edition)
Introduction to the Hadoop Ecosystem (FrOSCon Edition)
Uwe Printz
?

Similar to Hive: Data Warehousing for Hadoop (20)

Apache Drill
Apache DrillApache Drill
Apache Drill
Ted Dunning
?
Apache Hadoop 1.1
Apache Hadoop 1.1Apache Hadoop 1.1
Apache Hadoop 1.1
Sperasoft
?
Microsoft's Hadoop Story
Microsoft's Hadoop StoryMicrosoft's Hadoop Story
Microsoft's Hadoop Story
Michael Rys
?
Hadoop on Azure, Blue elephants
Hadoop on Azure,  Blue elephantsHadoop on Azure,  Blue elephants
Hadoop on Azure, Blue elephants
Ovidiu Dimulescu
?
An introduction to apache drill presentation
An introduction to apache drill presentationAn introduction to apache drill presentation
An introduction to apache drill presentation
MapR Technologies
?
Real time hadoop + mapreduce intro
Real time hadoop + mapreduce introReal time hadoop + mapreduce intro
Real time hadoop + mapreduce intro
Geoff Hendrey
?
Drill njhug -19 feb2013
Drill njhug -19 feb2013Drill njhug -19 feb2013
Drill njhug -19 feb2013
MapR Technologies
?
Big data Hadoop
Big data  Hadoop   Big data  Hadoop
Big data Hadoop
Ayyappan Paramesh
?
Modern Big Data Analytics Tools: An Overview
Modern Big Data Analytics Tools: An OverviewModern Big Data Analytics Tools: An Overview
Modern Big Data Analytics Tools: An Overview
Great Wide Open
?
Etu L2 Training - Hadoop 二I喘恬
Etu L2 Training - Hadoop 二I喘恬Etu L2 Training - Hadoop 二I喘恬
Etu L2 Training - Hadoop 二I喘恬
James Chen
?
Big Data Analytics with Hadoop, MongoDB and SQL Server
Big Data Analytics with Hadoop, MongoDB and SQL ServerBig Data Analytics with Hadoop, MongoDB and SQL Server
Big Data Analytics with Hadoop, MongoDB and SQL Server
Mark Kromer
?
Big Data and Hadoop Training in Chandigarh
Big Data and Hadoop Training in ChandigarhBig Data and Hadoop Training in Chandigarh
Big Data and Hadoop Training in Chandigarh
Big Boxx Animation Academy
?
Apache Drill at ApacheCon2014
Apache Drill at ApacheCon2014Apache Drill at ApacheCon2014
Apache Drill at ApacheCon2014
Neeraja Rentachintala
?
2016-07-21-Godil-presentation.pptx
2016-07-21-Godil-presentation.pptx2016-07-21-Godil-presentation.pptx
2016-07-21-Godil-presentation.pptx
D21CE161GOSWAMIPARTH
?
Big Data Developers Moscow Meetup 1 - sql on hadoop
Big Data Developers Moscow Meetup 1  - sql on hadoopBig Data Developers Moscow Meetup 1  - sql on hadoop
Big Data Developers Moscow Meetup 1 - sql on hadoop
bddmoscow
?
Big Data Processing
Big Data ProcessingBig Data Processing
Big Data Processing
Michael Ming Lei
?
SQL on Hadoop for the Oracle Professional
SQL on Hadoop for the Oracle ProfessionalSQL on Hadoop for the Oracle Professional
SQL on Hadoop for the Oracle Professional
Michael Rainey
?
Paris Data Geek - Spark Streaming
Paris Data Geek - Spark Streaming Paris Data Geek - Spark Streaming
Paris Data Geek - Spark Streaming
Djamel Zouaoui
?
Big data hadoop ecosystem and nosql
Big data hadoop ecosystem and nosqlBig data hadoop ecosystem and nosql
Big data hadoop ecosystem and nosql
Khanderao Kand
?
02 data warehouse applications with hive
02 data warehouse applications with hive02 data warehouse applications with hive
02 data warehouse applications with hive
Subhas Kumar Ghosh
?
Apache Hadoop 1.1
Apache Hadoop 1.1Apache Hadoop 1.1
Apache Hadoop 1.1
Sperasoft
?
Microsoft's Hadoop Story
Microsoft's Hadoop StoryMicrosoft's Hadoop Story
Microsoft's Hadoop Story
Michael Rys
?
Hadoop on Azure, Blue elephants
Hadoop on Azure,  Blue elephantsHadoop on Azure,  Blue elephants
Hadoop on Azure, Blue elephants
Ovidiu Dimulescu
?
An introduction to apache drill presentation
An introduction to apache drill presentationAn introduction to apache drill presentation
An introduction to apache drill presentation
MapR Technologies
?
Real time hadoop + mapreduce intro
Real time hadoop + mapreduce introReal time hadoop + mapreduce intro
Real time hadoop + mapreduce intro
Geoff Hendrey
?
Modern Big Data Analytics Tools: An Overview
Modern Big Data Analytics Tools: An OverviewModern Big Data Analytics Tools: An Overview
Modern Big Data Analytics Tools: An Overview
Great Wide Open
?
Etu L2 Training - Hadoop 二I喘恬
Etu L2 Training - Hadoop 二I喘恬Etu L2 Training - Hadoop 二I喘恬
Etu L2 Training - Hadoop 二I喘恬
James Chen
?
Big Data Analytics with Hadoop, MongoDB and SQL Server
Big Data Analytics with Hadoop, MongoDB and SQL ServerBig Data Analytics with Hadoop, MongoDB and SQL Server
Big Data Analytics with Hadoop, MongoDB and SQL Server
Mark Kromer
?
Big Data Developers Moscow Meetup 1 - sql on hadoop
Big Data Developers Moscow Meetup 1  - sql on hadoopBig Data Developers Moscow Meetup 1  - sql on hadoop
Big Data Developers Moscow Meetup 1 - sql on hadoop
bddmoscow
?
SQL on Hadoop for the Oracle Professional
SQL on Hadoop for the Oracle ProfessionalSQL on Hadoop for the Oracle Professional
SQL on Hadoop for the Oracle Professional
Michael Rainey
?
Paris Data Geek - Spark Streaming
Paris Data Geek - Spark Streaming Paris Data Geek - Spark Streaming
Paris Data Geek - Spark Streaming
Djamel Zouaoui
?
Big data hadoop ecosystem and nosql
Big data hadoop ecosystem and nosqlBig data hadoop ecosystem and nosql
Big data hadoop ecosystem and nosql
Khanderao Kand
?
02 data warehouse applications with hive
02 data warehouse applications with hive02 data warehouse applications with hive
02 data warehouse applications with hive
Subhas Kumar Ghosh
?

Recently uploaded (20)

Future-Proof Your Career with AI Options
Future-Proof Your  Career with AI OptionsFuture-Proof Your  Career with AI Options
Future-Proof Your Career with AI Options
DianaGray10
?
[Webinar] Scaling Made Simple: Getting Started with No-Code Web Apps
[Webinar] Scaling Made Simple: Getting Started with No-Code Web Apps[Webinar] Scaling Made Simple: Getting Started with No-Code Web Apps
[Webinar] Scaling Made Simple: Getting Started with No-Code Web Apps
Safe Software
?
BoxLang JVM Language : The Future is Dynamic
BoxLang JVM Language : The Future is DynamicBoxLang JVM Language : The Future is Dynamic
BoxLang JVM Language : The Future is Dynamic
Ortus Solutions, Corp
?
SMART SENTRY CYBER THREAT INTELLIGENCE IN IIOT
SMART SENTRY CYBER THREAT INTELLIGENCE IN IIOTSMART SENTRY CYBER THREAT INTELLIGENCE IN IIOT
SMART SENTRY CYBER THREAT INTELLIGENCE IN IIOT
TanmaiArni
?
1.1. Evolution-and-Scope-of-Business-Analytics.pptx
1.1. Evolution-and-Scope-of-Business-Analytics.pptx1.1. Evolution-and-Scope-of-Business-Analytics.pptx
1.1. Evolution-and-Scope-of-Business-Analytics.pptx
Jitendra Tomar
?
Revolutionizing-Government-Communication-The-OSWAN-Success-Story
Revolutionizing-Government-Communication-The-OSWAN-Success-StoryRevolutionizing-Government-Communication-The-OSWAN-Success-Story
Revolutionizing-Government-Communication-The-OSWAN-Success-Story
ssuser52ad5e
?
A Framework for Model-Driven Digital Twin Engineering
A Framework for Model-Driven Digital Twin EngineeringA Framework for Model-Driven Digital Twin Engineering
A Framework for Model-Driven Digital Twin Engineering
Daniel Lehner
?
What Makes "Deep Research"? A Dive into AI Agents
What Makes "Deep Research"? A Dive into AI AgentsWhat Makes "Deep Research"? A Dive into AI Agents
What Makes "Deep Research"? A Dive into AI Agents
Zilliz
?
Build with AI on Google Cloud Session #4
Build with AI on Google Cloud Session #4Build with AI on Google Cloud Session #4
Build with AI on Google Cloud Session #4
Margaret Maynard-Reid
?
Early Adopter's Guide to AI Moderation (Preview)
Early Adopter's Guide to AI Moderation (Preview)Early Adopter's Guide to AI Moderation (Preview)
Early Adopter's Guide to AI Moderation (Preview)
nick896721
?
Wondershare Dr.Fone Crack Free Download 2025
Wondershare Dr.Fone Crack Free Download 2025Wondershare Dr.Fone Crack Free Download 2025
Wondershare Dr.Fone Crack Free Download 2025
maharajput103
?
Understanding Traditional AI with Custom Vision & MuleSoft.pptx
Understanding Traditional AI with Custom Vision & MuleSoft.pptxUnderstanding Traditional AI with Custom Vision & MuleSoft.pptx
Understanding Traditional AI with Custom Vision & MuleSoft.pptx
shyamraj55
?
Cloud of everything Tech of the 21 century in Aviation
Cloud of everything Tech of the 21 century in AviationCloud of everything Tech of the 21 century in Aviation
Cloud of everything Tech of the 21 century in Aviation
Assem mousa
?
Stronger Together: Combining Data Quality and Governance for Confident AI & A...
Stronger Together: Combining Data Quality and Governance for Confident AI & A...Stronger Together: Combining Data Quality and Governance for Confident AI & A...
Stronger Together: Combining Data Quality and Governance for Confident AI & A...
Precisely
?
UiPath Document Understanding - Generative AI and Active learning capabilities
UiPath Document Understanding - Generative AI and Active learning capabilitiesUiPath Document Understanding - Generative AI and Active learning capabilities
UiPath Document Understanding - Generative AI and Active learning capabilities
DianaGray10
?
Unlocking DevOps Secuirty :Vault & Keylock
Unlocking DevOps Secuirty :Vault & KeylockUnlocking DevOps Secuirty :Vault & Keylock
Unlocking DevOps Secuirty :Vault & Keylock
HusseinMalikMammadli
?
How Discord Indexes Trillions of Messages: Scaling Search Infrastructure by V...
How Discord Indexes Trillions of Messages: Scaling Search Infrastructure by V...How Discord Indexes Trillions of Messages: Scaling Search Infrastructure by V...
How Discord Indexes Trillions of Messages: Scaling Search Infrastructure by V...
ScyllaDB
?
Q4_TLE-7-Lesson-6-Week-6.pptx 4th quarter
Q4_TLE-7-Lesson-6-Week-6.pptx 4th quarterQ4_TLE-7-Lesson-6-Week-6.pptx 4th quarter
Q4_TLE-7-Lesson-6-Week-6.pptx 4th quarter
MariaBarbaraPaglinaw
?
Wondershare Filmora Crack 14.3.2.11147 Latest
Wondershare Filmora Crack 14.3.2.11147 LatestWondershare Filmora Crack 14.3.2.11147 Latest
Wondershare Filmora Crack 14.3.2.11147 Latest
udkg888
?
TrustArc Webinar - Building your DPIA/PIA Program: Best Practices & Tips
TrustArc Webinar - Building your DPIA/PIA Program: Best Practices & TipsTrustArc Webinar - Building your DPIA/PIA Program: Best Practices & Tips
TrustArc Webinar - Building your DPIA/PIA Program: Best Practices & Tips
TrustArc
?
Future-Proof Your Career with AI Options
Future-Proof Your  Career with AI OptionsFuture-Proof Your  Career with AI Options
Future-Proof Your Career with AI Options
DianaGray10
?
[Webinar] Scaling Made Simple: Getting Started with No-Code Web Apps
[Webinar] Scaling Made Simple: Getting Started with No-Code Web Apps[Webinar] Scaling Made Simple: Getting Started with No-Code Web Apps
[Webinar] Scaling Made Simple: Getting Started with No-Code Web Apps
Safe Software
?
BoxLang JVM Language : The Future is Dynamic
BoxLang JVM Language : The Future is DynamicBoxLang JVM Language : The Future is Dynamic
BoxLang JVM Language : The Future is Dynamic
Ortus Solutions, Corp
?
SMART SENTRY CYBER THREAT INTELLIGENCE IN IIOT
SMART SENTRY CYBER THREAT INTELLIGENCE IN IIOTSMART SENTRY CYBER THREAT INTELLIGENCE IN IIOT
SMART SENTRY CYBER THREAT INTELLIGENCE IN IIOT
TanmaiArni
?
1.1. Evolution-and-Scope-of-Business-Analytics.pptx
1.1. Evolution-and-Scope-of-Business-Analytics.pptx1.1. Evolution-and-Scope-of-Business-Analytics.pptx
1.1. Evolution-and-Scope-of-Business-Analytics.pptx
Jitendra Tomar
?
Revolutionizing-Government-Communication-The-OSWAN-Success-Story
Revolutionizing-Government-Communication-The-OSWAN-Success-StoryRevolutionizing-Government-Communication-The-OSWAN-Success-Story
Revolutionizing-Government-Communication-The-OSWAN-Success-Story
ssuser52ad5e
?
A Framework for Model-Driven Digital Twin Engineering
A Framework for Model-Driven Digital Twin EngineeringA Framework for Model-Driven Digital Twin Engineering
A Framework for Model-Driven Digital Twin Engineering
Daniel Lehner
?
What Makes "Deep Research"? A Dive into AI Agents
What Makes "Deep Research"? A Dive into AI AgentsWhat Makes "Deep Research"? A Dive into AI Agents
What Makes "Deep Research"? A Dive into AI Agents
Zilliz
?
Build with AI on Google Cloud Session #4
Build with AI on Google Cloud Session #4Build with AI on Google Cloud Session #4
Build with AI on Google Cloud Session #4
Margaret Maynard-Reid
?
Early Adopter's Guide to AI Moderation (Preview)
Early Adopter's Guide to AI Moderation (Preview)Early Adopter's Guide to AI Moderation (Preview)
Early Adopter's Guide to AI Moderation (Preview)
nick896721
?
Wondershare Dr.Fone Crack Free Download 2025
Wondershare Dr.Fone Crack Free Download 2025Wondershare Dr.Fone Crack Free Download 2025
Wondershare Dr.Fone Crack Free Download 2025
maharajput103
?
Understanding Traditional AI with Custom Vision & MuleSoft.pptx
Understanding Traditional AI with Custom Vision & MuleSoft.pptxUnderstanding Traditional AI with Custom Vision & MuleSoft.pptx
Understanding Traditional AI with Custom Vision & MuleSoft.pptx
shyamraj55
?
Cloud of everything Tech of the 21 century in Aviation
Cloud of everything Tech of the 21 century in AviationCloud of everything Tech of the 21 century in Aviation
Cloud of everything Tech of the 21 century in Aviation
Assem mousa
?
Stronger Together: Combining Data Quality and Governance for Confident AI & A...
Stronger Together: Combining Data Quality and Governance for Confident AI & A...Stronger Together: Combining Data Quality and Governance for Confident AI & A...
Stronger Together: Combining Data Quality and Governance for Confident AI & A...
Precisely
?
UiPath Document Understanding - Generative AI and Active learning capabilities
UiPath Document Understanding - Generative AI and Active learning capabilitiesUiPath Document Understanding - Generative AI and Active learning capabilities
UiPath Document Understanding - Generative AI and Active learning capabilities
DianaGray10
?
Unlocking DevOps Secuirty :Vault & Keylock
Unlocking DevOps Secuirty :Vault & KeylockUnlocking DevOps Secuirty :Vault & Keylock
Unlocking DevOps Secuirty :Vault & Keylock
HusseinMalikMammadli
?
How Discord Indexes Trillions of Messages: Scaling Search Infrastructure by V...
How Discord Indexes Trillions of Messages: Scaling Search Infrastructure by V...How Discord Indexes Trillions of Messages: Scaling Search Infrastructure by V...
How Discord Indexes Trillions of Messages: Scaling Search Infrastructure by V...
ScyllaDB
?
Q4_TLE-7-Lesson-6-Week-6.pptx 4th quarter
Q4_TLE-7-Lesson-6-Week-6.pptx 4th quarterQ4_TLE-7-Lesson-6-Week-6.pptx 4th quarter
Q4_TLE-7-Lesson-6-Week-6.pptx 4th quarter
MariaBarbaraPaglinaw
?
Wondershare Filmora Crack 14.3.2.11147 Latest
Wondershare Filmora Crack 14.3.2.11147 LatestWondershare Filmora Crack 14.3.2.11147 Latest
Wondershare Filmora Crack 14.3.2.11147 Latest
udkg888
?
TrustArc Webinar - Building your DPIA/PIA Program: Best Practices & Tips
TrustArc Webinar - Building your DPIA/PIA Program: Best Practices & TipsTrustArc Webinar - Building your DPIA/PIA Program: Best Practices & Tips
TrustArc Webinar - Building your DPIA/PIA Program: Best Practices & Tips
TrustArc
?

Hive: Data Warehousing for Hadoop

  • 1. Hive: Data Warehousing for Hadoop Ben Lever @bmlever Big Data Analytics Meetup 27 March 2012
  • 2. Another Data Warehousing System? ? Problem: C Lots of data ? Partial solution: C Hadoop ? Another problem: C MapReduce can be hard C Schema information embedded in program C a lot of data is still structured
  • 3. Solution: Hive ? A system for querying and managing structured data within Hadoop C MapReduce for execution C HDFS for storage ? Designed for end-users that know more SQL than Java ? Apache v2 ? hive.apache.org
  • 4. Working example: MovieLens ? Movie ratings ? 3 ^tables ̄: Users Movies Ratings id id user id age title movie id gender release date rating (1 C 5) occupation action timestamp zip code adventure romance ... www.grouplens.org
  • 6. So far ? Hive shell ? Creating and loading tables ? Data model: C INT, BIGINT, TINYINT, STRING, etc C Also: FLOAT, DOUBLE, ARRAY, MAP, STRUCT ? Simple queries with filtering ? Table data is immutable ? Schema on readvsschema on write
  • 7. Hive components TABLE customer ( customer_id BIGINT, Metastore gender STRING, ... schema info launch MapReduce Driver MapReduc e job Hive query HDFS (SQL-like) raw source data (compressed) SELECT * FROM customers CLI WHERE gender = `M¨;
  • 8. Metastore Hadoop C The Definitive Guide
  • 9. Other SQL-like features ? Aggregation C COUNT, AVG ? JOIN ? GROUP BY ? SORT BY ? Sub queries
  • 10. Demo
  • 11. Built in functions ? Text mining: C ngrams() C context_ngrams() C sentences() ? Statistics + mathematics: C stddev() C histogram_numeric() C log C radians
  • 12. User Defined Functions ? Written in Java ? User Defined Functions (UDFs): C Single row ? Single row C e.g. mathematical and string functions ? User Defined Aggregate Functions (UDAFs): C Multiple rows ? Single row C e.g. AVG ? User Defined Table Functions (UDTFs): C Single row ? Multiple rows C e.g. ^explode ̄
  • 13. Hive Clients Hadoop C The Definitive Guide
  • 15. Sqoop Move data between Hadoop and relational databases RDBMS Sqoop Hadoop Hive Metastore schema http://incubator.apache.org/projects/sqoop.html
  • 17. Conclusion ? Scales to handle much more data than traditional systems: C Leverages Hadoop HDFS and MapReduce C Relational/structured data C Schema on read vs schema on write ? Supports rapid iteration of ad-hoc queries C SQL-like querying language C Complex queries (joins, etc) with minimal code ? Is not a database replacement: C Treats data as immutable C No indexing

Editor's Notes

  • #5: # of users = 943# of movies = 1682# of ratings = 100,000
  • #8: ShellDriverCompilerExecution engineMetastore