際際滷

際際滷Share a Scribd company logo
Oozie3:ImprovedScheduling
 andControlOfWork鍖ows

        MohammadKIslam
      kamrul@yahooinc.com
Introduc?ons
≒ WhoIam
    ≒ TechnicalLeadatYahoo!
≒ OozieTeam
≒ Architecture,Development,Management
       MayankBansal
       AngeloHuang
       MohammadIslam
       AmolKekre
       AndreasNewman
       LeiZhang
≒ Externalcontributors.
≒ QE
     MarcyChang
     MichelleChiang
Agenda
≒ OozieOverview
≒ Oozie3.0features:
    Bundle
    Scalability
    Usability
≒ FuturePlan
≒ Q&A
Overview:Work鍖ow
≒ Oozieexecuteswork鍖owde鍖nedasDAGofjobs.
≒ Thejobtypeincludes:MapReduce/Pipes/Streaming/
   Pig/CustomJavaCodeetc.
≒ IntroducedinOozie1.x.
                                      M/R   
                                   streaming  
                                       job

           M/R  
  start
                   fork                                 join
           job


                                      Pig                        MORE           decision
                                      job


                                                              M/R  
                                                                                      ENOUGH
                                                              job




                                                   FS
                            end                                          Java
                                                  job
Overview:Coordinator
≒ Oozieexecuteswork鍖owbasedon:
    TimeDependency(Frequency)
    DataDependency
≒ IntroducedinOozie2.x.

                   OozieServer
                                            Check
  WSAPI            Oozie           DataAvailability
                   Coordinator

                     Oozie
 Oozie             Work鍖ow
 Client                                        Hadoop
Bundle
≒ WhatisBundle?
   AnewabstracconlayerontopofCoordinator.
   Userscande鍖neandexecutea油bunch油of油
     coordinatorapplicacons.
   IntroducedinOozie3.x.
≒ Whyitisrequired?
   Datapipeline:Asetofinterrelatedcoordinators
     applicaconrequiredforlargedataprocessing.
   Operaconalnightmare:Hardtomaintainand
     controlthesepipelinesforServiceEngineering
     team.
BundleCont.
≒ Userde鍖nesthebundlethroughanew油XML.
≒ Usercouldstart/stop/suspend/resume/rerun油
   inthebundlelevel.
≒ Bundleisop3onal.油

               OozieServer
                                      Check
 WSAPI                         DataAvailability
                 Bundle

               Coordinator

 Oozie         Work鍖ow
 Client                                  Hadoop
或看噛庄艶A恢壊岳姻温界?看稼L温霞艶姻壊
                               Bundle油                                           Layer1


      Coord油Job油1油                                  Coord油Job油2油



                                                                                 Layer2
Coord油                      Coord油              Coord油               Coord油
Action油1油                   Action油2油           Action1油油            Action油2油




WF油Job油1油                   WF油Job油1油          WF油Job油2油             WF油Job油2油



                     PIG油
                                                                                 Layer3
                     Job油
 M/R油                                   M/R油                  PIG油
 Job油                                   Job油                  Job油

                     FS油
                     Job油
EnhancedStabilityandScalability
≒ Issue:Atveryhighload,Ooziebecomesslow.
≒ Impact:90%ofthetotalOoziesupportincidence.
≒ Reason:
    Lotofaccvebutnonprogressingjobs.
    Nonprogressingjobsareconsumingalotof
      resources.
    Oozieinternalqueueisfull.
≒ Resolucon:
    Throhlethenumberofaccvejobs/coordinator
    Putthejobintocmeoutstate.
    Enforcetheuniquenessforooziequeueelement.
ImprovedUsability
≒ Issue:Coordinatorjobsstatusisnotintuicveand
   causesconfusiontotheOozieuser.
≒ Impact:UserconfusionandrelatedOozie
   support.
≒ Reason:
    StatusSUCCEEDEDdoesntmeanjobissuccessful!!
    StatusPREMATERisforoozieinternaluseonly.Butit
      wasexposedtouser.
≒ Resolucon:
    RedesignCoordinatorstatus
CoordinatorStatusRedesign
Current                    SUSPENDED               KILLED


    PREP      PREMATER                 Running   SUCCEEDED



                                                     FAILED




New           SUSPENDED                             KILLED


                                                     SUCCEEDED
       PREP     Running

                                                    DONE_WITH_ERROR

                 PAUSED                              FAILED
FuturePlan
≒ HigherScalability:Changepollingbaseddata
   dependencychecktopushmodelthroughHCatalog
   andNoc鍖caconsystem.
≒ Adaptability:GracefulhandlingHadoopdowncme:
   IfHadoopisdown,blocksubmission.
   WhenHadoopbecomesavailable
     ≒ Submittheblockedjob
     ≒ Autoresubmittheuntracedjob.
≒ Monitoring:RichWSAPIforapplicaconMonitoring/
   Alercng.
FuturePlanCont.
≒ Automa?cFailover:UsingZooKeeper.
≒ LoadBalancing:Throughserverreplicacon
≒ ImprovedUsability:
   Distcpaccon
   HiveAccon
≒ Asynchronousdataprocessing.
≒ Incrementaldataprocessing.
≒ ApacheMigra?on:Worksinicated.
Q&A


≒ Githublink:hhp://yahoo.github.com/oozie
≒ Mailinglist:Oozie-users@yahoogroups.com


                     MohammadKIslam
                  kamrul@yahooinc.com
Ad

Recommended

Nov 2011 HUG: Oozie
Nov 2011 HUG: Oozie
Yahoo Developer Network
Gearman
Gearman
Brian Moon
When Two Worlds Collide: Java and Ruby in the Enterprise
When Two Worlds Collide: Java and Ruby in the Enterprise
benbrowning
Armada - the way to ship microservices
Armada - the way to ship microservices
GameDesire Company
Oozie hugnov11
Oozie hugnov11
mislam77
Yarn at LinkedIn
Yarn at LinkedIn
mislam77
Hive at LinkedIn
Hive at LinkedIn
mislam77
Oozie HUG May12
Oozie HUG May12
mislam77
Oozie Summit 2011
Oozie Summit 2011
mislam77
October 2013 HUG: Oozie 4.x
October 2013 HUG: Oozie 4.x
Yahoo Developer Network
Apache Oozie
Apache Oozie
Shalish VJ
Learn Hadoop Administration
Learn Hadoop Administration
Edureka!
Oracle migrations and upgrades
Oracle migrations and upgrades
Durga Gadiraju
Big Data Introduction
Big Data Introduction
Durga Gadiraju
Introduction to Cloudera's Administrator Training for Apache Hadoop
Introduction to Cloudera's Administrator Training for Apache Hadoop
Cloudera, Inc.
Hadoop Ecosystem | Big Data Analytics Tools | Hadoop Tutorial | Edureka
Hadoop Ecosystem | Big Data Analytics Tools | Hadoop Tutorial | Edureka
Edureka!
Oozie sweet
Oozie sweet
mislam77
Workflow on Hadoop Using Oozie__HadoopSummit2010
Workflow on Hadoop Using Oozie__HadoopSummit2010
Yahoo Developer Network
Everything you wanted to know, but were afraid to ask about Oozie
Everything you wanted to know, but were afraid to ask about Oozie
Chicago Hadoop Users Group
Oozie @ Riot Games
Oozie @ Riot Games
Matt Goeke
Apache Oozie.pptx
Apache Oozie.pptx
V.V.Vanniaperumal College for Women
Apache Oozie
Apache Oozie
NagajothiN1
July 2012 HUG: Overview of Oozie Qualification Process
July 2012 HUG: Overview of Oozie Qualification Process
Yahoo Developer Network
Building and managing complex dependencies pipeline using Apache Oozie
Building and managing complex dependencies pipeline using Apache Oozie
DataWorks Summit/Hadoop Summit
May 2012 HUG: Oozie: Towards a scalable Workflow Management System for Hadoop
May 2012 HUG: Oozie: Towards a scalable Workflow Management System for Hadoop
Yahoo Developer Network
August 2016 HUG: Recent development in Apache Oozie
August 2016 HUG: Recent development in Apache Oozie
Yahoo Developer Network
Apache ooziehhwkkwksjshshjjwjwisisis.pptx
Apache ooziehhwkkwksjshshjjwjwisisis.pptx
kingtigerdhanu6903
Hadoop Oozie
Hadoop Oozie
Madhur Nawandar
oozieee.pdf
oozieee.pdf
wwww63
Introduction to Oozie | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to Oozie | Big Data Hadoop Spark Tutorial | CloudxLab
CloudxLab

More Related Content

Viewers also liked (8)

Oozie Summit 2011
Oozie Summit 2011
mislam77
October 2013 HUG: Oozie 4.x
October 2013 HUG: Oozie 4.x
Yahoo Developer Network
Apache Oozie
Apache Oozie
Shalish VJ
Learn Hadoop Administration
Learn Hadoop Administration
Edureka!
Oracle migrations and upgrades
Oracle migrations and upgrades
Durga Gadiraju
Big Data Introduction
Big Data Introduction
Durga Gadiraju
Introduction to Cloudera's Administrator Training for Apache Hadoop
Introduction to Cloudera's Administrator Training for Apache Hadoop
Cloudera, Inc.
Hadoop Ecosystem | Big Data Analytics Tools | Hadoop Tutorial | Edureka
Hadoop Ecosystem | Big Data Analytics Tools | Hadoop Tutorial | Edureka
Edureka!
Oozie Summit 2011
Oozie Summit 2011
mislam77
Apache Oozie
Apache Oozie
Shalish VJ
Learn Hadoop Administration
Learn Hadoop Administration
Edureka!
Oracle migrations and upgrades
Oracle migrations and upgrades
Durga Gadiraju
Big Data Introduction
Big Data Introduction
Durga Gadiraju
Introduction to Cloudera's Administrator Training for Apache Hadoop
Introduction to Cloudera's Administrator Training for Apache Hadoop
Cloudera, Inc.
Hadoop Ecosystem | Big Data Analytics Tools | Hadoop Tutorial | Edureka
Hadoop Ecosystem | Big Data Analytics Tools | Hadoop Tutorial | Edureka
Edureka!

Similar to Oozie Hug May 2011 (20)

Oozie sweet
Oozie sweet
mislam77
Workflow on Hadoop Using Oozie__HadoopSummit2010
Workflow on Hadoop Using Oozie__HadoopSummit2010
Yahoo Developer Network
Everything you wanted to know, but were afraid to ask about Oozie
Everything you wanted to know, but were afraid to ask about Oozie
Chicago Hadoop Users Group
Oozie @ Riot Games
Oozie @ Riot Games
Matt Goeke
Apache Oozie.pptx
Apache Oozie.pptx
V.V.Vanniaperumal College for Women
Apache Oozie
Apache Oozie
NagajothiN1
July 2012 HUG: Overview of Oozie Qualification Process
July 2012 HUG: Overview of Oozie Qualification Process
Yahoo Developer Network
Building and managing complex dependencies pipeline using Apache Oozie
Building and managing complex dependencies pipeline using Apache Oozie
DataWorks Summit/Hadoop Summit
May 2012 HUG: Oozie: Towards a scalable Workflow Management System for Hadoop
May 2012 HUG: Oozie: Towards a scalable Workflow Management System for Hadoop
Yahoo Developer Network
August 2016 HUG: Recent development in Apache Oozie
August 2016 HUG: Recent development in Apache Oozie
Yahoo Developer Network
Apache ooziehhwkkwksjshshjjwjwisisis.pptx
Apache ooziehhwkkwksjshshjjwjwisisis.pptx
kingtigerdhanu6903
Hadoop Oozie
Hadoop Oozie
Madhur Nawandar
oozieee.pdf
oozieee.pdf
wwww63
Introduction to Oozie | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to Oozie | Big Data Hadoop Spark Tutorial | CloudxLab
CloudxLab
Apache Hadoop India Summit 2011 talk "Oozie - Workflow for Hadoop" by Andreas N
Apache Hadoop India Summit 2011 talk "Oozie - Workflow for Hadoop" by Andreas N
Yahoo Developer Network
Apache Oozie The Workflow Scheduler for Hadoop 1st Edition Mohammad Kamrul Islam
Apache Oozie The Workflow Scheduler for Hadoop 1st Edition Mohammad Kamrul Islam
osamafewelyo
Apache Oozie Workflow Scheduler - Module 10
Apache Oozie Workflow Scheduler - Module 10
Rohit Agrawal
Oozie at Yahoo
Oozie at Yahoo
Mona Chitnis
AI&BigData Lab. 仍亠从舒仆亟 仂仆仂仗从仂 "Celos: 仂从亠亳仂于舒仆亳亠 亳 亠亳仂于舒仆亳亠 亰舒亟舒...
AI&BigData Lab. 仍亠从舒仆亟 仂仆仂仗从仂 "Celos: 仂从亠亳仂于舒仆亳亠 亳 亠亳仂于舒仆亳亠 亰舒亟舒...
GeeksLab Odessa
Oozie meetup - HA + Cron Scheduling
Oozie meetup - HA + Cron Scheduling
Mona Chitnis
Oozie sweet
Oozie sweet
mislam77
Workflow on Hadoop Using Oozie__HadoopSummit2010
Workflow on Hadoop Using Oozie__HadoopSummit2010
Yahoo Developer Network
Everything you wanted to know, but were afraid to ask about Oozie
Everything you wanted to know, but were afraid to ask about Oozie
Chicago Hadoop Users Group
Oozie @ Riot Games
Oozie @ Riot Games
Matt Goeke
July 2012 HUG: Overview of Oozie Qualification Process
July 2012 HUG: Overview of Oozie Qualification Process
Yahoo Developer Network
Building and managing complex dependencies pipeline using Apache Oozie
Building and managing complex dependencies pipeline using Apache Oozie
DataWorks Summit/Hadoop Summit
May 2012 HUG: Oozie: Towards a scalable Workflow Management System for Hadoop
May 2012 HUG: Oozie: Towards a scalable Workflow Management System for Hadoop
Yahoo Developer Network
August 2016 HUG: Recent development in Apache Oozie
August 2016 HUG: Recent development in Apache Oozie
Yahoo Developer Network
Apache ooziehhwkkwksjshshjjwjwisisis.pptx
Apache ooziehhwkkwksjshshjjwjwisisis.pptx
kingtigerdhanu6903
oozieee.pdf
oozieee.pdf
wwww63
Introduction to Oozie | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to Oozie | Big Data Hadoop Spark Tutorial | CloudxLab
CloudxLab
Apache Hadoop India Summit 2011 talk "Oozie - Workflow for Hadoop" by Andreas N
Apache Hadoop India Summit 2011 talk "Oozie - Workflow for Hadoop" by Andreas N
Yahoo Developer Network
Apache Oozie The Workflow Scheduler for Hadoop 1st Edition Mohammad Kamrul Islam
Apache Oozie The Workflow Scheduler for Hadoop 1st Edition Mohammad Kamrul Islam
osamafewelyo
Apache Oozie Workflow Scheduler - Module 10
Apache Oozie Workflow Scheduler - Module 10
Rohit Agrawal
AI&BigData Lab. 仍亠从舒仆亟 仂仆仂仗从仂 "Celos: 仂从亠亳仂于舒仆亳亠 亳 亠亳仂于舒仆亳亠 亰舒亟舒...
AI&BigData Lab. 仍亠从舒仆亟 仂仆仂仗从仂 "Celos: 仂从亠亳仂于舒仆亳亠 亳 亠亳仂于舒仆亳亠 亰舒亟舒...
GeeksLab Odessa
Oozie meetup - HA + Cron Scheduling
Oozie meetup - HA + Cron Scheduling
Mona Chitnis
Ad

Recently uploaded (20)

Crypto Super 500 - 14th Report - June2025.pdf
Crypto Super 500 - 14th Report - June2025.pdf
Stephen Perrenod
Connecting Data and Intelligence: The Role of FME in Machine Learning
Connecting Data and Intelligence: The Role of FME in Machine Learning
Safe Software
Improving Data Integrity: Synchronization between EAM and ArcGIS Utility Netw...
Improving Data Integrity: Synchronization between EAM and ArcGIS Utility Netw...
Safe Software
OWASP Barcelona 2025 Threat Model Library
OWASP Barcelona 2025 Threat Model Library
PetraVukmirovic
FIDO Seminar: Authentication for a Billion Consumers - Amazon.pptx
FIDO Seminar: Authentication for a Billion Consumers - Amazon.pptx
FIDO Alliance
Can We Use Rust to Develop Extensions for PostgreSQL? (POSETTE: An Event for ...
Can We Use Rust to Develop Extensions for PostgreSQL? (POSETTE: An Event for ...
NTT DATA Technology & Innovation
"Database isolation: how we deal with hundreds of direct connections to the d...
"Database isolation: how we deal with hundreds of direct connections to the d...
Fwdays
FIDO Seminar: New Data: Passkey Adoption in the Workforce.pptx
FIDO Seminar: New Data: Passkey Adoption in the Workforce.pptx
FIDO Alliance
OpenACC and Open Hackathons Monthly Highlights June 2025
OpenACC and Open Hackathons Monthly Highlights June 2025
OpenACC
FME for Distribution & Transmission Integrity Management Program (DIMP & TIMP)
FME for Distribution & Transmission Integrity Management Program (DIMP & TIMP)
Safe Software
War_And_Cyber_3_Years_Of_Struggle_And_Lessons_For_Global_Security.pdf
War_And_Cyber_3_Years_Of_Struggle_And_Lessons_For_Global_Security.pdf
biswajitbanerjee38
Python Conference Singapore - 19 Jun 2025
Python Conference Singapore - 19 Jun 2025
ninefyi
Information Security Response Team Nepal_npCERT_Vice_President_Sudan_Jha.pdf
Information Security Response Team Nepal_npCERT_Vice_President_Sudan_Jha.pdf
ICT Frame Magazine Pvt. Ltd.
Edge-banding-machines-edgeteq-s-200-en-.pdf
Edge-banding-machines-edgeteq-s-200-en-.pdf
AmirStern2
FIDO Seminar: Perspectives on Passkeys & Consumer Adoption.pptx
FIDO Seminar: Perspectives on Passkeys & Consumer Adoption.pptx
FIDO Alliance
Viral>Wondershare Filmora 14.5.18.12900 Crack Free Download
Viral>Wondershare Filmora 14.5.18.12900 Crack Free Download
Puppy jhon
Smarter Aviation Data Management: Lessons from Swedavia Airports and Sweco
Smarter Aviation Data Management: Lessons from Swedavia Airports and Sweco
Safe Software
OpenPOWER Foundation & Open-Source Core Innovations
OpenPOWER Foundation & Open-Source Core Innovations
IBM
Enabling BIM / GIS integrations with Other Systems with FME
Enabling BIM / GIS integrations with Other Systems with FME
Safe Software
AI vs Human Writing: Can You Tell the Difference?
AI vs Human Writing: Can You Tell the Difference?
Shashi Sathyanarayana, Ph.D
Crypto Super 500 - 14th Report - June2025.pdf
Crypto Super 500 - 14th Report - June2025.pdf
Stephen Perrenod
Connecting Data and Intelligence: The Role of FME in Machine Learning
Connecting Data and Intelligence: The Role of FME in Machine Learning
Safe Software
Improving Data Integrity: Synchronization between EAM and ArcGIS Utility Netw...
Improving Data Integrity: Synchronization between EAM and ArcGIS Utility Netw...
Safe Software
OWASP Barcelona 2025 Threat Model Library
OWASP Barcelona 2025 Threat Model Library
PetraVukmirovic
FIDO Seminar: Authentication for a Billion Consumers - Amazon.pptx
FIDO Seminar: Authentication for a Billion Consumers - Amazon.pptx
FIDO Alliance
Can We Use Rust to Develop Extensions for PostgreSQL? (POSETTE: An Event for ...
Can We Use Rust to Develop Extensions for PostgreSQL? (POSETTE: An Event for ...
NTT DATA Technology & Innovation
"Database isolation: how we deal with hundreds of direct connections to the d...
"Database isolation: how we deal with hundreds of direct connections to the d...
Fwdays
FIDO Seminar: New Data: Passkey Adoption in the Workforce.pptx
FIDO Seminar: New Data: Passkey Adoption in the Workforce.pptx
FIDO Alliance
OpenACC and Open Hackathons Monthly Highlights June 2025
OpenACC and Open Hackathons Monthly Highlights June 2025
OpenACC
FME for Distribution & Transmission Integrity Management Program (DIMP & TIMP)
FME for Distribution & Transmission Integrity Management Program (DIMP & TIMP)
Safe Software
War_And_Cyber_3_Years_Of_Struggle_And_Lessons_For_Global_Security.pdf
War_And_Cyber_3_Years_Of_Struggle_And_Lessons_For_Global_Security.pdf
biswajitbanerjee38
Python Conference Singapore - 19 Jun 2025
Python Conference Singapore - 19 Jun 2025
ninefyi
Information Security Response Team Nepal_npCERT_Vice_President_Sudan_Jha.pdf
Information Security Response Team Nepal_npCERT_Vice_President_Sudan_Jha.pdf
ICT Frame Magazine Pvt. Ltd.
Edge-banding-machines-edgeteq-s-200-en-.pdf
Edge-banding-machines-edgeteq-s-200-en-.pdf
AmirStern2
FIDO Seminar: Perspectives on Passkeys & Consumer Adoption.pptx
FIDO Seminar: Perspectives on Passkeys & Consumer Adoption.pptx
FIDO Alliance
Viral>Wondershare Filmora 14.5.18.12900 Crack Free Download
Viral>Wondershare Filmora 14.5.18.12900 Crack Free Download
Puppy jhon
Smarter Aviation Data Management: Lessons from Swedavia Airports and Sweco
Smarter Aviation Data Management: Lessons from Swedavia Airports and Sweco
Safe Software
OpenPOWER Foundation & Open-Source Core Innovations
OpenPOWER Foundation & Open-Source Core Innovations
IBM
Enabling BIM / GIS integrations with Other Systems with FME
Enabling BIM / GIS integrations with Other Systems with FME
Safe Software
AI vs Human Writing: Can You Tell the Difference?
AI vs Human Writing: Can You Tell the Difference?
Shashi Sathyanarayana, Ph.D
Ad

Oozie Hug May 2011