ݺߣ

ݺߣShare a Scribd company logo
ĵƣHadoop ѡ




                   Hadoop ѡ



ĵ汾
 ĵ汾                   ˵
 V0.1
ĵƣHadoop ѡ




1 
    ɡHBase ʹ hadoop .doc HBase  Hadoop ҪΪ 2 棺
    ȶԣ
    (1) HDFS Ĵ
    (2) Datanode Ӧ hbase ͸
    (3) Region server  datanode ɿ
    (4) Sync() ֧(sequence file flush ֧)

    ܣ
    (5)  HDFS 
    (6) Client ˵ datanode 


2 汾ѡ

2.1 ѡ汾
  ǰ hadoop 汾ж append/sync ֵ֧İ CDH3b20.20-append0.21.0
ѡ 3 汾Ϊѡ汾



2.2 ѡ汾
汾 hbase ״


                 1        2          3        4           5         6
CDH3b2                                                                  
0.21.0                                                                  
0.20.2-append                                                           
ע: Ϊ֧֣ Ϊ֧

汾ȱ㣺
                       Append/sync ֧   Ƿ release                        Ƿ
                                                             hadoop0.20.2   hbase 
                                                                            Ż
CDH3b2               HADOOP-1700                                        
                    ʵ                  (facebook,cloudera
                                        Ƽ)
0.21.0               HDFS-265 ʵ       (޴ģʹ             δ֪             
                                        , .0 Ϊ
                                         release)
0.20.2-append        HADOOP-1700
ĵƣHadoop ѡ


                 ʵ                        (facebook,cloudera
                                           Ƽ)
ע
  HADOOP-1700 Ϊ 2008-07-24 ύ 2  bug fix
  HDFS-265 Ϊ 2010-05-21 ύĵϲ 0.21.0

      0.21.0 ð汾Ϊȶ releaseδʹã˴ݸı䣬
䲢ܱ汾õ hbase ݲѡ hadoop 0.21.0
    ѡ cdh3b2 йܲԣҪ append/sync ܵʵ״
¼ɲóۣ HADOOP-1700  append/sync ʵֿ
    ǶԱ cdh3b2  0.20-append, Ա汾
    ַ֧棬0.20-append Ҫ facebook ƶ CDH3b2  cloudera 
ƶȽֶ֧ԣಮ١
    DZȽ cdh3b2  0.20-append ڹϵIJ汾ҪܺͲõ
patch ϴͬС졣±жԱ汾е patch
     Cdh3                                                   0.20-append
ȶ   [HDFS-1056] - Multi-node RPC deadlocks during          HDFS-1258 Clearing namespace quota on
    block recovery                                         "/" corrupts fs image.
     [HDFS-1122] - client block verification may result     HDFS-955 New implementation of
     in blocks in DataBlockScanner prematurely              saveNamespace() to avoid loss of edits
     [HDFS-1197] - Blocks are considered "complete"
     prematurely after commitBlockSynchronization or
     DN restart
     [HDFS-1186] - 0.20: DNs should interrupt writers at
     start of recovery
     [HDFS-1218] - 20 append: Blocks recovered on
     startup should be treated with lower priority during
     block synchronization
     [HDFS-1260] - 0.20: Block lost when multiple DNs
     trying to recover it to different genstamps
     [HDFS-127] - DFSClient block read failures cause
     open DFSInputStream to become unusable
     [HDFS-686] - NullPointerException is thrown while
     merging edit log and image
     [HDFS-915] - Hung DN stalls write pipeline for far
     longer than its timeout
     [HADOOP-6269] - Missing synchronization for
     defaultResources in Configuration.addResource
     [HADOOP-6460] - Namenode runs of out of
     memory due to memory leak in ipc Server
     [HADOOP-6667] - RPC.waitForProxy should retry
     through NoRouteToHostException
     [HADOOP-6722] - NetUtils.connect should check
     that it hasn't connected a socket to itself
ĵƣHadoop ѡ


       [HADOOP-6723] - unchecked exceptions thrown in
       IPC Connection orphan clients
       [HADOOP-6724] - IPC doesn't properly handle
       IOEs thrown by socket factory
       [HADOOP-6762] - exception while doing RPC I/O
       closes channel
       [HADOOP-2366] - Space in the value for
       dfs.data.dir can cause great problems
       [HADOOP-4885] - Try to restore failed replicas of
       Name Node storage (at checkpoint time)


     [HDFS-142] - In 0.20, move blocks being written
       into a blocksBeingWritten directory
       [HDFS-611] - Heartbeats times from Datanodes
       increase when there are plenty of blocks to delete
       [HDFS-895] - Allow hflush/sync to occur in parallel
       with new writes to the file
       [HDFS-877] - Client-driven block verification not
       functioning
       [HDFS-894] - DatanodeID.ipcPort is not updated
       when existing node re-registers


     [HADOOP-4655] - FileSystem.CACHE should be              HDFS-1041
       ref-counted                                             DFSClient.getFileChecksum(..)      should
                                                               retry if connection to
                                                               HDFS-927 DFSInputStream retries too
                                                               many times for new block locations
Misc   [HDFS-1161] - Make DN minimum valid volumes             HADOOP-6637           Benchmark         for
       configurable                                            establishing RPC session. (shv)
       [HDFS-1209]             -           Add          conf   HADOOP-6760         WebServer shouldn't
       dfs.client.block.recovery.retries to configure number   increase port number in case of negative
       of block recovery attempts
       [HDFS-455] - Make NN and DN handle in a
       intuitive way comma-separated configuration strings
       [HDFS-528] - Add ability for safemode to wait for a
       minimum number of live datanodes
       [HADOOP-1849] - IPC server max queue size
       should be configurable
       [HADOOP-4675] - Current Ganglia metrics
       implementation is incompatible with Ganglia 3.1
       [HADOOP-4829] - Allow FileSystem shutdown
       hook to be disabled
       [HADOOP-5257] - Export namenode/datanode
       functionality through a pluggable RPC layer
ĵƣHadoop ѡ


     [HADOOP-5450]          -    Add       support for
     application-specific typecodes to typed bytes
     [HADOOP-5891] - If dfs.http.address is default,
     SecondaryNameNode can't find NameNode



       ϱDzѷ֣CDH3b2  0.20-append ˸ȶԸĽ append
   ܵ bug fix 0.20-append еһЩ patchҲ˺ҪĸĽǵĿ
   Ƽáԣѡ cdh3b2 Ϊ߰汾 0.20-append 
    patchڽ merge


3 
       ѡ汾ܹ֧ append/sync ܣ CDH3b2  0.20.0 C append 
   汾ǻԽȶ hadoop 0.20.2 hadoop 0.21.0 һµĴ汾    .0 
   Ϊȶ release δʹúͲԣѡ CDH3b2  0.20.0 C
   append
        CDH3b2  0.20.0 C append 汾ıȽУCDH3b2  0.20.0-append
   ϵͳȶԺ append ֧֣ӵиĸĽѡ CDH3b2 Ϊ
   ߰汾 0.20-append  6  patchǻᰴաȶ--ܡ
   ˳ڽ merge
       ѡ CDH3b2 Ϊ for hbase Ļ߰汾
ĵƣHadoop ѡ




¼
Append 
     append Ϊٷṩ append unit test Mini cluster вԡΪ֤
append/sync ʵʼȺ״Ⱥ append/sync ܲ
TestFileAppend Уcase ͨ
FileAppend:Ҫ sync  append ܵļ򵥲
testComplexFlush()
     ļ
     д
     Sync
     ļ
     д
     Close
     ļ
testSimpleFlush()
    ļ
    дļ
    Sync
    дļ
    Sync
    ļ
    Close
    ļ
FileAppend2: Ҫ sync  append ܵļ򵥲
testComplexAppend()
     ģ̶߳Զļ append 
testSimpleAppend()
     ļ
     д  io.bytes.per.checksum
     Close
    д루 2  io.bytes.per.checksum
    Close
    дļʣ²
    Close
    ֤ļС



FileAppend3 Ҫ sync  append ܵķϳԡҪ֤
 checksum 龰ȷԡ

testTC1()
     ļ
ĵƣHadoop ѡ


   д 1  block
   Close ļ
    Append ʽļ
   д 0.5  block
   Close
   ȡ 1.5  block ֤ļ

testTC2()
     ļ
     дļ 1.5 block
     Close
     Append
     д 1/4 block
     Close
      1.75  block ֤ļ

testTC5()
     clientA ļ
     clientA д
     clientA close ļ
     clientA append ļ
     client append ļڴ

testTC7()
     ļ
     д
     close ļ
     ʹһ datanode Ŀ corrupt
     append ļ
     close ļ
     ֤ļС

testTC12()
     ļ
     д 25687B
     close ļ
     append ļ
     д 5877B
     close ļ
     ֤ļС
     [ע] hdfs ÿ io.bytes.per.checksum(ĬΪ 512)һ checksum case 
֤ checksum ʱappend ǷЧ

FileAppend4 Ҫ sync  append ܵ fail over
ĵƣHadoop ѡ


testAppendSyncBbw()
     clientA ļ
     д 500B
     Sync
     Client A ʧȥ lease
     Client B  lease recovery
     Client B ֤ļ
testAppendSyncBbwClusterRestart()
     clientA ļ
     д 500B
     Sync
     Ⱥ
     Client B  lease recovery
     Client B ֤ļ
testAppendSyncChecksum0()
     ļ
     д 1/2 block
     Sync
     ͣȺ
     𻵵 1  datanode Ӧ block  checksum
     Ⱥ
     ֤ļ
testAppendSyncChecksum1()
     ļ
     д 1/2 block
     Sync
     ͣȺ
     𻵵 2  datanode Ӧ block  checksum
     Ⱥ
     ֤ļ
testAppendSyncChecksum2()
     ļ
     д 1/2 block
     Sync
     ͣȺ
     𻵵 3  datanode Ӧ block  checksum
     Ⱥ
     ֤ļ
testAppendSyncReplication0()
     ļ
     д 1/2 block
     Sync
     ͣ 1  datanode
     д 1/4  block
ĵƣHadoop ѡ


     Sync
     Ⱥ
     ֤ļ
testAppendSyncReplication1()
     ļ
     д 1/2 block
     Sync
     ͣ 2  datanode
     д 1/4  block
     Sync
     Ⱥ
     ֤ļ
testAppendSyncReplication2()
     ļ
     д 1/2 block
     Sync
     ͣ 3  datanode
     д 1/4  block
     Sync
     Ⱥ
     ֤ļ
testDnDeath0()
     дɺͣһ datanode lease recoveryȻ֤ļݡ
testDnDeath1()
     дɺͣڶ datanode lease recoveryȻ֤ļݡ
testDnDeath2()
     дɺͣ datanode lease recoveryȻ֤ļݡ
testFullClusterPowerLoss()
     д blocksync      Ȼģϵ磬Ȼ   ֤ļݡ page cache
                                                         
ˢ̣
testHalfLengthPrimaryDN()
     ģ DFSClient дĹУڻϵ磨datanode дһ룩龰
testRecoverFinalizedBlock()
     ֤ block  finalizeļûб complete  case lease recovery Ӧܵ
Could not complete file 쳣
testTruncatedPrimaryDN()
     ģ DFSClient дĹУڻϵ磨datanode ļûбд룩
龰

More Related Content

What's hot (19)

Debian ׼̌Wָ v0.19 - wķg
Debian ׼̌Wָ v0.19 - wķgDebian ׼̌Wָ v0.19 - wķg
Debian ׼̌Wָ v0.19 - wķg
SZ Lin
?
Dz
DzDz
Dz
84zhu
?
Oracleˇg1 LinuxIwybOracle 11g
Oracleˇg1 LinuxIwybOracle 11gOracleˇg1 LinuxIwybOracle 11g
Oracleˇg1 LinuxIwybOracle 11g
Chien Chung Shen
?
Ѳ˾ݷչʵ
Ѳ˾ݷչʵѲ˾ݷչʵ
Ѳ˾ݷչʵ
maclean liu
?
Altibaseѵ װƪ
Altibaseѵ װƪAltibaseѵ װƪ
Altibaseѵ װƪ
С
?
ܰ±ԲԳ漰¼Ӧ
ܰ±ԲԳ漰¼Ӧܰ±ԲԳ漰¼Ӧ
ܰ±ԲԳ漰¼Ӧ
redhat9
?
Debian ׼̌Wָ - wķg
Debian ׼̌Wָ - wķgDebian ׼̌Wָ - wķg
Debian ׼̌Wָ - wķg
SZ Lin
?
CKAN gB (Aƪ)
CKAN gB (Aƪ)CKAN gB (Aƪ)
CKAN gB (Aƪ)
Chengjen Lee
?
_l! Windows_lhadoopʽֱ\ map/reduce
_l! Windows_lhadoopʽֱ\ map/reduce_l! Windows_lhadoopʽֱ\ map/reduce
_l! Windows_lhadoopʽֱ\ map/reduce
Wei-Yu Chen
?
dbdao.com ΰ my-sql-replicationƸ߿÷
dbdao.com ΰ my-sql-replicationƸ߿÷dbdao.com ΰ my-sql-replicationƸ߿÷
dbdao.com ΰ my-sql-replicationƸ߿÷
maclean liu
?
Ӧð嶨崢¾
Ӧð嶨崢¾Ӧð嶨崢¾
Ӧð嶨崢¾
Alex Lau
?
Install oracle ebs r12.1.1 on OEL5.6 x86(include demo)
Install oracle ebs r12.1.1 on OEL5.6 x86(include demo)Install oracle ebs r12.1.1 on OEL5.6 x86(include demo)
Install oracle ebs r12.1.1 on OEL5.6 x86(include demo)
acqua young
?
ʫ̴ ΰ-桿˾ݿⱸݲ
ʫ̴ ΰ-桿˾ݿⱸݲʫ̴ ΰ-桿˾ݿⱸݲ
ʫ̴ ΰ-桿˾ݿⱸݲ
maclean liu
?
dz̸ٱ׼
dz̸ٱ׼dz̸ٱ׼
dz̸ٱ׼
Wen Liao
?
Debian ׼̌Wָ v0.19 - wķg
Debian ׼̌Wָ v0.19 - wķgDebian ׼̌Wָ v0.19 - wķg
Debian ׼̌Wָ v0.19 - wķg
SZ Lin
?
Oracleˇg1 LinuxIwybOracle 11g
Oracleˇg1 LinuxIwybOracle 11gOracleˇg1 LinuxIwybOracle 11g
Oracleˇg1 LinuxIwybOracle 11g
Chien Chung Shen
?
Altibaseѵ װƪ
Altibaseѵ װƪAltibaseѵ װƪ
Altibaseѵ װƪ
С
?
ܰ±ԲԳ漰¼Ӧ
ܰ±ԲԳ漰¼Ӧܰ±ԲԳ漰¼Ӧ
ܰ±ԲԳ漰¼Ӧ
redhat9
?
Debian ׼̌Wָ - wķg
Debian ׼̌Wָ - wķgDebian ׼̌Wָ - wķg
Debian ׼̌Wָ - wķg
SZ Lin
?
_l! Windows_lhadoopʽֱ\ map/reduce
_l! Windows_lhadoopʽֱ\ map/reduce_l! Windows_lhadoopʽֱ\ map/reduce
_l! Windows_lhadoopʽֱ\ map/reduce
Wei-Yu Chen
?
dbdao.com ΰ my-sql-replicationƸ߿÷
dbdao.com ΰ my-sql-replicationƸ߿÷dbdao.com ΰ my-sql-replicationƸ߿÷
dbdao.com ΰ my-sql-replicationƸ߿÷
maclean liu
?
Install oracle ebs r12.1.1 on OEL5.6 x86(include demo)
Install oracle ebs r12.1.1 on OEL5.6 x86(include demo)Install oracle ebs r12.1.1 on OEL5.6 x86(include demo)
Install oracle ebs r12.1.1 on OEL5.6 x86(include demo)
acqua young
?
ʫ̴ ΰ-桿˾ݿⱸݲ
ʫ̴ ΰ-桿˾ݿⱸݲʫ̴ ΰ-桿˾ݿⱸݲ
ʫ̴ ΰ-桿˾ݿⱸݲ
maclean liu
?

Viewers also liked (20)

General Information GITEX Technology Week 2012
General Information GITEX Technology Week 2012General Information GITEX Technology Week 2012
General Information GITEX Technology Week 2012
Technopreneurs Association of Malaysia
?
Marzoni Winter 2011
Marzoni Winter  2011Marzoni Winter  2011
Marzoni Winter 2011
kumaradam
?
Fulltext01
Fulltext01Fulltext01
Fulltext01
Prafulla Tekriwal
?
Bhopal20 yearslater
Bhopal20 yearslaterBhopal20 yearslater
Bhopal20 yearslater
Prafulla Tekriwal
?
1562011 ramjas 1
1562011 ramjas 11562011 ramjas 1
1562011 ramjas 1
Prafulla Tekriwal
?
Davis edu653 powerpointslideshare
Davis edu653 powerpointslideshareDavis edu653 powerpointslideshare
Davis edu653 powerpointslideshare
Mary Jo Davis
?
Ignite Auckland - 2011
Ignite Auckland - 2011Ignite Auckland - 2011
Ignite Auckland - 2011
Fabiana Kubke
?
Music : your social media optimisation
Music : your social media optimisationMusic : your social media optimisation
Music : your social media optimisation
af83media
?
Bgt2
Bgt2Bgt2
Bgt2
Prafulla Tekriwal
?
Authentic Venture case study
Authentic Venture case studyAuthentic Venture case study
Authentic Venture case study
Technopreneurs Association of Malaysia
?
Cets 2015 ls iaco cheap cheerful
Cets 2015 ls iaco cheap cheerfulCets 2015 ls iaco cheap cheerful
Cets 2015 ls iaco cheap cheerful
Chicago eLearning & Technology Showcase
?
Spencer Ogden Brochure
Spencer Ogden BrochureSpencer Ogden Brochure
Spencer Ogden Brochure
elliotdavis
?
Benh ly cot song
Benh ly cot songBenh ly cot song
Benh ly cot song
Ng? ??nh
?
Tech 2.0: Tech Tips to Boost Office Productivity
Tech 2.0: Tech Tips to Boost Office ProductivityTech 2.0: Tech Tips to Boost Office Productivity
Tech 2.0: Tech Tips to Boost Office Productivity
John Chen
?
Bsc FOOD PRODUCTION question and answer
Bsc FOOD PRODUCTION  question and answerBsc FOOD PRODUCTION  question and answer
Bsc FOOD PRODUCTION question and answer
Professor
?
Sales training part 1-2
Sales training part 1-2Sales training part 1-2
Sales training part 1-2
Callture Inc
?
How Twitter Saved My Life; Not Really, but Maybe?
How Twitter Saved My Life; Not Really, but Maybe?How Twitter Saved My Life; Not Really, but Maybe?
How Twitter Saved My Life; Not Really, but Maybe?
John Chen
?
Appendix c
Appendix cAppendix c
Appendix c
Prafulla Tekriwal
?
Marzoni Casual Collection 2011
Marzoni Casual Collection  2011Marzoni Casual Collection  2011
Marzoni Casual Collection 2011
kumaradam
?
Go Global Project 2012 by TeAM
Go Global Project 2012 by TeAMGo Global Project 2012 by TeAM
Go Global Project 2012 by TeAM
Technopreneurs Association of Malaysia
?

Similar to ᲹǴDZѡ (20)

ᲹǴDZ迪ר
ᲹǴDZ迪רᲹǴDZ迪ר
ᲹǴDZ迪ר
liangxiao0315
?
ٶϵͳֲʽϵͳ Sacc2010
ٶϵͳֲʽϵͳ  Sacc2010ٶϵͳֲʽϵͳ  Sacc2010
ٶϵͳֲʽϵͳ Sacc2010
Chuanying Du
?
ᲹǴDZ-ֲʽƽ̨
ᲹǴDZ-ֲʽƽ̨ᲹǴDZ-ֲʽƽ̨
ᲹǴDZ-ֲʽƽ̨
Jacky Chi
?
Hdfs introduction
Hdfs introductionHdfs introduction
Hdfs introduction
baggioss
?
HDInsight for Hadoopers
HDInsight for HadoopersHDInsight for Hadoopers
HDInsight for Hadoopers
Kuo-Chun Su
?
Hic2011
Hic2011Hic2011
Hic2011
baggioss
?
Centos°װapache + subversion
Centos°װapache + subversionCentos°װapache + subversion
Centos°װapache + subversion
Yiwei Ma
?
20150604 docker
20150604 docker 20150604 docker
20150604 docker
azole Lai
?
䲹ԻijʹüһЩ򵥵IJ
䲹ԻijʹüһЩ򵥵IJ䲹ԻijʹüһЩ򵥵IJ
䲹ԻijʹüһЩ򵥵IJ
zhubin885
?
Cent os װ subversion
Cent os װ subversionCent os װ subversion
Cent os װ subversion
YUCHENG HU
?
Hadoop Map Reduce ʽOӋ
Hadoop Map Reduce ʽOӋHadoop Map Reduce ʽOӋ
Hadoop Map Reduce ʽOӋ
Wei-Yu Chen
?
DAE ±仯
DAE ±仯DAE ±仯
DAE ±仯
Tianwei Liu
?
Inside the browser
Inside the browserInside the browser
Inside the browser
otakustay
?
X64 lamp׼ new
X64 lamp׼ newX64 lamp׼ new
X64 lamp׼ new
Yiwei Ma
?
ʳ輰ܱ貹ŻϵУ
ʳ輰ܱ貹ŻϵУʳ輰ܱ貹ŻϵУ
ʳ輰ܱ貹ŻϵУ
Robbin Zhao
?
20120613ݷʲٴܹʵ4ʤ
20120613ݷʲٴܹʵ4ʤ20120613ݷʲٴܹʵ4ʤ
20120613ݷʲٴܹʵ4ʤ
liu sheng
?
Odaװ ָ
Odaװ ָOdaװ ָ
Odaװ ָ
n-lauren
?
Voldemort Intro Tangfl
Voldemort Intro TangflVoldemort Intro Tangfl
Voldemort Intro Tangfl
fulin tang
?
ٶϵͳֲʽϵͳ Sacc2010
ٶϵͳֲʽϵͳ  Sacc2010ٶϵͳֲʽϵͳ  Sacc2010
ٶϵͳֲʽϵͳ Sacc2010
Chuanying Du
?
ᲹǴDZ-ֲʽƽ̨
ᲹǴDZ-ֲʽƽ̨ᲹǴDZ-ֲʽƽ̨
ᲹǴDZ-ֲʽƽ̨
Jacky Chi
?
Hdfs introduction
Hdfs introductionHdfs introduction
Hdfs introduction
baggioss
?
HDInsight for Hadoopers
HDInsight for HadoopersHDInsight for Hadoopers
HDInsight for Hadoopers
Kuo-Chun Su
?
Centos°װapache + subversion
Centos°װapache + subversionCentos°װapache + subversion
Centos°װapache + subversion
Yiwei Ma
?
20150604 docker
20150604 docker 20150604 docker
20150604 docker
azole Lai
?
䲹ԻijʹüһЩ򵥵IJ
䲹ԻijʹüһЩ򵥵IJ䲹ԻijʹüһЩ򵥵IJ
䲹ԻijʹüһЩ򵥵IJ
zhubin885
?
Cent os װ subversion
Cent os װ subversionCent os װ subversion
Cent os װ subversion
YUCHENG HU
?
Hadoop Map Reduce ʽOӋ
Hadoop Map Reduce ʽOӋHadoop Map Reduce ʽOӋ
Hadoop Map Reduce ʽOӋ
Wei-Yu Chen
?
Inside the browser
Inside the browserInside the browser
Inside the browser
otakustay
?
X64 lamp׼ new
X64 lamp׼ newX64 lamp׼ new
X64 lamp׼ new
Yiwei Ma
?
20120613ݷʲٴܹʵ4ʤ
20120613ݷʲٴܹʵ4ʤ20120613ݷʲٴܹʵ4ʤ
20120613ݷʲٴܹʵ4ʤ
liu sheng
?
Voldemort Intro Tangfl
Voldemort Intro TangflVoldemort Intro Tangfl
Voldemort Intro Tangfl
fulin tang
?

More from baggioss (8)

ڲд쳣
ڲд쳣ڲд쳣
ڲд쳣
baggioss
?
ܲĵ
ܲĵܲĵ
ܲĵ
baggioss
?
Hic 2011 realtime_analytics_at_facebook
Hic 2011 realtime_analytics_at_facebookHic 2011 realtime_analytics_at_facebook
Hic 2011 realtime_analytics_at_facebook
baggioss
?
[Hi c2011]building mission critical messaging system(guoqiang jerry)
[Hi c2011]building mission critical messaging system(guoqiang jerry)[Hi c2011]building mission critical messaging system(guoqiang jerry)
[Hi c2011]building mission critical messaging system(guoqiang jerry)
baggioss
?
Hdfs
HdfsHdfs
Hdfs
baggioss
?
Hdfs
HdfsHdfs
Hdfs
baggioss
?
ڲԭʵ
ڲԭʵڲԭʵ
ڲԭʵ
baggioss
?
Hic 2011 realtime_analytics_at_facebook
Hic 2011 realtime_analytics_at_facebookHic 2011 realtime_analytics_at_facebook
Hic 2011 realtime_analytics_at_facebook
baggioss
?
[Hi c2011]building mission critical messaging system(guoqiang jerry)
[Hi c2011]building mission critical messaging system(guoqiang jerry)[Hi c2011]building mission critical messaging system(guoqiang jerry)
[Hi c2011]building mission critical messaging system(guoqiang jerry)
baggioss
?

ᲹǴDZѡ

  • 1. ĵƣHadoop ѡ Hadoop ѡ ĵ汾 ĵ汾 ˵ V0.1
  • 2. ĵƣHadoop ѡ 1 ɡHBase ʹ hadoop .doc HBase Hadoop ҪΪ 2 棺 ȶԣ (1) HDFS Ĵ (2) Datanode Ӧ hbase ͸ (3) Region server datanode ɿ (4) Sync() ֧(sequence file flush ֧) ܣ (5) HDFS (6) Client ˵ datanode 2 汾ѡ 2.1 ѡ汾 ǰ hadoop 汾ж append/sync ֵ֧İ CDH3b20.20-append0.21.0 ѡ 3 汾Ϊѡ汾 2.2 ѡ汾 汾 hbase ״ 1 2 3 4 5 6 CDH3b2 0.21.0 0.20.2-append ע: Ϊ֧֣ Ϊ֧ 汾ȱ㣺 Append/sync ֧ Ƿ release Ƿ hadoop0.20.2 hbase Ż CDH3b2 HADOOP-1700 ʵ (facebook,cloudera Ƽ) 0.21.0 HDFS-265 ʵ (޴ģʹ δ֪ , .0 Ϊ release) 0.20.2-append HADOOP-1700
  • 3. ĵƣHadoop ѡ ʵ (facebook,cloudera Ƽ) ע HADOOP-1700 Ϊ 2008-07-24 ύ 2 bug fix HDFS-265 Ϊ 2010-05-21 ύĵϲ 0.21.0 0.21.0 ð汾Ϊȶ releaseδʹã˴ݸı䣬 䲢ܱ汾õ hbase ݲѡ hadoop 0.21.0 ѡ cdh3b2 йܲԣҪ append/sync ܵʵ״ ¼ɲóۣ HADOOP-1700 append/sync ʵֿ ǶԱ cdh3b2 0.20-append, Ա汾 ַ֧棬0.20-append Ҫ facebook ƶ CDH3b2 cloudera ƶȽֶ֧ԣಮ١ DZȽ cdh3b2 0.20-append ڹϵIJ汾ҪܺͲõ patch ϴͬС졣±жԱ汾е patch Cdh3 0.20-append ȶ [HDFS-1056] - Multi-node RPC deadlocks during HDFS-1258 Clearing namespace quota on block recovery "/" corrupts fs image. [HDFS-1122] - client block verification may result HDFS-955 New implementation of in blocks in DataBlockScanner prematurely saveNamespace() to avoid loss of edits [HDFS-1197] - Blocks are considered "complete" prematurely after commitBlockSynchronization or DN restart [HDFS-1186] - 0.20: DNs should interrupt writers at start of recovery [HDFS-1218] - 20 append: Blocks recovered on startup should be treated with lower priority during block synchronization [HDFS-1260] - 0.20: Block lost when multiple DNs trying to recover it to different genstamps [HDFS-127] - DFSClient block read failures cause open DFSInputStream to become unusable [HDFS-686] - NullPointerException is thrown while merging edit log and image [HDFS-915] - Hung DN stalls write pipeline for far longer than its timeout [HADOOP-6269] - Missing synchronization for defaultResources in Configuration.addResource [HADOOP-6460] - Namenode runs of out of memory due to memory leak in ipc Server [HADOOP-6667] - RPC.waitForProxy should retry through NoRouteToHostException [HADOOP-6722] - NetUtils.connect should check that it hasn't connected a socket to itself
  • 4. ĵƣHadoop ѡ [HADOOP-6723] - unchecked exceptions thrown in IPC Connection orphan clients [HADOOP-6724] - IPC doesn't properly handle IOEs thrown by socket factory [HADOOP-6762] - exception while doing RPC I/O closes channel [HADOOP-2366] - Space in the value for dfs.data.dir can cause great problems [HADOOP-4885] - Try to restore failed replicas of Name Node storage (at checkpoint time) [HDFS-142] - In 0.20, move blocks being written into a blocksBeingWritten directory [HDFS-611] - Heartbeats times from Datanodes increase when there are plenty of blocks to delete [HDFS-895] - Allow hflush/sync to occur in parallel with new writes to the file [HDFS-877] - Client-driven block verification not functioning [HDFS-894] - DatanodeID.ipcPort is not updated when existing node re-registers [HADOOP-4655] - FileSystem.CACHE should be HDFS-1041 ref-counted DFSClient.getFileChecksum(..) should retry if connection to HDFS-927 DFSInputStream retries too many times for new block locations Misc [HDFS-1161] - Make DN minimum valid volumes HADOOP-6637 Benchmark for configurable establishing RPC session. (shv) [HDFS-1209] - Add conf HADOOP-6760 WebServer shouldn't dfs.client.block.recovery.retries to configure number increase port number in case of negative of block recovery attempts [HDFS-455] - Make NN and DN handle in a intuitive way comma-separated configuration strings [HDFS-528] - Add ability for safemode to wait for a minimum number of live datanodes [HADOOP-1849] - IPC server max queue size should be configurable [HADOOP-4675] - Current Ganglia metrics implementation is incompatible with Ganglia 3.1 [HADOOP-4829] - Allow FileSystem shutdown hook to be disabled [HADOOP-5257] - Export namenode/datanode functionality through a pluggable RPC layer
  • 5. ĵƣHadoop ѡ [HADOOP-5450] - Add support for application-specific typecodes to typed bytes [HADOOP-5891] - If dfs.http.address is default, SecondaryNameNode can't find NameNode ϱDzѷ֣CDH3b2 0.20-append ˸ȶԸĽ append ܵ bug fix 0.20-append еһЩ patchҲ˺ҪĸĽǵĿ Ƽáԣѡ cdh3b2 Ϊ߰汾 0.20-append patchڽ merge 3 ѡ汾ܹ֧ append/sync ܣ CDH3b2 0.20.0 C append 汾ǻԽȶ hadoop 0.20.2 hadoop 0.21.0 һµĴ汾 .0 Ϊȶ release δʹúͲԣѡ CDH3b2 0.20.0 C append CDH3b2 0.20.0 C append 汾ıȽУCDH3b2 0.20.0-append ϵͳȶԺ append ֧֣ӵиĸĽѡ CDH3b2 Ϊ ߰汾 0.20-append 6 patchǻᰴաȶ--ܡ ˳ڽ merge ѡ CDH3b2 Ϊ for hbase Ļ߰汾
  • 6. ĵƣHadoop ѡ ¼ Append append Ϊٷṩ append unit test Mini cluster вԡΪ֤ append/sync ʵʼȺ״Ⱥ append/sync ܲ TestFileAppend Уcase ͨ FileAppend:Ҫ sync append ܵļ򵥲 testComplexFlush() ļ д Sync ļ д Close ļ testSimpleFlush() ļ дļ Sync дļ Sync ļ Close ļ FileAppend2: Ҫ sync append ܵļ򵥲 testComplexAppend() ģ̶߳Զļ append testSimpleAppend() ļ д io.bytes.per.checksum Close д루 2 io.bytes.per.checksum Close дļʣ² Close ֤ļС FileAppend3 Ҫ sync append ܵķϳԡҪ֤ checksum 龰ȷԡ testTC1() ļ
  • 7. ĵƣHadoop ѡ д 1 block Close ļ Append ʽļ д 0.5 block Close ȡ 1.5 block ֤ļ testTC2() ļ дļ 1.5 block Close Append д 1/4 block Close 1.75 block ֤ļ testTC5() clientA ļ clientA д clientA close ļ clientA append ļ client append ļڴ testTC7() ļ д close ļ ʹһ datanode Ŀ corrupt append ļ close ļ ֤ļС testTC12() ļ д 25687B close ļ append ļ д 5877B close ļ ֤ļС [ע] hdfs ÿ io.bytes.per.checksum(ĬΪ 512)һ checksum case ֤ checksum ʱappend ǷЧ FileAppend4 Ҫ sync append ܵ fail over
  • 8. ĵƣHadoop ѡ testAppendSyncBbw() clientA ļ д 500B Sync Client A ʧȥ lease Client B lease recovery Client B ֤ļ testAppendSyncBbwClusterRestart() clientA ļ д 500B Sync Ⱥ Client B lease recovery Client B ֤ļ testAppendSyncChecksum0() ļ д 1/2 block Sync ͣȺ 𻵵 1 datanode Ӧ block checksum Ⱥ ֤ļ testAppendSyncChecksum1() ļ д 1/2 block Sync ͣȺ 𻵵 2 datanode Ӧ block checksum Ⱥ ֤ļ testAppendSyncChecksum2() ļ д 1/2 block Sync ͣȺ 𻵵 3 datanode Ӧ block checksum Ⱥ ֤ļ testAppendSyncReplication0() ļ д 1/2 block Sync ͣ 1 datanode д 1/4 block
  • 9. ĵƣHadoop ѡ Sync Ⱥ ֤ļ testAppendSyncReplication1() ļ д 1/2 block Sync ͣ 2 datanode д 1/4 block Sync Ⱥ ֤ļ testAppendSyncReplication2() ļ д 1/2 block Sync ͣ 3 datanode д 1/4 block Sync Ⱥ ֤ļ testDnDeath0() дɺͣһ datanode lease recoveryȻ֤ļݡ testDnDeath1() дɺͣڶ datanode lease recoveryȻ֤ļݡ testDnDeath2() дɺͣ datanode lease recoveryȻ֤ļݡ testFullClusterPowerLoss() д blocksync Ȼģϵ磬Ȼ ֤ļݡ page cache ˢ̣ testHalfLengthPrimaryDN() ģ DFSClient дĹУڻϵ磨datanode дһ룩龰 testRecoverFinalizedBlock() ֤ block finalizeļûб complete case lease recovery Ӧܵ Could not complete file 쳣 testTruncatedPrimaryDN() ģ DFSClient дĹУڻϵ磨datanode ļûбд룩 龰