
際際滷Share a Scribd company logo

       Apache Avro           1.5.4 蠍一

企 蠍一
 Avro 蟲 觜蠍 譟壱
 1910~1963 (M&A覃伎 殊)
 覦 覯  れ螳  .

   ( 覿 伎
  http://ftp.daum.net/apache//avro/avro-1.5.4/avro-src-1.5.4.tar.gz れ)
ろ 襴

                                JSON                         Stub /
     IDL                                                 Skeleton 貊
Genavro                                                  Java 

java -jar avro-tools-1.5.4.jar
idl helloworld.genavro helloworld.avpr

                                    java -jar avro-tools-1.5.4.jar compile
                                     protocol helloworld.avpr ./gen-java
protocol HelloService {

        string hello(string greeting);

        record Result {
                int cmdNumber;
                string return;

        error CmdException {
                string message;

        Result cmd(int cmdNumber, string param) throws CmdException;
                       ( java -jar avro-tools-1.5.4.jar
                  idl helloworld.genavro helloworld.avpr)

                                      "messages" : {
    "protocol" : "HelloService",
                                          "hello" : {
    "namespace" : "test",
                                             "request" : [ {
    "types" : [ {
                                               "name" : "greeting",
      "type" : "record",
                                               "type" : "string"
      "name" : "Result",
                                             } ],
      "fields" : [ {
                                             "response" : "string"
         "name" : "cmdNumber",
         "type" : "int"
                                          "cmd" : {
      }, {
                                             "request" : [ {
         "name" : "return",
                                               "name" : "cmdNumber",
         "type" : "string"
                                               "type" : "int"
                                             }, {
    }, {
                                               "name" : "param",
      "type" : "error",
                                               "type" : "string"
      "name" : "CmdException",
                                             } ],
      "fields" : [ {
                                             "response" : "Result",
         "name" : "message",
                                             "errors" : [ "CmdException" ]
         "type" : "string"
    } ],
java -jar avro-tools-1.5.4.jar compile
 protocol helloworld.avpr ./gen-java
package test;

public interface HelloService {
 public static final org.apache.avro.Protocol PROTOCOL =

 java.lang.CharSequence hello( java.lang.CharSequence greeting) throws
 test.Result cmd(int cmdNumber, java.lang.CharSequence param) throws
org.apache.avro.AvroRemoteException, test.CmdException;

 public interface Callback extends HelloService {
   public static final org.apache.avro.Protocol PROTOCOL = test.HelloService.PROTOCOL;
   void hello( java.lang.CharSequence greeting,
org.apache.avro.ipc.Callback<java.lang.CharSequence> callback) throws java.io.IOException;
   void cmd(int cmdNumber, java.lang.CharSequence param,
org.apache.avro.ipc.Callback<test.Result> callback) throws java.io.IOException;
Eclipse-maven 襦語
 genavro json朱 焔 avpr  
 pom.xml 殊 java stub/skeleton 襷
   avro-maven-plugin plugin 豢螳
 src/main/avro 襴 avpr, genavro
Maven 襦
Exception 覦

C:a>java -jar avro-tools-1.5.4.jar idl helloworld.genavro helloworld.avpr
Exception in thread "main" org.apache.avro.compiler.idl.ParseException:
d name '.String', at line 4, column 9
     at org.apache.avro.compiler.idl.Idl.error(Idl.java:48)
     at org.apache.avro.compiler.idl.Idl.ReferenceType(Idl.java:750)
     at org.apache.avro.compiler.idl.Idl.Type(Idl.java:670)
     at org.apache.avro.compiler.idl.Idl.ResultType(Idl.java:821)
     at org.apache.avro.compiler.idl.Idl.MessageDeclaration(Idl.java:565)
     at org.apache.avro.compiler.idl.Idl.ProtocolBody(Idl.java:342)
     at org.apache.avro.compiler.idl.Idl.ProtocolDeclaration(Idl.java:206)
     at org.apache.avro.compiler.idl.Idl.CompilationUnit(Idl.java:84)
     at org.apache.avro.tool.IdlTool.run(IdlTool.java:65)
     at org.apache.avro.tool.Main.run(Main.java:74)
     at org.apache.avro.tool.Main.main(Main.java:63)
Avro tool
 compile : Generates Java code for the given schema.
 fragtojson : Renders a binary-encoded Avro datum as JSON.
 fromjson : Reads JSON records and writes an Avro data file.
 genavro : Generates a JSON schema from a GenAvro file
 getschema : Prints out schema of an Avro data file.
 induce : Induce a schema/protocol from Java class/interface.
 jsontofrag : Renders a JSON-encoded Avro datum as binary.
 rpcreceive : Opens an HTTP RPC Server and listens for one
 rpcsend : Sends a single RPC message.
 tojson : Dumps an Avro data file as JSON, one record per
 a data serialization system.
    Protocol buffer Thrift 願鍵譬螳 旧 襭
      覦, Avro serialization 讌譴 旧 襭.
      Schema language, cross language
      Rich data structures.
      A compact, fast, binary data format.
      A container file, to store persistent data.
      Remote procedure call (RPC).
          . 1.3.0覿 讌 (ipc れ 豢螳)
    Simple integration with dynamic languages.
 Language : C, C++, Java, PHP, Python, Ruby
 Main developer and created by
   Doug Cutting
 Doug Cutting said Avro will replace
  Hadoops existing RPC
 Avro is replacing Thrift as the RPC client for
  interacting with Cassandra.
 Avro is a sub-project of the Apache
  Hadoop project
 dynamic data serialization library that has
  an advantage over Thrift in that it does not
  require static code generation.
        bytes : byte 覦一
        record : struct 螳 螳
                name : 貊 企
                doc : ろる  る
                fields :  , name, doc, type, default  
                name, size
        fixed (螻蠍語)
                name, doc, symbols
 JSON  伎伎 蟲譟磯ゼ  : Schema
    一危一 襷血 
 ′ ろる 一危磯ゼ 谿襦襦 ,
  レ  覿覿 JSON襦 ろる
  襯 襾殊 ??
 壱 , Schema  燕
           Object container file

         Schema  (Protocol)

               Json data

Client                             Server
Object Container File
 Serialization 觜覦
 一危 ろる 覲伎 一危磯 蟲
 一危磯 豢貅 Block 
 Sync Marker螳 豢  譴螳 豢螳
=> Hadoop Sequence File螻 ′ 蟲譟
Object Container File
Hadoop Sequence File
 Hadoop sequence file

 Hadoop sequence file Header

Hadoop Integration
1. Pig, Hive  螳 弰 焔 れ 一危磯ゼ
                 豌襴 螳ロ  
2. Java api 螻襯 蠏豪概, るジ 語伎 螻 蟲
           3. 豢螳ロ螻,    蟲譟
       4. Shema sort order襯    ??
       Hadoop螻 Avro螳 一  襦 蠍磯

Apache avro api
 Transports - currently only HTTP
 Handshake (exchange/verify protocols)
Java api
SocketServer Sample


 覯 覦 襷 覦覯 殊蟇
   java api螳 覦螻  譴. (覯 
 Json Schema襯 蠍 覓語 覲旧″
  一危 蟲譟  覦 企れ 
  Importing, polymorphism implementation,
   Nested extension..
  覯 覿 覦襦 螳 (http/json)
 RpcSendTool/RpcReceiveTool 螻 螳 蟆讀
  螳ロ api 螻
   /豌  蠍磯螳 ろ誤  
 襷 commiter襯 ′り 
  ) Tracing api
Thrift, Protocol Buffer 觜蟲
 Dynamic typing
   Json朱 る慨 serialization/deserialization
    伎 旧 螳
 Untagged data
   一危 語 compact蟆 蠍 伎 data
 No manually-assigned field IDs
   Thrift protocol buffer 蠎 . (ordering

More Related Content

Apache avro

  • 1. 襷覲願鍵 2011.9.29 Apache Avro 1.5.4 蠍一 蟾 knight76.tistory.com
  • 2. 企 蠍一 Avro 蟲 觜蠍 譟壱 1910~1963 (M&A覃伎 殊)
  • 4. れ 覦 覯 れ螳 . http://ftp.daum.net/apache//avro/avro-1.5.4/java/ ( 覿 伎 http://ftp.daum.net/apache//avro/avro-1.5.4/avro-src-1.5.4.tar.gz れ)
  • 5. ろ 襴 JSON Stub / IDL Skeleton 貊 Avpr Genavro Java (襦貊) java -jar avro-tools-1.5.4.jar idl helloworld.genavro helloworld.avpr java -jar avro-tools-1.5.4.jar compile protocol helloworld.avpr ./gen-java
  • 6. Genavro @namespace("test") protocol HelloService { string hello(string greeting); record Result { int cmdNumber; string return; } error CmdException { string message; } Result cmd(int cmdNumber, string param) throws CmdException; }
  • 7. Avpr ( java -jar avro-tools-1.5.4.jar idl helloworld.genavro helloworld.avpr) { "messages" : { "protocol" : "HelloService", "hello" : { "namespace" : "test", "request" : [ { "types" : [ { "name" : "greeting", "type" : "record", "type" : "string" "name" : "Result", } ], "fields" : [ { "response" : "string" "name" : "cmdNumber", }, "type" : "int" "cmd" : { }, { "request" : [ { "name" : "return", "name" : "cmdNumber", "type" : "string" "type" : "int" }] }, { }, { "name" : "param", "type" : "error", "type" : "string" "name" : "CmdException", } ], "fields" : [ { "response" : "Result", "name" : "message", "errors" : [ "CmdException" ] "type" : "string" } }] } } ], }
  • 8. java java -jar avro-tools-1.5.4.jar compile protocol helloworld.avpr ./gen-java
  • 9. HelloService.java package test; @SuppressWarnings("all") public interface HelloService { public static final org.apache.avro.Protocol PROTOCOL = org.apache.avro.Protocol.parse("{"protocol":"HelloService","namespace":"test","types ":[{"type":"record","name":"Result","fields":[{"name":"cmdNumber","type ":"int"},{"name":"return","type":"string"}]},{"type":"error","name":"Cmd Exception","fields":[{"name":"message","type":"string"}]}],"messages":{"hello ":{"request":[{"name":"greeting","type":"string"}],"response":"string"},"cm d":{"request":[{"name":"cmdNumber","type":"int"},{"name":"param","type ":"string"}],"response":"Result","errors":["CmdException"]}}}"); java.lang.CharSequence hello( java.lang.CharSequence greeting) throws org.apache.avro.AvroRemoteException; test.Result cmd(int cmdNumber, java.lang.CharSequence param) throws org.apache.avro.AvroRemoteException, test.CmdException; @SuppressWarnings("all") public interface Callback extends HelloService { public static final org.apache.avro.Protocol PROTOCOL = test.HelloService.PROTOCOL; void hello( java.lang.CharSequence greeting, org.apache.avro.ipc.Callback<java.lang.CharSequence> callback) throws java.io.IOException; void cmd(int cmdNumber, java.lang.CharSequence param, org.apache.avro.ipc.Callback<test.Result> callback) throws java.io.IOException; }
  • 10. Eclipse-maven 襦語 genavro json朱 焔 avpr pom.xml 殊 java stub/skeleton 襷 avro-maven-plugin plugin 豢螳
  • 11. Eclipse src/main/avro 襴 avpr, genavro 豺
  • 12. Maven 襦 Pom.xml <plugin> <groupId>org.apache.avro</groupId> <artifactId>avro-maven-plugin</artifactId> <version>1.5.4</version> <executions> <execution> <id>schemas</id> <phase>generate-sources</phase> <goals> <goal>schema</goal> <goal>protocol</goal> <goal>idl-protocol</goal> </goals> <configuration> <sourceDirectory>src/main/avro/</sourceDirectory> <outputDirectory>src/main/java/</outputDirectory> <testSourceDirectory>src/test/avro/</testSourceDirectory> <testOutputDirectory>src/test/java/</testOutputDirectory> </configuration> </execution> </executions> </plugin>
  • 13. Exception 覦 C:a>java -jar avro-tools-1.5.4.jar idl helloworld.genavro helloworld.avpr Exception in thread "main" org.apache.avro.compiler.idl.ParseException: Undefine d name '.String', at line 4, column 9 at org.apache.avro.compiler.idl.Idl.error(Idl.java:48) at org.apache.avro.compiler.idl.Idl.ReferenceType(Idl.java:750) at org.apache.avro.compiler.idl.Idl.Type(Idl.java:670) at org.apache.avro.compiler.idl.Idl.ResultType(Idl.java:821) at org.apache.avro.compiler.idl.Idl.MessageDeclaration(Idl.java:565) at org.apache.avro.compiler.idl.Idl.ProtocolBody(Idl.java:342) at org.apache.avro.compiler.idl.Idl.ProtocolDeclaration(Idl.java:206) at org.apache.avro.compiler.idl.Idl.CompilationUnit(Idl.java:84) at org.apache.avro.tool.IdlTool.run(IdlTool.java:65) at org.apache.avro.tool.Main.run(Main.java:74) at org.apache.avro.tool.Main.main(Main.java:63)
  • 14. Avro tool compile : Generates Java code for the given schema. fragtojson : Renders a binary-encoded Avro datum as JSON. fromjson : Reads JSON records and writes an Avro data file. genavro : Generates a JSON schema from a GenAvro file getschema : Prints out schema of an Avro data file. induce : Induce a schema/protocol from Java class/interface. jsontofrag : Renders a JSON-encoded Avro datum as binary. rpcreceive : Opens an HTTP RPC Server and listens for one message. rpcsend : Sends a single RPC message. tojson : Dumps an Avro data file as JSON, one record per line.
  • 16. 轟 a data serialization system. Protocol buffer Thrift 願鍵譬螳 旧 襭 覦, Avro serialization 讌譴 旧 襭. 轟 Schema language, cross language Rich data structures. A compact, fast, binary data format. A container file, to store persistent data. Remote procedure call (RPC). . 1.3.0覿 讌 (ipc れ 豢螳) Simple integration with dynamic languages. Language : C, C++, Java, PHP, Python, Ruby
  • 17. 轟 Main developer and created by Doug Cutting Doug Cutting said Avro will replace Hadoops existing RPC http://www.cloudera.com/blog/2009/11/avro -a-new-format-for-data-interchange/
  • 18. 轟 Avro is replacing Thrift as the RPC client for interacting with Cassandra. Avro is a sub-project of the Apache Hadoop project dynamic data serialization library that has an advantage over Thrift in that it does not require static code generation. org.apache.cassandra.thrift.CassandraServer => org.apache.cassandra.avro.CassandraServer
  • 19. Type Primitive string bytes : byte 覦一 int long float double Boolean null Complex record : struct 螳 螳 name : 貊 企 doc : ろる る fields : , name, doc, type, default array Items map value Union name, size fixed (螻蠍語) enum name, doc, symbols
  • 20. Schema JSON 伎伎 蟲譟磯ゼ : Schema 一危一 襷血 ′ ろる 一危磯ゼ 谿襦襦 , レ 覿覿 JSON襦 ろる 襯 襾殊 ?? 壱 , Schema 燕
  • 21. 一危 Object container file Schema (Protocol) Json data Client Server
  • 22. Object Container File Serialization 觜覦 一危 ろる 覲伎 一危磯 蟲 一危磯 豢貅 Block Sync Marker螳 豢 譴螳 豢螳 => Hadoop Sequence File螻 ′ 蟲譟
  • 24. Hadoop Sequence File Hadoop sequence file Hadoop sequence file Header http://www.cloudera.com/blog/2 011/01/hadoop-io-sequence- map-set-array-bloommap-files/
  • 25. Hadoop Integration 1. Pig, Hive 螳 弰 焔 れ 一危磯ゼ 豌襴 螳ロ 2. Java api 螻襯 蠏豪概, るジ 語伎 螻 蟲 襦 3. 豢螳ロ螻, 蟲譟 4. Shema sort order襯 ?? Hadoop螻 Avro螳 一 襦 蠍磯 Apache avro api
  • 26. RPC Transports - currently only HTTP Handshake (exchange/verify protocols) Asynchronous/Synchronous
  • 28. SocketServer Sample https://github.com/phunt/avro-rpc- quickstart/blob/master/src/main/jav a/example/Main.java
  • 29. 覯 覦 襷 覦覯 殊蟇 java api螳 覦螻 譴. (覯 覦) Json Schema襯 蠍 覓語 覲旧″ 一危 蟲譟 覦 企れ . Importing, polymorphism implementation, Nested extension..
  • 30. 蠍磯 覯 覿 覦襦 螳 (http/json) RpcSendTool/RpcReceiveTool 螻 螳 蟆讀 螳ロ api 螻 /豌 蠍磯螳 ろ誤 襷 commiter襯 ′り ) Tracing api
  • 31. Thrift, Protocol Buffer 觜蟲 Dynamic typing Json朱 る慨 serialization/deserialization 伎 旧 螳 Untagged data 一危 語 compact蟆 蠍 伎 data 覲願 No manually-assigned field IDs Thrift protocol buffer 蠎 . (ordering 譴)
  • 32.