際際滷

際際滷Share a Scribd company logo
襷覲願鍵




                             2011.9.29
       Apache Avro           1.5.4 蠍一




            蟾
      knight76.tistory.com
企 蠍一
 Avro 蟲 觜蠍 譟壱
 1910~1963 (M&A覃伎 殊)
 企慨蠍
れ
 覦 覯  れ螳  .
  http://ftp.daum.net/apache//avro/avro-1.5.4/java/




   ( 覿 伎
  http://ftp.daum.net/apache//avro/avro-1.5.4/avro-src-1.5.4.tar.gz れ)
ろ 襴


                                JSON                         Stub /
     IDL                                                 Skeleton 貊
                             Avpr 
Genavro                                                  Java 
                             (襦貊)




java -jar avro-tools-1.5.4.jar
idl helloworld.genavro helloworld.avpr

                                    java -jar avro-tools-1.5.4.jar compile
                                     protocol helloworld.avpr ./gen-java
Genavro 
@namespace("test")
protocol HelloService {

        string hello(string greeting);

        record Result {
                int cmdNumber;
                string return;
        }

        error CmdException {
                string message;
        }

        Result cmd(int cmdNumber, string param) throws CmdException;
}
Avpr 
                       ( java -jar avro-tools-1.5.4.jar
                  idl helloworld.genavro helloworld.avpr)

{
                                      "messages" : {
    "protocol" : "HelloService",
                                          "hello" : {
    "namespace" : "test",
                                             "request" : [ {
    "types" : [ {
                                               "name" : "greeting",
      "type" : "record",
                                               "type" : "string"
      "name" : "Result",
                                             } ],
      "fields" : [ {
                                             "response" : "string"
         "name" : "cmdNumber",
                                          },
         "type" : "int"
                                          "cmd" : {
      }, {
                                             "request" : [ {
         "name" : "return",
                                               "name" : "cmdNumber",
         "type" : "string"
                                               "type" : "int"
      }]
                                             }, {
    }, {
                                               "name" : "param",
      "type" : "error",
                                               "type" : "string"
      "name" : "CmdException",
                                             } ],
      "fields" : [ {
                                             "response" : "Result",
         "name" : "message",
                                             "errors" : [ "CmdException" ]
         "type" : "string"
                                          }
      }]
                                        }
    } ],
                                      }
java 
java -jar avro-tools-1.5.4.jar compile
 protocol helloworld.avpr ./gen-java
HelloService.java
package test;

@SuppressWarnings("all")
public interface HelloService {
 public static final org.apache.avro.Protocol PROTOCOL =
org.apache.avro.Protocol.parse("{"protocol":"HelloService","namespace":"test","types
":[{"type":"record","name":"Result","fields":[{"name":"cmdNumber","type
":"int"},{"name":"return","type":"string"}]},{"type":"error","name":"Cmd
Exception","fields":[{"name":"message","type":"string"}]}],"messages":{"hello
":{"request":[{"name":"greeting","type":"string"}],"response":"string"},"cm
d":{"request":[{"name":"cmdNumber","type":"int"},{"name":"param","type
":"string"}],"response":"Result","errors":["CmdException"]}}}");

 java.lang.CharSequence hello( java.lang.CharSequence greeting) throws
org.apache.avro.AvroRemoteException;
 test.Result cmd(int cmdNumber, java.lang.CharSequence param) throws
org.apache.avro.AvroRemoteException, test.CmdException;

 @SuppressWarnings("all")
 public interface Callback extends HelloService {
   public static final org.apache.avro.Protocol PROTOCOL = test.HelloService.PROTOCOL;
   void hello( java.lang.CharSequence greeting,
org.apache.avro.ipc.Callback<java.lang.CharSequence> callback) throws java.io.IOException;
   void cmd(int cmdNumber, java.lang.CharSequence param,
org.apache.avro.ipc.Callback<test.Result> callback) throws java.io.IOException;
 }
Eclipse-maven 襦語
 genavro json朱 焔 avpr  
  
 pom.xml 殊 java stub/skeleton 襷
   avro-maven-plugin plugin 豢螳
Eclipse
 src/main/avro 襴 avpr, genavro
   豺
Maven 襦
   Pom.xml
<plugin>
<groupId>org.apache.avro</groupId>
<artifactId>avro-maven-plugin</artifactId>
<version>1.5.4</version>
<executions>
<execution>
<id>schemas</id>
<phase>generate-sources</phase>
<goals>
<goal>schema</goal>
<goal>protocol</goal>
<goal>idl-protocol</goal>
</goals>
<configuration>
<sourceDirectory>src/main/avro/</sourceDirectory>
<outputDirectory>src/main/java/</outputDirectory>
<testSourceDirectory>src/test/avro/</testSourceDirectory>
<testOutputDirectory>src/test/java/</testOutputDirectory>
</configuration>
</execution>
</executions>
</plugin>
Exception 覦

C:a>java -jar avro-tools-1.5.4.jar idl helloworld.genavro helloworld.avpr
Exception in thread "main" org.apache.avro.compiler.idl.ParseException:
Undefine
d name '.String', at line 4, column 9
     at org.apache.avro.compiler.idl.Idl.error(Idl.java:48)
     at org.apache.avro.compiler.idl.Idl.ReferenceType(Idl.java:750)
     at org.apache.avro.compiler.idl.Idl.Type(Idl.java:670)
     at org.apache.avro.compiler.idl.Idl.ResultType(Idl.java:821)
     at org.apache.avro.compiler.idl.Idl.MessageDeclaration(Idl.java:565)
     at org.apache.avro.compiler.idl.Idl.ProtocolBody(Idl.java:342)
     at org.apache.avro.compiler.idl.Idl.ProtocolDeclaration(Idl.java:206)
     at org.apache.avro.compiler.idl.Idl.CompilationUnit(Idl.java:84)
     at org.apache.avro.tool.IdlTool.run(IdlTool.java:65)
     at org.apache.avro.tool.Main.run(Main.java:74)
     at org.apache.avro.tool.Main.main(Main.java:63)
Avro tool
 compile : Generates Java code for the given schema.
 fragtojson : Renders a binary-encoded Avro datum as JSON.
 fromjson : Reads JSON records and writes an Avro data file.
 genavro : Generates a JSON schema from a GenAvro file
 getschema : Prints out schema of an Avro data file.
 induce : Induce a schema/protocol from Java class/interface.
 jsontofrag : Renders a JSON-encoded Avro datum as binary.
 rpcreceive : Opens an HTTP RPC Server and listens for one
  message.
 rpcsend : Sends a single RPC message.
 tojson : Dumps an Avro data file as JSON, one record per
  line.
Features
轟
 a data serialization system.
    Protocol buffer Thrift 願鍵譬螳 旧 襭
      覦, Avro serialization 讌譴 旧 襭.
 轟
      Schema language, cross language
      Rich data structures.
      A compact, fast, binary data format.
      A container file, to store persistent data.
      Remote procedure call (RPC).
          . 1.3.0覿 讌 (ipc れ 豢螳)
    Simple integration with dynamic languages.
 Language : C, C++, Java, PHP, Python, Ruby
轟
 Main developer and created by
   Doug Cutting
 Doug Cutting said Avro will replace
  Hadoops existing RPC
   http://www.cloudera.com/blog/2009/11/avro
    -a-new-format-for-data-interchange/
轟
 Avro is replacing Thrift as the RPC client for
  interacting with Cassandra.
 Avro is a sub-project of the Apache
  Hadoop project
 dynamic data serialization library that has
  an advantage over Thrift in that it does not
  require static code generation.
 org.apache.cassandra.thrift.CassandraServer
  =>
  org.apache.cassandra.avro.CassandraServer
Type
   Primitive
        string
        bytes : byte 覦一
        int
        long
        float
        double
        Boolean
        null
   Complex
        record : struct 螳 螳
                name : 貊 企
                doc : ろる  る
                fields :  , name, doc, type, default  
        array
                Items
        map
                value
        Union
                name, size
        fixed (螻蠍語)
        enum
                name, doc, symbols
Schema
 JSON  伎伎 蟲譟磯ゼ  : Schema
    一危一 襷血 
 ′ ろる 一危磯ゼ 谿襦襦 ,
  レ  覿覿 JSON襦 ろる
  襯 襾殊 ??
 壱 , Schema  燕
一危 
           Object container file

         Schema  (Protocol)

               Json data




Client                             Server
Object Container File
 Serialization 觜覦
 一危 ろる 覲伎 一危磯 蟲
 一危磯 豢貅 Block 
 Sync Marker螳 豢  譴螳 豢螳
=> Hadoop Sequence File螻 ′ 蟲譟
Object Container File
Hadoop Sequence File
 Hadoop sequence file



 Hadoop sequence file Header




                            http://www.cloudera.com/blog/2
                            011/01/hadoop-io-sequence-
                            map-set-array-bloommap-files/
Hadoop Integration
1. Pig, Hive  螳 弰 焔 れ 一危磯ゼ
                 豌襴 螳ロ  
2. Java api 螻襯 蠏豪概, るジ 語伎 螻 蟲
                      襦 
           3. 豢螳ロ螻,    蟲譟
       4. Shema sort order襯    ??
       Hadoop螻 Avro螳 一  襦 蠍磯


Apache avro api
RPC
 Transports - currently only HTTP
 Handshake (exchange/verify protocols)
 Asynchronous/Synchronous
Java api
 http://avro.apache.org/docs/1.5.4/api/jav
  a/index.html
SocketServer Sample




                https://github.com/phunt/avro-rpc-
                quickstart/blob/master/src/main/jav
                a/example/Main.java

 覯 覦 襷 覦覯 殊蟇
   java api螳 覦螻  譴. (覯 
    覦)
 Json Schema襯 蠍 覓語 覲旧″
  一危 蟲譟  覦 企れ 
   .
  Importing, polymorphism implementation,
   Nested extension..
蠍磯 
  覯 覿 覦襦 螳 (http/json)
 RpcSendTool/RpcReceiveTool 螻 螳 蟆讀
  螳ロ api 螻
   /豌  蠍磯螳 ろ誤  
 襷 commiter襯 ′り 
  ) Tracing api
Thrift, Protocol Buffer 觜蟲
 Dynamic typing
   Json朱 る慨 serialization/deserialization
    伎 旧 螳
 Untagged data
   一危 語 compact蟆 蠍 伎 data
      覲願 
 No manually-assigned field IDs
   Thrift protocol buffer 蠎 . (ordering
    譴)

More Related Content

Apache avro

  • 1. 襷覲願鍵 2011.9.29 Apache Avro 1.5.4 蠍一 蟾 knight76.tistory.com
  • 2. 企 蠍一 Avro 蟲 觜蠍 譟壱 1910~1963 (M&A覃伎 殊)
  • 4. れ 覦 覯 れ螳 . http://ftp.daum.net/apache//avro/avro-1.5.4/java/ ( 覿 伎 http://ftp.daum.net/apache//avro/avro-1.5.4/avro-src-1.5.4.tar.gz れ)
  • 5. ろ 襴 JSON Stub / IDL Skeleton 貊 Avpr Genavro Java (襦貊) java -jar avro-tools-1.5.4.jar idl helloworld.genavro helloworld.avpr java -jar avro-tools-1.5.4.jar compile protocol helloworld.avpr ./gen-java
  • 6. Genavro @namespace("test") protocol HelloService { string hello(string greeting); record Result { int cmdNumber; string return; } error CmdException { string message; } Result cmd(int cmdNumber, string param) throws CmdException; }
  • 7. Avpr ( java -jar avro-tools-1.5.4.jar idl helloworld.genavro helloworld.avpr) { "messages" : { "protocol" : "HelloService", "hello" : { "namespace" : "test", "request" : [ { "types" : [ { "name" : "greeting", "type" : "record", "type" : "string" "name" : "Result", } ], "fields" : [ { "response" : "string" "name" : "cmdNumber", }, "type" : "int" "cmd" : { }, { "request" : [ { "name" : "return", "name" : "cmdNumber", "type" : "string" "type" : "int" }] }, { }, { "name" : "param", "type" : "error", "type" : "string" "name" : "CmdException", } ], "fields" : [ { "response" : "Result", "name" : "message", "errors" : [ "CmdException" ] "type" : "string" } }] } } ], }
  • 8. java java -jar avro-tools-1.5.4.jar compile protocol helloworld.avpr ./gen-java
  • 9. HelloService.java package test; @SuppressWarnings("all") public interface HelloService { public static final org.apache.avro.Protocol PROTOCOL = org.apache.avro.Protocol.parse("{"protocol":"HelloService","namespace":"test","types ":[{"type":"record","name":"Result","fields":[{"name":"cmdNumber","type ":"int"},{"name":"return","type":"string"}]},{"type":"error","name":"Cmd Exception","fields":[{"name":"message","type":"string"}]}],"messages":{"hello ":{"request":[{"name":"greeting","type":"string"}],"response":"string"},"cm d":{"request":[{"name":"cmdNumber","type":"int"},{"name":"param","type ":"string"}],"response":"Result","errors":["CmdException"]}}}"); java.lang.CharSequence hello( java.lang.CharSequence greeting) throws org.apache.avro.AvroRemoteException; test.Result cmd(int cmdNumber, java.lang.CharSequence param) throws org.apache.avro.AvroRemoteException, test.CmdException; @SuppressWarnings("all") public interface Callback extends HelloService { public static final org.apache.avro.Protocol PROTOCOL = test.HelloService.PROTOCOL; void hello( java.lang.CharSequence greeting, org.apache.avro.ipc.Callback<java.lang.CharSequence> callback) throws java.io.IOException; void cmd(int cmdNumber, java.lang.CharSequence param, org.apache.avro.ipc.Callback<test.Result> callback) throws java.io.IOException; }
  • 10. Eclipse-maven 襦語 genavro json朱 焔 avpr pom.xml 殊 java stub/skeleton 襷 avro-maven-plugin plugin 豢螳
  • 11. Eclipse src/main/avro 襴 avpr, genavro 豺
  • 12. Maven 襦 Pom.xml <plugin> <groupId>org.apache.avro</groupId> <artifactId>avro-maven-plugin</artifactId> <version>1.5.4</version> <executions> <execution> <id>schemas</id> <phase>generate-sources</phase> <goals> <goal>schema</goal> <goal>protocol</goal> <goal>idl-protocol</goal> </goals> <configuration> <sourceDirectory>src/main/avro/</sourceDirectory> <outputDirectory>src/main/java/</outputDirectory> <testSourceDirectory>src/test/avro/</testSourceDirectory> <testOutputDirectory>src/test/java/</testOutputDirectory> </configuration> </execution> </executions> </plugin>
  • 13. Exception 覦 C:a>java -jar avro-tools-1.5.4.jar idl helloworld.genavro helloworld.avpr Exception in thread "main" org.apache.avro.compiler.idl.ParseException: Undefine d name '.String', at line 4, column 9 at org.apache.avro.compiler.idl.Idl.error(Idl.java:48) at org.apache.avro.compiler.idl.Idl.ReferenceType(Idl.java:750) at org.apache.avro.compiler.idl.Idl.Type(Idl.java:670) at org.apache.avro.compiler.idl.Idl.ResultType(Idl.java:821) at org.apache.avro.compiler.idl.Idl.MessageDeclaration(Idl.java:565) at org.apache.avro.compiler.idl.Idl.ProtocolBody(Idl.java:342) at org.apache.avro.compiler.idl.Idl.ProtocolDeclaration(Idl.java:206) at org.apache.avro.compiler.idl.Idl.CompilationUnit(Idl.java:84) at org.apache.avro.tool.IdlTool.run(IdlTool.java:65) at org.apache.avro.tool.Main.run(Main.java:74) at org.apache.avro.tool.Main.main(Main.java:63)
  • 14. Avro tool compile : Generates Java code for the given schema. fragtojson : Renders a binary-encoded Avro datum as JSON. fromjson : Reads JSON records and writes an Avro data file. genavro : Generates a JSON schema from a GenAvro file getschema : Prints out schema of an Avro data file. induce : Induce a schema/protocol from Java class/interface. jsontofrag : Renders a JSON-encoded Avro datum as binary. rpcreceive : Opens an HTTP RPC Server and listens for one message. rpcsend : Sends a single RPC message. tojson : Dumps an Avro data file as JSON, one record per line.
  • 16. 轟 a data serialization system. Protocol buffer Thrift 願鍵譬螳 旧 襭 覦, Avro serialization 讌譴 旧 襭. 轟 Schema language, cross language Rich data structures. A compact, fast, binary data format. A container file, to store persistent data. Remote procedure call (RPC). . 1.3.0覿 讌 (ipc れ 豢螳) Simple integration with dynamic languages. Language : C, C++, Java, PHP, Python, Ruby
  • 17. 轟 Main developer and created by Doug Cutting Doug Cutting said Avro will replace Hadoops existing RPC http://www.cloudera.com/blog/2009/11/avro -a-new-format-for-data-interchange/
  • 18. 轟 Avro is replacing Thrift as the RPC client for interacting with Cassandra. Avro is a sub-project of the Apache Hadoop project dynamic data serialization library that has an advantage over Thrift in that it does not require static code generation. org.apache.cassandra.thrift.CassandraServer => org.apache.cassandra.avro.CassandraServer
  • 19. Type Primitive string bytes : byte 覦一 int long float double Boolean null Complex record : struct 螳 螳 name : 貊 企 doc : ろる る fields : , name, doc, type, default array Items map value Union name, size fixed (螻蠍語) enum name, doc, symbols
  • 20. Schema JSON 伎伎 蟲譟磯ゼ : Schema 一危一 襷血 ′ ろる 一危磯ゼ 谿襦襦 , レ 覿覿 JSON襦 ろる 襯 襾殊 ?? 壱 , Schema 燕
  • 21. 一危 Object container file Schema (Protocol) Json data Client Server
  • 22. Object Container File Serialization 觜覦 一危 ろる 覲伎 一危磯 蟲 一危磯 豢貅 Block Sync Marker螳 豢 譴螳 豢螳 => Hadoop Sequence File螻 ′ 蟲譟
  • 24. Hadoop Sequence File Hadoop sequence file Hadoop sequence file Header http://www.cloudera.com/blog/2 011/01/hadoop-io-sequence- map-set-array-bloommap-files/
  • 25. Hadoop Integration 1. Pig, Hive 螳 弰 焔 れ 一危磯ゼ 豌襴 螳ロ 2. Java api 螻襯 蠏豪概, るジ 語伎 螻 蟲 襦 3. 豢螳ロ螻, 蟲譟 4. Shema sort order襯 ?? Hadoop螻 Avro螳 一 襦 蠍磯 Apache avro api
  • 26. RPC Transports - currently only HTTP Handshake (exchange/verify protocols) Asynchronous/Synchronous
  • 28. SocketServer Sample https://github.com/phunt/avro-rpc- quickstart/blob/master/src/main/jav a/example/Main.java
  • 29. 覯 覦 襷 覦覯 殊蟇 java api螳 覦螻 譴. (覯 覦) Json Schema襯 蠍 覓語 覲旧″ 一危 蟲譟 覦 企れ . Importing, polymorphism implementation, Nested extension..
  • 30. 蠍磯 覯 覿 覦襦 螳 (http/json) RpcSendTool/RpcReceiveTool 螻 螳 蟆讀 螳ロ api 螻 /豌 蠍磯螳 ろ誤 襷 commiter襯 ′り ) Tracing api
  • 31. Thrift, Protocol Buffer 觜蟲 Dynamic typing Json朱 る慨 serialization/deserialization 伎 旧 螳 Untagged data 一危 語 compact蟆 蠍 伎 data 覲願 No manually-assigned field IDs Thrift protocol buffer 蠎 . (ordering 譴)
  • 32.