Google Protocol Buffers provide a cross-language, compact, and strongly typed format for serializing structured data. It defines messages using .proto files that can then be compiled to generate data access classes in various languages. While more lightweight than alternatives like XML, Protobuf has some limitations like lacking map and set data structures. It aims to be a simple yet flexible format for tasks like payloads, logging, and data storage.
3. Binary serialization - sucks
? Language-specific
? Not safe (see Effective Java)
? Not extensible
4. XML, JSON C back to 1999
? Too verbose
? Need to parse
? Slow performance
? Huge size if not compressed
? No strong types (int vs float)
? Need to store field names
7. Options
? Outer class
? Default value
? Deprecated
? Speed / Size
? Custom options
? desriptor.proto
8. Our protobuf use cases:
? Java, C++, C#
? Payload for ZeroC ICE
? IBM MQ / Solace messages
? DB raw data
? Log messages to disk
? Compress using TAR.GZIP
? Show as XML / JSON
? exe utility associated with protobuf files
9. Disadvantages
? No Map<K, V> / Dictionary<K, V>
? No Set<T>
? No short / int16 / uint16
? No interning
? Generated classes are immutable
? compiler vs library are not backwards compatible
? descriptor.proto is not backwards compatible
? Poor number of officially supported languages
? Enum is not extensible (unknown resets to 0)
11. Apache Avro
? No tag
? Schema is required
? The entire record is tagged by schema ID
? Fields are matched by name
? No optional values: union { null, long } is used instead
? Resolution rules are used for server vs client schemas
15. Comparison
Thrift Protobuf
Language Bindings
Java, C++, Python, C#, Cocoa, Erlang, Haskell, OCaml, Perl,
PHP, Ruby, Smalltalk
Java, C++, Python
Primitive Types
bool, byte, 16/32/64-bit integers, double, string, byte
sequence, map<t1,t2>, list<t>, set<t>
bool, 32/64-bit integers, float, double, string, byte
sequence, repeated properties act like lists
Enumerations Yes Yes
Constants Yes No
Composite Type Struct Message
Exception Handling Yes No
Documentation Lacking Good
License Apache BSD-style
Compiler C++ C++
RPC Interfaces Yes Yes
RPC Implementation Yes No
Composite Type Extensions No Yes
Data Versioning Yes Yes
Pros
- More languages supported out of the box
- Richer data structures than Protobuf (e.g.: Map and Set)
- Includes RPC implementation for services
- Slightly faster than Thrift when using "optimize_for =
SPEED"
- Serialized objects slightly smaller than Thrift due to more
aggressive data compression
- Better documentation
- API a bit cleaner than Thrift
Cons
- Good examples are hard to find
- Missing/incomplete documentation
- .proto can define services, but no RPC implementation is
defined (although stubs are generated for you).