SlideShare ist ein Scribd-Unternehmen logo
1 von 19
Protocol
Buffer
www.tothenew.com
Serialization - Basic Concepts
➢Serialization is the encoding of objects, and the objects reachable in them, into a stream of bytes.
➢Concept is by no means unique to Java, but PPT is related to Java’s Serialization.
➢It is basis for all PERSISTENCE in java.
➢Handles versioning with the use of serialVersionUID.
➢Adding Marker interface - Serializable makes the class serializable.
➢Transient and static fields are not serialized.
www.tothenew.com
Serialization - Advantages
➢Provide way to hook into Serialization process
○ by providing implementation of readObject() and writeObject()
○ by providing implementation for readExternal() and writeExternal()
➢When you want to serialize just part of the class
○ provide implementations for readResolve() and writeReplace() methods describing what you
want to serialize.
➢Object Validation
○ provide implementation for validateObject() of ObjectInputValidation interface, which shall be
called automatically when de-serializing the object
www.tothenew.com
Serialization - Problem
➢Slow Processing
○ Serialization discovers which fields to write/read through reflection and Type Introspection, which is usually slow.
○ Serialization writes extra data to stream.
○ You can offset the cost of Serialization, to some extent, by having application objects implement java.io.Externalizable, but
there still will be significant overhead in marshalling the class descriptor. To avoid the cost, have these objects implement
Externalizable, and call readExternal and writeExternal on them directly. For example, call obj.writeExternal(stream) rather
than stream.writeObject(obj). See this Link
➢No proper Handling of fields
○ readObject() and writeObject() may not handle proper serialization of transient and static fields.
○ when default handling is inefficient, use the Externalizable interface instead of Serializable.
○ this way you need to write readExternal() and writeExternal(), a lot more work for simple serialization.
➢Not Secure
○ Because the format is fully documented, it's possible to read the contents of the serialized stream without the class being available
➢No proper version handling, even using serialVersionUID won’t help much. Not using it makes the Serializable class not version changes in class,
and using it will result in API break when version changed.
www.tothenew.com
Protocol Buffer - Basic Concepts
➢Library for Serializing Messages
➢Protocol Buffer is a Serialization format with an interface description language developed by Google
➢Write a .proto file with structure of data(message format) and run it through protocol compiler,
generate classes in java
➢Each class has accessor for fields defined
➢Methods for parsing and serializing the data in a compact and very fast data
➢Protocol buffers are Strongly typed
➢Handles Versioning Automatically
➢Generates Classes into C++/ Java/ Python
○ More languages supported into external repos(C#, Erlang etc)
➢Each generated class represents a Single Message
➢protoc generates code that depends on private APIs in libprotobuf.jar which often change between
versions. So, use same version in maven as the compiler installed on system.
www.tothenew.com
.proto File
➢Defines a message format/class
➢Simple syntax for defining message
➢Fields in a message class must be identified via a numeric index
➢Field have a name, type and descriptor such as it’s a required field or not
➢Messages can import or subclass other messages
www.tothenew.com
Sample .proto File
package java;
option java_package="com.shashi.protoc.generated";
option java_outer_classname="AddressBookProtos";
message Person {
required string name = 1;
required int32 id = 2;
optional string email = 3;
enum PhoneType {
MOBILE = 0;
HOME = 1;
WORK = 2;
}
message PhoneNumber {
required string number = 1;
optional PhoneType type = 2 [default = HOME];
}
repeated PhoneNumber number = 4;
}
message AddressBook {
repeated Person person = 1;
}
www.tothenew.com
import Command
➢Simply import another .proto file
➢Allows for separating different message classes into different files
➢Imported file should be into same directory
○ can be into another directory, in case have to specify additional argument to protoc compiler
www.tothenew.com
package Command
➢In message file, generate namespaces
➢package abc.def would mean
namespace abc {
namespace def {
. . .
}
}
➢package here has same significance as in java Language.
www.tothenew.com
message Command
➢Encloses a message class
➢Follows the term “message” with the name of the message, which will become it’s Java Class name
➢Message classes are encapsulated
www.tothenew.com
enum Command
➢Enum followed by the name of enumeration
➢Zero based enumeration
➢will produce actual Java Enumeration
➢Simple defines an enumeration, will not create a field in the message for that enumeration
www.tothenew.com
Fields
➢Fields are members of the message class
➢Convention is [descriptor] type name = index
➢index is 1-based
➢index 1-16 are better performing than 17+, so save 1-16 for the most frequently accessed fields
www.tothenew.com
Descriptor
➢Describes the field
➢Required means that the message requires this field to be non-null before writing
➢Optional means that the field is not required to be set before writing
➢Repeated means that the field is a collection(Dynamic array) of another type
○ For historical reasons, repeated fields of scalar numeric types aren't encoded as efficiently as they could be. New code should use the special
option [packed=true] to get a more efficient encoding
message AddressBook {
repeated Person person = 1 [packed = true];
}
www.tothenew.com
Types
➢The Expected type of the field
➢There are range of integer types and String types
➢Can be name of an enumeration
➢Can be a name of another Message class
www.tothenew.com
Class Generation
➢Use the Protoc Compiler
➢protoc -I=$SRC_DIR --java_out=$DST_DIR $SRC_DIR/addressbook.proto
➢Use your classes via aggregation
○ DO NOT inherit from your message class
www.tothenew.com
Advantage / Disadvantages
➢Advantages:
○ If you add new fields in the structure, and there are any old programs that dont know about
those structures then these old programs will ignore these new fields.
○ If you remove a field, old program will just assume default value for this deleted field.
➢Disadvantages
○ Can not remove required fields once added. Have to plan schema in advance.
■ suggested to add only optional fields. make only id etc required.
○ Just a way to encode data, not an RPC
■ it’s designed to be implemented with any RPC implementation
○ Not for Unstructured text
○ Not great if your first priority is human readability(Not Good for debugging and stuff)
www.tothenew.com
Alternatives
➢Apache Avro :
○ Essentially ProtoBuf with RPC facility, it is a Data Serialization and RPC framework used in
APache Hadoop
○ Dynamic Typing - no code generation required, only schema in json format
■ Can optionally use Avro IDL.
○ No Static Data Types - facilitates generic data-processing systems
➢Apache Thrift:
○ a code generation engine. Has a IDL and Binary communication protocol developed by FB
○ Facilitates calling between different language platforms
○ Instead of writing a load of boilerplate code to serialize and transport your objects and invoke
remote methods, you can get right down to business.
www.tothenew.com
REFERENCES
Serialization
1. http://thecodersbreakfast.net/index.php?post/2011/05/12/Serialization-and-magic-
methods
2. http://www.ibm.com/developerworks/library/j-5things1/
Protocol Buffers
1. https://developers.google.com/protocol-buffers/docs/javatutorial?hl=en
www.tothenew.com
Thank you!

Weitere ähnliche Inhalte

Was ist angesagt?

Terraform: Infrastructure as Code
Terraform: Infrastructure as CodeTerraform: Infrastructure as Code
Terraform: Infrastructure as CodePradeep Bhadani
 
Building High Performance APIs In Go Using gRPC And Protocol Buffers
Building High Performance APIs In Go Using gRPC And Protocol BuffersBuilding High Performance APIs In Go Using gRPC And Protocol Buffers
Building High Performance APIs In Go Using gRPC And Protocol BuffersShiju Varghese
 
An Introduction to Apache Kafka
An Introduction to Apache KafkaAn Introduction to Apache Kafka
An Introduction to Apache KafkaAmir Sedighi
 
Introduction to Go programming language
Introduction to Go programming languageIntroduction to Go programming language
Introduction to Go programming languageSlawomir Dorzak
 
Reactive programming
Reactive programmingReactive programming
Reactive programmingSUDIP GHOSH
 
REST vs gRPC: Battle of API's
REST vs gRPC: Battle of API'sREST vs gRPC: Battle of API's
REST vs gRPC: Battle of API'sLuram Archanjo
 
A visual introduction to Apache Kafka
A visual introduction to Apache KafkaA visual introduction to Apache Kafka
A visual introduction to Apache KafkaPaul Brebner
 
Inter-Process Communication in Microservices using gRPC
Inter-Process Communication in Microservices using gRPCInter-Process Communication in Microservices using gRPC
Inter-Process Communication in Microservices using gRPCShiju Varghese
 
Building a Unified Logging Layer with Fluentd, Elasticsearch and Kibana
Building a Unified Logging Layer with Fluentd, Elasticsearch and KibanaBuilding a Unified Logging Layer with Fluentd, Elasticsearch and Kibana
Building a Unified Logging Layer with Fluentd, Elasticsearch and KibanaMushfekur Rahman
 
Formation Gratuite Total Tests par les experts Java Ippon
Formation Gratuite Total Tests par les experts Java Ippon Formation Gratuite Total Tests par les experts Java Ippon
Formation Gratuite Total Tests par les experts Java Ippon Ippon
 
HTTP2 and gRPC
HTTP2 and gRPCHTTP2 and gRPC
HTTP2 and gRPCGuo Jing
 
JUnit5 and TestContainers
JUnit5 and TestContainersJUnit5 and TestContainers
JUnit5 and TestContainersSunghyouk Bae
 
Coding with golang
Coding with golangCoding with golang
Coding with golangHannahMoss14
 
Getting up to speed with Kafka Connect: from the basics to the latest feature...
Getting up to speed with Kafka Connect: from the basics to the latest feature...Getting up to speed with Kafka Connect: from the basics to the latest feature...
Getting up to speed with Kafka Connect: from the basics to the latest feature...HostedbyConfluent
 
Asynchronous javascript
 Asynchronous javascript Asynchronous javascript
Asynchronous javascriptEman Mohamed
 
Apache Kafka vs RabbitMQ: Fit For Purpose / Decision Tree
Apache Kafka vs RabbitMQ: Fit For Purpose / Decision TreeApache Kafka vs RabbitMQ: Fit For Purpose / Decision Tree
Apache Kafka vs RabbitMQ: Fit For Purpose / Decision TreeSlim Baltagi
 
Advanced javascript
Advanced javascriptAdvanced javascript
Advanced javascriptDoeun KOCH
 

Was ist angesagt? (20)

Terraform: Infrastructure as Code
Terraform: Infrastructure as CodeTerraform: Infrastructure as Code
Terraform: Infrastructure as Code
 
gRPC with java
gRPC with javagRPC with java
gRPC with java
 
Building High Performance APIs In Go Using gRPC And Protocol Buffers
Building High Performance APIs In Go Using gRPC And Protocol BuffersBuilding High Performance APIs In Go Using gRPC And Protocol Buffers
Building High Performance APIs In Go Using gRPC And Protocol Buffers
 
An Introduction to Apache Kafka
An Introduction to Apache KafkaAn Introduction to Apache Kafka
An Introduction to Apache Kafka
 
Introduction to Go programming language
Introduction to Go programming languageIntroduction to Go programming language
Introduction to Go programming language
 
Reactive programming
Reactive programmingReactive programming
Reactive programming
 
REST vs gRPC: Battle of API's
REST vs gRPC: Battle of API'sREST vs gRPC: Battle of API's
REST vs gRPC: Battle of API's
 
A visual introduction to Apache Kafka
A visual introduction to Apache KafkaA visual introduction to Apache Kafka
A visual introduction to Apache Kafka
 
Inter-Process Communication in Microservices using gRPC
Inter-Process Communication in Microservices using gRPCInter-Process Communication in Microservices using gRPC
Inter-Process Communication in Microservices using gRPC
 
Building a Unified Logging Layer with Fluentd, Elasticsearch and Kibana
Building a Unified Logging Layer with Fluentd, Elasticsearch and KibanaBuilding a Unified Logging Layer with Fluentd, Elasticsearch and Kibana
Building a Unified Logging Layer with Fluentd, Elasticsearch and Kibana
 
Formation Gratuite Total Tests par les experts Java Ippon
Formation Gratuite Total Tests par les experts Java Ippon Formation Gratuite Total Tests par les experts Java Ippon
Formation Gratuite Total Tests par les experts Java Ippon
 
HDFS Selective Wire Encryption
HDFS Selective Wire EncryptionHDFS Selective Wire Encryption
HDFS Selective Wire Encryption
 
HTTP2 and gRPC
HTTP2 and gRPCHTTP2 and gRPC
HTTP2 and gRPC
 
JUnit5 and TestContainers
JUnit5 and TestContainersJUnit5 and TestContainers
JUnit5 and TestContainers
 
Coding with golang
Coding with golangCoding with golang
Coding with golang
 
Getting up to speed with Kafka Connect: from the basics to the latest feature...
Getting up to speed with Kafka Connect: from the basics to the latest feature...Getting up to speed with Kafka Connect: from the basics to the latest feature...
Getting up to speed with Kafka Connect: from the basics to the latest feature...
 
Asynchronous javascript
 Asynchronous javascript Asynchronous javascript
Asynchronous javascript
 
Apache Kafka vs RabbitMQ: Fit For Purpose / Decision Tree
Apache Kafka vs RabbitMQ: Fit For Purpose / Decision TreeApache Kafka vs RabbitMQ: Fit For Purpose / Decision Tree
Apache Kafka vs RabbitMQ: Fit For Purpose / Decision Tree
 
An Introduction To REST API
An Introduction To REST APIAn Introduction To REST API
An Introduction To REST API
 
Advanced javascript
Advanced javascriptAdvanced javascript
Advanced javascript
 

Andere mochten auch

Data Serialization Using Google Protocol Buffers
Data Serialization Using Google Protocol BuffersData Serialization Using Google Protocol Buffers
Data Serialization Using Google Protocol BuffersWilliam Kibira
 
Introduction to protocol buffer
Introduction to protocol bufferIntroduction to protocol buffer
Introduction to protocol bufferTim (文昌)
 
Thrift vs Protocol Buffers vs Avro - Biased Comparison
Thrift vs Protocol Buffers vs Avro - Biased ComparisonThrift vs Protocol Buffers vs Avro - Biased Comparison
Thrift vs Protocol Buffers vs Avro - Biased ComparisonIgor Anishchenko
 
Event Driven with LibUV and ZeroMQ
Event Driven with LibUV and ZeroMQEvent Driven with LibUV and ZeroMQ
Event Driven with LibUV and ZeroMQLuke Luo
 
Experience protocol buffer on android
Experience protocol buffer on androidExperience protocol buffer on android
Experience protocol buffer on androidRichard Chang
 
An Empirical Evaluation of VoIP Playout Buffer Dimensioning in Skype, Google ...
An Empirical Evaluation of VoIP Playout Buffer Dimensioning in Skype, Google ...An Empirical Evaluation of VoIP Playout Buffer Dimensioning in Skype, Google ...
An Empirical Evaluation of VoIP Playout Buffer Dimensioning in Skype, Google ...Academia Sinica
 
gRPC and Microservices
gRPC and MicroservicesgRPC and Microservices
gRPC and MicroservicesJonathan Gomez
 
Illustration of TextSecure's Protocol Buffer usage
Illustration of TextSecure's Protocol Buffer usageIllustration of TextSecure's Protocol Buffer usage
Illustration of TextSecure's Protocol Buffer usageChristine Corbett Moran
 
Introducing HTTP/2
Introducing HTTP/2Introducing HTTP/2
Introducing HTTP/2Ido Flatow
 
ZeroMQ: Super Sockets - by J2 Labs
ZeroMQ: Super Sockets - by J2 LabsZeroMQ: Super Sockets - by J2 Labs
ZeroMQ: Super Sockets - by J2 LabsJames Dennis
 
Serialization and performance by Sergey Morenets
Serialization and performance by Sergey MorenetsSerialization and performance by Sergey Morenets
Serialization and performance by Sergey MorenetsAlex Tumanoff
 
(BDT309) Data Science & Best Practices for Apache Spark on Amazon EMR
(BDT309) Data Science & Best Practices for Apache Spark on Amazon EMR(BDT309) Data Science & Best Practices for Apache Spark on Amazon EMR
(BDT309) Data Science & Best Practices for Apache Spark on Amazon EMRAmazon Web Services
 
Best Practices for Using Apache Spark on AWS
Best Practices for Using Apache Spark on AWSBest Practices for Using Apache Spark on AWS
Best Practices for Using Apache Spark on AWSAmazon Web Services
 
Four Things to Know About Reliable Spark Streaming with Typesafe and Databricks
Four Things to Know About Reliable Spark Streaming with Typesafe and DatabricksFour Things to Know About Reliable Spark Streaming with Typesafe and Databricks
Four Things to Know About Reliable Spark Streaming with Typesafe and DatabricksLegacy Typesafe (now Lightbend)
 
Hadoop, Pig, and Twitter (NoSQL East 2009)
Hadoop, Pig, and Twitter (NoSQL East 2009)Hadoop, Pig, and Twitter (NoSQL East 2009)
Hadoop, Pig, and Twitter (NoSQL East 2009)Kevin Weil
 
Scaling Deep Learning with MXNet
Scaling Deep Learning with MXNetScaling Deep Learning with MXNet
Scaling Deep Learning with MXNetAI Frontiers
 

Andere mochten auch (20)

Data Serialization Using Google Protocol Buffers
Data Serialization Using Google Protocol BuffersData Serialization Using Google Protocol Buffers
Data Serialization Using Google Protocol Buffers
 
Google Protocol Buffers
Google Protocol BuffersGoogle Protocol Buffers
Google Protocol Buffers
 
Introduction to protocol buffer
Introduction to protocol bufferIntroduction to protocol buffer
Introduction to protocol buffer
 
3 apache-avro
3 apache-avro3 apache-avro
3 apache-avro
 
Thrift vs Protocol Buffers vs Avro - Biased Comparison
Thrift vs Protocol Buffers vs Avro - Biased ComparisonThrift vs Protocol Buffers vs Avro - Biased Comparison
Thrift vs Protocol Buffers vs Avro - Biased Comparison
 
Event Driven with LibUV and ZeroMQ
Event Driven with LibUV and ZeroMQEvent Driven with LibUV and ZeroMQ
Event Driven with LibUV and ZeroMQ
 
Experience protocol buffer on android
Experience protocol buffer on androidExperience protocol buffer on android
Experience protocol buffer on android
 
An Empirical Evaluation of VoIP Playout Buffer Dimensioning in Skype, Google ...
An Empirical Evaluation of VoIP Playout Buffer Dimensioning in Skype, Google ...An Empirical Evaluation of VoIP Playout Buffer Dimensioning in Skype, Google ...
An Empirical Evaluation of VoIP Playout Buffer Dimensioning in Skype, Google ...
 
gRPC and Microservices
gRPC and MicroservicesgRPC and Microservices
gRPC and Microservices
 
Illustration of TextSecure's Protocol Buffer usage
Illustration of TextSecure's Protocol Buffer usageIllustration of TextSecure's Protocol Buffer usage
Illustration of TextSecure's Protocol Buffer usage
 
Introducing HTTP/2
Introducing HTTP/2Introducing HTTP/2
Introducing HTTP/2
 
ZeroMQ: Super Sockets - by J2 Labs
ZeroMQ: Super Sockets - by J2 LabsZeroMQ: Super Sockets - by J2 Labs
ZeroMQ: Super Sockets - by J2 Labs
 
Serialization and performance by Sergey Morenets
Serialization and performance by Sergey MorenetsSerialization and performance by Sergey Morenets
Serialization and performance by Sergey Morenets
 
Axolotl Protocol: An Illustrated Primer
Axolotl Protocol: An Illustrated PrimerAxolotl Protocol: An Illustrated Primer
Axolotl Protocol: An Illustrated Primer
 
(BDT309) Data Science & Best Practices for Apache Spark on Amazon EMR
(BDT309) Data Science & Best Practices for Apache Spark on Amazon EMR(BDT309) Data Science & Best Practices for Apache Spark on Amazon EMR
(BDT309) Data Science & Best Practices for Apache Spark on Amazon EMR
 
Best Practices for Using Apache Spark on AWS
Best Practices for Using Apache Spark on AWSBest Practices for Using Apache Spark on AWS
Best Practices for Using Apache Spark on AWS
 
Four Things to Know About Reliable Spark Streaming with Typesafe and Databricks
Four Things to Know About Reliable Spark Streaming with Typesafe and DatabricksFour Things to Know About Reliable Spark Streaming with Typesafe and Databricks
Four Things to Know About Reliable Spark Streaming with Typesafe and Databricks
 
Reversing Google Protobuf protocol
Reversing Google Protobuf protocolReversing Google Protobuf protocol
Reversing Google Protobuf protocol
 
Hadoop, Pig, and Twitter (NoSQL East 2009)
Hadoop, Pig, and Twitter (NoSQL East 2009)Hadoop, Pig, and Twitter (NoSQL East 2009)
Hadoop, Pig, and Twitter (NoSQL East 2009)
 
Scaling Deep Learning with MXNet
Scaling Deep Learning with MXNetScaling Deep Learning with MXNet
Scaling Deep Learning with MXNet
 

Ähnlich wie Protocol Buffer.ppt

Python Mastery: A Comprehensive Guide to Setting Up Your Development Environment
Python Mastery: A Comprehensive Guide to Setting Up Your Development EnvironmentPython Mastery: A Comprehensive Guide to Setting Up Your Development Environment
Python Mastery: A Comprehensive Guide to Setting Up Your Development EnvironmentPython Devloper
 
Objective-c for Java Developers
Objective-c for Java DevelopersObjective-c for Java Developers
Objective-c for Java DevelopersMuhammad Abdullah
 
Balisage - EXPath Packaging
Balisage - EXPath PackagingBalisage - EXPath Packaging
Balisage - EXPath PackagingFlorent Georges
 
Basics java programing
Basics java programingBasics java programing
Basics java programingDarshan Gohel
 
Structure of java program diff c- cpp and java
Structure of java program  diff c- cpp and javaStructure of java program  diff c- cpp and java
Structure of java program diff c- cpp and javaMadishetty Prathibha
 
Java Presentation For Syntax
Java Presentation For SyntaxJava Presentation For Syntax
Java Presentation For SyntaxPravinYalameli
 
Java (1).ppt seminar topics engineering
Java (1).ppt  seminar topics engineeringJava (1).ppt  seminar topics engineering
Java (1).ppt seminar topics engineering4MU21CS023
 
Building scalable and language-independent Java services using Apache Thrift ...
Building scalable and language-independent Java services using Apache Thrift ...Building scalable and language-independent Java services using Apache Thrift ...
Building scalable and language-independent Java services using Apache Thrift ...IndicThreads
 
Development and deployment with composer and kite
Development and deployment with composer and kiteDevelopment and deployment with composer and kite
Development and deployment with composer and kiteChristian Opitz
 
Python for katana
Python for katanaPython for katana
Python for katanakedar nath
 
05 Lecture - PARALLEL Programming in C ++.pdf
05 Lecture - PARALLEL Programming in C ++.pdf05 Lecture - PARALLEL Programming in C ++.pdf
05 Lecture - PARALLEL Programming in C ++.pdfalivaisi1
 
Automating API Documentation
Automating API DocumentationAutomating API Documentation
Automating API DocumentationSelvakumar T S
 

Ähnlich wie Protocol Buffer.ppt (20)

Python Mastery: A Comprehensive Guide to Setting Up Your Development Environment
Python Mastery: A Comprehensive Guide to Setting Up Your Development EnvironmentPython Mastery: A Comprehensive Guide to Setting Up Your Development Environment
Python Mastery: A Comprehensive Guide to Setting Up Your Development Environment
 
Objective-c for Java Developers
Objective-c for Java DevelopersObjective-c for Java Developers
Objective-c for Java Developers
 
Why Drupal is Rockstar?
Why Drupal is Rockstar?Why Drupal is Rockstar?
Why Drupal is Rockstar?
 
Balisage - EXPath Packaging
Balisage - EXPath PackagingBalisage - EXPath Packaging
Balisage - EXPath Packaging
 
Comp102 lec 11
Comp102   lec 11Comp102   lec 11
Comp102 lec 11
 
Basics java programing
Basics java programingBasics java programing
Basics java programing
 
Core Java
Core JavaCore Java
Core Java
 
Structure of java program diff c- cpp and java
Structure of java program  diff c- cpp and javaStructure of java program  diff c- cpp and java
Structure of java program diff c- cpp and java
 
Comp102 lec 3
Comp102   lec 3Comp102   lec 3
Comp102 lec 3
 
OOP-Chap2.docx
OOP-Chap2.docxOOP-Chap2.docx
OOP-Chap2.docx
 
Java Presentation For Syntax
Java Presentation For SyntaxJava Presentation For Syntax
Java Presentation For Syntax
 
Java (1).ppt seminar topics engineering
Java (1).ppt  seminar topics engineeringJava (1).ppt  seminar topics engineering
Java (1).ppt seminar topics engineering
 
Java lab-manual
Java lab-manualJava lab-manual
Java lab-manual
 
Annotations
AnnotationsAnnotations
Annotations
 
Building scalable and language-independent Java services using Apache Thrift ...
Building scalable and language-independent Java services using Apache Thrift ...Building scalable and language-independent Java services using Apache Thrift ...
Building scalable and language-independent Java services using Apache Thrift ...
 
Development and deployment with composer and kite
Development and deployment with composer and kiteDevelopment and deployment with composer and kite
Development and deployment with composer and kite
 
Python for katana
Python for katanaPython for katana
Python for katana
 
05 Lecture - PARALLEL Programming in C ++.pdf
05 Lecture - PARALLEL Programming in C ++.pdf05 Lecture - PARALLEL Programming in C ++.pdf
05 Lecture - PARALLEL Programming in C ++.pdf
 
Automating API Documentation
Automating API DocumentationAutomating API Documentation
Automating API Documentation
 
Java Enterprise Edition
Java Enterprise EditionJava Enterprise Edition
Java Enterprise Edition
 

Protocol Buffer.ppt

  • 2. www.tothenew.com Serialization - Basic Concepts ➢Serialization is the encoding of objects, and the objects reachable in them, into a stream of bytes. ➢Concept is by no means unique to Java, but PPT is related to Java’s Serialization. ➢It is basis for all PERSISTENCE in java. ➢Handles versioning with the use of serialVersionUID. ➢Adding Marker interface - Serializable makes the class serializable. ➢Transient and static fields are not serialized.
  • 3. www.tothenew.com Serialization - Advantages ➢Provide way to hook into Serialization process ○ by providing implementation of readObject() and writeObject() ○ by providing implementation for readExternal() and writeExternal() ➢When you want to serialize just part of the class ○ provide implementations for readResolve() and writeReplace() methods describing what you want to serialize. ➢Object Validation ○ provide implementation for validateObject() of ObjectInputValidation interface, which shall be called automatically when de-serializing the object
  • 4. www.tothenew.com Serialization - Problem ➢Slow Processing ○ Serialization discovers which fields to write/read through reflection and Type Introspection, which is usually slow. ○ Serialization writes extra data to stream. ○ You can offset the cost of Serialization, to some extent, by having application objects implement java.io.Externalizable, but there still will be significant overhead in marshalling the class descriptor. To avoid the cost, have these objects implement Externalizable, and call readExternal and writeExternal on them directly. For example, call obj.writeExternal(stream) rather than stream.writeObject(obj). See this Link ➢No proper Handling of fields ○ readObject() and writeObject() may not handle proper serialization of transient and static fields. ○ when default handling is inefficient, use the Externalizable interface instead of Serializable. ○ this way you need to write readExternal() and writeExternal(), a lot more work for simple serialization. ➢Not Secure ○ Because the format is fully documented, it's possible to read the contents of the serialized stream without the class being available ➢No proper version handling, even using serialVersionUID won’t help much. Not using it makes the Serializable class not version changes in class, and using it will result in API break when version changed.
  • 5. www.tothenew.com Protocol Buffer - Basic Concepts ➢Library for Serializing Messages ➢Protocol Buffer is a Serialization format with an interface description language developed by Google ➢Write a .proto file with structure of data(message format) and run it through protocol compiler, generate classes in java ➢Each class has accessor for fields defined ➢Methods for parsing and serializing the data in a compact and very fast data ➢Protocol buffers are Strongly typed ➢Handles Versioning Automatically ➢Generates Classes into C++/ Java/ Python ○ More languages supported into external repos(C#, Erlang etc) ➢Each generated class represents a Single Message ➢protoc generates code that depends on private APIs in libprotobuf.jar which often change between versions. So, use same version in maven as the compiler installed on system.
  • 6. www.tothenew.com .proto File ➢Defines a message format/class ➢Simple syntax for defining message ➢Fields in a message class must be identified via a numeric index ➢Field have a name, type and descriptor such as it’s a required field or not ➢Messages can import or subclass other messages
  • 7. www.tothenew.com Sample .proto File package java; option java_package="com.shashi.protoc.generated"; option java_outer_classname="AddressBookProtos"; message Person { required string name = 1; required int32 id = 2; optional string email = 3; enum PhoneType { MOBILE = 0; HOME = 1; WORK = 2; } message PhoneNumber { required string number = 1; optional PhoneType type = 2 [default = HOME]; } repeated PhoneNumber number = 4; } message AddressBook { repeated Person person = 1; }
  • 8. www.tothenew.com import Command ➢Simply import another .proto file ➢Allows for separating different message classes into different files ➢Imported file should be into same directory ○ can be into another directory, in case have to specify additional argument to protoc compiler
  • 9. www.tothenew.com package Command ➢In message file, generate namespaces ➢package abc.def would mean namespace abc { namespace def { . . . } } ➢package here has same significance as in java Language.
  • 10. www.tothenew.com message Command ➢Encloses a message class ➢Follows the term “message” with the name of the message, which will become it’s Java Class name ➢Message classes are encapsulated
  • 11. www.tothenew.com enum Command ➢Enum followed by the name of enumeration ➢Zero based enumeration ➢will produce actual Java Enumeration ➢Simple defines an enumeration, will not create a field in the message for that enumeration
  • 12. www.tothenew.com Fields ➢Fields are members of the message class ➢Convention is [descriptor] type name = index ➢index is 1-based ➢index 1-16 are better performing than 17+, so save 1-16 for the most frequently accessed fields
  • 13. www.tothenew.com Descriptor ➢Describes the field ➢Required means that the message requires this field to be non-null before writing ➢Optional means that the field is not required to be set before writing ➢Repeated means that the field is a collection(Dynamic array) of another type ○ For historical reasons, repeated fields of scalar numeric types aren't encoded as efficiently as they could be. New code should use the special option [packed=true] to get a more efficient encoding message AddressBook { repeated Person person = 1 [packed = true]; }
  • 14. www.tothenew.com Types ➢The Expected type of the field ➢There are range of integer types and String types ➢Can be name of an enumeration ➢Can be a name of another Message class
  • 15. www.tothenew.com Class Generation ➢Use the Protoc Compiler ➢protoc -I=$SRC_DIR --java_out=$DST_DIR $SRC_DIR/addressbook.proto ➢Use your classes via aggregation ○ DO NOT inherit from your message class
  • 16. www.tothenew.com Advantage / Disadvantages ➢Advantages: ○ If you add new fields in the structure, and there are any old programs that dont know about those structures then these old programs will ignore these new fields. ○ If you remove a field, old program will just assume default value for this deleted field. ➢Disadvantages ○ Can not remove required fields once added. Have to plan schema in advance. ■ suggested to add only optional fields. make only id etc required. ○ Just a way to encode data, not an RPC ■ it’s designed to be implemented with any RPC implementation ○ Not for Unstructured text ○ Not great if your first priority is human readability(Not Good for debugging and stuff)
  • 17. www.tothenew.com Alternatives ➢Apache Avro : ○ Essentially ProtoBuf with RPC facility, it is a Data Serialization and RPC framework used in APache Hadoop ○ Dynamic Typing - no code generation required, only schema in json format ■ Can optionally use Avro IDL. ○ No Static Data Types - facilitates generic data-processing systems ➢Apache Thrift: ○ a code generation engine. Has a IDL and Binary communication protocol developed by FB ○ Facilitates calling between different language platforms ○ Instead of writing a load of boilerplate code to serialize and transport your objects and invoke remote methods, you can get right down to business.