SlideShare ist ein Scribd-Unternehmen logo
1 von 27
http://avro.apache.org
                                            Apache Avro
                         More Than Just A Serialization Framework

                                                        Jim Scott
                                         Lead Engineer / Architect




                                                             A ValueClick Company
Agenda

     • History / Overview

     • Serialization Framework

              • Supported Languages

              • Performance

     • Implementing Avro (Including Code Examples)

     • Avro with Maven

     • RPC (Including Code Examples)

     • Resources

     • Questions?




2   Not to be distributed without prior consent. Confidential. Copyright © 2011, Dotomi a ValueClick Company. All rights reserved.
History / Overview




3   Not to be distributed without prior consent. Confidential. Copyright © 2011, Dotomi a ValueClick Company. All rights reserved.
History / Overview

     Existing Serialization Frameworks

              • protobuf, thrift, avro, kryo, hessian, activemq-protobuf, scala, sbinary,
                google-gson, jackson/JSON, javolution, protostuff, woodstox, aalto, fast-
                infoset, xstream, java serialization, etc…

     Most popular frameworks

              • JAXB, Protocol Buffers, Thrift

     Avro

              Created by Doug Cutting, the Creator of Hadoop

              • Data is always accompanied by a schema:

                             Support for dynamic typing--code generation is not required
                             Supports schema evolution
                             The data is not tagged resulting in smaller serialization size




4   Not to be distributed without prior consent. Confidential. Copyright © 2011, Dotomi a ValueClick Company. All rights reserved.
Serialization Framework




5   Not to be distributed without prior consent. Confidential. Copyright © 2011, Dotomi a ValueClick Company. All rights reserved.
Serialization Framework

     Avro Limitations

              • Map keys can only be Strings

     Avro Benefits

              • Interoperability

                            Can serialize into Avro/Binary or Avro/JSON
                            Supports reading and writing protobufs and thrift

              • Supports multiple languages

              • Rich data structures with a schema described via JSON

                            A compact, fast, binary data format.
                            A container file, to store persistent data (Schema ALWAYS available)
                            Remote procedure call (RPC).

              • Simple integration with dynamic languages (via the generic type)

                        Unlike other frameworks, an unknown schema is supported at runtime

              • Compressable and splittable by Hadoop MapReduce


6   Not to be distributed without prior consent. Confidential. Copyright © 2011, Dotomi a ValueClick Company. All rights reserved.
Supported Languages

       Implementation Core                                                                    Data file                              Codec             RPC
       C                                                 yes                                  yes                                    deflate           yes
       C++                                               yes                                  yes                                    ?                 yes
       C#                                                yes                                  no                                     n/a               no
       Java                                              yes                                  yes                                    deflate, snappy   yes
       Perl                                              yes                                  yes                                    deflate           no
       Python                                            yes                                  yes                                    deflate, snappy   yes
       Ruby                                              yes                                  yes                                    deflate           yes
       PHP                                               yes                                  yes                                    ?                 no


       Core: Parse JSON schema, read / write binary schema
       Data file: Read / write avro data files
       RPC: Over HTTP

       Source: https://cwiki.apache.org/confluence/display/AVRO/Supported+Languages



7   Not to be distributed without prior consent. Confidential. Copyright © 2011, Dotomi a ValueClick Company. All rights reserved.
Framework - Performance
     Comparison Metrics


     Time to Serialize / Deserialize

              • Avro is not the fastest, but is in the top half of all frameworks

     Object Creation

              • Avro falls to the bottom, because it always uses UTF-8 for Strings. In
                normal use cases this is not a problem, as this test was just to compare
                object creation, not object reuse.

     Size of Serialized Objects (Compressed w/ deflate or nothing)

              • Avro is only bested by Kryo by about 1 byte




     Source: http://code.google.com/p/thrift-protobuf-compare/wiki/BenchmarkingV2




8   Not to be distributed without prior consent. Confidential. Copyright © 2011, Dotomi a ValueClick Company. All rights reserved.
Framework - Performance
         Comparison Charts



                Size of serialized data                                                                                            Total time to serialize data


                                                                                           Avro




    Source: http://code.google.com/p/thrift-protobuf-compare/wiki/BenchmarkingV2


9       Not to be distributed without prior consent. Confidential. Copyright © 2011, Dotomi a ValueClick Company. All rights reserved.
Implementing Avro




10   Not to be distributed without prior consent. Confidential. Copyright © 2011, Dotomi a ValueClick Company. All rights reserved.
Framework - Types

      Generic

               • All avro records are represented by a generic attribute/value data structure. This
                 style is most useful for systems which dynamically process datasets based on
                 user-provided scripts. For example, a program may be passed a data file whose
                 schema has not been previously seen by the program and told to sort it by the
                 field named "city".

      Specific

               • Each Avro record corresponds to a different kind of object in the programming
                 language. For example, in Java, C and C++, a specific API would generate a
                 distinct class or struct definition for each record definition. This style is used for
                 programs written to process a specific schema. RPC systems typically use this.

      Reflect

               • Avro schemas are generated via reflection to correspond to existing
                 programming language data structures. This may be useful when converting an
                 existing codebase to use Avro with minimal modifications.



      Source: https://cwiki.apache.org/confluence/display/AVRO/Glossary



11   Not to be distributed without prior consent. Confidential. Copyright © 2011, Dotomi a ValueClick Company. All rights reserved.
Using Reflect Type

      Class<T> type =

                                 SomeObject.getClass();

      Schema schema =

                                 ReflectData.AllowNull.get().getSchema(type);

      DataFileWriter writer =

                                 new DataFileWriter(new ReflectDatumWriter(schema));




12   Not to be distributed without prior consent. Confidential. Copyright © 2011, Dotomi a ValueClick Company. All rights reserved.
Using Specific Type

      Class<T> type =

                                 SomeObject.getClass();

      Schema schema =

                                 SpecificData.get().getSchema(type);

      DataFileWriter writer =

                                 new DataFileWriter(new SpecificDatumWriter(schema));




13   Not to be distributed without prior consent. Confidential. Copyright © 2011, Dotomi a ValueClick Company. All rights reserved.
Using the DataFileWriter

      Only one more thing to do and that is to tell this writer where to write...

           writer.create(schema, OutputStream);

      What if you want to append to an existing file instead of creating a new
       one?

           writer.appendTo(new File("Some File That exists"));

      Time to write...

           writer.append(object of type T);




14   Not to be distributed without prior consent. Confidential. Copyright © 2011, Dotomi a ValueClick Company. All rights reserved.
Don’t Forget About Reading
      Class<T> type =

                                 SomeObject.getClass();

      Schema schema =

                                 ReflectData.AllowNull.get().getSchema(type);
                                 SpecificData.get().getSchema(type);

      DatumReader datumReader =

                                 new SpecificDatumReader(schema);
                                 new ReflectDatumReader(schema);

      DataFileStream reader =

                                 new DataFileStream(inputStream, datumReader);

      reader.iterator();

      Remember that compressed data? Reader reads it automatically!




15   Not to be distributed without prior consent. Confidential. Copyright © 2011, Dotomi a ValueClick Company. All rights reserved.
Defining a Specific Schema

      Create an Enum type: serverstate.avsc (name is arbitrary, extension is not)
           {"type":"enum",
           "namespace":"com.yourcompany.avro",
           "name":"ServerState",
           "symbols":[
                     "STARTING",
                     "IDLE",
                     "ACTIVE",
                     "STOPPING“,
                     "STOPPED“
           ]}

      Create an Exception type: wrongstate.avsc
           { "type":"error",
           "namespace":"com.yourcompany.avro",
           "name":“WrongServerStateException",
           "fields":[
                      {
                             "name":"message",
                             "type":"string“
                      }
           ]}



16   Not to be distributed without prior consent. Confidential. Copyright © 2011, Dotomi a ValueClick Company. All rights reserved.
Defining a Specific Schema
      Create a regular data object: historical.avsc

      { "type":"record",
         "namespace":"com.yourcompany.avro",
         "name":"NewHistoricalMessage",
         "aliases": ["com.yourcompany.avro.datatypes.HistoricalMessage"],
         "fields":[ {
                   "name":"dataSource",
                   "type":[
                            "null",
                            "string“
                   ]}
         }

      Aliases allow for schema evolution.

      All data objects that are generated are defined with simple JSON and the
        documentation is very straight forward.

      Source: http://avro.apache.org/docs/current/spec.html




17   Not to be distributed without prior consent. Confidential. Copyright © 2011, Dotomi a ValueClick Company. All rights reserved.
Maven




18   Not to be distributed without prior consent. Confidential. Copyright © 2011, Dotomi a ValueClick Company. All rights reserved.
Avro With Maven
     Maven Plugins

     • This plugin assists with the Maven build lifecycle (may not be necessary in all use cases)

       <plugin>
                <groupId>org.codehaus.mojo</groupId>
                <artifactId>build-helper-maven-plugin</artifactId>
        </plugin>

     • Compiles *.avdl, *.avpr, *.avsc, and *.genavro (define the goals accordingly)

       <plugin>
                <groupId>org.apache.avro</groupId>
                <artifactId>avro-maven-plugin</artifactId>
        </plugin>

     • Necessary for Avro to introspect generated rpc code (http://paranamer.codehaus.org/)

       <plugin>
                <groupId>com.thoughtworks.paranamer</groupId>
                <artifactId>paranamer-maven-plugin</artifactId>
        </plugin>




19    Not to be distributed without prior consent. Confidential. Copyright © 2011, Dotomi a ValueClick Company. All rights reserved.
RPC




20   Not to be distributed without prior consent. Confidential. Copyright © 2011, Dotomi a ValueClick Company. All rights reserved.
RPC

      How to utilize an Avro RPC Server

      • Define the Protocol

      • Datatypes passed via RPC require use of specific types

      • An implementation of the interface generated by the protocol

      • Create and start an instance of an Avro RPC Server in Java

      • Create a client based on the interface generated by the protocol




21   Not to be distributed without prior consent. Confidential. Copyright © 2011, Dotomi a ValueClick Company. All rights reserved.
Define the Protocol

      • Create an AVDL file: historytracker.avdl (name is arbitrary, but the extension
        is not)

           @namespace("com.yourcompany.rpc")
           protocol HistoryTracker {
            import schema "historical.avsc";
            import schema "serverstate.avsc";
            import schema "wrongstate.avsc“;
            void somethingHappened(
                   com.yourcompany.avro.NewHistoricalMessage Item) oneway;

               /**
                * You can add comments
                */
               com.yourcompany.avro.ServerState getState() throws
                      com.yourcompany.avro.WrongServerStateException;
           }

      .




22   Not to be distributed without prior consent. Confidential. Copyright © 2011, Dotomi a ValueClick Company. All rights reserved.
Create an RPC Server

      Creating a server is fast and easy…
           InetSocketAddress address =
                  new InetSocketAddress(hostname, port);
           Responder responder =
                  new SpecificResponder(HistoryTracker.class, HistoryTrackerImpl);
           Server avroServer =
                  new NettyServer(responder, address);
           avroServer.start();


      • The HistoryTracker is the interface generated from the AVDL file

      • The HistoryTrackerImpl is an implementation of the HistoryTracker

      • There are other service implementations beyond Netty, e.g. HTTP




23   Not to be distributed without prior consent. Confidential. Copyright © 2011, Dotomi a ValueClick Company. All rights reserved.
Create an RPC Client

      Creating a client is easier than creating a server…
           InetSocketAddress address =
                  new InetSocketAddress(hostname, port);
           Transceiver transceiver =
                  new NettyTransceiver(address);
           Object<rpcInterface> client =
                  SpecificRequestor.getClient(HistoryTracker.class, transceiver);


      • The HistoryTracker is the interface generated from the AVDL file

      • There are other service implementations beyond Netty, e.g. HTTP




24   Not to be distributed without prior consent. Confidential. Copyright © 2011, Dotomi a ValueClick Company. All rights reserved.
Resources




25   Not to be distributed without prior consent. Confidential. Copyright © 2011, Dotomi a ValueClick Company. All rights reserved.
Resources

      References
      • Apache Website and Wiki
           http://avro.apache.org
           https://cwiki.apache.org/confluence/display/AVRO/Index

      • Benchmarking Serializaiton Frameworks
           http://code.google.com/p/thrift-protobuf-compare/wiki/BenchmarkingV2

      • An Introduction to Avro (Chris Cooper)
           http://files.meetup.com/1634302/CHUG-ApacheAvro.pdf

      Resources
      • Mailing List: user@avro.apache.org




26   Not to be distributed without prior consent. Confidential. Copyright © 2011, Dotomi a ValueClick Company. All rights reserved.
Thanks for Attending
                                                                Questions?
                                                                              jscott@dotomi.com




27   Not to be distributed without prior consent. Confidential. Copyright © 2011, Dotomi a ValueClick Company. All rights reserved.

Weitere ähnliche Inhalte

Was ist angesagt?

Beyond JSON - An Introduction to FlatBuffers
Beyond JSON - An Introduction to FlatBuffersBeyond JSON - An Introduction to FlatBuffers
Beyond JSON - An Introduction to FlatBuffersMaxim Zaks
 
Serialization (Avro, Message Pack, Kryo)
Serialization (Avro, Message Pack, Kryo)Serialization (Avro, Message Pack, Kryo)
Serialization (Avro, Message Pack, Kryo)오석 한
 
RESTLess Design with Apache Thrift: Experiences from Apache Airavata
RESTLess Design with Apache Thrift: Experiences from Apache AiravataRESTLess Design with Apache Thrift: Experiences from Apache Airavata
RESTLess Design with Apache Thrift: Experiences from Apache Airavatasmarru
 
Apache AVRO (Boston HUG, Jan 19, 2010)
Apache AVRO (Boston HUG, Jan 19, 2010)Apache AVRO (Boston HUG, Jan 19, 2010)
Apache AVRO (Boston HUG, Jan 19, 2010)Cloudera, Inc.
 
Serialization and performance by Sergey Morenets
Serialization and performance by Sergey MorenetsSerialization and performance by Sergey Morenets
Serialization and performance by Sergey MorenetsAlex Tumanoff
 
F# Type Provider for R Statistical Platform
F# Type Provider for R Statistical PlatformF# Type Provider for R Statistical Platform
F# Type Provider for R Statistical PlatformHoward Mansell
 
Data Serialization Using Google Protocol Buffers
Data Serialization Using Google Protocol BuffersData Serialization Using Google Protocol Buffers
Data Serialization Using Google Protocol BuffersWilliam Kibira
 
Experience protocol buffer on android
Experience protocol buffer on androidExperience protocol buffer on android
Experience protocol buffer on androidRichard Chang
 
Taming the resource tiger
Taming the resource tigerTaming the resource tiger
Taming the resource tigerElizabeth Smith
 
Building scalable and language independent java services using apache thrift
Building scalable and language independent java services using apache thriftBuilding scalable and language independent java services using apache thrift
Building scalable and language independent java services using apache thriftTalentica Software
 
Building scalable and language-independent Java services using Apache Thrift ...
Building scalable and language-independent Java services using Apache Thrift ...Building scalable and language-independent Java services using Apache Thrift ...
Building scalable and language-independent Java services using Apache Thrift ...IndicThreads
 
Dart the better Javascript 2015
Dart the better Javascript 2015Dart the better Javascript 2015
Dart the better Javascript 2015Jorg Janke
 
Presentation of Python, Django, DockerStack
Presentation of Python, Django, DockerStackPresentation of Python, Django, DockerStack
Presentation of Python, Django, DockerStackDavid Sanchez
 

Was ist angesagt? (19)

Beyond JSON - An Introduction to FlatBuffers
Beyond JSON - An Introduction to FlatBuffersBeyond JSON - An Introduction to FlatBuffers
Beyond JSON - An Introduction to FlatBuffers
 
Serialization (Avro, Message Pack, Kryo)
Serialization (Avro, Message Pack, Kryo)Serialization (Avro, Message Pack, Kryo)
Serialization (Avro, Message Pack, Kryo)
 
RESTLess Design with Apache Thrift: Experiences from Apache Airavata
RESTLess Design with Apache Thrift: Experiences from Apache AiravataRESTLess Design with Apache Thrift: Experiences from Apache Airavata
RESTLess Design with Apache Thrift: Experiences from Apache Airavata
 
Apache AVRO (Boston HUG, Jan 19, 2010)
Apache AVRO (Boston HUG, Jan 19, 2010)Apache AVRO (Boston HUG, Jan 19, 2010)
Apache AVRO (Boston HUG, Jan 19, 2010)
 
Serialization and performance by Sergey Morenets
Serialization and performance by Sergey MorenetsSerialization and performance by Sergey Morenets
Serialization and performance by Sergey Morenets
 
Google Protocol Buffers
Google Protocol BuffersGoogle Protocol Buffers
Google Protocol Buffers
 
Php
PhpPhp
Php
 
F# Type Provider for R Statistical Platform
F# Type Provider for R Statistical PlatformF# Type Provider for R Statistical Platform
F# Type Provider for R Statistical Platform
 
Data Serialization Using Google Protocol Buffers
Data Serialization Using Google Protocol BuffersData Serialization Using Google Protocol Buffers
Data Serialization Using Google Protocol Buffers
 
Experience protocol buffer on android
Experience protocol buffer on androidExperience protocol buffer on android
Experience protocol buffer on android
 
Taming the resource tiger
Taming the resource tigerTaming the resource tiger
Taming the resource tiger
 
Building scalable and language independent java services using apache thrift
Building scalable and language independent java services using apache thriftBuilding scalable and language independent java services using apache thrift
Building scalable and language independent java services using apache thrift
 
Dart programming language
Dart programming languageDart programming language
Dart programming language
 
Php extensions
Php extensionsPhp extensions
Php extensions
 
Hack and HHVM
Hack and HHVMHack and HHVM
Hack and HHVM
 
Building scalable and language-independent Java services using Apache Thrift ...
Building scalable and language-independent Java services using Apache Thrift ...Building scalable and language-independent Java services using Apache Thrift ...
Building scalable and language-independent Java services using Apache Thrift ...
 
Dart the better Javascript 2015
Dart the better Javascript 2015Dart the better Javascript 2015
Dart the better Javascript 2015
 
Php’s guts
Php’s gutsPhp’s guts
Php’s guts
 
Presentation of Python, Django, DockerStack
Presentation of Python, Django, DockerStackPresentation of Python, Django, DockerStack
Presentation of Python, Django, DockerStack
 

Ähnlich wie Avro - More Than Just a Serialization Framework - CHUG - 20120416

Graal VM: Multi-Language Execution Platform
Graal VM: Multi-Language Execution PlatformGraal VM: Multi-Language Execution Platform
Graal VM: Multi-Language Execution PlatformThomas Wuerthinger
 
Berlin Buzzwords 2019 - Taming the language border in data analytics and scie...
Berlin Buzzwords 2019 - Taming the language border in data analytics and scie...Berlin Buzzwords 2019 - Taming the language border in data analytics and scie...
Berlin Buzzwords 2019 - Taming the language border in data analytics and scie...Uwe Korn
 
gRPC, GraphQL, REST - Which API Tech to use - API Conference Berlin oct 20
gRPC, GraphQL, REST - Which API Tech to use - API Conference Berlin oct 20gRPC, GraphQL, REST - Which API Tech to use - API Conference Berlin oct 20
gRPC, GraphQL, REST - Which API Tech to use - API Conference Berlin oct 20Phil Wilkins
 
CRX Best practices
CRX Best practicesCRX Best practices
CRX Best practiceslisui0807
 
PyData Frankfurt - (Efficient) Data Exchange with "Foreign" Ecosystems
PyData Frankfurt - (Efficient) Data Exchange with "Foreign" EcosystemsPyData Frankfurt - (Efficient) Data Exchange with "Foreign" Ecosystems
PyData Frankfurt - (Efficient) Data Exchange with "Foreign" EcosystemsUwe Korn
 
Sparklife - Life In The Trenches With Spark
Sparklife - Life In The Trenches With SparkSparklife - Life In The Trenches With Spark
Sparklife - Life In The Trenches With SparkIan Pointer
 
Ruby on Rails (RoR) as a back-end processor for Apex
Ruby on Rails (RoR) as a back-end processor for Apex Ruby on Rails (RoR) as a back-end processor for Apex
Ruby on Rails (RoR) as a back-end processor for Apex Espen Brækken
 
DCRUG: Achieving Development-Production Parity
DCRUG: Achieving Development-Production ParityDCRUG: Achieving Development-Production Parity
DCRUG: Achieving Development-Production ParityGeoff Harcourt
 
Suneel Marthi - Deep Learning with Apache Flink and DL4J
Suneel Marthi - Deep Learning with Apache Flink and DL4JSuneel Marthi - Deep Learning with Apache Flink and DL4J
Suneel Marthi - Deep Learning with Apache Flink and DL4JFlink Forward
 
Api world apache nifi 101
Api world   apache nifi 101Api world   apache nifi 101
Api world apache nifi 101Timothy Spann
 
OSGi enRoute Unveiled - P Kriens
OSGi enRoute Unveiled - P KriensOSGi enRoute Unveiled - P Kriens
OSGi enRoute Unveiled - P Kriensmfrancis
 
Deep learning on HDP 2018 Prague
Deep learning on HDP 2018 PragueDeep learning on HDP 2018 Prague
Deep learning on HDP 2018 PragueTimothy Spann
 
Reusando componentes Zope fuera de Zope
Reusando componentes Zope fuera de ZopeReusando componentes Zope fuera de Zope
Reusando componentes Zope fuera de Zopementtes
 
Guglielmo iozzia - Google I/O extended dublin 2018
Guglielmo iozzia - Google  I/O extended dublin 2018Guglielmo iozzia - Google  I/O extended dublin 2018
Guglielmo iozzia - Google I/O extended dublin 2018Guglielmo Iozzia
 
Full Speed Ahead! (Ahead-of-Time Compilation for Java SE) [JavaOne 2017 CON3738]
Full Speed Ahead! (Ahead-of-Time Compilation for Java SE) [JavaOne 2017 CON3738]Full Speed Ahead! (Ahead-of-Time Compilation for Java SE) [JavaOne 2017 CON3738]
Full Speed Ahead! (Ahead-of-Time Compilation for Java SE) [JavaOne 2017 CON3738]David Buck
 
Spring Roo 1.0.0 Technical Deep Dive
Spring Roo 1.0.0 Technical Deep DiveSpring Roo 1.0.0 Technical Deep Dive
Spring Roo 1.0.0 Technical Deep DiveBen Alex
 

Ähnlich wie Avro - More Than Just a Serialization Framework - CHUG - 20120416 (20)

Graal VM: Multi-Language Execution Platform
Graal VM: Multi-Language Execution PlatformGraal VM: Multi-Language Execution Platform
Graal VM: Multi-Language Execution Platform
 
Avro
AvroAvro
Avro
 
Berlin Buzzwords 2019 - Taming the language border in data analytics and scie...
Berlin Buzzwords 2019 - Taming the language border in data analytics and scie...Berlin Buzzwords 2019 - Taming the language border in data analytics and scie...
Berlin Buzzwords 2019 - Taming the language border in data analytics and scie...
 
gRPC, GraphQL, REST - Which API Tech to use - API Conference Berlin oct 20
gRPC, GraphQL, REST - Which API Tech to use - API Conference Berlin oct 20gRPC, GraphQL, REST - Which API Tech to use - API Conference Berlin oct 20
gRPC, GraphQL, REST - Which API Tech to use - API Conference Berlin oct 20
 
CRX Best practices
CRX Best practicesCRX Best practices
CRX Best practices
 
PyData Frankfurt - (Efficient) Data Exchange with "Foreign" Ecosystems
PyData Frankfurt - (Efficient) Data Exchange with "Foreign" EcosystemsPyData Frankfurt - (Efficient) Data Exchange with "Foreign" Ecosystems
PyData Frankfurt - (Efficient) Data Exchange with "Foreign" Ecosystems
 
Sparklife - Life In The Trenches With Spark
Sparklife - Life In The Trenches With SparkSparklife - Life In The Trenches With Spark
Sparklife - Life In The Trenches With Spark
 
Ruby on Rails (RoR) as a back-end processor for Apex
Ruby on Rails (RoR) as a back-end processor for Apex Ruby on Rails (RoR) as a back-end processor for Apex
Ruby on Rails (RoR) as a back-end processor for Apex
 
DCRUG: Achieving Development-Production Parity
DCRUG: Achieving Development-Production ParityDCRUG: Achieving Development-Production Parity
DCRUG: Achieving Development-Production Parity
 
3 apache-avro
3 apache-avro3 apache-avro
3 apache-avro
 
Suneel Marthi - Deep Learning with Apache Flink and DL4J
Suneel Marthi - Deep Learning with Apache Flink and DL4JSuneel Marthi - Deep Learning with Apache Flink and DL4J
Suneel Marthi - Deep Learning with Apache Flink and DL4J
 
PHP - Introduction to PHP Fundamentals
PHP -  Introduction to PHP FundamentalsPHP -  Introduction to PHP Fundamentals
PHP - Introduction to PHP Fundamentals
 
Api world apache nifi 101
Api world   apache nifi 101Api world   apache nifi 101
Api world apache nifi 101
 
OSGi enRoute Unveiled - P Kriens
OSGi enRoute Unveiled - P KriensOSGi enRoute Unveiled - P Kriens
OSGi enRoute Unveiled - P Kriens
 
Deep learning on HDP 2018 Prague
Deep learning on HDP 2018 PragueDeep learning on HDP 2018 Prague
Deep learning on HDP 2018 Prague
 
Reusando componentes Zope fuera de Zope
Reusando componentes Zope fuera de ZopeReusando componentes Zope fuera de Zope
Reusando componentes Zope fuera de Zope
 
Guglielmo iozzia - Google I/O extended dublin 2018
Guglielmo iozzia - Google  I/O extended dublin 2018Guglielmo iozzia - Google  I/O extended dublin 2018
Guglielmo iozzia - Google I/O extended dublin 2018
 
Full Speed Ahead! (Ahead-of-Time Compilation for Java SE) [JavaOne 2017 CON3738]
Full Speed Ahead! (Ahead-of-Time Compilation for Java SE) [JavaOne 2017 CON3738]Full Speed Ahead! (Ahead-of-Time Compilation for Java SE) [JavaOne 2017 CON3738]
Full Speed Ahead! (Ahead-of-Time Compilation for Java SE) [JavaOne 2017 CON3738]
 
Intro Of Selenium
Intro Of SeleniumIntro Of Selenium
Intro Of Selenium
 
Spring Roo 1.0.0 Technical Deep Dive
Spring Roo 1.0.0 Technical Deep DiveSpring Roo 1.0.0 Technical Deep Dive
Spring Roo 1.0.0 Technical Deep Dive
 

Mehr von Chicago Hadoop Users Group

Using HBase Co-Processors to Build a Distributed, Transactional RDBMS - Splic...
Using HBase Co-Processors to Build a Distributed, Transactional RDBMS - Splic...Using HBase Co-Processors to Build a Distributed, Transactional RDBMS - Splic...
Using HBase Co-Processors to Build a Distributed, Transactional RDBMS - Splic...Chicago Hadoop Users Group
 
Choosing the Right Big Data Architecture for your Business
Choosing the Right Big Data Architecture for your BusinessChoosing the Right Big Data Architecture for your Business
Choosing the Right Big Data Architecture for your BusinessChicago Hadoop Users Group
 
Everything you wanted to know, but were afraid to ask about Oozie
Everything you wanted to know, but were afraid to ask about OozieEverything you wanted to know, but were afraid to ask about Oozie
Everything you wanted to know, but were afraid to ask about OozieChicago Hadoop Users Group
 
An Introduction to Impala – Low Latency Queries for Apache Hadoop
An Introduction to Impala – Low Latency Queries for Apache HadoopAn Introduction to Impala – Low Latency Queries for Apache Hadoop
An Introduction to Impala – Low Latency Queries for Apache HadoopChicago Hadoop Users Group
 
HCatalog: Table Management for Hadoop - CHUG - 20120917
HCatalog: Table Management for Hadoop - CHUG - 20120917HCatalog: Table Management for Hadoop - CHUG - 20120917
HCatalog: Table Management for Hadoop - CHUG - 20120917Chicago Hadoop Users Group
 

Mehr von Chicago Hadoop Users Group (19)

Kinetica master chug_9.12
Kinetica master chug_9.12Kinetica master chug_9.12
Kinetica master chug_9.12
 
Chug dl presentation
Chug dl presentationChug dl presentation
Chug dl presentation
 
Yahoo compares Storm and Spark
Yahoo compares Storm and SparkYahoo compares Storm and Spark
Yahoo compares Storm and Spark
 
Using Apache Drill
Using Apache DrillUsing Apache Drill
Using Apache Drill
 
Using HBase Co-Processors to Build a Distributed, Transactional RDBMS - Splic...
Using HBase Co-Processors to Build a Distributed, Transactional RDBMS - Splic...Using HBase Co-Processors to Build a Distributed, Transactional RDBMS - Splic...
Using HBase Co-Processors to Build a Distributed, Transactional RDBMS - Splic...
 
Meet Spark
Meet SparkMeet Spark
Meet Spark
 
Choosing the Right Big Data Architecture for your Business
Choosing the Right Big Data Architecture for your BusinessChoosing the Right Big Data Architecture for your Business
Choosing the Right Big Data Architecture for your Business
 
An Overview of Ambari
An Overview of AmbariAn Overview of Ambari
An Overview of Ambari
 
Hadoop and Big Data Security
Hadoop and Big Data SecurityHadoop and Big Data Security
Hadoop and Big Data Security
 
Introduction to MapReduce
Introduction to MapReduceIntroduction to MapReduce
Introduction to MapReduce
 
Advanced Oozie
Advanced OozieAdvanced Oozie
Advanced Oozie
 
Scalding for Hadoop
Scalding for HadoopScalding for Hadoop
Scalding for Hadoop
 
Financial Data Analytics with Hadoop
Financial Data Analytics with HadoopFinancial Data Analytics with Hadoop
Financial Data Analytics with Hadoop
 
Everything you wanted to know, but were afraid to ask about Oozie
Everything you wanted to know, but were afraid to ask about OozieEverything you wanted to know, but were afraid to ask about Oozie
Everything you wanted to know, but were afraid to ask about Oozie
 
An Introduction to Impala – Low Latency Queries for Apache Hadoop
An Introduction to Impala – Low Latency Queries for Apache HadoopAn Introduction to Impala – Low Latency Queries for Apache Hadoop
An Introduction to Impala – Low Latency Queries for Apache Hadoop
 
HCatalog: Table Management for Hadoop - CHUG - 20120917
HCatalog: Table Management for Hadoop - CHUG - 20120917HCatalog: Table Management for Hadoop - CHUG - 20120917
HCatalog: Table Management for Hadoop - CHUG - 20120917
 
Map Reduce v2 and YARN - CHUG - 20120604
Map Reduce v2 and YARN - CHUG - 20120604Map Reduce v2 and YARN - CHUG - 20120604
Map Reduce v2 and YARN - CHUG - 20120604
 
Hadoop in a Windows Shop - CHUG - 20120416
Hadoop in a Windows Shop - CHUG - 20120416Hadoop in a Windows Shop - CHUG - 20120416
Hadoop in a Windows Shop - CHUG - 20120416
 
Running R on Hadoop - CHUG - 20120815
Running R on Hadoop - CHUG - 20120815Running R on Hadoop - CHUG - 20120815
Running R on Hadoop - CHUG - 20120815
 

Kürzlich hochgeladen

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 

Kürzlich hochgeladen (20)

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 

Avro - More Than Just a Serialization Framework - CHUG - 20120416

  • 1. http://avro.apache.org Apache Avro More Than Just A Serialization Framework Jim Scott Lead Engineer / Architect A ValueClick Company
  • 2. Agenda • History / Overview • Serialization Framework • Supported Languages • Performance • Implementing Avro (Including Code Examples) • Avro with Maven • RPC (Including Code Examples) • Resources • Questions? 2 Not to be distributed without prior consent. Confidential. Copyright © 2011, Dotomi a ValueClick Company. All rights reserved.
  • 3. History / Overview 3 Not to be distributed without prior consent. Confidential. Copyright © 2011, Dotomi a ValueClick Company. All rights reserved.
  • 4. History / Overview Existing Serialization Frameworks • protobuf, thrift, avro, kryo, hessian, activemq-protobuf, scala, sbinary, google-gson, jackson/JSON, javolution, protostuff, woodstox, aalto, fast- infoset, xstream, java serialization, etc… Most popular frameworks • JAXB, Protocol Buffers, Thrift Avro Created by Doug Cutting, the Creator of Hadoop • Data is always accompanied by a schema: Support for dynamic typing--code generation is not required Supports schema evolution The data is not tagged resulting in smaller serialization size 4 Not to be distributed without prior consent. Confidential. Copyright © 2011, Dotomi a ValueClick Company. All rights reserved.
  • 5. Serialization Framework 5 Not to be distributed without prior consent. Confidential. Copyright © 2011, Dotomi a ValueClick Company. All rights reserved.
  • 6. Serialization Framework Avro Limitations • Map keys can only be Strings Avro Benefits • Interoperability Can serialize into Avro/Binary or Avro/JSON Supports reading and writing protobufs and thrift • Supports multiple languages • Rich data structures with a schema described via JSON A compact, fast, binary data format. A container file, to store persistent data (Schema ALWAYS available) Remote procedure call (RPC). • Simple integration with dynamic languages (via the generic type) Unlike other frameworks, an unknown schema is supported at runtime • Compressable and splittable by Hadoop MapReduce 6 Not to be distributed without prior consent. Confidential. Copyright © 2011, Dotomi a ValueClick Company. All rights reserved.
  • 7. Supported Languages Implementation Core Data file Codec RPC C yes yes deflate yes C++ yes yes ? yes C# yes no n/a no Java yes yes deflate, snappy yes Perl yes yes deflate no Python yes yes deflate, snappy yes Ruby yes yes deflate yes PHP yes yes ? no Core: Parse JSON schema, read / write binary schema Data file: Read / write avro data files RPC: Over HTTP Source: https://cwiki.apache.org/confluence/display/AVRO/Supported+Languages 7 Not to be distributed without prior consent. Confidential. Copyright © 2011, Dotomi a ValueClick Company. All rights reserved.
  • 8. Framework - Performance Comparison Metrics Time to Serialize / Deserialize • Avro is not the fastest, but is in the top half of all frameworks Object Creation • Avro falls to the bottom, because it always uses UTF-8 for Strings. In normal use cases this is not a problem, as this test was just to compare object creation, not object reuse. Size of Serialized Objects (Compressed w/ deflate or nothing) • Avro is only bested by Kryo by about 1 byte Source: http://code.google.com/p/thrift-protobuf-compare/wiki/BenchmarkingV2 8 Not to be distributed without prior consent. Confidential. Copyright © 2011, Dotomi a ValueClick Company. All rights reserved.
  • 9. Framework - Performance Comparison Charts Size of serialized data Total time to serialize data Avro Source: http://code.google.com/p/thrift-protobuf-compare/wiki/BenchmarkingV2 9 Not to be distributed without prior consent. Confidential. Copyright © 2011, Dotomi a ValueClick Company. All rights reserved.
  • 10. Implementing Avro 10 Not to be distributed without prior consent. Confidential. Copyright © 2011, Dotomi a ValueClick Company. All rights reserved.
  • 11. Framework - Types Generic • All avro records are represented by a generic attribute/value data structure. This style is most useful for systems which dynamically process datasets based on user-provided scripts. For example, a program may be passed a data file whose schema has not been previously seen by the program and told to sort it by the field named "city". Specific • Each Avro record corresponds to a different kind of object in the programming language. For example, in Java, C and C++, a specific API would generate a distinct class or struct definition for each record definition. This style is used for programs written to process a specific schema. RPC systems typically use this. Reflect • Avro schemas are generated via reflection to correspond to existing programming language data structures. This may be useful when converting an existing codebase to use Avro with minimal modifications. Source: https://cwiki.apache.org/confluence/display/AVRO/Glossary 11 Not to be distributed without prior consent. Confidential. Copyright © 2011, Dotomi a ValueClick Company. All rights reserved.
  • 12. Using Reflect Type Class<T> type = SomeObject.getClass(); Schema schema = ReflectData.AllowNull.get().getSchema(type); DataFileWriter writer = new DataFileWriter(new ReflectDatumWriter(schema)); 12 Not to be distributed without prior consent. Confidential. Copyright © 2011, Dotomi a ValueClick Company. All rights reserved.
  • 13. Using Specific Type Class<T> type = SomeObject.getClass(); Schema schema = SpecificData.get().getSchema(type); DataFileWriter writer = new DataFileWriter(new SpecificDatumWriter(schema)); 13 Not to be distributed without prior consent. Confidential. Copyright © 2011, Dotomi a ValueClick Company. All rights reserved.
  • 14. Using the DataFileWriter Only one more thing to do and that is to tell this writer where to write... writer.create(schema, OutputStream); What if you want to append to an existing file instead of creating a new one? writer.appendTo(new File("Some File That exists")); Time to write... writer.append(object of type T); 14 Not to be distributed without prior consent. Confidential. Copyright © 2011, Dotomi a ValueClick Company. All rights reserved.
  • 15. Don’t Forget About Reading Class<T> type = SomeObject.getClass(); Schema schema = ReflectData.AllowNull.get().getSchema(type); SpecificData.get().getSchema(type); DatumReader datumReader = new SpecificDatumReader(schema); new ReflectDatumReader(schema); DataFileStream reader = new DataFileStream(inputStream, datumReader); reader.iterator(); Remember that compressed data? Reader reads it automatically! 15 Not to be distributed without prior consent. Confidential. Copyright © 2011, Dotomi a ValueClick Company. All rights reserved.
  • 16. Defining a Specific Schema Create an Enum type: serverstate.avsc (name is arbitrary, extension is not) {"type":"enum", "namespace":"com.yourcompany.avro", "name":"ServerState", "symbols":[ "STARTING", "IDLE", "ACTIVE", "STOPPING“, "STOPPED“ ]} Create an Exception type: wrongstate.avsc { "type":"error", "namespace":"com.yourcompany.avro", "name":“WrongServerStateException", "fields":[ { "name":"message", "type":"string“ } ]} 16 Not to be distributed without prior consent. Confidential. Copyright © 2011, Dotomi a ValueClick Company. All rights reserved.
  • 17. Defining a Specific Schema Create a regular data object: historical.avsc { "type":"record", "namespace":"com.yourcompany.avro", "name":"NewHistoricalMessage", "aliases": ["com.yourcompany.avro.datatypes.HistoricalMessage"], "fields":[ { "name":"dataSource", "type":[ "null", "string“ ]} } Aliases allow for schema evolution. All data objects that are generated are defined with simple JSON and the documentation is very straight forward. Source: http://avro.apache.org/docs/current/spec.html 17 Not to be distributed without prior consent. Confidential. Copyright © 2011, Dotomi a ValueClick Company. All rights reserved.
  • 18. Maven 18 Not to be distributed without prior consent. Confidential. Copyright © 2011, Dotomi a ValueClick Company. All rights reserved.
  • 19. Avro With Maven Maven Plugins • This plugin assists with the Maven build lifecycle (may not be necessary in all use cases) <plugin> <groupId>org.codehaus.mojo</groupId> <artifactId>build-helper-maven-plugin</artifactId> </plugin> • Compiles *.avdl, *.avpr, *.avsc, and *.genavro (define the goals accordingly) <plugin> <groupId>org.apache.avro</groupId> <artifactId>avro-maven-plugin</artifactId> </plugin> • Necessary for Avro to introspect generated rpc code (http://paranamer.codehaus.org/) <plugin> <groupId>com.thoughtworks.paranamer</groupId> <artifactId>paranamer-maven-plugin</artifactId> </plugin> 19 Not to be distributed without prior consent. Confidential. Copyright © 2011, Dotomi a ValueClick Company. All rights reserved.
  • 20. RPC 20 Not to be distributed without prior consent. Confidential. Copyright © 2011, Dotomi a ValueClick Company. All rights reserved.
  • 21. RPC How to utilize an Avro RPC Server • Define the Protocol • Datatypes passed via RPC require use of specific types • An implementation of the interface generated by the protocol • Create and start an instance of an Avro RPC Server in Java • Create a client based on the interface generated by the protocol 21 Not to be distributed without prior consent. Confidential. Copyright © 2011, Dotomi a ValueClick Company. All rights reserved.
  • 22. Define the Protocol • Create an AVDL file: historytracker.avdl (name is arbitrary, but the extension is not) @namespace("com.yourcompany.rpc") protocol HistoryTracker { import schema "historical.avsc"; import schema "serverstate.avsc"; import schema "wrongstate.avsc“; void somethingHappened( com.yourcompany.avro.NewHistoricalMessage Item) oneway; /** * You can add comments */ com.yourcompany.avro.ServerState getState() throws com.yourcompany.avro.WrongServerStateException; } . 22 Not to be distributed without prior consent. Confidential. Copyright © 2011, Dotomi a ValueClick Company. All rights reserved.
  • 23. Create an RPC Server Creating a server is fast and easy… InetSocketAddress address = new InetSocketAddress(hostname, port); Responder responder = new SpecificResponder(HistoryTracker.class, HistoryTrackerImpl); Server avroServer = new NettyServer(responder, address); avroServer.start(); • The HistoryTracker is the interface generated from the AVDL file • The HistoryTrackerImpl is an implementation of the HistoryTracker • There are other service implementations beyond Netty, e.g. HTTP 23 Not to be distributed without prior consent. Confidential. Copyright © 2011, Dotomi a ValueClick Company. All rights reserved.
  • 24. Create an RPC Client Creating a client is easier than creating a server… InetSocketAddress address = new InetSocketAddress(hostname, port); Transceiver transceiver = new NettyTransceiver(address); Object<rpcInterface> client = SpecificRequestor.getClient(HistoryTracker.class, transceiver); • The HistoryTracker is the interface generated from the AVDL file • There are other service implementations beyond Netty, e.g. HTTP 24 Not to be distributed without prior consent. Confidential. Copyright © 2011, Dotomi a ValueClick Company. All rights reserved.
  • 25. Resources 25 Not to be distributed without prior consent. Confidential. Copyright © 2011, Dotomi a ValueClick Company. All rights reserved.
  • 26. Resources References • Apache Website and Wiki http://avro.apache.org https://cwiki.apache.org/confluence/display/AVRO/Index • Benchmarking Serializaiton Frameworks http://code.google.com/p/thrift-protobuf-compare/wiki/BenchmarkingV2 • An Introduction to Avro (Chris Cooper) http://files.meetup.com/1634302/CHUG-ApacheAvro.pdf Resources • Mailing List: user@avro.apache.org 26 Not to be distributed without prior consent. Confidential. Copyright © 2011, Dotomi a ValueClick Company. All rights reserved.
  • 27. Thanks for Attending Questions? jscott@dotomi.com 27 Not to be distributed without prior consent. Confidential. Copyright © 2011, Dotomi a ValueClick Company. All rights reserved.