SlideShare ist ein Scribd-Unternehmen logo
1 von 23
Apache Avro in LivePerson 
Collecting and saving data is easy 
keeping it consistent is tough 
Sandwich club, Sep 2014 
Amihay Zer-Kavod, Software Architect
Who am I? 
Amihay Zer-Kavod 
Software Architect 
Been in software Since 1989
LivePerson Echo System 
M/R
Communication & Meaning 
● Consistent but decoupled communication 
between services, such as: 
o Monitoring, Interaction 
o Predictive, Sentiment 
o RT Reporting & Analysis 
o Visitor History 
event 
evento 
事件 
घटना 
حدث 
ארוע 
событие 
● Consistent meaning over time 
o BigData Store (Hadoop) 
o Offline Reporting & Analysis
What shouldn’t we use? 
Don’t use Direct APIs! 
They are completely wrong for this subject: 
• They produce too much coupling between services 
• APIs are synchronous by nature 
• Adds irrelevant complexity to the called service
What is needed? 
The Message is the API! 
● A unified event model (schema) for all reported events 
● Management tools for the unified schema 
● Tools for sending events over the wire 
● Tools for reading/writing event in big data 
● Backward and forward compatibility
The Event model 
From generic to specific structure with: 
• Common header - all common data to all events 
• Logical Entities - common header to all logical entities 
(such as Visitor) 
• Dynamic Specific headers 
• Specific Event body
Apache Avro to the rescue 
● Avro - a schema based serialization/deserialization 
framework 
● Avro idl - schema definition language 
● Avro file - Hadoop integration 
● Avro schema resolution 
● Apache Avro created by Doug Cutting
Avro 101 - Data Structures 
● Rich data structures 
○ Primitives 
■ null, int, long, boolean, float, double, bytes, string 
○ Records 
○ Map (string, Schema) 
○ Arrays (Schema) 
○ Enums 
○ Unions
Avro 101 - JSON Schema 
{ 
"type": "record", 
"name": "Event", 
"namespace": "com.liveperson.example", 
"doc": "Example event", 
"fields":[ 
{ "name": "id", "type": "string", "default": "Unknown"}, 
{ "name": "time", "type": "long", "default": -1}, 
{ "name": "color", "type": 
{ "type": "enum", "name": "Color", 
"symbols": ["NO_COLOR", "BLUE", "BLACK", "WHITE", "PINK"] 
}, 
"default": "NO_COLOR" } 
] 
}
Avro 101 - Avro IDL Schema 
@namespace("com.liveperson.example") 
enum Color { NO_COLOR, BLUE, BLACK, WHITE, PINK } 
/** 
Example event 
*/ 
@namespace("com.liveperson.example") 
record Event { 
string id = “Unknown”; 
long time = -1; 
Color color = "NO_COLOR"; 
}
Avro 101 - Serialization 
● JSON Serialization 
● Binary serialization 
○ int, long - variable length, Zig-zag encoding 
○ float, double - 4,8 bytes respectively 
○ string - long followed by UTF-8 bytes 
○ map, array - unlimited size, use blocks 
○ Unions - long index of the type
Avro 101 - Generic vs. Specific 
● SpecificDatumReader/Writer <T> 
○ Static types 
○ Code Generation: Java, C, C++, C#, Python, Ruby... 
● GenericDatumReader/Writer <GenericRecord> 
○ Dynamic types & access
Avro 101 - Schema Resolution 
● Writer schema must be always provided for decoding 
● Reader can use its own schema 
● Allows the reader and writer schema to evolve 
independently
Avro vs... 
Technologies Protobuf Thrift Avro 
Created 2001 (2008) 2007 2009 
Creator / Maintainer Google / Google Facebook / Apache 
Doug cutting / 
Apache 
Schema evolution Field Tag Field Tag Schema 
Static/Dynamic Yes/No Yes/No Yes/Yes 
Hadoop support No No Yes 
RPC No Yes Yes 
Used by Google Facebook, Cassandra Hadoop, Liveperson 
Lang support Good Great Good
Backward & Forward Compatibility 
Avro schema evolution 
● Avro supports resolution between two schemes 
● Need to follow a set of rules: 
● Every field must have a default value 
● A field can be added (make sure to put a default value) 
● Field types can not be changed (add a new field 
instead) 
● enum symbols can be added but never removed
Avro IDL - LivePerson Event 
/** Base for all LivePerson Events 
*/ 
@namespace("com.liveperson.global") 
record LPEvent { 
/** Common Header of the event */ 
CommonHeader header = null; 
/** Logical entity details participating in this event - Visitor, Agent, etc... */ 
array<Participant> participants = null; 
/** Holding specific platform info as node name (machine) cluster Id etc... */ 
PlatformHeader platformSpecificHeader = null; 
/** Auditing Header, Optional - adds data for auditing of the events flow in the platform*/ 
union {null, AuditingHeader } auditingHeader = null; 
/** The event body */ 
EventBody eventBody = null; 
}
Wait there is (much) more! 
M/R 
Migdalor
How good does it work? 
● Cyber Monday 2013 (one day) 
o More than 320,000 events per second 
o 7 Storm topologies consuming the events seconds from 
real time 
o 2TB of data saved to Hadoop 
● 2014 preparation: 
o x2 number of events per second to ~640,000
So how did we do it? 
1. Use an event driven system, don’t use direct APIs 
2. Create a unified schema for all events 
3. Use Avro to implement the schema 
4. Add some supporting infrastructure
Questions 
???? 
event 
evento 
事件 
घटना 
حدث 
ארוע 
событие
Amihay Zer-Kavod 
You can contact me at: 
amihayz@liveperson.com 
LivePerson is hiring!
Thank You

Weitere ähnliche Inhalte

Was ist angesagt?

F# Type Provider for R Statistical Platform
F# Type Provider for R Statistical PlatformF# Type Provider for R Statistical Platform
F# Type Provider for R Statistical PlatformHoward Mansell
 
Serialization (Avro, Message Pack, Kryo)
Serialization (Avro, Message Pack, Kryo)Serialization (Avro, Message Pack, Kryo)
Serialization (Avro, Message Pack, Kryo)오석 한
 
The D Programming Language - Why I love it!
The D Programming Language - Why I love it!The D Programming Language - Why I love it!
The D Programming Language - Why I love it!ryutenchi
 
Extending the Xbase Typesystem
Extending the Xbase TypesystemExtending the Xbase Typesystem
Extending the Xbase TypesystemSebastian Zarnekow
 
Introduction to the rust programming language
Introduction to the rust programming languageIntroduction to the rust programming language
Introduction to the rust programming languageNikolay Denev
 
Go Programming Language (Golang)
Go Programming Language (Golang)Go Programming Language (Golang)
Go Programming Language (Golang)Ishin Vin
 
Introduction to D programming language at Weka.IO
Introduction to D programming language at Weka.IOIntroduction to D programming language at Weka.IO
Introduction to D programming language at Weka.IOLiran Zvibel
 
Dart the better Javascript 2015
Dart the better Javascript 2015Dart the better Javascript 2015
Dart the better Javascript 2015Jorg Janke
 
Serialization and performance by Sergey Morenets
Serialization and performance by Sergey MorenetsSerialization and performance by Sergey Morenets
Serialization and performance by Sergey MorenetsAlex Tumanoff
 
Introduction To Scala
Introduction To ScalaIntroduction To Scala
Introduction To ScalaPeter Maas
 
Scala : language of the future
Scala : language of the futureScala : language of the future
Scala : language of the futureAnsviaLab
 
Few simple-type-tricks in scala
Few simple-type-tricks in scalaFew simple-type-tricks in scala
Few simple-type-tricks in scalaRuslan Shevchenko
 
Experience protocol buffer on android
Experience protocol buffer on androidExperience protocol buffer on android
Experience protocol buffer on androidRichard Chang
 

Was ist angesagt? (20)

Google Protocol Buffers
Google Protocol BuffersGoogle Protocol Buffers
Google Protocol Buffers
 
F# Type Provider for R Statistical Platform
F# Type Provider for R Statistical PlatformF# Type Provider for R Statistical Platform
F# Type Provider for R Statistical Platform
 
Serialization (Avro, Message Pack, Kryo)
Serialization (Avro, Message Pack, Kryo)Serialization (Avro, Message Pack, Kryo)
Serialization (Avro, Message Pack, Kryo)
 
Dart programming language
Dart programming languageDart programming language
Dart programming language
 
Sax Dom Tutorial
Sax Dom TutorialSax Dom Tutorial
Sax Dom Tutorial
 
The D Programming Language - Why I love it!
The D Programming Language - Why I love it!The D Programming Language - Why I love it!
The D Programming Language - Why I love it!
 
Extending the Xbase Typesystem
Extending the Xbase TypesystemExtending the Xbase Typesystem
Extending the Xbase Typesystem
 
D programming language
D programming languageD programming language
D programming language
 
Introduction to the rust programming language
Introduction to the rust programming languageIntroduction to the rust programming language
Introduction to the rust programming language
 
Go Programming Language (Golang)
Go Programming Language (Golang)Go Programming Language (Golang)
Go Programming Language (Golang)
 
Introduction to D programming language at Weka.IO
Introduction to D programming language at Weka.IOIntroduction to D programming language at Weka.IO
Introduction to D programming language at Weka.IO
 
Dart the better Javascript 2015
Dart the better Javascript 2015Dart the better Javascript 2015
Dart the better Javascript 2015
 
Serialization and performance by Sergey Morenets
Serialization and performance by Sergey MorenetsSerialization and performance by Sergey Morenets
Serialization and performance by Sergey Morenets
 
Introduction To Scala
Introduction To ScalaIntroduction To Scala
Introduction To Scala
 
IO Streams, Files and Directories
IO Streams, Files and DirectoriesIO Streams, Files and Directories
IO Streams, Files and Directories
 
Dart ppt
Dart pptDart ppt
Dart ppt
 
Scala : language of the future
Scala : language of the futureScala : language of the future
Scala : language of the future
 
DSLs in JavaScript
DSLs in JavaScriptDSLs in JavaScript
DSLs in JavaScript
 
Few simple-type-tricks in scala
Few simple-type-tricks in scalaFew simple-type-tricks in scala
Few simple-type-tricks in scala
 
Experience protocol buffer on android
Experience protocol buffer on androidExperience protocol buffer on android
Experience protocol buffer on android
 

Andere mochten auch

Avro Data | Washington DC HUG
Avro Data | Washington DC HUGAvro Data | Washington DC HUG
Avro Data | Washington DC HUGCloudera, Inc.
 
Type safe, versioned, and rewindable stream processing with Apache {Avro, K...
Type safe, versioned, and rewindable stream processing  with  Apache {Avro, K...Type safe, versioned, and rewindable stream processing  with  Apache {Avro, K...
Type safe, versioned, and rewindable stream processing with Apache {Avro, K...Hisham Mardam-Bey
 
Apache Flume
Apache FlumeApache Flume
Apache FlumeGetInData
 
Building a Next-gen Data Platform and Leveraging the OSS Ecosystem for Easy W...
Building a Next-gen Data Platform and Leveraging the OSS Ecosystem for Easy W...Building a Next-gen Data Platform and Leveraging the OSS Ecosystem for Easy W...
Building a Next-gen Data Platform and Leveraging the OSS Ecosystem for Easy W...StampedeCon
 
Internet Filtering In South Korea
Internet Filtering In South KoreaInternet Filtering In South Korea
Internet Filtering In South Koreamichroeder
 
Stress Management!
Stress Management!Stress Management!
Stress Management!Andeel Ali
 
Brochure gjav.indd
Brochure gjav.inddBrochure gjav.indd
Brochure gjav.inddGJAV
 
D.condicion juridica procesal de los extranjeros
D.condicion juridica procesal de los extranjerosD.condicion juridica procesal de los extranjeros
D.condicion juridica procesal de los extranjerosUniversidad de Sonora
 
Fast Food/Casual QSR Restaurant Tour of Sydney Australia 2014
Fast Food/Casual QSR Restaurant Tour of Sydney Australia 2014Fast Food/Casual QSR Restaurant Tour of Sydney Australia 2014
Fast Food/Casual QSR Restaurant Tour of Sydney Australia 2014Food Technical Consulting
 
Screaming Brain Studio Art Samples set1
Screaming Brain Studio Art Samples set1Screaming Brain Studio Art Samples set1
Screaming Brain Studio Art Samples set1screamingbrain
 
Cis 512 Week 4 Assignment
Cis 512 Week 4 AssignmentCis 512 Week 4 Assignment
Cis 512 Week 4 AssignmentVwilliams621
 
Pm glassy blue_earth
Pm glassy blue_earthPm glassy blue_earth
Pm glassy blue_earthThomas Jensen
 

Andere mochten auch (20)

3 apache-avro
3 apache-avro3 apache-avro
3 apache-avro
 
Avro Data | Washington DC HUG
Avro Data | Washington DC HUGAvro Data | Washington DC HUG
Avro Data | Washington DC HUG
 
Type safe, versioned, and rewindable stream processing with Apache {Avro, K...
Type safe, versioned, and rewindable stream processing  with  Apache {Avro, K...Type safe, versioned, and rewindable stream processing  with  Apache {Avro, K...
Type safe, versioned, and rewindable stream processing with Apache {Avro, K...
 
Avro
AvroAvro
Avro
 
Avro introduction
Avro introductionAvro introduction
Avro introduction
 
Apache Avro and You
Apache Avro and YouApache Avro and You
Apache Avro and You
 
Apache Flume
Apache FlumeApache Flume
Apache Flume
 
Scala Days NYC 2016
Scala Days NYC 2016Scala Days NYC 2016
Scala Days NYC 2016
 
Building a Next-gen Data Platform and Leveraging the OSS Ecosystem for Easy W...
Building a Next-gen Data Platform and Leveraging the OSS Ecosystem for Easy W...Building a Next-gen Data Platform and Leveraging the OSS Ecosystem for Easy W...
Building a Next-gen Data Platform and Leveraging the OSS Ecosystem for Easy W...
 
Internet Filtering In South Korea
Internet Filtering In South KoreaInternet Filtering In South Korea
Internet Filtering In South Korea
 
Jiuzhou
JiuzhouJiuzhou
Jiuzhou
 
Opensat
OpensatOpensat
Opensat
 
Stress Management!
Stress Management!Stress Management!
Stress Management!
 
Brochure gjav.indd
Brochure gjav.inddBrochure gjav.indd
Brochure gjav.indd
 
D.condicion juridica procesal de los extranjeros
D.condicion juridica procesal de los extranjerosD.condicion juridica procesal de los extranjeros
D.condicion juridica procesal de los extranjeros
 
Elnet
ElnetElnet
Elnet
 
Fast Food/Casual QSR Restaurant Tour of Sydney Australia 2014
Fast Food/Casual QSR Restaurant Tour of Sydney Australia 2014Fast Food/Casual QSR Restaurant Tour of Sydney Australia 2014
Fast Food/Casual QSR Restaurant Tour of Sydney Australia 2014
 
Screaming Brain Studio Art Samples set1
Screaming Brain Studio Art Samples set1Screaming Brain Studio Art Samples set1
Screaming Brain Studio Art Samples set1
 
Cis 512 Week 4 Assignment
Cis 512 Week 4 AssignmentCis 512 Week 4 Assignment
Cis 512 Week 4 Assignment
 
Pm glassy blue_earth
Pm glassy blue_earthPm glassy blue_earth
Pm glassy blue_earth
 

Ähnlich wie Apache Avro in LivePerson [Hebrew]

[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdf
[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdf[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdf
[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdfSteve Caron
 
Enforcing API Design Rules for High Quality Code Generation
Enforcing API Design Rules for High Quality Code GenerationEnforcing API Design Rules for High Quality Code Generation
Enforcing API Design Rules for High Quality Code GenerationTim Burks
 
Data engineering Stl Big Data IDEA user group
Data engineering   Stl Big Data IDEA user groupData engineering   Stl Big Data IDEA user group
Data engineering Stl Big Data IDEA user groupAdam Doyle
 
Apache Arrow at DataEngConf Barcelona 2018
Apache Arrow at DataEngConf Barcelona 2018Apache Arrow at DataEngConf Barcelona 2018
Apache Arrow at DataEngConf Barcelona 2018Wes McKinney
 
Big Data, Data Lake, Fast Data - Dataserialiation-Formats
Big Data, Data Lake, Fast Data - Dataserialiation-FormatsBig Data, Data Lake, Fast Data - Dataserialiation-Formats
Big Data, Data Lake, Fast Data - Dataserialiation-FormatsGuido Schmutz
 
SplunkLive! Frankfurt 2018 - Data Onboarding Overview
SplunkLive! Frankfurt 2018 - Data Onboarding OverviewSplunkLive! Frankfurt 2018 - Data Onboarding Overview
SplunkLive! Frankfurt 2018 - Data Onboarding OverviewSplunk
 
Standardizing on a single N-dimensional array API for Python
Standardizing on a single N-dimensional array API for PythonStandardizing on a single N-dimensional array API for Python
Standardizing on a single N-dimensional array API for PythonRalf Gommers
 
Data Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFixData Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFixC4Media
 
SplunkLive! Munich 2018: Data Onboarding Overview
SplunkLive! Munich 2018: Data Onboarding OverviewSplunkLive! Munich 2018: Data Onboarding Overview
SplunkLive! Munich 2018: Data Onboarding OverviewSplunk
 
Runtime Environment Of .Net Divya Rathore
Runtime Environment Of .Net Divya RathoreRuntime Environment Of .Net Divya Rathore
Runtime Environment Of .Net Divya RathoreEsha Yadav
 
Hack Like It's 2013 (The Workshop)
Hack Like It's 2013 (The Workshop)Hack Like It's 2013 (The Workshop)
Hack Like It's 2013 (The Workshop)Itzik Kotler
 
MacSysAdmin Conference 2019 - Logging
MacSysAdmin Conference 2019 - Logging MacSysAdmin Conference 2019 - Logging
MacSysAdmin Conference 2019 - Logging Henry Stamerjohann
 
Ursa Labs and Apache Arrow in 2019
Ursa Labs and Apache Arrow in 2019Ursa Labs and Apache Arrow in 2019
Ursa Labs and Apache Arrow in 2019Wes McKinney
 
MongoDB Use Cases: Healthcare, CMS, Analytics
MongoDB Use Cases: Healthcare, CMS, AnalyticsMongoDB Use Cases: Healthcare, CMS, Analytics
MongoDB Use Cases: Healthcare, CMS, AnalyticsMongoDB
 
Apache Drill @ PJUG, Jan 15, 2013
Apache Drill @ PJUG, Jan 15, 2013Apache Drill @ PJUG, Jan 15, 2013
Apache Drill @ PJUG, Jan 15, 2013Gera Shegalov
 
The Data Lake Engine Data Microservices in Spark using Apache Arrow Flight
The Data Lake Engine Data Microservices in Spark using Apache Arrow FlightThe Data Lake Engine Data Microservices in Spark using Apache Arrow Flight
The Data Lake Engine Data Microservices in Spark using Apache Arrow FlightDatabricks
 
WSO2Con ASIA 2016: WSO2 Analytics Platform: The One Stop Shop for All Your Da...
WSO2Con ASIA 2016: WSO2 Analytics Platform: The One Stop Shop for All Your Da...WSO2Con ASIA 2016: WSO2 Analytics Platform: The One Stop Shop for All Your Da...
WSO2Con ASIA 2016: WSO2 Analytics Platform: The One Stop Shop for All Your Da...WSO2
 

Ähnlich wie Apache Avro in LivePerson [Hebrew] (20)

Handout: 'Open Source Tools & Resources'
Handout: 'Open Source Tools & Resources'Handout: 'Open Source Tools & Resources'
Handout: 'Open Source Tools & Resources'
 
Introduction to Apache Beam
Introduction to Apache BeamIntroduction to Apache Beam
Introduction to Apache Beam
 
[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdf
[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdf[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdf
[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdf
 
Enforcing API Design Rules for High Quality Code Generation
Enforcing API Design Rules for High Quality Code GenerationEnforcing API Design Rules for High Quality Code Generation
Enforcing API Design Rules for High Quality Code Generation
 
Data engineering Stl Big Data IDEA user group
Data engineering   Stl Big Data IDEA user groupData engineering   Stl Big Data IDEA user group
Data engineering Stl Big Data IDEA user group
 
Apache Arrow at DataEngConf Barcelona 2018
Apache Arrow at DataEngConf Barcelona 2018Apache Arrow at DataEngConf Barcelona 2018
Apache Arrow at DataEngConf Barcelona 2018
 
Big Data, Data Lake, Fast Data - Dataserialiation-Formats
Big Data, Data Lake, Fast Data - Dataserialiation-FormatsBig Data, Data Lake, Fast Data - Dataserialiation-Formats
Big Data, Data Lake, Fast Data - Dataserialiation-Formats
 
SplunkLive! Frankfurt 2018 - Data Onboarding Overview
SplunkLive! Frankfurt 2018 - Data Onboarding OverviewSplunkLive! Frankfurt 2018 - Data Onboarding Overview
SplunkLive! Frankfurt 2018 - Data Onboarding Overview
 
Standardizing on a single N-dimensional array API for Python
Standardizing on a single N-dimensional array API for PythonStandardizing on a single N-dimensional array API for Python
Standardizing on a single N-dimensional array API for Python
 
Data Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFixData Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFix
 
SplunkLive! Munich 2018: Data Onboarding Overview
SplunkLive! Munich 2018: Data Onboarding OverviewSplunkLive! Munich 2018: Data Onboarding Overview
SplunkLive! Munich 2018: Data Onboarding Overview
 
Runtime Environment Of .Net Divya Rathore
Runtime Environment Of .Net Divya RathoreRuntime Environment Of .Net Divya Rathore
Runtime Environment Of .Net Divya Rathore
 
Hack Like It's 2013 (The Workshop)
Hack Like It's 2013 (The Workshop)Hack Like It's 2013 (The Workshop)
Hack Like It's 2013 (The Workshop)
 
MacSysAdmin Conference 2019 - Logging
MacSysAdmin Conference 2019 - Logging MacSysAdmin Conference 2019 - Logging
MacSysAdmin Conference 2019 - Logging
 
Ursa Labs and Apache Arrow in 2019
Ursa Labs and Apache Arrow in 2019Ursa Labs and Apache Arrow in 2019
Ursa Labs and Apache Arrow in 2019
 
MongoDB Use Cases: Healthcare, CMS, Analytics
MongoDB Use Cases: Healthcare, CMS, AnalyticsMongoDB Use Cases: Healthcare, CMS, Analytics
MongoDB Use Cases: Healthcare, CMS, Analytics
 
Apache Drill @ PJUG, Jan 15, 2013
Apache Drill @ PJUG, Jan 15, 2013Apache Drill @ PJUG, Jan 15, 2013
Apache Drill @ PJUG, Jan 15, 2013
 
The Data Lake Engine Data Microservices in Spark using Apache Arrow Flight
The Data Lake Engine Data Microservices in Spark using Apache Arrow FlightThe Data Lake Engine Data Microservices in Spark using Apache Arrow Flight
The Data Lake Engine Data Microservices in Spark using Apache Arrow Flight
 
WSO2Con ASIA 2016: WSO2 Analytics Platform: The One Stop Shop for All Your Da...
WSO2Con ASIA 2016: WSO2 Analytics Platform: The One Stop Shop for All Your Da...WSO2Con ASIA 2016: WSO2 Analytics Platform: The One Stop Shop for All Your Da...
WSO2Con ASIA 2016: WSO2 Analytics Platform: The One Stop Shop for All Your Da...
 
HUG France - Apache Drill
HUG France - Apache DrillHUG France - Apache Drill
HUG France - Apache Drill
 

Mehr von LivePerson

Microservices on top of kafka
Microservices on top of kafkaMicroservices on top of kafka
Microservices on top of kafkaLivePerson
 
Graph QL Introduction
Graph QL IntroductionGraph QL Introduction
Graph QL IntroductionLivePerson
 
Kubernetes your tests! automation with docker on google cloud platform
Kubernetes your tests! automation with docker on google cloud platformKubernetes your tests! automation with docker on google cloud platform
Kubernetes your tests! automation with docker on google cloud platformLivePerson
 
Growing into a proactive Data Platform
Growing into a proactive Data PlatformGrowing into a proactive Data Platform
Growing into a proactive Data PlatformLivePerson
 
Measure() or die()
Measure() or die() Measure() or die()
Measure() or die() LivePerson
 
Resilience from Theory to Practice
Resilience from Theory to PracticeResilience from Theory to Practice
Resilience from Theory to PracticeLivePerson
 
System Revolution- How We Did It
System Revolution- How We Did It System Revolution- How We Did It
System Revolution- How We Did It LivePerson
 
Liveperson DLD 2015
Liveperson DLD 2015 Liveperson DLD 2015
Liveperson DLD 2015 LivePerson
 
Http 2: Should I care?
Http 2: Should I care?Http 2: Should I care?
Http 2: Should I care?LivePerson
 
Mobile app real-time content modifications using websockets
Mobile app real-time content modifications using websocketsMobile app real-time content modifications using websockets
Mobile app real-time content modifications using websocketsLivePerson
 
Mobile SDK: Considerations & Best Practices
Mobile SDK: Considerations & Best Practices Mobile SDK: Considerations & Best Practices
Mobile SDK: Considerations & Best Practices LivePerson
 
Functional programming with Java 8
Functional programming with Java 8Functional programming with Java 8
Functional programming with Java 8LivePerson
 
Data compression in Modern Application
Data compression in Modern ApplicationData compression in Modern Application
Data compression in Modern ApplicationLivePerson
 
Support Office Hour Webinar - LivePerson API
Support Office Hour Webinar - LivePerson API Support Office Hour Webinar - LivePerson API
Support Office Hour Webinar - LivePerson API LivePerson
 
SIP - Introduction to SIP Protocol
SIP - Introduction to SIP ProtocolSIP - Introduction to SIP Protocol
SIP - Introduction to SIP ProtocolLivePerson
 
Scalding: Reaching Efficient MapReduce
Scalding: Reaching Efficient MapReduceScalding: Reaching Efficient MapReduce
Scalding: Reaching Efficient MapReduceLivePerson
 
Building Enterprise Level End-To-End Monitor System with Open Source Solution...
Building Enterprise Level End-To-End Monitor System with Open Source Solution...Building Enterprise Level End-To-End Monitor System with Open Source Solution...
Building Enterprise Level End-To-End Monitor System with Open Source Solution...LivePerson
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data ScienceLivePerson
 
From a Kafkaesque Story to The Promised Land at LivePerson
From a Kafkaesque Story to The Promised Land at LivePersonFrom a Kafkaesque Story to The Promised Land at LivePerson
From a Kafkaesque Story to The Promised Land at LivePersonLivePerson
 
How can A/B testing go wrong?
How can A/B testing go wrong?How can A/B testing go wrong?
How can A/B testing go wrong?LivePerson
 

Mehr von LivePerson (20)

Microservices on top of kafka
Microservices on top of kafkaMicroservices on top of kafka
Microservices on top of kafka
 
Graph QL Introduction
Graph QL IntroductionGraph QL Introduction
Graph QL Introduction
 
Kubernetes your tests! automation with docker on google cloud platform
Kubernetes your tests! automation with docker on google cloud platformKubernetes your tests! automation with docker on google cloud platform
Kubernetes your tests! automation with docker on google cloud platform
 
Growing into a proactive Data Platform
Growing into a proactive Data PlatformGrowing into a proactive Data Platform
Growing into a proactive Data Platform
 
Measure() or die()
Measure() or die() Measure() or die()
Measure() or die()
 
Resilience from Theory to Practice
Resilience from Theory to PracticeResilience from Theory to Practice
Resilience from Theory to Practice
 
System Revolution- How We Did It
System Revolution- How We Did It System Revolution- How We Did It
System Revolution- How We Did It
 
Liveperson DLD 2015
Liveperson DLD 2015 Liveperson DLD 2015
Liveperson DLD 2015
 
Http 2: Should I care?
Http 2: Should I care?Http 2: Should I care?
Http 2: Should I care?
 
Mobile app real-time content modifications using websockets
Mobile app real-time content modifications using websocketsMobile app real-time content modifications using websockets
Mobile app real-time content modifications using websockets
 
Mobile SDK: Considerations & Best Practices
Mobile SDK: Considerations & Best Practices Mobile SDK: Considerations & Best Practices
Mobile SDK: Considerations & Best Practices
 
Functional programming with Java 8
Functional programming with Java 8Functional programming with Java 8
Functional programming with Java 8
 
Data compression in Modern Application
Data compression in Modern ApplicationData compression in Modern Application
Data compression in Modern Application
 
Support Office Hour Webinar - LivePerson API
Support Office Hour Webinar - LivePerson API Support Office Hour Webinar - LivePerson API
Support Office Hour Webinar - LivePerson API
 
SIP - Introduction to SIP Protocol
SIP - Introduction to SIP ProtocolSIP - Introduction to SIP Protocol
SIP - Introduction to SIP Protocol
 
Scalding: Reaching Efficient MapReduce
Scalding: Reaching Efficient MapReduceScalding: Reaching Efficient MapReduce
Scalding: Reaching Efficient MapReduce
 
Building Enterprise Level End-To-End Monitor System with Open Source Solution...
Building Enterprise Level End-To-End Monitor System with Open Source Solution...Building Enterprise Level End-To-End Monitor System with Open Source Solution...
Building Enterprise Level End-To-End Monitor System with Open Source Solution...
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
From a Kafkaesque Story to The Promised Land at LivePerson
From a Kafkaesque Story to The Promised Land at LivePersonFrom a Kafkaesque Story to The Promised Land at LivePerson
From a Kafkaesque Story to The Promised Land at LivePerson
 
How can A/B testing go wrong?
How can A/B testing go wrong?How can A/B testing go wrong?
How can A/B testing go wrong?
 

Kürzlich hochgeladen

Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 

Kürzlich hochgeladen (20)

Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 

Apache Avro in LivePerson [Hebrew]

  • 1. Apache Avro in LivePerson Collecting and saving data is easy keeping it consistent is tough Sandwich club, Sep 2014 Amihay Zer-Kavod, Software Architect
  • 2. Who am I? Amihay Zer-Kavod Software Architect Been in software Since 1989
  • 4. Communication & Meaning ● Consistent but decoupled communication between services, such as: o Monitoring, Interaction o Predictive, Sentiment o RT Reporting & Analysis o Visitor History event evento 事件 घटना حدث ארוע событие ● Consistent meaning over time o BigData Store (Hadoop) o Offline Reporting & Analysis
  • 5. What shouldn’t we use? Don’t use Direct APIs! They are completely wrong for this subject: • They produce too much coupling between services • APIs are synchronous by nature • Adds irrelevant complexity to the called service
  • 6. What is needed? The Message is the API! ● A unified event model (schema) for all reported events ● Management tools for the unified schema ● Tools for sending events over the wire ● Tools for reading/writing event in big data ● Backward and forward compatibility
  • 7. The Event model From generic to specific structure with: • Common header - all common data to all events • Logical Entities - common header to all logical entities (such as Visitor) • Dynamic Specific headers • Specific Event body
  • 8. Apache Avro to the rescue ● Avro - a schema based serialization/deserialization framework ● Avro idl - schema definition language ● Avro file - Hadoop integration ● Avro schema resolution ● Apache Avro created by Doug Cutting
  • 9. Avro 101 - Data Structures ● Rich data structures ○ Primitives ■ null, int, long, boolean, float, double, bytes, string ○ Records ○ Map (string, Schema) ○ Arrays (Schema) ○ Enums ○ Unions
  • 10. Avro 101 - JSON Schema { "type": "record", "name": "Event", "namespace": "com.liveperson.example", "doc": "Example event", "fields":[ { "name": "id", "type": "string", "default": "Unknown"}, { "name": "time", "type": "long", "default": -1}, { "name": "color", "type": { "type": "enum", "name": "Color", "symbols": ["NO_COLOR", "BLUE", "BLACK", "WHITE", "PINK"] }, "default": "NO_COLOR" } ] }
  • 11. Avro 101 - Avro IDL Schema @namespace("com.liveperson.example") enum Color { NO_COLOR, BLUE, BLACK, WHITE, PINK } /** Example event */ @namespace("com.liveperson.example") record Event { string id = “Unknown”; long time = -1; Color color = "NO_COLOR"; }
  • 12. Avro 101 - Serialization ● JSON Serialization ● Binary serialization ○ int, long - variable length, Zig-zag encoding ○ float, double - 4,8 bytes respectively ○ string - long followed by UTF-8 bytes ○ map, array - unlimited size, use blocks ○ Unions - long index of the type
  • 13. Avro 101 - Generic vs. Specific ● SpecificDatumReader/Writer <T> ○ Static types ○ Code Generation: Java, C, C++, C#, Python, Ruby... ● GenericDatumReader/Writer <GenericRecord> ○ Dynamic types & access
  • 14. Avro 101 - Schema Resolution ● Writer schema must be always provided for decoding ● Reader can use its own schema ● Allows the reader and writer schema to evolve independently
  • 15. Avro vs... Technologies Protobuf Thrift Avro Created 2001 (2008) 2007 2009 Creator / Maintainer Google / Google Facebook / Apache Doug cutting / Apache Schema evolution Field Tag Field Tag Schema Static/Dynamic Yes/No Yes/No Yes/Yes Hadoop support No No Yes RPC No Yes Yes Used by Google Facebook, Cassandra Hadoop, Liveperson Lang support Good Great Good
  • 16. Backward & Forward Compatibility Avro schema evolution ● Avro supports resolution between two schemes ● Need to follow a set of rules: ● Every field must have a default value ● A field can be added (make sure to put a default value) ● Field types can not be changed (add a new field instead) ● enum symbols can be added but never removed
  • 17. Avro IDL - LivePerson Event /** Base for all LivePerson Events */ @namespace("com.liveperson.global") record LPEvent { /** Common Header of the event */ CommonHeader header = null; /** Logical entity details participating in this event - Visitor, Agent, etc... */ array<Participant> participants = null; /** Holding specific platform info as node name (machine) cluster Id etc... */ PlatformHeader platformSpecificHeader = null; /** Auditing Header, Optional - adds data for auditing of the events flow in the platform*/ union {null, AuditingHeader } auditingHeader = null; /** The event body */ EventBody eventBody = null; }
  • 18. Wait there is (much) more! M/R Migdalor
  • 19. How good does it work? ● Cyber Monday 2013 (one day) o More than 320,000 events per second o 7 Storm topologies consuming the events seconds from real time o 2TB of data saved to Hadoop ● 2014 preparation: o x2 number of events per second to ~640,000
  • 20. So how did we do it? 1. Use an event driven system, don’t use direct APIs 2. Create a unified schema for all events 3. Use Avro to implement the schema 4. Add some supporting infrastructure
  • 21. Questions ???? event evento 事件 घटना حدث ארוע событие
  • 22. Amihay Zer-Kavod You can contact me at: amihayz@liveperson.com LivePerson is hiring!