SlideShare ist ein Scribd-Unternehmen logo
1 von 23
Downloaden Sie, um offline zu lesen
 	
  Adding	
  Data	
  Schemas	
  to	
  
Snowplow	
  
Big	
  Data	
  Budapest	
  Meetup	
  -­‐	
  5	
  June	
  2014	
  
Agenda	
  today	
  
1.  Introduc;on	
  to	
  Snowplow	
  
2.  Evolu;on	
  of	
  Snowplow	
  
3.  The	
  answer:	
  schema	
  all	
  the	
  things!	
  
4.  Snowplow	
  roadmap	
  
5.  Ques;ons	
  
Introduc8on	
  to	
  Snowplow	
  
Snowplow	
  is	
  an	
  open-­‐source	
  web	
  and	
  event	
  analy8cs	
  pla<orm,	
  
first	
  version	
  released	
  in	
  early	
  2012	
  
•  Co-­‐founders	
  Alex	
  Dean	
  and	
  Yali	
  Sassoon	
  met	
  at	
  
OpenX,	
  the	
  open-­‐source	
  ad	
  technology	
  business	
  
in	
  2008	
  
•  ASer	
  leaving	
  OpenX,	
  Alex	
  and	
  Yali	
  set	
  up	
  Keplar,	
  
a	
  niche	
  digital	
  product	
  and	
  analy;cs	
  consultancy	
  
•  We	
  released	
  Snowplow	
  as	
  a	
  skunkworks	
  
prototype	
  at	
  start	
  of	
  2012:	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  github.com/snowplow/snowplow	
  
•  We	
  started	
  working	
  full	
  ;me	
  on	
  Snowplow	
  in	
  
summer	
  2013	
  
We	
  wanted	
  to	
  take	
  a	
  fresh	
  approach	
  to	
  web	
  analy8cs	
  
•  Your	
  own	
  web	
  event	
  data	
  -­‐>	
  in	
  your	
  own	
  data	
  warehouse	
  
•  Your	
  own	
  event	
  data	
  model	
  
•  Slice	
  /	
  dice	
  and	
  mine	
  the	
  data	
  in	
  highly	
  bespoke	
  ways	
  to	
  answer	
  your	
  
specific	
  business	
  ques;ons	
  
•  Plug	
  in	
  the	
  broadest	
  possible	
  set	
  of	
  analysis	
  tools	
  to	
  drive	
  value	
  from	
  your	
  
data	
  
Data	
  warehouse	
  Data	
  pipeline	
  
Analyse	
  your	
  data	
  in	
  
any	
  analysis	
  tool	
  
By	
  spring	
  2013	
  we	
  had	
  arrived	
  at	
  a	
  rela8vely	
  stable	
  batch-­‐based	
  
processing	
  architecture	
  
Website	
  /	
  webapp	
  
Snowplow	
  Hadoop	
  data	
  pipeline	
  
CloudFront-­‐
based	
  event	
  
collector	
  
Scalding-­‐
based	
  
enrichment	
  
on	
  Hadoop	
  
JavaScript	
  
event	
  tracker	
  
Amazon	
  
RedshiS	
  /	
  
PostgreSQL	
  
Amazon	
  S3	
  
or	
  
Clojure-­‐
based	
  event	
  
collector	
  
Evolu8on	
  of	
  Snowplow	
  
Snowplow	
  is	
  evolving	
  from	
  a	
  web	
  analy8cs	
  pla<orm	
  into	
  a	
  
general	
  event	
  analy8cs	
  pla<orm	
  
Data	
  warehouse	
  
Collect	
  event	
  data	
  
from	
  any	
  connected	
  
device	
  
Web	
  analysts	
  work	
  with	
  a	
  small	
  number	
  of	
  event	
  types	
  –	
  outside	
  
of	
  web,	
  the	
  number	
  of	
  possible	
  event	
  types	
  is…	
  infinite	
  
Web	
  events	
  
All	
  events	
  
•  Page	
  view	
   •  Order	
   •  Add	
  to	
  basket	
  •  Page	
  ac;vity	
  
•  Game	
  saved	
   •  Machine	
  broke	
  •  Car	
  started	
  
•  Spellcheck	
  run	
   •  Screenshot	
  taken	
  •  Fridge	
  empty	
  
•  App	
  crashed	
   •  Disk	
  full	
  •  SMS	
  sent	
  
•  Screen	
  viewed	
   •  Tweet	
  draSed	
  •  Player	
  died	
  
•  Taxi	
  arrived	
   •  Phonecall	
  ended	
  •  Cluster	
  started	
  
•  Till	
  opened	
   •  Product	
  returned	
  
∞	
  
There	
  are	
  two	
  historic	
  approaches	
  to	
  dealing	
  with	
  the	
  explosion	
  
of	
  possible	
  event	
  types	
  
Web	
  analy8cs	
  vendors	
   Mobile	
  and	
  app	
  analy8cs	
  vendors	
  
Custom	
  Variables	
   Schema-­‐less	
  JSONs	
  
Custom	
  variables	
  are	
  very	
  restric8ve	
  
	
  
1.  Take	
  a	
  standard	
  web	
  event,	
  like	
  a	
  page	
  view:	
  
2.  and	
  add	
  custom	
  variables	
  un;l	
  it	
  becomes	
  something	
  totally	
  different:	
  
	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  =	
  a	
  “taxi	
  arrived”	
  event,	
  kind	
  of!	
  
Page	
  View	
  
Page	
  View	
   vehicle=taxi23	
   status=arrived	
  +	
   +	
  
Schema-­‐less	
  JSONs	
  are	
  beWer,	
  but	
  they	
  have	
  a	
  different	
  set	
  of	
  
problems	
  
Issues	
  with	
  the	
  event	
  name:	
  
•  Separate	
  from	
  the	
  event	
  proper;es	
  
•  Not	
  versioned	
  
•  Not	
  unique	
  –	
  HBO	
  video	
  played	
  
versus	
  Brightcove	
  video	
  played	
  
Lots	
  of	
  unanswered	
  ques;ons	
  about	
  the	
  
proper;es:	
  
•  Is	
  length	
  required,	
  and	
  is	
  it	
  always	
  a	
  
number?	
  
•  Is	
  id	
  required,	
  and	
  is	
  it	
  always	
  a	
  string?	
  
•  What	
  other	
  op;onal	
  proper;es	
  are	
  
allowed	
  for	
  a	
  video	
  play?	
  
Other	
  issues:	
  
•  What	
  if	
  the	
  developer	
  
accidentally	
  starts	
  sending	
  
“len”	
  instead	
  of	
  “length”?	
  The	
  
data	
  will	
  end	
  up	
  split	
  across	
  
two	
  separate	
  fields	
  
•  Why	
  does	
  the	
  analyst	
  need	
  to	
  
keep	
  an	
  implicit	
  schema	
  in	
  
their	
  head	
  to	
  analyze	
  video	
  
played	
  events?	
  
The	
  answer:	
  schema	
  all	
  the	
  
things!	
  
When	
  a	
  developer	
  or	
  analyst	
  defines	
  a	
  new	
  event	
  in	
  JSON,	
  let’s	
  
ask	
  them	
  to	
  create	
  a	
  JSON	
  Schema	
  for	
  that	
  event	
  
Addi;onal	
  op;onal	
  field	
  we	
  might	
  
not	
  know	
  about	
  otherwise	
  
No	
  other	
  fields	
  
allowed	
  
Yes	
  length	
  should	
  always	
  be	
  a	
  
number	
  
But	
  we	
  need	
  to	
  let	
  our	
  event	
  defini8ons	
  evolve,	
  so	
  let’s	
  
add	
  versioning	
  –	
  we’re	
  calling	
  this	
  SchemaVer	
  
MODEL-REVISION-ADDITION!
•  Start	
  versioning	
  at	
  1-­‐0-­‐0	
  –	
  so	
  1-­‐0-­‐0,	
  1-­‐0-­‐1,	
  1-­‐0-­‐2,	
  1-­‐1-­‐0	
  etc	
  
•  Try	
  to	
  s;ck	
  to	
  backwards-­‐compa;ble	
  ADDITION	
  upgrades	
  as	
  much	
  
as	
  possible	
  
Where	
  are	
  our	
  schemas	
  going	
  to	
  live?	
  We	
  need	
  a	
  schema	
  
repository/registry	
  
Schema	
  repo	
  {}!
Enrichment	
  
Manager	
  
Raw	
  events	
  
in	
  JSON	
  
format	
  
Enriched	
  
events	
  in	
  
ThriS	
  or	
  
Arvo	
  
format	
  
Shredder	
  
1.	
  Test	
  
instrumenta;on	
  
2.	
  Validate	
  
events	
  
3.	
  Define	
  
structure	
  
4.	
  Drive	
  
shredding	
  
Enriched	
  
events	
  in	
  
TSV	
  ready	
  
for	
  loading	
  
into	
  db	
  
5.	
  Define	
  
structure	
  
We	
  need	
  to	
  namespace	
  our	
  schemas	
  properly	
  to	
  prevent	
  clashes	
  
and	
  confusion	
  in	
  our	
  schema	
  repository	
  
iglu:com.channel2.vod/video_played/jsonschema/1-0-0!
We	
  are	
  calling	
  our	
  schema	
  methodology	
  “Iglu”	
  
The	
  vendor	
  of	
  this	
  event	
  
Event	
  name	
  
Schema	
  format	
  
Schema	
  
version	
  
Bringing	
  it	
  all	
  together,	
  let’s	
  now	
  make	
  the	
  event	
  JSONs	
  self-­‐
describing,	
  with	
  a	
  schema	
  header	
  and	
  data	
  body	
  
And	
  for	
  good	
  measure,	
  let’s	
  add	
  in	
  our	
  schema	
  informa8on	
  into	
  
the	
  JSON	
  Schema	
  itself	
  	
  
Snowplow	
  roadmap	
  
Self-­‐describing	
  JSON	
  Schemas	
  are	
  coming	
  in	
  the	
  next	
  release	
  of	
  
Snowplow	
  
We	
  are	
  also	
  star8ng	
  to	
  define	
  third-­‐party	
  events	
  for	
  Snowplow	
  
integra8on,	
  star8ng	
  with	
  Zendesk	
  customer	
  support	
  events	
  
Ques8ons?	
  
	
  
hlp://snowplowanaly;cs.com	
  
hlps://github.com/snowplow/snowplow	
  
@snowplowdata	
  
	
  
To	
  chat	
  –	
  @alexcrdean	
  on	
  Twiler	
  or	
  alex@snowplowanaly;cs.com	
  

Weitere ähnliche Inhalte

Was ist angesagt?

Mô hình Trại nuôi dế
Mô hình Trại nuôi dếMô hình Trại nuôi dế
Mô hình Trại nuôi dếLong Khủng
 
thuyết minh dự án nhà máy phân bón hữu cơ
thuyết minh dự án nhà máy phân bón hữu cơthuyết minh dự án nhà máy phân bón hữu cơ
thuyết minh dự án nhà máy phân bón hữu cơLẬP DỰ ÁN VIỆT
 
Tư vấn lập dự án Xây dựng Khu sản xuất Nông nghiệp công nghệ cao trong nhà mà...
Tư vấn lập dự án Xây dựng Khu sản xuất Nông nghiệp công nghệ cao trong nhà mà...Tư vấn lập dự án Xây dựng Khu sản xuất Nông nghiệp công nghệ cao trong nhà mà...
Tư vấn lập dự án Xây dựng Khu sản xuất Nông nghiệp công nghệ cao trong nhà mà...CTY CP TƯ VẤN ĐẦU TƯ THẢO NGUYÊN XANH
 
Version Control with Git
Version Control with GitVersion Control with Git
Version Control with GitLuigi De Russis
 
Installing and Invoking Oracle Data Integrator (ODI) Public Web Services (whi...
Installing and Invoking Oracle Data Integrator (ODI) Public Web Services (whi...Installing and Invoking Oracle Data Integrator (ODI) Public Web Services (whi...
Installing and Invoking Oracle Data Integrator (ODI) Public Web Services (whi...Revelation Technologies
 
GitHub Actions in action
GitHub Actions in actionGitHub Actions in action
GitHub Actions in actionOleksii Holub
 
Thuyết minh dự án nhà máy sản xuất dầu thực vật
Thuyết minh dự án nhà máy sản xuất dầu thực vậtThuyết minh dự án nhà máy sản xuất dầu thực vật
Thuyết minh dự án nhà máy sản xuất dầu thực vậtLẬP DỰ ÁN VIỆT
 
Angular Web Programlama
Angular Web ProgramlamaAngular Web Programlama
Angular Web ProgramlamaCihan Özhan
 
Tìm hiểu quy trình sản xuất tinh bột sắn tại nhà máy fococev thừa thiên huế
Tìm hiểu quy trình  sản xuất tinh bột sắn tại nhà máy fococev thừa thiên huếTìm hiểu quy trình  sản xuất tinh bột sắn tại nhà máy fococev thừa thiên huế
Tìm hiểu quy trình sản xuất tinh bột sắn tại nhà máy fococev thừa thiên huếThanh Hoa
 
Advanced Schema Design Patterns
Advanced Schema Design Patterns Advanced Schema Design Patterns
Advanced Schema Design Patterns MongoDB
 
Cong nge say phun va ung dung trong san xuat thucpham _do an thuc pham
Cong nge say phun va ung dung trong san xuat thucpham _do an thuc phamCong nge say phun va ung dung trong san xuat thucpham _do an thuc pham
Cong nge say phun va ung dung trong san xuat thucpham _do an thuc phamLinh Linpine
 
Kiểm thử bảo mật web
Kiểm thử bảo mật webKiểm thử bảo mật web
Kiểm thử bảo mật webMinh Tri Nguyen
 
The Performance Engineer's Guide To HotSpot Just-in-Time Compilation
The Performance Engineer's Guide To HotSpot Just-in-Time CompilationThe Performance Engineer's Guide To HotSpot Just-in-Time Compilation
The Performance Engineer's Guide To HotSpot Just-in-Time CompilationMonica Beckwith
 
Dự án Nhà máy sản xuất sợi dệt kết hợp
Dự án Nhà máy sản xuất sợi dệt kết hợpDự án Nhà máy sản xuất sợi dệt kết hợp
Dự án Nhà máy sản xuất sợi dệt kết hợpLẬP DỰ ÁN VIỆT
 

Was ist angesagt? (20)

Dự án kho lạnh 0918755356
Dự án kho lạnh 0918755356Dự án kho lạnh 0918755356
Dự án kho lạnh 0918755356
 
Mô hình Trại nuôi dế
Mô hình Trại nuôi dếMô hình Trại nuôi dế
Mô hình Trại nuôi dế
 
thuyết minh dự án nhà máy phân bón hữu cơ
thuyết minh dự án nhà máy phân bón hữu cơthuyết minh dự án nhà máy phân bón hữu cơ
thuyết minh dự án nhà máy phân bón hữu cơ
 
Serving ML easily with FastAPI - meme version
Serving ML easily with FastAPI - meme versionServing ML easily with FastAPI - meme version
Serving ML easily with FastAPI - meme version
 
Tư vấn lập dự án Xây dựng Khu sản xuất Nông nghiệp công nghệ cao trong nhà mà...
Tư vấn lập dự án Xây dựng Khu sản xuất Nông nghiệp công nghệ cao trong nhà mà...Tư vấn lập dự án Xây dựng Khu sản xuất Nông nghiệp công nghệ cao trong nhà mà...
Tư vấn lập dự án Xây dựng Khu sản xuất Nông nghiệp công nghệ cao trong nhà mà...
 
BÀI MẪU Luận văn thạc sĩ khoa học máy tính, 9 ĐIỂM
BÀI MẪU Luận văn thạc sĩ khoa học máy tính, 9 ĐIỂMBÀI MẪU Luận văn thạc sĩ khoa học máy tính, 9 ĐIỂM
BÀI MẪU Luận văn thạc sĩ khoa học máy tính, 9 ĐIỂM
 
Version Control with Git
Version Control with GitVersion Control with Git
Version Control with Git
 
Installing and Invoking Oracle Data Integrator (ODI) Public Web Services (whi...
Installing and Invoking Oracle Data Integrator (ODI) Public Web Services (whi...Installing and Invoking Oracle Data Integrator (ODI) Public Web Services (whi...
Installing and Invoking Oracle Data Integrator (ODI) Public Web Services (whi...
 
Luận văn: Chiến lược thu hút đầu tư vào Khu công nghiệp dệt may
Luận văn: Chiến lược thu hút đầu tư vào Khu công nghiệp dệt mayLuận văn: Chiến lược thu hút đầu tư vào Khu công nghiệp dệt may
Luận văn: Chiến lược thu hút đầu tư vào Khu công nghiệp dệt may
 
GitHub Actions in action
GitHub Actions in actionGitHub Actions in action
GitHub Actions in action
 
Thuyết minh dự án đầu tư Trồng chè và chế biến chè Công nghệ Ô long tỉnh Lạng...
Thuyết minh dự án đầu tư Trồng chè và chế biến chè Công nghệ Ô long tỉnh Lạng...Thuyết minh dự án đầu tư Trồng chè và chế biến chè Công nghệ Ô long tỉnh Lạng...
Thuyết minh dự án đầu tư Trồng chè và chế biến chè Công nghệ Ô long tỉnh Lạng...
 
Thuyết minh dự án nhà máy sản xuất dầu thực vật
Thuyết minh dự án nhà máy sản xuất dầu thực vậtThuyết minh dự án nhà máy sản xuất dầu thực vật
Thuyết minh dự án nhà máy sản xuất dầu thực vật
 
Angular Web Programlama
Angular Web ProgramlamaAngular Web Programlama
Angular Web Programlama
 
Tìm hiểu quy trình sản xuất tinh bột sắn tại nhà máy fococev thừa thiên huế
Tìm hiểu quy trình  sản xuất tinh bột sắn tại nhà máy fococev thừa thiên huếTìm hiểu quy trình  sản xuất tinh bột sắn tại nhà máy fococev thừa thiên huế
Tìm hiểu quy trình sản xuất tinh bột sắn tại nhà máy fococev thừa thiên huế
 
Advanced Schema Design Patterns
Advanced Schema Design Patterns Advanced Schema Design Patterns
Advanced Schema Design Patterns
 
Cong nge say phun va ung dung trong san xuat thucpham _do an thuc pham
Cong nge say phun va ung dung trong san xuat thucpham _do an thuc phamCong nge say phun va ung dung trong san xuat thucpham _do an thuc pham
Cong nge say phun va ung dung trong san xuat thucpham _do an thuc pham
 
Kiểm thử bảo mật web
Kiểm thử bảo mật webKiểm thử bảo mật web
Kiểm thử bảo mật web
 
The Performance Engineer's Guide To HotSpot Just-in-Time Compilation
The Performance Engineer's Guide To HotSpot Just-in-Time CompilationThe Performance Engineer's Guide To HotSpot Just-in-Time Compilation
The Performance Engineer's Guide To HotSpot Just-in-Time Compilation
 
Dự án Nhà máy sản xuất sợi dệt kết hợp
Dự án Nhà máy sản xuất sợi dệt kết hợpDự án Nhà máy sản xuất sợi dệt kết hợp
Dự án Nhà máy sản xuất sợi dệt kết hợp
 
Công nghệ sản xuất bia va malt
Công nghệ sản xuất bia va maltCông nghệ sản xuất bia va malt
Công nghệ sản xuất bia va malt
 

Andere mochten auch

Snowplow at Sigfig
Snowplow at SigfigSnowplow at Sigfig
Snowplow at Sigfigyalisassoon
 
From Zero to Hadoop: a tutorial for getting started writing Hadoop jobs on Am...
From Zero to Hadoop: a tutorial for getting started writing Hadoop jobs on Am...From Zero to Hadoop: a tutorial for getting started writing Hadoop jobs on Am...
From Zero to Hadoop: a tutorial for getting started writing Hadoop jobs on Am...Alexander Dean
 
Implementing improved and consistent arbitrary event tracking company-wide us...
Implementing improved and consistent arbitrary event tracking company-wide us...Implementing improved and consistent arbitrary event tracking company-wide us...
Implementing improved and consistent arbitrary event tracking company-wide us...yalisassoon
 
Snowplow the evolving data pipeline
Snowplow   the evolving data pipelineSnowplow   the evolving data pipeline
Snowplow the evolving data pipelineyalisassoon
 
Using Snowplow for A/B testing and user journey analysis at CustomMade
Using Snowplow for A/B testing and user journey analysis at CustomMadeUsing Snowplow for A/B testing and user journey analysis at CustomMade
Using Snowplow for A/B testing and user journey analysis at CustomMadeyalisassoon
 
Yali presentation for snowplow amsterdam meetup number 2
Yali presentation for snowplow amsterdam meetup number 2Yali presentation for snowplow amsterdam meetup number 2
Yali presentation for snowplow amsterdam meetup number 2yalisassoon
 
Snowplow: where we came from and where we are going - March 2016
Snowplow: where we came from and where we are going - March 2016Snowplow: where we came from and where we are going - March 2016
Snowplow: where we came from and where we are going - March 2016yalisassoon
 
Snowplow: putting digital analysts at the heart of digital analytics - the fo...
Snowplow: putting digital analysts at the heart of digital analytics - the fo...Snowplow: putting digital analysts at the heart of digital analytics - the fo...
Snowplow: putting digital analysts at the heart of digital analytics - the fo...yalisassoon
 
Modeling event data
Modeling event dataModeling event data
Modeling event datayalisassoon
 
Lean Product Analytics by Dan Olsen
Lean Product Analytics by Dan OlsenLean Product Analytics by Dan Olsen
Lean Product Analytics by Dan OlsenDan Olsen
 
Understanding event data
Understanding event dataUnderstanding event data
Understanding event datayalisassoon
 
A KPI framework for startups
A KPI framework for startupsA KPI framework for startups
A KPI framework for startupsyalisassoon
 
File Format Benchmarks - Avro, JSON, ORC, & Parquet
File Format Benchmarks - Avro, JSON, ORC, & ParquetFile Format Benchmarks - Avro, JSON, ORC, & Parquet
File Format Benchmarks - Avro, JSON, ORC, & ParquetOwen O'Malley
 

Andere mochten auch (13)

Snowplow at Sigfig
Snowplow at SigfigSnowplow at Sigfig
Snowplow at Sigfig
 
From Zero to Hadoop: a tutorial for getting started writing Hadoop jobs on Am...
From Zero to Hadoop: a tutorial for getting started writing Hadoop jobs on Am...From Zero to Hadoop: a tutorial for getting started writing Hadoop jobs on Am...
From Zero to Hadoop: a tutorial for getting started writing Hadoop jobs on Am...
 
Implementing improved and consistent arbitrary event tracking company-wide us...
Implementing improved and consistent arbitrary event tracking company-wide us...Implementing improved and consistent arbitrary event tracking company-wide us...
Implementing improved and consistent arbitrary event tracking company-wide us...
 
Snowplow the evolving data pipeline
Snowplow   the evolving data pipelineSnowplow   the evolving data pipeline
Snowplow the evolving data pipeline
 
Using Snowplow for A/B testing and user journey analysis at CustomMade
Using Snowplow for A/B testing and user journey analysis at CustomMadeUsing Snowplow for A/B testing and user journey analysis at CustomMade
Using Snowplow for A/B testing and user journey analysis at CustomMade
 
Yali presentation for snowplow amsterdam meetup number 2
Yali presentation for snowplow amsterdam meetup number 2Yali presentation for snowplow amsterdam meetup number 2
Yali presentation for snowplow amsterdam meetup number 2
 
Snowplow: where we came from and where we are going - March 2016
Snowplow: where we came from and where we are going - March 2016Snowplow: where we came from and where we are going - March 2016
Snowplow: where we came from and where we are going - March 2016
 
Snowplow: putting digital analysts at the heart of digital analytics - the fo...
Snowplow: putting digital analysts at the heart of digital analytics - the fo...Snowplow: putting digital analysts at the heart of digital analytics - the fo...
Snowplow: putting digital analysts at the heart of digital analytics - the fo...
 
Modeling event data
Modeling event dataModeling event data
Modeling event data
 
Lean Product Analytics by Dan Olsen
Lean Product Analytics by Dan OlsenLean Product Analytics by Dan Olsen
Lean Product Analytics by Dan Olsen
 
Understanding event data
Understanding event dataUnderstanding event data
Understanding event data
 
A KPI framework for startups
A KPI framework for startupsA KPI framework for startups
A KPI framework for startups
 
File Format Benchmarks - Avro, JSON, ORC, & Parquet
File Format Benchmarks - Avro, JSON, ORC, & ParquetFile Format Benchmarks - Avro, JSON, ORC, & Parquet
File Format Benchmarks - Avro, JSON, ORC, & Parquet
 

Ähnlich wie Big data meetup budapest adding data schemas to snowplow

Big Data Beers - Introducing Snowplow
Big Data Beers - Introducing SnowplowBig Data Beers - Introducing Snowplow
Big Data Beers - Introducing SnowplowAlexander Dean
 
Scala eXchange: Building robust data pipelines in Scala
Scala eXchange: Building robust data pipelines in ScalaScala eXchange: Building robust data pipelines in Scala
Scala eXchange: Building robust data pipelines in ScalaAlexander Dean
 
[2C6]Everyplay_Big_Data
[2C6]Everyplay_Big_Data[2C6]Everyplay_Big_Data
[2C6]Everyplay_Big_DataNAVER D2
 
SpringOne 2016 in a nutshell
SpringOne 2016 in a nutshellSpringOne 2016 in a nutshell
SpringOne 2016 in a nutshellJeroen Resoort
 
Snowplow presentation for Amsterdam Meetup #3
Snowplow presentation for Amsterdam Meetup #3Snowplow presentation for Amsterdam Meetup #3
Snowplow presentation for Amsterdam Meetup #3Snowplow Analytics
 
ECS19 Elio Struyf - Setting Up Your SPFx CI/CD pipelines on Azure DevOps
ECS19 Elio Struyf - Setting Up Your SPFx CI/CD pipelines on Azure DevOpsECS19 Elio Struyf - Setting Up Your SPFx CI/CD pipelines on Azure DevOps
ECS19 Elio Struyf - Setting Up Your SPFx CI/CD pipelines on Azure DevOpsEuropean Collaboration Summit
 
Integrating Splunk into your Spring Applications
Integrating Splunk into your Spring ApplicationsIntegrating Splunk into your Spring Applications
Integrating Splunk into your Spring ApplicationsDamien Dallimore
 
WSO2Con EU 2015: An Introduction to the WSO2 Data Analytics Platform
WSO2Con EU 2015: An Introduction to the WSO2 Data Analytics PlatformWSO2Con EU 2015: An Introduction to the WSO2 Data Analytics Platform
WSO2Con EU 2015: An Introduction to the WSO2 Data Analytics PlatformWSO2
 
Snowplow: open source game analytics powered by AWS
Snowplow: open source game analytics powered by AWSSnowplow: open source game analytics powered by AWS
Snowplow: open source game analytics powered by AWSGiuseppe Gaviani
 
Make Cross-platform Mobile Apps Quickly - SIGGRAPH 2014
Make Cross-platform Mobile Apps Quickly - SIGGRAPH 2014Make Cross-platform Mobile Apps Quickly - SIGGRAPH 2014
Make Cross-platform Mobile Apps Quickly - SIGGRAPH 2014Gil Irizarry
 
AD113 Speed Up Your Applications w/ Nginx and PageSpeed
AD113  Speed Up Your Applications w/ Nginx and PageSpeedAD113  Speed Up Your Applications w/ Nginx and PageSpeed
AD113 Speed Up Your Applications w/ Nginx and PageSpeededm00se
 
Lambda Architectures in Practice
Lambda Architectures in PracticeLambda Architectures in Practice
Lambda Architectures in PracticeC4Media
 
DevOps on AWS: Accelerating Software Delivery with the AWS Developer Tools
DevOps on AWS: Accelerating Software Delivery with the AWS Developer ToolsDevOps on AWS: Accelerating Software Delivery with the AWS Developer Tools
DevOps on AWS: Accelerating Software Delivery with the AWS Developer ToolsAmazon Web Services
 
Cloud Big Data Architectures
Cloud Big Data ArchitecturesCloud Big Data Architectures
Cloud Big Data ArchitecturesLynn Langit
 
Elastic Data Analytics Platform @Datadog
Elastic Data Analytics Platform @DatadogElastic Data Analytics Platform @Datadog
Elastic Data Analytics Platform @DatadogC4Media
 
German introduction to sp framework
German   introduction to sp frameworkGerman   introduction to sp framework
German introduction to sp frameworkBob German
 
How Open Source Embiggens Salesforce.com
How Open Source Embiggens Salesforce.comHow Open Source Embiggens Salesforce.com
How Open Source Embiggens Salesforce.comSalesforce Engineering
 
Develop modern apps using Spring ecosystem at time of BigData
Develop modern apps using Spring ecosystem at time of BigData Develop modern apps using Spring ecosystem at time of BigData
Develop modern apps using Spring ecosystem at time of BigData Oleg Tsal-Tsalko
 
Dev ops on aws deep dive on continuous delivery - Toronto
Dev ops on aws deep dive on continuous delivery - TorontoDev ops on aws deep dive on continuous delivery - Toronto
Dev ops on aws deep dive on continuous delivery - TorontoAmazon Web Services
 

Ähnlich wie Big data meetup budapest adding data schemas to snowplow (20)

Big Data Beers - Introducing Snowplow
Big Data Beers - Introducing SnowplowBig Data Beers - Introducing Snowplow
Big Data Beers - Introducing Snowplow
 
Scala eXchange: Building robust data pipelines in Scala
Scala eXchange: Building robust data pipelines in ScalaScala eXchange: Building robust data pipelines in Scala
Scala eXchange: Building robust data pipelines in Scala
 
[2C6]Everyplay_Big_Data
[2C6]Everyplay_Big_Data[2C6]Everyplay_Big_Data
[2C6]Everyplay_Big_Data
 
SpringOne 2016 in a nutshell
SpringOne 2016 in a nutshellSpringOne 2016 in a nutshell
SpringOne 2016 in a nutshell
 
Snowplow presentation for Amsterdam Meetup #3
Snowplow presentation for Amsterdam Meetup #3Snowplow presentation for Amsterdam Meetup #3
Snowplow presentation for Amsterdam Meetup #3
 
ECS19 Elio Struyf - Setting Up Your SPFx CI/CD pipelines on Azure DevOps
ECS19 Elio Struyf - Setting Up Your SPFx CI/CD pipelines on Azure DevOpsECS19 Elio Struyf - Setting Up Your SPFx CI/CD pipelines on Azure DevOps
ECS19 Elio Struyf - Setting Up Your SPFx CI/CD pipelines on Azure DevOps
 
Integrating Splunk into your Spring Applications
Integrating Splunk into your Spring ApplicationsIntegrating Splunk into your Spring Applications
Integrating Splunk into your Spring Applications
 
WSO2Con EU 2015: An Introduction to the WSO2 Data Analytics Platform
WSO2Con EU 2015: An Introduction to the WSO2 Data Analytics PlatformWSO2Con EU 2015: An Introduction to the WSO2 Data Analytics Platform
WSO2Con EU 2015: An Introduction to the WSO2 Data Analytics Platform
 
Snowplow: open source game analytics powered by AWS
Snowplow: open source game analytics powered by AWSSnowplow: open source game analytics powered by AWS
Snowplow: open source game analytics powered by AWS
 
Make Cross-platform Mobile Apps Quickly - SIGGRAPH 2014
Make Cross-platform Mobile Apps Quickly - SIGGRAPH 2014Make Cross-platform Mobile Apps Quickly - SIGGRAPH 2014
Make Cross-platform Mobile Apps Quickly - SIGGRAPH 2014
 
BDA311 Introduction to AWS Glue
BDA311 Introduction to AWS GlueBDA311 Introduction to AWS Glue
BDA311 Introduction to AWS Glue
 
AD113 Speed Up Your Applications w/ Nginx and PageSpeed
AD113  Speed Up Your Applications w/ Nginx and PageSpeedAD113  Speed Up Your Applications w/ Nginx and PageSpeed
AD113 Speed Up Your Applications w/ Nginx and PageSpeed
 
Lambda Architectures in Practice
Lambda Architectures in PracticeLambda Architectures in Practice
Lambda Architectures in Practice
 
DevOps on AWS: Accelerating Software Delivery with the AWS Developer Tools
DevOps on AWS: Accelerating Software Delivery with the AWS Developer ToolsDevOps on AWS: Accelerating Software Delivery with the AWS Developer Tools
DevOps on AWS: Accelerating Software Delivery with the AWS Developer Tools
 
Cloud Big Data Architectures
Cloud Big Data ArchitecturesCloud Big Data Architectures
Cloud Big Data Architectures
 
Elastic Data Analytics Platform @Datadog
Elastic Data Analytics Platform @DatadogElastic Data Analytics Platform @Datadog
Elastic Data Analytics Platform @Datadog
 
German introduction to sp framework
German   introduction to sp frameworkGerman   introduction to sp framework
German introduction to sp framework
 
How Open Source Embiggens Salesforce.com
How Open Source Embiggens Salesforce.comHow Open Source Embiggens Salesforce.com
How Open Source Embiggens Salesforce.com
 
Develop modern apps using Spring ecosystem at time of BigData
Develop modern apps using Spring ecosystem at time of BigData Develop modern apps using Spring ecosystem at time of BigData
Develop modern apps using Spring ecosystem at time of BigData
 
Dev ops on aws deep dive on continuous delivery - Toronto
Dev ops on aws deep dive on continuous delivery - TorontoDev ops on aws deep dive on continuous delivery - Toronto
Dev ops on aws deep dive on continuous delivery - Toronto
 

Mehr von yalisassoon

Snowplow: evolve your analytics stack with your business
Snowplow: evolve your analytics stack with your businessSnowplow: evolve your analytics stack with your business
Snowplow: evolve your analytics stack with your businessyalisassoon
 
2016 09 measurecamp - event data modeling
2016 09 measurecamp - event data modeling2016 09 measurecamp - event data modeling
2016 09 measurecamp - event data modelingyalisassoon
 
Capturing online customer data to create better insights and targeted actions...
Capturing online customer data to create better insights and targeted actions...Capturing online customer data to create better insights and targeted actions...
Capturing online customer data to create better insights and targeted actions...yalisassoon
 
Snowplow at DA Hub emerging technology showcase
Snowplow at DA Hub emerging technology showcaseSnowplow at DA Hub emerging technology showcase
Snowplow at DA Hub emerging technology showcaseyalisassoon
 
Analytics at Carbonite: presentation to Snowplow Meetup Boston April 2016
Analytics at Carbonite: presentation to Snowplow Meetup Boston April 2016Analytics at Carbonite: presentation to Snowplow Meetup Boston April 2016
Analytics at Carbonite: presentation to Snowplow Meetup Boston April 2016yalisassoon
 
The analytics journey at Viewbix - how they came to use Snowplow and the setu...
The analytics journey at Viewbix - how they came to use Snowplow and the setu...The analytics journey at Viewbix - how they came to use Snowplow and the setu...
The analytics journey at Viewbix - how they came to use Snowplow and the setu...yalisassoon
 
Snowplow Analytics and Looker at Oyster.com
Snowplow Analytics and Looker at Oyster.comSnowplow Analytics and Looker at Oyster.com
Snowplow Analytics and Looker at Oyster.comyalisassoon
 
Snowplow is at the core of everything we do
Snowplow is at the core of everything we doSnowplow is at the core of everything we do
Snowplow is at the core of everything we doyalisassoon
 
Chefsfeed presentation to Snowplow Meetup San Francisco, Oct 2015
Chefsfeed presentation to Snowplow Meetup San Francisco, Oct 2015Chefsfeed presentation to Snowplow Meetup San Francisco, Oct 2015
Chefsfeed presentation to Snowplow Meetup San Francisco, Oct 2015yalisassoon
 
Modelling event data in look ml
Modelling event data in look mlModelling event data in look ml
Modelling event data in look mlyalisassoon
 
Why use big data tools to do web analytics? And how to do it using Snowplow a...
Why use big data tools to do web analytics? And how to do it using Snowplow a...Why use big data tools to do web analytics? And how to do it using Snowplow a...
Why use big data tools to do web analytics? And how to do it using Snowplow a...yalisassoon
 
Customer lifetime value
Customer lifetime valueCustomer lifetime value
Customer lifetime valueyalisassoon
 
How we use Hive at SnowPlow, and how the role of HIve is changing
How we use Hive at SnowPlow, and how the role of HIve is changingHow we use Hive at SnowPlow, and how the role of HIve is changing
How we use Hive at SnowPlow, and how the role of HIve is changingyalisassoon
 

Mehr von yalisassoon (13)

Snowplow: evolve your analytics stack with your business
Snowplow: evolve your analytics stack with your businessSnowplow: evolve your analytics stack with your business
Snowplow: evolve your analytics stack with your business
 
2016 09 measurecamp - event data modeling
2016 09 measurecamp - event data modeling2016 09 measurecamp - event data modeling
2016 09 measurecamp - event data modeling
 
Capturing online customer data to create better insights and targeted actions...
Capturing online customer data to create better insights and targeted actions...Capturing online customer data to create better insights and targeted actions...
Capturing online customer data to create better insights and targeted actions...
 
Snowplow at DA Hub emerging technology showcase
Snowplow at DA Hub emerging technology showcaseSnowplow at DA Hub emerging technology showcase
Snowplow at DA Hub emerging technology showcase
 
Analytics at Carbonite: presentation to Snowplow Meetup Boston April 2016
Analytics at Carbonite: presentation to Snowplow Meetup Boston April 2016Analytics at Carbonite: presentation to Snowplow Meetup Boston April 2016
Analytics at Carbonite: presentation to Snowplow Meetup Boston April 2016
 
The analytics journey at Viewbix - how they came to use Snowplow and the setu...
The analytics journey at Viewbix - how they came to use Snowplow and the setu...The analytics journey at Viewbix - how they came to use Snowplow and the setu...
The analytics journey at Viewbix - how they came to use Snowplow and the setu...
 
Snowplow Analytics and Looker at Oyster.com
Snowplow Analytics and Looker at Oyster.comSnowplow Analytics and Looker at Oyster.com
Snowplow Analytics and Looker at Oyster.com
 
Snowplow is at the core of everything we do
Snowplow is at the core of everything we doSnowplow is at the core of everything we do
Snowplow is at the core of everything we do
 
Chefsfeed presentation to Snowplow Meetup San Francisco, Oct 2015
Chefsfeed presentation to Snowplow Meetup San Francisco, Oct 2015Chefsfeed presentation to Snowplow Meetup San Francisco, Oct 2015
Chefsfeed presentation to Snowplow Meetup San Francisco, Oct 2015
 
Modelling event data in look ml
Modelling event data in look mlModelling event data in look ml
Modelling event data in look ml
 
Why use big data tools to do web analytics? And how to do it using Snowplow a...
Why use big data tools to do web analytics? And how to do it using Snowplow a...Why use big data tools to do web analytics? And how to do it using Snowplow a...
Why use big data tools to do web analytics? And how to do it using Snowplow a...
 
Customer lifetime value
Customer lifetime valueCustomer lifetime value
Customer lifetime value
 
How we use Hive at SnowPlow, and how the role of HIve is changing
How we use Hive at SnowPlow, and how the role of HIve is changingHow we use Hive at SnowPlow, and how the role of HIve is changing
How we use Hive at SnowPlow, and how the role of HIve is changing
 

Kürzlich hochgeladen

ITALY - Visa Options for expats and digital nomads
ITALY - Visa Options for expats and digital nomadsITALY - Visa Options for expats and digital nomads
ITALY - Visa Options for expats and digital nomadsMarco Mazzeschi
 
Hire 💕 8617697112 Champawat Call Girls Service Call Girls Agency
Hire 💕 8617697112 Champawat Call Girls Service Call Girls AgencyHire 💕 8617697112 Champawat Call Girls Service Call Girls Agency
Hire 💕 8617697112 Champawat Call Girls Service Call Girls AgencyNitya salvi
 
BERMUDA Triangle the mystery of life.pptx
BERMUDA Triangle the mystery of life.pptxBERMUDA Triangle the mystery of life.pptx
BERMUDA Triangle the mystery of life.pptxseri bangash
 
Book Cheap Flight Tickets - TraveljunctionUK
Book  Cheap Flight Tickets - TraveljunctionUKBook  Cheap Flight Tickets - TraveljunctionUK
Book Cheap Flight Tickets - TraveljunctionUKTravel Juncation
 
08448380779 Call Girls In Chirag Enclave Women Seeking Men
08448380779 Call Girls In Chirag Enclave Women Seeking Men08448380779 Call Girls In Chirag Enclave Women Seeking Men
08448380779 Call Girls In Chirag Enclave Women Seeking MenDelhi Call girls
 
❤Personal Contact Number Varanasi Call Girls 8617697112💦✅.
❤Personal Contact Number Varanasi Call Girls 8617697112💦✅.❤Personal Contact Number Varanasi Call Girls 8617697112💦✅.
❤Personal Contact Number Varanasi Call Girls 8617697112💦✅.Nitya salvi
 
High Profile 🔝 8250077686 📞 Call Girls Service in Siri Fort🍑
High Profile 🔝 8250077686 📞 Call Girls Service in Siri Fort🍑High Profile 🔝 8250077686 📞 Call Girls Service in Siri Fort🍑
High Profile 🔝 8250077686 📞 Call Girls Service in Siri Fort🍑Damini Dixit
 
Hire 💕 8617697112 Chamba Call Girls Service Call Girls Agency
Hire 💕 8617697112 Chamba Call Girls Service Call Girls AgencyHire 💕 8617697112 Chamba Call Girls Service Call Girls Agency
Hire 💕 8617697112 Chamba Call Girls Service Call Girls AgencyNitya salvi
 
💕📲09602870969💓Girl Escort Services Udaipur Call Girls in Chittorgarh Haldighati
💕📲09602870969💓Girl Escort Services Udaipur Call Girls in Chittorgarh Haldighati💕📲09602870969💓Girl Escort Services Udaipur Call Girls in Chittorgarh Haldighati
💕📲09602870969💓Girl Escort Services Udaipur Call Girls in Chittorgarh HaldighatiApsara Of India
 
Texas Tales Brenham and Amarillo Experiences Elevated by Find American Rental...
Texas Tales Brenham and Amarillo Experiences Elevated by Find American Rental...Texas Tales Brenham and Amarillo Experiences Elevated by Find American Rental...
Texas Tales Brenham and Amarillo Experiences Elevated by Find American Rental...Find American Rentals
 
08448380779 Call Girls In Bhikaji Cama Palace Women Seeking Men
08448380779 Call Girls In Bhikaji Cama Palace Women Seeking Men08448380779 Call Girls In Bhikaji Cama Palace Women Seeking Men
08448380779 Call Girls In Bhikaji Cama Palace Women Seeking MenDelhi Call girls
 
Night 7k to 12k Daman Call Girls 👉👉 8617697112⭐⭐ 100% Genuine Escort Service ...
Night 7k to 12k Daman Call Girls 👉👉 8617697112⭐⭐ 100% Genuine Escort Service ...Night 7k to 12k Daman Call Girls 👉👉 8617697112⭐⭐ 100% Genuine Escort Service ...
Night 7k to 12k Daman Call Girls 👉👉 8617697112⭐⭐ 100% Genuine Escort Service ...Nitya salvi
 
08448380779 Call Girls In Shahdara Women Seeking Men
08448380779 Call Girls In Shahdara Women Seeking Men08448380779 Call Girls In Shahdara Women Seeking Men
08448380779 Call Girls In Shahdara Women Seeking MenDelhi Call girls
 
🔥HOT🔥📲9602870969🔥Prostitute Service in Udaipur Call Girls in City Palace Lake...
🔥HOT🔥📲9602870969🔥Prostitute Service in Udaipur Call Girls in City Palace Lake...🔥HOT🔥📲9602870969🔥Prostitute Service in Udaipur Call Girls in City Palace Lake...
🔥HOT🔥📲9602870969🔥Prostitute Service in Udaipur Call Girls in City Palace Lake...Apsara Of India
 
Top 10 Traditional Indian Handicrafts.pptx
Top 10 Traditional Indian Handicrafts.pptxTop 10 Traditional Indian Handicrafts.pptx
Top 10 Traditional Indian Handicrafts.pptxdishha99
 
A tour of African gastronomy - World Tourism Organization
A tour of African gastronomy - World Tourism OrganizationA tour of African gastronomy - World Tourism Organization
A tour of African gastronomy - World Tourism OrganizationJuan Carlos Fonseca Mata
 
Study Consultants in Lahore || 📞03094429236
Study Consultants in Lahore || 📞03094429236Study Consultants in Lahore || 📞03094429236
Study Consultants in Lahore || 📞03094429236Sherazi Tours
 

Kürzlich hochgeladen (20)

ITALY - Visa Options for expats and digital nomads
ITALY - Visa Options for expats and digital nomadsITALY - Visa Options for expats and digital nomads
ITALY - Visa Options for expats and digital nomads
 
Hire 💕 8617697112 Champawat Call Girls Service Call Girls Agency
Hire 💕 8617697112 Champawat Call Girls Service Call Girls AgencyHire 💕 8617697112 Champawat Call Girls Service Call Girls Agency
Hire 💕 8617697112 Champawat Call Girls Service Call Girls Agency
 
BERMUDA Triangle the mystery of life.pptx
BERMUDA Triangle the mystery of life.pptxBERMUDA Triangle the mystery of life.pptx
BERMUDA Triangle the mystery of life.pptx
 
Book Cheap Flight Tickets - TraveljunctionUK
Book  Cheap Flight Tickets - TraveljunctionUKBook  Cheap Flight Tickets - TraveljunctionUK
Book Cheap Flight Tickets - TraveljunctionUK
 
08448380779 Call Girls In Chirag Enclave Women Seeking Men
08448380779 Call Girls In Chirag Enclave Women Seeking Men08448380779 Call Girls In Chirag Enclave Women Seeking Men
08448380779 Call Girls In Chirag Enclave Women Seeking Men
 
❤Personal Contact Number Varanasi Call Girls 8617697112💦✅.
❤Personal Contact Number Varanasi Call Girls 8617697112💦✅.❤Personal Contact Number Varanasi Call Girls 8617697112💦✅.
❤Personal Contact Number Varanasi Call Girls 8617697112💦✅.
 
High Profile 🔝 8250077686 📞 Call Girls Service in Siri Fort🍑
High Profile 🔝 8250077686 📞 Call Girls Service in Siri Fort🍑High Profile 🔝 8250077686 📞 Call Girls Service in Siri Fort🍑
High Profile 🔝 8250077686 📞 Call Girls Service in Siri Fort🍑
 
CYTOTEC DUBAI ☎️ +966572737505 } Abortion pills in Abu dhabi,get misoprostal ...
CYTOTEC DUBAI ☎️ +966572737505 } Abortion pills in Abu dhabi,get misoprostal ...CYTOTEC DUBAI ☎️ +966572737505 } Abortion pills in Abu dhabi,get misoprostal ...
CYTOTEC DUBAI ☎️ +966572737505 } Abortion pills in Abu dhabi,get misoprostal ...
 
Hire 💕 8617697112 Chamba Call Girls Service Call Girls Agency
Hire 💕 8617697112 Chamba Call Girls Service Call Girls AgencyHire 💕 8617697112 Chamba Call Girls Service Call Girls Agency
Hire 💕 8617697112 Chamba Call Girls Service Call Girls Agency
 
💕📲09602870969💓Girl Escort Services Udaipur Call Girls in Chittorgarh Haldighati
💕📲09602870969💓Girl Escort Services Udaipur Call Girls in Chittorgarh Haldighati💕📲09602870969💓Girl Escort Services Udaipur Call Girls in Chittorgarh Haldighati
💕📲09602870969💓Girl Escort Services Udaipur Call Girls in Chittorgarh Haldighati
 
Texas Tales Brenham and Amarillo Experiences Elevated by Find American Rental...
Texas Tales Brenham and Amarillo Experiences Elevated by Find American Rental...Texas Tales Brenham and Amarillo Experiences Elevated by Find American Rental...
Texas Tales Brenham and Amarillo Experiences Elevated by Find American Rental...
 
08448380779 Call Girls In Bhikaji Cama Palace Women Seeking Men
08448380779 Call Girls In Bhikaji Cama Palace Women Seeking Men08448380779 Call Girls In Bhikaji Cama Palace Women Seeking Men
08448380779 Call Girls In Bhikaji Cama Palace Women Seeking Men
 
Night 7k to 12k Daman Call Girls 👉👉 8617697112⭐⭐ 100% Genuine Escort Service ...
Night 7k to 12k Daman Call Girls 👉👉 8617697112⭐⭐ 100% Genuine Escort Service ...Night 7k to 12k Daman Call Girls 👉👉 8617697112⭐⭐ 100% Genuine Escort Service ...
Night 7k to 12k Daman Call Girls 👉👉 8617697112⭐⭐ 100% Genuine Escort Service ...
 
08448380779 Call Girls In Shahdara Women Seeking Men
08448380779 Call Girls In Shahdara Women Seeking Men08448380779 Call Girls In Shahdara Women Seeking Men
08448380779 Call Girls In Shahdara Women Seeking Men
 
🔥HOT🔥📲9602870969🔥Prostitute Service in Udaipur Call Girls in City Palace Lake...
🔥HOT🔥📲9602870969🔥Prostitute Service in Udaipur Call Girls in City Palace Lake...🔥HOT🔥📲9602870969🔥Prostitute Service in Udaipur Call Girls in City Palace Lake...
🔥HOT🔥📲9602870969🔥Prostitute Service in Udaipur Call Girls in City Palace Lake...
 
Rohini Sector 18 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 18 Call Girls Delhi 9999965857 @Sabina Saikh No AdvanceRohini Sector 18 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 18 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
 
Top 10 Traditional Indian Handicrafts.pptx
Top 10 Traditional Indian Handicrafts.pptxTop 10 Traditional Indian Handicrafts.pptx
Top 10 Traditional Indian Handicrafts.pptx
 
A tour of African gastronomy - World Tourism Organization
A tour of African gastronomy - World Tourism OrganizationA tour of African gastronomy - World Tourism Organization
A tour of African gastronomy - World Tourism Organization
 
Study Consultants in Lahore || 📞03094429236
Study Consultants in Lahore || 📞03094429236Study Consultants in Lahore || 📞03094429236
Study Consultants in Lahore || 📞03094429236
 
Call Girls Service !! New Friends Colony!! @9999965857 Delhi 🫦 No Advance VV...
Call Girls Service !! New Friends Colony!! @9999965857 Delhi 🫦 No Advance  VV...Call Girls Service !! New Friends Colony!! @9999965857 Delhi 🫦 No Advance  VV...
Call Girls Service !! New Friends Colony!! @9999965857 Delhi 🫦 No Advance VV...
 

Big data meetup budapest adding data schemas to snowplow

  • 1.    Adding  Data  Schemas  to   Snowplow   Big  Data  Budapest  Meetup  -­‐  5  June  2014  
  • 2. Agenda  today   1.  Introduc;on  to  Snowplow   2.  Evolu;on  of  Snowplow   3.  The  answer:  schema  all  the  things!   4.  Snowplow  roadmap   5.  Ques;ons  
  • 4. Snowplow  is  an  open-­‐source  web  and  event  analy8cs  pla<orm,   first  version  released  in  early  2012   •  Co-­‐founders  Alex  Dean  and  Yali  Sassoon  met  at   OpenX,  the  open-­‐source  ad  technology  business   in  2008   •  ASer  leaving  OpenX,  Alex  and  Yali  set  up  Keplar,   a  niche  digital  product  and  analy;cs  consultancy   •  We  released  Snowplow  as  a  skunkworks   prototype  at  start  of  2012:                    github.com/snowplow/snowplow   •  We  started  working  full  ;me  on  Snowplow  in   summer  2013  
  • 5. We  wanted  to  take  a  fresh  approach  to  web  analy8cs   •  Your  own  web  event  data  -­‐>  in  your  own  data  warehouse   •  Your  own  event  data  model   •  Slice  /  dice  and  mine  the  data  in  highly  bespoke  ways  to  answer  your   specific  business  ques;ons   •  Plug  in  the  broadest  possible  set  of  analysis  tools  to  drive  value  from  your   data   Data  warehouse  Data  pipeline   Analyse  your  data  in   any  analysis  tool  
  • 6. By  spring  2013  we  had  arrived  at  a  rela8vely  stable  batch-­‐based   processing  architecture   Website  /  webapp   Snowplow  Hadoop  data  pipeline   CloudFront-­‐ based  event   collector   Scalding-­‐ based   enrichment   on  Hadoop   JavaScript   event  tracker   Amazon   RedshiS  /   PostgreSQL   Amazon  S3   or   Clojure-­‐ based  event   collector  
  • 8. Snowplow  is  evolving  from  a  web  analy8cs  pla<orm  into  a   general  event  analy8cs  pla<orm   Data  warehouse   Collect  event  data   from  any  connected   device  
  • 9. Web  analysts  work  with  a  small  number  of  event  types  –  outside   of  web,  the  number  of  possible  event  types  is…  infinite   Web  events   All  events   •  Page  view   •  Order   •  Add  to  basket  •  Page  ac;vity   •  Game  saved   •  Machine  broke  •  Car  started   •  Spellcheck  run   •  Screenshot  taken  •  Fridge  empty   •  App  crashed   •  Disk  full  •  SMS  sent   •  Screen  viewed   •  Tweet  draSed  •  Player  died   •  Taxi  arrived   •  Phonecall  ended  •  Cluster  started   •  Till  opened   •  Product  returned   ∞  
  • 10. There  are  two  historic  approaches  to  dealing  with  the  explosion   of  possible  event  types   Web  analy8cs  vendors   Mobile  and  app  analy8cs  vendors   Custom  Variables   Schema-­‐less  JSONs  
  • 11. Custom  variables  are  very  restric8ve     1.  Take  a  standard  web  event,  like  a  page  view:   2.  and  add  custom  variables  un;l  it  becomes  something  totally  different:                                            =  a  “taxi  arrived”  event,  kind  of!   Page  View   Page  View   vehicle=taxi23   status=arrived  +   +  
  • 12. Schema-­‐less  JSONs  are  beWer,  but  they  have  a  different  set  of   problems   Issues  with  the  event  name:   •  Separate  from  the  event  proper;es   •  Not  versioned   •  Not  unique  –  HBO  video  played   versus  Brightcove  video  played   Lots  of  unanswered  ques;ons  about  the   proper;es:   •  Is  length  required,  and  is  it  always  a   number?   •  Is  id  required,  and  is  it  always  a  string?   •  What  other  op;onal  proper;es  are   allowed  for  a  video  play?   Other  issues:   •  What  if  the  developer   accidentally  starts  sending   “len”  instead  of  “length”?  The   data  will  end  up  split  across   two  separate  fields   •  Why  does  the  analyst  need  to   keep  an  implicit  schema  in   their  head  to  analyze  video   played  events?  
  • 13. The  answer:  schema  all  the   things!  
  • 14. When  a  developer  or  analyst  defines  a  new  event  in  JSON,  let’s   ask  them  to  create  a  JSON  Schema  for  that  event   Addi;onal  op;onal  field  we  might   not  know  about  otherwise   No  other  fields   allowed   Yes  length  should  always  be  a   number  
  • 15. But  we  need  to  let  our  event  defini8ons  evolve,  so  let’s   add  versioning  –  we’re  calling  this  SchemaVer   MODEL-REVISION-ADDITION! •  Start  versioning  at  1-­‐0-­‐0  –  so  1-­‐0-­‐0,  1-­‐0-­‐1,  1-­‐0-­‐2,  1-­‐1-­‐0  etc   •  Try  to  s;ck  to  backwards-­‐compa;ble  ADDITION  upgrades  as  much   as  possible  
  • 16. Where  are  our  schemas  going  to  live?  We  need  a  schema   repository/registry   Schema  repo  {}! Enrichment   Manager   Raw  events   in  JSON   format   Enriched   events  in   ThriS  or   Arvo   format   Shredder   1.  Test   instrumenta;on   2.  Validate   events   3.  Define   structure   4.  Drive   shredding   Enriched   events  in   TSV  ready   for  loading   into  db   5.  Define   structure  
  • 17. We  need  to  namespace  our  schemas  properly  to  prevent  clashes   and  confusion  in  our  schema  repository   iglu:com.channel2.vod/video_played/jsonschema/1-0-0! We  are  calling  our  schema  methodology  “Iglu”   The  vendor  of  this  event   Event  name   Schema  format   Schema   version  
  • 18. Bringing  it  all  together,  let’s  now  make  the  event  JSONs  self-­‐ describing,  with  a  schema  header  and  data  body  
  • 19. And  for  good  measure,  let’s  add  in  our  schema  informa8on  into   the  JSON  Schema  itself    
  • 21. Self-­‐describing  JSON  Schemas  are  coming  in  the  next  release  of   Snowplow  
  • 22. We  are  also  star8ng  to  define  third-­‐party  events  for  Snowplow   integra8on,  star8ng  with  Zendesk  customer  support  events  
  • 23. Ques8ons?     hlp://snowplowanaly;cs.com   hlps://github.com/snowplow/snowplow   @snowplowdata     To  chat  –  @alexcrdean  on  Twiler  or  alex@snowplowanaly;cs.com