SlideShare ist ein Scribd-Unternehmen logo
1 von 57
Downloaden Sie, um offline zu lesen
javier ramirez
@supercoco9
API analytics
with Redis
and BigQuery
javier ramirez @supercoco9 https://teowaki.com api days 14
REST API (Ruby on Rails)
+
Web on top (AngularJS)
Use a
hosted
solution
questions?
api days 14
javier ramirez @supercoco9 https://teowaki.com api days 14
data that’s an order of
magnitude greater than
data you’re accustomed
to
javier ramirez @supercoco9 https://teowaki.com api days2014
Doug Laney
VP Research, Business Analytics and Performance Management at Gartner
data that exceeds the
processing capacity of
conventional database
systems. The data is too big,
moves too fast, or doesn’t fit
the structures of your
database architectures.
Ed Dumbill
program chair for the O’Reilly Strata Conference
javier ramirez @supercoco9 https://teowaki.com api days2014
bigdata is doing a
fullscan to 330MM rows,
matching them against a
regexp, and getting the
result (223MM rows) in
just 5 seconds
javier ramirez @supercoco9 https://teowaki.com api days2014
Javier Ramirez
impresionable teowaki founder
1. non intrusive metrics
2. keep the history
3. avoid vendor lock-in
4. interactive queries
5. cheap
6. extra ball: real time
javier ramirez @supercoco9 https://teowaki.com api days 14
javier ramirez @supercoco9 https://teowaki.com api days2014
open source, BSD licensed, advanced
key-value store. It is often referred to as a
data structure server since keys can contain
strings, hashes, lists, sets, sorted sets and
hyperloglogs.
http://redis.io
started in 2009 by Salvatore Sanfilippo @antirez
112 contributors at https://github.com/antirez/redis
javier ramirez @supercoco9 https://teowaki.com api days2014
twitter
stackoverflow
pinterest
booking.com
World of Warcraft
YouPorn
HipChat
Snapchat
javier ramirez @supercoco9 https://teowaki.com api days 14
ntopng
LogStash
Intel(R) Xeon(R) CPU E5520 @ 2.27GHz (with pipelining)
$ ./redis-benchmark -r 1000000 -n 2000000 -t get,set,lpush,lpop -P 16 -q
SET: 552,028 requests per second
GET: 707,463 requests per second
LPUSH: 767,459 requests per second
LPOP: 770,119 requests per second
Intel(R) Xeon(R) CPU E5520 @ 2.27GHz (without pipelining)
$ ./redis-benchmark -r 1000000 -n 2000000 -t get,set,lpush,lpop -q
SET: 122,556 requests per second
GET: 123,601 requests per second
LPUSH: 136,752 requests per second
LPOP: 132,424 requests per second
javier ramirez @supercoco9 https://teowaki.com api days2014
javier ramirez @supercoco9 https://teowaki.com api days 14
Non intrusive metrics
Capture data really fast.
Then send the data on
the background
javier ramirez @supercoco9 https://teowaki.com api days2014
Redis keeps
everything in
memory
all the time
javier ramirez @supercoco9 https://teowaki.com api days2014
javier ramirez @supercoco9 https://teowaki.com api days 14
Gzip to
AWS S3/Glacier
or
Google Cloud Storage
javier ramirez @supercoco9 https://teowaki.com api days 14
javier ramirez @supercoco9 https://teowaki.com api days 14
Hadoop
Cassandra
Amazon Redshift
...
javier ramirez @supercoco9 https://teowaki.com api days 14
tools we considered:
but...
hard to set up and monitor
expensive cluster
not interactive enough
javier ramirez @supercoco9 https://teowaki.com api days 14
Our choice:
Google BigQuery
Data analysis as a service
http://developers.google.com/bigquery
javier ramirez @supercoco9 https://teowaki.com api days 14
Based on “Dremel”
Specifically designed for
interactive queries over
petabytes of real-time
data
javier ramirez @supercoco9 https://teowaki.com api days 14
loading data
You just send the data in
text (or JSON) format
javier ramirez @supercoco9 https://teowaki.com api days 14
SQL
javier ramirez @supercoco9 https://teowaki.com api days 14
select name from USERS order by date;
select count(*) from users;
select max(date) from USERS;
select sum(total) from ORDERS group by user;
specific extensions for
analytics
javier ramirez @supercoco9 https://teowaki.com api days 14
within
flatten
nest
stddev
top
first
last
nth
variance
var_pop
var_samp
covar_pop
covar_samp
quantiles
web console screenshot
javier ramirez @supercoco9 https://teowaki.com api days 14
javier ramirez @supercoco9 https://teowaki.com api days 14
window functions
javier ramirez @supercoco9 https://teowaki.com api days 14
our most active user
javier ramirez @supercoco9 https://teowaki.com api days 14
country segmented traffic
javier ramirez @supercoco9 https://teowaki.com api days 14
10 request we should be caching
correlations.
not to mistake with
causality
javier ramirez @supercoco9 https://teowaki.com api days 14
Things you always wanted to
try but were too scared to
javier ramirez @supercoco9 https://teowaki.com api days 14
select count(*) from
publicdata:samples.wikipedia
where REGEXP_MATCH(title, "[0-9]*")
AND wp_namespace = 0;
223,163,387
Query complete (5.6s elapsed, 9.13 GB processed, Cost: 32¢)
javier ramirez @supercoco9 http://teowaki.com api days2014
5 most created resources
select uri, count(*) total from
stats where method = 'POST'
group by URI;
javier ramirez @supercoco9 http://teowaki.com api days2014
...but
/users/javier/shouts
/users/rgo/shouts
/teams/javier-community/links
/teams/nosqlmatters-cgn/links
javier ramirez @supercoco9 http://teowaki.com api days2014
5 most created resources
new users per month
SELECT repository_name, repository_language,
repository_description, COUNT(repository_name) as cnt,
repository_url
FROM github.timeline
WHERE type="WatchEvent"
AND PARSE_UTC_USEC(created_at) >=
PARSE_UTC_USEC("#{yesterday} 20:00:00")
AND repository_url IN (
SELECT repository_url
FROM github.timeline
WHERE type="CreateEvent"
AND PARSE_UTC_USEC(repository_created_at) >=
PARSE_UTC_USEC('#{yesterday} 20:00:00')
AND repository_fork = "false"
AND payload_ref_type = "repository"
GROUP BY repository_url
)
GROUP BY repository_name, repository_language,
repository_description, repository_url
HAVING cnt >= 5
ORDER BY cnt DESC
LIMIT 25
Automation with Apps Script
Read from bigquery
Create a spreadsheet on Drive
E-mail it everyday as a PDF
javier ramirez @supercoco9 https://teowaki.com api days 14
bigquery pricing
$26 per stored TB
1000000 rows => $0.00416 / month
£0.00243 / month
$5 per processed TB
1 full scan = 160 MB
1 count = 0 MB
1 full scan over 1 column = 5.4 MB
100 GB => $0.05 / month £0.03javier ramirez @supercoco9 https://teowaki.com api days 14
£0.054307 / month*
per 1MM rows
*the 1st
1TB every month are free of charge
javier ramirez @supercoco9 https://teowaki.com api days 14
1. non intrusive metrics
2. keep the history
3. avoid vendor lock-in
4. interactive queries
5. cheap
6. extra ball: real time
javier ramirez @supercoco9 https://teowaki.com api days 14
Find related links at
https://teowaki.com/teams/javier-community/link-categories/bigquery-talk
Thanks!
Gr ciesà
Javier Ramírez
@supercoco9
api days 14

Weitere ähnliche Inhalte

Andere mochten auch

The Lincoln Institue - 10 Ways to Regenerate America's Legacy Cities
The Lincoln Institue - 10 Ways to Regenerate America's Legacy CitiesThe Lincoln Institue - 10 Ways to Regenerate America's Legacy Cities
The Lincoln Institue - 10 Ways to Regenerate America's Legacy CitiesCassidy Swanson
 
Enhance the browser_experience
Enhance the browser_experienceEnhance the browser_experience
Enhance the browser_experienceHTML5 Spain
 
I want to be an efficient developer - APIdays Barcelona version
I want to be an efficient developer - APIdays Barcelona versionI want to be an efficient developer - APIdays Barcelona version
I want to be an efficient developer - APIdays Barcelona versionQuentin Adam
 
APIfying the Web with import.io (at APIdays mediterranea)
APIfying the Web with import.io (at APIdays mediterranea)APIfying the Web with import.io (at APIdays mediterranea)
APIfying the Web with import.io (at APIdays mediterranea)Ignacio Elola Villar
 
AIL Platform APIDays Mediterranea
AIL Platform APIDays MediterraneaAIL Platform APIDays Mediterranea
AIL Platform APIDays MediterraneaJoan Protasio
 
The importance of /me
The importance of /meThe importance of /me
The importance of /meBruno Pedro
 
Battelfield REST, API Development from the trenches
Battelfield REST, API Development from the trenchesBattelfield REST, API Development from the trenches
Battelfield REST, API Development from the trenchesDaniel Cerecedo
 
8 . Valle de Ricote - Ricote - la sierra
8 . Valle de Ricote -  Ricote - la sierra8 . Valle de Ricote -  Ricote - la sierra
8 . Valle de Ricote - Ricote - la sierramaestriko
 
Fichas proyectos fsc
Fichas proyectos fscFichas proyectos fsc
Fichas proyectos fscForo Abierto
 
Erfolgsfaktoren zur Auswahl und Organisation von studentischen Projektstudien
Erfolgsfaktoren zur Auswahl und Organisation von studentischen ProjektstudienErfolgsfaktoren zur Auswahl und Organisation von studentischen Projektstudien
Erfolgsfaktoren zur Auswahl und Organisation von studentischen ProjektstudienMichael Groeschel
 
Mano a Mano - Charla sobre TICE para maestras
Mano a Mano - Charla sobre TICE para maestrasMano a Mano - Charla sobre TICE para maestras
Mano a Mano - Charla sobre TICE para maestrasFernando Cormenzana
 

Andere mochten auch (15)

Patent wars, Innovation, Roads
Patent wars, Innovation, RoadsPatent wars, Innovation, Roads
Patent wars, Innovation, Roads
 
The Lincoln Institue - 10 Ways to Regenerate America's Legacy Cities
The Lincoln Institue - 10 Ways to Regenerate America's Legacy CitiesThe Lincoln Institue - 10 Ways to Regenerate America's Legacy Cities
The Lincoln Institue - 10 Ways to Regenerate America's Legacy Cities
 
Enhance the browser_experience
Enhance the browser_experienceEnhance the browser_experience
Enhance the browser_experience
 
I want to be an efficient developer - APIdays Barcelona version
I want to be an efficient developer - APIdays Barcelona versionI want to be an efficient developer - APIdays Barcelona version
I want to be an efficient developer - APIdays Barcelona version
 
APIfying the Web with import.io (at APIdays mediterranea)
APIfying the Web with import.io (at APIdays mediterranea)APIfying the Web with import.io (at APIdays mediterranea)
APIfying the Web with import.io (at APIdays mediterranea)
 
AIL Platform APIDays Mediterranea
AIL Platform APIDays MediterraneaAIL Platform APIDays Mediterranea
AIL Platform APIDays Mediterranea
 
Build a Restfull app using drupal
Build a Restfull app using drupalBuild a Restfull app using drupal
Build a Restfull app using drupal
 
The importance of /me
The importance of /meThe importance of /me
The importance of /me
 
Battelfield REST, API Development from the trenches
Battelfield REST, API Development from the trenchesBattelfield REST, API Development from the trenches
Battelfield REST, API Development from the trenches
 
8 . Valle de Ricote - Ricote - la sierra
8 . Valle de Ricote -  Ricote - la sierra8 . Valle de Ricote -  Ricote - la sierra
8 . Valle de Ricote - Ricote - la sierra
 
Fichas proyectos fsc
Fichas proyectos fscFichas proyectos fsc
Fichas proyectos fsc
 
7 Reglas para una campaña de email marketing efectiva
7 Reglas para una campaña de email marketing efectiva7 Reglas para una campaña de email marketing efectiva
7 Reglas para una campaña de email marketing efectiva
 
Erfolgsfaktoren zur Auswahl und Organisation von studentischen Projektstudien
Erfolgsfaktoren zur Auswahl und Organisation von studentischen ProjektstudienErfolgsfaktoren zur Auswahl und Organisation von studentischen Projektstudien
Erfolgsfaktoren zur Auswahl und Organisation von studentischen Projektstudien
 
Mano a Mano - Charla sobre TICE para maestras
Mano a Mano - Charla sobre TICE para maestrasMano a Mano - Charla sobre TICE para maestras
Mano a Mano - Charla sobre TICE para maestras
 
Black metal
Black metalBlack metal
Black metal
 

Mehr von javier ramirez

¿Se puede vivir del open source? T3chfest
¿Se puede vivir del open source? T3chfest¿Se puede vivir del open source? T3chfest
¿Se puede vivir del open source? T3chfestjavier ramirez
 
QuestDB: The building blocks of a fast open-source time-series database
QuestDB: The building blocks of a fast open-source time-series databaseQuestDB: The building blocks of a fast open-source time-series database
QuestDB: The building blocks of a fast open-source time-series databasejavier ramirez
 
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...javier ramirez
 
Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...
Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...
Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...javier ramirez
 
Deduplicating and analysing time-series data with Apache Beam and QuestDB
Deduplicating and analysing time-series data with Apache Beam and QuestDBDeduplicating and analysing time-series data with Apache Beam and QuestDB
Deduplicating and analysing time-series data with Apache Beam and QuestDBjavier ramirez
 
Your Database Cannot Do this (well)
Your Database Cannot Do this (well)Your Database Cannot Do this (well)
Your Database Cannot Do this (well)javier ramirez
 
Your Timestamps Deserve Better than a Generic Database
Your Timestamps Deserve Better than a Generic DatabaseYour Timestamps Deserve Better than a Generic Database
Your Timestamps Deserve Better than a Generic Databasejavier ramirez
 
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...javier ramirez
 
QuestDB-Community-Call-20220728
QuestDB-Community-Call-20220728QuestDB-Community-Call-20220728
QuestDB-Community-Call-20220728javier ramirez
 
Processing and analysing streaming data with Python. Pycon Italy 2022
Processing and analysing streaming  data with Python. Pycon Italy 2022Processing and analysing streaming  data with Python. Pycon Italy 2022
Processing and analysing streaming data with Python. Pycon Italy 2022javier ramirez
 
QuestDB: ingesting a million time series per second on a single instance. Big...
QuestDB: ingesting a million time series per second on a single instance. Big...QuestDB: ingesting a million time series per second on a single instance. Big...
QuestDB: ingesting a million time series per second on a single instance. Big...javier ramirez
 
Servicios e infraestructura de AWS y la próxima región en Aragón
Servicios e infraestructura de AWS y la próxima región en AragónServicios e infraestructura de AWS y la próxima región en Aragón
Servicios e infraestructura de AWS y la próxima región en Aragónjavier ramirez
 
Primeros pasos en desarrollo serverless
Primeros pasos en desarrollo serverlessPrimeros pasos en desarrollo serverless
Primeros pasos en desarrollo serverlessjavier ramirez
 
How AWS is reinventing the cloud
How AWS is reinventing the cloudHow AWS is reinventing the cloud
How AWS is reinventing the cloudjavier ramirez
 
Analitica de datos en tiempo real con Apache Flink y Apache BEAM
Analitica de datos en tiempo real con Apache Flink y Apache BEAMAnalitica de datos en tiempo real con Apache Flink y Apache BEAM
Analitica de datos en tiempo real con Apache Flink y Apache BEAMjavier ramirez
 
Getting started with streaming analytics
Getting started with streaming analyticsGetting started with streaming analytics
Getting started with streaming analyticsjavier ramirez
 
Getting started with streaming analytics: Setting up a pipeline
Getting started with streaming analytics: Setting up a pipelineGetting started with streaming analytics: Setting up a pipeline
Getting started with streaming analytics: Setting up a pipelinejavier ramirez
 
Getting started with streaming analytics: Deep Dive
Getting started with streaming analytics: Deep DiveGetting started with streaming analytics: Deep Dive
Getting started with streaming analytics: Deep Divejavier ramirez
 
Getting started with streaming analytics: streaming basics (1 of 3)
Getting started with streaming analytics: streaming basics (1 of 3)Getting started with streaming analytics: streaming basics (1 of 3)
Getting started with streaming analytics: streaming basics (1 of 3)javier ramirez
 
Monitorización de seguridad y detección de amenazas con AWS
Monitorización de seguridad y detección de amenazas con AWSMonitorización de seguridad y detección de amenazas con AWS
Monitorización de seguridad y detección de amenazas con AWSjavier ramirez
 

Mehr von javier ramirez (20)

¿Se puede vivir del open source? T3chfest
¿Se puede vivir del open source? T3chfest¿Se puede vivir del open source? T3chfest
¿Se puede vivir del open source? T3chfest
 
QuestDB: The building blocks of a fast open-source time-series database
QuestDB: The building blocks of a fast open-source time-series databaseQuestDB: The building blocks of a fast open-source time-series database
QuestDB: The building blocks of a fast open-source time-series database
 
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
 
Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...
Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...
Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...
 
Deduplicating and analysing time-series data with Apache Beam and QuestDB
Deduplicating and analysing time-series data with Apache Beam and QuestDBDeduplicating and analysing time-series data with Apache Beam and QuestDB
Deduplicating and analysing time-series data with Apache Beam and QuestDB
 
Your Database Cannot Do this (well)
Your Database Cannot Do this (well)Your Database Cannot Do this (well)
Your Database Cannot Do this (well)
 
Your Timestamps Deserve Better than a Generic Database
Your Timestamps Deserve Better than a Generic DatabaseYour Timestamps Deserve Better than a Generic Database
Your Timestamps Deserve Better than a Generic Database
 
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
 
QuestDB-Community-Call-20220728
QuestDB-Community-Call-20220728QuestDB-Community-Call-20220728
QuestDB-Community-Call-20220728
 
Processing and analysing streaming data with Python. Pycon Italy 2022
Processing and analysing streaming  data with Python. Pycon Italy 2022Processing and analysing streaming  data with Python. Pycon Italy 2022
Processing and analysing streaming data with Python. Pycon Italy 2022
 
QuestDB: ingesting a million time series per second on a single instance. Big...
QuestDB: ingesting a million time series per second on a single instance. Big...QuestDB: ingesting a million time series per second on a single instance. Big...
QuestDB: ingesting a million time series per second on a single instance. Big...
 
Servicios e infraestructura de AWS y la próxima región en Aragón
Servicios e infraestructura de AWS y la próxima región en AragónServicios e infraestructura de AWS y la próxima región en Aragón
Servicios e infraestructura de AWS y la próxima región en Aragón
 
Primeros pasos en desarrollo serverless
Primeros pasos en desarrollo serverlessPrimeros pasos en desarrollo serverless
Primeros pasos en desarrollo serverless
 
How AWS is reinventing the cloud
How AWS is reinventing the cloudHow AWS is reinventing the cloud
How AWS is reinventing the cloud
 
Analitica de datos en tiempo real con Apache Flink y Apache BEAM
Analitica de datos en tiempo real con Apache Flink y Apache BEAMAnalitica de datos en tiempo real con Apache Flink y Apache BEAM
Analitica de datos en tiempo real con Apache Flink y Apache BEAM
 
Getting started with streaming analytics
Getting started with streaming analyticsGetting started with streaming analytics
Getting started with streaming analytics
 
Getting started with streaming analytics: Setting up a pipeline
Getting started with streaming analytics: Setting up a pipelineGetting started with streaming analytics: Setting up a pipeline
Getting started with streaming analytics: Setting up a pipeline
 
Getting started with streaming analytics: Deep Dive
Getting started with streaming analytics: Deep DiveGetting started with streaming analytics: Deep Dive
Getting started with streaming analytics: Deep Dive
 
Getting started with streaming analytics: streaming basics (1 of 3)
Getting started with streaming analytics: streaming basics (1 of 3)Getting started with streaming analytics: streaming basics (1 of 3)
Getting started with streaming analytics: streaming basics (1 of 3)
 
Monitorización de seguridad y detección de amenazas con AWS
Monitorización de seguridad y detección de amenazas con AWSMonitorización de seguridad y detección de amenazas con AWS
Monitorización de seguridad y detección de amenazas con AWS
 

Kürzlich hochgeladen

%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benoni%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benonimasabamasaba
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension AidPhilip Schwarz
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...masabamasaba
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplatePresentation.STUDIO
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfonteinmasabamasaba
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...masabamasaba
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park masabamasaba
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfonteinmasabamasaba
 
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is insideshinachiaurasa2
 
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Bert Jan Schrijver
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...masabamasaba
 
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2
 
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...chiefasafspells
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...masabamasaba
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisamasabamasaba
 
WSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2
 

Kürzlich hochgeladen (20)

%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benoni%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benoni
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
 
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
 
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
 
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
 
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
 
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
WSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go Platformless
 

api analytics using Redis, BigQuery and AppsScripts by Javier Ramirez from teowaki (Apidays Mediterranea)

  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7. javier ramirez @supercoco9 https://teowaki.com api days 14 REST API (Ruby on Rails) + Web on top (AngularJS)
  • 10. javier ramirez @supercoco9 https://teowaki.com api days 14
  • 11. data that’s an order of magnitude greater than data you’re accustomed to javier ramirez @supercoco9 https://teowaki.com api days2014 Doug Laney VP Research, Business Analytics and Performance Management at Gartner
  • 12. data that exceeds the processing capacity of conventional database systems. The data is too big, moves too fast, or doesn’t fit the structures of your database architectures. Ed Dumbill program chair for the O’Reilly Strata Conference javier ramirez @supercoco9 https://teowaki.com api days2014
  • 13. bigdata is doing a fullscan to 330MM rows, matching them against a regexp, and getting the result (223MM rows) in just 5 seconds javier ramirez @supercoco9 https://teowaki.com api days2014 Javier Ramirez impresionable teowaki founder
  • 14. 1. non intrusive metrics 2. keep the history 3. avoid vendor lock-in 4. interactive queries 5. cheap 6. extra ball: real time javier ramirez @supercoco9 https://teowaki.com api days 14
  • 15. javier ramirez @supercoco9 https://teowaki.com api days2014
  • 16. open source, BSD licensed, advanced key-value store. It is often referred to as a data structure server since keys can contain strings, hashes, lists, sets, sorted sets and hyperloglogs. http://redis.io started in 2009 by Salvatore Sanfilippo @antirez 112 contributors at https://github.com/antirez/redis javier ramirez @supercoco9 https://teowaki.com api days2014
  • 17. twitter stackoverflow pinterest booking.com World of Warcraft YouPorn HipChat Snapchat javier ramirez @supercoco9 https://teowaki.com api days 14 ntopng LogStash
  • 18. Intel(R) Xeon(R) CPU E5520 @ 2.27GHz (with pipelining) $ ./redis-benchmark -r 1000000 -n 2000000 -t get,set,lpush,lpop -P 16 -q SET: 552,028 requests per second GET: 707,463 requests per second LPUSH: 767,459 requests per second LPOP: 770,119 requests per second Intel(R) Xeon(R) CPU E5520 @ 2.27GHz (without pipelining) $ ./redis-benchmark -r 1000000 -n 2000000 -t get,set,lpush,lpop -q SET: 122,556 requests per second GET: 123,601 requests per second LPUSH: 136,752 requests per second LPOP: 132,424 requests per second javier ramirez @supercoco9 https://teowaki.com api days2014
  • 19. javier ramirez @supercoco9 https://teowaki.com api days 14 Non intrusive metrics Capture data really fast. Then send the data on the background
  • 20. javier ramirez @supercoco9 https://teowaki.com api days2014
  • 21. Redis keeps everything in memory all the time javier ramirez @supercoco9 https://teowaki.com api days2014
  • 22. javier ramirez @supercoco9 https://teowaki.com api days 14
  • 23. Gzip to AWS S3/Glacier or Google Cloud Storage javier ramirez @supercoco9 https://teowaki.com api days 14
  • 24. javier ramirez @supercoco9 https://teowaki.com api days 14
  • 25. Hadoop Cassandra Amazon Redshift ... javier ramirez @supercoco9 https://teowaki.com api days 14 tools we considered:
  • 26. but... hard to set up and monitor expensive cluster not interactive enough javier ramirez @supercoco9 https://teowaki.com api days 14
  • 27. Our choice: Google BigQuery Data analysis as a service http://developers.google.com/bigquery javier ramirez @supercoco9 https://teowaki.com api days 14
  • 28. Based on “Dremel” Specifically designed for interactive queries over petabytes of real-time data javier ramirez @supercoco9 https://teowaki.com api days 14
  • 29. loading data You just send the data in text (or JSON) format javier ramirez @supercoco9 https://teowaki.com api days 14
  • 30. SQL javier ramirez @supercoco9 https://teowaki.com api days 14 select name from USERS order by date; select count(*) from users; select max(date) from USERS; select sum(total) from ORDERS group by user;
  • 31. specific extensions for analytics javier ramirez @supercoco9 https://teowaki.com api days 14 within flatten nest stddev top first last nth variance var_pop var_samp covar_pop covar_samp quantiles
  • 32. web console screenshot javier ramirez @supercoco9 https://teowaki.com api days 14
  • 33. javier ramirez @supercoco9 https://teowaki.com api days 14 window functions
  • 34. javier ramirez @supercoco9 https://teowaki.com api days 14 our most active user
  • 35. javier ramirez @supercoco9 https://teowaki.com api days 14 country segmented traffic
  • 36. javier ramirez @supercoco9 https://teowaki.com api days 14 10 request we should be caching
  • 37. correlations. not to mistake with causality javier ramirez @supercoco9 https://teowaki.com api days 14
  • 38. Things you always wanted to try but were too scared to javier ramirez @supercoco9 https://teowaki.com api days 14 select count(*) from publicdata:samples.wikipedia where REGEXP_MATCH(title, "[0-9]*") AND wp_namespace = 0; 223,163,387 Query complete (5.6s elapsed, 9.13 GB processed, Cost: 32¢)
  • 39. javier ramirez @supercoco9 http://teowaki.com api days2014 5 most created resources select uri, count(*) total from stats where method = 'POST' group by URI;
  • 40. javier ramirez @supercoco9 http://teowaki.com api days2014 ...but /users/javier/shouts /users/rgo/shouts /teams/javier-community/links /teams/nosqlmatters-cgn/links
  • 41. javier ramirez @supercoco9 http://teowaki.com api days2014 5 most created resources
  • 42. new users per month
  • 43. SELECT repository_name, repository_language, repository_description, COUNT(repository_name) as cnt, repository_url FROM github.timeline WHERE type="WatchEvent" AND PARSE_UTC_USEC(created_at) >= PARSE_UTC_USEC("#{yesterday} 20:00:00") AND repository_url IN ( SELECT repository_url FROM github.timeline WHERE type="CreateEvent" AND PARSE_UTC_USEC(repository_created_at) >= PARSE_UTC_USEC('#{yesterday} 20:00:00') AND repository_fork = "false" AND payload_ref_type = "repository" GROUP BY repository_url ) GROUP BY repository_name, repository_language, repository_description, repository_url HAVING cnt >= 5 ORDER BY cnt DESC LIMIT 25
  • 44.
  • 45.
  • 46.
  • 47. Automation with Apps Script Read from bigquery Create a spreadsheet on Drive E-mail it everyday as a PDF javier ramirez @supercoco9 https://teowaki.com api days 14
  • 48.
  • 49.
  • 50.
  • 51.
  • 52.
  • 53. bigquery pricing $26 per stored TB 1000000 rows => $0.00416 / month £0.00243 / month $5 per processed TB 1 full scan = 160 MB 1 count = 0 MB 1 full scan over 1 column = 5.4 MB 100 GB => $0.05 / month £0.03javier ramirez @supercoco9 https://teowaki.com api days 14
  • 54. £0.054307 / month* per 1MM rows *the 1st 1TB every month are free of charge javier ramirez @supercoco9 https://teowaki.com api days 14
  • 55. 1. non intrusive metrics 2. keep the history 3. avoid vendor lock-in 4. interactive queries 5. cheap 6. extra ball: real time javier ramirez @supercoco9 https://teowaki.com api days 14
  • 56.
  • 57. Find related links at https://teowaki.com/teams/javier-community/link-categories/bigquery-talk Thanks! Gr ciesà Javier Ramírez @supercoco9 api days 14