SlideShare ist ein Scribd-Unternehmen logo
1 von 63
Downloaden Sie, um offline zu lesen
javier ramirez
@supercoco9
API Analytics
with Redis
and
Google BigQuery
Cologne 2014
moral of the story
you can do big,
if you know how
javier ramirez
@supercoco9
API Analytics
with Redis
and
Google BigQuery
Cologne 2014
javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2013
REST API
+
AngularJS web as
an API client
obvious solution:
use a ready-made service
(3scale, apigee...)
javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014
javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014
data that’s an order of
magnitude greater than
data you’re accustomed
to
javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014
Doug Laney
VP Research, Business Analytics and Performance Management at Gartner
data that exceeds the
processing capacity of
conventional database
systems. The data is too big,
moves too fast, or doesn’t fit
the structures of your
database architectures.
Ed Dumbill
program chair for the O’Reilly Strata Conference
javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014
bigdata is doing a
fullscan to 330MM rows,
matching them against a
regexp, and getting the
result (223MM rows) in
just 5 seconds
javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014
Javier Ramirez
impresionable teowaki founder
1. non intrusive metrics
2. keep the history
3. avoid vendor lock-in
4. interactive queries
5. cheap
6. extra ball: real time
javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014
javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014
Intel(R) Xeon(R) CPU E5520 @ 2.27GHz (with pipelining)
$ ./redis-benchmark -r 1000000 -n 2000000 -t get,set,lpush,lpop -P 16 -q
SET: 552,028 requests per second
GET: 707,463 requests per second
LPUSH: 767,459 requests per second
LPOP: 770,119 requests per second
Intel(R) Xeon(R) CPU E5520 @ 2.27GHz (without pipelining)
$ ./redis-benchmark -r 1000000 -n 2000000 -t get,set,lpush,lpop -q
SET: 122,556 requests per second
GET: 123,601 requests per second
LPUSH: 136,752 requests per second
LPOP: 132,424 requests per second
javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014
open source, BSD licensed, advanced
key-value store. It is often referred to as a
data structure server since keys can contain
strings, hashes, lists, sets, sorted sets and
hyperloglogs.
http://redis.io
started in 2009 by Salvatore Sanfilippo @antirez
111 contributors at
https://github.com/antirez/redis
javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014
what is it used for?
javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014
twitter
Every time line (800 tweets
per user) is stored in redis
5000 writes per second avg
300K reads per second
javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014
youporn
Most data is found in hashes with ordered sets used to
know what data to show.
zInterStore on: videos:filters:released,
Videos:filters:orientation:straight,Videos:filters:categorie
s:{category_id}, Videos:ordering:rating
Then perform a zRange to get the pages we want and
get the list of video_ids back.
Then start a pipeline and get all the videos from hashes.
javier ramirez @supercoco9 https://teowaki.com Codemotion 2014
javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014
Redis keeps
everything in
memory
all the time
javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014
javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014
easy:
store GZIPPED files into
S3/Glacier
javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014
* or google cloud storage
javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014
Hadoop (map/reduce)
javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014
http://hadoop.apache.org/
started in 2005 by Doug Cutting and Mike Cafarella
Cassandra
javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014
http://cassandra.apache.org/
released in 2008 by facebook.
other big data solutions:
Hadoop+Voldemort+Kafka
Hbase
javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014
http://engineering.linkedin.com/projects
http://hbase.apache.org/
Amazon Redshift
javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014
http://aws.amazon.com/redshift/
Our choice:
Google BigQuery
Data analysis as a service
http://developers.google.com/bigquery
javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014
Based on Dremel
Specifically designed for
interactive queries over
petabytes of real-time
data
javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014
Columnar storage
Easy to compress
Convenient for querying long
series over a single column
javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014
loading data
You can feed flat CSV-like
files or nested JSON
objects
javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014
javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014
bq cli
bq load --nosynchronous_mode
--encoding UTF-8
--field_delimiter 'tab'
--max_bad_records 100
--source_format CSV
api.stats
20131014T11-42-05Z.gz
web console screenshot
javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014
almost SQL
javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014
select
from
join
where
group by
having
order
limit
avg
count
max
min
sum
+
-
*
/
%
&
|
^
<<
>>
~
=
!=
<>
>
<
>= <=
IN
IS NULL
BETWEEN
AND
OR
NOT
Functions overview
javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014
current_date
current_time
now
datediff
day
day_of_week
day_of_year
hour
minute
quarter
year...
abs
acos
atan
ceil
floor
degrees
log
log2
log10
PI
SQRT...
concat
contains
left
length
lower
upper
lpad
rpad
right
substr
specific extensions for
analytics
javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014
within
flatten
nest
stddev
top
first
last
nth
variance
var_pop
var_samp
covar_pop
covar_samp
quantiles
javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014
window functions
correlations.
not to mistake with
causality
javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014
javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014
views.
JSON fields.
timestamped tables.
Things you always wanted to
try but were too scared to
javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014
select count(*) from
publicdata:samples.wikipedia
where REGEXP_MATCH(title, "[0-9]*")
AND wp_namespace = 0;
223,163,387
Query complete (5.6s elapsed, 9.13 GB processed, Cost: 32¢)
SELECT repository_name, repository_language,
repository_description, COUNT(repository_name) as cnt,
repository_url
FROM github.timeline
WHERE type="WatchEvent"
AND PARSE_UTC_USEC(created_at) >=
PARSE_UTC_USEC("#{yesterday} 20:00:00")
AND repository_url IN (
SELECT repository_url
FROM github.timeline
WHERE type="CreateEvent"
AND PARSE_UTC_USEC(repository_created_at) >=
PARSE_UTC_USEC('#{yesterday} 20:00:00')
AND repository_fork = "false"
AND payload_ref_type = "repository"
GROUP BY repository_url
)
GROUP BY repository_name, repository_language,
repository_description, repository_url
HAVING cnt >= 5
ORDER BY cnt DESC
LIMIT 25
javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014
country segmented traffic
javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014
our most active user
javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014
10 request we should be caching
javier ramirez @supercoco9 http://teowaki.com nosqlmatters 2014
5 most created resources
select uri, count(*) total from
stats where method = 'POST'
group by URI;
javier ramirez @supercoco9 http://teowaki.com nosqlmatters 2014
...but
/users/javier/shouts
/users/rgo/shouts
/teams/javier-community/links
/teams/nosqlmatters-cgn/links
javier ramirez @supercoco9 http://teowaki.com nosqlmatters 2014
5 most created resources
Automation with Apps Script
Read from bigquery
Create a spreadsheet on Drive
E-mail it everyday as a PDF
javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014
javier ramirez @supercoco9 http://teowaki.com nosqlmatters 2014
Do you
remember
the future?
redis pricing
2* machines (master/slave) at digital
ocean
$10 monthly
* we were already using these
instances for a lot of redis use cases
javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014
s3 pricing
$0.095 per GB
a gzipped 1.6 MB file stores 300K
rows
$0.0001541698 / monthly
javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014
glacier pricing
$0.01 per GB
$0.000016 / monthly
javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014
bigquery pricing
$80 per stored TB
300000 rows => $0.007629392 / month
$35 per processed TB
1 full scan = 84 MB
1 count = 0 MB
1 full scan over 1 column = 5.4 MB
10 GB => $0.35 / month
javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014
redis $10.0000000000
s3 storage $00.0001541698
s3 transfer $00.0050000000
glacier transfer $00.0500000000
glacier storage $00.0000160000
bigquery storage $00.0076293920
bigquery queries $00.3500000000
$10.41 / month
javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014
1. non intrusive metrics
2. keep the history
3. avoid vendor lock-in
4. interactive queries
5. cheap
6. extra ball: real time
javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014
ig
Find related links at
https://teowaki.com/teams/javier-community/link-categories/bigquery-talk
Danke!
Javier Ramírez
@supercoco9
nosqlmatters 2014

Weitere ähnliche Inhalte

Was ist angesagt?

Dirty Little Secrets They Didn't Teach You In Pentest Class v2
Dirty Little Secrets They Didn't Teach You In Pentest Class v2Dirty Little Secrets They Didn't Teach You In Pentest Class v2
Dirty Little Secrets They Didn't Teach You In Pentest Class v2Rob Fuller
 
code.talks 2016 Hamburg - Plesk - AutoScaling WordPress with Docker & AWS - b...
code.talks 2016 Hamburg - Plesk - AutoScaling WordPress with Docker & AWS - b...code.talks 2016 Hamburg - Plesk - AutoScaling WordPress with Docker & AWS - b...
code.talks 2016 Hamburg - Plesk - AutoScaling WordPress with Docker & AWS - b...Jan Löffler
 
There's no place like 127.0.0.1 - Achieving "reliable" DNS rebinding in moder...
There's no place like 127.0.0.1 - Achieving "reliable" DNS rebinding in moder...There's no place like 127.0.0.1 - Achieving "reliable" DNS rebinding in moder...
There's no place like 127.0.0.1 - Achieving "reliable" DNS rebinding in moder...Luke Young
 
Four years of breaking HTTPS with BGP hijacking
Four years of breaking HTTPS with BGP hijackingFour years of breaking HTTPS with BGP hijacking
Four years of breaking HTTPS with BGP hijackingAPNIC
 
HTTP Caching in Web Application
HTTP Caching in Web ApplicationHTTP Caching in Web Application
HTTP Caching in Web ApplicationMartins Sipenko
 
AutoScaling WordPress with Docker & AWS - WordPress Meetup Karlsruhe - Plesk
AutoScaling WordPress with Docker & AWS - WordPress Meetup Karlsruhe - PleskAutoScaling WordPress with Docker & AWS - WordPress Meetup Karlsruhe - Plesk
AutoScaling WordPress with Docker & AWS - WordPress Meetup Karlsruhe - PleskJan Löffler
 
HTTP and Your Angry Dog
HTTP and Your Angry DogHTTP and Your Angry Dog
HTTP and Your Angry DogRoss Tuck
 
Aerospike DB and Storm for real-time analytics
Aerospike DB and Storm for real-time analyticsAerospike DB and Storm for real-time analytics
Aerospike DB and Storm for real-time analyticsAerospike
 
Building an Impenetrable ZooKeeper - Kathleen Ting
Building an Impenetrable ZooKeeper - Kathleen TingBuilding an Impenetrable ZooKeeper - Kathleen Ting
Building an Impenetrable ZooKeeper - Kathleen Tingjaxconf
 
NotaCon 2011 - Networking for Pentesters
NotaCon 2011 - Networking for PentestersNotaCon 2011 - Networking for Pentesters
NotaCon 2011 - Networking for PentestersRob Fuller
 
Integrated Cache on Netscaler
Integrated Cache on NetscalerIntegrated Cache on Netscaler
Integrated Cache on NetscalerMark Hillick
 
Going on an HTTP Diet: Front-End Web Performance
Going on an HTTP Diet: Front-End Web PerformanceGoing on an HTTP Diet: Front-End Web Performance
Going on an HTTP Diet: Front-End Web PerformanceAdam Norwood
 
Nagios Conference 2012 - Nathan Vonnahme - Writing Custom Nagios Plugins in Perl
Nagios Conference 2012 - Nathan Vonnahme - Writing Custom Nagios Plugins in PerlNagios Conference 2012 - Nathan Vonnahme - Writing Custom Nagios Plugins in Perl
Nagios Conference 2012 - Nathan Vonnahme - Writing Custom Nagios Plugins in PerlNagios
 
PHP conference Berlin 2015: running PHP on Nginx
PHP conference Berlin 2015: running PHP on NginxPHP conference Berlin 2015: running PHP on Nginx
PHP conference Berlin 2015: running PHP on NginxHarald Zeitlhofer
 
Rails Caching Secrets from the Edge
Rails Caching Secrets from the EdgeRails Caching Secrets from the Edge
Rails Caching Secrets from the EdgeMichael May
 
Advanced HTTP Caching
Advanced HTTP CachingAdvanced HTTP Caching
Advanced HTTP CachingMartin Breest
 

Was ist angesagt? (19)

Dirty Little Secrets They Didn't Teach You In Pentest Class v2
Dirty Little Secrets They Didn't Teach You In Pentest Class v2Dirty Little Secrets They Didn't Teach You In Pentest Class v2
Dirty Little Secrets They Didn't Teach You In Pentest Class v2
 
code.talks 2016 Hamburg - Plesk - AutoScaling WordPress with Docker & AWS - b...
code.talks 2016 Hamburg - Plesk - AutoScaling WordPress with Docker & AWS - b...code.talks 2016 Hamburg - Plesk - AutoScaling WordPress with Docker & AWS - b...
code.talks 2016 Hamburg - Plesk - AutoScaling WordPress with Docker & AWS - b...
 
There's no place like 127.0.0.1 - Achieving "reliable" DNS rebinding in moder...
There's no place like 127.0.0.1 - Achieving "reliable" DNS rebinding in moder...There's no place like 127.0.0.1 - Achieving "reliable" DNS rebinding in moder...
There's no place like 127.0.0.1 - Achieving "reliable" DNS rebinding in moder...
 
Four years of breaking HTTPS with BGP hijacking
Four years of breaking HTTPS with BGP hijackingFour years of breaking HTTPS with BGP hijacking
Four years of breaking HTTPS with BGP hijacking
 
HTTP Caching in Web Application
HTTP Caching in Web ApplicationHTTP Caching in Web Application
HTTP Caching in Web Application
 
Http caching basics
Http caching basicsHttp caching basics
Http caching basics
 
AutoScaling WordPress with Docker & AWS - WordPress Meetup Karlsruhe - Plesk
AutoScaling WordPress with Docker & AWS - WordPress Meetup Karlsruhe - PleskAutoScaling WordPress with Docker & AWS - WordPress Meetup Karlsruhe - Plesk
AutoScaling WordPress with Docker & AWS - WordPress Meetup Karlsruhe - Plesk
 
HTTP/3 for everyone
HTTP/3 for everyoneHTTP/3 for everyone
HTTP/3 for everyone
 
HTTP and Your Angry Dog
HTTP and Your Angry DogHTTP and Your Angry Dog
HTTP and Your Angry Dog
 
Aerospike DB and Storm for real-time analytics
Aerospike DB and Storm for real-time analyticsAerospike DB and Storm for real-time analytics
Aerospike DB and Storm for real-time analytics
 
Building an Impenetrable ZooKeeper - Kathleen Ting
Building an Impenetrable ZooKeeper - Kathleen TingBuilding an Impenetrable ZooKeeper - Kathleen Ting
Building an Impenetrable ZooKeeper - Kathleen Ting
 
Unity Makes Strength
Unity Makes StrengthUnity Makes Strength
Unity Makes Strength
 
NotaCon 2011 - Networking for Pentesters
NotaCon 2011 - Networking for PentestersNotaCon 2011 - Networking for Pentesters
NotaCon 2011 - Networking for Pentesters
 
Integrated Cache on Netscaler
Integrated Cache on NetscalerIntegrated Cache on Netscaler
Integrated Cache on Netscaler
 
Going on an HTTP Diet: Front-End Web Performance
Going on an HTTP Diet: Front-End Web PerformanceGoing on an HTTP Diet: Front-End Web Performance
Going on an HTTP Diet: Front-End Web Performance
 
Nagios Conference 2012 - Nathan Vonnahme - Writing Custom Nagios Plugins in Perl
Nagios Conference 2012 - Nathan Vonnahme - Writing Custom Nagios Plugins in PerlNagios Conference 2012 - Nathan Vonnahme - Writing Custom Nagios Plugins in Perl
Nagios Conference 2012 - Nathan Vonnahme - Writing Custom Nagios Plugins in Perl
 
PHP conference Berlin 2015: running PHP on Nginx
PHP conference Berlin 2015: running PHP on NginxPHP conference Berlin 2015: running PHP on Nginx
PHP conference Berlin 2015: running PHP on Nginx
 
Rails Caching Secrets from the Edge
Rails Caching Secrets from the EdgeRails Caching Secrets from the Edge
Rails Caching Secrets from the Edge
 
Advanced HTTP Caching
Advanced HTTP CachingAdvanced HTTP Caching
Advanced HTTP Caching
 

Ähnlich wie API Analytics with Redis and Bigquery. NoSQLmatters Cologne '14 edition. Javier Ramirez

API analytics with Redis and Google Bigquery. NoSQL matters edition
API analytics with Redis and Google Bigquery. NoSQL matters editionAPI analytics with Redis and Google Bigquery. NoSQL matters edition
API analytics with Redis and Google Bigquery. NoSQL matters editionjavier ramirez
 
Bigdata for small pockets, by Javier Ramirez from teowaki. RubyC Kiev 2014
Bigdata for small pockets, by Javier Ramirez from teowaki. RubyC Kiev 2014Bigdata for small pockets, by Javier Ramirez from teowaki. RubyC Kiev 2014
Bigdata for small pockets, by Javier Ramirez from teowaki. RubyC Kiev 2014javier ramirez
 
api analytics redis bigquery. Lrug
api analytics redis bigquery. Lrugapi analytics redis bigquery. Lrug
api analytics redis bigquery. Lrugjavier ramirez
 
Fun with ruby and redis, arrrrcamp edition, javier_ramirez, teowaki
Fun with ruby and redis, arrrrcamp edition, javier_ramirez, teowakiFun with ruby and redis, arrrrcamp edition, javier_ramirez, teowaki
Fun with ruby and redis, arrrrcamp edition, javier_ramirez, teowakijavier ramirez
 
How you can benefit from using Redis - Ramirez
How you can benefit from using Redis - RamirezHow you can benefit from using Redis - Ramirez
How you can benefit from using Redis - RamirezCodemotion
 
How we are using BigQuery and Apps Scripts at teowaki
How we are using BigQuery and Apps Scripts at teowakiHow we are using BigQuery and Apps Scripts at teowaki
How we are using BigQuery and Apps Scripts at teowakijavier ramirez
 
Fosdem how you can benefit from redis, javier ramirez @ teowaki
Fosdem how you can benefit from redis, javier ramirez @ teowakiFosdem how you can benefit from redis, javier ramirez @ teowaki
Fosdem how you can benefit from redis, javier ramirez @ teowakijavier ramirez
 
Big Data Analytics with Google BigQuery. By Javier Ramirez. All your base Co...
Big Data Analytics with Google BigQuery.  By Javier Ramirez. All your base Co...Big Data Analytics with Google BigQuery.  By Javier Ramirez. All your base Co...
Big Data Analytics with Google BigQuery. By Javier Ramirez. All your base Co...javier ramirez
 
How you can benefit from using Redis. Javier Ramirez, teowaki, at Codemotion ...
How you can benefit from using Redis. Javier Ramirez, teowaki, at Codemotion ...How you can benefit from using Redis. Javier Ramirez, teowaki, at Codemotion ...
How you can benefit from using Redis. Javier Ramirez, teowaki, at Codemotion ...javier ramirez
 
Big Data analytics with Nginx, Logstash, Redis, Google Bigquery and Neo4j, ja...
Big Data analytics with Nginx, Logstash, Redis, Google Bigquery and Neo4j, ja...Big Data analytics with Nginx, Logstash, Redis, Google Bigquery and Neo4j, ja...
Big Data analytics with Nginx, Logstash, Redis, Google Bigquery and Neo4j, ja...javier ramirez
 
Big Data Analytics with Google BigQuery. GDG Summit Spain 2014
Big Data Analytics with Google BigQuery. GDG Summit Spain 2014Big Data Analytics with Google BigQuery. GDG Summit Spain 2014
Big Data Analytics with Google BigQuery. GDG Summit Spain 2014javier ramirez
 
Por que deberias haberle pedido redis a los reyes magos
Por que deberias haberle pedido redis a los reyes magosPor que deberias haberle pedido redis a los reyes magos
Por que deberias haberle pedido redis a los reyes magosjavier ramirez
 
Spark Summit EU talk by Debasish Das and Pramod Narasimha
Spark Summit EU talk by Debasish Das and Pramod NarasimhaSpark Summit EU talk by Debasish Das and Pramod Narasimha
Spark Summit EU talk by Debasish Das and Pramod NarasimhaSpark Summit
 
Spark Summit EU talk by Debasish Das and Pramod Narasimha
Spark Summit EU talk by Debasish Das and Pramod NarasimhaSpark Summit EU talk by Debasish Das and Pramod Narasimha
Spark Summit EU talk by Debasish Das and Pramod NarasimhaSpark Summit
 
Use drupal 8 as a framework the romance recalibration
Use drupal 8 as a framework   the romance recalibrationUse drupal 8 as a framework   the romance recalibration
Use drupal 8 as a framework the romance recalibrationKevin Wenger
 
Fraud Detection with Hadoop
Fraud Detection with HadoopFraud Detection with Hadoop
Fraud Detection with Hadoopmarkgrover
 
Dok Talks #124 - Intro to Druid on Kubernetes
Dok Talks #124 - Intro to Druid on KubernetesDok Talks #124 - Intro to Druid on Kubernetes
Dok Talks #124 - Intro to Druid on KubernetesDoKC
 
Take a Groovy REST
Take a Groovy RESTTake a Groovy REST
Take a Groovy RESTRestlet
 
Hadoop Application Architectures - Fraud Detection
Hadoop Application Architectures - Fraud  DetectionHadoop Application Architectures - Fraud  Detection
Hadoop Application Architectures - Fraud Detectionhadooparchbook
 

Ähnlich wie API Analytics with Redis and Bigquery. NoSQLmatters Cologne '14 edition. Javier Ramirez (20)

API analytics with Redis and Google Bigquery. NoSQL matters edition
API analytics with Redis and Google Bigquery. NoSQL matters editionAPI analytics with Redis and Google Bigquery. NoSQL matters edition
API analytics with Redis and Google Bigquery. NoSQL matters edition
 
Bigdata for small pockets, by Javier Ramirez from teowaki. RubyC Kiev 2014
Bigdata for small pockets, by Javier Ramirez from teowaki. RubyC Kiev 2014Bigdata for small pockets, by Javier Ramirez from teowaki. RubyC Kiev 2014
Bigdata for small pockets, by Javier Ramirez from teowaki. RubyC Kiev 2014
 
api analytics redis bigquery. Lrug
api analytics redis bigquery. Lrugapi analytics redis bigquery. Lrug
api analytics redis bigquery. Lrug
 
Fun with ruby and redis, arrrrcamp edition, javier_ramirez, teowaki
Fun with ruby and redis, arrrrcamp edition, javier_ramirez, teowakiFun with ruby and redis, arrrrcamp edition, javier_ramirez, teowaki
Fun with ruby and redis, arrrrcamp edition, javier_ramirez, teowaki
 
How you can benefit from using Redis - Ramirez
How you can benefit from using Redis - RamirezHow you can benefit from using Redis - Ramirez
How you can benefit from using Redis - Ramirez
 
How we are using BigQuery and Apps Scripts at teowaki
How we are using BigQuery and Apps Scripts at teowakiHow we are using BigQuery and Apps Scripts at teowaki
How we are using BigQuery and Apps Scripts at teowaki
 
Fosdem how you can benefit from redis, javier ramirez @ teowaki
Fosdem how you can benefit from redis, javier ramirez @ teowakiFosdem how you can benefit from redis, javier ramirez @ teowaki
Fosdem how you can benefit from redis, javier ramirez @ teowaki
 
Big Data Analytics with Google BigQuery. By Javier Ramirez. All your base Co...
Big Data Analytics with Google BigQuery.  By Javier Ramirez. All your base Co...Big Data Analytics with Google BigQuery.  By Javier Ramirez. All your base Co...
Big Data Analytics with Google BigQuery. By Javier Ramirez. All your base Co...
 
How you can benefit from using Redis. Javier Ramirez, teowaki, at Codemotion ...
How you can benefit from using Redis. Javier Ramirez, teowaki, at Codemotion ...How you can benefit from using Redis. Javier Ramirez, teowaki, at Codemotion ...
How you can benefit from using Redis. Javier Ramirez, teowaki, at Codemotion ...
 
Big Data analytics with Nginx, Logstash, Redis, Google Bigquery and Neo4j, ja...
Big Data analytics with Nginx, Logstash, Redis, Google Bigquery and Neo4j, ja...Big Data analytics with Nginx, Logstash, Redis, Google Bigquery and Neo4j, ja...
Big Data analytics with Nginx, Logstash, Redis, Google Bigquery and Neo4j, ja...
 
Big Data Analytics with Google BigQuery. GDG Summit Spain 2014
Big Data Analytics with Google BigQuery. GDG Summit Spain 2014Big Data Analytics with Google BigQuery. GDG Summit Spain 2014
Big Data Analytics with Google BigQuery. GDG Summit Spain 2014
 
Por que deberias haberle pedido redis a los reyes magos
Por que deberias haberle pedido redis a los reyes magosPor que deberias haberle pedido redis a los reyes magos
Por que deberias haberle pedido redis a los reyes magos
 
Spark Summit EU talk by Debasish Das and Pramod Narasimha
Spark Summit EU talk by Debasish Das and Pramod NarasimhaSpark Summit EU talk by Debasish Das and Pramod Narasimha
Spark Summit EU talk by Debasish Das and Pramod Narasimha
 
Spark Summit EU talk by Debasish Das and Pramod Narasimha
Spark Summit EU talk by Debasish Das and Pramod NarasimhaSpark Summit EU talk by Debasish Das and Pramod Narasimha
Spark Summit EU talk by Debasish Das and Pramod Narasimha
 
Devoxx 2014 monitoring
Devoxx 2014 monitoringDevoxx 2014 monitoring
Devoxx 2014 monitoring
 
Use drupal 8 as a framework the romance recalibration
Use drupal 8 as a framework   the romance recalibrationUse drupal 8 as a framework   the romance recalibration
Use drupal 8 as a framework the romance recalibration
 
Fraud Detection with Hadoop
Fraud Detection with HadoopFraud Detection with Hadoop
Fraud Detection with Hadoop
 
Dok Talks #124 - Intro to Druid on Kubernetes
Dok Talks #124 - Intro to Druid on KubernetesDok Talks #124 - Intro to Druid on Kubernetes
Dok Talks #124 - Intro to Druid on Kubernetes
 
Take a Groovy REST
Take a Groovy RESTTake a Groovy REST
Take a Groovy REST
 
Hadoop Application Architectures - Fraud Detection
Hadoop Application Architectures - Fraud  DetectionHadoop Application Architectures - Fraud  Detection
Hadoop Application Architectures - Fraud Detection
 

Mehr von javier ramirez

¿Se puede vivir del open source? T3chfest
¿Se puede vivir del open source? T3chfest¿Se puede vivir del open source? T3chfest
¿Se puede vivir del open source? T3chfestjavier ramirez
 
QuestDB: The building blocks of a fast open-source time-series database
QuestDB: The building blocks of a fast open-source time-series databaseQuestDB: The building blocks of a fast open-source time-series database
QuestDB: The building blocks of a fast open-source time-series databasejavier ramirez
 
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...javier ramirez
 
Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...
Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...
Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...javier ramirez
 
Deduplicating and analysing time-series data with Apache Beam and QuestDB
Deduplicating and analysing time-series data with Apache Beam and QuestDBDeduplicating and analysing time-series data with Apache Beam and QuestDB
Deduplicating and analysing time-series data with Apache Beam and QuestDBjavier ramirez
 
Your Database Cannot Do this (well)
Your Database Cannot Do this (well)Your Database Cannot Do this (well)
Your Database Cannot Do this (well)javier ramirez
 
Your Timestamps Deserve Better than a Generic Database
Your Timestamps Deserve Better than a Generic DatabaseYour Timestamps Deserve Better than a Generic Database
Your Timestamps Deserve Better than a Generic Databasejavier ramirez
 
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...javier ramirez
 
QuestDB-Community-Call-20220728
QuestDB-Community-Call-20220728QuestDB-Community-Call-20220728
QuestDB-Community-Call-20220728javier ramirez
 
Processing and analysing streaming data with Python. Pycon Italy 2022
Processing and analysing streaming  data with Python. Pycon Italy 2022Processing and analysing streaming  data with Python. Pycon Italy 2022
Processing and analysing streaming data with Python. Pycon Italy 2022javier ramirez
 
QuestDB: ingesting a million time series per second on a single instance. Big...
QuestDB: ingesting a million time series per second on a single instance. Big...QuestDB: ingesting a million time series per second on a single instance. Big...
QuestDB: ingesting a million time series per second on a single instance. Big...javier ramirez
 
Servicios e infraestructura de AWS y la próxima región en Aragón
Servicios e infraestructura de AWS y la próxima región en AragónServicios e infraestructura de AWS y la próxima región en Aragón
Servicios e infraestructura de AWS y la próxima región en Aragónjavier ramirez
 
Primeros pasos en desarrollo serverless
Primeros pasos en desarrollo serverlessPrimeros pasos en desarrollo serverless
Primeros pasos en desarrollo serverlessjavier ramirez
 
How AWS is reinventing the cloud
How AWS is reinventing the cloudHow AWS is reinventing the cloud
How AWS is reinventing the cloudjavier ramirez
 
Analitica de datos en tiempo real con Apache Flink y Apache BEAM
Analitica de datos en tiempo real con Apache Flink y Apache BEAMAnalitica de datos en tiempo real con Apache Flink y Apache BEAM
Analitica de datos en tiempo real con Apache Flink y Apache BEAMjavier ramirez
 
Getting started with streaming analytics
Getting started with streaming analyticsGetting started with streaming analytics
Getting started with streaming analyticsjavier ramirez
 
Getting started with streaming analytics: Setting up a pipeline
Getting started with streaming analytics: Setting up a pipelineGetting started with streaming analytics: Setting up a pipeline
Getting started with streaming analytics: Setting up a pipelinejavier ramirez
 
Getting started with streaming analytics: Deep Dive
Getting started with streaming analytics: Deep DiveGetting started with streaming analytics: Deep Dive
Getting started with streaming analytics: Deep Divejavier ramirez
 
Getting started with streaming analytics: streaming basics (1 of 3)
Getting started with streaming analytics: streaming basics (1 of 3)Getting started with streaming analytics: streaming basics (1 of 3)
Getting started with streaming analytics: streaming basics (1 of 3)javier ramirez
 
Monitorización de seguridad y detección de amenazas con AWS
Monitorización de seguridad y detección de amenazas con AWSMonitorización de seguridad y detección de amenazas con AWS
Monitorización de seguridad y detección de amenazas con AWSjavier ramirez
 

Mehr von javier ramirez (20)

¿Se puede vivir del open source? T3chfest
¿Se puede vivir del open source? T3chfest¿Se puede vivir del open source? T3chfest
¿Se puede vivir del open source? T3chfest
 
QuestDB: The building blocks of a fast open-source time-series database
QuestDB: The building blocks of a fast open-source time-series databaseQuestDB: The building blocks of a fast open-source time-series database
QuestDB: The building blocks of a fast open-source time-series database
 
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
 
Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...
Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...
Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...
 
Deduplicating and analysing time-series data with Apache Beam and QuestDB
Deduplicating and analysing time-series data with Apache Beam and QuestDBDeduplicating and analysing time-series data with Apache Beam and QuestDB
Deduplicating and analysing time-series data with Apache Beam and QuestDB
 
Your Database Cannot Do this (well)
Your Database Cannot Do this (well)Your Database Cannot Do this (well)
Your Database Cannot Do this (well)
 
Your Timestamps Deserve Better than a Generic Database
Your Timestamps Deserve Better than a Generic DatabaseYour Timestamps Deserve Better than a Generic Database
Your Timestamps Deserve Better than a Generic Database
 
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
 
QuestDB-Community-Call-20220728
QuestDB-Community-Call-20220728QuestDB-Community-Call-20220728
QuestDB-Community-Call-20220728
 
Processing and analysing streaming data with Python. Pycon Italy 2022
Processing and analysing streaming  data with Python. Pycon Italy 2022Processing and analysing streaming  data with Python. Pycon Italy 2022
Processing and analysing streaming data with Python. Pycon Italy 2022
 
QuestDB: ingesting a million time series per second on a single instance. Big...
QuestDB: ingesting a million time series per second on a single instance. Big...QuestDB: ingesting a million time series per second on a single instance. Big...
QuestDB: ingesting a million time series per second on a single instance. Big...
 
Servicios e infraestructura de AWS y la próxima región en Aragón
Servicios e infraestructura de AWS y la próxima región en AragónServicios e infraestructura de AWS y la próxima región en Aragón
Servicios e infraestructura de AWS y la próxima región en Aragón
 
Primeros pasos en desarrollo serverless
Primeros pasos en desarrollo serverlessPrimeros pasos en desarrollo serverless
Primeros pasos en desarrollo serverless
 
How AWS is reinventing the cloud
How AWS is reinventing the cloudHow AWS is reinventing the cloud
How AWS is reinventing the cloud
 
Analitica de datos en tiempo real con Apache Flink y Apache BEAM
Analitica de datos en tiempo real con Apache Flink y Apache BEAMAnalitica de datos en tiempo real con Apache Flink y Apache BEAM
Analitica de datos en tiempo real con Apache Flink y Apache BEAM
 
Getting started with streaming analytics
Getting started with streaming analyticsGetting started with streaming analytics
Getting started with streaming analytics
 
Getting started with streaming analytics: Setting up a pipeline
Getting started with streaming analytics: Setting up a pipelineGetting started with streaming analytics: Setting up a pipeline
Getting started with streaming analytics: Setting up a pipeline
 
Getting started with streaming analytics: Deep Dive
Getting started with streaming analytics: Deep DiveGetting started with streaming analytics: Deep Dive
Getting started with streaming analytics: Deep Dive
 
Getting started with streaming analytics: streaming basics (1 of 3)
Getting started with streaming analytics: streaming basics (1 of 3)Getting started with streaming analytics: streaming basics (1 of 3)
Getting started with streaming analytics: streaming basics (1 of 3)
 
Monitorización de seguridad y detección de amenazas con AWS
Monitorización de seguridad y detección de amenazas con AWSMonitorización de seguridad y detección de amenazas con AWS
Monitorización de seguridad y detección de amenazas con AWS
 

Kürzlich hochgeladen

What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWave PLM
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...confluent
 
Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Hr365.us smith
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Natan Silnitsky
 
CRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceCRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceBrainSell Technologies
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesŁukasz Chruściel
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...OnePlan Solutions
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样umasea
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Andreas Granig
 
How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationHow to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationBradBedford3
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Angel Borroy López
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作qr0udbr0
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtimeandrehoraa
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsAhmed Mohamed
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxTier1 app
 
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Cizo Technology Services
 
Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Velvetech LLC
 

Kürzlich hochgeladen (20)

Advantages of Odoo ERP 17 for Your Business
Advantages of Odoo ERP 17 for Your BusinessAdvantages of Odoo ERP 17 for Your Business
Advantages of Odoo ERP 17 for Your Business
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need It
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
 
Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
 
CRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceCRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. Salesforce
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New Features
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024
 
How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationHow to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion Application
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
2.pdf Ejercicios de programación competitiva
2.pdf Ejercicios de programación competitiva2.pdf Ejercicios de programación competitiva
2.pdf Ejercicios de programación competitiva
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtime
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML Diagrams
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
 
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
 
Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...
 

API Analytics with Redis and Bigquery. NoSQLmatters Cologne '14 edition. Javier Ramirez

  • 1. javier ramirez @supercoco9 API Analytics with Redis and Google BigQuery Cologne 2014
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8. moral of the story you can do big, if you know how
  • 9. javier ramirez @supercoco9 API Analytics with Redis and Google BigQuery Cologne 2014
  • 10. javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2013 REST API + AngularJS web as an API client
  • 11.
  • 12. obvious solution: use a ready-made service (3scale, apigee...) javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014
  • 13. javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014
  • 14. data that’s an order of magnitude greater than data you’re accustomed to javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014 Doug Laney VP Research, Business Analytics and Performance Management at Gartner
  • 15. data that exceeds the processing capacity of conventional database systems. The data is too big, moves too fast, or doesn’t fit the structures of your database architectures. Ed Dumbill program chair for the O’Reilly Strata Conference javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014
  • 16. bigdata is doing a fullscan to 330MM rows, matching them against a regexp, and getting the result (223MM rows) in just 5 seconds javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014 Javier Ramirez impresionable teowaki founder
  • 17. 1. non intrusive metrics 2. keep the history 3. avoid vendor lock-in 4. interactive queries 5. cheap 6. extra ball: real time javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014
  • 18. javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014
  • 19. Intel(R) Xeon(R) CPU E5520 @ 2.27GHz (with pipelining) $ ./redis-benchmark -r 1000000 -n 2000000 -t get,set,lpush,lpop -P 16 -q SET: 552,028 requests per second GET: 707,463 requests per second LPUSH: 767,459 requests per second LPOP: 770,119 requests per second Intel(R) Xeon(R) CPU E5520 @ 2.27GHz (without pipelining) $ ./redis-benchmark -r 1000000 -n 2000000 -t get,set,lpush,lpop -q SET: 122,556 requests per second GET: 123,601 requests per second LPUSH: 136,752 requests per second LPOP: 132,424 requests per second javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014
  • 20. open source, BSD licensed, advanced key-value store. It is often referred to as a data structure server since keys can contain strings, hashes, lists, sets, sorted sets and hyperloglogs. http://redis.io started in 2009 by Salvatore Sanfilippo @antirez 111 contributors at https://github.com/antirez/redis javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014
  • 21. what is it used for? javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014
  • 22. twitter Every time line (800 tweets per user) is stored in redis 5000 writes per second avg 300K reads per second javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014
  • 23. youporn Most data is found in hashes with ordered sets used to know what data to show. zInterStore on: videos:filters:released, Videos:filters:orientation:straight,Videos:filters:categorie s:{category_id}, Videos:ordering:rating Then perform a zRange to get the pages we want and get the list of video_ids back. Then start a pipeline and get all the videos from hashes. javier ramirez @supercoco9 https://teowaki.com Codemotion 2014
  • 24. javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014
  • 25. Redis keeps everything in memory all the time javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014
  • 26. javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014
  • 27. easy: store GZIPPED files into S3/Glacier javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014 * or google cloud storage
  • 28. javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014
  • 29. Hadoop (map/reduce) javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014 http://hadoop.apache.org/ started in 2005 by Doug Cutting and Mike Cafarella
  • 30. Cassandra javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014 http://cassandra.apache.org/ released in 2008 by facebook.
  • 31. other big data solutions: Hadoop+Voldemort+Kafka Hbase javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014 http://engineering.linkedin.com/projects http://hbase.apache.org/
  • 32. Amazon Redshift javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014 http://aws.amazon.com/redshift/
  • 33. Our choice: Google BigQuery Data analysis as a service http://developers.google.com/bigquery javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014
  • 34. Based on Dremel Specifically designed for interactive queries over petabytes of real-time data javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014
  • 35. Columnar storage Easy to compress Convenient for querying long series over a single column javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014
  • 36. loading data You can feed flat CSV-like files or nested JSON objects javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014
  • 37. javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014 bq cli bq load --nosynchronous_mode --encoding UTF-8 --field_delimiter 'tab' --max_bad_records 100 --source_format CSV api.stats 20131014T11-42-05Z.gz
  • 38. web console screenshot javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014
  • 39. almost SQL javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014 select from join where group by having order limit avg count max min sum + - * / % & | ^ << >> ~ = != <> > < >= <= IN IS NULL BETWEEN AND OR NOT
  • 40. Functions overview javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014 current_date current_time now datediff day day_of_week day_of_year hour minute quarter year... abs acos atan ceil floor degrees log log2 log10 PI SQRT... concat contains left length lower upper lpad rpad right substr
  • 41. specific extensions for analytics javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014 within flatten nest stddev top first last nth variance var_pop var_samp covar_pop covar_samp quantiles
  • 42. javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014 window functions
  • 43. correlations. not to mistake with causality javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014
  • 44. javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014 views. JSON fields. timestamped tables.
  • 45. Things you always wanted to try but were too scared to javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014 select count(*) from publicdata:samples.wikipedia where REGEXP_MATCH(title, "[0-9]*") AND wp_namespace = 0; 223,163,387 Query complete (5.6s elapsed, 9.13 GB processed, Cost: 32¢)
  • 46. SELECT repository_name, repository_language, repository_description, COUNT(repository_name) as cnt, repository_url FROM github.timeline WHERE type="WatchEvent" AND PARSE_UTC_USEC(created_at) >= PARSE_UTC_USEC("#{yesterday} 20:00:00") AND repository_url IN ( SELECT repository_url FROM github.timeline WHERE type="CreateEvent" AND PARSE_UTC_USEC(repository_created_at) >= PARSE_UTC_USEC('#{yesterday} 20:00:00') AND repository_fork = "false" AND payload_ref_type = "repository" GROUP BY repository_url ) GROUP BY repository_name, repository_language, repository_description, repository_url HAVING cnt >= 5 ORDER BY cnt DESC LIMIT 25
  • 47. javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014 country segmented traffic
  • 48. javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014 our most active user
  • 49. javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014 10 request we should be caching
  • 50. javier ramirez @supercoco9 http://teowaki.com nosqlmatters 2014 5 most created resources select uri, count(*) total from stats where method = 'POST' group by URI;
  • 51. javier ramirez @supercoco9 http://teowaki.com nosqlmatters 2014 ...but /users/javier/shouts /users/rgo/shouts /teams/javier-community/links /teams/nosqlmatters-cgn/links
  • 52. javier ramirez @supercoco9 http://teowaki.com nosqlmatters 2014 5 most created resources
  • 53. Automation with Apps Script Read from bigquery Create a spreadsheet on Drive E-mail it everyday as a PDF javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014
  • 54.
  • 55. javier ramirez @supercoco9 http://teowaki.com nosqlmatters 2014 Do you remember the future?
  • 56. redis pricing 2* machines (master/slave) at digital ocean $10 monthly * we were already using these instances for a lot of redis use cases javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014
  • 57. s3 pricing $0.095 per GB a gzipped 1.6 MB file stores 300K rows $0.0001541698 / monthly javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014
  • 58. glacier pricing $0.01 per GB $0.000016 / monthly javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014
  • 59. bigquery pricing $80 per stored TB 300000 rows => $0.007629392 / month $35 per processed TB 1 full scan = 84 MB 1 count = 0 MB 1 full scan over 1 column = 5.4 MB 10 GB => $0.35 / month javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014
  • 60. redis $10.0000000000 s3 storage $00.0001541698 s3 transfer $00.0050000000 glacier transfer $00.0500000000 glacier storage $00.0000160000 bigquery storage $00.0076293920 bigquery queries $00.3500000000 $10.41 / month javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014
  • 61. 1. non intrusive metrics 2. keep the history 3. avoid vendor lock-in 4. interactive queries 5. cheap 6. extra ball: real time javier ramirez @supercoco9 https://teowaki.com nosqlmatters 2014
  • 62. ig
  • 63. Find related links at https://teowaki.com/teams/javier-community/link-categories/bigquery-talk Danke! Javier Ramírez @supercoco9 nosqlmatters 2014