Talk about the integration of Google Analytics and BigQuery, delivered at Dare2Data event (BBVACI). The video is available at https://www.youtube.com/watch?v=ZdMJf0btAbc
2. about me
19 years working on software: banking, e-commerce,
government, CMS, start-ups...
founder of
https://datawaki.com
https://teowaki.com
Google Developer Expert on
the Cloud Platform
mail: hello@datawaki.com twitter: @supercoco9
datawaki
5. javier ramirez @supercoco9 https://datawaki.com
Google Analytics is great but...
It lets you access aggregated data and reports, not
individual sessions/visits data.
Even premium accounts get sampled reports when
there are too many data (and not all the reports can
be unsampled).
6. javier ramirez @supercoco9 https://datawaki.com
Google Analytics is great but...
If you need to manage many different segments, and
if you want to combine segments, it can get tricky.
Moreover, you can only segment or create reports
using the pre-defined filters, which might or not be
enough for you*.
*even if segments have experienced a huge
improvement with Universal Analytics
7. javier ramirez @supercoco9 https://datawaki.com
Google Analytics is great but...
It's not easy to cross data in Analytics with data from
other sources (CRM, invoicing system...)
Now you can use Import Data from Universal
Analytics, but there are many constraints to what
you can do
8. javier ramirez @supercoco9 https://datawaki.com
Google Analytics is great but...
Good for knowing what's happening in your
application, but difficult for:
* business intelligence/big data (data mining,
find patterns...)
* machine learning (classify information,
predict future trends...)
9. javier ramirez @supercoco9 https://datawaki.com
big data -analysed and organised
into information- has big value
11. data that exceeds the
processing capacity of
conventional database
systems. The data is too big,
moves too fast, or doesn’t fit
the structures of your
database architectures.
Ed Dumbill
program chair for the O’Reilly Strata Conference
javier ramirez @supercoco9 https://datawaki.com
12. bigdata is cool but...
expensive cluster
hard to set up and monitor
not interactive enough
13. What if I could...
..query billions of rows in seconds..
..using a SQL-like web interface..
..on a fully managed cloud..
..paying only when I use it?
javier ramirez @supercoco9 https://datawaki.com
14. Designed to run analytics
over huge volumes of raw
data, and to integrate
with other data sources
javier ramirez @supercoco9 https://datawaki.com
Google BigQuery
15. Google BigQuery
Data analysis as a service
https://cloud.google.com/products/bigquery/
javier ramirez @supercoco9 https://datawaki.com
17. Google Analytics Premium
users get free daily
exports from GA to
BigQuery.
javier ramirez @supercoco9 https://datawaki.com
Google BigQuery + GA Premium
18. All your raw data.
Unsampled.
Use it however you want.
BOOM!
javier ramirez @supercoco9 https://datawaki.com
Google BigQuery + GA Premium
22. US Cellular Case Study
5th
largest US telecommunications company
over 10.6 million customers
They didn't know how many offline (in-store +
telesales) sales were originated by online media.
After combining GA data with other internal data,
they can more accurately attribute sales to
digital channel (website, social, search and display)
It helps them optimize their campaign and forecast
sales
25. SELECT trafficSource.source, SUM( totals.transactions ) AS total_transactions
FROM playground.ga_sessions_20140621
GROUP BY trafficSource.source
ORDER BY total_transactions;
basic queries (metric/dimension)
SELECT device.isMobile, SUM ( totals.pageviews ) AS total_pageviews
FROM playground.ga_sessions_20140621
GROUP BY device.isMobile
ORDER BY total_pageviews;
27. SELECT ( SUM(total_transactionrevenue_per_user) / SUM(total_visits_per_user) )
AS avg_revenue_by_user_per_visit
FROM (
SELECT SUM(totals.visits) AS total_visits_per_user,
SUM( totals.transactionRevenue ) AS total_transactionrevenue_per_user,
visitorId
FROM playground.ga_sessions_20140621
WHERE totals.visits>0
AND totals.transactions>=1
AND totals.transactionRevenue IS NOT NULL
GROUP BY visitorId ) ;
Average amount spent per visit
29. SELECT hits.item.productName AS other_purchased_products,
COUNT(hits.item.productName) AS quantity
FROM playground.ga_sessions_20140621
WHERE fullVisitorId IN (
SELECT fullVisitorId
FROM playground.ga_sessions_20140621
WHERE hits.item.productName CONTAINS 'Light Helmet'
AND totals.transactions>=1
GROUP BY fullVisitorId )
AND hits.item.productName IS NOT NULL
AND hits.item.productName !='Light Helmet'
GROUP BY other_purchased_products
ORDER BY quantity DESC;
Users who bought product A,
also bought product B
30. SELECT prod_name, count(*) as transactions
FROM
(
SELECT fullVisitorId, min(date) AS date, visitId,
hits.item.productName as prod_name
FROM (
SELECT fullVisitorId, date, visitId,
totals.transactions,
hits.item.productName FROM
(TABLE_DATE_RANGE([dataset.ga_sessions_],
TIMESTAMP('2014-06-01'),
TIMESTAMP('2014-06-14')))
)
WHERE fullVisitorId IN
(
SELECT fullVisitorId
FROM (TABLE_DATE_RANGE([dataset.ga_sessions_],
TIMESTAMP('2014-06-01'),
TIMESTAMP('2014-06-14')))
GROUP BY fullVisitorId
HAVING SUM(totals.transactions) > 1
)
AND hits.item.productName IS NOT NULL
GROUP BY fullVisitorId, visitId, prod_name ORDER BY
fullVisitorId DESC
)
GROUP BY prod_name ORDER BY transactions DESC;
* example query from the lunametrics blog. Check them out for more awesomeness
Products that
are purchased
and lead to
other products
being purchased
31. SELECT fullvisitorID, visitID, visitNumber, hits.page.pagePath
FROM playground.ga_sessions_20140621
where hits.type='PAGE'
order by fullvisitorID, visitID, hits.hitnumber asc
Identify user path/user actions
32. individual users data is awesome
Cross CRM data with individual users actions to see
how your response to incidents affect your users.
Use the “frequently bought together” query and find
users who didn't buy the related products. Send an
e-mail campaign with an offer for those products.
33. integrating with external
data sources
* Connectors/REST API
* Export into GCS
* Import into BigQuery
javier ramirez @supercoco9 https://datawaki.com
35. BigQuery pricing
$20 per stored TB
$5 per processed TB
*the 1st
TB every month is free of charge
** GA premium get $500 free credit monthly
javier ramirez @supercoco9 https://datawaki.com
36. for GA premium users
BigQuery is effectively
for free
*unless you upload huge external data or make
huge queries
javier ramirez @supercoco9 https://datawaki.com
38. just send your own data
javier ramirez @supercoco9 https://datawaki.com
define a data structure that fits your needs
(or replicate the one GA provides)
use a JS snippet to send data to your server, then
to BigQuery
39. or use
javier ramirez @supercoco9 https://datawaki.com
Just add an extra snippet to your GA
datawaki
40. javier ramirez @supercoco9 https://datawaki.com
send data from any other source (CRM, back-end,
sensors, mobile apps, log system, external tools...)
datawaki
42. javier ramirez @supercoco9 https://datawaki.com
* Get full access to your data
* Receive reports by e-mail
* Get individual or group alerts
* if there is a purchase over $1000
* if a user has visited a product page over
20 times in one week and didn't buy
* if a product is seen over 200 times one hour
* every time a product reaches 5000 views
datawaki
43. Want to know more?
https://cloud.google.com/products/bigquery/
Need help?
https://teowaki.com/services
Thanks!
Gracias
Javier Ramírez
@supercoco9
datawaki