SlideShare ist ein Scribd-Unternehmen logo
1 von 27
Downloaden Sie, um offline zu lesen
Google BigQuery
for the Everyday Developer
Márton Kodok
Senior Software Engineer at REEA
twitter: martonkodok stackoverflow: pentium10 github: @pentium10
IV. IT&C Innovation Conference - October 2016 - Sovata, Romania
1. Big Data Challenges
2. Infrastructure overview
3. Choosing a Strategy
4. BigQuery - Use Cases
5. Q & A
BigQuery for the Everyday Developer @martonkodok
Agenda
BigQuery for the Everyday Developer @martonkodok
The 5 V’s of Big Data
Every scientist who needs big data analytics to save millions of lives
should have that power
Legacy systems don’t provide the power.
The simple fact is that you are brilliant but your brilliant ideas require
complex analytics.
Traditional solutions are not applicable.
BigQuery for the Everyday Developer @martonkodok
Big Data movement
With more and more people realizing the value of realtime
analytics, the race to build realtime middleware has begun.
The Plan: have oversight over developments as they happen.
* More slides about Beanstalkd at slideshare.net/martonkodok
BigQuery for the Everyday Developer @martonkodok
Fast computing systems
BigQuery for the Everyday Developer @martonkodok
Simple legacy Website stack/infrastructure
BigQuery for the Everyday Developer @martonkodok
Infrastructure with Middleware
Task: Need backend/database for STORE, QUERY, EXTRACT deep analytics:
● Events (website events, app events, email events)
● Entities (user data, business entities, A/B split test data)
● Metrics (sales data, code commit history, profiler data)
● Streaming (sensor data, IoT data burst)
● 3rd party Analytics (Google Analytics, Mixpanel)
● Local generated data (log file entries, trace output, unstructured data...)
❏ Run Ad-Hoc queries (without Developer)
❏ Be realtime
BigQuery for the Everyday Developer @martonkodok
Requirements
● Terabyte scalable storage
● Real-time row ingestion
● Ask sophisticated queries
● Query-performance
● Low-maintenance
● Cost effective
● Wire them up easily
Goal: Store everything accessible by SQL immediately.
BigQuery for the Everyday Developer @martonkodok
Desired system/platform Equipment strategy
* people still required
Engines:
❏ MongoDB, Riak, Redis
❏ ELK Stack (Elastic-Logstash-Kibana)...
❏ Cassandra, Hive, Hadoop...
❏ Amazon RedShift, Google BigQuery...
In-House Hosted Managed
BigQuery for the Everyday Developer @martonkodok
Google BigQuery
● Analytics-as-a-Service - Data Warehouse in the Cloud
● Fully-Managed by Google (US or EU zone)
● Scales into Petabytes
● Ridiculously fast
● Decent pricing (queries $5/TB, storage: $20/TB) *October 2016 pricing
● 100.000 rows / sec Streaming API
● Open Interfaces (Web UI, BQ command line tool, REST, ODBC)
● Familiar DB Structure (table, views, record, nested, JSON)
● Convenience of SQL + Javascript UDF (User Defined Functions)
● Integrates with Google Sheets + Google Cloud Storage + Pub/Sub connectors
● Client libraries available in YFL (your favorite languages)
BigQuery for the Everyday Developer @martonkodok
What is BigQuery?
● ACL - row level locking (individual or group based)
● Columnar storage (max 10 000 columns in table)
● Rich SQL: JSON, IP, Math, RegExp, Window functions
● Data types: String, Integer, Float, Boolean, Timestamp,
Record, Nested, Struct, Array.
● Append-only tables prefered (DML syntax available)
● Batch load file size limits: 1TB (CSV or JSON)
● User Defined Functions in SQL or Javascript
● Date partitioned tables
BigQuery for the Everyday Developer @martonkodok
BigQuery: Convenience of SQL
* 1 Petabyte storage, 10 TB inserts, 100 TB queries => $22000
Queries Storage Ingestion
➔ 1 TB per month free
➔ 5 USD per TB
➔ only pay for the columns you use
in your query
➔ 20 USD per TB frequently accessed
data
➔ 10 USD per TB long term storage
90 days
➔ Batch load free (CSV/JSON)
➔ Exporting free
➔ Table copy free
➔ 50 USD per TB
Estimate 1
- Storage 5 TB
- Streaming Inserts 1 TB
- Queries 3 TB
Monthly total: $165
Estimate 2
- Storage 24 TB
- Streaming Inserts 1 TB
- Queries 50 TB
Monthly total: $788
BigQuery for the Everyday Developer @martonkodok
BigQuery Costs - October 2016
BigQuery for the Everyday Developer @martonkodok
+--------------------------+-----------+----------+
| order_id | INTEGER | REQUIRED |
| ... | | | Example:
| products | RECORD | REPEATED | ”products”:[p1,p2]
| products.name | STRING | NULLABLE |
| products.product_id | INTEGER | NULLABLE | ”products”:[{”name”:”p1”,
| products.attributes | STRING | REPEATED | ”product_id”:10,
| products.price | FLOAT | NULLABLE | ”attributes”:[”red”,”xl”]
| ... | | | ,”price”:9.99},
| common | RECORD | NULLABLE | {”name”:”p2”,
| common.insert_id | INTEGER | REQUIRED | ”product_id”:20,
| common.event | INTEGER | REQUIRED | ”attributes”:[“red”,”xl”]
| common.user_id | INTEGER | REQUIRED | ,”price”:9.99}]
| common.timestamp | TIMESTAMP | REQUIRED |
| common.utm | RECORD | NULLABLE |
| common.utm.source | STRING | NULLABLE |
| common.utm.medium | STRING | NULLABLE |
| common.utm.campaign | STRING | NULLABLE |
| common.utm.content | STRING | NULLABLE |
| common.utm.term | STRING | NULLABLE |
| meta | STRING | NULLABLE |
+--------------------------+-----------+----------+
Schema modelling / JSON
1. Use aggregation MIN/MAX on timestamp to find first/last and join back to the same table.
2. Use analytic functions FIRST_VALUE and LAST_VALUE when only one column is needed
SELECT LAST_VALUE(email) OVER(
PARTITION BY user_id
ORDER BY timestamp ASC) AS email_last ...
3. Using Window Functions for full row selection
SELECT email, firstname, lastname
FROM
(SELECT email, firstname, lastname
row_number() over (partition BY user_id
ORDER BY timestamp DESC) seqnum
FROM [profile_event]
)
WHERE seqnum=1
BigQuery for the Everyday Developer @martonkodok
Append only tables - Get last value
BigQuery for the Everyday Developer @martonkodok
Streaming insert time (ms) - last year
● Funnel Analysis
BigQuery for the Everyday Developer @martonkodok
Achievements
BigQuery for the Everyday Developer @martonkodok
Funnel analysis: Time on upsell pages
Example HITS chain:
● article1 -> page2 -> page3 -> page4 -> orderpage1 -> thankyoupage1
● page1 -> article2-> page3 -> orderpage2 -> ...
BigQuery for the Everyday Developer @martonkodok
Attribute credit to first article visited on purchase
● Funnel Analysis
● Email URL click heatmap
BigQuery for the Everyday Developer @martonkodok
Achievements
BigQuery for the Everyday Developer @martonkodok
Email URL clicks heat-map
● Funnel Analysis
● Email URL click heatmap
● Email Health Dashboard (SPAM, ISP deferral, content
A/B split tests, trends or low open rate campaigns)
● Advanced segmentation (all raw data stored)
● Behavioral analytics - engaged users etc...
BigQuery for the Everyday Developer @martonkodok
Achievements Continued
● no provisioning/deploy
● no running out of resources
● no more focus on large scale execution plan
● no need to re-implement tricky concepts
(time windows / join streams)
● pay only the columns we have in your queries
● run raw ad-hoc queries (either by analysts/sales or Devs)
● no more throwing away-, expiring-, aggregating old data.
BigQuery for the Everyday Developer @martonkodok
Our benefits
1. githubarchive.org: 20+ event types available since 2012
a. pull request latency
b. expressions, emotions in commit messages
c. etc...
2. httparchive.org
a. sites that use multiple popular JS frameworks
b. sites that use multiple jQuery versions
3. Export for Google Analytics
4. MLab - broadband connection performance (26B rows)
5. GSOD - samples of weather (rainfall, temp…)
6. US Natality - 1969 to 2008
7. Wikipedia edits
BigQuery for the Everyday Developer @martonkodok
BigQuery: Sample projects to try out
BigQuery for the Everyday Developer @martonkodok
HttpArchive - multiple JS frameworks
BigQuery for the Everyday Developer @martonkodok
HttpArchive - multiple jQuery versions
Questions?
Thank you.
Slides available soon on: slideshare.net/martonkodok

Weitere ähnliche Inhalte

Was ist angesagt?

bigquery.pptx
bigquery.pptxbigquery.pptx
bigquery.pptxHarissh16
 
BigQuery walk through.pptx
BigQuery walk through.pptxBigQuery walk through.pptx
BigQuery walk through.pptxVikRam S
 
Big Query Basics
Big Query BasicsBig Query Basics
Big Query BasicsIdo Green
 
Real-time Analytics with Apache Flink and Druid
Real-time Analytics with Apache Flink and DruidReal-time Analytics with Apache Flink and Druid
Real-time Analytics with Apache Flink and DruidJan Graßegger
 
An overview of BigQuery
An overview of BigQuery An overview of BigQuery
An overview of BigQuery GirdhareeSaran
 
Big Query - Utilizing Google Data Warehouse for Media Analytics
Big Query - Utilizing Google Data Warehouse for Media AnalyticsBig Query - Utilizing Google Data Warehouse for Media Analytics
Big Query - Utilizing Google Data Warehouse for Media Analyticshafeeznazri
 
BigQuery implementation
BigQuery implementationBigQuery implementation
BigQuery implementationSimon Su
 
data platform on kubernetes
data platform on kubernetesdata platform on kubernetes
data platform on kubernetes창언 정
 
BigQuery ML - Machine learning at scale using SQL
BigQuery ML - Machine learning at scale using SQLBigQuery ML - Machine learning at scale using SQL
BigQuery ML - Machine learning at scale using SQLMárton Kodok
 
What Is ELK Stack | ELK Tutorial For Beginners | Elasticsearch Kibana | ELK S...
What Is ELK Stack | ELK Tutorial For Beginners | Elasticsearch Kibana | ELK S...What Is ELK Stack | ELK Tutorial For Beginners | Elasticsearch Kibana | ELK S...
What Is ELK Stack | ELK Tutorial For Beginners | Elasticsearch Kibana | ELK S...Edureka!
 
Introduction to elasticsearch
Introduction to elasticsearchIntroduction to elasticsearch
Introduction to elasticsearchpmanvi
 
Building a Data Lake on AWS
Building a Data Lake on AWSBuilding a Data Lake on AWS
Building a Data Lake on AWSGary Stafford
 
Apache Spark at Airbnb
Apache Spark at AirbnbApache Spark at Airbnb
Apache Spark at AirbnbDatabricks
 
Lessons Learned Developing and Managing Massive (300TB+) Apache Spark Pipelin...
Lessons Learned Developing and Managing Massive (300TB+) Apache Spark Pipelin...Lessons Learned Developing and Managing Massive (300TB+) Apache Spark Pipelin...
Lessons Learned Developing and Managing Massive (300TB+) Apache Spark Pipelin...Spark Summit
 
Unified Big Data Processing with Apache Spark (QCON 2014)
Unified Big Data Processing with Apache Spark (QCON 2014)Unified Big Data Processing with Apache Spark (QCON 2014)
Unified Big Data Processing with Apache Spark (QCON 2014)Databricks
 
Google Cloud Dataproc - Easier, faster, more cost-effective Spark and Hadoop
Google Cloud Dataproc - Easier, faster, more cost-effective Spark and HadoopGoogle Cloud Dataproc - Easier, faster, more cost-effective Spark and Hadoop
Google Cloud Dataproc - Easier, faster, more cost-effective Spark and Hadoophuguk
 
Big Data Analytics with Google BigQuery. By Javier Ramirez. All your base Co...
Big Data Analytics with Google BigQuery.  By Javier Ramirez. All your base Co...Big Data Analytics with Google BigQuery.  By Javier Ramirez. All your base Co...
Big Data Analytics with Google BigQuery. By Javier Ramirez. All your base Co...javier ramirez
 

Was ist angesagt? (20)

bigquery.pptx
bigquery.pptxbigquery.pptx
bigquery.pptx
 
BigQuery walk through.pptx
BigQuery walk through.pptxBigQuery walk through.pptx
BigQuery walk through.pptx
 
Big Query Basics
Big Query BasicsBig Query Basics
Big Query Basics
 
Real-time Analytics with Apache Flink and Druid
Real-time Analytics with Apache Flink and DruidReal-time Analytics with Apache Flink and Druid
Real-time Analytics with Apache Flink and Druid
 
An overview of BigQuery
An overview of BigQuery An overview of BigQuery
An overview of BigQuery
 
Elk - An introduction
Elk - An introductionElk - An introduction
Elk - An introduction
 
Big Query - Utilizing Google Data Warehouse for Media Analytics
Big Query - Utilizing Google Data Warehouse for Media AnalyticsBig Query - Utilizing Google Data Warehouse for Media Analytics
Big Query - Utilizing Google Data Warehouse for Media Analytics
 
BigQuery implementation
BigQuery implementationBigQuery implementation
BigQuery implementation
 
data platform on kubernetes
data platform on kubernetesdata platform on kubernetes
data platform on kubernetes
 
BigQuery ML - Machine learning at scale using SQL
BigQuery ML - Machine learning at scale using SQLBigQuery ML - Machine learning at scale using SQL
BigQuery ML - Machine learning at scale using SQL
 
What Is ELK Stack | ELK Tutorial For Beginners | Elasticsearch Kibana | ELK S...
What Is ELK Stack | ELK Tutorial For Beginners | Elasticsearch Kibana | ELK S...What Is ELK Stack | ELK Tutorial For Beginners | Elasticsearch Kibana | ELK S...
What Is ELK Stack | ELK Tutorial For Beginners | Elasticsearch Kibana | ELK S...
 
Introduction to elasticsearch
Introduction to elasticsearchIntroduction to elasticsearch
Introduction to elasticsearch
 
Building a Data Lake on AWS
Building a Data Lake on AWSBuilding a Data Lake on AWS
Building a Data Lake on AWS
 
Apache Spark at Airbnb
Apache Spark at AirbnbApache Spark at Airbnb
Apache Spark at Airbnb
 
Lessons Learned Developing and Managing Massive (300TB+) Apache Spark Pipelin...
Lessons Learned Developing and Managing Massive (300TB+) Apache Spark Pipelin...Lessons Learned Developing and Managing Massive (300TB+) Apache Spark Pipelin...
Lessons Learned Developing and Managing Massive (300TB+) Apache Spark Pipelin...
 
Unified Big Data Processing with Apache Spark (QCON 2014)
Unified Big Data Processing with Apache Spark (QCON 2014)Unified Big Data Processing with Apache Spark (QCON 2014)
Unified Big Data Processing with Apache Spark (QCON 2014)
 
Google Cloud Dataproc - Easier, faster, more cost-effective Spark and Hadoop
Google Cloud Dataproc - Easier, faster, more cost-effective Spark and HadoopGoogle Cloud Dataproc - Easier, faster, more cost-effective Spark and Hadoop
Google Cloud Dataproc - Easier, faster, more cost-effective Spark and Hadoop
 
Google Cloud Dataflow
Google Cloud DataflowGoogle Cloud Dataflow
Google Cloud Dataflow
 
Big Data Analytics with Google BigQuery. By Javier Ramirez. All your base Co...
Big Data Analytics with Google BigQuery.  By Javier Ramirez. All your base Co...Big Data Analytics with Google BigQuery.  By Javier Ramirez. All your base Co...
Big Data Analytics with Google BigQuery. By Javier Ramirez. All your base Co...
 
ELK Stack
ELK StackELK Stack
ELK Stack
 

Ähnlich wie Google BigQuery for Everyday Developer

VoxxedDays Bucharest 2017 - Powering interactive data analysis with Google Bi...
VoxxedDays Bucharest 2017 - Powering interactive data analysis with Google Bi...VoxxedDays Bucharest 2017 - Powering interactive data analysis with Google Bi...
VoxxedDays Bucharest 2017 - Powering interactive data analysis with Google Bi...Márton Kodok
 
Complex realtime event analytics using BigQuery @Crunch Warmup
Complex realtime event analytics using BigQuery @Crunch WarmupComplex realtime event analytics using BigQuery @Crunch Warmup
Complex realtime event analytics using BigQuery @Crunch WarmupMárton Kodok
 
Voxxed Days Cluj - Powering interactive data analysis with Google BigQuery
Voxxed Days Cluj - Powering interactive data analysis with Google BigQueryVoxxed Days Cluj - Powering interactive data analysis with Google BigQuery
Voxxed Days Cluj - Powering interactive data analysis with Google BigQueryMárton Kodok
 
DevTalks Keynote Powering interactive data analysis with Google BigQuery
DevTalks Keynote Powering interactive data analysis with Google BigQueryDevTalks Keynote Powering interactive data analysis with Google BigQuery
DevTalks Keynote Powering interactive data analysis with Google BigQueryMárton Kodok
 
GDG DevFest Ukraine - Powering Interactive Data Analysis with Google BigQuery
GDG DevFest Ukraine - Powering Interactive Data Analysis with Google BigQueryGDG DevFest Ukraine - Powering Interactive Data Analysis with Google BigQuery
GDG DevFest Ukraine - Powering Interactive Data Analysis with Google BigQueryMárton Kodok
 
Supercharge your data analytics with BigQuery
Supercharge your data analytics with BigQuerySupercharge your data analytics with BigQuery
Supercharge your data analytics with BigQueryMárton Kodok
 
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQuery
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQueryCodeCamp Iasi - Creating serverless data analytics system on GCP using BigQuery
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQueryMárton Kodok
 
Making advanced analytics accessible to more companies
Making advanced analytics accessible to more companiesMaking advanced analytics accessible to more companies
Making advanced analytics accessible to more companiesMárton Kodok
 
Using ClickHouse for Experimentation
Using ClickHouse for ExperimentationUsing ClickHouse for Experimentation
Using ClickHouse for ExperimentationGleb Kanterov
 
2017 09-27 democratize data products with SQL
2017 09-27 democratize data products with SQL2017 09-27 democratize data products with SQL
2017 09-27 democratize data products with SQLYu Ishikawa
 
MongoDB Tick Data Presentation
MongoDB Tick Data PresentationMongoDB Tick Data Presentation
MongoDB Tick Data PresentationMongoDB
 
MongoDB@sfr.fr
MongoDB@sfr.frMongoDB@sfr.fr
MongoDB@sfr.frbeboutou
 
Executive Intro to BigQuery
Executive Intro to BigQueryExecutive Intro to BigQuery
Executive Intro to BigQueryWilliam M. Cohee
 
Building Data Products with BigQuery for PPC and SEO (SMX 2022)
Building Data Products with BigQuery for PPC and SEO (SMX 2022)Building Data Products with BigQuery for PPC and SEO (SMX 2022)
Building Data Products with BigQuery for PPC and SEO (SMX 2022)Christopher Gutknecht
 
IoT NY - Google Cloud Services for IoT
IoT NY - Google Cloud Services for IoTIoT NY - Google Cloud Services for IoT
IoT NY - Google Cloud Services for IoTJames Chittenden
 
Data Architecture at Vente-Exclusive.com - TOTM Exellys
Data Architecture at Vente-Exclusive.com - TOTM ExellysData Architecture at Vente-Exclusive.com - TOTM Exellys
Data Architecture at Vente-Exclusive.com - TOTM ExellysWout Scheepers
 
Exploring Google APIs with Python
Exploring Google APIs with PythonExploring Google APIs with Python
Exploring Google APIs with Pythonwesley chun
 
apidays LIVE Paris 2021 - Building an analytics API by David Wobrock, Botify
apidays LIVE Paris 2021 - Building an analytics API by David Wobrock, Botifyapidays LIVE Paris 2021 - Building an analytics API by David Wobrock, Botify
apidays LIVE Paris 2021 - Building an analytics API by David Wobrock, Botifyapidays
 
When it all GOes right
When it all GOes rightWhen it all GOes right
When it all GOes rightPavlo Golub
 

Ähnlich wie Google BigQuery for Everyday Developer (20)

VoxxedDays Bucharest 2017 - Powering interactive data analysis with Google Bi...
VoxxedDays Bucharest 2017 - Powering interactive data analysis with Google Bi...VoxxedDays Bucharest 2017 - Powering interactive data analysis with Google Bi...
VoxxedDays Bucharest 2017 - Powering interactive data analysis with Google Bi...
 
Complex realtime event analytics using BigQuery @Crunch Warmup
Complex realtime event analytics using BigQuery @Crunch WarmupComplex realtime event analytics using BigQuery @Crunch Warmup
Complex realtime event analytics using BigQuery @Crunch Warmup
 
Voxxed Days Cluj - Powering interactive data analysis with Google BigQuery
Voxxed Days Cluj - Powering interactive data analysis with Google BigQueryVoxxed Days Cluj - Powering interactive data analysis with Google BigQuery
Voxxed Days Cluj - Powering interactive data analysis with Google BigQuery
 
DevTalks Keynote Powering interactive data analysis with Google BigQuery
DevTalks Keynote Powering interactive data analysis with Google BigQueryDevTalks Keynote Powering interactive data analysis with Google BigQuery
DevTalks Keynote Powering interactive data analysis with Google BigQuery
 
GDG DevFest Ukraine - Powering Interactive Data Analysis with Google BigQuery
GDG DevFest Ukraine - Powering Interactive Data Analysis with Google BigQueryGDG DevFest Ukraine - Powering Interactive Data Analysis with Google BigQuery
GDG DevFest Ukraine - Powering Interactive Data Analysis with Google BigQuery
 
Supercharge your data analytics with BigQuery
Supercharge your data analytics with BigQuerySupercharge your data analytics with BigQuery
Supercharge your data analytics with BigQuery
 
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQuery
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQueryCodeCamp Iasi - Creating serverless data analytics system on GCP using BigQuery
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQuery
 
Making advanced analytics accessible to more companies
Making advanced analytics accessible to more companiesMaking advanced analytics accessible to more companies
Making advanced analytics accessible to more companies
 
Using ClickHouse for Experimentation
Using ClickHouse for ExperimentationUsing ClickHouse for Experimentation
Using ClickHouse for Experimentation
 
2017 09-27 democratize data products with SQL
2017 09-27 democratize data products with SQL2017 09-27 democratize data products with SQL
2017 09-27 democratize data products with SQL
 
MongoDB Tick Data Presentation
MongoDB Tick Data PresentationMongoDB Tick Data Presentation
MongoDB Tick Data Presentation
 
MongoDB@sfr.fr
MongoDB@sfr.frMongoDB@sfr.fr
MongoDB@sfr.fr
 
Executive Intro to BigQuery
Executive Intro to BigQueryExecutive Intro to BigQuery
Executive Intro to BigQuery
 
Building Data Products with BigQuery for PPC and SEO (SMX 2022)
Building Data Products with BigQuery for PPC and SEO (SMX 2022)Building Data Products with BigQuery for PPC and SEO (SMX 2022)
Building Data Products with BigQuery for PPC and SEO (SMX 2022)
 
IoT NY - Google Cloud Services for IoT
IoT NY - Google Cloud Services for IoTIoT NY - Google Cloud Services for IoT
IoT NY - Google Cloud Services for IoT
 
Data Architecture at Vente-Exclusive.com - TOTM Exellys
Data Architecture at Vente-Exclusive.com - TOTM ExellysData Architecture at Vente-Exclusive.com - TOTM Exellys
Data Architecture at Vente-Exclusive.com - TOTM Exellys
 
Exploring Google APIs with Python
Exploring Google APIs with PythonExploring Google APIs with Python
Exploring Google APIs with Python
 
apidays LIVE Paris 2021 - Building an analytics API by David Wobrock, Botify
apidays LIVE Paris 2021 - Building an analytics API by David Wobrock, Botifyapidays LIVE Paris 2021 - Building an analytics API by David Wobrock, Botify
apidays LIVE Paris 2021 - Building an analytics API by David Wobrock, Botify
 
When it all GOes right
When it all GOes rightWhen it all GOes right
When it all GOes right
 
Frappe Open Day - August 2018
Frappe Open Day - August 2018Frappe Open Day - August 2018
Frappe Open Day - August 2018
 

Mehr von Márton Kodok

Gen Apps on Google Cloud PaLM2 and Codey APIs in Action
Gen Apps on Google Cloud PaLM2 and Codey APIs in ActionGen Apps on Google Cloud PaLM2 and Codey APIs in Action
Gen Apps on Google Cloud PaLM2 and Codey APIs in ActionMárton Kodok
 
DevBCN Vertex AI - Pipelines for your MLOps workflows
DevBCN Vertex AI - Pipelines for your MLOps workflowsDevBCN Vertex AI - Pipelines for your MLOps workflows
DevBCN Vertex AI - Pipelines for your MLOps workflowsMárton Kodok
 
Discover BigQuery ML, build your own CREATE MODEL statement
Discover BigQuery ML, build your own CREATE MODEL statementDiscover BigQuery ML, build your own CREATE MODEL statement
Discover BigQuery ML, build your own CREATE MODEL statementMárton Kodok
 
Cloud Run - the rise of serverless and containerization
Cloud Run - the rise of serverless and containerizationCloud Run - the rise of serverless and containerization
Cloud Run - the rise of serverless and containerizationMárton Kodok
 
BigQuery best practices and recommendations to reduce costs with BI Engine, S...
BigQuery best practices and recommendations to reduce costs with BI Engine, S...BigQuery best practices and recommendations to reduce costs with BI Engine, S...
BigQuery best practices and recommendations to reduce costs with BI Engine, S...Márton Kodok
 
Vertex AI - Unified ML Platform for the entire AI workflow on Google Cloud
Vertex AI - Unified ML Platform for the entire AI workflow on Google CloudVertex AI - Unified ML Platform for the entire AI workflow on Google Cloud
Vertex AI - Unified ML Platform for the entire AI workflow on Google CloudMárton Kodok
 
Vertex AI: Pipelines for your MLOps workflows
Vertex AI: Pipelines for your MLOps workflowsVertex AI: Pipelines for your MLOps workflows
Vertex AI: Pipelines for your MLOps workflowsMárton Kodok
 
Cloud Workflows What's new in serverless orchestration and automation
Cloud Workflows What's new in serverless orchestration and automationCloud Workflows What's new in serverless orchestration and automation
Cloud Workflows What's new in serverless orchestration and automationMárton Kodok
 
Serverless orchestration and automation with Cloud Workflows
Serverless orchestration and automation with Cloud WorkflowsServerless orchestration and automation with Cloud Workflows
Serverless orchestration and automation with Cloud WorkflowsMárton Kodok
 
Serverless orchestration and automation with Cloud Workflows
Serverless orchestration and automation with Cloud WorkflowsServerless orchestration and automation with Cloud Workflows
Serverless orchestration and automation with Cloud WorkflowsMárton Kodok
 
Serverless orchestration and automation with Cloud Workflows
Serverless orchestration and automation with Cloud WorkflowsServerless orchestration and automation with Cloud Workflows
Serverless orchestration and automation with Cloud WorkflowsMárton Kodok
 
BigdataConference Europe - BigQuery ML
BigdataConference Europe - BigQuery MLBigdataConference Europe - BigQuery ML
BigdataConference Europe - BigQuery MLMárton Kodok
 
DevFest Romania 2020 Keynote: Bringing the Cloud to you.
DevFest Romania 2020 Keynote: Bringing the Cloud to you.DevFest Romania 2020 Keynote: Bringing the Cloud to you.
DevFest Romania 2020 Keynote: Bringing the Cloud to you.Márton Kodok
 
Applying BigQuery ML on e-commerce data analytics
Applying BigQuery ML on e-commerce data analyticsApplying BigQuery ML on e-commerce data analytics
Applying BigQuery ML on e-commerce data analyticsMárton Kodok
 
Vibe Koli 2019 - Utazás az egyetem padjaitól a Google Developer Expertig
Vibe Koli 2019 - Utazás az egyetem padjaitól a Google Developer ExpertigVibe Koli 2019 - Utazás az egyetem padjaitól a Google Developer Expertig
Vibe Koli 2019 - Utazás az egyetem padjaitól a Google Developer ExpertigMárton Kodok
 
BigQuery ML - Machine learning at scale using SQL
BigQuery ML - Machine learning at scale using SQLBigQuery ML - Machine learning at scale using SQL
BigQuery ML - Machine learning at scale using SQLMárton Kodok
 
Google Cloud Platform Solutions for DevOps Engineers
Google Cloud Platform Solutions  for DevOps EngineersGoogle Cloud Platform Solutions  for DevOps Engineers
Google Cloud Platform Solutions for DevOps EngineersMárton Kodok
 
GDG DevFest Romania - Architecting for the Google Cloud Platform
GDG DevFest Romania - Architecting for the Google Cloud PlatformGDG DevFest Romania - Architecting for the Google Cloud Platform
GDG DevFest Romania - Architecting for the Google Cloud PlatformMárton Kodok
 
Next18 Extended Targu Mures - Bringing the Cloud to you
Next18 Extended Targu Mures - Bringing the Cloud to youNext18 Extended Targu Mures - Bringing the Cloud to you
Next18 Extended Targu Mures - Bringing the Cloud to youMárton Kodok
 
6. DISZ - Webalkalmazások skálázhatósága a Google Cloud Platformon
6. DISZ - Webalkalmazások skálázhatósága  a Google Cloud Platformon6. DISZ - Webalkalmazások skálázhatósága  a Google Cloud Platformon
6. DISZ - Webalkalmazások skálázhatósága a Google Cloud PlatformonMárton Kodok
 

Mehr von Márton Kodok (20)

Gen Apps on Google Cloud PaLM2 and Codey APIs in Action
Gen Apps on Google Cloud PaLM2 and Codey APIs in ActionGen Apps on Google Cloud PaLM2 and Codey APIs in Action
Gen Apps on Google Cloud PaLM2 and Codey APIs in Action
 
DevBCN Vertex AI - Pipelines for your MLOps workflows
DevBCN Vertex AI - Pipelines for your MLOps workflowsDevBCN Vertex AI - Pipelines for your MLOps workflows
DevBCN Vertex AI - Pipelines for your MLOps workflows
 
Discover BigQuery ML, build your own CREATE MODEL statement
Discover BigQuery ML, build your own CREATE MODEL statementDiscover BigQuery ML, build your own CREATE MODEL statement
Discover BigQuery ML, build your own CREATE MODEL statement
 
Cloud Run - the rise of serverless and containerization
Cloud Run - the rise of serverless and containerizationCloud Run - the rise of serverless and containerization
Cloud Run - the rise of serverless and containerization
 
BigQuery best practices and recommendations to reduce costs with BI Engine, S...
BigQuery best practices and recommendations to reduce costs with BI Engine, S...BigQuery best practices and recommendations to reduce costs with BI Engine, S...
BigQuery best practices and recommendations to reduce costs with BI Engine, S...
 
Vertex AI - Unified ML Platform for the entire AI workflow on Google Cloud
Vertex AI - Unified ML Platform for the entire AI workflow on Google CloudVertex AI - Unified ML Platform for the entire AI workflow on Google Cloud
Vertex AI - Unified ML Platform for the entire AI workflow on Google Cloud
 
Vertex AI: Pipelines for your MLOps workflows
Vertex AI: Pipelines for your MLOps workflowsVertex AI: Pipelines for your MLOps workflows
Vertex AI: Pipelines for your MLOps workflows
 
Cloud Workflows What's new in serverless orchestration and automation
Cloud Workflows What's new in serverless orchestration and automationCloud Workflows What's new in serverless orchestration and automation
Cloud Workflows What's new in serverless orchestration and automation
 
Serverless orchestration and automation with Cloud Workflows
Serverless orchestration and automation with Cloud WorkflowsServerless orchestration and automation with Cloud Workflows
Serverless orchestration and automation with Cloud Workflows
 
Serverless orchestration and automation with Cloud Workflows
Serverless orchestration and automation with Cloud WorkflowsServerless orchestration and automation with Cloud Workflows
Serverless orchestration and automation with Cloud Workflows
 
Serverless orchestration and automation with Cloud Workflows
Serverless orchestration and automation with Cloud WorkflowsServerless orchestration and automation with Cloud Workflows
Serverless orchestration and automation with Cloud Workflows
 
BigdataConference Europe - BigQuery ML
BigdataConference Europe - BigQuery MLBigdataConference Europe - BigQuery ML
BigdataConference Europe - BigQuery ML
 
DevFest Romania 2020 Keynote: Bringing the Cloud to you.
DevFest Romania 2020 Keynote: Bringing the Cloud to you.DevFest Romania 2020 Keynote: Bringing the Cloud to you.
DevFest Romania 2020 Keynote: Bringing the Cloud to you.
 
Applying BigQuery ML on e-commerce data analytics
Applying BigQuery ML on e-commerce data analyticsApplying BigQuery ML on e-commerce data analytics
Applying BigQuery ML on e-commerce data analytics
 
Vibe Koli 2019 - Utazás az egyetem padjaitól a Google Developer Expertig
Vibe Koli 2019 - Utazás az egyetem padjaitól a Google Developer ExpertigVibe Koli 2019 - Utazás az egyetem padjaitól a Google Developer Expertig
Vibe Koli 2019 - Utazás az egyetem padjaitól a Google Developer Expertig
 
BigQuery ML - Machine learning at scale using SQL
BigQuery ML - Machine learning at scale using SQLBigQuery ML - Machine learning at scale using SQL
BigQuery ML - Machine learning at scale using SQL
 
Google Cloud Platform Solutions for DevOps Engineers
Google Cloud Platform Solutions  for DevOps EngineersGoogle Cloud Platform Solutions  for DevOps Engineers
Google Cloud Platform Solutions for DevOps Engineers
 
GDG DevFest Romania - Architecting for the Google Cloud Platform
GDG DevFest Romania - Architecting for the Google Cloud PlatformGDG DevFest Romania - Architecting for the Google Cloud Platform
GDG DevFest Romania - Architecting for the Google Cloud Platform
 
Next18 Extended Targu Mures - Bringing the Cloud to you
Next18 Extended Targu Mures - Bringing the Cloud to youNext18 Extended Targu Mures - Bringing the Cloud to you
Next18 Extended Targu Mures - Bringing the Cloud to you
 
6. DISZ - Webalkalmazások skálázhatósága a Google Cloud Platformon
6. DISZ - Webalkalmazások skálázhatósága  a Google Cloud Platformon6. DISZ - Webalkalmazások skálázhatósága  a Google Cloud Platformon
6. DISZ - Webalkalmazások skálázhatósága a Google Cloud Platformon
 

Kürzlich hochgeladen

(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
Test Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendTest Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendArshad QA
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number SystemsJheuzeDellosa
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️anilsa9823
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 

Kürzlich hochgeladen (20)

(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Test Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendTest Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and Backend
 
Exploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the ProcessExploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the Process
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number Systems
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 

Google BigQuery for Everyday Developer

  • 1. Google BigQuery for the Everyday Developer Márton Kodok Senior Software Engineer at REEA twitter: martonkodok stackoverflow: pentium10 github: @pentium10 IV. IT&C Innovation Conference - October 2016 - Sovata, Romania
  • 2. 1. Big Data Challenges 2. Infrastructure overview 3. Choosing a Strategy 4. BigQuery - Use Cases 5. Q & A BigQuery for the Everyday Developer @martonkodok Agenda
  • 3. BigQuery for the Everyday Developer @martonkodok The 5 V’s of Big Data
  • 4. Every scientist who needs big data analytics to save millions of lives should have that power Legacy systems don’t provide the power. The simple fact is that you are brilliant but your brilliant ideas require complex analytics. Traditional solutions are not applicable. BigQuery for the Everyday Developer @martonkodok Big Data movement
  • 5. With more and more people realizing the value of realtime analytics, the race to build realtime middleware has begun. The Plan: have oversight over developments as they happen. * More slides about Beanstalkd at slideshare.net/martonkodok BigQuery for the Everyday Developer @martonkodok Fast computing systems
  • 6. BigQuery for the Everyday Developer @martonkodok Simple legacy Website stack/infrastructure
  • 7. BigQuery for the Everyday Developer @martonkodok Infrastructure with Middleware
  • 8. Task: Need backend/database for STORE, QUERY, EXTRACT deep analytics: ● Events (website events, app events, email events) ● Entities (user data, business entities, A/B split test data) ● Metrics (sales data, code commit history, profiler data) ● Streaming (sensor data, IoT data burst) ● 3rd party Analytics (Google Analytics, Mixpanel) ● Local generated data (log file entries, trace output, unstructured data...) ❏ Run Ad-Hoc queries (without Developer) ❏ Be realtime BigQuery for the Everyday Developer @martonkodok Requirements
  • 9. ● Terabyte scalable storage ● Real-time row ingestion ● Ask sophisticated queries ● Query-performance ● Low-maintenance ● Cost effective ● Wire them up easily Goal: Store everything accessible by SQL immediately. BigQuery for the Everyday Developer @martonkodok Desired system/platform Equipment strategy * people still required Engines: ❏ MongoDB, Riak, Redis ❏ ELK Stack (Elastic-Logstash-Kibana)... ❏ Cassandra, Hive, Hadoop... ❏ Amazon RedShift, Google BigQuery... In-House Hosted Managed
  • 10. BigQuery for the Everyday Developer @martonkodok Google BigQuery
  • 11. ● Analytics-as-a-Service - Data Warehouse in the Cloud ● Fully-Managed by Google (US or EU zone) ● Scales into Petabytes ● Ridiculously fast ● Decent pricing (queries $5/TB, storage: $20/TB) *October 2016 pricing ● 100.000 rows / sec Streaming API ● Open Interfaces (Web UI, BQ command line tool, REST, ODBC) ● Familiar DB Structure (table, views, record, nested, JSON) ● Convenience of SQL + Javascript UDF (User Defined Functions) ● Integrates with Google Sheets + Google Cloud Storage + Pub/Sub connectors ● Client libraries available in YFL (your favorite languages) BigQuery for the Everyday Developer @martonkodok What is BigQuery?
  • 12. ● ACL - row level locking (individual or group based) ● Columnar storage (max 10 000 columns in table) ● Rich SQL: JSON, IP, Math, RegExp, Window functions ● Data types: String, Integer, Float, Boolean, Timestamp, Record, Nested, Struct, Array. ● Append-only tables prefered (DML syntax available) ● Batch load file size limits: 1TB (CSV or JSON) ● User Defined Functions in SQL or Javascript ● Date partitioned tables BigQuery for the Everyday Developer @martonkodok BigQuery: Convenience of SQL
  • 13. * 1 Petabyte storage, 10 TB inserts, 100 TB queries => $22000 Queries Storage Ingestion ➔ 1 TB per month free ➔ 5 USD per TB ➔ only pay for the columns you use in your query ➔ 20 USD per TB frequently accessed data ➔ 10 USD per TB long term storage 90 days ➔ Batch load free (CSV/JSON) ➔ Exporting free ➔ Table copy free ➔ 50 USD per TB Estimate 1 - Storage 5 TB - Streaming Inserts 1 TB - Queries 3 TB Monthly total: $165 Estimate 2 - Storage 24 TB - Streaming Inserts 1 TB - Queries 50 TB Monthly total: $788 BigQuery for the Everyday Developer @martonkodok BigQuery Costs - October 2016
  • 14. BigQuery for the Everyday Developer @martonkodok +--------------------------+-----------+----------+ | order_id | INTEGER | REQUIRED | | ... | | | Example: | products | RECORD | REPEATED | ”products”:[p1,p2] | products.name | STRING | NULLABLE | | products.product_id | INTEGER | NULLABLE | ”products”:[{”name”:”p1”, | products.attributes | STRING | REPEATED | ”product_id”:10, | products.price | FLOAT | NULLABLE | ”attributes”:[”red”,”xl”] | ... | | | ,”price”:9.99}, | common | RECORD | NULLABLE | {”name”:”p2”, | common.insert_id | INTEGER | REQUIRED | ”product_id”:20, | common.event | INTEGER | REQUIRED | ”attributes”:[“red”,”xl”] | common.user_id | INTEGER | REQUIRED | ,”price”:9.99}] | common.timestamp | TIMESTAMP | REQUIRED | | common.utm | RECORD | NULLABLE | | common.utm.source | STRING | NULLABLE | | common.utm.medium | STRING | NULLABLE | | common.utm.campaign | STRING | NULLABLE | | common.utm.content | STRING | NULLABLE | | common.utm.term | STRING | NULLABLE | | meta | STRING | NULLABLE | +--------------------------+-----------+----------+ Schema modelling / JSON
  • 15. 1. Use aggregation MIN/MAX on timestamp to find first/last and join back to the same table. 2. Use analytic functions FIRST_VALUE and LAST_VALUE when only one column is needed SELECT LAST_VALUE(email) OVER( PARTITION BY user_id ORDER BY timestamp ASC) AS email_last ... 3. Using Window Functions for full row selection SELECT email, firstname, lastname FROM (SELECT email, firstname, lastname row_number() over (partition BY user_id ORDER BY timestamp DESC) seqnum FROM [profile_event] ) WHERE seqnum=1 BigQuery for the Everyday Developer @martonkodok Append only tables - Get last value
  • 16. BigQuery for the Everyday Developer @martonkodok Streaming insert time (ms) - last year
  • 17. ● Funnel Analysis BigQuery for the Everyday Developer @martonkodok Achievements
  • 18. BigQuery for the Everyday Developer @martonkodok Funnel analysis: Time on upsell pages
  • 19. Example HITS chain: ● article1 -> page2 -> page3 -> page4 -> orderpage1 -> thankyoupage1 ● page1 -> article2-> page3 -> orderpage2 -> ... BigQuery for the Everyday Developer @martonkodok Attribute credit to first article visited on purchase
  • 20. ● Funnel Analysis ● Email URL click heatmap BigQuery for the Everyday Developer @martonkodok Achievements
  • 21. BigQuery for the Everyday Developer @martonkodok Email URL clicks heat-map
  • 22. ● Funnel Analysis ● Email URL click heatmap ● Email Health Dashboard (SPAM, ISP deferral, content A/B split tests, trends or low open rate campaigns) ● Advanced segmentation (all raw data stored) ● Behavioral analytics - engaged users etc... BigQuery for the Everyday Developer @martonkodok Achievements Continued
  • 23. ● no provisioning/deploy ● no running out of resources ● no more focus on large scale execution plan ● no need to re-implement tricky concepts (time windows / join streams) ● pay only the columns we have in your queries ● run raw ad-hoc queries (either by analysts/sales or Devs) ● no more throwing away-, expiring-, aggregating old data. BigQuery for the Everyday Developer @martonkodok Our benefits
  • 24. 1. githubarchive.org: 20+ event types available since 2012 a. pull request latency b. expressions, emotions in commit messages c. etc... 2. httparchive.org a. sites that use multiple popular JS frameworks b. sites that use multiple jQuery versions 3. Export for Google Analytics 4. MLab - broadband connection performance (26B rows) 5. GSOD - samples of weather (rainfall, temp…) 6. US Natality - 1969 to 2008 7. Wikipedia edits BigQuery for the Everyday Developer @martonkodok BigQuery: Sample projects to try out
  • 25. BigQuery for the Everyday Developer @martonkodok HttpArchive - multiple JS frameworks
  • 26. BigQuery for the Everyday Developer @martonkodok HttpArchive - multiple jQuery versions
  • 27. Questions? Thank you. Slides available soon on: slideshare.net/martonkodok