SlideShare ist ein Scribd-Unternehmen logo
1 von 59
Ta Virot Chiraphadhanakul, PhD (@tvirot)
GDE in Machine Learning | Managing Director @ Skooldio
Data Science on Google Cloud Platform
Google Developers
Launchpad Build for Cloud Meetup, Bangkok
90% of the data in the world
today has been created in
the last two years alone.
— IBM
Turning data into …
ÇĄ Metrics
Insights
Data Products
@tvirot
Data Science Process

Collect Manipulate Analyze Model
|
Communicate
@tvirot
A lot of tools needed
Pig Airflow
@tvirot
Administrative and operational issues
• Deployment and configuration
• Managing scale and optimizing utilization
• Reliability
• Resource provisioning
• etc.
@tvirot
How about a serverless big
data stack that scales
automatically?
Serverless Data Processing
• Focus on insights, not administration
• Practically infinite scale, exactly when you need it
• Pay only for what you use
• Freedom to experiment, fail quickly, and iterate. Successful experiments are
ready to go live right away
Storage & Databases Big Data Machine Learning
Data Science on Google Cloud Platform

Collect Manipulate Analyze Model
|
Communicate
Storage & Databases
Cloud Storage
A scalable object storage service
suitable for all kinds of
unstructured data
Cloud SQL
A fully-managed database service
that makes it easy to set up,
maintain, manage, and administer
your relational MySQL and
PostgreSQL databases in the cloud
Cloud Datastore
A highly-scalable NoSQL database
for your applications. Cloud
Datastore automatically handles
sharding and replication.
Cloud BigTable
A massively scalable NoSQL
database suitable for low-latency
and high-throughput workloads. It
supports the open-source, industry-
standard HBase API
Fully-managed real-time messaging service
that allows you to send and receive
messages between independent applications
Connect Anything to Everything

Use Cloud Pub/Sub to publish and
subscribe to data from multiple sources,
reducing dependencies between
components of distributed applications
Highly Scalable

Any customer can send up to 10,000
messages per second, by default



Guaranteed Delivery

Designed to provide “at least once” delivery
Cloud Pub/Sub

Collect Manipulate Analyze Model
|
Communicate
Fully-managed data processing service,
supporting both stream and batch execution
of pipelines
Fully Managed

Dynamically provision resources to
minimize latency while maintaining high
utilization efficiency
Unified Programming Model

Express computational requirements
regardless of data source
Cloud Dataflow
https://cloud.google.com/dataflow/examples/wordcount-example
Pipeline p = Pipeline.create(options);
Pipeline p = Pipeline.create(options);
p.apply(TextIO.Read.from(“gs://dataflow-samples/shakespeare/kinglear.txt”))
Create a Pipeline
https://cloud.google.com/dataflow/examples/wordcount-example
Pipeline p = Pipeline.create(options);
p.apply(TextIO.Read.from(“gs://dataflow-samples/shakespeare/kinglear.txt”))
.apply(ParDo.named("ExtractWords").of(new DoFn<String, String>() {
     @Override
     public void processElement(ProcessContext c) {
       for (String word : c.element().split("[^a-zA-Z']+")) {
         if (!word.isEmpty()) {
           c.output(word);
         }
       }
     }
}))
Read lines
Create a Pipeline
https://cloud.google.com/dataflow/examples/wordcount-example
Pipeline p = Pipeline.create(options);
p.apply(TextIO.Read.from(“gs://dataflow-samples/shakespeare/kinglear.txt”))
.apply(ParDo.named("ExtractWords").of(new DoFn<String, String>() {
     @Override
     public void processElement(ProcessContext c) {
       for (String word : c.element().split("[^a-zA-Z']+")) {
         if (!word.isEmpty()) {
           c.output(word);
         }
       }
     }
}))
  .apply(Count.<String>perElement())
 
Read lines
Create a Pipeline
Tokenize lines into words
https://cloud.google.com/dataflow/examples/wordcount-example
Pipeline p = Pipeline.create(options);
p.apply(TextIO.Read.from(“gs://dataflow-samples/shakespeare/kinglear.txt”))
.apply(ParDo.named("ExtractWords").of(new DoFn<String, String>() {
     @Override
     public void processElement(ProcessContext c) {
       for (String word : c.element().split("[^a-zA-Z']+")) {
         if (!word.isEmpty()) {
           c.output(word);
         }
       }
     }
}))
  .apply(Count.<String>perElement())
  .apply(MapElements.via(new SimpleFunction<KV<String, Long>, String>() {
     @Override
     public String apply(KV<String, Long> element) {
       return element.getKey() + ": " + element.getValue();
     }
  }))
Read lines
Create a Pipeline
Tokenize lines into words
Count words
https://cloud.google.com/dataflow/examples/wordcount-example
Pipeline p = Pipeline.create(options);
p.apply(TextIO.Read.from(“gs://dataflow-samples/shakespeare/kinglear.txt”))
.apply(ParDo.named("ExtractWords").of(new DoFn<String, String>() {
     @Override
     public void processElement(ProcessContext c) {
       for (String word : c.element().split("[^a-zA-Z']+")) {
         if (!word.isEmpty()) {
           c.output(word);
         }
       }
     }
}))
  .apply(Count.<String>perElement())
  .apply(MapElements.via(new SimpleFunction<KV<String, Long>, String>() {
     @Override
     public String apply(KV<String, Long> element) {
       return element.getKey() + ": " + element.getValue();
     }
  }))
.apply(TextIO.Write.to("gs://my-bucket/counts.txt"));
Format strings
Read lines
Create a Pipeline
Tokenize lines into words
Count words
https://cloud.google.com/dataflow/examples/wordcount-example
Pipeline p = Pipeline.create(options);
p.apply(TextIO.Read.from(“gs://dataflow-samples/shakespeare/kinglear.txt”))
.apply(ParDo.named("ExtractWords").of(new DoFn<String, String>() {
     @Override
     public void processElement(ProcessContext c) {
       for (String word : c.element().split("[^a-zA-Z']+")) {
         if (!word.isEmpty()) {
           c.output(word);
         }
       }
     }
}))
  .apply(Count.<String>perElement())
  .apply(MapElements.via(new SimpleFunction<KV<String, Long>, String>() {
     @Override
     public String apply(KV<String, Long> element) {
       return element.getKey() + ": " + element.getValue();
     }
  }))
.apply(TextIO.Write.to("gs://my-bucket/counts.txt"));
Format strings
Read lines
Create a Pipeline
Tokenize lines into words
Write to file
Count words
https://cloud.google.com/dataflow/examples/wordcount-example
https://cloud.google.com/dataflow/pipelines/design-principles
Different Pipeline Shapes
MULTIPLE TRANSFORMS

+ MERGING
JOINING MULTIPLE
SOURCES
Events, metrics
Stream
Batch
Raw logs,
databases, etc.
Cloud Pub/Sub
Cloud Storage
Cloud Dataflow
Managed Spark and Hadoop service which is
fast, easy to use, and low cost
Fast & Scalable Data Processing

Create a cluster in minutes and resize them
at any time
Affordable Pricing

Based on actual use, measured by the
minute
Open Source Ecosystem

Move existing projects or ETL pipelines
without redevelopment
Cloud Dataproc
Cloud Dataproc
@tvirot
Events, metrics
Stream
Batch
Raw logs,
databases, etc.
Cloud Pub/Sub
Cloud Storage
Cloud Dataflow
Batch Cloud Dataproc
An intelligent data service for visually
exploring, cleaning, and preparing data
Visually explore data
Intelligent data manipulation
Serverless and works at any scale
Cloud Dataprep
https://cloud.google.com/dataprep/
Google's fully managed, petabyte scale, low
cost enterprise data warehouse for analytics
Fully Managed

No infrastructure to manage, and you don't
need a database administrator
Speed & Scale

Scans TB in seconds and PB in minutes
Convenience of SQL
Makes it more accessible
Security & Reliability

Automatically encrypts and replicates data
BigQuery
Google's fully managed, petabyte scale, low
cost enterprise data warehouse for analytics
Flexible Data Ingestion

Load your data from Google Cloud Storage
or Google Cloud Datastore, or stream it
Fully Integrated

With other Google Cloud products and
third-party applications
BigQuery
https://cloud.google.com/blog/big-data/2016/01/anatomy-of-a-bigquery-query
How fast is BigQuery really?
BigQuery
Events, metrics
Stream
Batch
Raw logs,
databases, etc.
Cloud Pub/Sub
Cloud Storage
Cloud Dataflow
Batch Cloud Dataproc

Collect Manipulate Analyze Model
|
Communicate
An easy to use interactive tool for data
exploration, analysis, visualization and
machine learning
Integrated & Open Source

Built on Jupyter (formerly IPython).
Enables analysis of your data on BigQuery,
ML Engine, Compute Engine, and Cloud
Storage
Cloud Datalab
Photo: https://github.com/googledatalab/datalab
Turns your data into informative dashboards
and reports that are easy to read, easy to
share, and fully customizable
Put all your data to work

Easily access all the data sources you
need to understand your business and
make better decisions
Build engaging visualizations

Create beautiful charts and graphs that
bring your data to life
Leverage teamwork that works

Share and collaborate in real time. Work
together quickly, from anywhere.
Cloud Data Studio
https://datastudio.google.com
@tvirot
https://datastudio.google.com
@tvirot

Collect Manipulate Analyze Model
|
Communicate
Artificial Intelligence
is the new electricity.
— Andrew Ng
AlphaGo
The first computer program to
beat a professional human Go
player
Photo: Nature
Waymo
The Google self-driving car
project became Waymo with a
mission to make it easy and
safe for people and things to
move around
Photo: Waymo
Machine Learning engine and APIs
Custom ML modelsPre-trained ML models
Machine Learning
Engine
TensorFlowVision API
Translation
API
Natural Language
API
Speech API Jobs API
Google Cloud
Vision API
Understand the content of
images
• Label Detection
• Optical Character Recognition
• Explicit Content Detection
• etc.
+
https://m.me/youpin.city | https://youpin.city/app
@tvirot
Google Cloud
Speech API
Convert audio to text by
applying powerful neural
network models in an easy to
use API
@tvirot
A managed service that enables you to easily
build machine learning models, that work on
any type of data, of any size
Scalable Service

Managed distributed training infrastructure
that supports CPUs and GPUs
HyperTune

Automatically tuning your hyper
parameters with HyperTune
Deep Learning Capabilities

Supports any TensorFlow models
Cloud ML Engine
BigQuery
Events, metrics
Stream
Batch
Raw logs,
databases, etc.
Cloud Pub/Sub
Cloud Storage
Cloud Dataflow
Batch Cloud Dataproc
Cloud Machine
Learning Engine
Large-Scale Deep Learning for Intelligent Computer Systems, Jeff Dean, WSDM 2016
http://playground.tensorflow.org/
http://playground.tensorflow.org/
http://playground.tensorflow.org/
TensorKart
Self-driving MarioKart with
TensorFlow
http://kevinhughes.ca/blog/tensor-kart
Cucumber Sorter
"Farmers want to focus and
spend their time on growing
delicious vegetables.”
— Makoto Koike
Photos: Google Cloud Platform / Kaz Sato
https://codelabs.developers.google.com/codelabs/cloud-tensorflow-mnist
https://codelabs.developers.google.com/codelabs/cloud-tensorflow-mnist
Serverless

Less ops and administration
No waiting

Queries that used to take hours or days
now take minutes or seconds
Machine Intelligence

Gives everyone access to the deep learning
systems
Thank you!
Ta Virot Chiraphadhanakul, PhD (@tvirot)
GDE in Machine Learning | Managing Director @ Skooldio

Weitere ähnliche Inhalte

Was ist angesagt?

AWS IoT Lab Introduction
AWS IoT Lab IntroductionAWS IoT Lab Introduction
AWS IoT Lab Introduction
Amazon Web Services
 
Machine Learning Models in Production
Machine Learning Models in ProductionMachine Learning Models in Production
Machine Learning Models in Production
DataWorks Summit
 
Efficient Spark Analytics on Encrypted Data with Gidon Gershinsky
 Efficient Spark Analytics on Encrypted Data with Gidon Gershinsky Efficient Spark Analytics on Encrypted Data with Gidon Gershinsky
Efficient Spark Analytics on Encrypted Data with Gidon Gershinsky
Databricks
 

Was ist angesagt? (20)

BigQuery for Beginners
BigQuery for BeginnersBigQuery for Beginners
BigQuery for Beginners
 
Mongodb basics and architecture
Mongodb basics and architectureMongodb basics and architecture
Mongodb basics and architecture
 
Build accurate training datasets with Amazon SageMaker Ground Truth - AIM305 ...
Build accurate training datasets with Amazon SageMaker Ground Truth - AIM305 ...Build accurate training datasets with Amazon SageMaker Ground Truth - AIM305 ...
Build accurate training datasets with Amazon SageMaker Ground Truth - AIM305 ...
 
Google Vertex AI
Google Vertex AIGoogle Vertex AI
Google Vertex AI
 
Applying BigQuery ML on e-commerce data analytics
Applying BigQuery ML on e-commerce data analyticsApplying BigQuery ML on e-commerce data analytics
Applying BigQuery ML on e-commerce data analytics
 
Big data on google cloud
Big data on google cloudBig data on google cloud
Big data on google cloud
 
Databricks Overview for MLOps
Databricks Overview for MLOpsDatabricks Overview for MLOps
Databricks Overview for MLOps
 
Machine learning at scale with Google Cloud Platform
Machine learning at scale with Google Cloud PlatformMachine learning at scale with Google Cloud Platform
Machine learning at scale with Google Cloud Platform
 
[DSC Europe 22] Overview of the Databricks Platform - Petar Zecevic
[DSC Europe 22] Overview of the Databricks Platform - Petar Zecevic[DSC Europe 22] Overview of the Databricks Platform - Petar Zecevic
[DSC Europe 22] Overview of the Databricks Platform - Petar Zecevic
 
The Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it WorkThe Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it Work
 
Getting Started with BigQuery ML
Getting Started with BigQuery MLGetting Started with BigQuery ML
Getting Started with BigQuery ML
 
Azure cognitive search
Azure cognitive searchAzure cognitive search
Azure cognitive search
 
Google Cloud Machine Learning
 Google Cloud Machine Learning  Google Cloud Machine Learning
Google Cloud Machine Learning
 
MLOps Using MLflow
MLOps Using MLflowMLOps Using MLflow
MLOps Using MLflow
 
Azure Machine Learning tutorial
Azure Machine Learning tutorialAzure Machine Learning tutorial
Azure Machine Learning tutorial
 
AWS IoT Lab Introduction
AWS IoT Lab IntroductionAWS IoT Lab Introduction
AWS IoT Lab Introduction
 
Machine Learning Models in Production
Machine Learning Models in ProductionMachine Learning Models in Production
Machine Learning Models in Production
 
Efficient Spark Analytics on Encrypted Data with Gidon Gershinsky
 Efficient Spark Analytics on Encrypted Data with Gidon Gershinsky Efficient Spark Analytics on Encrypted Data with Gidon Gershinsky
Efficient Spark Analytics on Encrypted Data with Gidon Gershinsky
 
Big query
Big queryBig query
Big query
 
Microsoft Data Platform - What's included
Microsoft Data Platform - What's includedMicrosoft Data Platform - What's included
Microsoft Data Platform - What's included
 

Ähnlich wie Data Science on Google Cloud Platform

Introduction to Stream Processing
Introduction to Stream ProcessingIntroduction to Stream Processing
Introduction to Stream Processing
Guido Schmutz
 

Ähnlich wie Data Science on Google Cloud Platform (20)

Building Modern Data Pipelines for Time Series Data on GCP with InfluxData by...
Building Modern Data Pipelines for Time Series Data on GCP with InfluxData by...Building Modern Data Pipelines for Time Series Data on GCP with InfluxData by...
Building Modern Data Pipelines for Time Series Data on GCP with InfluxData by...
 
Data Ingestion in Big Data and IoT platforms
Data Ingestion in Big Data and IoT platformsData Ingestion in Big Data and IoT platforms
Data Ingestion in Big Data and IoT platforms
 
Integrating Google Cloud Dataproc with Alluxio for faster performance in the ...
Integrating Google Cloud Dataproc with Alluxio for faster performance in the ...Integrating Google Cloud Dataproc with Alluxio for faster performance in the ...
Integrating Google Cloud Dataproc with Alluxio for faster performance in the ...
 
Google Cloud Platform for Data Science teams
Google Cloud Platform for Data Science teamsGoogle Cloud Platform for Data Science teams
Google Cloud Platform for Data Science teams
 
Architecting Solutions Leveraging The Cloud
Architecting Solutions Leveraging The CloudArchitecting Solutions Leveraging The Cloud
Architecting Solutions Leveraging The Cloud
 
Supercharge your data analytics with BigQuery
Supercharge your data analytics with BigQuerySupercharge your data analytics with BigQuery
Supercharge your data analytics with BigQuery
 
Google cloud Dataflow & Apache Flink
Google cloud Dataflow & Apache FlinkGoogle cloud Dataflow & Apache Flink
Google cloud Dataflow & Apache Flink
 
Streaming Visualization
Streaming VisualizationStreaming Visualization
Streaming Visualization
 
Apache Eagle: Secure Hadoop in Real Time
Apache Eagle: Secure Hadoop in Real TimeApache Eagle: Secure Hadoop in Real Time
Apache Eagle: Secure Hadoop in Real Time
 
Apache Eagle at Hadoop Summit 2016 San Jose
Apache Eagle at Hadoop Summit 2016 San JoseApache Eagle at Hadoop Summit 2016 San Jose
Apache Eagle at Hadoop Summit 2016 San Jose
 
DataFinder: A Python Application for Scientific Data Management
DataFinder: A Python Application for Scientific Data ManagementDataFinder: A Python Application for Scientific Data Management
DataFinder: A Python Application for Scientific Data Management
 
Introduction to Stream Processing
Introduction to Stream ProcessingIntroduction to Stream Processing
Introduction to Stream Processing
 
IoT NY - Google Cloud Services for IoT
IoT NY - Google Cloud Services for IoTIoT NY - Google Cloud Services for IoT
IoT NY - Google Cloud Services for IoT
 
Organizing the Data Chaos of Scientists
Organizing the Data Chaos of ScientistsOrganizing the Data Chaos of Scientists
Organizing the Data Chaos of Scientists
 
Ultra Fast Deep Learning in Hybrid Cloud using Intel Analytics Zoo & Alluxio
Ultra Fast Deep Learning in Hybrid Cloud using Intel Analytics Zoo & AlluxioUltra Fast Deep Learning in Hybrid Cloud using Intel Analytics Zoo & Alluxio
Ultra Fast Deep Learning in Hybrid Cloud using Intel Analytics Zoo & Alluxio
 
Apache Beam: A unified model for batch and stream processing data
Apache Beam: A unified model for batch and stream processing dataApache Beam: A unified model for batch and stream processing data
Apache Beam: A unified model for batch and stream processing data
 
Fundamental question and answer in cloud computing quiz by animesh chaturvedi
Fundamental question and answer in cloud computing quiz by animesh chaturvediFundamental question and answer in cloud computing quiz by animesh chaturvedi
Fundamental question and answer in cloud computing quiz by animesh chaturvedi
 
Giga Spaces Data Grid / Data Caching Overview
Giga Spaces Data Grid / Data Caching OverviewGiga Spaces Data Grid / Data Caching Overview
Giga Spaces Data Grid / Data Caching Overview
 
Google Cloud lightning talk @MHacks
Google Cloud lightning talk @MHacksGoogle Cloud lightning talk @MHacks
Google Cloud lightning talk @MHacks
 
Enterprise guide to building a Data Mesh
Enterprise guide to building a Data MeshEnterprise guide to building a Data Mesh
Enterprise guide to building a Data Mesh
 

KĂźrzlich hochgeladen

Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
JoseMangaJr1
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
amitlee9823
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
amitlee9823
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
SUHANI PANDEY
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
AroojKhan71
 

KĂźrzlich hochgeladen (20)

Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
ELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptxELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptx
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptx
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 

Data Science on Google Cloud Platform

  • 1. Ta Virot Chiraphadhanakul, PhD (@tvirot) GDE in Machine Learning | Managing Director @ Skooldio Data Science on Google Cloud Platform Google Developers Launchpad Build for Cloud Meetup, Bangkok
  • 2. 90% of the data in the world today has been created in the last two years alone. — IBM
  • 3. Turning data into … ÇĄ Metrics Insights Data Products @tvirot
  • 4. Data Science Process  Collect Manipulate Analyze Model | Communicate @tvirot
  • 5. A lot of tools needed Pig Airflow @tvirot
  • 6. Administrative and operational issues • Deployment and configuration • Managing scale and optimizing utilization • Reliability • Resource provisioning • etc.
  • 8. How about a serverless big data stack that scales automatically?
  • 9. Serverless Data Processing • Focus on insights, not administration • Practically infinite scale, exactly when you need it • Pay only for what you use • Freedom to experiment, fail quickly, and iterate. Successful experiments are ready to go live right away
  • 10. Storage & Databases Big Data Machine Learning Data Science on Google Cloud Platform
  • 11.  Collect Manipulate Analyze Model | Communicate
  • 12. Storage & Databases Cloud Storage A scalable object storage service suitable for all kinds of unstructured data Cloud SQL A fully-managed database service that makes it easy to set up, maintain, manage, and administer your relational MySQL and PostgreSQL databases in the cloud Cloud Datastore A highly-scalable NoSQL database for your applications. Cloud Datastore automatically handles sharding and replication. Cloud BigTable A massively scalable NoSQL database suitable for low-latency and high-throughput workloads. It supports the open-source, industry- standard HBase API
  • 13. Fully-managed real-time messaging service that allows you to send and receive messages between independent applications Connect Anything to Everything
 Use Cloud Pub/Sub to publish and subscribe to data from multiple sources, reducing dependencies between components of distributed applications Highly Scalable
 Any customer can send up to 10,000 messages per second, by default
 
 Guaranteed Delivery
 Designed to provide “at least once” delivery Cloud Pub/Sub
  • 14.  Collect Manipulate Analyze Model | Communicate
  • 15. Fully-managed data processing service, supporting both stream and batch execution of pipelines Fully Managed
 Dynamically provision resources to minimize latency while maintaining high utilization efficiency Unified Programming Model
 Express computational requirements regardless of data source Cloud Dataflow
  • 17. Pipeline p = Pipeline.create(options); p.apply(TextIO.Read.from(“gs://dataflow-samples/shakespeare/kinglear.txt”)) Create a Pipeline https://cloud.google.com/dataflow/examples/wordcount-example
  • 18. Pipeline p = Pipeline.create(options); p.apply(TextIO.Read.from(“gs://dataflow-samples/shakespeare/kinglear.txt”)) .apply(ParDo.named("ExtractWords").of(new DoFn<String, String>() {      @Override      public void processElement(ProcessContext c) {        for (String word : c.element().split("[^a-zA-Z']+")) {          if (!word.isEmpty()) {            c.output(word);          }        }      } })) Read lines Create a Pipeline https://cloud.google.com/dataflow/examples/wordcount-example
  • 19. Pipeline p = Pipeline.create(options); p.apply(TextIO.Read.from(“gs://dataflow-samples/shakespeare/kinglear.txt”)) .apply(ParDo.named("ExtractWords").of(new DoFn<String, String>() {      @Override      public void processElement(ProcessContext c) {        for (String word : c.element().split("[^a-zA-Z']+")) {          if (!word.isEmpty()) {            c.output(word);          }        }      } }))   .apply(Count.<String>perElement())   Read lines Create a Pipeline Tokenize lines into words https://cloud.google.com/dataflow/examples/wordcount-example
  • 20. Pipeline p = Pipeline.create(options); p.apply(TextIO.Read.from(“gs://dataflow-samples/shakespeare/kinglear.txt”)) .apply(ParDo.named("ExtractWords").of(new DoFn<String, String>() {      @Override      public void processElement(ProcessContext c) {        for (String word : c.element().split("[^a-zA-Z']+")) {          if (!word.isEmpty()) {            c.output(word);          }        }      } }))   .apply(Count.<String>perElement())   .apply(MapElements.via(new SimpleFunction<KV<String, Long>, String>() {      @Override      public String apply(KV<String, Long> element) {        return element.getKey() + ": " + element.getValue();      }   })) Read lines Create a Pipeline Tokenize lines into words Count words https://cloud.google.com/dataflow/examples/wordcount-example
  • 21. Pipeline p = Pipeline.create(options); p.apply(TextIO.Read.from(“gs://dataflow-samples/shakespeare/kinglear.txt”)) .apply(ParDo.named("ExtractWords").of(new DoFn<String, String>() {      @Override      public void processElement(ProcessContext c) {        for (String word : c.element().split("[^a-zA-Z']+")) {          if (!word.isEmpty()) {            c.output(word);          }        }      } }))   .apply(Count.<String>perElement())   .apply(MapElements.via(new SimpleFunction<KV<String, Long>, String>() {      @Override      public String apply(KV<String, Long> element) {        return element.getKey() + ": " + element.getValue();      }   })) .apply(TextIO.Write.to("gs://my-bucket/counts.txt")); Format strings Read lines Create a Pipeline Tokenize lines into words Count words https://cloud.google.com/dataflow/examples/wordcount-example
  • 22. Pipeline p = Pipeline.create(options); p.apply(TextIO.Read.from(“gs://dataflow-samples/shakespeare/kinglear.txt”)) .apply(ParDo.named("ExtractWords").of(new DoFn<String, String>() {      @Override      public void processElement(ProcessContext c) {        for (String word : c.element().split("[^a-zA-Z']+")) {          if (!word.isEmpty()) {            c.output(word);          }        }      } }))   .apply(Count.<String>perElement())   .apply(MapElements.via(new SimpleFunction<KV<String, Long>, String>() {      @Override      public String apply(KV<String, Long> element) {        return element.getKey() + ": " + element.getValue();      }   })) .apply(TextIO.Write.to("gs://my-bucket/counts.txt")); Format strings Read lines Create a Pipeline Tokenize lines into words Write to file Count words https://cloud.google.com/dataflow/examples/wordcount-example
  • 24. Events, metrics Stream Batch Raw logs, databases, etc. Cloud Pub/Sub Cloud Storage Cloud Dataflow
  • 25. Managed Spark and Hadoop service which is fast, easy to use, and low cost Fast & Scalable Data Processing
 Create a cluster in minutes and resize them at any time Affordable Pricing
 Based on actual use, measured by the minute Open Source Ecosystem
 Move existing projects or ETL pipelines without redevelopment Cloud Dataproc
  • 27. Events, metrics Stream Batch Raw logs, databases, etc. Cloud Pub/Sub Cloud Storage Cloud Dataflow Batch Cloud Dataproc
  • 28. An intelligent data service for visually exploring, cleaning, and preparing data Visually explore data Intelligent data manipulation Serverless and works at any scale Cloud Dataprep
  • 30. Google's fully managed, petabyte scale, low cost enterprise data warehouse for analytics Fully Managed
 No infrastructure to manage, and you don't need a database administrator Speed & Scale
 Scans TB in seconds and PB in minutes Convenience of SQL Makes it more accessible Security & Reliability
 Automatically encrypts and replicates data BigQuery
  • 31. Google's fully managed, petabyte scale, low cost enterprise data warehouse for analytics Flexible Data Ingestion
 Load your data from Google Cloud Storage or Google Cloud Datastore, or stream it Fully Integrated
 With other Google Cloud products and third-party applications BigQuery
  • 33. BigQuery Events, metrics Stream Batch Raw logs, databases, etc. Cloud Pub/Sub Cloud Storage Cloud Dataflow Batch Cloud Dataproc
  • 34.  Collect Manipulate Analyze Model | Communicate
  • 35. An easy to use interactive tool for data exploration, analysis, visualization and machine learning Integrated & Open Source
 Built on Jupyter (formerly IPython). Enables analysis of your data on BigQuery, ML Engine, Compute Engine, and Cloud Storage Cloud Datalab
  • 37. Turns your data into informative dashboards and reports that are easy to read, easy to share, and fully customizable Put all your data to work
 Easily access all the data sources you need to understand your business and make better decisions Build engaging visualizations
 Create beautiful charts and graphs that bring your data to life Leverage teamwork that works
 Share and collaborate in real time. Work together quickly, from anywhere. Cloud Data Studio
  • 40.  Collect Manipulate Analyze Model | Communicate
  • 41. Artificial Intelligence is the new electricity. — Andrew Ng
  • 42. AlphaGo The first computer program to beat a professional human Go player Photo: Nature
  • 43. Waymo The Google self-driving car project became Waymo with a mission to make it easy and safe for people and things to move around Photo: Waymo
  • 44. Machine Learning engine and APIs Custom ML modelsPre-trained ML models Machine Learning Engine TensorFlowVision API Translation API Natural Language API Speech API Jobs API
  • 45. Google Cloud Vision API Understand the content of images • Label Detection • Optical Character Recognition • Explicit Content Detection • etc. + https://m.me/youpin.city | https://youpin.city/app @tvirot
  • 46. Google Cloud Speech API Convert audio to text by applying powerful neural network models in an easy to use API @tvirot
  • 47. A managed service that enables you to easily build machine learning models, that work on any type of data, of any size Scalable Service
 Managed distributed training infrastructure that supports CPUs and GPUs HyperTune
 Automatically tuning your hyper parameters with HyperTune Deep Learning Capabilities
 Supports any TensorFlow models Cloud ML Engine
  • 48. BigQuery Events, metrics Stream Batch Raw logs, databases, etc. Cloud Pub/Sub Cloud Storage Cloud Dataflow Batch Cloud Dataproc Cloud Machine Learning Engine
  • 49. Large-Scale Deep Learning for Intelligent Computer Systems, Jeff Dean, WSDM 2016
  • 54. Cucumber Sorter "Farmers want to focus and spend their time on growing delicious vegetables.” — Makoto Koike Photos: Google Cloud Platform / Kaz Sato
  • 57. Serverless
 Less ops and administration No waiting
 Queries that used to take hours or days now take minutes or seconds Machine Intelligence
 Gives everyone access to the deep learning systems
  • 58.
  • 59. Thank you! Ta Virot Chiraphadhanakul, PhD (@tvirot) GDE in Machine Learning | Managing Director @ Skooldio