Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.
Nächste SlideShare
What to Upload to SlideShare
Weiter
Herunterladen, um offline zu lesen und im Vollbildmodus anzuzeigen.

Teilen

BigdataConference Europe - BigQuery ML

Herunterladen, um offline zu lesen

One of the hottest topics in database land these days is BigQuery ML. A new way to use machine learning on top of tabular data straight on your tables without leaving the query editor.
With BigQuery ML, you can build machine learning models without leaving the database environment and training it on massive datasets.
In this demo session, we are going to demonstrate common marketing Machine Learning use cases how to build, train, eval and predict, your own scalable machine learning models using SQL language.
The audience will get first hand experience how to write CREATE MODEL sql syntax to build machine learning models such as:
– Multiclass logistic regression for classification
– K-means clustering
– Matrix factorization
– ARIMA time series predictions
– Import TensorFlow models for prediction in BigQuery

Models are trained and accessed in BigQuery using SQL — a language data analysts know. This enables business decision making through predictive analytics across the organization without leaving the query editor.

Ähnliche Bücher

Kostenlos mit einer 30-tägigen Testversion von Scribd

Alle anzeigen

Ähnliche Hörbücher

Kostenlos mit einer 30-tägigen Testversion von Scribd

Alle anzeigen

BigdataConference Europe - BigQuery ML

  1. 1. Supercharge your data analytics with BigQuery ML November 2020 Márton Kodok / @martonkodok Google Developer Expert at REEA.net
  2. 2. ● Among the Top3 romanians on Stackoverflow 175k reputation ● Google Developer Expert on Cloud technologies ● Crafting Web/Mobile backends at REEA.net ● BigQuery + Redis database engine expert Slideshare: martonkodok Twitter: @martonkodok StackOverflow: pentium10 GitHub: pentium10 Supercharge your data analytics with BigQuery ML @martonkodok About me
  3. 3. 1. E-commerce Workloads and data models 2. What is BigQuery? - Data warehouse in the Cloud 3. Introduction to BigQuery ML - execute ML models using SQL 4. Practical use cases 5. Predict, recommend and forecastwith BigQuery ML 6. Conclusions Agenda Supercharge your data analytics with BigQuery ML @martonkodok
  4. 4. Shop - products, tagging, features, attributes Users profile, preferences, favorites, rating, engagement Customers orders, re-orders, profile, associated products, survey, feedback, 360° Analytics metrics, event data, page hits, email campaigns, A/B split tests Upsells recommendations, price tags, strategy, discounts, vouchers Enriched data sku, sentiment analysis, image parsing, object recognition E-commerce Workloads and data models Supercharge your data analytics with BigQuery ML @martonkodok
  5. 5. Shop - products, tagging, features, attributes Users profile, preferences, favorites, rating, engagement Customers orders, re-orders, profile, associated products, survey, feedback, 360° Analytics metrics, event data, page hits, email campaigns, A/B split tests Upsells recommendations, price tags, strategy, discounts, vouchers Enriched data sku, sentiment analysis, image parsing, object recognition E-commerce Workloads and data models Supercharge your data analytics with BigQuery ML @martonkodok
  6. 6. “ Where to store all these rawdata? Supercharge your data analytics with BigQuery ML @martonkodok
  7. 7. BigQuery On-Premises Servers ApplicationEvents Frontend Metrics / Logs/ Streaming Supercharge your data analytics with BigQuery ML @martonkodok SQL
  8. 8. Analytics-as-a-Service - Data Warehouse in the Cloud Familiar DB Structure (table, columns, views, struct, nested, JSON) Decent pricing (storage: $20/TB cold: $10/TB,queries $5/TB) *Nov 2020 SQL 2011 + Javascript UDF (User Defined Functions) BigQuery ML enables users to create machine learning models by SQL queries Scales into Petabytes on Managed Infrastructure Integrates with Cloud SQL + Cloud Storage + Sheets + Pub/Sub connectors What is BigQuery? Supercharge your data analytics with BigQuery ML @martonkodok
  9. 9. What is BigQuery’s Superpower? Supercharge your data analytics with BigQuery ML @martonkodok
  10. 10. 1. Load from file - either local or from GCS (max 5TB each) 2. Streaming rows - event driven approach - high throughput 1M rows/sec 3. Functions - observer-trigger based (Google Cloud Functions) 4. Join with Cloud SQL - Ability to join with MySQL, Postgres 5. Pipelines - flexibility to do ETL - FluentD, Kafka, Google Dataflow 6. Export from connected services - Firestore, Billing, AuditLogs, Stackdriver 7. Firebase - Analytics - Messaging - Crashlytics - Perf. Monitoring - Predictions Loading Data into BigQuery Supercharge your data analytics with BigQuery ML @martonkodok
  11. 11. “ Capturing the data Supercharge your data analytics with BigQuery ML @martonkodok
  12. 12. Data Pipeline Integration at REEA.net Analytics Backend BigQuery On-Premises Servers Pipelines FluentD Event Sourcing Frontend Platform Services Metrics / Logs/ Streaming Development Team Data Analysts Report & Share Business Analysis Tools Tableau QlikView Data Studio Internal Dashboard Database SQL Application ServersServers Cloud Storage archive Load Export Replay Standard Devices HTTPS Supercharge your data analytics with BigQuery ML @martonkodok
  13. 13. “ We have our app outside of GCP. We need to join with our SQL database. Solution: EXTERNAL_QUERY Supercharge your data analytics with BigQuery ML @martonkodok
  14. 14. Combine on-premise with Cloud App Load Balancing NGINX Compute Engine 10GB PD 2 1 Database Service (Master/Slave) Compute Engine 10GB PD 4 1 Compute Engine 10GB PD 4 1 Compute Engine 10GB PD 4 1 BigQuery Supercharge your data analytics with BigQuery ML @martonkodok Zone 1 us-east1-a Replica Cloud SQL Cloud VPN Gateway Execute combined queries Report
  15. 15. EXTERNAL_QUERY: Run in BQ a query from Cloud SQL db Supercharge your data analytics with BigQuery ML @martonkodok
  16. 16. ● ● ● ● ● ● ● Our benefits Supercharge your data analytics with BigQuery ML @martonkodok
  17. 17. What is BigQueryML? Supercharge your data analytics with BigQuery ML @martonkodok
  18. 18. BigQuery ML 1. CREATE MODEL in SQL to increase development speed 2. Predict, recommend, foreast on tabular data with SQL 3. Automate common ML tasks and hyperparameter tuning by creating new models as easy ascreatingtables
  19. 19. ● Binary or Multiclass logistic regression for classification (labels can have up to 50 unique values) ● K-means clustering for data segmentation (unsupervised learning - not require labels/training) ● Recommend with Matrix factorization ● Import TensorFlow models for prediction in BigQuery ● Time series forecasting with ARIMA - the sales of an item on a given day ● Boosted Tree for creating XGBoost | Deep Neural Network DNN models | AutoML tables ● and others... Supported models in BigQuery ML Supercharge your data analytics with BigQuery ML @martonkodok
  20. 20. Conversion/Purchase prediction MODEL: Logistic-Regression Predict if a user “converts” or "purchases". It is in the company's interest if many users sign up for this membership as it helps streamline their Ads convertion and also helps with recurring revenue. Customer Lifetime Value (LTV) prediction. MODEL: Logistic-Regression It is used by the organisations to identify and prioritizesignificantcustomersegments that would be most valuable to the company. Customer Segmentation MODEL: K-means clustering dividing a client base into groups in specific ways relevanttomarketing, such as interestsandspending habits. Segmentation allows marketers to better customize their efforts to various audience groups. E-commerce Use Cases Supercharge your data analytics with BigQuery ML @martonkodok
  21. 21. Create a MODELthat predicts whether a website visitor will make a transaction. ● CREATEMODEL statement ● TheML.EVALUATE function to evaluate the ML model ● TheML.PREDICTfunction to make predictions using the ML model Getting started with BigQuery ML Supercharge your data analytics with BigQuery ML @martonkodok
  22. 22. Create a binarylogisticregressionmodel Supercharge your data analytics with BigQuery ML @martonkodok 3 2 Create training dataset using a labelcolumn CREATEMODEL syntax 1 2 SELECT features 3 1
  23. 23. Evaluate your model Supercharge your data analytics with BigQuery ML @martonkodok
  24. 24. Predict Supercharge your data analytics with BigQuery ML @martonkodok
  25. 25. Use cases: ● Customer segmentation ● Data quality Options and defaults ● Number of clusters: Default log10 (num_rows) clusters ● Distance type - Euclidean(default), Cosine ● Supports all major SQL data types including GIS K-means clustering Supercharge your data analytics with BigQuery ML @martonkodok CREATE MODEL yourmodel OPTIONS (model_type = “kmeans”) AS SELECT.. FROM ml.PREDICT maps rows to closest clusters ml.CENTROID for cluster centroids ml.EVALUATE ml.TRAINING_INFO ml.FEATURE_INFO
  26. 26. Available data: ● Encode yes/no features (eg: has a microwave, has a kitchen, has a TV, has a bathroom) ● Can apply clustering on the encoded data K-means clustering: Problem definition Supercharge your data analytics with BigQuery ML @martonkodok
  27. 27. Premise We can identify oddities (potential data quality issues) by grouping things together and separating outliers. K-means clustering: Problem definition Supercharge your data analytics with BigQuery ML @martonkodok
  28. 28. Use cases: ● Product recommendation ● Marketing campaign target optimization tool Options and defaults ● Input: User, Item, Rating ● Can use L2 regularization ● Specify training-test split (default random 80-20) Matrix Factorization Supercharge your data analytics with BigQuery ML @martonkodok CREATE MODEL yourmodel OPTIONS (model_type = “matrix_factorization”) AS SELECT.. FROM ml.RECOMMEND for full user-item matrix ml.EVALUATE ml.WEIGHTS ml.TRAINING_INFO ml.FEATURE_INFO
  29. 29. Available data: ● User ● Item ● Rating Problem ● assigning values for previously unknown values (zeros in our case) Matrix Factorization: Problem definition Supercharge your data analytics with BigQuery ML @martonkodok
  30. 30. BigQuery ML - Matrix Factorization Supercharge your data analytics with BigQuery ML @martonkodok CREATE MODEL wr_temp.purchases_mf_model options(model_type= 'matrix_factorization' ) as SELECT user,item,rating FROM `wr_temp.purchases`; SELECT * FROM ML.RECOMMEND(MODEL wr_temp.purchases_mf_model); Step 1 Create a model from a dataset. Step 2 To view the rating associated with a given user-item pair, use ML.RECOMMEND with the model name. The output will return a rating for each user-item pair.
  31. 31. Use cases: ● All sort of time series data forecast ● Marketing campaign target optimization tool Options and defaults ● Holiday effects adjustments by Region ● Seasonal and trend decomposition ● Auto data frequency detection Time Series forecasting with ARIMA model Supercharge your data analytics with BigQuery ML @martonkodok CREATE MODEL yourmodel OPTIONS (model_type = “ARIMA”) AS SELECT.. ml.FORECAST to be use with HORIZON ml.EVALUATE ml.ARIMA_COEFFICIENTS
  32. 32. Available data: ● Past Timestamp ● Past Value Problem ● Forecasts for next X slots (called horizon) Time Series forecasting with ARIMA model Supercharge your data analytics with BigQuery ML @martonkodok SELECT forecast_timestamp, forecast_value FROM ML.FORECAST(MODEL bqml_tutorial.nyc_citibike_arima_model, STRUCT(300 AS horizon, 0.8 AS confidence_level))
  33. 33. Use cases: ● Easily add TensorFlow predictions to BigQuery ● Build unstructured data models in TensorFlow, predict in BigQuery Key restrictions ● Model size limit of 250MB Import TensorFlow models for prediction Supercharge your data analytics with BigQuery ML @martonkodok CREATE MODEL yourmodel OPTIONS (model_type =“tensorflow”, Model_path =’gs://’) ml.PREDICT() DEMO Search 'QueryIt Smart' on GitHub to learn more.
  34. 34. Google Drive - Collaboratory - Jupyter Notebook Supercharge your data analytics with BigQuery ML @martonkodok
  35. 35. New on BigQuery UI - Evaluation charts Supercharge your data analytics with BigQuery ML @martonkodok
  36. 36. Conclusions Supercharge your data analytics with BigQuery ML @martonkodok
  37. 37. Automation ● Run the process daily ● Determine hyperparameters ● Surface the results and route them somewhere for inspection and improvement Testing ● AB test around impact of data quality on conversion and customer NPS (net promoter score) Improvements ● Determine, and explore outliers ● Repeat, automate Considerations Supercharge your data analytics with BigQuery ML @martonkodok
  38. 38. ● Democratizes the use of ML by empowering data analysts to build and run models using existing business intelligence tools and spreadsheets ● Generalist team. Models are trained using SQL. There is no need to program an ML solution using Python or Java. ● Increases the innovation and speed of model development by removing the need to export data from the data warehouse. ● A Model serves a purpose. Easy to change/recycle. Benefits of BigQuery ML Supercharge your data analytics with BigQuery ML @martonkodok
  39. 39. The possibilities are endless Supercharge your data analytics with BigQuery ML @martonkodok Marketing Retail IndustrialandIoT Media/gaming Predict customer value Predict funnel conversion Personalize ads, email, webpage content Optimize inventory Forecast revenue Enable product recommendations Optimize staff promotions Forecast demand for parking, traffic utilities, personnel Prevent equipment downtime Predict maintenance needs Personalize content Predict game difficulty Predict player lifetime value
  40. 40. Thank you. Slides available on: slideshare.net/martonkodok Reea.net - Integrated web solutions driven by creativity to deliver projects.
  • dzivkovi

    Nov. 27, 2020

One of the hottest topics in database land these days is BigQuery ML. A new way to use machine learning on top of tabular data straight on your tables without leaving the query editor. With BigQuery ML, you can build machine learning models without leaving the database environment and training it on massive datasets. In this demo session, we are going to demonstrate common marketing Machine Learning use cases how to build, train, eval and predict, your own scalable machine learning models using SQL language. The audience will get first hand experience how to write CREATE MODEL sql syntax to build machine learning models such as: – Multiclass logistic regression for classification – K-means clustering – Matrix factorization – ARIMA time series predictions – Import TensorFlow models for prediction in BigQuery Models are trained and accessed in BigQuery using SQL — a language data analysts know. This enables business decision making through predictive analytics across the organization without leaving the query editor.

Aufrufe

Aufrufe insgesamt

103

Auf Slideshare

0

Aus Einbettungen

0

Anzahl der Einbettungen

0

Befehle

Downloads

8

Geteilt

0

Kommentare

0

Likes

1

×