SlideShare a Scribd company logo
1 of 29
Download to read offline
Copyright © 2015 KNIME.com AG
Big Data Science is just a
Click Away!
Rosaria Silipo
KNIME.com
Copyright © 2015 KNIME.com AG
Variety, Volume, Velocity
Variety:
• integrating heterogeneous data (and tools)
Volume:
• from small files...
• ...to distributed data repositories (Hadoop)
• bring the tools to the data
Velocity:
• from distributing computationally heavy
computations...
• ...to real time scoring of millions of
records/sec.
4
Copyright © 2015 KNIME.com AG
Every Minute…
5
Copyright © 2015 KNIME.com AG
IoT
6
Copyright © 2015 KNIME.com AG 7
The Challenge
Copyright © 2015 KNIME.com AG
Energy Usage Prediction from Smart Meters Data
• Read Smart Meter Energy Data (176 millions rows)
• Clean Up and Aggregate total Energy Usage by hour,
week, day, month, year
• Calculate Behavioral Measures for each Smart Meter
• Cluster Smart Meters with Similar Behavior (k-
Means)
• Predict Energy Usage in Clustered Smart Meters
(Auto-Regressive Time Series Prediction)
8
Workflow 1
Workflow 2
Workflow 3
Copyright © 2015 KNIME.com AG
Workflow 1: PrepareData
9
~ 2 days
Copyright © 2015 KNIME.com AG 10
Big Data
Copyright © 2015 KNIME.com AG
Big Data Support
• KNIME Big Data Access Nodes
– preconfigured connectors
– in database processing
• Big Data Platforms
– HDFS, Hive, Impala, HP Vertica, Hortonworks, ParStream,
Actian, any big data platform really!
• Spark MLlib integration (coming soon)
• Streaming Executor (coming soon)
Copyright © 2015 KNIME.com AG
Hadoop Sandboxes
• Hortonworks:
http://hortonworks.com/products/hortonworks-sandbox/
• Cloudera:
http://www.cloudera.com/content/cloudera/en/downloads/
quickstart_vms.html
• Virtual Box
https://www.virtualbox.org/
• VMWare Player
http://www.vmware.com/
12
Copyright © 2015 KNIME.com AG
Access Big
Data
Select Table
In-DB
Processing
Into
KNIME
… as easy as 1,2,3,… 4
13
4321
Copyright © 2015 KNIME.com AG
1. Database Connector
Generic Database Connector
– Can connect to any JDBC source
– Register new JDBC driver via
preferences page
14
Access Big
Data
Copyright © 2015 KNIME.com AG
1. Register JDBC Driver
15
Open KNIME and go to
File -> Preferences
Increase connection timeout for
long running retrieval operations
Access Big
Data
Copyright © 2015 KNIME.com AG
1. Dedicated Connectors
Dedicated pre-configured connectors
– Bundling necessary JDBC drivers
– Easy to use
– DB specific behavior/capability
Some dedicated connectors are part of
the open source KNIME Analytics
Platform, some belong to the
commercial KNIME Big Data Extension
16
works for most
Hadoop HIVE
installations,
including
Hortonworks
free
Access Big
Data
Copyright © 2015 KNIME.com AG
2. Data Table Selection
18
Select
Table
Copyright © 2015 KNIME.com AG
3. In-Database Processing
• Filter rows and columns
• Join tables/queries
• Sort your data
• Write your own query
• Aggregate* your data
19
Similar Settings as
GroupBy node
Similar Settings as
Joiner node
* Database GroupBy node exposes DB specific aggregation methods
In-DB
Processing
Copyright © 2015 KNIME.com AG
3. Queries for average Measures
20
In-DB
Processing
Copyright © 2015 KNIME.com AG
3. Average Monthly Values
22
In-DB
Processing
Copyright © 2015 KNIME.com AG
4. Import Data from Database
23
< 30 min
1 2
3
4
Into KNIME
Copyright © 2015 KNIME.com AG
New Big Data Platform?
24
No problem!
Just change the connector node!
Copyright © 2015 KNIME.com AG
Other Useful Database Nodes
• Drop table
– missing table handling
– cascade option
• Execute any SQL
statement
• Manipulate existing
queries
25
Executes several
queries separated
by ; and new line
Copyright © 2015 KNIME.com AG 26
KNIME Big Data Extension
Copyright © 2015 KNIME.com AG
KNIME Big Data Extension
• KNIME Big Data Access Nodes
– preconfigured connectors
– HDFS File Handling
– Hive/Impala Loader
• Big Data Platforms
– HDFS, Hive, Impala, HP Vertica, Hortonworks, ParStream,
Actian, SAP Hana (to be), …
• Spark MLlib integration (coming soon)
• Streaming Executor (coming soon)
Copyright © 2015 KNIME.com AG
HDFS File Handling
• KNIME & Extensions ->
KNIME File Handling Nodes
• HDFS Connection and
HDFS File Permission nodes
28
Copyright © 2015 KNIME.com AG
Hive/Impala Loader
29
• Upload a KNIME data table to Hive/Impala
Copyright © 2015 KNIME.com AG
KNIME Big Data Extension: Download and Install
KNIME.com Extension Store
License Required!
Installation Instructions
http://tech.knime.org/installation-instructions
Product Description
http://www.knime.org/knime-big-data-extension
Copyright © 2015 KNIME.com AG
License on KNIME Store
http://tech.knime.org/knime-store
30-day trial license available with special Promotion Code
education@knime.com
Copyright © 2015 KNIME.com AG
References
• Whitepaper “KNIME opens the Doors to Big Data”
http://www.knime.org/files/big_data_in_knime_1.pdf
• Blog Post “Integrating Big data is as Easy as 1,2,3, … 4”
http://www.knime.org/blog/integrating-big-data-is-as-easy-as-
1-2-3-4
• The Big Data Extension Product Description
http://www.knime.org/knime-big-data-extension
32
Copyright © 2015 KNIME.com AG
Thank You!
• education@knime.com
• Twitter: @KNIME
• LinkedIn Group: KNIME
• KNIME Blog: http://www.knime.org/blog
33

More Related Content

What's hot

SQL Analytics Powering Telemetry Analysis at Comcast
SQL Analytics Powering Telemetry Analysis at ComcastSQL Analytics Powering Telemetry Analysis at Comcast
SQL Analytics Powering Telemetry Analysis at Comcast
Databricks
 
Hw09 Hadoop Development At Facebook Hive And Hdfs
Hw09   Hadoop Development At Facebook  Hive And HdfsHw09   Hadoop Development At Facebook  Hive And Hdfs
Hw09 Hadoop Development At Facebook Hive And Hdfs
Cloudera, Inc.
 
Column oriented database
Column oriented databaseColumn oriented database
Column oriented database
Kanike Krishna
 

What's hot (20)

Pandas
PandasPandas
Pandas
 
Demystifying Data Warehouse as a Service
Demystifying Data Warehouse as a ServiceDemystifying Data Warehouse as a Service
Demystifying Data Warehouse as a Service
 
SQL Analytics Powering Telemetry Analysis at Comcast
SQL Analytics Powering Telemetry Analysis at ComcastSQL Analytics Powering Telemetry Analysis at Comcast
SQL Analytics Powering Telemetry Analysis at Comcast
 
pandas - Python Data Analysis
pandas - Python Data Analysispandas - Python Data Analysis
pandas - Python Data Analysis
 
Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...
Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...
Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...
 
SQOOP PPT
SQOOP PPTSQOOP PPT
SQOOP PPT
 
Cloud Computing & Big Data
Cloud Computing & Big DataCloud Computing & Big Data
Cloud Computing & Big Data
 
Big Data Ecosystem at LinkedIn. Keynote talk at Big Data Innovators Gathering...
Big Data Ecosystem at LinkedIn. Keynote talk at Big Data Innovators Gathering...Big Data Ecosystem at LinkedIn. Keynote talk at Big Data Innovators Gathering...
Big Data Ecosystem at LinkedIn. Keynote talk at Big Data Innovators Gathering...
 
Python Seaborn Data Visualization
Python Seaborn Data Visualization Python Seaborn Data Visualization
Python Seaborn Data Visualization
 
Hw09 Hadoop Development At Facebook Hive And Hdfs
Hw09   Hadoop Development At Facebook  Hive And HdfsHw09   Hadoop Development At Facebook  Hive And Hdfs
Hw09 Hadoop Development At Facebook Hive And Hdfs
 
Achieving Lakehouse Models with Spark 3.0
Achieving Lakehouse Models with Spark 3.0Achieving Lakehouse Models with Spark 3.0
Achieving Lakehouse Models with Spark 3.0
 
Cloudera SDX
Cloudera SDXCloudera SDX
Cloudera SDX
 
An overview of snowflake
An overview of snowflakeAn overview of snowflake
An overview of snowflake
 
Overview - IBM Big Data Platform
Overview - IBM Big Data PlatformOverview - IBM Big Data Platform
Overview - IBM Big Data Platform
 
Column oriented database
Column oriented databaseColumn oriented database
Column oriented database
 
Apache HBase - Lab Assignment
Apache HBase - Lab AssignmentApache HBase - Lab Assignment
Apache HBase - Lab Assignment
 
Hadoop and Manufacturing
Hadoop and ManufacturingHadoop and Manufacturing
Hadoop and Manufacturing
 
Data Warehousing with Python
Data Warehousing with PythonData Warehousing with Python
Data Warehousing with Python
 
Snowflake Architecture
Snowflake ArchitectureSnowflake Architecture
Snowflake Architecture
 
Auto-Train a Time-Series Forecast Model With AML + ADB
Auto-Train a Time-Series Forecast Model With AML + ADBAuto-Train a Time-Series Forecast Model With AML + ADB
Auto-Train a Time-Series Forecast Model With AML + ADB
 

Viewers also liked

KNIME - Create Workflow with KNIME
KNIME - Create Workflow with KNIMEKNIME - Create Workflow with KNIME
KNIME - Create Workflow with KNIME
Billy Wong
 

Viewers also liked (20)

KNIME tutorial
KNIME tutorialKNIME tutorial
KNIME tutorial
 
Text Processing with KNIME
Text Processing with KNIMEText Processing with KNIME
Text Processing with KNIME
 
Knime
KnimeKnime
Knime
 
KNIME - Create Workflow with KNIME
KNIME - Create Workflow with KNIMEKNIME - Create Workflow with KNIME
KNIME - Create Workflow with KNIME
 
Just add Imagination
Just add ImaginationJust add Imagination
Just add Imagination
 
Knime
Knime Knime
Knime
 
The Actionable Guide to Doing Better Semantic Keyword Research #BrightonSEO (...
The Actionable Guide to Doing Better Semantic Keyword Research #BrightonSEO (...The Actionable Guide to Doing Better Semantic Keyword Research #BrightonSEO (...
The Actionable Guide to Doing Better Semantic Keyword Research #BrightonSEO (...
 
Manual Básico Knime
Manual Básico KnimeManual Básico Knime
Manual Básico Knime
 
Knime customer intelligence on social media: Text Analytics vs. Network Mining
Knime customer intelligence on social media: Text Analytics vs. Network MiningKnime customer intelligence on social media: Text Analytics vs. Network Mining
Knime customer intelligence on social media: Text Analytics vs. Network Mining
 
Webinar Social Media Analytics - Using KNIME
Webinar Social Media Analytics - Using KNIMEWebinar Social Media Analytics - Using KNIME
Webinar Social Media Analytics - Using KNIME
 
Introduction to knime
Introduction to knimeIntroduction to knime
Introduction to knime
 
Sentiment Analysis Using Machine Learning
Sentiment Analysis Using Machine LearningSentiment Analysis Using Machine Learning
Sentiment Analysis Using Machine Learning
 
Advanced analytics for the Internet of Things. Restocking Rental Bike Stations
Advanced analytics for the Internet of Things. Restocking Rental Bike StationsAdvanced analytics for the Internet of Things. Restocking Rental Bike Stations
Advanced analytics for the Internet of Things. Restocking Rental Bike Stations
 
SearchLove Boston 2016 | Paul Shapiro | How to Automate Your Keyword Research
SearchLove Boston 2016 | Paul Shapiro | How to Automate Your Keyword ResearchSearchLove Boston 2016 | Paul Shapiro | How to Automate Your Keyword Research
SearchLove Boston 2016 | Paul Shapiro | How to Automate Your Keyword Research
 
Productionizing Spark and the REST Job Server- Evan Chan
Productionizing Spark and the REST Job Server- Evan ChanProductionizing Spark and the REST Job Server- Evan Chan
Productionizing Spark and the REST Job Server- Evan Chan
 
#APIDays Paris - NamSor API for 'Gender Gap Grader'
#APIDays Paris - NamSor API for 'Gender Gap Grader'#APIDays Paris - NamSor API for 'Gender Gap Grader'
#APIDays Paris - NamSor API for 'Gender Gap Grader'
 
2015-11-17 Présentation SEAO et ES
2015-11-17 Présentation SEAO et ES2015-11-17 Présentation SEAO et ES
2015-11-17 Présentation SEAO et ES
 
Apresentação Webinar – Analytics em Mídia Sociais
Apresentação Webinar – Analytics em Mídia SociaisApresentação Webinar – Analytics em Mídia Sociais
Apresentação Webinar – Analytics em Mídia Sociais
 
CURRICULO_LeonardoLopes _20160623
CURRICULO_LeonardoLopes _20160623CURRICULO_LeonardoLopes _20160623
CURRICULO_LeonardoLopes _20160623
 
Using Sociolinguistics to Enhance Customer Segmentation, Geomarketing & Diver...
Using Sociolinguistics to Enhance Customer Segmentation, Geomarketing & Diver...Using Sociolinguistics to Enhance Customer Segmentation, Geomarketing & Diver...
Using Sociolinguistics to Enhance Customer Segmentation, Geomarketing & Diver...
 

Similar to Big Data with KNIME is as easy as 1, 2, 3, ...4!

Pivotal deep dive_on_pivotal_hd_world_class_hdfs_platform
Pivotal deep dive_on_pivotal_hd_world_class_hdfs_platformPivotal deep dive_on_pivotal_hd_world_class_hdfs_platform
Pivotal deep dive_on_pivotal_hd_world_class_hdfs_platform
EMC
 

Similar to Big Data with KNIME is as easy as 1, 2, 3, ...4! (20)

SQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for ImpalaSQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for Impala
 
What's New in KNIME Analytics Platform 4.1
What's New in KNIME Analytics Platform 4.1What's New in KNIME Analytics Platform 4.1
What's New in KNIME Analytics Platform 4.1
 
InfoSphere BigInsights - Analytics power for Hadoop - field experience
InfoSphere BigInsights - Analytics power for Hadoop - field experienceInfoSphere BigInsights - Analytics power for Hadoop - field experience
InfoSphere BigInsights - Analytics power for Hadoop - field experience
 
Hadoop Essentials -- The What, Why and How to Meet Agency Objectives
Hadoop Essentials -- The What, Why and How to Meet Agency ObjectivesHadoop Essentials -- The What, Why and How to Meet Agency Objectives
Hadoop Essentials -- The What, Why and How to Meet Agency Objectives
 
Unlocking Big Data Insights with MySQL
Unlocking Big Data Insights with MySQLUnlocking Big Data Insights with MySQL
Unlocking Big Data Insights with MySQL
 
Hadoop Application Architectures tutorial at Big DataService 2015
Hadoop Application Architectures tutorial at Big DataService 2015Hadoop Application Architectures tutorial at Big DataService 2015
Hadoop Application Architectures tutorial at Big DataService 2015
 
Simplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache KuduSimplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache Kudu
 
Software Defined Infrastructure
Software Defined InfrastructureSoftware Defined Infrastructure
Software Defined Infrastructure
 
Multi-Tenant Operations with Cloudera 5.7 & BT
Multi-Tenant Operations with Cloudera 5.7 & BTMulti-Tenant Operations with Cloudera 5.7 & BT
Multi-Tenant Operations with Cloudera 5.7 & BT
 
KNIME Software Overview
KNIME Software OverviewKNIME Software Overview
KNIME Software Overview
 
Vmware Serengeti - Based on Infochimps Ironfan
Vmware Serengeti - Based on Infochimps IronfanVmware Serengeti - Based on Infochimps Ironfan
Vmware Serengeti - Based on Infochimps Ironfan
 
Strata EU tutorial - Architectural considerations for hadoop applications
Strata EU tutorial - Architectural considerations for hadoop applicationsStrata EU tutorial - Architectural considerations for hadoop applications
Strata EU tutorial - Architectural considerations for hadoop applications
 
Open Sourcing GemFire - Apache Geode
Open Sourcing GemFire - Apache GeodeOpen Sourcing GemFire - Apache Geode
Open Sourcing GemFire - Apache Geode
 
An Introduction to Apache Geode (incubating)
An Introduction to Apache Geode (incubating)An Introduction to Apache Geode (incubating)
An Introduction to Apache Geode (incubating)
 
1. beyond mission critical virtualizing big data and hadoop
1. beyond mission critical   virtualizing big data and hadoop1. beyond mission critical   virtualizing big data and hadoop
1. beyond mission critical virtualizing big data and hadoop
 
Analyzing the World's Largest Security Data Lake!
Analyzing the World's Largest Security Data Lake!Analyzing the World's Largest Security Data Lake!
Analyzing the World's Largest Security Data Lake!
 
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5
 
Pivotal deep dive_on_pivotal_hd_world_class_hdfs_platform
Pivotal deep dive_on_pivotal_hd_world_class_hdfs_platformPivotal deep dive_on_pivotal_hd_world_class_hdfs_platform
Pivotal deep dive_on_pivotal_hd_world_class_hdfs_platform
 
Deploy and Scale a Cloud Application with Amazon Lightsail (CMP410-R2) - AWS ...
Deploy and Scale a Cloud Application with Amazon Lightsail (CMP410-R2) - AWS ...Deploy and Scale a Cloud Application with Amazon Lightsail (CMP410-R2) - AWS ...
Deploy and Scale a Cloud Application with Amazon Lightsail (CMP410-R2) - AWS ...
 
IMCSummit 2015 - 1 IT Business - The Evolution of Pivotal Gemfire
IMCSummit 2015 - 1 IT Business  - The Evolution of Pivotal GemfireIMCSummit 2015 - 1 IT Business  - The Evolution of Pivotal Gemfire
IMCSummit 2015 - 1 IT Business - The Evolution of Pivotal Gemfire
 

More from KNIMESlides

Webinar: Behind the Scenes on Guided Analytics
Webinar: Behind the Scenes on Guided AnalyticsWebinar: Behind the Scenes on Guided Analytics
Webinar: Behind the Scenes on Guided Analytics
KNIMESlides
 

More from KNIMESlides (20)

Codeless Deep Learning for Language Modeling and Image Classification
Codeless Deep Learning for Language Modeling and Image ClassificationCodeless Deep Learning for Language Modeling and Image Classification
Codeless Deep Learning for Language Modeling and Image Classification
 
Automating Inferences out of Financial Data
Automating Inferences out of Financial DataAutomating Inferences out of Financial Data
Automating Inferences out of Financial Data
 
Credit Card Fraud Detection Tutorial - KNIME Meetup Berlin 2020
Credit Card Fraud Detection Tutorial - KNIME Meetup Berlin 2020Credit Card Fraud Detection Tutorial - KNIME Meetup Berlin 2020
Credit Card Fraud Detection Tutorial - KNIME Meetup Berlin 2020
 
Credit Card Fraud Detection Tutorial
Credit Card Fraud Detection TutorialCredit Card Fraud Detection Tutorial
Credit Card Fraud Detection Tutorial
 
Practicing Data Science: A Collection of Case Studies
Practicing Data Science: A Collection of Case StudiesPracticing Data Science: A Collection of Case Studies
Practicing Data Science: A Collection of Case Studies
 
What's New in KNIME Analytics Platform 4.0 and KNIME Server 4.9
What's New in KNIME Analytics Platform 4.0 and KNIME Server 4.9What's New in KNIME Analytics Platform 4.0 and KNIME Server 4.9
What's New in KNIME Analytics Platform 4.0 and KNIME Server 4.9
 
Webinar: Behind the Scenes on Guided Analytics
Webinar: Behind the Scenes on Guided AnalyticsWebinar: Behind the Scenes on Guided Analytics
Webinar: Behind the Scenes on Guided Analytics
 
KNIME Data Science Learnathon: From Raw Data To Deployment - Dublin - June 2019
KNIME Data Science Learnathon: From Raw Data To Deployment - Dublin - June 2019KNIME Data Science Learnathon: From Raw Data To Deployment - Dublin - June 2019
KNIME Data Science Learnathon: From Raw Data To Deployment - Dublin - June 2019
 
Scoring Metrics for Classification Models
Scoring Metrics for Classification ModelsScoring Metrics for Classification Models
Scoring Metrics for Classification Models
 
Open Source Story and what’s new in KNIME Software
Open Source Story and what’s new in KNIME SoftwareOpen Source Story and what’s new in KNIME Software
Open Source Story and what’s new in KNIME Software
 
Anomaly Detection - Discover unknown Frauds and Anomalies using Machine Learning
Anomaly Detection - Discover unknown Frauds and Anomalies using Machine LearningAnomaly Detection - Discover unknown Frauds and Anomalies using Machine Learning
Anomaly Detection - Discover unknown Frauds and Anomalies using Machine Learning
 
Sharing and Deploying Data Science with KNIME Server
Sharing and Deploying Data Science with KNIME ServerSharing and Deploying Data Science with KNIME Server
Sharing and Deploying Data Science with KNIME Server
 
Guided Automation- A Blueprint for Interactive Automated Machine Learning
Guided Automation- A Blueprint for Interactive Automated Machine LearningGuided Automation- A Blueprint for Interactive Automated Machine Learning
Guided Automation- A Blueprint for Interactive Automated Machine Learning
 
KNIME Data Science Learnathon: From Raw Data To Deployment - Paris - November...
KNIME Data Science Learnathon: From Raw Data To Deployment - Paris - November...KNIME Data Science Learnathon: From Raw Data To Deployment - Paris - November...
KNIME Data Science Learnathon: From Raw Data To Deployment - Paris - November...
 
Sentiment Analysis with KNIME Analytics Platform
Sentiment Analysis with KNIME Analytics PlatformSentiment Analysis with KNIME Analytics Platform
Sentiment Analysis with KNIME Analytics Platform
 
Chemistry Data Basics with KNIME Analytics Platform
Chemistry Data Basics with KNIME Analytics PlatformChemistry Data Basics with KNIME Analytics Platform
Chemistry Data Basics with KNIME Analytics Platform
 
Sentiment Analysis with Deep Learning, Machine Learning or Lexicon based
Sentiment Analysis with Deep Learning, Machine Learning or Lexicon basedSentiment Analysis with Deep Learning, Machine Learning or Lexicon based
Sentiment Analysis with Deep Learning, Machine Learning or Lexicon based
 
KNIME Data Science Learnathon: From Raw Data To Deployment
KNIME Data Science Learnathon: From Raw Data To DeploymentKNIME Data Science Learnathon: From Raw Data To Deployment
KNIME Data Science Learnathon: From Raw Data To Deployment
 
From Raw Data to Deployment
From Raw Data to DeploymentFrom Raw Data to Deployment
From Raw Data to Deployment
 
From raw data to deployment
From raw data to deployment From raw data to deployment
From raw data to deployment
 

Recently uploaded

Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
amitlee9823
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
amitlee9823
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
amitlee9823
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
AroojKhan71
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
JoseMangaJr1
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
MarinCaroMartnezBerg
 

Recently uploaded (20)

Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 

Big Data with KNIME is as easy as 1, 2, 3, ...4!

  • 1. Copyright © 2015 KNIME.com AG Big Data Science is just a Click Away! Rosaria Silipo KNIME.com
  • 2. Copyright © 2015 KNIME.com AG Variety, Volume, Velocity Variety: • integrating heterogeneous data (and tools) Volume: • from small files... • ...to distributed data repositories (Hadoop) • bring the tools to the data Velocity: • from distributing computationally heavy computations... • ...to real time scoring of millions of records/sec. 4
  • 3. Copyright © 2015 KNIME.com AG Every Minute… 5
  • 4. Copyright © 2015 KNIME.com AG IoT 6
  • 5. Copyright © 2015 KNIME.com AG 7 The Challenge
  • 6. Copyright © 2015 KNIME.com AG Energy Usage Prediction from Smart Meters Data • Read Smart Meter Energy Data (176 millions rows) • Clean Up and Aggregate total Energy Usage by hour, week, day, month, year • Calculate Behavioral Measures for each Smart Meter • Cluster Smart Meters with Similar Behavior (k- Means) • Predict Energy Usage in Clustered Smart Meters (Auto-Regressive Time Series Prediction) 8 Workflow 1 Workflow 2 Workflow 3
  • 7. Copyright © 2015 KNIME.com AG Workflow 1: PrepareData 9 ~ 2 days
  • 8. Copyright © 2015 KNIME.com AG 10 Big Data
  • 9. Copyright © 2015 KNIME.com AG Big Data Support • KNIME Big Data Access Nodes – preconfigured connectors – in database processing • Big Data Platforms – HDFS, Hive, Impala, HP Vertica, Hortonworks, ParStream, Actian, any big data platform really! • Spark MLlib integration (coming soon) • Streaming Executor (coming soon)
  • 10. Copyright © 2015 KNIME.com AG Hadoop Sandboxes • Hortonworks: http://hortonworks.com/products/hortonworks-sandbox/ • Cloudera: http://www.cloudera.com/content/cloudera/en/downloads/ quickstart_vms.html • Virtual Box https://www.virtualbox.org/ • VMWare Player http://www.vmware.com/ 12
  • 11. Copyright © 2015 KNIME.com AG Access Big Data Select Table In-DB Processing Into KNIME … as easy as 1,2,3,… 4 13 4321
  • 12. Copyright © 2015 KNIME.com AG 1. Database Connector Generic Database Connector – Can connect to any JDBC source – Register new JDBC driver via preferences page 14 Access Big Data
  • 13. Copyright © 2015 KNIME.com AG 1. Register JDBC Driver 15 Open KNIME and go to File -> Preferences Increase connection timeout for long running retrieval operations Access Big Data
  • 14. Copyright © 2015 KNIME.com AG 1. Dedicated Connectors Dedicated pre-configured connectors – Bundling necessary JDBC drivers – Easy to use – DB specific behavior/capability Some dedicated connectors are part of the open source KNIME Analytics Platform, some belong to the commercial KNIME Big Data Extension 16 works for most Hadoop HIVE installations, including Hortonworks free Access Big Data
  • 15. Copyright © 2015 KNIME.com AG 2. Data Table Selection 18 Select Table
  • 16. Copyright © 2015 KNIME.com AG 3. In-Database Processing • Filter rows and columns • Join tables/queries • Sort your data • Write your own query • Aggregate* your data 19 Similar Settings as GroupBy node Similar Settings as Joiner node * Database GroupBy node exposes DB specific aggregation methods In-DB Processing
  • 17. Copyright © 2015 KNIME.com AG 3. Queries for average Measures 20 In-DB Processing
  • 18. Copyright © 2015 KNIME.com AG 3. Average Monthly Values 22 In-DB Processing
  • 19. Copyright © 2015 KNIME.com AG 4. Import Data from Database 23 < 30 min 1 2 3 4 Into KNIME
  • 20. Copyright © 2015 KNIME.com AG New Big Data Platform? 24 No problem! Just change the connector node!
  • 21. Copyright © 2015 KNIME.com AG Other Useful Database Nodes • Drop table – missing table handling – cascade option • Execute any SQL statement • Manipulate existing queries 25 Executes several queries separated by ; and new line
  • 22. Copyright © 2015 KNIME.com AG 26 KNIME Big Data Extension
  • 23. Copyright © 2015 KNIME.com AG KNIME Big Data Extension • KNIME Big Data Access Nodes – preconfigured connectors – HDFS File Handling – Hive/Impala Loader • Big Data Platforms – HDFS, Hive, Impala, HP Vertica, Hortonworks, ParStream, Actian, SAP Hana (to be), … • Spark MLlib integration (coming soon) • Streaming Executor (coming soon)
  • 24. Copyright © 2015 KNIME.com AG HDFS File Handling • KNIME & Extensions -> KNIME File Handling Nodes • HDFS Connection and HDFS File Permission nodes 28
  • 25. Copyright © 2015 KNIME.com AG Hive/Impala Loader 29 • Upload a KNIME data table to Hive/Impala
  • 26. Copyright © 2015 KNIME.com AG KNIME Big Data Extension: Download and Install KNIME.com Extension Store License Required! Installation Instructions http://tech.knime.org/installation-instructions Product Description http://www.knime.org/knime-big-data-extension
  • 27. Copyright © 2015 KNIME.com AG License on KNIME Store http://tech.knime.org/knime-store 30-day trial license available with special Promotion Code education@knime.com
  • 28. Copyright © 2015 KNIME.com AG References • Whitepaper “KNIME opens the Doors to Big Data” http://www.knime.org/files/big_data_in_knime_1.pdf • Blog Post “Integrating Big data is as Easy as 1,2,3, … 4” http://www.knime.org/blog/integrating-big-data-is-as-easy-as- 1-2-3-4 • The Big Data Extension Product Description http://www.knime.org/knime-big-data-extension 32
  • 29. Copyright © 2015 KNIME.com AG Thank You! • education@knime.com • Twitter: @KNIME • LinkedIn Group: KNIME • KNIME Blog: http://www.knime.org/blog 33