SlideShare ist ein Scribd-Unternehmen logo
1 von 17
Downloaden Sie, um offline zu lesen
© Hitachi America, Ltd. 2017. All rights reserved.
Hands-on demo of PDI using webSpoon
Researcher at Hitachi America, Ltd.
4/27/2017
Hiromu Hota, PhD
@HiromuHota, hiromu.hota@hal.hitachi.com
© Hitachi America, Ltd. 2017. All rights reserved.
Get started with webSpoon
1
© Hitachi America, Ltd. 2017. All rights reserved.
How to get started with webSpoon
2
1. Visit
https://HighlyAvailable-env.i8gkiqhycy.us-west-2.elasticbeanstalk.com
(will be deleted after the meetup)
2. Login with
User: user
Password: password
3. From the top menu, click File > New > Transformation
© Hitachi America, Ltd. 2017. All rights reserved.
• Transformations
– are data flows, which typically start from data sources, go through some
processing, and end at a target database table.
– are comprised of steps and hops.
– are saved as .ktr (Kettle) files or to a repository.
• Steps and Hops
– Steps are designed for a specific task such as input, output, scripting, etc.
– Hops are directed data pathways that connect steps.
Basic Concepts of PDI
3
HopStep
Trans.ktr
Repository
Save
© Hitachi America, Ltd. 2017. All rights reserved.
How to operate webSpoon
4
• Drawing Steps
1. Under the Design tab, expand the Input node, then click and drag a
Generate random credit card numbers step onto the canvas.
2. Expand the Flow node; click and drag a Dummy (do nothing) step onto the
canvas.
• Drawing Hops (similar to the way in Spoon)
1. Key-down and hold the <SHIFT> key.
2. Click-down and hold the Generate random credit card numbers step.
3. Move the mouse cursor to the Dummy (do nothing) step.
4. Release the click and the key.
© Hitachi America, Ltd. 2017. All rights reserved.
Example demo
5
© Hitachi America, Ltd. 2017. All rights reserved.
Demo story
6
• Background
– Ichiro Hitachi works for a travel agency, based in San Francisco.
– He wants to offer additional benefit to his customer tourists.
– He personally likes to visit filming locations when visiting a new place,
so strongly believes that such information is useful for them too.
• Movie location notifier
– When his customers come close to a filming location, they receive a
notification that tells title, year, short plot, actor, and address
(Cropped) Map of San Francisco by Ryan Holliday / CC-BY-SA 4.0
• Godzilla (2014)
• He attacked GGB
• Golden Gate Bridge
• Forrest Gump (1994)
• He has accidentally been present
at many historic moments
• 3301 Lyon Street
© Hitachi America, Ltd. 2017. All rights reserved.
Source data: “Film Locations in San Francisco”
7
• Source data
– Available on SF OpenData (https://data.sfgov.org/).
– A list of filming locations of movies shot in San Francisco.
• Web APIs to retrieve missing information
– OMDb (Open Movie Database) API
• Short plot of the movie
– Google Maps API
• Formatted (normalized) address (e.g., Palace of Fine Arts -> 3301 Lyon Street)
• Latitude & Longitude of the location, to calculate the distance from each user
Title Year Locations Actor1 ...
Godzilla 2014 Kearney & Pine St.
Forrest Gump 1994 Palace of Fine Arts
...
© Hitachi America, Ltd. 2017. All rights reserved.
High-level demo system architecture
8
webSpoon
SF OpenData
Organizer Participants
Database
Google Maps API OMDb API
Raw data
Operations
Enriched data
Specific location data
Geo data Movie data
Not covered today
© Hitachi America, Ltd. 2017. All rights reserved.
Exercise (step 1)
9
1. Open an example file and save in a different name
1. Click File > Open, select example2, then click OK
2. Click File > Save as, change Transformation name to be unique (not to be
overwritten by others), then click OK
© Hitachi America, Ltd. 2017. All rights reserved.
Exercise (step 2)
10
2. Run
1. Click the Run button or Action > Run from the menu
2. Click the Run button at the bottom right
Step 3.1 Step 3.2
© Hitachi America, Ltd. 2017. All rights reserved.
Exercise (step 3)
11
3. Preview the result
1. Click on the “Dummy (do nothing)” step
2. Click on the “Preview data” tab in the “Execution Results” at the bottom
3. See other steps
© Hitachi America, Ltd. 2017. All rights reserved.
Exercise (step 4)
12
4. Complete the data flow by enabling the disabled hop
1. Click on the hop between “Dummy (do nothing)” and “Filter out rows...”
2. Save, Run, and preview the result
© Hitachi America, Ltd. 2017. All rights reserved.
Exercise (step 5)
13
5. Explorer the rest yourself; for example,
– Click on each step and see how it is configured
– Explorer what kinds of steps are available
– Design the exact same flow yourself
– Download and deploy webSpoon
• Docker image: https://hub.docker.com/r/hiromuhota/webspoon/
• WAR file: https://github.com/HiromuHota/pentaho-kettle/releases
– Download and install Pentaho Data Integration (including Spoon)
• http://www.pentaho.com/download (Enterprise Edition)
• http://community.pentaho.com/ (Community Edition)
© Hitachi America, Ltd. 2017. All rights reserved.
Trademarks and copyrights
14
• Pentaho is a registered trademark of Pentaho, Inc.
• AWS, Amazon Elastic Beanstalk, and any other AWS Marks and
Services are trademarks of Amazon Web Services, Inc.
• The use of AWS Simple Icons is permitted by Amazon Web Services,
Inc.
• Godzilla is a registered trademark of Toho Co., Ltd.
• Google Maps is a trademark of Google Inc.
• All content via OMDb API is licensed by Brian Fritz under CC BY-NC 4.0.
© Hitachi America, Ltd. 2017. All rights reserved.
Demo system architecture
16
webSpoon
Classic Load
Balancer
Auto Scaling group
Elastic Beanstalk
AWS cloud
SF OpenData
・・・
Organizer
ParticipantsDatabase
Geo data, Movie data

Weitere ähnliche Inhalte

Was ist angesagt?

Drone CI/CD 自動化測試及部署
Drone CI/CD 自動化測試及部署Drone CI/CD 自動化測試及部署
Drone CI/CD 自動化測試及部署Bo-Yi Wu
 
Drone CI/CD Platform
Drone CI/CD PlatformDrone CI/CD Platform
Drone CI/CD PlatformBo-Yi Wu
 
Apache Camel - The integration library
Apache Camel - The integration libraryApache Camel - The integration library
Apache Camel - The integration libraryClaus Ibsen
 
Apache Airflow in Production
Apache Airflow in ProductionApache Airflow in Production
Apache Airflow in ProductionRobert Sanders
 
Connecting mq&amp;kafka
Connecting mq&amp;kafkaConnecting mq&amp;kafka
Connecting mq&amp;kafkaMatt Leming
 
Protecting your data at rest with Apache Kafka by Confluent and Vormetric
Protecting your data at rest with Apache Kafka by Confluent and VormetricProtecting your data at rest with Apache Kafka by Confluent and Vormetric
Protecting your data at rest with Apache Kafka by Confluent and Vormetricconfluent
 
ExoPlayer for Application developers
ExoPlayer for Application developersExoPlayer for Application developers
ExoPlayer for Application developersHassan Abid
 
Intelligently collecting data at the edge—intro to Apache MiNiFi
Intelligently collecting data at the edge—intro to Apache MiNiFiIntelligently collecting data at the edge—intro to Apache MiNiFi
Intelligently collecting data at the edge—intro to Apache MiNiFiDataWorks Summit
 
Alfresco search services: Now and Then
Alfresco search services: Now and ThenAlfresco search services: Now and Then
Alfresco search services: Now and ThenAngel Borroy López
 
JSLT: JSON querying and transformation
JSLT: JSON querying and transformationJSLT: JSON querying and transformation
JSLT: JSON querying and transformationLars Marius Garshol
 
Monitoring Kafka without instrumentation using eBPF with Antón Rodríguez | Ka...
Monitoring Kafka without instrumentation using eBPF with Antón Rodríguez | Ka...Monitoring Kafka without instrumentation using eBPF with Antón Rodríguez | Ka...
Monitoring Kafka without instrumentation using eBPF with Antón Rodríguez | Ka...HostedbyConfluent
 
Native Android Userspace part of the Embedded Android Workshop at Linaro Conn...
Native Android Userspace part of the Embedded Android Workshop at Linaro Conn...Native Android Userspace part of the Embedded Android Workshop at Linaro Conn...
Native Android Userspace part of the Embedded Android Workshop at Linaro Conn...Opersys inc.
 
HKG15-311: OP-TEE for Beginners and Porting Review
HKG15-311: OP-TEE for Beginners and Porting ReviewHKG15-311: OP-TEE for Beginners and Porting Review
HKG15-311: OP-TEE for Beginners and Porting ReviewLinaro
 
E health interoperability layer through kafka
E health interoperability layer through kafkaE health interoperability layer through kafka
E health interoperability layer through kafkaIfunga Ndana
 
Ingress? That’s So 2020! Introducing the Kubernetes Gateway API
Ingress? That’s So 2020! Introducing the Kubernetes Gateway APIIngress? That’s So 2020! Introducing the Kubernetes Gateway API
Ingress? That’s So 2020! Introducing the Kubernetes Gateway APIVMware Tanzu
 

Was ist angesagt? (20)

Drone CI/CD 自動化測試及部署
Drone CI/CD 自動化測試及部署Drone CI/CD 自動化測試及部署
Drone CI/CD 自動化測試及部署
 
Drone CI/CD Platform
Drone CI/CD PlatformDrone CI/CD Platform
Drone CI/CD Platform
 
Git and git hub basics
Git and git hub basicsGit and git hub basics
Git and git hub basics
 
Apache Camel - The integration library
Apache Camel - The integration libraryApache Camel - The integration library
Apache Camel - The integration library
 
Apache Airflow
Apache AirflowApache Airflow
Apache Airflow
 
Apache Airflow in Production
Apache Airflow in ProductionApache Airflow in Production
Apache Airflow in Production
 
Connecting mq&amp;kafka
Connecting mq&amp;kafkaConnecting mq&amp;kafka
Connecting mq&amp;kafka
 
Git & GitLab
Git & GitLabGit & GitLab
Git & GitLab
 
Fluent Bit: Log Forwarding at Scale
Fluent Bit: Log Forwarding at ScaleFluent Bit: Log Forwarding at Scale
Fluent Bit: Log Forwarding at Scale
 
Protecting your data at rest with Apache Kafka by Confluent and Vormetric
Protecting your data at rest with Apache Kafka by Confluent and VormetricProtecting your data at rest with Apache Kafka by Confluent and Vormetric
Protecting your data at rest with Apache Kafka by Confluent and Vormetric
 
ExoPlayer for Application developers
ExoPlayer for Application developersExoPlayer for Application developers
ExoPlayer for Application developers
 
Intelligently collecting data at the edge—intro to Apache MiNiFi
Intelligently collecting data at the edge—intro to Apache MiNiFiIntelligently collecting data at the edge—intro to Apache MiNiFi
Intelligently collecting data at the edge—intro to Apache MiNiFi
 
Alfresco search services: Now and Then
Alfresco search services: Now and ThenAlfresco search services: Now and Then
Alfresco search services: Now and Then
 
Apache Airflow
Apache AirflowApache Airflow
Apache Airflow
 
JSLT: JSON querying and transformation
JSLT: JSON querying and transformationJSLT: JSON querying and transformation
JSLT: JSON querying and transformation
 
Monitoring Kafka without instrumentation using eBPF with Antón Rodríguez | Ka...
Monitoring Kafka without instrumentation using eBPF with Antón Rodríguez | Ka...Monitoring Kafka without instrumentation using eBPF with Antón Rodríguez | Ka...
Monitoring Kafka without instrumentation using eBPF with Antón Rodríguez | Ka...
 
Native Android Userspace part of the Embedded Android Workshop at Linaro Conn...
Native Android Userspace part of the Embedded Android Workshop at Linaro Conn...Native Android Userspace part of the Embedded Android Workshop at Linaro Conn...
Native Android Userspace part of the Embedded Android Workshop at Linaro Conn...
 
HKG15-311: OP-TEE for Beginners and Porting Review
HKG15-311: OP-TEE for Beginners and Porting ReviewHKG15-311: OP-TEE for Beginners and Porting Review
HKG15-311: OP-TEE for Beginners and Porting Review
 
E health interoperability layer through kafka
E health interoperability layer through kafkaE health interoperability layer through kafka
E health interoperability layer through kafka
 
Ingress? That’s So 2020! Introducing the Kubernetes Gateway API
Ingress? That’s So 2020! Introducing the Kubernetes Gateway APIIngress? That’s So 2020! Introducing the Kubernetes Gateway API
Ingress? That’s So 2020! Introducing the Kubernetes Gateway API
 

Ähnlich wie Hands-on demo of PDI using webSpoon

Extending Android's Platform Toolsuite
Extending Android's Platform ToolsuiteExtending Android's Platform Toolsuite
Extending Android's Platform ToolsuiteOpersys inc.
 
Cerebro general overiew eng
Cerebro general overiew engCerebro general overiew eng
Cerebro general overiew engCineSoft
 
Super Easy Memory Forensics
Super Easy Memory ForensicsSuper Easy Memory Forensics
Super Easy Memory ForensicsIIJ
 
Dori waldman android _course
Dori waldman android _courseDori waldman android _course
Dori waldman android _courseDori Waldman
 
Dd13.2013.milano.open ntf
Dd13.2013.milano.open ntfDd13.2013.milano.open ntf
Dd13.2013.milano.open ntfUlrich Krause
 
OpenSource Big Data Platform - Flamingo Project
OpenSource Big Data Platform - Flamingo ProjectOpenSource Big Data Platform - Flamingo Project
OpenSource Big Data Platform - Flamingo ProjectBYOUNG GON KIM
 
Migrating to Git: Rethinking the Commit
Migrating to Git:  Rethinking the CommitMigrating to Git:  Rethinking the Commit
Migrating to Git: Rethinking the CommitKim Moir
 
Embedded Android Workshop part I ESC SV 2012
Embedded Android Workshop part I ESC SV 2012Embedded Android Workshop part I ESC SV 2012
Embedded Android Workshop part I ESC SV 2012Opersys inc.
 
Hacktoberfest 2020 - Open source for beginners
Hacktoberfest 2020 - Open source for beginnersHacktoberfest 2020 - Open source for beginners
Hacktoberfest 2020 - Open source for beginnersDeepikaRana30
 
Prototyping for mobile
Prototyping for mobilePrototyping for mobile
Prototyping for mobileMemi Beltrame
 
R1-intro-to-go.pptx
R1-intro-to-go.pptxR1-intro-to-go.pptx
R1-intro-to-go.pptxAbabb2
 
The Latest and Greatest from OpenNTF and the IBM Social Business Toolkit, #dd13
The Latest and Greatest from OpenNTF and the IBM Social Business Toolkit, #dd13The Latest and Greatest from OpenNTF and the IBM Social Business Toolkit, #dd13
The Latest and Greatest from OpenNTF and the IBM Social Business Toolkit, #dd13Dominopoint - Italian Lotus User Group
 
Refactoring to Go modules: why and how
Refactoring to Go modules: why and howRefactoring to Go modules: why and how
Refactoring to Go modules: why and howLeon Stigter
 
BriMor Labs Live Response Collection - OSDFCON
BriMor Labs Live Response Collection - OSDFCONBriMor Labs Live Response Collection - OSDFCON
BriMor Labs Live Response Collection - OSDFCONBriMorLabs
 
Introduction of ShinoBOT (Black Hat USA 2013 Arsenal)
Introduction of ShinoBOT (Black Hat USA 2013 Arsenal)Introduction of ShinoBOT (Black Hat USA 2013 Arsenal)
Introduction of ShinoBOT (Black Hat USA 2013 Arsenal)Shota Shinogi
 
What the Heck Just Happened?
What the Heck Just Happened?What the Heck Just Happened?
What the Heck Just Happened?Ken Evans
 
Dori waldman android _course_2
Dori waldman android _course_2Dori waldman android _course_2
Dori waldman android _course_2Dori Waldman
 
Fernando Arnaboldi - Exposing Hidden Exploitable Behaviors Using Extended Dif...
Fernando Arnaboldi - Exposing Hidden Exploitable Behaviors Using Extended Dif...Fernando Arnaboldi - Exposing Hidden Exploitable Behaviors Using Extended Dif...
Fernando Arnaboldi - Exposing Hidden Exploitable Behaviors Using Extended Dif...Codemotion
 
COMP 4026 Lecture3 Prototyping and Evaluation
COMP 4026 Lecture3 Prototyping and EvaluationCOMP 4026 Lecture3 Prototyping and Evaluation
COMP 4026 Lecture3 Prototyping and EvaluationMark Billinghurst
 

Ähnlich wie Hands-on demo of PDI using webSpoon (20)

Extending Android's Platform Toolsuite
Extending Android's Platform ToolsuiteExtending Android's Platform Toolsuite
Extending Android's Platform Toolsuite
 
Cerebro general overiew eng
Cerebro general overiew engCerebro general overiew eng
Cerebro general overiew eng
 
Super Easy Memory Forensics
Super Easy Memory ForensicsSuper Easy Memory Forensics
Super Easy Memory Forensics
 
Introduction to git & github
Introduction to git & githubIntroduction to git & github
Introduction to git & github
 
Dori waldman android _course
Dori waldman android _courseDori waldman android _course
Dori waldman android _course
 
Dd13.2013.milano.open ntf
Dd13.2013.milano.open ntfDd13.2013.milano.open ntf
Dd13.2013.milano.open ntf
 
OpenSource Big Data Platform - Flamingo Project
OpenSource Big Data Platform - Flamingo ProjectOpenSource Big Data Platform - Flamingo Project
OpenSource Big Data Platform - Flamingo Project
 
Migrating to Git: Rethinking the Commit
Migrating to Git:  Rethinking the CommitMigrating to Git:  Rethinking the Commit
Migrating to Git: Rethinking the Commit
 
Embedded Android Workshop part I ESC SV 2012
Embedded Android Workshop part I ESC SV 2012Embedded Android Workshop part I ESC SV 2012
Embedded Android Workshop part I ESC SV 2012
 
Hacktoberfest 2020 - Open source for beginners
Hacktoberfest 2020 - Open source for beginnersHacktoberfest 2020 - Open source for beginners
Hacktoberfest 2020 - Open source for beginners
 
Prototyping for mobile
Prototyping for mobilePrototyping for mobile
Prototyping for mobile
 
R1-intro-to-go.pptx
R1-intro-to-go.pptxR1-intro-to-go.pptx
R1-intro-to-go.pptx
 
The Latest and Greatest from OpenNTF and the IBM Social Business Toolkit, #dd13
The Latest and Greatest from OpenNTF and the IBM Social Business Toolkit, #dd13The Latest and Greatest from OpenNTF and the IBM Social Business Toolkit, #dd13
The Latest and Greatest from OpenNTF and the IBM Social Business Toolkit, #dd13
 
Refactoring to Go modules: why and how
Refactoring to Go modules: why and howRefactoring to Go modules: why and how
Refactoring to Go modules: why and how
 
BriMor Labs Live Response Collection - OSDFCON
BriMor Labs Live Response Collection - OSDFCONBriMor Labs Live Response Collection - OSDFCON
BriMor Labs Live Response Collection - OSDFCON
 
Introduction of ShinoBOT (Black Hat USA 2013 Arsenal)
Introduction of ShinoBOT (Black Hat USA 2013 Arsenal)Introduction of ShinoBOT (Black Hat USA 2013 Arsenal)
Introduction of ShinoBOT (Black Hat USA 2013 Arsenal)
 
What the Heck Just Happened?
What the Heck Just Happened?What the Heck Just Happened?
What the Heck Just Happened?
 
Dori waldman android _course_2
Dori waldman android _course_2Dori waldman android _course_2
Dori waldman android _course_2
 
Fernando Arnaboldi - Exposing Hidden Exploitable Behaviors Using Extended Dif...
Fernando Arnaboldi - Exposing Hidden Exploitable Behaviors Using Extended Dif...Fernando Arnaboldi - Exposing Hidden Exploitable Behaviors Using Extended Dif...
Fernando Arnaboldi - Exposing Hidden Exploitable Behaviors Using Extended Dif...
 
COMP 4026 Lecture3 Prototyping and Evaluation
COMP 4026 Lecture3 Prototyping and EvaluationCOMP 4026 Lecture3 Prototyping and Evaluation
COMP 4026 Lecture3 Prototyping and Evaluation
 

Kürzlich hochgeladen

Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️anilsa9823
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about usDynamic Netsoft
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AIABDERRAOUF MEHENNI
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 

Kürzlich hochgeladen (20)

Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about us
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 

Hands-on demo of PDI using webSpoon

  • 1. © Hitachi America, Ltd. 2017. All rights reserved. Hands-on demo of PDI using webSpoon Researcher at Hitachi America, Ltd. 4/27/2017 Hiromu Hota, PhD @HiromuHota, hiromu.hota@hal.hitachi.com
  • 2. © Hitachi America, Ltd. 2017. All rights reserved. Get started with webSpoon 1
  • 3. © Hitachi America, Ltd. 2017. All rights reserved. How to get started with webSpoon 2 1. Visit https://HighlyAvailable-env.i8gkiqhycy.us-west-2.elasticbeanstalk.com (will be deleted after the meetup) 2. Login with User: user Password: password 3. From the top menu, click File > New > Transformation
  • 4. © Hitachi America, Ltd. 2017. All rights reserved. • Transformations – are data flows, which typically start from data sources, go through some processing, and end at a target database table. – are comprised of steps and hops. – are saved as .ktr (Kettle) files or to a repository. • Steps and Hops – Steps are designed for a specific task such as input, output, scripting, etc. – Hops are directed data pathways that connect steps. Basic Concepts of PDI 3 HopStep Trans.ktr Repository Save
  • 5. © Hitachi America, Ltd. 2017. All rights reserved. How to operate webSpoon 4 • Drawing Steps 1. Under the Design tab, expand the Input node, then click and drag a Generate random credit card numbers step onto the canvas. 2. Expand the Flow node; click and drag a Dummy (do nothing) step onto the canvas. • Drawing Hops (similar to the way in Spoon) 1. Key-down and hold the <SHIFT> key. 2. Click-down and hold the Generate random credit card numbers step. 3. Move the mouse cursor to the Dummy (do nothing) step. 4. Release the click and the key.
  • 6. © Hitachi America, Ltd. 2017. All rights reserved. Example demo 5
  • 7. © Hitachi America, Ltd. 2017. All rights reserved. Demo story 6 • Background – Ichiro Hitachi works for a travel agency, based in San Francisco. – He wants to offer additional benefit to his customer tourists. – He personally likes to visit filming locations when visiting a new place, so strongly believes that such information is useful for them too. • Movie location notifier – When his customers come close to a filming location, they receive a notification that tells title, year, short plot, actor, and address (Cropped) Map of San Francisco by Ryan Holliday / CC-BY-SA 4.0 • Godzilla (2014) • He attacked GGB • Golden Gate Bridge • Forrest Gump (1994) • He has accidentally been present at many historic moments • 3301 Lyon Street
  • 8. © Hitachi America, Ltd. 2017. All rights reserved. Source data: “Film Locations in San Francisco” 7 • Source data – Available on SF OpenData (https://data.sfgov.org/). – A list of filming locations of movies shot in San Francisco. • Web APIs to retrieve missing information – OMDb (Open Movie Database) API • Short plot of the movie – Google Maps API • Formatted (normalized) address (e.g., Palace of Fine Arts -> 3301 Lyon Street) • Latitude & Longitude of the location, to calculate the distance from each user Title Year Locations Actor1 ... Godzilla 2014 Kearney & Pine St. Forrest Gump 1994 Palace of Fine Arts ...
  • 9. © Hitachi America, Ltd. 2017. All rights reserved. High-level demo system architecture 8 webSpoon SF OpenData Organizer Participants Database Google Maps API OMDb API Raw data Operations Enriched data Specific location data Geo data Movie data Not covered today
  • 10. © Hitachi America, Ltd. 2017. All rights reserved. Exercise (step 1) 9 1. Open an example file and save in a different name 1. Click File > Open, select example2, then click OK 2. Click File > Save as, change Transformation name to be unique (not to be overwritten by others), then click OK
  • 11. © Hitachi America, Ltd. 2017. All rights reserved. Exercise (step 2) 10 2. Run 1. Click the Run button or Action > Run from the menu 2. Click the Run button at the bottom right Step 3.1 Step 3.2
  • 12. © Hitachi America, Ltd. 2017. All rights reserved. Exercise (step 3) 11 3. Preview the result 1. Click on the “Dummy (do nothing)” step 2. Click on the “Preview data” tab in the “Execution Results” at the bottom 3. See other steps
  • 13. © Hitachi America, Ltd. 2017. All rights reserved. Exercise (step 4) 12 4. Complete the data flow by enabling the disabled hop 1. Click on the hop between “Dummy (do nothing)” and “Filter out rows...” 2. Save, Run, and preview the result
  • 14. © Hitachi America, Ltd. 2017. All rights reserved. Exercise (step 5) 13 5. Explorer the rest yourself; for example, – Click on each step and see how it is configured – Explorer what kinds of steps are available – Design the exact same flow yourself – Download and deploy webSpoon • Docker image: https://hub.docker.com/r/hiromuhota/webspoon/ • WAR file: https://github.com/HiromuHota/pentaho-kettle/releases – Download and install Pentaho Data Integration (including Spoon) • http://www.pentaho.com/download (Enterprise Edition) • http://community.pentaho.com/ (Community Edition)
  • 15. © Hitachi America, Ltd. 2017. All rights reserved. Trademarks and copyrights 14 • Pentaho is a registered trademark of Pentaho, Inc. • AWS, Amazon Elastic Beanstalk, and any other AWS Marks and Services are trademarks of Amazon Web Services, Inc. • The use of AWS Simple Icons is permitted by Amazon Web Services, Inc. • Godzilla is a registered trademark of Toho Co., Ltd. • Google Maps is a trademark of Google Inc. • All content via OMDb API is licensed by Brian Fritz under CC BY-NC 4.0.
  • 16.
  • 17. © Hitachi America, Ltd. 2017. All rights reserved. Demo system architecture 16 webSpoon Classic Load Balancer Auto Scaling group Elastic Beanstalk AWS cloud SF OpenData ・・・ Organizer ParticipantsDatabase Geo data, Movie data