SlideShare ist ein Scribd-Unternehmen logo
1 von 7
Downloaden Sie, um offline zu lesen
DMDW - ETL-PROJECT
                              THE GROOVY-WAY
                           BY MAXIMILIAN BUTTERER




Samstag, 14. Mai 2011
WHAT WAS THE JOB


                    EXTRACT THE DATA FROM ROOM-PLAN-FILE (EXCEL)

                    TRANSFORM IT INTO NEW STRUCTURE (DATATYPES)

                    LOAD IT INTO NEW TARGET (E.G. DATABASE-SYSTEM)

                    CREATE KIND OF DOCUMENTATION (HOW-TO) OR

                    PRESENT IT TO YOU




Samstag, 14. Mai 2011
#1: EXPORT THE DATA



                    EXPORTING DATA FROM EXCEL IS PRETTY EASY

                    SO I EXPORTED HOLE FILE AS CSV-FILE

                    1ST PROBLEM: COMMA IS SEMICOLON

                    2ND PROBLEM: ENCODING IS NOT UTF-8




Samstag, 14. Mai 2011
#2: A                    HELPER


                    BECAUSE CSV IS PLAINTEXT IT‘S EASY TO PARSE

                    I CREATED A GROOVY-SCRIPT FOR CONVERTING

                    MY SOLUTION IS NOT THE CLEANEST WAY BUT SOME
                    KIND THE EASIEST

                    3RD PROBLEM: DATA IS NOT CONSISTENT / THERE ARE
                    CORRUPTED DATA-SETS




Samstag, 14. Mai 2011
#3: PUT IT INTO



                    HAVING ALL THE OBJECTS PARSED AND CONVERTED

                    JUST PUT ALL THE STUFF INTO THE DATABASE

                    4TH PROBLEM: HOW?




Samstag, 14. Mai 2011
HOW I SOLVED THE PROBLEMS


                    PARSING A FILE WITH GROOVY IS EASY AS 1,2,3



                    EACH LINE I HAD TO SPLIT UP BECAUSE OF THE
                    SEMICOLONS

                    TO CONVERT THE DATE-STRING INTO A REAL DATE WE
                    HAVE TO TRICK




Samstag, 14. Mai 2011
LET‘S HAVE A LOOK BEHIND



                    LOOK INTO THE CODE

                    HOW DOES THE CODE WORK

                    QUESTIONS




Samstag, 14. Mai 2011

Weitere ähnliche Inhalte

Mehr von Johannes Hoppe

2012-10-16 - WebTechCon 2012: HTML5 & WebGL
2012-10-16 - WebTechCon 2012: HTML5 & WebGL2012-10-16 - WebTechCon 2012: HTML5 & WebGL
2012-10-16 - WebTechCon 2012: HTML5 & WebGL
Johannes Hoppe
 
2012-10-12 - NoSQL in .NET - mit Redis und Mongodb
2012-10-12 - NoSQL in .NET - mit Redis und Mongodb2012-10-12 - NoSQL in .NET - mit Redis und Mongodb
2012-10-12 - NoSQL in .NET - mit Redis und Mongodb
Johannes Hoppe
 
2012-05-10 - UG Karlsruhe: NoSQL in .NET - mit Redis und MongoDB
2012-05-10 - UG Karlsruhe: NoSQL in .NET - mit Redis und MongoDB2012-05-10 - UG Karlsruhe: NoSQL in .NET - mit Redis und MongoDB
2012-05-10 - UG Karlsruhe: NoSQL in .NET - mit Redis und MongoDB
Johannes Hoppe
 
2012-04-12 - AOP .NET UserGroup Niederrhein
2012-04-12 - AOP .NET UserGroup Niederrhein2012-04-12 - AOP .NET UserGroup Niederrhein
2012-04-12 - AOP .NET UserGroup Niederrhein
Johannes Hoppe
 
2011-06-27 - AOP - .NET User Group Rhein Neckar
2011-06-27 - AOP - .NET User Group Rhein Neckar2011-06-27 - AOP - .NET User Group Rhein Neckar
2011-06-27 - AOP - .NET User Group Rhein Neckar
Johannes Hoppe
 
DMDW 5. Student Presentation - Pentaho Data Integration (Kettle)
DMDW 5. Student Presentation - Pentaho Data Integration (Kettle)DMDW 5. Student Presentation - Pentaho Data Integration (Kettle)
DMDW 5. Student Presentation - Pentaho Data Integration (Kettle)
Johannes Hoppe
 

Mehr von Johannes Hoppe (20)

Einführung in Angular 2
Einführung in Angular 2Einführung in Angular 2
Einführung in Angular 2
 
MDC kompakt 2014: Hybride Apps mit Cordova, AngularJS und Ionic
MDC kompakt 2014: Hybride Apps mit Cordova, AngularJS und IonicMDC kompakt 2014: Hybride Apps mit Cordova, AngularJS und Ionic
MDC kompakt 2014: Hybride Apps mit Cordova, AngularJS und Ionic
 
2015 02-09 - NoSQL Vorlesung Mosbach
2015 02-09 - NoSQL Vorlesung Mosbach2015 02-09 - NoSQL Vorlesung Mosbach
2015 02-09 - NoSQL Vorlesung Mosbach
 
2012-06-25 - MapReduce auf Azure
2012-06-25 - MapReduce auf Azure2012-06-25 - MapReduce auf Azure
2012-06-25 - MapReduce auf Azure
 
2013-06-25 - HTML5 & JavaScript Security
2013-06-25 - HTML5 & JavaScript Security2013-06-25 - HTML5 & JavaScript Security
2013-06-25 - HTML5 & JavaScript Security
 
2013-06-15 - Software Craftsmanship mit JavaScript
2013-06-15 - Software Craftsmanship mit JavaScript2013-06-15 - Software Craftsmanship mit JavaScript
2013-06-15 - Software Craftsmanship mit JavaScript
 
2013 02-26 - Software Tests with Mongo db
2013 02-26 - Software Tests with Mongo db2013 02-26 - Software Tests with Mongo db
2013 02-26 - Software Tests with Mongo db
 
2013-02-21 - .NET UG Rhein-Neckar: JavaScript Best Practices
2013-02-21 - .NET UG Rhein-Neckar: JavaScript Best Practices2013-02-21 - .NET UG Rhein-Neckar: JavaScript Best Practices
2013-02-21 - .NET UG Rhein-Neckar: JavaScript Best Practices
 
2012-10-16 - WebTechCon 2012: HTML5 & WebGL
2012-10-16 - WebTechCon 2012: HTML5 & WebGL2012-10-16 - WebTechCon 2012: HTML5 & WebGL
2012-10-16 - WebTechCon 2012: HTML5 & WebGL
 
2012-10-12 - NoSQL in .NET - mit Redis und Mongodb
2012-10-12 - NoSQL in .NET - mit Redis und Mongodb2012-10-12 - NoSQL in .NET - mit Redis und Mongodb
2012-10-12 - NoSQL in .NET - mit Redis und Mongodb
 
2012-09-18 - HTML5 & WebGL
2012-09-18 - HTML5 & WebGL2012-09-18 - HTML5 & WebGL
2012-09-18 - HTML5 & WebGL
 
2012-09-17 - WDC12: Node.js & MongoDB
2012-09-17 - WDC12: Node.js & MongoDB2012-09-17 - WDC12: Node.js & MongoDB
2012-09-17 - WDC12: Node.js & MongoDB
 
2012-08-29 - NoSQL Bootcamp (Redis, RavenDB & MongoDB für .NET Entwickler)
2012-08-29 - NoSQL Bootcamp (Redis, RavenDB & MongoDB für .NET Entwickler)2012-08-29 - NoSQL Bootcamp (Redis, RavenDB & MongoDB für .NET Entwickler)
2012-08-29 - NoSQL Bootcamp (Redis, RavenDB & MongoDB für .NET Entwickler)
 
2012-05-14 NoSQL in .NET - mit Redis und MongoDB
2012-05-14 NoSQL in .NET - mit Redis und MongoDB2012-05-14 NoSQL in .NET - mit Redis und MongoDB
2012-05-14 NoSQL in .NET - mit Redis und MongoDB
 
2012-05-10 - UG Karlsruhe: NoSQL in .NET - mit Redis und MongoDB
2012-05-10 - UG Karlsruhe: NoSQL in .NET - mit Redis und MongoDB2012-05-10 - UG Karlsruhe: NoSQL in .NET - mit Redis und MongoDB
2012-05-10 - UG Karlsruhe: NoSQL in .NET - mit Redis und MongoDB
 
2012-04-12 - AOP .NET UserGroup Niederrhein
2012-04-12 - AOP .NET UserGroup Niederrhein2012-04-12 - AOP .NET UserGroup Niederrhein
2012-04-12 - AOP .NET UserGroup Niederrhein
 
2012-03-20 - Getting started with Node.js and MongoDB on MS Azure
2012-03-20 - Getting started with Node.js and MongoDB on MS Azure2012-03-20 - Getting started with Node.js and MongoDB on MS Azure
2012-03-20 - Getting started with Node.js and MongoDB on MS Azure
 
2012-01-31 NoSQL in .NET
2012-01-31 NoSQL in .NET2012-01-31 NoSQL in .NET
2012-01-31 NoSQL in .NET
 
2011-06-27 - AOP - .NET User Group Rhein Neckar
2011-06-27 - AOP - .NET User Group Rhein Neckar2011-06-27 - AOP - .NET User Group Rhein Neckar
2011-06-27 - AOP - .NET User Group Rhein Neckar
 
DMDW 5. Student Presentation - Pentaho Data Integration (Kettle)
DMDW 5. Student Presentation - Pentaho Data Integration (Kettle)DMDW 5. Student Presentation - Pentaho Data Integration (Kettle)
DMDW 5. Student Presentation - Pentaho Data Integration (Kettle)
 

Kürzlich hochgeladen

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Kürzlich hochgeladen (20)

AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 

DMDW 8. Student Presentation - Groovy to MongoDB

  • 1. DMDW - ETL-PROJECT THE GROOVY-WAY BY MAXIMILIAN BUTTERER Samstag, 14. Mai 2011
  • 2. WHAT WAS THE JOB EXTRACT THE DATA FROM ROOM-PLAN-FILE (EXCEL) TRANSFORM IT INTO NEW STRUCTURE (DATATYPES) LOAD IT INTO NEW TARGET (E.G. DATABASE-SYSTEM) CREATE KIND OF DOCUMENTATION (HOW-TO) OR PRESENT IT TO YOU Samstag, 14. Mai 2011
  • 3. #1: EXPORT THE DATA EXPORTING DATA FROM EXCEL IS PRETTY EASY SO I EXPORTED HOLE FILE AS CSV-FILE 1ST PROBLEM: COMMA IS SEMICOLON 2ND PROBLEM: ENCODING IS NOT UTF-8 Samstag, 14. Mai 2011
  • 4. #2: A HELPER BECAUSE CSV IS PLAINTEXT IT‘S EASY TO PARSE I CREATED A GROOVY-SCRIPT FOR CONVERTING MY SOLUTION IS NOT THE CLEANEST WAY BUT SOME KIND THE EASIEST 3RD PROBLEM: DATA IS NOT CONSISTENT / THERE ARE CORRUPTED DATA-SETS Samstag, 14. Mai 2011
  • 5. #3: PUT IT INTO HAVING ALL THE OBJECTS PARSED AND CONVERTED JUST PUT ALL THE STUFF INTO THE DATABASE 4TH PROBLEM: HOW? Samstag, 14. Mai 2011
  • 6. HOW I SOLVED THE PROBLEMS PARSING A FILE WITH GROOVY IS EASY AS 1,2,3 EACH LINE I HAD TO SPLIT UP BECAUSE OF THE SEMICOLONS TO CONVERT THE DATE-STRING INTO A REAL DATE WE HAVE TO TRICK Samstag, 14. Mai 2011
  • 7. LET‘S HAVE A LOOK BEHIND LOOK INTO THE CODE HOW DOES THE CODE WORK QUESTIONS Samstag, 14. Mai 2011