SlideShare ist ein Scribd-Unternehmen logo
1 von 21
BigDataCamp 2011

     Chris K Wensel
Concurrent, Inc.

• Founded in Spring of 2008
• Cascading core development
• Support, Training, & OEM Licensing
So What is Cascading?
In a Nutshell
                Processing API Integration API
Scheduler API
                       Physical Planner
                  Scheduler



     Alternative Java API to MapReduce
    with built in Processing Planner and
           Workload Scheduler
On Many Platforms
                    Processing API Integration API
    Scheduler API
                           Physical Planner
                      Scheduler
                       Platform



• Apache Hadoop               • MapR
• Amazon Elastic              • EMC/GreenPlum
  MapReduce                   • and more**
But How is Cascading
      Used?
RazorFish/BestBuy
                               Java
               [unit, regression, & integration testing]


                       Processing API Integration API
      Scheduler API
                                   Physical Planner
                           Scheduler
                             Platform


• E-Commerce visitor/customer behavior
  classification
• Rule processing against proprietary logs
• Backend system integration
FlightCaster
                   JVM Language/DSL
                  [scripting, ad-hoc queries, etc]

                      Logical Planner

                      Processing API Integration API
      Scheduler API
                                  Physical Planner
                          Scheduler
                           Platform


• They predict flight delays 6 hrs in advance
• Created own API/DSL in Clojure
• Used to build predictive models
Etsy
                   JVM Language/DSL
                  [scripting, ad-hoc queries, etc]

                      Logical Planner

                      Processing API Integration API
      Scheduler API
                                  Physical Planner
                          Scheduler
                           Platform



• Online retailer
• Forked own API/DSL in JRuby
 • Cascading.JRuby - avail on github
What
• User behavior on site
• Data driven site features
 • Taste Test
 • Facebook gift recommender
 • Suggested Shops
 • Top Query List
 • plus many more on the way
BackType
                   JVM Language/DSL
                  [scripting, ad-hoc queries, etc]

                      Logical Planner

                      Processing API Integration API
      Scheduler API
                                  Physical Planner
                          Scheduler
                           Platform


• Marketing intelligence
• Created Cascalog
 • an API/DSL in Clojure, avail on github
Ion Flux
                         Java
         [unit, regression, & integration testing]


                 Processing API Integration API
Scheduler API
                             Physical Planner
                     Scheduler
                       Platform




            Gene sequencing
Who Else?




http://concurrentinc.com/casestudies/
How is Cascading
  Different?
Pig/Hive
                 Query Syntax    Extension API

                        Logical Planner

                 Processing API Integration API
 Scheduler API
                        Physical Planner
                   Scheduler
                    Platform



Great for ad-hoc queries, but hard to
           operationalize
Oozie/Azkaban
        Scheduler
         Syntax


                       Processing API Integration API
       Scheduler API
                              Physical Planner
                         Scheduler
                          Platform



• Great for gluing command line apps together
• JVM scripting language + Cascading is less
  brittle and with more degrees of freedom
But They are
     Complementary
• No reason Oozie (or Talend) can’t be used
  to drive Cascading apps


• No reason Cascading can’t drive raw MR/
  Pig/Hive processes (see Riffle)
Architecture isn’t
               Innovation
        collection           cleansing            processing                  delivery


event                data                signal                   info                     knowledge


                            normalization          scoring

                                         mining


   The point of computing systems is to make data
                   more valuable
        Everything else is an implementation detail
                                                               Copyright Concurrent, Inc. 2011. All rights reserved.
Cascading 2.0

• Removed dependencies on Hadoop
• Improved Processing Planner architecture
• Improved integration APIs

                              Copyright Concurrent, Inc. 2011. All rights reserved.
To Do
• Support more platforms, including in-
  memory stream processing

• Make Planner more intelligent and leverage
  more complex data flow topologies

• Integrate with more systems and
  applications

                               Copyright Concurrent, Inc. 2011. All rights reserved.
We are Hiring
http://www.concurrentinc.com/careers/

Weitere ähnliche Inhalte

Was ist angesagt?

Eaiesb Migration Approach
Eaiesb Migration ApproachEaiesb Migration Approach
Eaiesb Migration ApproachVijay Reddy
 
Web Editing in ArcGIS Server
Web Editing in ArcGIS ServerWeb Editing in ArcGIS Server
Web Editing in ArcGIS ServerEsri
 
ArcGIS for Server, Portal for ArcGIS and the Road Ahead - Esri norsk BK 2014
ArcGIS for Server, Portal for ArcGIS and the Road Ahead - Esri norsk BK 2014ArcGIS for Server, Portal for ArcGIS and the Road Ahead - Esri norsk BK 2014
ArcGIS for Server, Portal for ArcGIS and the Road Ahead - Esri norsk BK 2014Geodata AS
 
Couchbase Chennai meetup #3 What's new in Couchbase Server & Couchbase Mobile
Couchbase Chennai meetup #3  What's new in Couchbase Server & Couchbase MobileCouchbase Chennai meetup #3  What's new in Couchbase Server & Couchbase Mobile
Couchbase Chennai meetup #3 What's new in Couchbase Server & Couchbase MobileKarthik Babu Sekar
 
Camunda GraphQL Extension (09/2017 Berlin)
Camunda GraphQL Extension (09/2017 Berlin)Camunda GraphQL Extension (09/2017 Berlin)
Camunda GraphQL Extension (09/2017 Berlin)Harald J. Loydl
 
Graph at the Core of a Microservices Architecture, Vinelab
Graph at the Core of a Microservices Architecture, VinelabGraph at the Core of a Microservices Architecture, Vinelab
Graph at the Core of a Microservices Architecture, VinelabNeo4j
 
Deep Dive Into Elasticsearch: Establish A Powerful Log Analysis System With E...
Deep Dive Into Elasticsearch: Establish A Powerful Log Analysis System With E...Deep Dive Into Elasticsearch: Establish A Powerful Log Analysis System With E...
Deep Dive Into Elasticsearch: Establish A Powerful Log Analysis System With E...Tyler Nguyen
 
OSW06 - A Real World Guide to Building Highly Available Fault Tolerant ShareP...
OSW06 - A Real World Guide to Building Highly Available Fault Tolerant ShareP...OSW06 - A Real World Guide to Building Highly Available Fault Tolerant ShareP...
OSW06 - A Real World Guide to Building Highly Available Fault Tolerant ShareP...Eric Shupps
 
Crawlers com serverless @ Serverless Floripa - 1st commit
Crawlers com serverless @ Serverless Floripa - 1st commitCrawlers com serverless @ Serverless Floripa - 1st commit
Crawlers com serverless @ Serverless Floripa - 1st commitRicardo Lima
 
Extending Your Reach with Career Portal and Search Engine Optimization
Extending Your Reach with Career Portal and Search Engine OptimizationExtending Your Reach with Career Portal and Search Engine Optimization
Extending Your Reach with Career Portal and Search Engine OptimizationJeremyOtt5
 
Adobe Experience Manager - Replication deep dive
Adobe Experience Manager - Replication deep diveAdobe Experience Manager - Replication deep dive
Adobe Experience Manager - Replication deep divemwmd
 

Was ist angesagt? (20)

SAP PI online training course content
SAP PI online training course contentSAP PI online training course content
SAP PI online training course content
 
Stack Refactoring
Stack RefactoringStack Refactoring
Stack Refactoring
 
Barug 2014-10-16
Barug 2014-10-16Barug 2014-10-16
Barug 2014-10-16
 
Eaiesb Migration Approach
Eaiesb Migration ApproachEaiesb Migration Approach
Eaiesb Migration Approach
 
Web Editing in ArcGIS Server
Web Editing in ArcGIS ServerWeb Editing in ArcGIS Server
Web Editing in ArcGIS Server
 
What is Serverless Computing?
What is Serverless Computing?What is Serverless Computing?
What is Serverless Computing?
 
ArcGIS for Server, Portal for ArcGIS and the Road Ahead - Esri norsk BK 2014
ArcGIS for Server, Portal for ArcGIS and the Road Ahead - Esri norsk BK 2014ArcGIS for Server, Portal for ArcGIS and the Road Ahead - Esri norsk BK 2014
ArcGIS for Server, Portal for ArcGIS and the Road Ahead - Esri norsk BK 2014
 
Couchbase Chennai meetup #3 What's new in Couchbase Server & Couchbase Mobile
Couchbase Chennai meetup #3  What's new in Couchbase Server & Couchbase MobileCouchbase Chennai meetup #3  What's new in Couchbase Server & Couchbase Mobile
Couchbase Chennai meetup #3 What's new in Couchbase Server & Couchbase Mobile
 
Evaluating Koha
Evaluating KohaEvaluating Koha
Evaluating Koha
 
Camunda GraphQL Extension (09/2017 Berlin)
Camunda GraphQL Extension (09/2017 Berlin)Camunda GraphQL Extension (09/2017 Berlin)
Camunda GraphQL Extension (09/2017 Berlin)
 
Graph at the Core of a Microservices Architecture, Vinelab
Graph at the Core of a Microservices Architecture, VinelabGraph at the Core of a Microservices Architecture, Vinelab
Graph at the Core of a Microservices Architecture, Vinelab
 
Google App Engine
Google App EngineGoogle App Engine
Google App Engine
 
API Design- Best Practices
API Design-   Best PracticesAPI Design-   Best Practices
API Design- Best Practices
 
Deep Dive Into Elasticsearch: Establish A Powerful Log Analysis System With E...
Deep Dive Into Elasticsearch: Establish A Powerful Log Analysis System With E...Deep Dive Into Elasticsearch: Establish A Powerful Log Analysis System With E...
Deep Dive Into Elasticsearch: Establish A Powerful Log Analysis System With E...
 
OSW06 - A Real World Guide to Building Highly Available Fault Tolerant ShareP...
OSW06 - A Real World Guide to Building Highly Available Fault Tolerant ShareP...OSW06 - A Real World Guide to Building Highly Available Fault Tolerant ShareP...
OSW06 - A Real World Guide to Building Highly Available Fault Tolerant ShareP...
 
Crawlers com serverless @ Serverless Floripa - 1st commit
Crawlers com serverless @ Serverless Floripa - 1st commitCrawlers com serverless @ Serverless Floripa - 1st commit
Crawlers com serverless @ Serverless Floripa - 1st commit
 
Extending Your Reach with Career Portal and Search Engine Optimization
Extending Your Reach with Career Portal and Search Engine OptimizationExtending Your Reach with Career Portal and Search Engine Optimization
Extending Your Reach with Career Portal and Search Engine Optimization
 
Settle UseR 2014-10-13
Settle UseR 2014-10-13Settle UseR 2014-10-13
Settle UseR 2014-10-13
 
What's new on Rails 5
What's new on Rails 5What's new on Rails 5
What's new on Rails 5
 
Adobe Experience Manager - Replication deep dive
Adobe Experience Manager - Replication deep diveAdobe Experience Manager - Replication deep dive
Adobe Experience Manager - Replication deep dive
 

Andere mochten auch

Hadoop User Group EU 2014
Hadoop User Group EU 2014Hadoop User Group EU 2014
Hadoop User Group EU 2014cwensel
 
Processing Big Data
Processing Big DataProcessing Big Data
Processing Big Datacwensel
 
2015 Title Pckg_HEART YOUR LADY PARTS 5k
2015 Title Pckg_HEART YOUR LADY PARTS 5k2015 Title Pckg_HEART YOUR LADY PARTS 5k
2015 Title Pckg_HEART YOUR LADY PARTS 5klizloden
 
Real Social Media Recruitment ROI
Real Social Media Recruitment ROIReal Social Media Recruitment ROI
Real Social Media Recruitment ROIMikeVangel
 
An Integrated Marketing Plan
An Integrated Marketing PlanAn Integrated Marketing Plan
An Integrated Marketing PlanLinda Dacosta
 
SAM SIG: Hadoop architecture, MapReduce patterns, and best practices with Cas...
SAM SIG: Hadoop architecture, MapReduce patterns, and best practices with Cas...SAM SIG: Hadoop architecture, MapReduce patterns, and best practices with Cas...
SAM SIG: Hadoop architecture, MapReduce patterns, and best practices with Cas...cwensel
 
Hadoop Summit EU 2014
Hadoop Summit EU   2014Hadoop Summit EU   2014
Hadoop Summit EU 2014cwensel
 
Buzz words
Buzz wordsBuzz words
Buzz wordscwensel
 
Making the Quantum Leap: UPS Social Media Recruitment ROI 2012
Making the Quantum Leap: UPS Social Media Recruitment ROI 2012Making the Quantum Leap: UPS Social Media Recruitment ROI 2012
Making the Quantum Leap: UPS Social Media Recruitment ROI 2012MikeVangel
 
Building Scale Free Applications with Hadoop and Cascading
Building Scale Free Applications with Hadoop and CascadingBuilding Scale Free Applications with Hadoop and Cascading
Building Scale Free Applications with Hadoop and Cascadingcwensel
 
Isra' wal mikraj
Isra' wal mikrajIsra' wal mikraj
Isra' wal mikrajZuraimi Ali
 

Andere mochten auch (18)

Hadoop User Group EU 2014
Hadoop User Group EU 2014Hadoop User Group EU 2014
Hadoop User Group EU 2014
 
Processing Big Data
Processing Big DataProcessing Big Data
Processing Big Data
 
2015 Title Pckg_HEART YOUR LADY PARTS 5k
2015 Title Pckg_HEART YOUR LADY PARTS 5k2015 Title Pckg_HEART YOUR LADY PARTS 5k
2015 Title Pckg_HEART YOUR LADY PARTS 5k
 
Illinois Birds
Illinois BirdsIllinois Birds
Illinois Birds
 
Social Media Lecture Summer 2011
Social Media Lecture Summer 2011Social Media Lecture Summer 2011
Social Media Lecture Summer 2011
 
Real Social Media Recruitment ROI
Real Social Media Recruitment ROIReal Social Media Recruitment ROI
Real Social Media Recruitment ROI
 
An Integrated Marketing Plan
An Integrated Marketing PlanAn Integrated Marketing Plan
An Integrated Marketing Plan
 
SAM SIG: Hadoop architecture, MapReduce patterns, and best practices with Cas...
SAM SIG: Hadoop architecture, MapReduce patterns, and best practices with Cas...SAM SIG: Hadoop architecture, MapReduce patterns, and best practices with Cas...
SAM SIG: Hadoop architecture, MapReduce patterns, and best practices with Cas...
 
Hadoop Summit EU 2014
Hadoop Summit EU   2014Hadoop Summit EU   2014
Hadoop Summit EU 2014
 
Digital Marketing Lecture 2015
Digital Marketing Lecture 2015Digital Marketing Lecture 2015
Digital Marketing Lecture 2015
 
Illinois Birds2
Illinois Birds2Illinois Birds2
Illinois Birds2
 
Buzz words
Buzz wordsBuzz words
Buzz words
 
Engaging the customer
Engaging the customer Engaging the customer
Engaging the customer
 
Dialog Marketing with Digital Media
Dialog Marketing with Digital MediaDialog Marketing with Digital Media
Dialog Marketing with Digital Media
 
Making the Quantum Leap: UPS Social Media Recruitment ROI 2012
Making the Quantum Leap: UPS Social Media Recruitment ROI 2012Making the Quantum Leap: UPS Social Media Recruitment ROI 2012
Making the Quantum Leap: UPS Social Media Recruitment ROI 2012
 
Building Scale Free Applications with Hadoop and Cascading
Building Scale Free Applications with Hadoop and CascadingBuilding Scale Free Applications with Hadoop and Cascading
Building Scale Free Applications with Hadoop and Cascading
 
Digital Marketing Lecture 2016
Digital Marketing Lecture 2016Digital Marketing Lecture 2016
Digital Marketing Lecture 2016
 
Isra' wal mikraj
Isra' wal mikrajIsra' wal mikraj
Isra' wal mikraj
 

Ähnlich wie BigDataCamp 2011

Getting Started with the ArcGIS API for JavaScript, Esri, Julie Powell, Antoo...
Getting Started with the ArcGIS API for JavaScript, Esri, Julie Powell, Antoo...Getting Started with the ArcGIS API for JavaScript, Esri, Julie Powell, Antoo...
Getting Started with the ArcGIS API for JavaScript, Esri, Julie Powell, Antoo...Esri Nederland
 
Streaming Solutions for Real time problems
Streaming Solutions for Real time problemsStreaming Solutions for Real time problems
Streaming Solutions for Real time problemsAbhishek Gupta
 
APIdays Paris 2019 - Delivering Exceptional User Experience with REST and Gra...
APIdays Paris 2019 - Delivering Exceptional User Experience with REST and Gra...APIdays Paris 2019 - Delivering Exceptional User Experience with REST and Gra...
APIdays Paris 2019 - Delivering Exceptional User Experience with REST and Gra...apidays
 
DevOps and Cloud at NI
DevOps and Cloud at NIDevOps and Cloud at NI
DevOps and Cloud at NIErnest Mueller
 
Harsh_Resume_pdf
Harsh_Resume_pdfHarsh_Resume_pdf
Harsh_Resume_pdfHArsh Dawar
 
Sunshine consulting mopuru babu cv_java_j2ee_spring_bigdata_scala
Sunshine consulting mopuru babu cv_java_j2ee_spring_bigdata_scalaSunshine consulting mopuru babu cv_java_j2ee_spring_bigdata_scala
Sunshine consulting mopuru babu cv_java_j2ee_spring_bigdata_scalaMopuru Babu
 
Coherence RoadMap 2018
Coherence RoadMap 2018Coherence RoadMap 2018
Coherence RoadMap 2018harvraja
 
SAP & Open Souce - Give & Take
SAP & Open Souce - Give & TakeSAP & Open Souce - Give & Take
SAP & Open Souce - Give & TakeJan Penninkhof
 
OSCON 2012 OpenStack Automation and DevOps Best Practices
OSCON 2012 OpenStack Automation and DevOps Best PracticesOSCON 2012 OpenStack Automation and DevOps Best Practices
OSCON 2012 OpenStack Automation and DevOps Best PracticesMatt Ray
 
API City 2019 Presentation - Delivering Developer Tools at Scale: Microsoft A...
API City 2019 Presentation - Delivering Developer Tools at Scale: Microsoft A...API City 2019 Presentation - Delivering Developer Tools at Scale: Microsoft A...
API City 2019 Presentation - Delivering Developer Tools at Scale: Microsoft A...Joe Levy
 
Appscale at CLOUDCOMP '09
Appscale at CLOUDCOMP '09Appscale at CLOUDCOMP '09
Appscale at CLOUDCOMP '09Chris Bunch
 
Use Cases of #Grails in #WebApplications
Use Cases of #Grails in #WebApplicationsUse Cases of #Grails in #WebApplications
Use Cases of #Grails in #WebApplicationsXebia IT Architects
 
深探-IaC-(Infrastructure as Code-基礎設施即程式碼-)-在-AWS-上的應用
深探-IaC-(Infrastructure as Code-基礎設施即程式碼-)-在-AWS-上的應用深探-IaC-(Infrastructure as Code-基礎設施即程式碼-)-在-AWS-上的應用
深探-IaC-(Infrastructure as Code-基礎設施即程式碼-)-在-AWS-上的應用Amazon Web Services
 
Himansu-Java&BigdataDeveloper
Himansu-Java&BigdataDeveloperHimansu-Java&BigdataDeveloper
Himansu-Java&BigdataDeveloperHimansu Behera
 
SAP Inside Track Singapore 2014
SAP Inside Track Singapore 2014SAP Inside Track Singapore 2014
SAP Inside Track Singapore 2014mharkus
 

Ähnlich wie BigDataCamp 2011 (20)

Getting Started with the ArcGIS API for JavaScript, Esri, Julie Powell, Antoo...
Getting Started with the ArcGIS API for JavaScript, Esri, Julie Powell, Antoo...Getting Started with the ArcGIS API for JavaScript, Esri, Julie Powell, Antoo...
Getting Started with the ArcGIS API for JavaScript, Esri, Julie Powell, Antoo...
 
Streaming Solutions for Real time problems
Streaming Solutions for Real time problemsStreaming Solutions for Real time problems
Streaming Solutions for Real time problems
 
APIdays Paris 2019 - Delivering Exceptional User Experience with REST and Gra...
APIdays Paris 2019 - Delivering Exceptional User Experience with REST and Gra...APIdays Paris 2019 - Delivering Exceptional User Experience with REST and Gra...
APIdays Paris 2019 - Delivering Exceptional User Experience with REST and Gra...
 
DevOps and Cloud at NI
DevOps and Cloud at NIDevOps and Cloud at NI
DevOps and Cloud at NI
 
Harsh_Resume_pdf
Harsh_Resume_pdfHarsh_Resume_pdf
Harsh_Resume_pdf
 
Sunshine consulting mopuru babu cv_java_j2ee_spring_bigdata_scala
Sunshine consulting mopuru babu cv_java_j2ee_spring_bigdata_scalaSunshine consulting mopuru babu cv_java_j2ee_spring_bigdata_scala
Sunshine consulting mopuru babu cv_java_j2ee_spring_bigdata_scala
 
楽天が挑むDevOps
楽天が挑むDevOps楽天が挑むDevOps
楽天が挑むDevOps
 
Coherence RoadMap 2018
Coherence RoadMap 2018Coherence RoadMap 2018
Coherence RoadMap 2018
 
SAP & Open Souce - Give & Take
SAP & Open Souce - Give & TakeSAP & Open Souce - Give & Take
SAP & Open Souce - Give & Take
 
Implementing your own Google App Engine
Implementing your own Google App Engine Implementing your own Google App Engine
Implementing your own Google App Engine
 
OSCON 2012 OpenStack Automation and DevOps Best Practices
OSCON 2012 OpenStack Automation and DevOps Best PracticesOSCON 2012 OpenStack Automation and DevOps Best Practices
OSCON 2012 OpenStack Automation and DevOps Best Practices
 
PaaS with Java
PaaS with JavaPaaS with Java
PaaS with Java
 
API City 2019 Presentation - Delivering Developer Tools at Scale: Microsoft A...
API City 2019 Presentation - Delivering Developer Tools at Scale: Microsoft A...API City 2019 Presentation - Delivering Developer Tools at Scale: Microsoft A...
API City 2019 Presentation - Delivering Developer Tools at Scale: Microsoft A...
 
Appscale at CLOUDCOMP '09
Appscale at CLOUDCOMP '09Appscale at CLOUDCOMP '09
Appscale at CLOUDCOMP '09
 
Use Cases of #Grails in #WebApplications
Use Cases of #Grails in #WebApplicationsUse Cases of #Grails in #WebApplications
Use Cases of #Grails in #WebApplications
 
深探-IaC-(Infrastructure as Code-基礎設施即程式碼-)-在-AWS-上的應用
深探-IaC-(Infrastructure as Code-基礎設施即程式碼-)-在-AWS-上的應用深探-IaC-(Infrastructure as Code-基礎設施即程式碼-)-在-AWS-上的應用
深探-IaC-(Infrastructure as Code-基礎設施即程式碼-)-在-AWS-上的應用
 
Gubendran Lakshmanan
Gubendran LakshmananGubendran Lakshmanan
Gubendran Lakshmanan
 
Graph ql and enterprise
Graph ql and enterpriseGraph ql and enterprise
Graph ql and enterprise
 
Himansu-Java&BigdataDeveloper
Himansu-Java&BigdataDeveloperHimansu-Java&BigdataDeveloper
Himansu-Java&BigdataDeveloper
 
SAP Inside Track Singapore 2014
SAP Inside Track Singapore 2014SAP Inside Track Singapore 2014
SAP Inside Track Singapore 2014
 

Kürzlich hochgeladen

Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 

Kürzlich hochgeladen (20)

Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 

BigDataCamp 2011

  • 1. BigDataCamp 2011 Chris K Wensel
  • 2. Concurrent, Inc. • Founded in Spring of 2008 • Cascading core development • Support, Training, & OEM Licensing
  • 3. So What is Cascading?
  • 4. In a Nutshell Processing API Integration API Scheduler API Physical Planner Scheduler Alternative Java API to MapReduce with built in Processing Planner and Workload Scheduler
  • 5. On Many Platforms Processing API Integration API Scheduler API Physical Planner Scheduler Platform • Apache Hadoop • MapR • Amazon Elastic • EMC/GreenPlum MapReduce • and more**
  • 6. But How is Cascading Used?
  • 7. RazorFish/BestBuy Java [unit, regression, & integration testing] Processing API Integration API Scheduler API Physical Planner Scheduler Platform • E-Commerce visitor/customer behavior classification • Rule processing against proprietary logs • Backend system integration
  • 8. FlightCaster JVM Language/DSL [scripting, ad-hoc queries, etc] Logical Planner Processing API Integration API Scheduler API Physical Planner Scheduler Platform • They predict flight delays 6 hrs in advance • Created own API/DSL in Clojure • Used to build predictive models
  • 9. Etsy JVM Language/DSL [scripting, ad-hoc queries, etc] Logical Planner Processing API Integration API Scheduler API Physical Planner Scheduler Platform • Online retailer • Forked own API/DSL in JRuby • Cascading.JRuby - avail on github
  • 10. What • User behavior on site • Data driven site features • Taste Test • Facebook gift recommender • Suggested Shops • Top Query List • plus many more on the way
  • 11. BackType JVM Language/DSL [scripting, ad-hoc queries, etc] Logical Planner Processing API Integration API Scheduler API Physical Planner Scheduler Platform • Marketing intelligence • Created Cascalog • an API/DSL in Clojure, avail on github
  • 12. Ion Flux Java [unit, regression, & integration testing] Processing API Integration API Scheduler API Physical Planner Scheduler Platform Gene sequencing
  • 14. How is Cascading Different?
  • 15. Pig/Hive Query Syntax Extension API Logical Planner Processing API Integration API Scheduler API Physical Planner Scheduler Platform Great for ad-hoc queries, but hard to operationalize
  • 16. Oozie/Azkaban Scheduler Syntax Processing API Integration API Scheduler API Physical Planner Scheduler Platform • Great for gluing command line apps together • JVM scripting language + Cascading is less brittle and with more degrees of freedom
  • 17. But They are Complementary • No reason Oozie (or Talend) can’t be used to drive Cascading apps • No reason Cascading can’t drive raw MR/ Pig/Hive processes (see Riffle)
  • 18. Architecture isn’t Innovation collection cleansing processing delivery event data signal info knowledge normalization scoring mining The point of computing systems is to make data more valuable Everything else is an implementation detail Copyright Concurrent, Inc. 2011. All rights reserved.
  • 19. Cascading 2.0 • Removed dependencies on Hadoop • Improved Processing Planner architecture • Improved integration APIs Copyright Concurrent, Inc. 2011. All rights reserved.
  • 20. To Do • Support more platforms, including in- memory stream processing • Make Planner more intelligent and leverage more complex data flow topologies • Integrate with more systems and applications Copyright Concurrent, Inc. 2011. All rights reserved.

Hinweis der Redaktion

  1. \n
  2. \n
  3. \n
  4. \n
  5. \n
  6. \n
  7. \n
  8. \n
  9. \n
  10. \n
  11. \n
  12. \n
  13. \n
  14. \n
  15. \n
  16. \n
  17. \n
  18. \n
  19. \n
  20. \n
  21. \n