SlideShare ist ein Scribd-Unternehmen logo
1 von 21
Downloaden Sie, um offline zu lesen
Data Discovery Tool
        BigSheets
MapReduce with No Coding?
  p                     g
Atsushi Tsuchiya (eAtsuhsi@JP.ibm.com)
Atsushi Tsuchiya (eAtsuhsi@JP.ibm.com)
          Big Data Tiger Team
             IBM Software
             IBM Software
Looking at Data
              Looking at Data
• What would you do with Big data? 
    h      ld    d ih i d ?
• How to make use of it?
• It is difficult! – too vague.
   • No specific problem that needs to be solved.
            p       p
   • No specific question that needs to be answered.
• Only you know is to improve the business.
       yy                   p
• But you have *data*
• So what would you do first?
  So, what would you do first?
                Looking at Data!
                      g
IBM with Hadoop
            IBM with Hadoop
• IBM has been working with Open source 
           y           g
  community for the long time.
  – Eclipse, Hadoop and so on …

• BigInsights include Hadoop
BigInsights
• BigInsihgts i
   i    ih is IBM Hadoop product for Big data 
                    d       d    f    i d
  analytics.
  – Basic Edition (up to 10TB) – Free   無償で使えます!
  – Enterprise Edition 
         p

• Next version BigInsights ‐ coming soon
  Next version BigInsights coming soon.
  – v1.2 available.

• And many more
BigInsights Componetns
         BigInsights Componetns
• BigInsihgts i l d
   i    ih includes:
  –   IBM Java
  –   JAQL           - IBMが開発した言語(オープンソース)
  –   IBM Distribution of Hadoop
  –   BigSheets      - データ探索ツール
  –   FLEX scheduler for Adaptive MapReduce 
  –   Orchestrator (Workflow Engine)
  –   SystemT (Text Analytics), SystemML (Machine Learning)
  –   LDAP
  –   Web Console / Developer Studio
BigInsights – Basic Edition
                BigInsights – Basic Edition
                                                                      Version
                                                                  Will be Update     Basic    Enterprise
Function                                                             in Nov         Edition
                                                                                    Editi      Edition
                                                                                               Editi
                                                                     release.

Integrated Install                                                                 Inc        Inc
Open Source components:
Hadoop (including common utilities, HDFS, MapReduce framework)    0.20.2           Inc        Inc
Jaql (programming / query language)                               0.5.2            Inc        Inc
Pig (programming / query language)                                0.7              Inc        Inc
Flume (data collection/aggregation)                               0.9.1            Inc        Inc
Hive (data summarization/querying)                                0.5              Inc        Inc
Lucene (text search)                                              3.0.2
                                                                  302              Inc        Inc
Zookeeper (process coordination)                                  3.2.2            Inc        Inc
Avro (data serialization)                                         1.3.0            Inc        Inc
HBase (
      (real time read/write)
                     /     )                                      0.20.6
                                                                  0 20 6           Inc        Inc
Oozie (workflow/ job orchestration)                               2.2.2            Inc        Inc
Online documentation                                                               Inc        Inc
Capability to integrate with DB2, InfoSphere Warehouse                             Inc        Inc
 Two DB2 UDFs to submit jobs, and read results from BigInsights
BigInsights – Enterprise Edition
                     Enterprise Edition
                                                                        Basic    Enterprise
Function                                                               Edition    Edition
R Connector
 Jaql module to invoke R statistical capabilities from BigInsights   n/a         Inc
Netezza C
N t     Connector
                t
 Jaql modules to read/write data from/to Netezza                     n/a         Inc
LDAP                                                                 n/a         Inc
Web Console                                                          n/a         Inc
Workflow Engine                                                      n/a         Inc
Scheduler (Orchestrator)                                             n/a         Inc
Text Analytics Module (System T)                                     n/a         Inc
Eclipse support (for System T)*                                      n/a         Inc
BigSheets – Data Discovery Tool                                      n/a         Inc
IBM Optim Development Studio V2.2.1.0                                n/a         Inc
Support by IBM
  pp     y                                                           n/a         Inc
BigSheets
• A data exploring tool for Hadoop
• Only comes with BigInsights Enterprise edition
  Only comes with BigInsights Enterprise edition
BigSheets Concept Model
                     Concept Model
                           Enrich   Inspect


                                               Explore
Internet                                                   No Coding is Required!
            Gather
                             BigSheets


Intranet

                 Publish                      Get/
                                              Manipulate
 Logs       Gather


                           Massive Results
 Other                      in BigInsights

                                                    Explore & 
                                                    Analyze
It s like a spreadsheets.
It’s like a spreadsheets

                    Looks very familiar ?!?
Visualizations
• Predefined visualization
• Customer Plug‐in
  Customer Plug in




                  A number of coffee shops in North America for each States.
DEMO
Internet
                                                                     BigSheets

                                                          Intranet




                           Gather                         Logs


                                                          Other
                                                                     BigInsight
                                                                          s




• BigInsights can gather data from
   i    i h          h d f
  – Predefined formats :
     •   BigSheets data reader
     •   Basic crawler data reader
     •   Basic crawler data reader (binary support)
         Basic crawler data reader (binary support)
     •   Character‐delimited data reader
     •   Tab Separated Value (TSV) data reader
                p             (    )
     •   JavaScript Object Notation (JSON) array reader
     •   Comma Separated Value (CSV) data reader

  – Customer BigSheets Reader 
Internet
                                                  BigSheets

                                       Intranet




                      Gather           Logs


                                       Other
                                                  BigInsight
                                                       s




• BigInsights can import structured and 
   i    i h       i               d d
  unstructured data
  – CSV
  – Files
  – Network
     • http
          p
     • hdfs
     • AWS (S3n/S3)
  – Other
     • Customer Importer
Internet
                                                    BigSheets

                                         Intranet




       Collection                         Logs


                                          Other
                                                    BigInsight
                                                         s




A complete list of MacDonald s in North America.
A complete list of MacDonald's in North America
Internet
                                                                         BigSheets

                                                              Intranet


                                                              Logs

                                                                         BigInsight
                                                              Other           s




                                                  Calculate



               Reformat

Import



         A complete list of MacDonald's in North America.
Internet
                                     BigSheets

                          Intranet


                          Logs

                                     BigInsight
                          Other           s




Column chart




               Heat map
BigSheets in Action
                    in Action
              映 売  げ
• Blockbuster 映画売り上げ予測
 – ABC Newsより
Blockbuster – 映画の売り上げ予測
    IBM BigInsights/BigSheets
                 ①週末につぶやかれたTweets 
                 ①週末につぶやかれたTweets
                 (約200,000)フィードを受けて、




                 ②数時間以内に、
                 (今までは、月曜の朝になってから)
                  売り上げ予測チャ ト作成
                 ‐売り上げ予測チャート作成
                 ‐センチメント分析
                 例えば、今年の夏は、
                      がどれよりも人気があ た(
                 X‐manがどれよりも人気があった(つ
                 ぶやかれた)→宣伝、上映戦略など
                 をこまめに修正
Conclusion


• We all need to improve the business.

• S
  So, where would you start with Big data?
       h       ld      t t ith Bi d t ?

 Data Discovery is a key to start improving 
              YOUR Business!
              YOUR Business!
Thank you!
Thank you!

Weitere ähnliche Inhalte

Was ist angesagt?

Overview of Big data, Hadoop and Microsoft BI - version1
Overview of Big data, Hadoop and Microsoft BI - version1Overview of Big data, Hadoop and Microsoft BI - version1
Overview of Big data, Hadoop and Microsoft BI - version1Thanh Nguyen
 
Hadoop as Data Refinery - Steve Loughran
Hadoop as Data Refinery - Steve LoughranHadoop as Data Refinery - Steve Loughran
Hadoop as Data Refinery - Steve LoughranJAX London
 
Hadoop as data refinery
Hadoop as data refineryHadoop as data refinery
Hadoop as data refinerySteve Loughran
 
Jan 2013 HUG: Cloud-Friendly Hadoop and Hive
Jan 2013 HUG: Cloud-Friendly Hadoop and HiveJan 2013 HUG: Cloud-Friendly Hadoop and Hive
Jan 2013 HUG: Cloud-Friendly Hadoop and HiveYahoo Developer Network
 
Using hadoop to expand data warehousing
Using hadoop to expand data warehousingUsing hadoop to expand data warehousing
Using hadoop to expand data warehousingDataWorks Summit
 
Emergent Distributed Data Storage
Emergent Distributed Data StorageEmergent Distributed Data Storage
Emergent Distributed Data Storagehybrid cloud
 
HugeTable:Application-Oriented Structure Data Storage System
HugeTable:Application-Oriented Structure Data Storage SystemHugeTable:Application-Oriented Structure Data Storage System
HugeTable:Application-Oriented Structure Data Storage Systemqlw5
 
Data Analysis with Hadoop and Hive, ChicagoDB 2/21/2011
Data Analysis with Hadoop and Hive, ChicagoDB 2/21/2011Data Analysis with Hadoop and Hive, ChicagoDB 2/21/2011
Data Analysis with Hadoop and Hive, ChicagoDB 2/21/2011Jonathan Seidman
 
Introduction to Hortonworks Data Platform for Windows
Introduction to Hortonworks Data Platform for WindowsIntroduction to Hortonworks Data Platform for Windows
Introduction to Hortonworks Data Platform for WindowsHortonworks
 
Alex Wade, Digital Library Interoperability
Alex Wade, Digital Library InteroperabilityAlex Wade, Digital Library Interoperability
Alex Wade, Digital Library Interoperabilityparker01
 
Db tech show - hivemall
Db tech show - hivemallDb tech show - hivemall
Db tech show - hivemallMakoto Yui
 
First Step for Big Data with Apache Hadoop
First Step for Big Data with Apache HadoopFirst Step for Big Data with Apache Hadoop
First Step for Big Data with Apache HadoopBorn2Learn Co., Ltd
 
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...Mahantesh Angadi
 
Etu L2 Training - Hadoop 企業應用實作
Etu L2 Training - Hadoop 企業應用實作Etu L2 Training - Hadoop 企業應用實作
Etu L2 Training - Hadoop 企業應用實作James Chen
 
Big Data Warehousing: Pig vs. Hive Comparison
Big Data Warehousing: Pig vs. Hive ComparisonBig Data Warehousing: Pig vs. Hive Comparison
Big Data Warehousing: Pig vs. Hive ComparisonCaserta
 
Integrating Hadoop Into the Enterprise – Hadoop Summit 2012
Integrating Hadoop Into the Enterprise – Hadoop Summit 2012Integrating Hadoop Into the Enterprise – Hadoop Summit 2012
Integrating Hadoop Into the Enterprise – Hadoop Summit 2012Jonathan Seidman
 

Was ist angesagt? (19)

Overview of Big data, Hadoop and Microsoft BI - version1
Overview of Big data, Hadoop and Microsoft BI - version1Overview of Big data, Hadoop and Microsoft BI - version1
Overview of Big data, Hadoop and Microsoft BI - version1
 
Hadoop
HadoopHadoop
Hadoop
 
Hadoop as Data Refinery - Steve Loughran
Hadoop as Data Refinery - Steve LoughranHadoop as Data Refinery - Steve Loughran
Hadoop as Data Refinery - Steve Loughran
 
Hadoop as data refinery
Hadoop as data refineryHadoop as data refinery
Hadoop as data refinery
 
Jan 2013 HUG: Cloud-Friendly Hadoop and Hive
Jan 2013 HUG: Cloud-Friendly Hadoop and HiveJan 2013 HUG: Cloud-Friendly Hadoop and Hive
Jan 2013 HUG: Cloud-Friendly Hadoop and Hive
 
Using hadoop to expand data warehousing
Using hadoop to expand data warehousingUsing hadoop to expand data warehousing
Using hadoop to expand data warehousing
 
Emergent Distributed Data Storage
Emergent Distributed Data StorageEmergent Distributed Data Storage
Emergent Distributed Data Storage
 
HugeTable:Application-Oriented Structure Data Storage System
HugeTable:Application-Oriented Structure Data Storage SystemHugeTable:Application-Oriented Structure Data Storage System
HugeTable:Application-Oriented Structure Data Storage System
 
Data Analysis with Hadoop and Hive, ChicagoDB 2/21/2011
Data Analysis with Hadoop and Hive, ChicagoDB 2/21/2011Data Analysis with Hadoop and Hive, ChicagoDB 2/21/2011
Data Analysis with Hadoop and Hive, ChicagoDB 2/21/2011
 
Introduction to Hortonworks Data Platform for Windows
Introduction to Hortonworks Data Platform for WindowsIntroduction to Hortonworks Data Platform for Windows
Introduction to Hortonworks Data Platform for Windows
 
SQL in Hadoop
SQL in HadoopSQL in Hadoop
SQL in Hadoop
 
Alex Wade, Digital Library Interoperability
Alex Wade, Digital Library InteroperabilityAlex Wade, Digital Library Interoperability
Alex Wade, Digital Library Interoperability
 
Steve Watt Presentation
Steve Watt PresentationSteve Watt Presentation
Steve Watt Presentation
 
Db tech show - hivemall
Db tech show - hivemallDb tech show - hivemall
Db tech show - hivemall
 
First Step for Big Data with Apache Hadoop
First Step for Big Data with Apache HadoopFirst Step for Big Data with Apache Hadoop
First Step for Big Data with Apache Hadoop
 
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
 
Etu L2 Training - Hadoop 企業應用實作
Etu L2 Training - Hadoop 企業應用實作Etu L2 Training - Hadoop 企業應用實作
Etu L2 Training - Hadoop 企業應用實作
 
Big Data Warehousing: Pig vs. Hive Comparison
Big Data Warehousing: Pig vs. Hive ComparisonBig Data Warehousing: Pig vs. Hive Comparison
Big Data Warehousing: Pig vs. Hive Comparison
 
Integrating Hadoop Into the Enterprise – Hadoop Summit 2012
Integrating Hadoop Into the Enterprise – Hadoop Summit 2012Integrating Hadoop Into the Enterprise – Hadoop Summit 2012
Integrating Hadoop Into the Enterprise – Hadoop Summit 2012
 

Ähnlich wie Hadoop Summit Japan 2011 Fall - LT by IBM

Big Data 視覺化分析解決方案
Big Data 視覺化分析解決方案Big Data 視覺化分析解決方案
Big Data 視覺化分析解決方案Etu Solution
 
Big Data: Technical Introduction to BigSheets for InfoSphere BigInsights
Big Data:  Technical Introduction to BigSheets for InfoSphere BigInsightsBig Data:  Technical Introduction to BigSheets for InfoSphere BigInsights
Big Data: Technical Introduction to BigSheets for InfoSphere BigInsightsCynthia Saracco
 
Finding the needles in the haystack. An Overview of Analyzing Big Data with H...
Finding the needles in the haystack. An Overview of Analyzing Big Data with H...Finding the needles in the haystack. An Overview of Analyzing Big Data with H...
Finding the needles in the haystack. An Overview of Analyzing Big Data with H...Chris Baglieri
 
Webinar: Open Source Business Intelligence Intro
Webinar: Open Source Business Intelligence IntroWebinar: Open Source Business Intelligence Intro
Webinar: Open Source Business Intelligence IntroSpagoWorld
 
Intel IT OpenStack Journey - OpenStack Fall 2012 Summit.pdf
Intel IT OpenStack Journey - OpenStack Fall 2012 Summit.pdfIntel IT OpenStack Journey - OpenStack Fall 2012 Summit.pdf
Intel IT OpenStack Journey - OpenStack Fall 2012 Summit.pdfOpenStack Foundation
 
Dynamic Cubes Deep Dive IBM Cognos 10.2
Dynamic Cubes Deep Dive IBM Cognos 10.2Dynamic Cubes Deep Dive IBM Cognos 10.2
Dynamic Cubes Deep Dive IBM Cognos 10.2Senturus
 
The sensor data challenge - Innovations (not only) for the Internet of Things
The sensor data challenge - Innovations (not only) for the Internet of ThingsThe sensor data challenge - Innovations (not only) for the Internet of Things
The sensor data challenge - Innovations (not only) for the Internet of ThingsStephan Reimann
 
Avoiding 10 common SharePoint Administration mistakes
Avoiding 10 common SharePoint Administration mistakesAvoiding 10 common SharePoint Administration mistakes
Avoiding 10 common SharePoint Administration mistakesBenjamin Athawes
 
Tableau 7.0 prsentation
Tableau 7.0 prsentationTableau 7.0 prsentation
Tableau 7.0 prsentationinam_slides
 
Big data and hadoop introduction
Big data and hadoop introductionBig data and hadoop introduction
Big data and hadoop introductionAjay Mittal
 
Know thy logos
Know thy logosKnow thy logos
Know thy logosVishal V
 
Impact of in-memory technology and SAP HANA on your business, IT, and career
Impact of in-memory technology and SAP HANA on your business, IT, and careerImpact of in-memory technology and SAP HANA on your business, IT, and career
Impact of in-memory technology and SAP HANA on your business, IT, and careerVitaliy Rudnytskiy
 
Big data-at-detik
Big data-at-detikBig data-at-detik
Big data-at-detikk4ndar
 
sones company presentation
sones company presentationsones company presentation
sones company presentationsones GmbH
 
01 necto introduction_ready
01 necto introduction_ready01 necto introduction_ready
01 necto introduction_readywww.panorama.com
 
hari_duche_updated
hari_duche_updatedhari_duche_updated
hari_duche_updatedHari Duche
 
All Grown Up: Maturation of Analytics in the Cloud
All Grown Up: Maturation of Analytics in the CloudAll Grown Up: Maturation of Analytics in the Cloud
All Grown Up: Maturation of Analytics in the CloudInside Analysis
 
From open data to API-driven business
From open data to API-driven businessFrom open data to API-driven business
From open data to API-driven businessOpenDataSoft
 

Ähnlich wie Hadoop Summit Japan 2011 Fall - LT by IBM (20)

Iotbds v1.0
Iotbds v1.0Iotbds v1.0
Iotbds v1.0
 
Big Data 視覺化分析解決方案
Big Data 視覺化分析解決方案Big Data 視覺化分析解決方案
Big Data 視覺化分析解決方案
 
Big Data: Technical Introduction to BigSheets for InfoSphere BigInsights
Big Data:  Technical Introduction to BigSheets for InfoSphere BigInsightsBig Data:  Technical Introduction to BigSheets for InfoSphere BigInsights
Big Data: Technical Introduction to BigSheets for InfoSphere BigInsights
 
Finding the needles in the haystack. An Overview of Analyzing Big Data with H...
Finding the needles in the haystack. An Overview of Analyzing Big Data with H...Finding the needles in the haystack. An Overview of Analyzing Big Data with H...
Finding the needles in the haystack. An Overview of Analyzing Big Data with H...
 
Webinar: Open Source Business Intelligence Intro
Webinar: Open Source Business Intelligence IntroWebinar: Open Source Business Intelligence Intro
Webinar: Open Source Business Intelligence Intro
 
Intel IT OpenStack Journey - OpenStack Fall 2012 Summit.pdf
Intel IT OpenStack Journey - OpenStack Fall 2012 Summit.pdfIntel IT OpenStack Journey - OpenStack Fall 2012 Summit.pdf
Intel IT OpenStack Journey - OpenStack Fall 2012 Summit.pdf
 
Dynamic Cubes Deep Dive IBM Cognos 10.2
Dynamic Cubes Deep Dive IBM Cognos 10.2Dynamic Cubes Deep Dive IBM Cognos 10.2
Dynamic Cubes Deep Dive IBM Cognos 10.2
 
The sensor data challenge - Innovations (not only) for the Internet of Things
The sensor data challenge - Innovations (not only) for the Internet of ThingsThe sensor data challenge - Innovations (not only) for the Internet of Things
The sensor data challenge - Innovations (not only) for the Internet of Things
 
Avoiding 10 common SharePoint Administration mistakes
Avoiding 10 common SharePoint Administration mistakesAvoiding 10 common SharePoint Administration mistakes
Avoiding 10 common SharePoint Administration mistakes
 
Tableau 7.0 prsentation
Tableau 7.0 prsentationTableau 7.0 prsentation
Tableau 7.0 prsentation
 
Big data and hadoop introduction
Big data and hadoop introductionBig data and hadoop introduction
Big data and hadoop introduction
 
Know thy logos
Know thy logosKnow thy logos
Know thy logos
 
Impact of in-memory technology and SAP HANA on your business, IT, and career
Impact of in-memory technology and SAP HANA on your business, IT, and careerImpact of in-memory technology and SAP HANA on your business, IT, and career
Impact of in-memory technology and SAP HANA on your business, IT, and career
 
Ofm msft-interop-v5c-132827
Ofm msft-interop-v5c-132827Ofm msft-interop-v5c-132827
Ofm msft-interop-v5c-132827
 
Big data-at-detik
Big data-at-detikBig data-at-detik
Big data-at-detik
 
sones company presentation
sones company presentationsones company presentation
sones company presentation
 
01 necto introduction_ready
01 necto introduction_ready01 necto introduction_ready
01 necto introduction_ready
 
hari_duche_updated
hari_duche_updatedhari_duche_updated
hari_duche_updated
 
All Grown Up: Maturation of Analytics in the Cloud
All Grown Up: Maturation of Analytics in the CloudAll Grown Up: Maturation of Analytics in the Cloud
All Grown Up: Maturation of Analytics in the Cloud
 
From open data to API-driven business
From open data to API-driven businessFrom open data to API-driven business
From open data to API-driven business
 

Kürzlich hochgeladen

Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 

Kürzlich hochgeladen (20)

Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 

Hadoop Summit Japan 2011 Fall - LT by IBM

  • 1. Data Discovery Tool BigSheets MapReduce with No Coding? p g Atsushi Tsuchiya (eAtsuhsi@JP.ibm.com) Atsushi Tsuchiya (eAtsuhsi@JP.ibm.com) Big Data Tiger Team IBM Software IBM Software
  • 2. Looking at Data Looking at Data • What would you do with Big data?  h ld d ih i d ? • How to make use of it? • It is difficult! – too vague. • No specific problem that needs to be solved. p p • No specific question that needs to be answered. • Only you know is to improve the business. yy p • But you have *data* • So what would you do first? So, what would you do first? Looking at Data! g
  • 3. IBM with Hadoop IBM with Hadoop • IBM has been working with Open source  y g community for the long time. – Eclipse, Hadoop and so on … • BigInsights include Hadoop
  • 4. BigInsights • BigInsihgts i i ih is IBM Hadoop product for Big data  d d f i d analytics. – Basic Edition (up to 10TB) – Free 無償で使えます! – Enterprise Edition  p • Next version BigInsights ‐ coming soon Next version BigInsights coming soon. – v1.2 available. • And many more
  • 5. BigInsights Componetns BigInsights Componetns • BigInsihgts i l d i ih includes: – IBM Java – JAQL - IBMが開発した言語(オープンソース) – IBM Distribution of Hadoop – BigSheets - データ探索ツール – FLEX scheduler for Adaptive MapReduce  – Orchestrator (Workflow Engine) – SystemT (Text Analytics), SystemML (Machine Learning) – LDAP – Web Console / Developer Studio
  • 6. BigInsights – Basic Edition BigInsights – Basic Edition Version Will be Update Basic Enterprise Function in Nov Edition Editi Edition Editi release. Integrated Install Inc Inc Open Source components: Hadoop (including common utilities, HDFS, MapReduce framework) 0.20.2 Inc Inc Jaql (programming / query language) 0.5.2 Inc Inc Pig (programming / query language) 0.7 Inc Inc Flume (data collection/aggregation) 0.9.1 Inc Inc Hive (data summarization/querying) 0.5 Inc Inc Lucene (text search) 3.0.2 302 Inc Inc Zookeeper (process coordination) 3.2.2 Inc Inc Avro (data serialization) 1.3.0 Inc Inc HBase ( (real time read/write) / ) 0.20.6 0 20 6 Inc Inc Oozie (workflow/ job orchestration) 2.2.2 Inc Inc Online documentation Inc Inc Capability to integrate with DB2, InfoSphere Warehouse Inc Inc Two DB2 UDFs to submit jobs, and read results from BigInsights
  • 7. BigInsights – Enterprise Edition Enterprise Edition Basic Enterprise Function Edition Edition R Connector Jaql module to invoke R statistical capabilities from BigInsights n/a Inc Netezza C N t Connector t Jaql modules to read/write data from/to Netezza n/a Inc LDAP n/a Inc Web Console n/a Inc Workflow Engine n/a Inc Scheduler (Orchestrator) n/a Inc Text Analytics Module (System T) n/a Inc Eclipse support (for System T)* n/a Inc BigSheets – Data Discovery Tool n/a Inc IBM Optim Development Studio V2.2.1.0 n/a Inc Support by IBM pp y n/a Inc
  • 8. BigSheets • A data exploring tool for Hadoop • Only comes with BigInsights Enterprise edition Only comes with BigInsights Enterprise edition
  • 9. BigSheets Concept Model Concept Model Enrich Inspect Explore Internet No Coding is Required! Gather BigSheets Intranet Publish Get/ Manipulate Logs Gather Massive Results Other in BigInsights Explore &  Analyze
  • 10. It s like a spreadsheets. It’s like a spreadsheets Looks very familiar ?!?
  • 11. Visualizations • Predefined visualization • Customer Plug‐in Customer Plug in A number of coffee shops in North America for each States.
  • 12. DEMO
  • 13. Internet BigSheets Intranet Gather Logs Other BigInsight s • BigInsights can gather data from i i h h d f – Predefined formats : • BigSheets data reader • Basic crawler data reader • Basic crawler data reader (binary support) Basic crawler data reader (binary support) • Character‐delimited data reader • Tab Separated Value (TSV) data reader p ( ) • JavaScript Object Notation (JSON) array reader • Comma Separated Value (CSV) data reader – Customer BigSheets Reader 
  • 14. Internet BigSheets Intranet Gather Logs Other BigInsight s • BigInsights can import structured and  i i h i d d unstructured data – CSV – Files – Network • http p • hdfs • AWS (S3n/S3) – Other • Customer Importer
  • 15. Internet BigSheets Intranet Collection Logs Other BigInsight s A complete list of MacDonald s in North America. A complete list of MacDonald's in North America
  • 16. Internet BigSheets Intranet Logs BigInsight Other s Calculate Reformat Import A complete list of MacDonald's in North America.
  • 17. Internet BigSheets Intranet Logs BigInsight Other s Column chart Heat map
  • 18. BigSheets in Action in Action 映 売 げ • Blockbuster 映画売り上げ予測 – ABC Newsより
  • 19. Blockbuster – 映画の売り上げ予測 IBM BigInsights/BigSheets ①週末につぶやかれたTweets  ①週末につぶやかれたTweets (約200,000)フィードを受けて、 ②数時間以内に、 (今までは、月曜の朝になってから) 売り上げ予測チャ ト作成 ‐売り上げ予測チャート作成 ‐センチメント分析 例えば、今年の夏は、 がどれよりも人気があ た( X‐manがどれよりも人気があった(つ ぶやかれた)→宣伝、上映戦略など をこまめに修正
  • 20. Conclusion • We all need to improve the business. • S So, where would you start with Big data? h ld t t ith Bi d t ? Data Discovery is a key to start improving  YOUR Business! YOUR Business!