SlideShare ist ein Scribd-Unternehmen logo
1 von 9
Information Management and
Analytics
AKA Discussion Papers


                    February 2012
Challenges and opportunities in gaining advantage
and leverage through data
   Companies today are evolving into virtual networks of permanent and
    transient teams of people.
     ̶   Enterprises today can garner competitive operating advantage by
         leveraging social, local and mobile technology to generate leverage
         through individuals
     ̶   This leverage comes through the application of targeted, specific data at
         the point and time of informational advantage
   Commonly used information architectures do not address delivery,
    collaboration and interchange of ALL-types of information across networks of
    people as a core principle.
     ̶   Knowledge workers create, analyze, manage, decide, evaluate, and
         synthesize information of all types as their dominant activity throughout
         the enterprise.
   Solving the Right Problems – Companies must address two fundamental
    activities that intersect their daily routine:
     ̶   Collaboration, communication and information sharing
     ̶   Making sense of information - separating noise from the constant stream

                                                                                     2
Big Data Volume Statistics and Predictions

       Digital Storage Acquisition in zettabytes
                                                     IDC: Universal Digital Data Explosion Study

                                                                                                8 zb



                                                     A years worth of data
                                                     generated in the 90’s
                                                      is created within 1
                                                        minute in 2011                 1.8 zb



                                                                             0.13 zb

                                                   1990                      2005      2010            2015



Gartner: Unstructured data alone will explode to 650% its present volume by 2017.

  Are you positioned to take advantage of the big data predictions?

                                                                                                              3
What is Big Data? Where Does it Come From?

    Big Data includes both internal AND external content. Not all data must reside
     internally for analysis
    Data is organized and managed by its type of structure

    Type of Data        Structured               Semi-Structured       Unstructured

    Short Definition    Strictly meets its       Has a structure but   Has little to no
                        object definition        may differ greatly    structure and not
                                                 between files         easily read by a
                                                                       machine
    Examples            Relational, Flat File,   Excel, Word, xml,     Pdf, xray, legal
                        web services, …          html, tweets,         documents, video,
                                                 email,…               im


    Big Data is everywhere: Search engines, Instant Messaging, Social Media,
     Legal documents and Contracts, Medical Records and test/scan outcomes,
     Digital Media, Internal unstructured documents, stock tickers, press releases,
     et al.
                                                                                           4
The search challenge with unstructured data:
Data Science
  % of Relevant Data that are Returned




                                             Inefficient             Optimal




                                               Worst               Incomplete




                                         % of Returned Data that are Relevant

                                                                Source - Brewster Kahle




                                                                                          5
How to Reveal the Content in Big Data and Determine
its Relevance and Confidence.
   Sentiment analysis, also called text analytics, provides the ability to filter big
    data to determine its relevance. (Social Media, Search engines, et al)

                                                          Happy
        Capture
                                  Sentiment                        Unhappy
       Tweets on
                                   Analysis
        Brand X                                                          Need
                                                                         Help
   Textual ETL breaks down content to its granular information using taxonomies
    and ontologies. (pdf, doc, swift, et al)

For Unstructured:                    For Semi-structured:
  - stop word processing               - textual structure mapping
  - stemming                           - variable pattern recognition
  - alternate spelling                 - variable symbol recognition
  - synonym concatenation              - multiple index type support
  - homograph resolution               - utilities including:
  - spell checking                        - raw data hidden character display
  - word and phrase proximity             - multiple path processing
                                          - final index trimming
                                                                                         6
The Value of Big Data
    Data Science: To Support or To Drive?
           Perform analysis & exploration of Big Data.
           Analyze RAW and/or integrated data, remove ‘noise’, mine for peaks and
            valleys, determine relevance and exploit the data for predictive analysis.
                                                                   ROIi


   Top Level: Integrate and
    enrich with External Data
        ̶   Predictive Analysis                                                    Integrated and
            & Exploration           Big Data Utilization   Predictive Analysis –
                                                                                   RAW Internal &
            Reports
                                                            Drive the Business     External Data
   Mid Level: Integrate and
    enhance proprietary                                         Informed           Integrated
    data.                                                  Decisions/Insights –    Internal Data &
        ̶   BI Reports                                     Enhanced Support        Purchased
                                                                                   External data
   Bottom Level: Support
    operational systems.                                                            Internal
                                                           Operate & Support
        ̶   Operational Reports                                                     Proprietary
                                                               Business             Data


                                                                                                  7
Big Data Architecture
   Non-relational distributed file system. Can Augment existing systems.
   Provides the ability to internalize Optimal big data while continuing to access
    and report on external data to position for predictive analysis.
   Can use open source: Hadoop, Clojure, Storm, et al. and/or an enterprise level
    vendor to manage/monitor and support such as Teradata, Greeplum, Neteeza,
    Exadata, etal.
   Scalable and Extensible solution
   MPP (Massive Parallel Processing) reduces query response and acquisition
    time.
   Capable of handling RAW data.


   Additional benefits:
     ̶   increased IT agility in meeting business requirements
     ̶   Softens the brittleness of the data models
     ̶   Ability for Real time analysis
     ̶   Positions BI for next generation architecture


                                                                                      8
Big Data Management
   As with all forms of data, a critical aspect of getting value out of big data is data
    management best practices.
   Data Management practices include:
     ̶   Data Quality & Discovery
     ̶   Relationship or linking algorhythyms
     ̶   Data Governance
     ̶   Confidence levels and status codes
     ̶   Metadata management
   Information available about the data should include:
     ̶   Where did the data point come from?
     ̶   What type of cleansing/linkage or modification was performed?
     ̶   When did this data arrive?
     ̶   What is the temperature of the data?
     ̶   Who are the consumers of the data?
     ̶   When is the data required?
     ̶   What is the value of the data?
     ̶   What is it linked to?
                                                                                        9

Weitere ähnliche Inhalte

Was ist angesagt?

Intel Cloud Summit: Big Data
Intel Cloud Summit: Big DataIntel Cloud Summit: Big Data
Intel Cloud Summit: Big Data
IntelAPAC
 
Introduction to Data Mining for Newbies
Introduction to Data Mining for NewbiesIntroduction to Data Mining for Newbies
Introduction to Data Mining for Newbies
Eunjeong (Lucy) Park
 
Extract the Analyzed Information from Dark Data
Extract the Analyzed Information from Dark DataExtract the Analyzed Information from Dark Data
Extract the Analyzed Information from Dark Data
ijtsrd
 
InfoFusion Overview And Roadmap
InfoFusion Overview And RoadmapInfoFusion Overview And Roadmap
InfoFusion Overview And Roadmap
Marten den Haring
 
Jean-Marc Lazard d'Exalead - Pioneering hypermedia - SEO Campus 2011
Jean-Marc Lazard d'Exalead - Pioneering hypermedia - SEO Campus 2011Jean-Marc Lazard d'Exalead - Pioneering hypermedia - SEO Campus 2011
Jean-Marc Lazard d'Exalead - Pioneering hypermedia - SEO Campus 2011
SEO CAMP
 
Streaming Hadoop for Enterprise Adoption
Streaming Hadoop for Enterprise AdoptionStreaming Hadoop for Enterprise Adoption
Streaming Hadoop for Enterprise Adoption
DATAVERSITY
 

Was ist angesagt? (20)

Intel Cloud Summit: Big Data
Intel Cloud Summit: Big DataIntel Cloud Summit: Big Data
Intel Cloud Summit: Big Data
 
Hadoop, Big Data, and the Future of the Enterprise Data Warehouse
Hadoop, Big Data, and the Future of the Enterprise Data WarehouseHadoop, Big Data, and the Future of the Enterprise Data Warehouse
Hadoop, Big Data, and the Future of the Enterprise Data Warehouse
 
Data mining - GDi Techno Solutions
Data mining - GDi Techno SolutionsData mining - GDi Techno Solutions
Data mining - GDi Techno Solutions
 
Left Brain, Right Brain: How to Unify Enterprise Analytics
Left Brain, Right Brain: How to Unify Enterprise AnalyticsLeft Brain, Right Brain: How to Unify Enterprise Analytics
Left Brain, Right Brain: How to Unify Enterprise Analytics
 
Exploring Process Barriers to Release Public Sector Information in Local Gove...
Exploring Process Barriers to Release Public Sector Information in Local Gove...Exploring Process Barriers to Release Public Sector Information in Local Gove...
Exploring Process Barriers to Release Public Sector Information in Local Gove...
 
Ibm presentation unlocking new insights in dark data
Ibm presentation   unlocking new insights in dark dataIbm presentation   unlocking new insights in dark data
Ibm presentation unlocking new insights in dark data
 
Introduction to Data Mining for Newbies
Introduction to Data Mining for NewbiesIntroduction to Data Mining for Newbies
Introduction to Data Mining for Newbies
 
Extract the Analyzed Information from Dark Data
Extract the Analyzed Information from Dark DataExtract the Analyzed Information from Dark Data
Extract the Analyzed Information from Dark Data
 
InfoFusion Overview And Roadmap
InfoFusion Overview And RoadmapInfoFusion Overview And Roadmap
InfoFusion Overview And Roadmap
 
Big Data Fundamentals
Big Data FundamentalsBig Data Fundamentals
Big Data Fundamentals
 
Metadata in general and Dublin Core in specific; some experiences
Metadata in general and Dublin Core in specific; some experiencesMetadata in general and Dublin Core in specific; some experiences
Metadata in general and Dublin Core in specific; some experiences
 
Jean-Marc Lazard d'Exalead - Pioneering hypermedia - SEO Campus 2011
Jean-Marc Lazard d'Exalead - Pioneering hypermedia - SEO Campus 2011Jean-Marc Lazard d'Exalead - Pioneering hypermedia - SEO Campus 2011
Jean-Marc Lazard d'Exalead - Pioneering hypermedia - SEO Campus 2011
 
Big Data
Big DataBig Data
Big Data
 
Using Big Data to create a data drive organization
Using Big Data to create a data drive organizationUsing Big Data to create a data drive organization
Using Big Data to create a data drive organization
 
DLD_SYNOPSIS
DLD_SYNOPSISDLD_SYNOPSIS
DLD_SYNOPSIS
 
Maria Corpuz
Maria CorpuzMaria Corpuz
Maria Corpuz
 
Hadoop Demo eConvergence
Hadoop Demo eConvergenceHadoop Demo eConvergence
Hadoop Demo eConvergence
 
EDF2013: Selected Talk: Bryan Drexler: The 80/20 Rule and Big Data
EDF2013: Selected Talk: Bryan Drexler: The 80/20 Rule and Big Data EDF2013: Selected Talk: Bryan Drexler: The 80/20 Rule and Big Data
EDF2013: Selected Talk: Bryan Drexler: The 80/20 Rule and Big Data
 
IDOL presentation
IDOL presentationIDOL presentation
IDOL presentation
 
Streaming Hadoop for Enterprise Adoption
Streaming Hadoop for Enterprise AdoptionStreaming Hadoop for Enterprise Adoption
Streaming Hadoop for Enterprise Adoption
 

Ähnlich wie Information Management and Analytics

Ibm big data ibm marriage of hadoop and data warehousing
Ibm big dataibm marriage of hadoop and data warehousingIbm big dataibm marriage of hadoop and data warehousing
Ibm big data ibm marriage of hadoop and data warehousing
DataWorks Summit
 
Robert LeBlanc - Why Big Data? Why Now?
Robert LeBlanc - Why Big Data? Why Now?Robert LeBlanc - Why Big Data? Why Now?
Robert LeBlanc - Why Big Data? Why Now?
Mauricio Godoy
 
Big Data and Implications on Platform Architecture
Big Data and Implications on Platform ArchitectureBig Data and Implications on Platform Architecture
Big Data and Implications on Platform Architecture
Odinot Stanislas
 
Big Data Analytics MIS presentation
Big Data Analytics MIS presentationBig Data Analytics MIS presentation
Big Data Analytics MIS presentation
AASTHA PANDEY
 
Intersection of Business Intelligence and CRM vsr12
Intersection of Business Intelligence and CRM vsr12Intersection of Business Intelligence and CRM vsr12
Intersection of Business Intelligence and CRM vsr12
David J Rosenthal
 

Ähnlich wie Information Management and Analytics (20)

Ibm big data ibm marriage of hadoop and data warehousing
Ibm big dataibm marriage of hadoop and data warehousingIbm big dataibm marriage of hadoop and data warehousing
Ibm big data ibm marriage of hadoop and data warehousing
 
Evaluating Big Data Predictive Analytics Platforms
Evaluating Big Data Predictive Analytics PlatformsEvaluating Big Data Predictive Analytics Platforms
Evaluating Big Data Predictive Analytics Platforms
 
Robert LeBlanc - Why Big Data? Why Now?
Robert LeBlanc - Why Big Data? Why Now?Robert LeBlanc - Why Big Data? Why Now?
Robert LeBlanc - Why Big Data? Why Now?
 
OSC2012: Big Data Using Open Source: Netapp Project - Technical
OSC2012: Big Data Using Open Source: Netapp Project - TechnicalOSC2012: Big Data Using Open Source: Netapp Project - Technical
OSC2012: Big Data Using Open Source: Netapp Project - Technical
 
Big Data in Context
Big Data in ContextBig Data in Context
Big Data in Context
 
Big Data and Implications on Platform Architecture
Big Data and Implications on Platform ArchitectureBig Data and Implications on Platform Architecture
Big Data and Implications on Platform Architecture
 
Big data? No. Big Decisions are What You Want
Big data? No. Big Decisions are What You WantBig data? No. Big Decisions are What You Want
Big data? No. Big Decisions are What You Want
 
Big Data Analytics MIS presentation
Big Data Analytics MIS presentationBig Data Analytics MIS presentation
Big Data Analytics MIS presentation
 
INTRODUCTION TO BIG DATA AND HADOOP
INTRODUCTION TO BIG DATA AND HADOOPINTRODUCTION TO BIG DATA AND HADOOP
INTRODUCTION TO BIG DATA AND HADOOP
 
Big Data 視覺化分析解決方案
Big Data 視覺化分析解決方案Big Data 視覺化分析解決方案
Big Data 視覺化分析解決方案
 
Simplifying Big Data Analytics for the Business
Simplifying Big Data Analytics for the BusinessSimplifying Big Data Analytics for the Business
Simplifying Big Data Analytics for the Business
 
The New Enterprise Data Platform
The New Enterprise Data PlatformThe New Enterprise Data Platform
The New Enterprise Data Platform
 
Unit 2
Unit 2Unit 2
Unit 2
 
The Comprehensive Approach: A Unified Information Architecture
The Comprehensive Approach: A Unified Information ArchitectureThe Comprehensive Approach: A Unified Information Architecture
The Comprehensive Approach: A Unified Information Architecture
 
Intersection of Business Intelligence and CRM vsr12
Intersection of Business Intelligence and CRM vsr12Intersection of Business Intelligence and CRM vsr12
Intersection of Business Intelligence and CRM vsr12
 
Enabling Flexible Governance for All Data Sources
Enabling Flexible Governance for All Data SourcesEnabling Flexible Governance for All Data Sources
Enabling Flexible Governance for All Data Sources
 
From Big Legacy Data to Insight: Lessons Learned Creating New Value from a Bi...
From Big Legacy Data to Insight: Lessons Learned Creating New Value from a Bi...From Big Legacy Data to Insight: Lessons Learned Creating New Value from a Bi...
From Big Legacy Data to Insight: Lessons Learned Creating New Value from a Bi...
 
Predictive analytics km chicago
Predictive analytics km chicagoPredictive analytics km chicago
Predictive analytics km chicago
 
B-S-S Context Aware Information Access
B-S-S  Context Aware Information AccessB-S-S  Context Aware Information Access
B-S-S Context Aware Information Access
 
Roland Haeve (Atos): 'Using the Cloud for Big Data Analytics'
Roland Haeve (Atos): 'Using the Cloud for Big Data Analytics'Roland Haeve (Atos): 'Using the Cloud for Big Data Analytics'
Roland Haeve (Atos): 'Using the Cloud for Big Data Analytics'
 

Kürzlich hochgeladen

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Kürzlich hochgeladen (20)

Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Information Management and Analytics

  • 1. Information Management and Analytics AKA Discussion Papers February 2012
  • 2. Challenges and opportunities in gaining advantage and leverage through data  Companies today are evolving into virtual networks of permanent and transient teams of people. ̶ Enterprises today can garner competitive operating advantage by leveraging social, local and mobile technology to generate leverage through individuals ̶ This leverage comes through the application of targeted, specific data at the point and time of informational advantage  Commonly used information architectures do not address delivery, collaboration and interchange of ALL-types of information across networks of people as a core principle. ̶ Knowledge workers create, analyze, manage, decide, evaluate, and synthesize information of all types as their dominant activity throughout the enterprise.  Solving the Right Problems – Companies must address two fundamental activities that intersect their daily routine: ̶ Collaboration, communication and information sharing ̶ Making sense of information - separating noise from the constant stream 2
  • 3. Big Data Volume Statistics and Predictions Digital Storage Acquisition in zettabytes IDC: Universal Digital Data Explosion Study 8 zb A years worth of data generated in the 90’s is created within 1 minute in 2011 1.8 zb 0.13 zb 1990 2005 2010 2015 Gartner: Unstructured data alone will explode to 650% its present volume by 2017. Are you positioned to take advantage of the big data predictions? 3
  • 4. What is Big Data? Where Does it Come From?  Big Data includes both internal AND external content. Not all data must reside internally for analysis  Data is organized and managed by its type of structure Type of Data Structured Semi-Structured Unstructured Short Definition Strictly meets its Has a structure but Has little to no object definition may differ greatly structure and not between files easily read by a machine Examples Relational, Flat File, Excel, Word, xml, Pdf, xray, legal web services, … html, tweets, documents, video, email,… im  Big Data is everywhere: Search engines, Instant Messaging, Social Media, Legal documents and Contracts, Medical Records and test/scan outcomes, Digital Media, Internal unstructured documents, stock tickers, press releases, et al. 4
  • 5. The search challenge with unstructured data: Data Science % of Relevant Data that are Returned Inefficient Optimal Worst Incomplete % of Returned Data that are Relevant Source - Brewster Kahle 5
  • 6. How to Reveal the Content in Big Data and Determine its Relevance and Confidence.  Sentiment analysis, also called text analytics, provides the ability to filter big data to determine its relevance. (Social Media, Search engines, et al) Happy Capture Sentiment Unhappy Tweets on Analysis Brand X Need Help  Textual ETL breaks down content to its granular information using taxonomies and ontologies. (pdf, doc, swift, et al) For Unstructured: For Semi-structured: - stop word processing - textual structure mapping - stemming - variable pattern recognition - alternate spelling - variable symbol recognition - synonym concatenation - multiple index type support - homograph resolution - utilities including: - spell checking - raw data hidden character display - word and phrase proximity - multiple path processing - final index trimming 6
  • 7. The Value of Big Data Data Science: To Support or To Drive?  Perform analysis & exploration of Big Data.  Analyze RAW and/or integrated data, remove ‘noise’, mine for peaks and valleys, determine relevance and exploit the data for predictive analysis. ROIi  Top Level: Integrate and enrich with External Data ̶ Predictive Analysis Integrated and & Exploration Big Data Utilization Predictive Analysis – RAW Internal & Reports Drive the Business External Data  Mid Level: Integrate and enhance proprietary Informed Integrated data. Decisions/Insights – Internal Data & ̶ BI Reports Enhanced Support Purchased External data  Bottom Level: Support operational systems. Internal Operate & Support ̶ Operational Reports Proprietary Business Data 7
  • 8. Big Data Architecture  Non-relational distributed file system. Can Augment existing systems.  Provides the ability to internalize Optimal big data while continuing to access and report on external data to position for predictive analysis.  Can use open source: Hadoop, Clojure, Storm, et al. and/or an enterprise level vendor to manage/monitor and support such as Teradata, Greeplum, Neteeza, Exadata, etal.  Scalable and Extensible solution  MPP (Massive Parallel Processing) reduces query response and acquisition time.  Capable of handling RAW data.  Additional benefits: ̶ increased IT agility in meeting business requirements ̶ Softens the brittleness of the data models ̶ Ability for Real time analysis ̶ Positions BI for next generation architecture 8
  • 9. Big Data Management  As with all forms of data, a critical aspect of getting value out of big data is data management best practices.  Data Management practices include: ̶ Data Quality & Discovery ̶ Relationship or linking algorhythyms ̶ Data Governance ̶ Confidence levels and status codes ̶ Metadata management  Information available about the data should include: ̶ Where did the data point come from? ̶ What type of cleansing/linkage or modification was performed? ̶ When did this data arrive? ̶ What is the temperature of the data? ̶ Who are the consumers of the data? ̶ When is the data required? ̶ What is the value of the data? ̶ What is it linked to? 9