SlideShare ist ein Scribd-Unternehmen logo
1 von 11
Anexinet Big Data

Solutions for Big Data Analytics
Big Data Defined



Volume                                 Velocity
• Datasets that grow too large to      • Large volume streaming data that
  easily manage in traditional RDBMS     can overwhelm traditional BI & ETL
• TBs, PBs, ZBs                          processes



Variety                                Value
• Data sources extraneous to           • Big Data can have a
  traditional business systems that      transformational effect on business
  can be unstructured and require        when the proper systems and
  text analytics                         processes are put in place
Big Data vs. Classic BI

 What is different from classic DW/BI and Big Data Analytics?
     Businesses today treat data warehouse & business intelligence as must-have reporting and
      operational capability
     Businesses that are not fully mature in BI lifecycle may struggle with Big Data

 Big Data Projects look for untapped analytics, not BI dashboards
 SCALE: Think Volume, Variety and Velocity
     Yahoo! Uses Microsoft SQL Server & Analysis Services, with Hadoop, Oracle & Tableau
         38,000 machines distributed across 20 different clusters
     2-petabyte Hadoop cluster that feeds 1.2 terabytes of raw data each day into Oracle RAC
     Data is compressed and 135 gigabytes of data per day is sent to a SQL Server 2008 R2 Analysis
      Services cube
     Cube produces 24 terabytes of data each quarter
     http://www.microsoft.com/casestudies/Case_Study_Detail.aspx?CaseStudyID=710000001707
Scalable Big Data Platform Architecture



                            HDFS Cluster                                       In-memory
                                                                                  cubes




                              MapReduce
                              Framework
                                                                            Analytical
                                                                                                  Advanced in-
                                                                           Columnstore
                                                      MPP                                        memory analytics
                                                                              Tables
                                                    Database


                  Hadoop                                           Analytics




                                                     Star                                          Ad-hoc data
                                                   Schemas                                          discovery


                                           Data Warehouse                                  End User Reporting


© Copyright 2013 Anexinet Corp.                                                                                     4
Go Beyond Dashboards. Provide Advanced Analytics.

 Large number of data
                                                                   Tableau
  points adds new business
  value

 Big Data advanced
  analytics requires tool that                                          Microsoft Power
  can sample complex data                                                    View
  sources

 Must provide quick
  aggregations of large data
  sets that are easily                              Qlikview
  consumed by the human
  eye

 Must provide “data
  discovery” for ad-hoc
  analysis
Marketing Samples

 Enhance marketing
  campaigns with Big Data

 Social analytics,
  customer analytic,
  targeted marketing,
  brand sentiment

 Big Data has proven
  transformational for
  marketing organizations
  (Razorfish, Yahoo!,
  NBC, [x+1])




                               Web Analytics from Google Analytics
Anexinet Big Data Offerings

Strategy Engagement
• Customer stakeholder interviews & interactive sessions
• Define Big Data Requirements
• Design Big Data Strategy
• Deliver Strategy & Roadmap Documents


     Starter Solution
     • Let Anexinet handle the hardest parts of a Big Data solution
       * Getting started
       * Collecting & processing data
       * Uncover business value from Big Data


Big Data Project Engagement
• End-to-end Big Data project
  * Big Data Discovery
  * Big Data Platform
  * Big Data Analytics
  * Big Data Visualizations
Partnerships



  Big Data Platforms     Big Data Databases   Big Data Visualizations


• EMC Greenplum        • HP Vertica           • QlikView
• Hortonworks          • EMC Greenplum        • Tableau
  (OSS, MSFT, HP)      • Microsoft PDW        • Microsoft PowerPivot
• Cloudera             • Oracle Exalytics     • Microsoft Power View
  (OSS, Oracle, HP)    • Oracle Big Data
                         Appliance
A Credible Partner to Deploy Big Data Solutions



    Security           Integration         Configuration         Governance

• Ensure           • ETL / ELT           • Configure the      • Ensure Data
  privacy of PII   • Integrate             Big Data             Quality
                     Hadoop into           environment to     • MDM
• Conform Big        your DW &             maximize           • Process
  Data solution      Analytics             throughput,          Governance
  to your            environments          performance
  enterprise       • Integrate Big         and analytics to
  security           Data into your IT     meet your
                     investments           stated SLA goals
  standards
Top Impediments to Successful Big Data Analytics
Big Data Buzzword Glossary
 Big Data: Think 3 v’s, unstructured data, data that is not currently managed in DW. This is the data that
  companies need to do game-changing analytics.
 Big Data Analytics: Business insights gained from mining Big Data to transform business processes
 Columnar: Column-oriented databases that are used in Big Data scenarios because of their speed and
  compression capabilities, i.e. HP Vertica, HBase
 Hadoop: Apache open-source framework for Big Data processing. Made up of multiple components. The
  leading Big Data platform. Marketed by Couldera & Hortonworks.
 In-memory DB: A database that resides fully in memory, eliminating IO bottlenecks. Very important in Big
  Data Analytics systems, i.e. Microsoft PowerPivot, SSAS 2012, SAP HANA
 MapReduce: Distributed data programming and processing framework. A key aspect of processing Big
  Data is using a MapReduce framework across distributed clusters of commodity servers. Available as
  open source in the Hadoop framework and in various Hadoop distribution flavors.
 MPP: Massively Parallel Processing database engine, mostly used for data warehouse & BI workloads.
  I.e. SQL Server PDW, IBM Netezza, Teradata
 NoSQL: Key-value data store for quick eventual-ACID schemaless database writes. Big Data systems will
  use these to store data coming in from sources that dump large amounts of data quickly, i.e. Cassandra,
  MongoDB.

Weitere ähnliche Inhalte

Was ist angesagt?

Data warehouse-optimization-with-hadoop-informatica-cloudera
Data warehouse-optimization-with-hadoop-informatica-clouderaData warehouse-optimization-with-hadoop-informatica-cloudera
Data warehouse-optimization-with-hadoop-informatica-cloudera
Jyrki Määttä
 

Was ist angesagt? (20)

Filling the Data Lake - Strata + HadoopWorld San Jose 2016 Preview Presentation
Filling the Data Lake - Strata + HadoopWorld San Jose 2016 Preview PresentationFilling the Data Lake - Strata + HadoopWorld San Jose 2016 Preview Presentation
Filling the Data Lake - Strata + HadoopWorld San Jose 2016 Preview Presentation
 
Data warehouse-optimization-with-hadoop-informatica-cloudera
Data warehouse-optimization-with-hadoop-informatica-clouderaData warehouse-optimization-with-hadoop-informatica-cloudera
Data warehouse-optimization-with-hadoop-informatica-cloudera
 
Modern Data Management for Federal Modernization
Modern Data Management for Federal ModernizationModern Data Management for Federal Modernization
Modern Data Management for Federal Modernization
 
A beginners guide to Cloudera Hadoop
A beginners guide to Cloudera HadoopA beginners guide to Cloudera Hadoop
A beginners guide to Cloudera Hadoop
 
Microsoft SQL Azure - Cloud Based Database Datasheet
Microsoft SQL Azure - Cloud Based Database DatasheetMicrosoft SQL Azure - Cloud Based Database Datasheet
Microsoft SQL Azure - Cloud Based Database Datasheet
 
Hadoop: Extending your Data Warehouse
Hadoop: Extending your Data WarehouseHadoop: Extending your Data Warehouse
Hadoop: Extending your Data Warehouse
 
Integrating with Azure Data Lake
Integrating with Azure Data LakeIntegrating with Azure Data Lake
Integrating with Azure Data Lake
 
Webinar: Emerging Trends in Data Architecture – What’s the Next Big Thing?
Webinar: Emerging Trends in Data Architecture – What’s the Next Big Thing?Webinar: Emerging Trends in Data Architecture – What’s the Next Big Thing?
Webinar: Emerging Trends in Data Architecture – What’s the Next Big Thing?
 
From Traditional Data Warehouse To Real Time Data Warehouse
From Traditional Data Warehouse To Real Time Data WarehouseFrom Traditional Data Warehouse To Real Time Data Warehouse
From Traditional Data Warehouse To Real Time Data Warehouse
 
How One Company Offloaded Data Warehouse ETL To Hadoop and Saved $30 Million
How One Company Offloaded Data Warehouse ETL To Hadoop and Saved $30 MillionHow One Company Offloaded Data Warehouse ETL To Hadoop and Saved $30 Million
How One Company Offloaded Data Warehouse ETL To Hadoop and Saved $30 Million
 
White paper making an-operational_data_store_(ods)_the_center_of_your_data_...
White paper   making an-operational_data_store_(ods)_the_center_of_your_data_...White paper   making an-operational_data_store_(ods)_the_center_of_your_data_...
White paper making an-operational_data_store_(ods)_the_center_of_your_data_...
 
Magic quadrant for data warehouse database management systems
Magic quadrant for data warehouse database management systems Magic quadrant for data warehouse database management systems
Magic quadrant for data warehouse database management systems
 
Designing Fast Data Architecture for Big Data using Logical Data Warehouse a...
Designing Fast Data Architecture for Big Data  using Logical Data Warehouse a...Designing Fast Data Architecture for Big Data  using Logical Data Warehouse a...
Designing Fast Data Architecture for Big Data using Logical Data Warehouse a...
 
Pervasive analytics through data & analytic centricity
Pervasive analytics through data & analytic centricityPervasive analytics through data & analytic centricity
Pervasive analytics through data & analytic centricity
 
Data Lake
Data LakeData Lake
Data Lake
 
Use Big Data Technologies to Modernize Your Enterprise Data Warehouse
Use Big Data Technologies to Modernize Your Enterprise Data Warehouse Use Big Data Technologies to Modernize Your Enterprise Data Warehouse
Use Big Data Technologies to Modernize Your Enterprise Data Warehouse
 
Data Virtualization: From Zero to Hero
Data Virtualization: From Zero to HeroData Virtualization: From Zero to Hero
Data Virtualization: From Zero to Hero
 
Data Mining and Data Warehousing
Data Mining and Data WarehousingData Mining and Data Warehousing
Data Mining and Data Warehousing
 
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)
 
Introduction to Microsoft’s Master Data Services (MDS)
Introduction to Microsoft’s Master Data Services (MDS)Introduction to Microsoft’s Master Data Services (MDS)
Introduction to Microsoft’s Master Data Services (MDS)
 

Andere mochten auch

Microsoft Cloud BI Update 2012 for SQL Saturday Philly
Microsoft Cloud BI Update 2012 for SQL Saturday PhillyMicrosoft Cloud BI Update 2012 for SQL Saturday Philly
Microsoft Cloud BI Update 2012 for SQL Saturday Philly
Mark Kromer
 
Big Data Analytics Projects - Real World with Pentaho
Big Data Analytics Projects - Real World with PentahoBig Data Analytics Projects - Real World with Pentaho
Big Data Analytics Projects - Real World with Pentaho
Mark Kromer
 

Andere mochten auch (20)

Knowles Award 2011
 Knowles Award 2011 Knowles Award 2011
Knowles Award 2011
 
Understanding Physician/ Patient Conversations Online
Understanding Physician/ Patient Conversations OnlineUnderstanding Physician/ Patient Conversations Online
Understanding Physician/ Patient Conversations Online
 
Philly Code Camp 2013 Mark Kromer Big Data with SQL Server
Philly Code Camp 2013 Mark Kromer Big Data with SQL ServerPhilly Code Camp 2013 Mark Kromer Big Data with SQL Server
Philly Code Camp 2013 Mark Kromer Big Data with SQL Server
 
Microsoft Cloud BI Update 2012 for SQL Saturday Philly
Microsoft Cloud BI Update 2012 for SQL Saturday PhillyMicrosoft Cloud BI Update 2012 for SQL Saturday Philly
Microsoft Cloud BI Update 2012 for SQL Saturday Philly
 
Microsoft Event Registration System Hosted on Windows Azure
Microsoft Event Registration System Hosted on Windows AzureMicrosoft Event Registration System Hosted on Windows Azure
Microsoft Event Registration System Hosted on Windows Azure
 
What's new in SQL Server 2012 for philly code camp 2012.1
What's new in SQL Server 2012 for philly code camp 2012.1What's new in SQL Server 2012 for philly code camp 2012.1
What's new in SQL Server 2012 for philly code camp 2012.1
 
Big Data in the Cloud with Azure Marketplace Images
Big Data in the Cloud with Azure Marketplace ImagesBig Data in the Cloud with Azure Marketplace Images
Big Data in the Cloud with Azure Marketplace Images
 
MEC Data sheet
MEC Data sheetMEC Data sheet
MEC Data sheet
 
PSSUG Nov 2012: Big Data with SQL Server
PSSUG Nov 2012: Big Data with SQL ServerPSSUG Nov 2012: Big Data with SQL Server
PSSUG Nov 2012: Big Data with SQL Server
 
Big Data with SQL Server
Big Data with SQL ServerBig Data with SQL Server
Big Data with SQL Server
 
Pentaho Big Data Analytics with Vertica and Hadoop
Pentaho Big Data Analytics with Vertica and HadoopPentaho Big Data Analytics with Vertica and Hadoop
Pentaho Big Data Analytics with Vertica and Hadoop
 
Big Data in the Real World
Big Data in the Real WorldBig Data in the Real World
Big Data in the Real World
 
Rohit Bhargava, Influential Marketing Group: How To (Actually) Predict the Fu...
Rohit Bhargava, Influential Marketing Group: How To (Actually) Predict the Fu...Rohit Bhargava, Influential Marketing Group: How To (Actually) Predict the Fu...
Rohit Bhargava, Influential Marketing Group: How To (Actually) Predict the Fu...
 
Robert Hastings, Bell Helicopter: Lead Like a Warrior
Robert Hastings, Bell Helicopter: Lead Like a WarriorRobert Hastings, Bell Helicopter: Lead Like a Warrior
Robert Hastings, Bell Helicopter: Lead Like a Warrior
 
Pentaho Analytics on MongoDB
Pentaho Analytics on MongoDBPentaho Analytics on MongoDB
Pentaho Analytics on MongoDB
 
Francesca DeMartino, Medtronic: Adding Patient Value Through Partnerships
Francesca DeMartino, Medtronic: Adding Patient Value Through PartnershipsFrancesca DeMartino, Medtronic: Adding Patient Value Through Partnerships
Francesca DeMartino, Medtronic: Adding Patient Value Through Partnerships
 
Big Data Analytics Projects - Real World with Pentaho
Big Data Analytics Projects - Real World with PentahoBig Data Analytics Projects - Real World with Pentaho
Big Data Analytics Projects - Real World with Pentaho
 
Sql server 2012 roadshow masd overview 003
Sql server 2012 roadshow masd overview 003Sql server 2012 roadshow masd overview 003
Sql server 2012 roadshow masd overview 003
 
Microsoft SQL Server Data Warehouses for SQL Server DBAs
Microsoft SQL Server Data Warehouses for SQL Server DBAsMicrosoft SQL Server Data Warehouses for SQL Server DBAs
Microsoft SQL Server Data Warehouses for SQL Server DBAs
 
Azure vs. amazon
Azure vs. amazonAzure vs. amazon
Azure vs. amazon
 

Ähnlich wie Anexinet Big Data Solutions

Big Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RKBig Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RK
Rajesh Jayarman
 

Ähnlich wie Anexinet Big Data Solutions (20)

Introducing the Big Data Ecosystem with Caserta Concepts & Talend
Introducing the Big Data Ecosystem with Caserta Concepts & TalendIntroducing the Big Data Ecosystem with Caserta Concepts & Talend
Introducing the Big Data Ecosystem with Caserta Concepts & Talend
 
Accelerating Big Data Analytics
Accelerating Big Data AnalyticsAccelerating Big Data Analytics
Accelerating Big Data Analytics
 
Big Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RKBig Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RK
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with Hadoop
 
Traditional data word
Traditional data wordTraditional data word
Traditional data word
 
Skilwise Big data
Skilwise Big dataSkilwise Big data
Skilwise Big data
 
Présentation on radoop
Présentation on radoop   Présentation on radoop
Présentation on radoop
 
Derfor skal du bruge en DataLake
Derfor skal du bruge en DataLakeDerfor skal du bruge en DataLake
Derfor skal du bruge en DataLake
 
Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action
 
Skillwise Big Data part 2
Skillwise Big Data part 2Skillwise Big Data part 2
Skillwise Big Data part 2
 
Big data? No. Big Decisions are What You Want
Big data? No. Big Decisions are What You WantBig data? No. Big Decisions are What You Want
Big data? No. Big Decisions are What You Want
 
Big Data SE vs. SE for Big Data
Big Data SE vs. SE for Big DataBig Data SE vs. SE for Big Data
Big Data SE vs. SE for Big Data
 
Seagate: Sensor Overload! Taming The Raging Manufacturing Big Data Torrent
Seagate: Sensor Overload! Taming The Raging Manufacturing Big Data TorrentSeagate: Sensor Overload! Taming The Raging Manufacturing Big Data Torrent
Seagate: Sensor Overload! Taming The Raging Manufacturing Big Data Torrent
 
Talend introduction v1
Talend introduction v1Talend introduction v1
Talend introduction v1
 
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
The Practice of Big Data - The Hadoop ecosystem explained with usage scenariosThe Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
 
the Data World Distilled
the Data World Distilledthe Data World Distilled
the Data World Distilled
 
Hadoop as data refinery
Hadoop as data refineryHadoop as data refinery
Hadoop as data refinery
 
Hadoop as Data Refinery - Steve Loughran
Hadoop as Data Refinery - Steve LoughranHadoop as Data Refinery - Steve Loughran
Hadoop as Data Refinery - Steve Loughran
 
Big data Question bank.pdf
Big data Question bank.pdfBig data Question bank.pdf
Big data Question bank.pdf
 
Hadoop and the Data Warehouse: When to Use Which
Hadoop and the Data Warehouse: When to Use Which Hadoop and the Data Warehouse: When to Use Which
Hadoop and the Data Warehouse: When to Use Which
 

Mehr von Mark Kromer

Mehr von Mark Kromer (20)

Fabric Data Factory Pipeline Copy Perf Tips.pptx
Fabric Data Factory Pipeline Copy Perf Tips.pptxFabric Data Factory Pipeline Copy Perf Tips.pptx
Fabric Data Factory Pipeline Copy Perf Tips.pptx
 
Build data quality rules and data cleansing into your data pipelines
Build data quality rules and data cleansing into your data pipelinesBuild data quality rules and data cleansing into your data pipelines
Build data quality rules and data cleansing into your data pipelines
 
Mapping Data Flows Training deck Q1 CY22
Mapping Data Flows Training deck Q1 CY22Mapping Data Flows Training deck Q1 CY22
Mapping Data Flows Training deck Q1 CY22
 
Data cleansing and prep with synapse data flows
Data cleansing and prep with synapse data flowsData cleansing and prep with synapse data flows
Data cleansing and prep with synapse data flows
 
Data cleansing and data prep with synapse data flows
Data cleansing and data prep with synapse data flowsData cleansing and data prep with synapse data flows
Data cleansing and data prep with synapse data flows
 
Mapping Data Flows Training April 2021
Mapping Data Flows Training April 2021Mapping Data Flows Training April 2021
Mapping Data Flows Training April 2021
 
Mapping Data Flows Perf Tuning April 2021
Mapping Data Flows Perf Tuning April 2021Mapping Data Flows Perf Tuning April 2021
Mapping Data Flows Perf Tuning April 2021
 
Data Lake ETL in the Cloud with ADF
Data Lake ETL in the Cloud with ADFData Lake ETL in the Cloud with ADF
Data Lake ETL in the Cloud with ADF
 
Azure Data Factory Data Wrangling with Power Query
Azure Data Factory Data Wrangling with Power QueryAzure Data Factory Data Wrangling with Power Query
Azure Data Factory Data Wrangling with Power Query
 
Azure Data Factory Data Flow Performance Tuning 101
Azure Data Factory Data Flow Performance Tuning 101Azure Data Factory Data Flow Performance Tuning 101
Azure Data Factory Data Flow Performance Tuning 101
 
Data Quality Patterns in the Cloud with ADF
Data Quality Patterns in the Cloud with ADFData Quality Patterns in the Cloud with ADF
Data Quality Patterns in the Cloud with ADF
 
Azure Data Factory Data Flows Training (Sept 2020 Update)
Azure Data Factory Data Flows Training (Sept 2020 Update)Azure Data Factory Data Flows Training (Sept 2020 Update)
Azure Data Factory Data Flows Training (Sept 2020 Update)
 
Data quality patterns in the cloud with ADF
Data quality patterns in the cloud with ADFData quality patterns in the cloud with ADF
Data quality patterns in the cloud with ADF
 
Azure Data Factory Data Flows Training v005
Azure Data Factory Data Flows Training v005Azure Data Factory Data Flows Training v005
Azure Data Factory Data Flows Training v005
 
Data Quality Patterns in the Cloud with Azure Data Factory
Data Quality Patterns in the Cloud with Azure Data FactoryData Quality Patterns in the Cloud with Azure Data Factory
Data Quality Patterns in the Cloud with Azure Data Factory
 
ADF Mapping Data Flows Level 300
ADF Mapping Data Flows Level 300ADF Mapping Data Flows Level 300
ADF Mapping Data Flows Level 300
 
ADF Mapping Data Flows Training V2
ADF Mapping Data Flows Training V2ADF Mapping Data Flows Training V2
ADF Mapping Data Flows Training V2
 
ADF Mapping Data Flows Training Slides V1
ADF Mapping Data Flows Training Slides V1ADF Mapping Data Flows Training Slides V1
ADF Mapping Data Flows Training Slides V1
 
ADF Mapping Data Flow Private Preview Migration
ADF Mapping Data Flow Private Preview MigrationADF Mapping Data Flow Private Preview Migration
ADF Mapping Data Flow Private Preview Migration
 
Azure Data Factory ETL Patterns in the Cloud
Azure Data Factory ETL Patterns in the CloudAzure Data Factory ETL Patterns in the Cloud
Azure Data Factory ETL Patterns in the Cloud
 

Kürzlich hochgeladen

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Kürzlich hochgeladen (20)

Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptx
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 

Anexinet Big Data Solutions

  • 1. Anexinet Big Data Solutions for Big Data Analytics
  • 2. Big Data Defined Volume Velocity • Datasets that grow too large to • Large volume streaming data that easily manage in traditional RDBMS can overwhelm traditional BI & ETL • TBs, PBs, ZBs processes Variety Value • Data sources extraneous to • Big Data can have a traditional business systems that transformational effect on business can be unstructured and require when the proper systems and text analytics processes are put in place
  • 3. Big Data vs. Classic BI  What is different from classic DW/BI and Big Data Analytics?  Businesses today treat data warehouse & business intelligence as must-have reporting and operational capability  Businesses that are not fully mature in BI lifecycle may struggle with Big Data  Big Data Projects look for untapped analytics, not BI dashboards  SCALE: Think Volume, Variety and Velocity  Yahoo! Uses Microsoft SQL Server & Analysis Services, with Hadoop, Oracle & Tableau  38,000 machines distributed across 20 different clusters  2-petabyte Hadoop cluster that feeds 1.2 terabytes of raw data each day into Oracle RAC  Data is compressed and 135 gigabytes of data per day is sent to a SQL Server 2008 R2 Analysis Services cube  Cube produces 24 terabytes of data each quarter  http://www.microsoft.com/casestudies/Case_Study_Detail.aspx?CaseStudyID=710000001707
  • 4. Scalable Big Data Platform Architecture HDFS Cluster In-memory cubes MapReduce Framework Analytical Advanced in- Columnstore MPP memory analytics Tables Database Hadoop Analytics Star Ad-hoc data Schemas discovery Data Warehouse End User Reporting © Copyright 2013 Anexinet Corp. 4
  • 5. Go Beyond Dashboards. Provide Advanced Analytics.  Large number of data Tableau points adds new business value  Big Data advanced analytics requires tool that Microsoft Power can sample complex data View sources  Must provide quick aggregations of large data sets that are easily Qlikview consumed by the human eye  Must provide “data discovery” for ad-hoc analysis
  • 6. Marketing Samples  Enhance marketing campaigns with Big Data  Social analytics, customer analytic, targeted marketing, brand sentiment  Big Data has proven transformational for marketing organizations (Razorfish, Yahoo!, NBC, [x+1]) Web Analytics from Google Analytics
  • 7. Anexinet Big Data Offerings Strategy Engagement • Customer stakeholder interviews & interactive sessions • Define Big Data Requirements • Design Big Data Strategy • Deliver Strategy & Roadmap Documents Starter Solution • Let Anexinet handle the hardest parts of a Big Data solution * Getting started * Collecting & processing data * Uncover business value from Big Data Big Data Project Engagement • End-to-end Big Data project * Big Data Discovery * Big Data Platform * Big Data Analytics * Big Data Visualizations
  • 8. Partnerships Big Data Platforms Big Data Databases Big Data Visualizations • EMC Greenplum • HP Vertica • QlikView • Hortonworks • EMC Greenplum • Tableau (OSS, MSFT, HP) • Microsoft PDW • Microsoft PowerPivot • Cloudera • Oracle Exalytics • Microsoft Power View (OSS, Oracle, HP) • Oracle Big Data Appliance
  • 9. A Credible Partner to Deploy Big Data Solutions Security Integration Configuration Governance • Ensure • ETL / ELT • Configure the • Ensure Data privacy of PII • Integrate Big Data Quality Hadoop into environment to • MDM • Conform Big your DW & maximize • Process Data solution Analytics throughput, Governance to your environments performance enterprise • Integrate Big and analytics to security Data into your IT meet your investments stated SLA goals standards
  • 10. Top Impediments to Successful Big Data Analytics
  • 11. Big Data Buzzword Glossary  Big Data: Think 3 v’s, unstructured data, data that is not currently managed in DW. This is the data that companies need to do game-changing analytics.  Big Data Analytics: Business insights gained from mining Big Data to transform business processes  Columnar: Column-oriented databases that are used in Big Data scenarios because of their speed and compression capabilities, i.e. HP Vertica, HBase  Hadoop: Apache open-source framework for Big Data processing. Made up of multiple components. The leading Big Data platform. Marketed by Couldera & Hortonworks.  In-memory DB: A database that resides fully in memory, eliminating IO bottlenecks. Very important in Big Data Analytics systems, i.e. Microsoft PowerPivot, SSAS 2012, SAP HANA  MapReduce: Distributed data programming and processing framework. A key aspect of processing Big Data is using a MapReduce framework across distributed clusters of commodity servers. Available as open source in the Hadoop framework and in various Hadoop distribution flavors.  MPP: Massively Parallel Processing database engine, mostly used for data warehouse & BI workloads. I.e. SQL Server PDW, IBM Netezza, Teradata  NoSQL: Key-value data store for quick eventual-ACID schemaless database writes. Big Data systems will use these to store data coming in from sources that dump large amounts of data quickly, i.e. Cassandra, MongoDB.