SlideShare ist ein Scribd-Unternehmen logo
1 von 17
© 2016 WIPRO LTD | WWW.WIPRO.COM | CONFIDENTIAL1
Big Data Ready Enterprise
Sri Harsha Boda – Wipro Technologies
© 2016 WIPRO LTD | WWW.WIPRO.COM | CONFIDENTIAL2
Big Data Ready Enterprise
© 2016 WIPRO LTD | WWW.WIPRO.COM | CONFIDENTIAL3
Agenda
Common challenges when implementing at scale
How BDRE addresses the needs across the lifecycle
FastTrack Implementation using BDRE
Demo
Typical enterprise deployment view with BDRE
2
3
4
5
6
7
1
Typical use cases around Big Data Platform
Metadata Management in depth8
About BDRE
4
About BDRE
 BDRE is an Apache Licensed (APL 2.0) open source project.
Code is available on GitHub
 Wipro’s largest opensource contribution till date.
 Community choice winner of modern data applications track –
Hadoop summit San Jose, 2016.
© 2016 WIPRO LTD | WWW.WIPRO.COM | CONFIDENTIAL5
Typical use cases organizations are embarking on Big Data Analytics
Information Delivery
Enterprise Data
Hub / Lake
Information, Integration & Governance
Batch Data
Processing
Event Stream &
Micro batch
Processing
Enterprise Data
Provisioning
Platform
Low Latency
Store
Complex multistep
pipeline
transformation
Migration of EDW
workloads
Data as a Service
Enterprise
Analytical Platform
© 2016 WIPRO LTD | WWW.WIPRO.COM | CONFIDENTIAL6
Common challenges when implementing these use cases at scale
Skilled resource, shorter implementation cycles
Rapid Ingestion of data
Rework across several complex multi-step process
Robust application deployment support
Support flexible operations & SLA management
Robust operational metadata across technologies
© 2016 WIPRO LTD | WWW.WIPRO.COM | CONFIDENTIAL7
Pluggable
Architecture
Community
Driven
Distribution
Compatible
How BDRE addresses the needs across the lifecycle
Operational
functions you
like to build
Development
effort from
scratch
Basic Hadoop
– at the base
“Pre-built operational
functions”
Brought it by BDRE
HADOOP
APPPLICATIONS
Minimal development effort
through Customization on
BDRE components
Supporting Operational
Functions
 OPERATIONAL METADATA
 RAPID INGESTION
 VISUAL DATA PIPELINE
 AUTOMATED WORKFLOW
 ONE TOUCH DEPLOYMENT
 SLA MANANGEMENT
 RICH VISUALIZATION
Value – Add through
BDRE
With BDRE Without BDRE
Implementation
Jumpstart
Big Data Ready Enterprise
© 2016 WIPRO LTD | WWW.WIPRO.COM | CONFIDENTIAL8
FastTrack Implementation using BDRE
Key features that can be rapidly implemented using the product
Data Ingestion via Multiple Sources
 Abstraction layer: Component to ingest variety of data
(CPY, XML, DB, Mainframes)
 Streaming Data Ingest – 16 sources with Twitter, Flume, logs,
message queue
 File Monitoring: Component to check validity of incoming data at
file and record level
 Cluster to Cluster Hive Table Migration
Job Automation & Security Integration
 UI based Workflow Designer
 Supports Hive, Pig, Map Reduce, Spark, R, Python
 Automated Workflow Generator – Oozie/Airflow
 Authentication : Integration with Kerberos & JAAS
Data Quality and Data Profiling
 Enforce Data Quality and Data processing rules
(during ingestion or post ingestion)
 DQ Analysis, Integrity & Failure Handling
 Data Loading - Test Data Generation
One Touch Deployment
 Automated central deployment and application management.
 Registry of all workflow processes / templates
 Automated Process flow Planner
Operational Metadata & Lineage
 Job registry
 Configuration management
 Dependency management - Pipelining
 Batch management/tracking
 Real Time Execution status
 Ingestion registry
 Job monitoring and proactive/reactive alerting
 Restartability
Analytics & Visualization
 Support for Executing Models – R, Python, Spark
 Zero Coding UI based configuration for common use cases
 User Interface based metadata interaction& search
 Data Exploration integration with notebooks
 Visual Representation of workflow
© 2016 WIPRO LTD | WWW.WIPRO.COM | CONFIDENTIAL9
DEMO
© 2016 WIPRO LTD | WWW.WIPRO.COM | CONFIDENTIAL10
Typical enterprise deployment view with BDRE
NN RM
Browser
App Server
Eventing
Framework
Espresso Email
Oozie
Workflow
Generator
Data Quality
Workflow
Non Hadoop
Workflows
Ingestion
Workflow
Semantic
Workflow
Bulk Data
generation
Workflow
Job Deploy
Scripts
SLA notification
BDRE UI
App
BDRE Rest
API
App Server
JAASEdge Node
Operational
Metadata
RDBMS
Metastore
Rule Engine(for
DQ)
Job
Job
Job
Job
Job
Hadoop Cluster
Proactive Reporting
APP Store
(Git Repo)
Job
Export/Import
© 2016 WIPRO LTD | WWW.WIPRO.COM | CONFIDENTIAL11
BDRE Metadata Management system
© 2016 WIPRO LTD | WWW.WIPRO.COM | CONFIDENTIAL12
Intra and Inter Process Dependency
Pid Enq id Parent id
300 Null Null
301 100 300
302 Null 300
303 Null 300
304 200 300
Process 101
Process 102
Process 103
Process 203
Process 204
Process 205
Process 202
Process 201
Process 100
Process 200
Process 401
Process 402
Process 301
Process 302
Process 303
Process 300
Process 304
Process 400
© 2016 WIPRO LTD | WWW.WIPRO.COM | CONFIDENTIAL13
Job Status Management
InitJob
HaltJob
(success)
TermJob
(failure)
InitStep
HaltStep
(Success)
TermStep
(Failure)
BDRE
Operational
Metadata
Fail queue
Success
queue
Consumer
JIRA
M
Q
 Halt and TermJob APIs can send message to MQ
for proactive alerting
 Alternatively BDRE could directly connect to any
alerting/ticket mgmt system skipping the MQ
© 2016 WIPRO LTD | WWW.WIPRO.COM | CONFIDENTIAL14
Batch Management
101
102
103
200201
202
203
204
205
300
301
302 303
304
400
401 402
Batch
Bat
ch
Bat
ch
Bat
ch
Queue
Bat
ch
Bat
ch
Queue
Batch
Logical pipeline between the processes
Process 200
Process 300
Process 100
Process 400
Workflow id 200
Workflow id
400Workflow id 100
Batch
A row is added to the queue table for
all downstream upon each successful
execution of an upstream process.
Downstream looks up the queue
and process all pending batches en-
queued by upstream.
Multiple source batch
consumed = one target
batch is produced
Workflow 300
100
© 2016 WIPRO LTD | WWW.WIPRO.COM | CONFIDENTIAL15
Data Quality Component
Map only MR job
Mapper 1 Mapper 2 Mapper n
Rules
Guvnor API
Rule definition
Rule engine UI
Bad records Good records
Hadoop
Original file with
all records
© 2016 WIPRO LTD | WWW.WIPRO.COM | CONFIDENTIAL16
Important Links
BDRE- GitHub Repo -
https://github.com/WiproOpenSourcePractice/openbdre
Contains source code, setup instructions and demo videos
To contribute, please sign up at:
BDRE – Jira: https://openbdre.atlassian.net/
Please join the community
https://groups.google.com/forum/#!forum/bdre.
If you have any questions/suggestions please email to
bdre-queries@googlegroups.com .
© 2016 WIPRO LTD | WWW.WIPRO.COM | CONFIDENTIAL17
Sri Harsha Boda
Thank You
sri.boda@wipro.com

Weitere ähnliche Inhalte

Was ist angesagt?

Highly configurable and extensible data processing framework at PubMatic
Highly configurable and extensible data processing framework at PubMaticHighly configurable and extensible data processing framework at PubMatic
Highly configurable and extensible data processing framework at PubMatic
DataWorks Summit
 
Quick! Quick! Exploration!: A framework for searching a predictive model on A...
Quick! Quick! Exploration!: A framework for searching a predictive model on A...Quick! Quick! Exploration!: A framework for searching a predictive model on A...
Quick! Quick! Exploration!: A framework for searching a predictive model on A...
DataWorks Summit
 
Understanding Your Crown Jewels: Finding, Organizing, and Profiling Sensitive...
Understanding Your Crown Jewels: Finding, Organizing, and Profiling Sensitive...Understanding Your Crown Jewels: Finding, Organizing, and Profiling Sensitive...
Understanding Your Crown Jewels: Finding, Organizing, and Profiling Sensitive...
DataWorks Summit
 
Log Analytics for Distributed Microservices
Log Analytics for Distributed MicroservicesLog Analytics for Distributed Microservices
Log Analytics for Distributed Microservices
Kai Wähner
 

Was ist angesagt? (20)

Airline reservations and routing: a graph use case
Airline reservations and routing: a graph use caseAirline reservations and routing: a graph use case
Airline reservations and routing: a graph use case
 
GE Predix 新手入门 赵锴 物联网_IoT
GE Predix 新手入门 赵锴 物联网_IoTGE Predix 新手入门 赵锴 物联网_IoT
GE Predix 新手入门 赵锴 物联网_IoT
 
Munich Re: Driving a Big Data Transformation
Munich Re: Driving a Big Data TransformationMunich Re: Driving a Big Data Transformation
Munich Re: Driving a Big Data Transformation
 
Next gen tooling for building streaming analytics apps: code-less development...
Next gen tooling for building streaming analytics apps: code-less development...Next gen tooling for building streaming analytics apps: code-less development...
Next gen tooling for building streaming analytics apps: code-less development...
 
IBM THINK 2018 - IBM Cloud SQL Query Introduction
IBM THINK 2018 - IBM Cloud SQL Query IntroductionIBM THINK 2018 - IBM Cloud SQL Query Introduction
IBM THINK 2018 - IBM Cloud SQL Query Introduction
 
91APP API Gateway 導入之旅
91APP API Gateway 導入之旅91APP API Gateway 導入之旅
91APP API Gateway 導入之旅
 
Highly configurable and extensible data processing framework at PubMatic
Highly configurable and extensible data processing framework at PubMaticHighly configurable and extensible data processing framework at PubMatic
Highly configurable and extensible data processing framework at PubMatic
 
Dataworks | 2018-06-20 | Gimel data platform
Dataworks | 2018-06-20 | Gimel data platformDataworks | 2018-06-20 | Gimel data platform
Dataworks | 2018-06-20 | Gimel data platform
 
Pivotal Big Data Suite: A Technical Overview
Pivotal Big Data Suite: A Technical OverviewPivotal Big Data Suite: A Technical Overview
Pivotal Big Data Suite: A Technical Overview
 
Why You Need Manageability Now More than Ever and How to Get It
Why You Need Manageability Now More than Ever and How to Get ItWhy You Need Manageability Now More than Ever and How to Get It
Why You Need Manageability Now More than Ever and How to Get It
 
Postgres Vision 2018: Your Migration Path - BinckBank Case Study
Postgres Vision 2018: Your Migration Path - BinckBank Case StudyPostgres Vision 2018: Your Migration Path - BinckBank Case Study
Postgres Vision 2018: Your Migration Path - BinckBank Case Study
 
The Future of Data Warehousing, Data Science and Machine Learning
The Future of Data Warehousing, Data Science and Machine LearningThe Future of Data Warehousing, Data Science and Machine Learning
The Future of Data Warehousing, Data Science and Machine Learning
 
Quick! Quick! Exploration!: A framework for searching a predictive model on A...
Quick! Quick! Exploration!: A framework for searching a predictive model on A...Quick! Quick! Exploration!: A framework for searching a predictive model on A...
Quick! Quick! Exploration!: A framework for searching a predictive model on A...
 
Practical experiences using Atlas and Ranger to implement GDPR
Practical experiences using Atlas and Ranger to implement GDPRPractical experiences using Atlas and Ranger to implement GDPR
Practical experiences using Atlas and Ranger to implement GDPR
 
PayPal datalake journey | teradata - edge of next | san diego | 2017 october ...
PayPal datalake journey | teradata - edge of next | san diego | 2017 october ...PayPal datalake journey | teradata - edge of next | san diego | 2017 october ...
PayPal datalake journey | teradata - edge of next | san diego | 2017 october ...
 
QCon 2018 | Gimel | PayPal's Analytic Platform
QCon 2018 | Gimel | PayPal's Analytic PlatformQCon 2018 | Gimel | PayPal's Analytic Platform
QCon 2018 | Gimel | PayPal's Analytic Platform
 
Understanding Your Crown Jewels: Finding, Organizing, and Profiling Sensitive...
Understanding Your Crown Jewels: Finding, Organizing, and Profiling Sensitive...Understanding Your Crown Jewels: Finding, Organizing, and Profiling Sensitive...
Understanding Your Crown Jewels: Finding, Organizing, and Profiling Sensitive...
 
IoT Platform Meetup - GE
IoT Platform Meetup - GEIoT Platform Meetup - GE
IoT Platform Meetup - GE
 
Log Analytics for Distributed Microservices
Log Analytics for Distributed MicroservicesLog Analytics for Distributed Microservices
Log Analytics for Distributed Microservices
 
NetApp IT and how Data Fabric Simplifies Data Management across the Hybrid Cl...
NetApp IT and how Data Fabric Simplifies Data Management across the Hybrid Cl...NetApp IT and how Data Fabric Simplifies Data Management across the Hybrid Cl...
NetApp IT and how Data Fabric Simplifies Data Management across the Hybrid Cl...
 

Andere mochten auch

Kelas viii smp bahasa indonesia_maryati
Kelas viii smp bahasa indonesia_maryatiKelas viii smp bahasa indonesia_maryati
Kelas viii smp bahasa indonesia_maryati
w0nd0
 
Kelas v sd bahasa indonesia_sri murni
Kelas v sd bahasa indonesia_sri murniKelas v sd bahasa indonesia_sri murni
Kelas v sd bahasa indonesia_sri murni
w0nd0
 
Makalah uji normalitas dan homogenitas
Makalah uji normalitas dan homogenitasMakalah uji normalitas dan homogenitas
Makalah uji normalitas dan homogenitas
Aisyah Turidho
 
makalah statistik, statistika, macam data
makalah statistik, statistika, macam datamakalah statistik, statistika, macam data
makalah statistik, statistika, macam data
Aisyah Turidho
 

Andere mochten auch (20)

Ukuran Kemiringan Data
Ukuran Kemiringan DataUkuran Kemiringan Data
Ukuran Kemiringan Data
 
Películas de terror prohibidas en méxico informatica
Películas de terror prohibidas en méxico informaticaPelículas de terror prohibidas en méxico informatica
Películas de terror prohibidas en méxico informatica
 
Asignación unidad i educación virtual
Asignación unidad i educación virtualAsignación unidad i educación virtual
Asignación unidad i educación virtual
 
保保
 
PPT TIK kelas IX BAB V
PPT TIK kelas IX BAB VPPT TIK kelas IX BAB V
PPT TIK kelas IX BAB V
 
02_Daerah terdampak banjir di Bandung
02_Daerah terdampak banjir di Bandung02_Daerah terdampak banjir di Bandung
02_Daerah terdampak banjir di Bandung
 
Kelas viii smp bahasa indonesia_maryati
Kelas viii smp bahasa indonesia_maryatiKelas viii smp bahasa indonesia_maryati
Kelas viii smp bahasa indonesia_maryati
 
Human Rights
Human RightsHuman Rights
Human Rights
 
Ukuran pemusatan dan penyebaran
Ukuran pemusatan dan penyebaranUkuran pemusatan dan penyebaran
Ukuran pemusatan dan penyebaran
 
PERANG PATTIMURA
PERANG PATTIMURAPERANG PATTIMURA
PERANG PATTIMURA
 
Kelas v sd bahasa indonesia_sri murni
Kelas v sd bahasa indonesia_sri murniKelas v sd bahasa indonesia_sri murni
Kelas v sd bahasa indonesia_sri murni
 
Program tahunan
Program tahunanProgram tahunan
Program tahunan
 
Upaya negara indonesia menjadi negara maju
Upaya negara indonesia menjadi negara majuUpaya negara indonesia menjadi negara maju
Upaya negara indonesia menjadi negara maju
 
Makalah uji normalitas dan homogenitas
Makalah uji normalitas dan homogenitasMakalah uji normalitas dan homogenitas
Makalah uji normalitas dan homogenitas
 
BAB I : Operasi Hitung Bilangan
BAB I : Operasi Hitung BilanganBAB I : Operasi Hitung Bilangan
BAB I : Operasi Hitung Bilangan
 
makalah statistik, statistika, macam data
makalah statistik, statistika, macam datamakalah statistik, statistika, macam data
makalah statistik, statistika, macam data
 
Negara Kesatuan Republik Indonesia
Negara Kesatuan Republik IndonesiaNegara Kesatuan Republik Indonesia
Negara Kesatuan Republik Indonesia
 
RPP SMP IPS Kelas IX
RPP SMP IPS Kelas IXRPP SMP IPS Kelas IX
RPP SMP IPS Kelas IX
 
Penerapan fungsi logaritma dalam kehidupan sehari hari
Penerapan fungsi logaritma dalam kehidupan sehari hariPenerapan fungsi logaritma dalam kehidupan sehari hari
Penerapan fungsi logaritma dalam kehidupan sehari hari
 
AWS re:Invent 2016: Amazon EC2 Foundations (CMP203)
AWS re:Invent 2016: Amazon EC2 Foundations (CMP203)AWS re:Invent 2016: Amazon EC2 Foundations (CMP203)
AWS re:Invent 2016: Amazon EC2 Foundations (CMP203)
 

Ähnlich wie Big data ready Enterprise

Big Brother for Enterprises - The WSO2 Advantage
Big Brother for Enterprises - The WSO2 AdvantageBig Brother for Enterprises - The WSO2 Advantage
Big Brother for Enterprises - The WSO2 Advantage
WSO2
 
Track3, session 5, implementing documentum and captiva based application and ...
Track3, session 5, implementing documentum and captiva based application and ...Track3, session 5, implementing documentum and captiva based application and ...
Track3, session 5, implementing documentum and captiva based application and ...
EMC Forum India
 
Replatform your Teradata to a Next-Gen Cloud Data Platform in Weeks, Not Years
Replatform your Teradata to a Next-Gen Cloud Data Platform in Weeks, Not YearsReplatform your Teradata to a Next-Gen Cloud Data Platform in Weeks, Not Years
Replatform your Teradata to a Next-Gen Cloud Data Platform in Weeks, Not Years
VMware Tanzu
 

Ähnlich wie Big data ready Enterprise (20)

Big Data Ready Enterprise
Big Data Ready Enterprise Big Data Ready Enterprise
Big Data Ready Enterprise
 
Ibm Cognos B Iund Pmfj
Ibm Cognos B Iund PmfjIbm Cognos B Iund Pmfj
Ibm Cognos B Iund Pmfj
 
Big Brother for Enterprises - The WSO2 Advantage
Big Brother for Enterprises - The WSO2 AdvantageBig Brother for Enterprises - The WSO2 Advantage
Big Brother for Enterprises - The WSO2 Advantage
 
Track3, session 5, implementing documentum and captiva based application and ...
Track3, session 5, implementing documentum and captiva based application and ...Track3, session 5, implementing documentum and captiva based application and ...
Track3, session 5, implementing documentum and captiva based application and ...
 
Replatform your Teradata to a Next-Gen Cloud Data Platform in Weeks, Not Years
Replatform your Teradata to a Next-Gen Cloud Data Platform in Weeks, Not YearsReplatform your Teradata to a Next-Gen Cloud Data Platform in Weeks, Not Years
Replatform your Teradata to a Next-Gen Cloud Data Platform in Weeks, Not Years
 
Motadata - Unified Product Suite for IT Operations and Big Data Analytics
Motadata - Unified Product Suite for IT Operations and Big Data AnalyticsMotadata - Unified Product Suite for IT Operations and Big Data Analytics
Motadata - Unified Product Suite for IT Operations and Big Data Analytics
 
How T-Mobile Tamed Metron
How T-Mobile Tamed MetronHow T-Mobile Tamed Metron
How T-Mobile Tamed Metron
 
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
 
VMware Keynote
VMware KeynoteVMware Keynote
VMware Keynote
 
Information Virtualization: Query Federation on Data Lakes
Information Virtualization: Query Federation on Data LakesInformation Virtualization: Query Federation on Data Lakes
Information Virtualization: Query Federation on Data Lakes
 
ABC's of Service Virtualization
ABC's of Service VirtualizationABC's of Service Virtualization
ABC's of Service Virtualization
 
Spring and Pivotal Application Service - SpringOne Tour Dallas
Spring and Pivotal Application Service - SpringOne Tour DallasSpring and Pivotal Application Service - SpringOne Tour Dallas
Spring and Pivotal Application Service - SpringOne Tour Dallas
 
Evolution of Big Data at Intel - Crawl, Walk and Run Approach
Evolution of Big Data at Intel - Crawl, Walk and Run ApproachEvolution of Big Data at Intel - Crawl, Walk and Run Approach
Evolution of Big Data at Intel - Crawl, Walk and Run Approach
 
Spring on PAS - Fabio Marinelli
Spring on PAS - Fabio MarinelliSpring on PAS - Fabio Marinelli
Spring on PAS - Fabio Marinelli
 
Transforming a Large Mission-Critical E-Commerce Platform from a Relational A...
Transforming a Large Mission-Critical E-Commerce Platform from a Relational A...Transforming a Large Mission-Critical E-Commerce Platform from a Relational A...
Transforming a Large Mission-Critical E-Commerce Platform from a Relational A...
 
Complete Visibility into Docker Containers with AppDynamics
Complete Visibility into Docker Containers with AppDynamicsComplete Visibility into Docker Containers with AppDynamics
Complete Visibility into Docker Containers with AppDynamics
 
Peek into Neo4j Product Strategy and Roadmap
Peek into Neo4j Product Strategy and RoadmapPeek into Neo4j Product Strategy and Roadmap
Peek into Neo4j Product Strategy and Roadmap
 
Big Data, Big Picture: Can You See It?
Big Data, Big Picture: Can You See It?Big Data, Big Picture: Can You See It?
Big Data, Big Picture: Can You See It?
 
CA API Gateway
CA API GatewayCA API Gateway
CA API Gateway
 
Soa12c launch 1 overview cr
Soa12c launch 1 overview crSoa12c launch 1 overview cr
Soa12c launch 1 overview cr
 

Kürzlich hochgeladen

Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
amitlee9823
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
amitlee9823
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
amitlee9823
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
MarinCaroMartnezBerg
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
amitlee9823
 

Kürzlich hochgeladen (20)

Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptx
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
ELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptxELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptx
 

Big data ready Enterprise

  • 1. © 2016 WIPRO LTD | WWW.WIPRO.COM | CONFIDENTIAL1 Big Data Ready Enterprise Sri Harsha Boda – Wipro Technologies
  • 2. © 2016 WIPRO LTD | WWW.WIPRO.COM | CONFIDENTIAL2 Big Data Ready Enterprise
  • 3. © 2016 WIPRO LTD | WWW.WIPRO.COM | CONFIDENTIAL3 Agenda Common challenges when implementing at scale How BDRE addresses the needs across the lifecycle FastTrack Implementation using BDRE Demo Typical enterprise deployment view with BDRE 2 3 4 5 6 7 1 Typical use cases around Big Data Platform Metadata Management in depth8 About BDRE
  • 4. 4 About BDRE  BDRE is an Apache Licensed (APL 2.0) open source project. Code is available on GitHub  Wipro’s largest opensource contribution till date.  Community choice winner of modern data applications track – Hadoop summit San Jose, 2016.
  • 5. © 2016 WIPRO LTD | WWW.WIPRO.COM | CONFIDENTIAL5 Typical use cases organizations are embarking on Big Data Analytics Information Delivery Enterprise Data Hub / Lake Information, Integration & Governance Batch Data Processing Event Stream & Micro batch Processing Enterprise Data Provisioning Platform Low Latency Store Complex multistep pipeline transformation Migration of EDW workloads Data as a Service Enterprise Analytical Platform
  • 6. © 2016 WIPRO LTD | WWW.WIPRO.COM | CONFIDENTIAL6 Common challenges when implementing these use cases at scale Skilled resource, shorter implementation cycles Rapid Ingestion of data Rework across several complex multi-step process Robust application deployment support Support flexible operations & SLA management Robust operational metadata across technologies
  • 7. © 2016 WIPRO LTD | WWW.WIPRO.COM | CONFIDENTIAL7 Pluggable Architecture Community Driven Distribution Compatible How BDRE addresses the needs across the lifecycle Operational functions you like to build Development effort from scratch Basic Hadoop – at the base “Pre-built operational functions” Brought it by BDRE HADOOP APPPLICATIONS Minimal development effort through Customization on BDRE components Supporting Operational Functions  OPERATIONAL METADATA  RAPID INGESTION  VISUAL DATA PIPELINE  AUTOMATED WORKFLOW  ONE TOUCH DEPLOYMENT  SLA MANANGEMENT  RICH VISUALIZATION Value – Add through BDRE With BDRE Without BDRE Implementation Jumpstart Big Data Ready Enterprise
  • 8. © 2016 WIPRO LTD | WWW.WIPRO.COM | CONFIDENTIAL8 FastTrack Implementation using BDRE Key features that can be rapidly implemented using the product Data Ingestion via Multiple Sources  Abstraction layer: Component to ingest variety of data (CPY, XML, DB, Mainframes)  Streaming Data Ingest – 16 sources with Twitter, Flume, logs, message queue  File Monitoring: Component to check validity of incoming data at file and record level  Cluster to Cluster Hive Table Migration Job Automation & Security Integration  UI based Workflow Designer  Supports Hive, Pig, Map Reduce, Spark, R, Python  Automated Workflow Generator – Oozie/Airflow  Authentication : Integration with Kerberos & JAAS Data Quality and Data Profiling  Enforce Data Quality and Data processing rules (during ingestion or post ingestion)  DQ Analysis, Integrity & Failure Handling  Data Loading - Test Data Generation One Touch Deployment  Automated central deployment and application management.  Registry of all workflow processes / templates  Automated Process flow Planner Operational Metadata & Lineage  Job registry  Configuration management  Dependency management - Pipelining  Batch management/tracking  Real Time Execution status  Ingestion registry  Job monitoring and proactive/reactive alerting  Restartability Analytics & Visualization  Support for Executing Models – R, Python, Spark  Zero Coding UI based configuration for common use cases  User Interface based metadata interaction& search  Data Exploration integration with notebooks  Visual Representation of workflow
  • 9. © 2016 WIPRO LTD | WWW.WIPRO.COM | CONFIDENTIAL9 DEMO
  • 10. © 2016 WIPRO LTD | WWW.WIPRO.COM | CONFIDENTIAL10 Typical enterprise deployment view with BDRE NN RM Browser App Server Eventing Framework Espresso Email Oozie Workflow Generator Data Quality Workflow Non Hadoop Workflows Ingestion Workflow Semantic Workflow Bulk Data generation Workflow Job Deploy Scripts SLA notification BDRE UI App BDRE Rest API App Server JAASEdge Node Operational Metadata RDBMS Metastore Rule Engine(for DQ) Job Job Job Job Job Hadoop Cluster Proactive Reporting APP Store (Git Repo) Job Export/Import
  • 11. © 2016 WIPRO LTD | WWW.WIPRO.COM | CONFIDENTIAL11 BDRE Metadata Management system
  • 12. © 2016 WIPRO LTD | WWW.WIPRO.COM | CONFIDENTIAL12 Intra and Inter Process Dependency Pid Enq id Parent id 300 Null Null 301 100 300 302 Null 300 303 Null 300 304 200 300 Process 101 Process 102 Process 103 Process 203 Process 204 Process 205 Process 202 Process 201 Process 100 Process 200 Process 401 Process 402 Process 301 Process 302 Process 303 Process 300 Process 304 Process 400
  • 13. © 2016 WIPRO LTD | WWW.WIPRO.COM | CONFIDENTIAL13 Job Status Management InitJob HaltJob (success) TermJob (failure) InitStep HaltStep (Success) TermStep (Failure) BDRE Operational Metadata Fail queue Success queue Consumer JIRA M Q  Halt and TermJob APIs can send message to MQ for proactive alerting  Alternatively BDRE could directly connect to any alerting/ticket mgmt system skipping the MQ
  • 14. © 2016 WIPRO LTD | WWW.WIPRO.COM | CONFIDENTIAL14 Batch Management 101 102 103 200201 202 203 204 205 300 301 302 303 304 400 401 402 Batch Bat ch Bat ch Bat ch Queue Bat ch Bat ch Queue Batch Logical pipeline between the processes Process 200 Process 300 Process 100 Process 400 Workflow id 200 Workflow id 400Workflow id 100 Batch A row is added to the queue table for all downstream upon each successful execution of an upstream process. Downstream looks up the queue and process all pending batches en- queued by upstream. Multiple source batch consumed = one target batch is produced Workflow 300 100
  • 15. © 2016 WIPRO LTD | WWW.WIPRO.COM | CONFIDENTIAL15 Data Quality Component Map only MR job Mapper 1 Mapper 2 Mapper n Rules Guvnor API Rule definition Rule engine UI Bad records Good records Hadoop Original file with all records
  • 16. © 2016 WIPRO LTD | WWW.WIPRO.COM | CONFIDENTIAL16 Important Links BDRE- GitHub Repo - https://github.com/WiproOpenSourcePractice/openbdre Contains source code, setup instructions and demo videos To contribute, please sign up at: BDRE – Jira: https://openbdre.atlassian.net/ Please join the community https://groups.google.com/forum/#!forum/bdre. If you have any questions/suggestions please email to bdre-queries@googlegroups.com .
  • 17. © 2016 WIPRO LTD | WWW.WIPRO.COM | CONFIDENTIAL17 Sri Harsha Boda Thank You sri.boda@wipro.com