Big data is huge! with billions and billions of data sets and a need to analyze and apply that to real-life problem-solving is a challenge. Are traditional methods successful in solving big data problems?
Let’s take a look at the current state of big data, if traditional methodologies are providing the necessary answers quick enough. Is Agile/Scrum a good fit for big data?
– big data in any industry
– high data availability, real time analytics, data warehousing
– agile spectrum and where do my projects fall?
– big data complexity and empirical process control theory
– current industry trends
– metrics
Is Scrum a good fit for solving big data challenges
1. 1
Is Scrum a good fit for
solving big data challenges?
Speaker – Raj Kasturi
September 19th, 2017
10:00 to 11:00 AM EST,
7:30 PM to 8:30 PM IST
Special Thanks to:
2. 2
• 25+ years of IT experience with eight plus years of enterprise level
Agile Experience
• Agile experience as an Agile Coach, Scrum Trainer, Scrum Master
• Leading and helping large-‐scale Agile project transitions
• Adjunct faculty at Pennsylvania State University, Pennsylvania, USA.
• 18+ years of teaching experience in Technology, Project Management
and 8 years of teaching Scrum, Agile courses
• Started my career as a programmer; worked as App. Dev. Manager
• Speaker, volunteer at agile conferences, user groups
• Servant Leader – Agile World, User Group, Scrum Alliance
My Website/Blog: http://agilekingdom.com/
@AgileRaj
https://www.linkedin.com/in/rajkasturi/
Raj Kasturi, MBA
3. 3
Agenda
What is big data?
The three V’s of big data
Big Data Trends of 2017
Agile Spectrum
Big data complexity and empirical process control theory
Scrum and Big Data
Summary
4. 4
What is big data?
▪ The term big data was coined in late 1990s
▪ Big data is different than regular data
▪ Billions of data sets and their interaction
▪ Traditional RDBMS is for regular data
▪ RDBMS cannot handle big data
▪ Requires a new technological approach for handling and
processing
▪ New data platforms to meet storage and performance
requirements
5. 5
The 3 V’s of big data
Volume
VarietyVelocity
Are these three factors required to drive the need?
6. 6
Add Value
▪ Do we have a fourth V?
▪ Aggregate to provide value
Value
7. 7
Google’s flu tracker
▪ Knowing the what, rather than the why was good enough
▪ 2009 H1N1 flu epidemic
▪ Real-time flu tracker “Google Flu Trends”
▪ Flu sufferers google before visiting a clinic
▪ Search queries optimized, accurate and real-time data
▪ Data was far more effective than CDC – Size
▪ 3 billion searches a day
▪ Large servers and clever algorithms to sort data
8. 8
Who uses it?
▪ Financial Services
▪ Telecommunications
▪ Energy
▪ Government
▪ Retail
▪ And many more…..
12. 12
Input Output
May have internal processes
Defined Process
Composition known
Characteristics well
defined
• Sequential/Series of steps
• Underlying process well understood
• Results repeatable/predictable
• Command & Control approach
• Pre-defined variations are acceptable
14. 14 14
Transparency
Adaptation
Inspection
Black
Box
Frequently Inspect
and remove any
unacceptable
variations
Adjust and
control the
process, Improve
Significant aspects of the
process must be visible to
those responsible for the
outcome
Inputs Outputs
Needs frequent measurement
Problem cannot be fully understood or defined
Solution evolves as information becomes known
Protect the
black box by
not adding
anything new!
17. 17
Hadoop’s distributed file system (HDFS)
Source: Managing big data workflows for dummies
MapReduce - think of it as a framework that processes and reduces raw big data into
regular‐size, tagged datasets that are much easier to work with.
19. 19
Scrum and Big Data
➢Scrum’s ability to measure work output –
Velocity
➢Knowledge is based on the ability to measure
a given phenomenon
➢Once we measure it, we can start to
manipulate the input and determine if we’ve
improved something by the resulting output.
Inspect & Adapt concept
➢-we have discussed empiricism and Scrum is
based on empirical process control
➢Continuous improvement
20. 20
Top 10 Big Data Trends 2017
1. Big data becomes fast and approachable:
Options expand to speed up Hadoop
2. Big data no longer just Hadoop:
Purpose-built tools for Hadoop become obsolete
3. Organizations leverage data lakes from the get-go
to drive value
4. Architectures mature to reject one-size-fits
all frameworks
5. Variety, not volume or velocity, drives big-data
investments
21. 21
Top 10 Big Data Trends 2017
6. Spark and machine learning light up big data
7. The convergence of IoT, cloud, and big data create new
opportunities for self-service analytics
8. Self-service data prep becomes mainstream as end users
begin to shape big data
9. Big data grows up: Hadoop adds to enterprise standards
10. Rise of metadata catalogs helps people find analysis-
worthy big data
22. 22
Summary
Scrum is good for work:
With a fair degree of complexity,
Requires innovation
Requires invention
Product differentiation
Productivity
Faster launch to market
I say that Big Data needs all of the above.
23. 23
Attributions
1. http://www.scrumguides.org/scrum-guide.html Scrum Guide 2016
2. https://www.scruminc.com/scrum-big-data-2/ JJ Sutherland
3. Managing Big Data Workflows for dummies – BMC Software special edition- Joe Goldberg
and Lillian Pierson, PE
4. Top 10 Big Data Trends for 2017 Tableau