The Big Cloudy Continuity - Successful Disaster Recovery and Business Continuity In the world of Big Data and Cloud services
1. The Big Cloudy Continuity
Successful Disaster Recovery and Business Continuity
In the world of Big Data and Cloud services
Bozhidar Spirovski
spirovski.b@clearmorning.net
spirovski.b@gmail.com
2. The Big Cloudy Continuity
Perceived provider preference
and abilities of cloud and
continuity
Big data going into cloud
Adjusting Business Continuity to
Big Data
A new paradigm of Business
Continuity
9. The Continuous Cloud
• Generic services are replicated
▫ S3 Storage
▫ Google CloudStorage
▫ Azure Storage
▫ Would you actually make HDFS data replicate
correctly?
▫ It should be possible, but early testing is required
10. Some big data definitions
• Big data is the term for a collection of data sets
so large and complex that it becomes difficult to
process using on-hand database management
tools or traditional data processing applications.
• Big data is one of those new, shiny labels, like
SDN, DevOps and cloud computing, that is both
hard to ignore and hard to understand
Mike 2.0 (2013)
Hardman (2013)
11. Big data source
Saddle
Seat Rails
Seatpost Clamp
Handle Bars
Shift Levers
Brake Levers
Adjusting Barrel
Fork Crown
Valve Stem
Fork Blade
Hub
Front Dropout
Pedal
Chainring
Crankset
Crank Arm
Chain
Chainstay
Seatpost
Brake
Brake Pad
Seatstay
Cassette
Spoke
Rear Dropout
Rear Derailleur
Wheel
Rim
Tire
Brake
Cable
Frame
Tubes
Front
Derailleur
12. Big data source
Weight
Vibration
Torsion
Vibration/Torsion
Speed of Shift
Force and vector of pull
Friction, Force of
pull Torsion
Temperature,
Pressure
Vibration
Friction
Clamp Grip,
Vibration
Force
vector
Torsion
Vibration,
Friciton
Vibration
Weight, Vibration,
Torsion
Torsion
Weiight
Brake
Friction
Force of
pull
Weight
Vibration
Torsion
Chain
Allingnment
Tension
Clamp Grip
Vibration
Shift Speed, vibration,
chain alignment
Vibration
Tire Twist
Temperature,
Pressure, Friction
Tension
Vibration/
Torsion
Force
vector
Shift speed
Vibration
14. The Big Cloud (public flavor)
• Amazon EC2 supports Hadoop operation
• Amazon Elastic MapReduce
• Azure offers Hdinsight (still in preview)
• Azure looking for Academic big data projects
(september 2013)
• HP offers Hadoop, but also Cassandra
• Google offers ComputeEngine IaaS and
BigQuery SaaS (Dremel based)
• …
15. Plug into the Big cloud
• Initial Data delivery
• Continuous Data delivery
• Data integrity
• Status monitoring of compute nodes
• Performance (requested, delivered) regardless of
SLAs
• Data access and pullout (the ultimate lock-in
preventor)
16. The BIA
• Assign perceived impact to business from lack of
business elements for certain amount of time
• Assign perceived/required information
availability
• Calculate cost of resilience of business elements
• Decide whether cost is lower than impact
17. The Big BIA
• Is your Big Data insight real-time or trending?
• Is it connected to current revenue?
• Can you survive/operate/recover with a smaller subset
of data (recovered/recreated)?
• Do you have insights on other locations (are the current
relevant questions already asked and answers are
available)?
• Are the persons asking the questions available?
18. The Usual Business continuity
• BCP ensures that critical functions of an
organization available to stakeholders
• Business Continuity is delegated (at best) is the
job of CISO, CIO. They delegate it to a trusted
expert team.
• The team prepared documentation, systems and
runs tests. They prepare reports. Auditors are
happy
• Bonuses get distributed
19. The Usual Business continuity
• Somebody Else's Problem, or SEP can run
almost indefinitely … because it utilizes a
person's natural tendency to ignore things
they don't easily accept…
• Any object around which a SEP is applied
will cease to be noticed, because any
problems one may have understanding it
become Somebody Else's..
20. The New Business continuity
• Business continuity requires a broader audience
▫ Line management responsibility
▫ Insight from the lowest tier upwards
▫ Culture of resiliency and Crisis awareness on a
company scale
▫ Reacting to possible crisis is embedded
▫ Employees avoid crisis
▫ Bonuses are not distributed
▫ But we have a company to work at
▫ We all buy a spare tire for OUR car, well it’s OUR
company
Source: Herbane, Elliott and Swartz
21. The Big Test
• Is everyone testing their BCP?
• Loaded with assumptions
• Script based
• Best people are available
• Support teams are on call/site
• We ticked the box, and we are continuing as
usual
• Who fixed what during the test?
22. The Big Test
• “Plans are useless, but planning is everything” -
Dwight Eisenhower.
• Develop deep plans for as many scenarios as
possible but keep them open to changes
• Categorization, reporting and alerting sequence may
become moot at the point and circumstances of the
actual incident
• Improvisation should be part of the process
• Note improvisations and corrections, communicate
them back into the resiliency culture
• Test EARLY, test often
Source: Alesi, Elliott
23. Take away
Big data is about supporting decisions and business up to different level.
Evaluate it accordingly, reevaluate often. Understand your data!
Utilize the commodity of cloud service, but understand it well before
embarking on the journey.
Both empower and make accountable the line structure of the company to
understand and think about survival of the organization
When testing a BCP, test early and deep but allow for options. Feed
experiences into the culture
24.
25. Thank you
Successful Disaster Recovery and Business Continuity
In the world of Big Data and Cloud services
Bozhidar Spirovski
spirovski.b@clearmorning.net
spirovski.b@gmail.com
Hinweis der Redaktion
There are 39 elements of a bicycle that this diagram marks. You can go even more detailed
60 data points each containing at 8 bytes of data equals 16 megabytes per rider per 5 hour ride (discrete at ½ second). But it’s not, it’s continuous. But there are 30 riders in a pro team, training 6 months for an event like Tour de France will generate 77 gigabytes of data, without external conditions and material informationAll of this data is processed by one computer in real time
It is possible to run Hadoop on Amazon Elastic Compute Cloud (EC2) and Amazon Simple Storage Service (S3).[46] As an example The New York Times used 100 Amazon EC2 instances and a Hadoop application to process 4 TB of raw image TIFF data (stored in S3) into 11 million finished PDFs in the space of 24 hours at a computation cost of about $240 (not including bandwidth).[47]There is support for the S3 file system in Hadoop distributions, and the Hadoop team generates EC2 machine images after every release. From a pure performance perspective, Hadoop on S3/EC2 is inefficient, as the S3 file system is remote and delays returning from every write operation until the data is guaranteed not to be lost. This removes the locality advantages of Hadoop, which schedules work near data to save on network load.
Why extra effort? So you understand as an organizationwhat you are doing and with what value