SlideShare ist ein Scribd-Unternehmen logo
1 von 36
Page1 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
The Elephant Meets Scrum
Rommel Garcia
Page2 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Agenda
Control
access into
system
Flexibility
in defining
policies
• Introductions
• Why Scrum?
• Scrum Basic Concepts
• Scrum Team
• Scrum Framework
• Hadoop Meets Scrum
• Scrum Exercise
• Open Forum
Page3 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Introductions
What’s your name?
What’s your role?
Why are you here?
Page4 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Why Scrum?
Nobody wants to fail too big….too co$tly…on projects.
Page5 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Monolithic SDLC
• Small change, impacts everything
• Cost of failure, extremely big
• Slow, unpredictable progress
• Hard to prioritize
• Not business friendly
Page6 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Scrum..
• Produces immediate results
• Makes the development team nimble and adaptable
• Full visibility on development process
• Is a perfect fit for Hadoop
• Hadoop provides isolation of data and processing (HDFS and YARN respectively)
• Failure in Hadoop is cheap
• Complete traceability of apps deployed, run, tested by whom, when, where
Page7 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Scrum Concepts
Agile. Iterative. Adaptive. Fast results.
Page8 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Scrum is..
• A framework within which people can address complex adaptive
problems, while productively and creatively delivering products of the
highest possible value
• A framework to employ various processes and techniques
• Lightweight
• Simple to understand
• Difficult to master….if RULES are not followed religiously
Page9 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Success of SCRUM depends on..
• Transparency
• Common language must be shared by all team members
• What does “Done” mean??
• Inspection
• Frequent Scrum artifacts progress check
• But be careful not to overdo it or it gets in the way of work
• Adaptation
• Adjust properly and timely when process deviates outside of acceptable limits
• Adjust immediately to prevent further deviation
Page10 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Scrum Formal Events
1. Sprint Planning
2. Daily Scrum
3. Sprint Review
4. Sprint Retrospective
Page11 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Scrum consists of..
• Team
• Roles
• Events
• Artifacts
• Rules
Page12 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Scrum Team
Committed or Involved.
Page13 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
The ‘Ham-n-Eggs’ Paradigm
Page14 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
The Team
• Product Owner
• Development Team
• Scrum Master
Page15 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Product Owner
• Mainly responsible for Product Story management
• Clearly defines Product Story items
• Effectively order items in Product Story
• Ensures Product Story is visible, transparent, and clear to all, and
shows what the Scrum Team will work on next
• Validates with Scrum team that they understand the items in the
Product Story
• In real world, this could be either the Project Manager, Program
Manager, Development Manager, or Product Manager
Page16 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Development Team
• Self-organizing
• They decide how to produce and release incremental releasable functionality
• Scrum Master has no influence on how the team develop functionality
• Cross-functional
• Pig, Hive, HDFS, YARN, and more
• Develop and release features faster
• Accountability belongs to the Development Team as a whole
• Team size: >=3 but <=9
• Normally composed of Hadoop Developer, Hadoop Architect, Data
Scientist, Data Analyst, QA.
Page17 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Scrum Master
• Ensures Scrum theory, practices, and rules are enacted
• Servant-leader for the Scrum Team
• Coach Development Team in self-organization and cross-functionality
• Remove impediments to Development team’s progress
• Serves the Product Owner
• Find techniques for effective Product Story management
• Help with clear, concise definition of Product Story items
• Ensures Product Owner knows how to arrange Product Story to maximize value
• Facilitate Scrum events as requested/needed
• Serves the Organization
– Leading Scrum adoption
– Work with other Scrum Masters to increase effectiveness of Scrum application in the organization
Page18 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Scrum Framework
Fail fast in Hadoop. Move fast with Scrum.
Page19 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
The Sprint
• It is the heart of Scrum
• Time-boxed at 1 month or less. 2 weeks is pretty common.
• New Sprint starts immediately after conclusion of previous Sprint
• Consists of
• Sprint Planning
• Daily Scrums
• Development Work
• Sprint Review
• Sprint Retrospective
Page20 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
During the Sprint
• No changes are made that would compromise the Sprint Goal
• Quality goals do NOT decrease
• Scope may be clarified and re-negotiated between Product Owner and
Development team as more is learned
• ONLY Product Owner has the authority to cancel a Sprint
Page21 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Sprint Planning
• Time-boxed
• 8 hours planning is to 1 month of Sprint or 2 hours of planning is to 2 weeks of Sprint
• Answers the questions:
• What can be done this Sprint?
– Development Team forecasts what Product Story items it will deliver
– Output is Sprint Goal
• How will the chosen work get done?
– Development Team determines how to deliver the increments
– Output is Sprint Story
Page22 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Daily Scrum
• Driven by Scrum Master
• Time-boxed at 15 mins
• Synchronize activities and plan for the next 24 hours
• Each Development Team member will be asked the questions:
– What has been done yesterday?
– What needs to be done today?
– What were the issues faced that prevented incremental progress to work?
• Highlights and promotes quick decision-making
• Improves communications and eliminate other meetings
Page23 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Sprint Review
• Time-boxed
• 4 hour review is to 1 month Sprint or 2 hour review is to 2 week Sprint
• Scrum Team and Stakeholders collaborate on what was done in the
Sprint.
• Informal meeting, NOT a status meeting. A demo of product is
presented
• Scrum Team discusses
• What went well during the sprint
• What were the issues faced
• What could be improved
• Output is a revised Product Story items for the next Sprint
Page24 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Sprint Retrospective
• Time-boxed
• 3 hour meeting is to 1 month Sprint or <2 hour meeting is to 2 week Sprint
• Main purpose
• Review how the previous Sprint went with respect to people, relationships, process, and tools
• Identify and order the major items that went well and potential improvements
• Create a plan for implementing improvements to the way how the Scrum Team does its work
Page25 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Scrum Timelines
Product Story Sprint Planning Sprint Sprint Review
Sprint
Retrospective
Business Input
Immediate
Driven by Product
Owner,
Stakeholders,
Scrum Master
Immediate
4 hours for 2 wk
Sprint
2 weeks
Daily Scrum
2 hours
2 hours
Immediate
<2 hours
Page26 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Hadoop Meets Scrum
Scrum Tools
Page27 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Scrum Tools: Go modern or Archaic
• Agile Software is available i.e. www.rallydev.com, etc.
• LCD Projector
• Whiteboard and colored markers
• Long, contiguous wall
• Clustered cubicles
• Index card
Page28 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Hadoop Meets Scrum
Supporting Scrum in Hadoop Development
Page29 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
What is needed in Hadoop to support Scrum?
• Multi-tenancy is critical
• Setup security -> LDAP/AD, Ranger, Kerberos, Knox
• Setup HDFS quota for each Scrum Team
• Setup Capacity Scheduler Queue for each Scrum Team or member
• High Availability is important but not critical
• Setup NN HA
• Setup RM HA
• Setup HiveServer2 HA
• Setup Hive Metastore HA
• Setup Multi HBase Master
• Setup Multi Knox Cluster
Page30 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
What is needed in Hadoop to support Scrum?
• Establish a habit of disciplined performance tuning of Hadoop regularly
• YARN, Hive, Tez, Spark, Kafka, Storm, Flume, HBase, Solr, Mapreduce, etc.
• Truncate logs regularly
• All Hadoop component logs
• Truncate when at 80% disk utilization
• Logs are a gold mine. Learn to interpret it correctly.
• Troubleshooting purposes
• Understanding how component operates, interoperate
• Turn off Hadoop services that are not needed
• Save cpu, memory, disk space
• Do not forget to turn on maintenance mode. Ask your Hadoop Admin why.
Page31 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
What is needed in Hadoop to support Scrum?
• Know your tools
Component Best used for
Sqoop Ingesting RDBMS tables into HDFS and/or Hive
Flume Ingesting flat files from network file systems or file servers. Capped at 400,000 records/sec
NFS Ingesting flat files from NFS based file servers. ONLY ingest less than 1GB per file
Kafka, Storm, HBase Realtime, Streaming and Online processing. Perfect for IoT, CEP. They all go together in realtime
systems.
Slider Deploying custom long running applications. i.e. Tomcat Apps, etc.
Spark Data science (Spark ML), Micro-batch Streaming (Spark Streaming)
MapReduce Only use it when Pig and Hive can’t do the job
Pig Perfect for ETL processing. Data mining and statistics (Apache DataFu)
Hive Reporting and Analytics. Data warehousing. Always use ORC!
Tez Never turn it off. Enable both for Pig and Hive for fast data processing
Falcon Process orchestration and data lineage
Knox, Kerberos, Ranger AuthN, AuthZ, Audit. Preventing impersonation.
Ambari Do NOT update config files manually. Use Ambari UI to make config changes in Hadoop.
Page32 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Let’s Scrum!
Putting Hadoop and Scrum to the test
Page33 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
The Project – HVAC Sensor Analytics
• Business wants to understand how the buildings are consuming energy
and wants to start with HVAC. They want to determine which HVAC
systems are working harder and prioritize for maintenance or
replacement.
• Determine which HVAC products have the highest temperature
deviation and order them by age.
• Recommend which buildings have the possible, poorest maintenance
practices
Page34 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
TODO
• Apply SCRUM principles and rules
• Properly size your team
• Break down the requirements into Product Story
• Determine Sprint Goal
• Generate one Spring Story
• Develop the app in Hive
• Any performance tuning to your tables and creates is a big +
Page35 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Rules
• Spend 15 minutes as Sprint Planning
• We will do a 2 hour Sprint
• We will do daily Scrum meeting (just once) in the middle of 2 hour
Sprint
• Spend 15 minutes Sprint Review
Page36 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Q&A…
Discussion

Weitere ähnliche Inhalte

Was ist angesagt?

Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtwo...
Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtwo...Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtwo...
Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtwo...
Thoughtworks
 

Was ist angesagt? (20)

Anaconda Data Science Collaboration
Anaconda Data Science CollaborationAnaconda Data Science Collaboration
Anaconda Data Science Collaboration
 
Intro to Data Science for Non-Data Scientists
Intro to Data Science for Non-Data ScientistsIntro to Data Science for Non-Data Scientists
Intro to Data Science for Non-Data Scientists
 
Beyond the Science Gateway
Beyond the Science GatewayBeyond the Science Gateway
Beyond the Science Gateway
 
Cloud-native Enterprise Data Science Teams
Cloud-native Enterprise Data Science TeamsCloud-native Enterprise Data Science Teams
Cloud-native Enterprise Data Science Teams
 
Cloud-native Enterprise Data Science Teams
Cloud-native Enterprise Data Science TeamsCloud-native Enterprise Data Science Teams
Cloud-native Enterprise Data Science Teams
 
Keynote - An overview on Big Data & Data Science - Dr Gregory Piatetsky-Shapiro
Keynote -  An overview on Big Data & Data Science - Dr Gregory Piatetsky-ShapiroKeynote -  An overview on Big Data & Data Science - Dr Gregory Piatetsky-Shapiro
Keynote - An overview on Big Data & Data Science - Dr Gregory Piatetsky-Shapiro
 
Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtwo...
Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtwo...Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtwo...
Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtwo...
 
Data Science
Data ScienceData Science
Data Science
 
Big Data [sorry] & Data Science: What Does a Data Scientist Do?
Big Data [sorry] & Data Science: What Does a Data Scientist Do?Big Data [sorry] & Data Science: What Does a Data Scientist Do?
Big Data [sorry] & Data Science: What Does a Data Scientist Do?
 
Data science e machine learning
Data science e machine learningData science e machine learning
Data science e machine learning
 
Data Science: Past, Present, and Future
Data Science: Past, Present, and FutureData Science: Past, Present, and Future
Data Science: Past, Present, and Future
 
H2O for Medicine and Intro to H2O in Python
H2O for Medicine and Intro to H2O in PythonH2O for Medicine and Intro to H2O in Python
H2O for Medicine and Intro to H2O in Python
 
Data science presentation 2nd CI day
Data science presentation 2nd CI dayData science presentation 2nd CI day
Data science presentation 2nd CI day
 
Data Science Provenance: From Drug Discovery to Fake Fans
Data Science Provenance: From Drug Discovery to Fake FansData Science Provenance: From Drug Discovery to Fake Fans
Data Science Provenance: From Drug Discovery to Fake Fans
 
Intro to Data Science by DatalentTeam at Data Science Clinic#11
Intro to Data Science by DatalentTeam at Data Science Clinic#11Intro to Data Science by DatalentTeam at Data Science Clinic#11
Intro to Data Science by DatalentTeam at Data Science Clinic#11
 
2015 data-science-salary-survey
2015 data-science-salary-survey2015 data-science-salary-survey
2015 data-science-salary-survey
 
H2O World - Machine Learning for non-data scientists
H2O World - Machine Learning for non-data scientistsH2O World - Machine Learning for non-data scientists
H2O World - Machine Learning for non-data scientists
 
Big Data Science: Intro and Benefits
Big Data Science: Intro and BenefitsBig Data Science: Intro and Benefits
Big Data Science: Intro and Benefits
 
Applied Machine Learning for the IoT - Data Science Pop-up Seattle
Applied Machine Learning for the IoT - Data Science Pop-up SeattleApplied Machine Learning for the IoT - Data Science Pop-up Seattle
Applied Machine Learning for the IoT - Data Science Pop-up Seattle
 
Real time streaming analytics
Real time streaming analyticsReal time streaming analytics
Real time streaming analytics
 

Ähnlich wie Hadoop Meets Scrum

Agile Training March 2015
Agile Training March 2015Agile Training March 2015
Agile Training March 2015
David Phipps
 
Standardization and strategy in agile
Standardization and strategy in agileStandardization and strategy in agile
Standardization and strategy in agile
Naveen Gupta
 
A Reference Architecture to Enable Visibility and Traceability across the Ent...
A Reference Architecture to Enable Visibility and Traceability across the Ent...A Reference Architecture to Enable Visibility and Traceability across the Ent...
A Reference Architecture to Enable Visibility and Traceability across the Ent...
CollabNet
 

Ähnlich wie Hadoop Meets Scrum (20)

Scrum workshop for Project Managers
Scrum workshop for Project ManagersScrum workshop for Project Managers
Scrum workshop for Project Managers
 
Agile Training March 2015
Agile Training March 2015Agile Training March 2015
Agile Training March 2015
 
Introducing the Enterprise Transformation Meta Model
Introducing the Enterprise Transformation Meta ModelIntroducing the Enterprise Transformation Meta Model
Introducing the Enterprise Transformation Meta Model
 
Agile tutorial
Agile tutorialAgile tutorial
Agile tutorial
 
Afganistan Culture Shock
Afganistan Culture ShockAfganistan Culture Shock
Afganistan Culture Shock
 
Standardization and strategy in agile
Standardization and strategy in agileStandardization and strategy in agile
Standardization and strategy in agile
 
A Reference Architecture to Enable Visibility and Traceability across the Ent...
A Reference Architecture to Enable Visibility and Traceability across the Ent...A Reference Architecture to Enable Visibility and Traceability across the Ent...
A Reference Architecture to Enable Visibility and Traceability across the Ent...
 
Agile Software Development and DevOps 21092019
Agile Software Development and DevOps 21092019Agile Software Development and DevOps 21092019
Agile Software Development and DevOps 21092019
 
WMS Overview
WMS OverviewWMS Overview
WMS Overview
 
Introduction to scrum
Introduction to scrumIntroduction to scrum
Introduction to scrum
 
Understanding-Agile &Scrum.pdf
Understanding-Agile &Scrum.pdfUnderstanding-Agile &Scrum.pdf
Understanding-Agile &Scrum.pdf
 
The Dashlane Agile Journey
The Dashlane Agile JourneyThe Dashlane Agile Journey
The Dashlane Agile Journey
 
Apache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and FutureApache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and Future
 
Agile Scrum CMMI
Agile Scrum CMMIAgile Scrum CMMI
Agile Scrum CMMI
 
Hortonworks Big Data Career Paths and Training
Hortonworks Big Data Career Paths and Training Hortonworks Big Data Career Paths and Training
Hortonworks Big Data Career Paths and Training
 
A Peek Under the Hood at FamilySearch - Presentation
A Peek Under the Hood at FamilySearch - PresentationA Peek Under the Hood at FamilySearch - Presentation
A Peek Under the Hood at FamilySearch - Presentation
 
Project Sherpa: How RightScale Went All in on Docker
Project Sherpa: How RightScale Went All in on DockerProject Sherpa: How RightScale Went All in on Docker
Project Sherpa: How RightScale Went All in on Docker
 
Emptying Your Cup an Agile Primer
Emptying Your Cup an Agile Primer Emptying Your Cup an Agile Primer
Emptying Your Cup an Agile Primer
 
Scrum 101
Scrum 101 Scrum 101
Scrum 101
 
Software Agility.pptx
Software Agility.pptxSoftware Agility.pptx
Software Agility.pptx
 

Mehr von Rommel Garcia

Mehr von Rommel Garcia (12)

The of Operational Analytics Data Store
The of Operational Analytics Data StoreThe of Operational Analytics Data Store
The of Operational Analytics Data Store
 
Apache Druid: The Foundation of Fortune 500 “Analytical Decision-Making"
Apache Druid: The Foundation of Fortune 500 “Analytical Decision-Making"Apache Druid: The Foundation of Fortune 500 “Analytical Decision-Making"
Apache Druid: The Foundation of Fortune 500 “Analytical Decision-Making"
 
What does Netflix, NTT and Rubicon Project have in common? Apache Druid.
What does Netflix, NTT and Rubicon Project have in common? Apache Druid.What does Netflix, NTT and Rubicon Project have in common? Apache Druid.
What does Netflix, NTT and Rubicon Project have in common? Apache Druid.
 
GPU 101: The Beast In Data Centers
GPU 101: The Beast In Data CentersGPU 101: The Beast In Data Centers
GPU 101: The Beast In Data Centers
 
PCI Compliane With Hadoop
PCI Compliane With HadoopPCI Compliane With Hadoop
PCI Compliane With Hadoop
 
Virtualizing Hadoop
Virtualizing HadoopVirtualizing Hadoop
Virtualizing Hadoop
 
Open Source Security Tools for Big Data
Open Source Security Tools for Big DataOpen Source Security Tools for Big Data
Open Source Security Tools for Big Data
 
Apache Ranger
Apache RangerApache Ranger
Apache Ranger
 
Realtime analytics + hadoop 2.0
Realtime analytics + hadoop 2.0Realtime analytics + hadoop 2.0
Realtime analytics + hadoop 2.0
 
Interactive query in hadoop
Interactive query in hadoopInteractive query in hadoop
Interactive query in hadoop
 
YARN - Presented At Dallas Hadoop User Group
YARN - Presented At Dallas Hadoop User GroupYARN - Presented At Dallas Hadoop User Group
YARN - Presented At Dallas Hadoop User Group
 
Hadoop 1.x vs 2
Hadoop 1.x vs 2Hadoop 1.x vs 2
Hadoop 1.x vs 2
 

Kürzlich hochgeladen

%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
masabamasaba
 
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
chiefasafspells
 
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
masabamasaba
 
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Medical / Health Care (+971588192166) Mifepristone and Misoprostol tablets 200mg
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
Health
 

Kürzlich hochgeladen (20)

%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
Artyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptxArtyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptx
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
 
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
 
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
 
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
 
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the past
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
 
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
 
%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benoni%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benoni
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfPayment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
 
tonesoftg
tonesoftgtonesoftg
tonesoftg
 

Hadoop Meets Scrum

  • 1. Page1 © Hortonworks Inc. 2011 – 2014. All Rights Reserved The Elephant Meets Scrum Rommel Garcia
  • 2. Page2 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Agenda Control access into system Flexibility in defining policies • Introductions • Why Scrum? • Scrum Basic Concepts • Scrum Team • Scrum Framework • Hadoop Meets Scrum • Scrum Exercise • Open Forum
  • 3. Page3 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Introductions What’s your name? What’s your role? Why are you here?
  • 4. Page4 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Why Scrum? Nobody wants to fail too big….too co$tly…on projects.
  • 5. Page5 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Monolithic SDLC • Small change, impacts everything • Cost of failure, extremely big • Slow, unpredictable progress • Hard to prioritize • Not business friendly
  • 6. Page6 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Scrum.. • Produces immediate results • Makes the development team nimble and adaptable • Full visibility on development process • Is a perfect fit for Hadoop • Hadoop provides isolation of data and processing (HDFS and YARN respectively) • Failure in Hadoop is cheap • Complete traceability of apps deployed, run, tested by whom, when, where
  • 7. Page7 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Scrum Concepts Agile. Iterative. Adaptive. Fast results.
  • 8. Page8 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Scrum is.. • A framework within which people can address complex adaptive problems, while productively and creatively delivering products of the highest possible value • A framework to employ various processes and techniques • Lightweight • Simple to understand • Difficult to master….if RULES are not followed religiously
  • 9. Page9 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Success of SCRUM depends on.. • Transparency • Common language must be shared by all team members • What does “Done” mean?? • Inspection • Frequent Scrum artifacts progress check • But be careful not to overdo it or it gets in the way of work • Adaptation • Adjust properly and timely when process deviates outside of acceptable limits • Adjust immediately to prevent further deviation
  • 10. Page10 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Scrum Formal Events 1. Sprint Planning 2. Daily Scrum 3. Sprint Review 4. Sprint Retrospective
  • 11. Page11 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Scrum consists of.. • Team • Roles • Events • Artifacts • Rules
  • 12. Page12 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Scrum Team Committed or Involved.
  • 13. Page13 © Hortonworks Inc. 2011 – 2014. All Rights Reserved The ‘Ham-n-Eggs’ Paradigm
  • 14. Page14 © Hortonworks Inc. 2011 – 2014. All Rights Reserved The Team • Product Owner • Development Team • Scrum Master
  • 15. Page15 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Product Owner • Mainly responsible for Product Story management • Clearly defines Product Story items • Effectively order items in Product Story • Ensures Product Story is visible, transparent, and clear to all, and shows what the Scrum Team will work on next • Validates with Scrum team that they understand the items in the Product Story • In real world, this could be either the Project Manager, Program Manager, Development Manager, or Product Manager
  • 16. Page16 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Development Team • Self-organizing • They decide how to produce and release incremental releasable functionality • Scrum Master has no influence on how the team develop functionality • Cross-functional • Pig, Hive, HDFS, YARN, and more • Develop and release features faster • Accountability belongs to the Development Team as a whole • Team size: >=3 but <=9 • Normally composed of Hadoop Developer, Hadoop Architect, Data Scientist, Data Analyst, QA.
  • 17. Page17 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Scrum Master • Ensures Scrum theory, practices, and rules are enacted • Servant-leader for the Scrum Team • Coach Development Team in self-organization and cross-functionality • Remove impediments to Development team’s progress • Serves the Product Owner • Find techniques for effective Product Story management • Help with clear, concise definition of Product Story items • Ensures Product Owner knows how to arrange Product Story to maximize value • Facilitate Scrum events as requested/needed • Serves the Organization – Leading Scrum adoption – Work with other Scrum Masters to increase effectiveness of Scrum application in the organization
  • 18. Page18 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Scrum Framework Fail fast in Hadoop. Move fast with Scrum.
  • 19. Page19 © Hortonworks Inc. 2011 – 2014. All Rights Reserved The Sprint • It is the heart of Scrum • Time-boxed at 1 month or less. 2 weeks is pretty common. • New Sprint starts immediately after conclusion of previous Sprint • Consists of • Sprint Planning • Daily Scrums • Development Work • Sprint Review • Sprint Retrospective
  • 20. Page20 © Hortonworks Inc. 2011 – 2014. All Rights Reserved During the Sprint • No changes are made that would compromise the Sprint Goal • Quality goals do NOT decrease • Scope may be clarified and re-negotiated between Product Owner and Development team as more is learned • ONLY Product Owner has the authority to cancel a Sprint
  • 21. Page21 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Sprint Planning • Time-boxed • 8 hours planning is to 1 month of Sprint or 2 hours of planning is to 2 weeks of Sprint • Answers the questions: • What can be done this Sprint? – Development Team forecasts what Product Story items it will deliver – Output is Sprint Goal • How will the chosen work get done? – Development Team determines how to deliver the increments – Output is Sprint Story
  • 22. Page22 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Daily Scrum • Driven by Scrum Master • Time-boxed at 15 mins • Synchronize activities and plan for the next 24 hours • Each Development Team member will be asked the questions: – What has been done yesterday? – What needs to be done today? – What were the issues faced that prevented incremental progress to work? • Highlights and promotes quick decision-making • Improves communications and eliminate other meetings
  • 23. Page23 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Sprint Review • Time-boxed • 4 hour review is to 1 month Sprint or 2 hour review is to 2 week Sprint • Scrum Team and Stakeholders collaborate on what was done in the Sprint. • Informal meeting, NOT a status meeting. A demo of product is presented • Scrum Team discusses • What went well during the sprint • What were the issues faced • What could be improved • Output is a revised Product Story items for the next Sprint
  • 24. Page24 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Sprint Retrospective • Time-boxed • 3 hour meeting is to 1 month Sprint or <2 hour meeting is to 2 week Sprint • Main purpose • Review how the previous Sprint went with respect to people, relationships, process, and tools • Identify and order the major items that went well and potential improvements • Create a plan for implementing improvements to the way how the Scrum Team does its work
  • 25. Page25 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Scrum Timelines Product Story Sprint Planning Sprint Sprint Review Sprint Retrospective Business Input Immediate Driven by Product Owner, Stakeholders, Scrum Master Immediate 4 hours for 2 wk Sprint 2 weeks Daily Scrum 2 hours 2 hours Immediate <2 hours
  • 26. Page26 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Hadoop Meets Scrum Scrum Tools
  • 27. Page27 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Scrum Tools: Go modern or Archaic • Agile Software is available i.e. www.rallydev.com, etc. • LCD Projector • Whiteboard and colored markers • Long, contiguous wall • Clustered cubicles • Index card
  • 28. Page28 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Hadoop Meets Scrum Supporting Scrum in Hadoop Development
  • 29. Page29 © Hortonworks Inc. 2011 – 2014. All Rights Reserved What is needed in Hadoop to support Scrum? • Multi-tenancy is critical • Setup security -> LDAP/AD, Ranger, Kerberos, Knox • Setup HDFS quota for each Scrum Team • Setup Capacity Scheduler Queue for each Scrum Team or member • High Availability is important but not critical • Setup NN HA • Setup RM HA • Setup HiveServer2 HA • Setup Hive Metastore HA • Setup Multi HBase Master • Setup Multi Knox Cluster
  • 30. Page30 © Hortonworks Inc. 2011 – 2014. All Rights Reserved What is needed in Hadoop to support Scrum? • Establish a habit of disciplined performance tuning of Hadoop regularly • YARN, Hive, Tez, Spark, Kafka, Storm, Flume, HBase, Solr, Mapreduce, etc. • Truncate logs regularly • All Hadoop component logs • Truncate when at 80% disk utilization • Logs are a gold mine. Learn to interpret it correctly. • Troubleshooting purposes • Understanding how component operates, interoperate • Turn off Hadoop services that are not needed • Save cpu, memory, disk space • Do not forget to turn on maintenance mode. Ask your Hadoop Admin why.
  • 31. Page31 © Hortonworks Inc. 2011 – 2014. All Rights Reserved What is needed in Hadoop to support Scrum? • Know your tools Component Best used for Sqoop Ingesting RDBMS tables into HDFS and/or Hive Flume Ingesting flat files from network file systems or file servers. Capped at 400,000 records/sec NFS Ingesting flat files from NFS based file servers. ONLY ingest less than 1GB per file Kafka, Storm, HBase Realtime, Streaming and Online processing. Perfect for IoT, CEP. They all go together in realtime systems. Slider Deploying custom long running applications. i.e. Tomcat Apps, etc. Spark Data science (Spark ML), Micro-batch Streaming (Spark Streaming) MapReduce Only use it when Pig and Hive can’t do the job Pig Perfect for ETL processing. Data mining and statistics (Apache DataFu) Hive Reporting and Analytics. Data warehousing. Always use ORC! Tez Never turn it off. Enable both for Pig and Hive for fast data processing Falcon Process orchestration and data lineage Knox, Kerberos, Ranger AuthN, AuthZ, Audit. Preventing impersonation. Ambari Do NOT update config files manually. Use Ambari UI to make config changes in Hadoop.
  • 32. Page32 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Let’s Scrum! Putting Hadoop and Scrum to the test
  • 33. Page33 © Hortonworks Inc. 2011 – 2014. All Rights Reserved The Project – HVAC Sensor Analytics • Business wants to understand how the buildings are consuming energy and wants to start with HVAC. They want to determine which HVAC systems are working harder and prioritize for maintenance or replacement. • Determine which HVAC products have the highest temperature deviation and order them by age. • Recommend which buildings have the possible, poorest maintenance practices
  • 34. Page34 © Hortonworks Inc. 2011 – 2014. All Rights Reserved TODO • Apply SCRUM principles and rules • Properly size your team • Break down the requirements into Product Story • Determine Sprint Goal • Generate one Spring Story • Develop the app in Hive • Any performance tuning to your tables and creates is a big +
  • 35. Page35 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Rules • Spend 15 minutes as Sprint Planning • We will do a 2 hour Sprint • We will do daily Scrum meeting (just once) in the middle of 2 hour Sprint • Spend 15 minutes Sprint Review
  • 36. Page36 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Q&A… Discussion