SlideShare ist ein Scribd-Unternehmen logo
1 von 50
Page1
Applying Testing Techniques to
Hadoop Development
Mark Johnson
markfjohnson@gmail.com
Page2
Who Am I?
• Too many years of programming experience
• Created lots of bugs in my career…but hate for them
to be found by others
Mark Johnson
Regional Director Services -
Hortonworks
Page3
Who are You?
• Technologist
• Programmer
• Want to
produce low
defect or
defect free
code in
Hadoop
Page4
Good Hadoop Development is
Hard!
Page5
Lots of Data !
Page6
• Volume
• Velocity
• Variety
Page7
Fast Release Pressure!
Page8
BUGS!
Page9
Will anyone really notice that
little problem which affects only
1% of the rows?
Page10
YES!
1% of the financial transactions on
Wall Street is a lot of money!
Big Data can amplify small
problems!
Page11
Problem #1: Speed of Light
Page12
Problem #2: Data Permutations
Page13
Problem #3: Processing distributed
across multiple nodes
Page14
How are you currently
preventing BUGS?
Page15
Consequences
The Longer you wait to find
and fix a problem the more
time it will take to fix it!
Page16
Verify “DONE”
Page17
What 4 things do we need to
Verify?
Page18
1.Does it run?
Page19
2.) Functional Correctness
to Requirements
• Data Filters
• Loops and branches
• Calculations
• Output
Page20
3. Resources Correctly
Referenced
• Missing files
• Bad file names
• Etc.
Page21
4. Exceptions and errors
properly handled
• System left in a safe state
• User / SysAdmin informed of the failure
• Idempotency
Page22
Materiality Rule
Materiality Rule: Start with High value tests (easy to test
AND valued by business)
Light & Fast : Don’t create an overly heavy test process
Positive Tests:
• Tests should contain one general data condition
• Individual tests only for those determined by specific
data conditions
Negative Tests:
• Data not present
Page23
Data
1Sample
Uncollected data
Page24
Traditional Sampling Problems
• Sample error reflects the risk that, purely by chance, a
randomly chosen sample of opinions does not reflect the
true views of the population. The “margin of error”
reported in opinion polls reflects this risk and the larger
the sample, the smaller the margin of error
• sampling bias is a bias in which a sample is collected in
such a way that some members of the intended
population are less likely to be included than others.
(wikipedia)
Page25
PIG SAMPLE
Page26
Manual Sampling
• Build the sample dataset based on your program’s
logic.
• Does not need to be more than a few rows for each
test.
• Can get embedded within your test program to
facilitate test management.
Page27
• MapReduce testing framework
• Accepts a small record sample for each test
• Test one thing per test
• Does not reference HDFS directly
• Tests Map and Reduce methods separetly
Page28
Example: Word Count Program -
Mapper
• Does the tokenize work properly
• Does the Mapper properly filter ‘stop’ words
• Is the counter properly initialized
Page29
Example: Word Count - Reducer
• Keys counter values grouped properly
• Counter values are properly aggregated
Page30
Setup MRUnit test suite
• MapDriver:
• ReduceDriver:
• MapReduceDriver:
Page31
MRUnit Mapper test
Page32
Test Results
Page33
MRUnit Reducer Test
Page34
Test Results
Page35
Traditional Pig Functional Testing tools
DUMP
ILLUSTRATE
DESCRIBE
EXPLAIN
Page36
PigUnit to test Pig scripts
• Only one LOAD operation
per test script.
• PigUnit overrides STORE,
LOAD and DUMP
Page37
PigUnit – Setup
• PigTest references
your unmodified pig
script for testing.
• Input:
• Include just the rows required
to test logic
Page38
PigUnit: Assert
Page39
Page40
PigUnit: Output Job Order
Page41
BeeTest
Developed by: Adam Kawa
GitHub Project: https://github.com/kawaa/Beetest.git
Beetest provides a simple ‘unit’ test capability on a given
Hadoop Hive script which runs on a small cluster to
validate a Hive script
Page42
Beetest: Inputs and Outputs
Directory containing the following files defining the
test process:
setup.hql – The HQL script to setup the environment
select.hql – The HQL script to test
Input.tsv – input data
Expected.txt – the script’s expected output
variables.properties – the Beetest properties
Page43
Setting up BeeTest
Page44
BeeTest: Executing the test
• ${table} – value defined in variables.properties
Page45
BeeTest – Test Results
Page46
BeeTest: Test failures
Page47
Getting started with your test initiative
1. Start Simple
2. Materiality rule: Focus on Highest value and easiest
tests first
3. Data Sampling: Use the smallest datasets possible
4. Keep tests “light and fast”
5. Keep Hadoop code and tests in a Source Code
Management system
6. Use Automated test environment (Jenkins, AntHill
Pro, etc.)
7. Maintain and publicly publish historical test results
Page48
Hadoop Test Tool Wrap Up
• Tools and Techniques
– Data Sampling
– MRUnit
– PigUnit
– Beetest
Testing environment still imature but good enough to start
using now
Page49
Mark Johnson
Regional Director Services
Hortonworks
markfjohnson@gmail.com
mjohnson@hortonworks.com
Linkedin:markfjohnson
Twitter: markfjohnson
Source:
https://github.com/mfjohnson/HadoopTesting.
git
Page50

Weitere ähnliche Inhalte

Was ist angesagt?

Testing Big Data: Automated ETL Testing of Hadoop
Testing Big Data: Automated ETL Testing of HadoopTesting Big Data: Automated ETL Testing of Hadoop
Testing Big Data: Automated ETL Testing of HadoopRTTS
 
Big Data Testing: Ensuring MongoDB Data Quality
Big Data Testing: Ensuring MongoDB Data QualityBig Data Testing: Ensuring MongoDB Data Quality
Big Data Testing: Ensuring MongoDB Data QualityRTTS
 
Completing the Data Equation: Test Data + Data Validation = Success
Completing the Data Equation: Test Data + Data Validation = SuccessCompleting the Data Equation: Test Data + Data Validation = Success
Completing the Data Equation: Test Data + Data Validation = SuccessRTTS
 
Testistanbul 2016 - Keynote: "Enterprise Challenges of Test Data" by Rex Black
Testistanbul 2016 - Keynote: "Enterprise Challenges of Test Data" by Rex BlackTestistanbul 2016 - Keynote: "Enterprise Challenges of Test Data" by Rex Black
Testistanbul 2016 - Keynote: "Enterprise Challenges of Test Data" by Rex BlackTurkish Testing Board
 
Whitepaper: Volume Testing Thick Clients and Databases
Whitepaper:  Volume Testing Thick Clients and DatabasesWhitepaper:  Volume Testing Thick Clients and Databases
Whitepaper: Volume Testing Thick Clients and DatabasesRTTS
 
Test Automation for NoSQL Databases
Test Automation for NoSQL DatabasesTest Automation for NoSQL Databases
Test Automation for NoSQL DatabasesTobias Trelle
 
Testing Big Data: Automated Testing of Hadoop with QuerySurge
Testing Big Data: Automated  Testing of Hadoop with QuerySurgeTesting Big Data: Automated  Testing of Hadoop with QuerySurge
Testing Big Data: Automated Testing of Hadoop with QuerySurgeRTTS
 
Data Warehousing in Pharma: How to Find Bad Data while Meeting Regulatory Req...
Data Warehousing in Pharma: How to Find Bad Data while Meeting Regulatory Req...Data Warehousing in Pharma: How to Find Bad Data while Meeting Regulatory Req...
Data Warehousing in Pharma: How to Find Bad Data while Meeting Regulatory Req...RTTS
 
the Data World Distilled
the Data World Distilledthe Data World Distilled
the Data World DistilledRTTS
 
Data Warehouse Testing in the Pharmaceutical Industry
Data Warehouse Testing in the Pharmaceutical IndustryData Warehouse Testing in the Pharmaceutical Industry
Data Warehouse Testing in the Pharmaceutical IndustryRTTS
 
QuerySurge Slide Deck for Big Data Testing Webinar
QuerySurge Slide Deck for Big Data Testing WebinarQuerySurge Slide Deck for Big Data Testing Webinar
QuerySurge Slide Deck for Big Data Testing WebinarRTTS
 
Machine learning in production
Machine learning in productionMachine learning in production
Machine learning in productionTuri, Inc.
 
Continuous Integration & Continuous Delivery
Continuous Integration & Continuous DeliveryContinuous Integration & Continuous Delivery
Continuous Integration & Continuous DeliveryDatabricks
 
DataEngConf SF16 - Methods for Content Relevance at LinkedIn
DataEngConf SF16 - Methods for Content Relevance at LinkedInDataEngConf SF16 - Methods for Content Relevance at LinkedIn
DataEngConf SF16 - Methods for Content Relevance at LinkedInHakka Labs
 
Radical Speed for SQL Queries on Databricks: Photon Under the Hood
Radical Speed for SQL Queries on Databricks: Photon Under the HoodRadical Speed for SQL Queries on Databricks: Photon Under the Hood
Radical Speed for SQL Queries on Databricks: Photon Under the HoodDatabricks
 
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...Data Con LA
 
Semantic Image Logging Using Approximate Statistics & MLflow
Semantic Image Logging Using Approximate Statistics & MLflowSemantic Image Logging Using Approximate Statistics & MLflow
Semantic Image Logging Using Approximate Statistics & MLflowDatabricks
 
Building Data Science into Organizations: Field Experience
Building Data Science into Organizations: Field ExperienceBuilding Data Science into Organizations: Field Experience
Building Data Science into Organizations: Field ExperienceDatabricks
 
How to Productionize Your Machine Learning Models Using Apache Spark MLlib 2....
How to Productionize Your Machine Learning Models Using Apache Spark MLlib 2....How to Productionize Your Machine Learning Models Using Apache Spark MLlib 2....
How to Productionize Your Machine Learning Models Using Apache Spark MLlib 2....Databricks
 
Data Security and Protection in DevOps
Data Security and Protection in DevOps Data Security and Protection in DevOps
Data Security and Protection in DevOps Karen Lopez
 

Was ist angesagt? (20)

Testing Big Data: Automated ETL Testing of Hadoop
Testing Big Data: Automated ETL Testing of HadoopTesting Big Data: Automated ETL Testing of Hadoop
Testing Big Data: Automated ETL Testing of Hadoop
 
Big Data Testing: Ensuring MongoDB Data Quality
Big Data Testing: Ensuring MongoDB Data QualityBig Data Testing: Ensuring MongoDB Data Quality
Big Data Testing: Ensuring MongoDB Data Quality
 
Completing the Data Equation: Test Data + Data Validation = Success
Completing the Data Equation: Test Data + Data Validation = SuccessCompleting the Data Equation: Test Data + Data Validation = Success
Completing the Data Equation: Test Data + Data Validation = Success
 
Testistanbul 2016 - Keynote: "Enterprise Challenges of Test Data" by Rex Black
Testistanbul 2016 - Keynote: "Enterprise Challenges of Test Data" by Rex BlackTestistanbul 2016 - Keynote: "Enterprise Challenges of Test Data" by Rex Black
Testistanbul 2016 - Keynote: "Enterprise Challenges of Test Data" by Rex Black
 
Whitepaper: Volume Testing Thick Clients and Databases
Whitepaper:  Volume Testing Thick Clients and DatabasesWhitepaper:  Volume Testing Thick Clients and Databases
Whitepaper: Volume Testing Thick Clients and Databases
 
Test Automation for NoSQL Databases
Test Automation for NoSQL DatabasesTest Automation for NoSQL Databases
Test Automation for NoSQL Databases
 
Testing Big Data: Automated Testing of Hadoop with QuerySurge
Testing Big Data: Automated  Testing of Hadoop with QuerySurgeTesting Big Data: Automated  Testing of Hadoop with QuerySurge
Testing Big Data: Automated Testing of Hadoop with QuerySurge
 
Data Warehousing in Pharma: How to Find Bad Data while Meeting Regulatory Req...
Data Warehousing in Pharma: How to Find Bad Data while Meeting Regulatory Req...Data Warehousing in Pharma: How to Find Bad Data while Meeting Regulatory Req...
Data Warehousing in Pharma: How to Find Bad Data while Meeting Regulatory Req...
 
the Data World Distilled
the Data World Distilledthe Data World Distilled
the Data World Distilled
 
Data Warehouse Testing in the Pharmaceutical Industry
Data Warehouse Testing in the Pharmaceutical IndustryData Warehouse Testing in the Pharmaceutical Industry
Data Warehouse Testing in the Pharmaceutical Industry
 
QuerySurge Slide Deck for Big Data Testing Webinar
QuerySurge Slide Deck for Big Data Testing WebinarQuerySurge Slide Deck for Big Data Testing Webinar
QuerySurge Slide Deck for Big Data Testing Webinar
 
Machine learning in production
Machine learning in productionMachine learning in production
Machine learning in production
 
Continuous Integration & Continuous Delivery
Continuous Integration & Continuous DeliveryContinuous Integration & Continuous Delivery
Continuous Integration & Continuous Delivery
 
DataEngConf SF16 - Methods for Content Relevance at LinkedIn
DataEngConf SF16 - Methods for Content Relevance at LinkedInDataEngConf SF16 - Methods for Content Relevance at LinkedIn
DataEngConf SF16 - Methods for Content Relevance at LinkedIn
 
Radical Speed for SQL Queries on Databricks: Photon Under the Hood
Radical Speed for SQL Queries on Databricks: Photon Under the HoodRadical Speed for SQL Queries on Databricks: Photon Under the Hood
Radical Speed for SQL Queries on Databricks: Photon Under the Hood
 
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...
 
Semantic Image Logging Using Approximate Statistics & MLflow
Semantic Image Logging Using Approximate Statistics & MLflowSemantic Image Logging Using Approximate Statistics & MLflow
Semantic Image Logging Using Approximate Statistics & MLflow
 
Building Data Science into Organizations: Field Experience
Building Data Science into Organizations: Field ExperienceBuilding Data Science into Organizations: Field Experience
Building Data Science into Organizations: Field Experience
 
How to Productionize Your Machine Learning Models Using Apache Spark MLlib 2....
How to Productionize Your Machine Learning Models Using Apache Spark MLlib 2....How to Productionize Your Machine Learning Models Using Apache Spark MLlib 2....
How to Productionize Your Machine Learning Models Using Apache Spark MLlib 2....
 
Data Security and Protection in DevOps
Data Security and Protection in DevOps Data Security and Protection in DevOps
Data Security and Protection in DevOps
 

Ähnlich wie Applying Testing Techniques for Big Data and Hadoop

Small is Beautiful- Fully Automate your Test Case Design
Small is Beautiful- Fully Automate your Test Case DesignSmall is Beautiful- Fully Automate your Test Case Design
Small is Beautiful- Fully Automate your Test Case DesignGeorgina Tilby
 
Performance tuning Grails applications SpringOne 2GX 2014
Performance tuning Grails applications SpringOne 2GX 2014Performance tuning Grails applications SpringOne 2GX 2014
Performance tuning Grails applications SpringOne 2GX 2014Lari Hotari
 
Can we induce change with what we measure?
Can we induce change with what we measure?Can we induce change with what we measure?
Can we induce change with what we measure?Michaela Greiler
 
Bugday bkk-2014 nitisak-auto_perf
Bugday bkk-2014 nitisak-auto_perfBugday bkk-2014 nitisak-auto_perf
Bugday bkk-2014 nitisak-auto_perfNitisak Mooltreesri
 
Verification Bug Metrics: A Different Approach
Verification Bug Metrics: A Different ApproachVerification Bug Metrics: A Different Approach
Verification Bug Metrics: A Different ApproachDVClub
 
MyHeritage - End 2 End testing Infra
MyHeritage - End 2 End testing InfraMyHeritage - End 2 End testing Infra
MyHeritage - End 2 End testing InfraMatanGoren
 
Continuous Deployment To The Cloud
Continuous Deployment To The CloudContinuous Deployment To The Cloud
Continuous Deployment To The CloudMarcin Grzejszczak
 
TLC2018 Thomas Haver: The Automation Firehose - Be Strategic and Tactical
TLC2018 Thomas Haver: The Automation Firehose - Be Strategic and TacticalTLC2018 Thomas Haver: The Automation Firehose - Be Strategic and Tactical
TLC2018 Thomas Haver: The Automation Firehose - Be Strategic and TacticalAnna Royzman
 
Test workload otochkin_ppt
Test workload otochkin_pptTest workload otochkin_ppt
Test workload otochkin_pptGleb Otochkin
 
Testing in Production - presentation & webinar by Amber Race
Testing in Production - presentation & webinar by Amber RaceTesting in Production - presentation & webinar by Amber Race
Testing in Production - presentation & webinar by Amber RaceApplitools
 
An Agile Approach to Machine Learning
An Agile Approach to Machine LearningAn Agile Approach to Machine Learning
An Agile Approach to Machine LearningRandy Shoup
 
Load testing with Visual Studio and Azure - Andrew Siemer
Load testing with Visual Studio and Azure - Andrew SiemerLoad testing with Visual Studio and Azure - Andrew Siemer
Load testing with Visual Studio and Azure - Andrew SiemerAndrew Siemer
 
Application Quality Gates in Continuous Delivery: Deliver Better Software Fas...
Application Quality Gates in Continuous Delivery: Deliver Better Software Fas...Application Quality Gates in Continuous Delivery: Deliver Better Software Fas...
Application Quality Gates in Continuous Delivery: Deliver Better Software Fas...Andreas Grabner
 
Tool Up Your LAMP Stack
Tool Up Your LAMP StackTool Up Your LAMP Stack
Tool Up Your LAMP StackLorna Mitchell
 
Driving application development through behavior driven development
Driving application development through behavior driven developmentDriving application development through behavior driven development
Driving application development through behavior driven developmentEinar Ingebrigtsen
 
Lecture #6. automation testing (andrey oleynik)
Lecture #6. automation testing (andrey oleynik)Lecture #6. automation testing (andrey oleynik)
Lecture #6. automation testing (andrey oleynik)Andrey Oleynik
 
The Fifth Elephant 2016: Self-Serve Performance Tuning for Hadoop and Spark
The Fifth Elephant 2016: Self-Serve Performance Tuning for Hadoop and SparkThe Fifth Elephant 2016: Self-Serve Performance Tuning for Hadoop and Spark
The Fifth Elephant 2016: Self-Serve Performance Tuning for Hadoop and SparkAkshay Rai
 
BTV PHP - Building Fast Websites
BTV PHP - Building Fast WebsitesBTV PHP - Building Fast Websites
BTV PHP - Building Fast WebsitesJonathan Klein
 

Ähnlich wie Applying Testing Techniques for Big Data and Hadoop (20)

Small is Beautiful- Fully Automate your Test Case Design
Small is Beautiful- Fully Automate your Test Case DesignSmall is Beautiful- Fully Automate your Test Case Design
Small is Beautiful- Fully Automate your Test Case Design
 
Performance tuning Grails applications SpringOne 2GX 2014
Performance tuning Grails applications SpringOne 2GX 2014Performance tuning Grails applications SpringOne 2GX 2014
Performance tuning Grails applications SpringOne 2GX 2014
 
Can we induce change with what we measure?
Can we induce change with what we measure?Can we induce change with what we measure?
Can we induce change with what we measure?
 
Bugday bkk-2014 nitisak-auto_perf
Bugday bkk-2014 nitisak-auto_perfBugday bkk-2014 nitisak-auto_perf
Bugday bkk-2014 nitisak-auto_perf
 
Verification Bug Metrics: A Different Approach
Verification Bug Metrics: A Different ApproachVerification Bug Metrics: A Different Approach
Verification Bug Metrics: A Different Approach
 
MyHeritage - End 2 End testing Infra
MyHeritage - End 2 End testing InfraMyHeritage - End 2 End testing Infra
MyHeritage - End 2 End testing Infra
 
Continuous Deployment To The Cloud
Continuous Deployment To The CloudContinuous Deployment To The Cloud
Continuous Deployment To The Cloud
 
TLC2018 Thomas Haver: The Automation Firehose - Be Strategic and Tactical
TLC2018 Thomas Haver: The Automation Firehose - Be Strategic and TacticalTLC2018 Thomas Haver: The Automation Firehose - Be Strategic and Tactical
TLC2018 Thomas Haver: The Automation Firehose - Be Strategic and Tactical
 
Test workload otochkin_ppt
Test workload otochkin_pptTest workload otochkin_ppt
Test workload otochkin_ppt
 
Testing in Production - presentation & webinar by Amber Race
Testing in Production - presentation & webinar by Amber RaceTesting in Production - presentation & webinar by Amber Race
Testing in Production - presentation & webinar by Amber Race
 
An Agile Approach to Machine Learning
An Agile Approach to Machine LearningAn Agile Approach to Machine Learning
An Agile Approach to Machine Learning
 
Load testing with Visual Studio and Azure - Andrew Siemer
Load testing with Visual Studio and Azure - Andrew SiemerLoad testing with Visual Studio and Azure - Andrew Siemer
Load testing with Visual Studio and Azure - Andrew Siemer
 
Application Quality Gates in Continuous Delivery: Deliver Better Software Fas...
Application Quality Gates in Continuous Delivery: Deliver Better Software Fas...Application Quality Gates in Continuous Delivery: Deliver Better Software Fas...
Application Quality Gates in Continuous Delivery: Deliver Better Software Fas...
 
Tool up your lamp stack
Tool up your lamp stackTool up your lamp stack
Tool up your lamp stack
 
Tool Up Your LAMP Stack
Tool Up Your LAMP StackTool Up Your LAMP Stack
Tool Up Your LAMP Stack
 
Driving application development through behavior driven development
Driving application development through behavior driven developmentDriving application development through behavior driven development
Driving application development through behavior driven development
 
Introduction to Unit Tests and TDD
Introduction to Unit Tests and TDDIntroduction to Unit Tests and TDD
Introduction to Unit Tests and TDD
 
Lecture #6. automation testing (andrey oleynik)
Lecture #6. automation testing (andrey oleynik)Lecture #6. automation testing (andrey oleynik)
Lecture #6. automation testing (andrey oleynik)
 
The Fifth Elephant 2016: Self-Serve Performance Tuning for Hadoop and Spark
The Fifth Elephant 2016: Self-Serve Performance Tuning for Hadoop and SparkThe Fifth Elephant 2016: Self-Serve Performance Tuning for Hadoop and Spark
The Fifth Elephant 2016: Self-Serve Performance Tuning for Hadoop and Spark
 
BTV PHP - Building Fast Websites
BTV PHP - Building Fast WebsitesBTV PHP - Building Fast Websites
BTV PHP - Building Fast Websites
 

Kürzlich hochgeladen

Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 

Kürzlich hochgeladen (20)

Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 

Applying Testing Techniques for Big Data and Hadoop

Hinweis der Redaktion

  1. Example:23 mb/sec File Size: 500mb -> 22 seconds File Size: 1,000 -> 43 seconds
  2. More data often means more possible data permutations.
  3. Define standard sample problems…then transition to our relevant sample errors
  4. Random dataset…Only parameter is % of the dataset Testing though sometime needs to reference the data datasets!