SlideShare ist ein Scribd-Unternehmen logo
1 von 12
Welcome to London Jaspersoft Community User
Group Thursday 16th June 2016
Introductions and update themes for next event
KETL: Why DQ is important for BI
Implementation case study: Andy Fenn and Alexander
McGuire from Workplace Systems
Break
Ernesto: Complex Report Designs with Jaspersoft Studio
http://www.jiem.org/index.php/jiem/article/view/232/130
by 2017, 33% of Fortune 100
organisations will experience an
information crisis, due to their
inability to to effectively value,
govern and trust their enterprise
information.
Gartner
www.ketl.co.uk
Impact of poor DQ
Estimates vary on the impact of bad
data on revenue (10 to 30%!). Audit
your own revenue losses from poor
data. Factor in opportunity costs
too.
Measuring the cost of poor DQ
http://www.jiem.org/index.php/jiem/article/view/232/130
Impact of poor DQ in a BI environment
Make DQ part of your BI PoC. It is much harder to go in after
the event to address data quality issues.
DQ and the resulting ETL issues will likely slow down your BI
reporting and put extra strain on your data stores.
Who owns data quality for your BI source systems? This
needs to be established and ideally it should be the BI project
team that takes responsibility for ensuring the data that they
are providing in their reports is accurate and consistent.
Get involved in data governance and implement DQ as a KPIs
for the BI team.
http://www.jiem.org/index.php/jiem/article/view/232/130
www.ketl.co.uk
How is ‘bad’ data
entering our systems?
People. Poorly designed data entry
fields. Duplicate entries. Multiple
data sources. Self-service user
entry.
www.ketl.co.uk
Data profiling measures
1. Accuracy
2. Completeness
3. Timeliness
4. Validity
5. Consistency
6. Uniqueness
Experian survey on data accuracy
www.ketl.co.uk
Getting better data.
Don’t try ‘big bang’ approach – too
daunting. Profile your data. Use
familiar datasets that you know you
can improve easily. Quick gains.
You have to start with a very
basic idea: data is super
messy, and data cleanup will
always be literally 80 percent of
the work. In other words, data
is the problem.
DJ Patil, Chief Data Scientist of the White House
www.ketl.co.uk
13-14 Orchard Street, Bristol BS1 5EH
+44 (0)117 905 5323
info@ketl.co.uk @KETL_BI
Get in touch
For further information or help with
your data project speak to Helen to
see how we can help >
Helen Woodcock
LinkedIn: /in/helenwoodcock
email: helen@ketl.co.uk
References and Further Reading
Data disasters
http://blogs.mazars.com/the-model-auditor/files/2014/01/12-Modelling-Horror-Stories-and-Spreadsheet-Disasters-Mazars-UK.pdf
https://www.sas.com/content/dam/SAS/en_us/doc/whitepaper1/bad-data-good-companies-106465.pdf
Research on corporate data quality
https://www.edq.com/globalassets/uk/papers/global-research-2015_20pp-ext-apr15.pdf
https://www.gartner.com/doc/2636315/state-data-quality-current-practices
https://www.edq.com/uk/resources/infographics/data-machine/
Cost of data quality
http://betanews.com/2015/02/17/why-data-quality-is-essential-to-your-analytics-strategy/
http://www.itbusinessedge.com/interviews/how-to-measure-the-cost-of-data-quality-problems.html
http://www.itbusinessedge.com/blogs/integration/what-does-bad-data-cost.html
http://techcrunch.com/2015/07/01/enterprises-dont-have-big-data-they-just-have-bad-data/
https://www.experian.com/assets/decision-analytics/white-papers/the%20state%20of%20data%20quality.pdf
Data quality in the BI environment
http://searchdatamanagement.techtarget.com/tip/Data-quality-management-for-business-intelligence-projects
http://www.quistor.com/en/blog/entry/why-has-my-bi-become-slow

Weitere ähnliche Inhalte

Was ist angesagt?

What every product manager needs to know about data science (ProductCamp Bost...
What every product manager needs to know about data science (ProductCamp Bost...What every product manager needs to know about data science (ProductCamp Bost...
What every product manager needs to know about data science (ProductCamp Bost...
ProductCamp Boston
 

Was ist angesagt? (20)

General Data Protection Regulation - BDW Meetup, October 11th, 2017
General Data Protection Regulation - BDW Meetup, October 11th, 2017General Data Protection Regulation - BDW Meetup, October 11th, 2017
General Data Protection Regulation - BDW Meetup, October 11th, 2017
 
Bde presentation dv
Bde presentation dvBde presentation dv
Bde presentation dv
 
Making Big Data Work
Making Big Data WorkMaking Big Data Work
Making Big Data Work
 
Moving Data Science from an Event to A Program: Considerations in Creating Su...
Moving Data Science from an Event to A Program: Considerations in Creating Su...Moving Data Science from an Event to A Program: Considerations in Creating Su...
Moving Data Science from an Event to A Program: Considerations in Creating Su...
 
Notilyze SAS
Notilyze SASNotilyze SAS
Notilyze SAS
 
Ellicium's Gadfly - Next Generation Big Data Text Analytics Platform
Ellicium's Gadfly - Next Generation Big Data Text Analytics Platform Ellicium's Gadfly - Next Generation Big Data Text Analytics Platform
Ellicium's Gadfly - Next Generation Big Data Text Analytics Platform
 
Self-service data and data governance: friends or foes?
Self-service data and data governance: friends or foes?Self-service data and data governance: friends or foes?
Self-service data and data governance: friends or foes?
 
Data quality management Basic
Data quality management BasicData quality management Basic
Data quality management Basic
 
Democratizing Big Data
Democratizing Big DataDemocratizing Big Data
Democratizing Big Data
 
Democratizing Big Data (Updated)
Democratizing Big Data (Updated)Democratizing Big Data (Updated)
Democratizing Big Data (Updated)
 
Microsoft jeroen ter heerdt
Microsoft jeroen ter heerdtMicrosoft jeroen ter heerdt
Microsoft jeroen ter heerdt
 
Data Quality Analytics: Understanding what is in your data, before using it
Data Quality Analytics: Understanding what is in your data, before using itData Quality Analytics: Understanding what is in your data, before using it
Data Quality Analytics: Understanding what is in your data, before using it
 
Johnson & Johnson
Johnson & JohnsonJohnson & Johnson
Johnson & Johnson
 
Big Data Analytics on the Cloud
Big Data Analytics on the CloudBig Data Analytics on the Cloud
Big Data Analytics on the Cloud
 
How to add security in dataops and devops
How to add security in dataops and devopsHow to add security in dataops and devops
How to add security in dataops and devops
 
191017 scamander non invasive data governance - with link to movie with bob s...
191017 scamander non invasive data governance - with link to movie with bob s...191017 scamander non invasive data governance - with link to movie with bob s...
191017 scamander non invasive data governance - with link to movie with bob s...
 
Managing Data Science | Lessons from the Field
Managing Data Science | Lessons from the Field Managing Data Science | Lessons from the Field
Managing Data Science | Lessons from the Field
 
What every product manager needs to know about data science (ProductCamp Bost...
What every product manager needs to know about data science (ProductCamp Bost...What every product manager needs to know about data science (ProductCamp Bost...
What every product manager needs to know about data science (ProductCamp Bost...
 
Andreas weigend
Andreas weigendAndreas weigend
Andreas weigend
 
DataOps - Big Data and AI World London - March 2020 - Harvinder Atwal
DataOps - Big Data and AI World London - March 2020 - Harvinder AtwalDataOps - Big Data and AI World London - March 2020 - Harvinder Atwal
DataOps - Big Data and AI World London - March 2020 - Harvinder Atwal
 

Ähnlich wie London Jaspersoft Community User Group Event 2 KETL presentation

Project 3 – Hollywood and IT· Find 10 incidents of Hollywood p.docx
Project 3 – Hollywood and IT· Find 10 incidents of Hollywood p.docxProject 3 – Hollywood and IT· Find 10 incidents of Hollywood p.docx
Project 3 – Hollywood and IT· Find 10 incidents of Hollywood p.docx
stilliegeorgiana
 
Data-Ed Online: Approaching Data Quality
Data-Ed Online: Approaching Data QualityData-Ed Online: Approaching Data Quality
Data-Ed Online: Approaching Data Quality
DATAVERSITY
 
Is Your Staff Big Data Ready? 5 Things to Know About What It Will Take to Suc...
Is Your Staff Big Data Ready? 5 Things to Know About What It Will Take to Suc...Is Your Staff Big Data Ready? 5 Things to Know About What It Will Take to Suc...
Is Your Staff Big Data Ready? 5 Things to Know About What It Will Take to Suc...
CompTIA
 
Day 2 aziz apj aziz_big_datakeynote_press
Day 2 aziz apj aziz_big_datakeynote_pressDay 2 aziz apj aziz_big_datakeynote_press
Day 2 aziz apj aziz_big_datakeynote_press
IntelAPAC
 
The Bigger They Are The Harder They Fall
The Bigger They Are The Harder They FallThe Bigger They Are The Harder They Fall
The Bigger They Are The Harder They Fall
Trillium Software
 

Ähnlich wie London Jaspersoft Community User Group Event 2 KETL presentation (20)

Enterprise Business Intelligence & Data Warehousing: The Data Quality Conundrum
Enterprise Business Intelligence & Data Warehousing: The Data Quality ConundrumEnterprise Business Intelligence & Data Warehousing: The Data Quality Conundrum
Enterprise Business Intelligence & Data Warehousing: The Data Quality Conundrum
 
Project 3 – Hollywood and IT· Find 10 incidents of Hollywood p.docx
Project 3 – Hollywood and IT· Find 10 incidents of Hollywood p.docxProject 3 – Hollywood and IT· Find 10 incidents of Hollywood p.docx
Project 3 – Hollywood and IT· Find 10 incidents of Hollywood p.docx
 
Data-Ed Online: Approaching Data Quality
Data-Ed Online: Approaching Data QualityData-Ed Online: Approaching Data Quality
Data-Ed Online: Approaching Data Quality
 
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
 
From Near to Maturity - Presentation to European Data Forum
From Near to Maturity - Presentation to European Data ForumFrom Near to Maturity - Presentation to European Data Forum
From Near to Maturity - Presentation to European Data Forum
 
Is Your Staff Big Data Ready? 5 Things to Know About What It Will Take to Suc...
Is Your Staff Big Data Ready? 5 Things to Know About What It Will Take to Suc...Is Your Staff Big Data Ready? 5 Things to Know About What It Will Take to Suc...
Is Your Staff Big Data Ready? 5 Things to Know About What It Will Take to Suc...
 
The Digital Procurement Era
The Digital Procurement EraThe Digital Procurement Era
The Digital Procurement Era
 
Enabling a Bimodal IT Framework for Advanced Analytics with Data Virtualization
Enabling a Bimodal IT Framework for Advanced Analytics with Data VirtualizationEnabling a Bimodal IT Framework for Advanced Analytics with Data Virtualization
Enabling a Bimodal IT Framework for Advanced Analytics with Data Virtualization
 
Big Data and Analytics in your Organisation talk.pdf
Big Data and Analytics in your Organisation talk.pdfBig Data and Analytics in your Organisation talk.pdf
Big Data and Analytics in your Organisation talk.pdf
 
Keeping the Pulse of Your Data:  Why You Need Data Observability 
Keeping the Pulse of Your Data:  Why You Need Data Observability Keeping the Pulse of Your Data:  Why You Need Data Observability 
Keeping the Pulse of Your Data:  Why You Need Data Observability 
 
Optimizing Regulatory Compliance with Big Data
Optimizing Regulatory Compliance with Big DataOptimizing Regulatory Compliance with Big Data
Optimizing Regulatory Compliance with Big Data
 
The Data Lake: Empowering Your Data Science Team
The Data Lake: Empowering Your Data Science TeamThe Data Lake: Empowering Your Data Science Team
The Data Lake: Empowering Your Data Science Team
 
Towards the Industrialization of AI
Towards the Industrialization of AITowards the Industrialization of AI
Towards the Industrialization of AI
 
Augmented Data Management
Augmented Data ManagementAugmented Data Management
Augmented Data Management
 
Day 2 aziz apj aziz_big_datakeynote_press
Day 2 aziz apj aziz_big_datakeynote_pressDay 2 aziz apj aziz_big_datakeynote_press
Day 2 aziz apj aziz_big_datakeynote_press
 
AI-Led-Cognitive-Data-Quality.pdf
AI-Led-Cognitive-Data-Quality.pdfAI-Led-Cognitive-Data-Quality.pdf
AI-Led-Cognitive-Data-Quality.pdf
 
Future of Big Data
Future of Big DataFuture of Big Data
Future of Big Data
 
Integrating the CDO Role Into Your Organization; Managing the Disruption (MIT...
Integrating the CDO Role Into Your Organization; Managing the Disruption (MIT...Integrating the CDO Role Into Your Organization; Managing the Disruption (MIT...
Integrating the CDO Role Into Your Organization; Managing the Disruption (MIT...
 
The Bigger They Are The Harder They Fall
The Bigger They Are The Harder They FallThe Bigger They Are The Harder They Fall
The Bigger They Are The Harder They Fall
 
Data Virtualization Accelerating Your Data Strategy
Data Virtualization Accelerating Your Data StrategyData Virtualization Accelerating Your Data Strategy
Data Virtualization Accelerating Your Data Strategy
 

Kürzlich hochgeladen

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 

Kürzlich hochgeladen (20)

Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 

London Jaspersoft Community User Group Event 2 KETL presentation

  • 1. Welcome to London Jaspersoft Community User Group Thursday 16th June 2016 Introductions and update themes for next event KETL: Why DQ is important for BI Implementation case study: Andy Fenn and Alexander McGuire from Workplace Systems Break Ernesto: Complex Report Designs with Jaspersoft Studio http://www.jiem.org/index.php/jiem/article/view/232/130
  • 2. by 2017, 33% of Fortune 100 organisations will experience an information crisis, due to their inability to to effectively value, govern and trust their enterprise information. Gartner
  • 3. www.ketl.co.uk Impact of poor DQ Estimates vary on the impact of bad data on revenue (10 to 30%!). Audit your own revenue losses from poor data. Factor in opportunity costs too.
  • 4. Measuring the cost of poor DQ http://www.jiem.org/index.php/jiem/article/view/232/130
  • 5. Impact of poor DQ in a BI environment Make DQ part of your BI PoC. It is much harder to go in after the event to address data quality issues. DQ and the resulting ETL issues will likely slow down your BI reporting and put extra strain on your data stores. Who owns data quality for your BI source systems? This needs to be established and ideally it should be the BI project team that takes responsibility for ensuring the data that they are providing in their reports is accurate and consistent. Get involved in data governance and implement DQ as a KPIs for the BI team. http://www.jiem.org/index.php/jiem/article/view/232/130
  • 6. www.ketl.co.uk How is ‘bad’ data entering our systems? People. Poorly designed data entry fields. Duplicate entries. Multiple data sources. Self-service user entry.
  • 7. www.ketl.co.uk Data profiling measures 1. Accuracy 2. Completeness 3. Timeliness 4. Validity 5. Consistency 6. Uniqueness
  • 8. Experian survey on data accuracy
  • 9. www.ketl.co.uk Getting better data. Don’t try ‘big bang’ approach – too daunting. Profile your data. Use familiar datasets that you know you can improve easily. Quick gains.
  • 10. You have to start with a very basic idea: data is super messy, and data cleanup will always be literally 80 percent of the work. In other words, data is the problem. DJ Patil, Chief Data Scientist of the White House
  • 11. www.ketl.co.uk 13-14 Orchard Street, Bristol BS1 5EH +44 (0)117 905 5323 info@ketl.co.uk @KETL_BI Get in touch For further information or help with your data project speak to Helen to see how we can help > Helen Woodcock LinkedIn: /in/helenwoodcock email: helen@ketl.co.uk
  • 12. References and Further Reading Data disasters http://blogs.mazars.com/the-model-auditor/files/2014/01/12-Modelling-Horror-Stories-and-Spreadsheet-Disasters-Mazars-UK.pdf https://www.sas.com/content/dam/SAS/en_us/doc/whitepaper1/bad-data-good-companies-106465.pdf Research on corporate data quality https://www.edq.com/globalassets/uk/papers/global-research-2015_20pp-ext-apr15.pdf https://www.gartner.com/doc/2636315/state-data-quality-current-practices https://www.edq.com/uk/resources/infographics/data-machine/ Cost of data quality http://betanews.com/2015/02/17/why-data-quality-is-essential-to-your-analytics-strategy/ http://www.itbusinessedge.com/interviews/how-to-measure-the-cost-of-data-quality-problems.html http://www.itbusinessedge.com/blogs/integration/what-does-bad-data-cost.html http://techcrunch.com/2015/07/01/enterprises-dont-have-big-data-they-just-have-bad-data/ https://www.experian.com/assets/decision-analytics/white-papers/the%20state%20of%20data%20quality.pdf Data quality in the BI environment http://searchdatamanagement.techtarget.com/tip/Data-quality-management-for-business-intelligence-projects http://www.quistor.com/en/blog/entry/why-has-my-bi-become-slow

Hinweis der Redaktion

  1. Customer’s perception of you as a brand is key and its easy for people to go elsewhere – DQ paramount - for each company to decide just how important it is for their brand – measuring impact
  2. Impact: don’t forget to consider the opportunity costs. There is also the ‘weariness’ factor in staff. Why both to craft yet another campaign that will reach less then half of the recipients due to incorrect or outdated email addresses. The reputational costs of getting things badly wrong. Customer service issues. Unable to segment properly – not knowing high cost low value and low cost high value customers.
  3. Garbage in garbage out still holds true. Especially significant for marketers. Company reputation. Often the first contact point that customers have with a business.
  4. Some areas of DQ in your BI reporting are going to be more important than others. Financial forecasts for example – you want to know how far from your target projections you are each week. Strategic decisions may be influenced by even small margins of error.
  5. Use some examples here. No gender assigned. Mr Charge Dodger. Need to incentivise good data handling/entry. Improve data entry field design. Automate data cleansing routines. Establish KPIs against data quality.
  6. These are the 6 main tenants of DQ.
  7. What is easily achievable in DQ, how and why using KPIs to measure DQ will improve customer insight and add value. Technology has improved a great deal in the last few years and marketers need to know what they can do within their own team and what they will need to get IT to help with. We will use some demonstrations of quick data verification checks to explore what is possible either as batch reporting or in near real-time web integrated data verification look-ups. Depending on the scale and resources of your company you can make a decision about what is achievable within your own team and or within your company.
  8. Any campaign, any software upgrade project, any new product launch – all will be impacted if you have poor data quality. There is no point investing in data analytics if you can’t be sure about sending out an email campaign without addressing your customer by the right name (Mr Charge dodger) Reputation: Age UK – tidal wave of abuse and drop in income with data protection issues – lack of data cleansing.