Anzeige

Data Catalogues - Architecting for Collaboration & Self-Service

Executive Editor at DATAVERSITY um DATAVERSITY
27. Sep 2019
Anzeige

Más contenido relacionado

Presentaciones para ti(20)

Similar a Data Catalogues - Architecting for Collaboration & Self-Service(20)

Anzeige

Más de DATAVERSITY(20)

Último(20)

Anzeige

Data Catalogues - Architecting for Collaboration & Self-Service

  1. Copyright Global Data Strategy, Ltd. 2019 Data Catalogues: Architecting for Collaboration & Self-Service Donna Burbank Global Data Strategy, Ltd. August 26th, 2019 Follow on Twitter @donnaburbank Twitter Event hashtag: #DAStrategies
  2. Global Data Strategy, Ltd. 2019 Donna Burbank 2 Donna is a recognised industry expert in information management with over 20 years of experience in data strategy, information management, data modeling, metadata management, and enterprise architecture. Her background is multi-faceted across consulting, product development, product management, brand strategy, marketing, and business leadership. She is currently the Managing Director at Global Data Strategy, Ltd., an international information management consulting company that specializes in the alignment of business drivers with data-centric technology. In past roles, she has served in key brand strategy and product management roles at CA Technologies and Embarcadero Technologies for several of the leading data management products in the market. As an active contributor to the data management community, she is a long time DAMA International member, Past President and Advisor to the DAMA Rocky Mountain chapter, and was recently awarded the Excellence in Data Management Award from DAMA International in 2016. Donna is also an analyst at the Boulder BI Train Trust (BBBT) where she provides advice and gains insight on the latest BI and Analytics software in the market. She was on several review committees for the Object Management Group’s for key information management and process modeling notations. She has worked with dozens of Fortune 500 companies worldwide in the Americas, Europe, Asia, and Africa and speaks regularly at industry conferences. She has co- authored two books: Data Modeling for the Business and Data Modeling Made Simple with ERwin Data Modeler and is a regular contributor to industry publications. She can be reached at donna.burbank@globaldatastrategy.com Donna is based in Boulder, Colorado, USA. Follow on Twitter @donnaburbank Twitter Event hashtag: #DAStrategies
  3. Global Data Strategy, Ltd. 2019 DATAVERSITY Data Architecture Strategies • January 24 - on demand Emerging Trends in Data Architecture – What’s the Next Big Thing? • February 18 - on demand Building a Data Strategy - Practical Steps for Aligning with Business Goals • March 28 - on demand Data Modeling at the Environment Agency of England - Case Study • April 25 - on demand Data Governance - Combining Data Management with Organizational Change • May 23 - on demand Master Data Management - Aligning Data, Process, and Governance • June 27 - on demand Enterprise Architecture vs. Data Architecture • July 25 - on demand Metadata Management: Technical Architecture & Business Techniques • August 22 - on demand Data Quality Best Practices (w/ guest Nigel Turner) • Sept 26 Data Catalogues: Architecting for Collaboration & Self-Service • October 24 Data Modeling Best Practices: Business and Technical Approaches • December 3 Building a Future-State Data Architecture Plan: Where to Begin? 3 This Year’s Lineup
  4. Global Data Strategy, Ltd. 2019 Today’s Topic The interest in Data Catalogs is growing as more business & technical users are looking to gain insight from data using a self-service approach. Architectural techniques for Data Provisioning and Metadata Cataloging have evolved to cater to these new audiences and ways of working. This webinar provides concrete methods of architecting your Self-service BI & Analytics environment to foster collaboration while at the same time maintaining Data Quality and reducing risk. 4
  5. Global Data Strategy, Ltd. 2019 What is a Data Catalog? A data catalog creates and maintains an inventory of data assets through the discovery, description and organization of distributed datasets. The data catalog provides context to enable data stewards, data/business analysts, data engineers, data scientists and other line of business (LOB) data consumers to find and understand relevant datasets for the purpose of extracting business value. Modern machine-learning-augmented data catalogs automate various tedious tasks involved in data cataloging, including metadata discovery, ingestion, translation, enrichment and the creation of semantic relationships between metadata. • Gartner, 12 September 2019 - ID G00394570 5
  6. Global Data Strategy, Ltd. 2019 Data Catalog or Metadata Catalog? 6 M.C.Escher from Wikimedia Commons
  7. Global Data Strategy, Ltd. 2019 Data Catalog or Metadata Repository? • There exists functional differences between full metadata repositories and data catalogs. • Like any tools, functionality is a continuum, with overlap depending on a vendor, but be careful to consider your use cases before purchasing any tool. 7 Metadata Repository • Automated technical metadata discovery • Search capability • Data lineage • Impact analysis • Standards enforcement • Business rule alignment • Semantic Framework Data Catalog • Automated metadata discovery • Intuitive user search • Collaboration & User Ranking • “Light touch” standards enforcement Vendor Functionality Spectrum Encyclopedia
  8. Global Data Strategy, Ltd. 2019 Product Catalog 8
  9. Global Data Strategy, Ltd. 2019 Product Catalog 9 Easily Search & Discover Key Items of Interest Collaborate with other Users Understand Relevance & Ranking View Related Items Organize by Subject Area or Department Easily Obtain / Purchase Items of Interest View Product Details and Specifications
  10. Global Data Strategy, Ltd. 2019 Product Data Management 10 Product Master Data To align common: • SKUs • Product Name • Description • Price • Etc. PIM and/or Doc/Image Mgt. To align common: • Images • Branding • Etc. Operational Data To track: • Customer Purchase Activity Reference Data To track: • Common Departments, Regions, Brands, etc. NoSQL and/or Graph Database To track: • Recommendations • Usage Ranking • Etc. Semantic Layer Data models, taxonomies, hierarchies, etc. to track: • Product hierarchy • Organizational structure • Brand structure • Etc.
  11. Global Data Strategy, Ltd. 2019 Data Catalog 11 Discussion Forum JoeD “This doesn’t include lapsed customers – where do I find that? MaryK “Does anyone have a NPS query I could use?” Table: Customer Description: The Customer Table provides a list of de-duplicated individuals who have purchased one or more products within the past 18 months. Columns Name Data Type Description First Name Char(20) Given name of customer Last Name Char (50) Family name(s) of customer Gender Varchar(1) Biological gender Member Since Date Date joining loyalty program. Views Customer_Demographics Customer_Address Related Dashboards Customer Segmentation Top Customers by Region Usage Ranking Business Areas - Marketing - Development - Sales Data Assets - Tables - Views - Dashboards CustomerSearch:
  12. Global Data Strategy, Ltd. 2019 Data Catalog 12 Discussion Forum JoeD “This doesn’t include lapsed customers – where do I find that? MaryK “Does anyone have a NPS query I could use?” Table: Customer Description: The Customer Table provides a list of de-duplicated individuals who have purchased one or more products within the past 18 months. Columns Name Data Type Description First Name Char(20) Given name of customer Last Name Char (50) Family name(s) of customer Gender Varchar(1) Biological gender Member Since Date Date joining loyalty program. Views Customer_Demographics Customer_Address Related Dashboards Customer Segmentation Top Customers by Region Usage Ranking Business Areas - Marketing - Development - Sales Data Assets - Tables - Views - Dashboards CustomerSearch: Easily Search & Discover Key Items of Interest Collaborate with other Users Understand Relevance & Ranking View Related Items Organize by Subject Area or Department Easily Obtain / “Purchase” Items of Interest View Details and Specifications
  13. Global Data Strategy, Ltd. 2019 Metadata Repository 13 Metadata Storage, Integration & Publication Data Lineage & Impact Analysis
  14. Global Data Strategy, Ltd. 2019 Machine Learning & Metadata Discovery • Machine Learning offers ways to automate tedious tasks that may have been done manually before: • e.g. Data Mapping • SSN -> Field1_SSN • SSN -> Soc_Num • Etc. • Machine Learning Pattern Matching • NNN-NN-NNNN -> Field_X follows this pattern, it must be a SSN 14 Source kdnuggets.com • There is a place for both methods: • Sometimes you want to define specific mapping rules • Sometimes you want a pattern-matching, discovery- style approach.
  15. Global Data Strategy, Ltd. 2019 Collaboration to Support the Self-Service User 15 “If there are standardized data sets, I’d love to use them!” e.g. Master Data, Data Warehouse “Published documentation, metadata, & standard definitions are super-helpful!” e.g. Glossaries, data models, etc. “I want to integrate these data sets with my own exploratory data for analysis & modeling!” e.g. Self-Service Data Prep & Analysis Tools “How can I leverage what other people have done, and see what is most relevant? e.g. Data Cataloguing & Crowdsourcing Today’s self-service data preparation & reporting user makes use of a wide variety of tools & technologies.
  16. Global Data Strategy, Ltd. 2019 Integration with Data Governance is Key • In order to use data catalogues effectively for business success, clear processes and procedures need to be in place for the governance of and interaction between these different data landscapes. Examples include: • Data stewardship roles for curated data sets • Automated feedback loops to encourage collaborative input for business definitions and rules • Review cycles for standard data sets, reports, analytical models, etc. • Publication and distribution mechanisms for shared data sets • Processes for data promotion between data discovery and enterprise use • Data lifecycle and workflow 16
  17. Global Data Strategy, Ltd. 2019 Implement “Just Enough” Data Governance • Know what to manage closely and what to leave alone • The more the data is shared across & beyond the organization, the more formal governance needs to be 17 Core Enterprise Data Functional & Operational Data Exploratory Data Reference & Master Data Core Enterprise Data • Common data elements used by multiple stakeholders, departments, etc. (e.g. DW) • Highly governed • Highly published & shared Functional & Operational Data • Lightly modeled & prepared data for limited sharing & reuse • Collaboration-based governance • May be future candidates for core data Exploratory Data • Raw or lightly prepped data for exploratory analysis • Mainly ad hoc, one-off analysis • Light touch governance Examples • Operational Reporting • Non-productionized analytical model data • Ad hoc reporting & discovery Examples • Raw data sets for exploratory analytics • External & Open data sources Examples • Common Financial Metrics: for Financial & Regulatory Reporting • Common Attributes: Core attributes reused across multiple areas (e.g. Customer name, Address, etc.) Master & Reference Data • Common data elements used by multiple stakeholders across functional areas, applications, etc. • Highly governed • Highly published & shared Examples • Reference Data: Department Codes, Country Codes, etc. • Master Data: Customer, Product, Student, Supplier, etc. Exploratory analysis uses core data sets when applicable Derived variables of value can be fed into Core Enterprise, or even Master Data. PublishPromote
  18. Global Data Strategy, Ltd. 2019 Summary • Data Catalogues providing an intuitive way to access and discover core enterprise data • Collaboration and Feedback loops are critical to success • Integration with Data Governance is important to maintain the Data Catalogue effectively n the long-term • Understand your use case before choosing a tool – e.g. rigorous standards and lineage or looser, collaborative approach?
  19. Global Data Strategy, Ltd. 2019 DATAVERSITY Data Architecture Strategies • January 24 - on demand Emerging Trends in Data Architecture – What’s the Next Big Thing? • February 18 - on demand Building a Data Strategy - Practical Steps for Aligning with Business Goals • March 28 - on demand Data Modeling at the Environment Agency of England - Case Study • April 25 - on demand Data Governance - Combining Data Management with Organizational Change • May 23 - on demand Master Data Management - Aligning Data, Process, and Governance • June 27 - on demand Enterprise Architecture vs. Data Architecture • July 25 - on demand Metadata Management: Technical Architecture & Business Techniques • August 22 - on demand Data Quality Best Practices (w/ guest Nigel Turner) • Sept 26 – soon on demand Data Catalogues: Architecting for Collaboration & Self-Service • October 24 Data Modeling Best Practices: Business and Technical Approaches • December 3 Building a Future-State Data Architecture Plan: Where to Begin? 19 Join Us Next Month
  20. Global Data Strategy, Ltd. 2019 About Global Data Strategy, Ltd • Global Data Strategy is an international information management consulting company that specializes in the alignment of business drivers with data-centric technology. • Our passion is data, and helping organizations enrich their business opportunities through data and information. • Our core values center around providing solutions that are: • Business-Driven: We put the needs of your business first, before we look at any technology solution. • Clear & Relevant: We provide clear explanations using real-world examples. • Customized & Right-Sized: Our implementations are based on the unique needs of your organization’s size, corporate culture, and geography. • High Quality & Technically Precise: We pride ourselves in excellence of execution, with years of technical expertise in the industry. 20 Data-Driven Business Transformation Business Strategy Aligned With Data Strategy Visit www.globaldatastrategy.com for more information
  21. Global Data Strategy, Ltd. 2019 Questions? 21 • Thoughts? Ideas?
Anzeige