Graph databases are growing in popularity, with their ability to quickly discover and integrate key relationship between enterprise data sets. Business use cases such as recommendation engines, master data management, social networks, enterprise knowledge graphs and more provide valuable ways to leverage graph databases in your organization. This webinar provides an overview of graph database technologies, and how they can be used for practical applications to drive business value.
Data Architecture Strategies: The Rise of the Graph Database
1. The Rise of the Graph Database
Donna Burbank, Managing Director
Global Data Strategy, Ltd.
April 26th, 2018
Follow on Twitter @donnaburbank
Twitter Event hashtag: #DAStrategies
2. Global Data Strategy, Ltd. 2018
Donna Burbank
Donna is a recognised industry expert in
information management with over 20 years
of experience in data strategy, information
management, data modeling, metadata
management, and enterprise architecture.
Her background is multi-faceted across
consulting, product development, product
management, brand strategy, marketing,
and business leadership.
She is currently the Managing Director at
Global Data Strategy, Ltd., an international
information management consulting
company that specializes in the alignment of
business drivers with data-centric
technology. In past roles, she has served in
key brand strategy and product
management roles at CA Technologies and
Embarcadero Technologies for several of the
leading data management products in the
market.
As an active contributor to the data
management community, she is a long time
DAMA International member, Past President
and Advisor to the DAMA Rocky Mountain
chapter, and was recently awarded the
Excellence in Data Management Award from
DAMA International in 2016.
Donna is also an analyst at the Boulder BI
Train Trust (BBBT) where she provides advice
and gains insight on the latest BI and
Analytics software in the market. She was on
several review committees for the Object
Management Group’s for key information
management and process modeling
notations.
She has worked with dozens of Fortune 500
companies worldwide in the Americas,
Europe, Asia, and Africa and speaks regularly
at industry conferences. She has co-
authored two books: Data Modeling for the
Business and Data Modeling Made Simple
with ERwin Data Modeler and is a regular
contributor to industry publications. She can
be reached at
donna.burbank@globaldatastrategy.com
Donna is based in Boulder, Colorado, USA.
2
Follow on Twitter @donnaburbank
Twitter Event hashtag: #DAStrategies
3. Global Data Strategy, Ltd. 2018
DATAVERSITY Data Architecture Strategies
• January - on demand Panel: Emerging Trends in Data Architecture – What’s the Next Big Thing?
• February - on demand Building an Enterprise Data Strategy – Where to Start?
• March - on demand Modern Metadata Strategies
• April The Rise of the Graph Database: Practical Use Cases & Approaches to Benefit your Business
• May Data Architecture Best Practices for Today’s Rapidly Changing Data Landscape
• June Artificial Intelligence: Real-World Applications for Your Organization
• July Panel: Data as a Profit Driver – Emerging Techniques to Monetize Data as a Strategic Asset
• August Data Lake Architecture – Modern Strategies & Approaches
• Sept Master Data Management: Practical Strategies for Integrating into Your Data Architecture
• October Business-Centric Data Modeling: Strategies for Maximizing Business Benefit
• December Panel: Self-Service Reporting and Data Prep – Benefits & Risks
3
This Year’s Line Up for 2018
4. Global Data Strategy, Ltd. 2018
What We’ll Cover Today
• Graph databases are growing in popularity, with their ability to quickly discover and integrate key
relationship between enterprise data sets.
• Business use cases such as recommendation engines, master data management, social networks,
enterprise knowledge graphs and more provide valuable ways to leverage graph databases in your
organization.
• This webinar provides an overview of graph database technologies, and how they can be used for
practical applications to drive business value.
4
5. Global Data Strategy, Ltd. 2018
What is a Graph Database?
• A graph database uses a set of nodes, edges, and
properties to represent and store data.
• With graph databases, the relationships between data
points often matter more than the individual points
themselves. In order to leverage those data relationships,
your organization needs a database technology that stores
• These relationships can help you discover new insights
from your data.
5
7. Global Data Strategy, Ltd. 2018
Graph Database = Thing Relates to Thing
7
Node
Vertice
Edge
Relationship
The more formal way of referring to “thing relates to thing” is
“Nodes & Edges”, “Vertices & Relationships”, etc.
8. Global Data Strategy, Ltd. 2018
Graph Databases Mirror the Way We Think
8
Squirrel!
I should go
visit Mary
I wonder how her
brother John is doing?
Is he still dating
Stephanie?
…In the mind, as in data,
there are always random
data points…
Do they still have that
house at the Lake?
Riding their boats on the lake was great.
Remember when John crashed the boat?
Like my toy
as a child.
Graph databases can be intuitive to many, since they mirror the way the human brain
typically thinks – through Association.
9. Global Data Strategy, Ltd. 2018
“Traditional” way of Looking at the World: Hierarchies
• Carolus Linnaeus in 1735 established a hierarchy/taxonomy for organizing and identifying
biological systems.
Kingdom
Phylum
Class
Order
Family
Genus
Species
10. Global Data Strategy, Ltd. 2018
“New” Way of Looking at the World - Emergence
In philosophy, systems theory, science, and art, emergence is
the way complex systems and patterns arise out of a
multiplicity of relatively simple interactions.
- Wikipedia
11. Global Data Strategy, Ltd. 2018
Graph Databases Combine Flexibility w/ Structure & Meaning
• In many ways, graph databases provide the “best of both worlds”.
11
Flexibility of the “New World”
of Discovery & “Emergence”
Structure & Meaning of the “Old
World” through Ontologies+
12. Global Data Strategy, Ltd. 2018
It’s All About Relationships
• In graph databases, relationships are first class constructs.
• Rather ironically, relational databases lack relationships.
• In relational databases, relationships are enforced through joins and constraints.
• NoSQL (e.g. Key Value) databases are also weak at supporting relationships.
12
“A relational database isn’t about relationships, it’s about constraints.”
– Karen Lopez
Customer Account
Is Owner Of
<Customer> <Owner Of> <Account>
13. Global Data Strategy, Ltd. 2018
Data Modeling for Graph Databases
• There are several dominant ways to model graph databases. Two popular ones include:
• Resource Description Language (RDF) Triples
• Labeled Property Graph
13
Labeled Property Graph
• Made up of nodes, relationships, properties & labels
• Sample Query language: Cypher
• Sample Vendor: Neo4J
Resource Description Language (RDF) Triples
• Made up of subject, predicate object triples
• Sample Query: SPARQL
• Sample Vendor: Stardog
• Both have a close affinity between logical & physical models
• i.e. We already think in “thing relates to thing”
14. Global Data Strategy, Ltd. 2018
Ontologies help Define Queries
14
People have Names
People can own kinds of things
Pets can be owned
A dog is a pet
Dogs can have names
Ontology
Show me all of the People who Own Dogs
Query
Thing
“Person”
Relates to
“Owns”
“Loves?”
“Was Bitten By?
Thing
“Dogs”
15. Global Data Strategy, Ltd. 2018
Graph Databases – Current Usage
Only 12.7% of respondents are
currently using a graph database.
15
“Which of the following data sources or platforms are you currently using?
[Select all that apply]
Relational Databases
are still clearly the
leader.
Spreadsheets are
ubiquitous
From Trends in Data Architecture, 2017, DATAVERSITY, by Donna Burbank and Charles Roe
16. Global Data Strategy, Ltd. 2018
Graph Databases – Future Usage
16
“Which of the following do you plan to use in the future that you are not using
currently? [Select all that Apply]”
22.6% of respondents are
planning to use Graph databases in
the future.
From Trends in Data Architecture, 2017, DATAVERSITY, by Donna Burbank and Charles Roe
Interestingly, Forrester had predicted that graph
databases will have a 25% adoption rate by 2017
17. Global Data Strategy, Ltd. 2018 17Source: Gartner (September 2017)
Gartner’s Take
Gartner places Graph DBMSs in the “Peak of
Inflated Expectations” phase, with an
estimated plateau of 2 to 5 years.
18. Global Data Strategy, Ltd. 2018
Survey: How do you compare?
• Are you currently using a graph database?
• Yes/No
• Do you have future plans to use a graph database?
• Yes/No
• For what Use Cases are you currently implementing graph databases? (Select ALL that apply)
• Social Networks
• Fraud Detection
• Recommendation Engines
• Master Data Management (MDM)
• Enterprise Knowledge Graph
• Semantic Web
• Metadata Management
• Other
18
Live Survey Feedback
20. Global Data Strategy, Ltd. 2018
Social Networks
20
Donna
Sad, Lonely Person who
doesn’t like data
Who are the cool kids?
i.e. People linked with Donna
21. Global Data Strategy, Ltd. 2018
X Degrees of Separation – “The Bacon Number”
• What’s Audrey Hepburn’s “Bacon Number”? i.e. degrees of separation/relation to actor Kevin Bacon
• As always, metadata and data quality are important., i.e Which Audrey Hepburn?
21Courtesy of oracleofbacon.org
22. Global Data Strategy, Ltd. 2018
Fraud Detection in Online Transactions
• Online transactions typically have certain identifiers, e.g. User ID, IP address, geo location, tracking cookie, credit card number, etc.
• Graph patterns can help detect fraud, e.g.
• The more interconnections exist among identifiers, the greater the cause for concern.
• Typically they would be 1:1.
• Some variations may occur, e.g. Multiple credit cards with one person. Families using same machine, etc.
• Large and tightly-knit graphs are very strong indicators that fraud is taking place.
• Triggers can be put into place so that these patterns are uncovered before they cause damage.
22
IP1 IP1 IP1 IP1 IP1 IP1 IP1 IP1 IP1 IP1
CC1 CC2 CC3 CC4 CC5 CC6 CC7 CC8 CC9 CC10 CC11 CC12 CC13 CC14 CC15 CC16 CC17
Fraud? FamilyPersonal & Business Card
23. Global Data Strategy, Ltd. 2018
Recommendation Engines
• Recommendation Engines are familiar to most of us who do any online shopping.
• These engines can be powered by a graph database, e.g.
• Capture a customer’s browsing behavior and demographics
• Combine those with their buying history to provide relevant recommendations
23
24. Global Data Strategy, Ltd. 2018
Data Quality & Volume Matters
• Recommendation engines are based on evaluating data sets. If those data sets are faulty or of
poor quality, your results will be flawed.
• Especially if the data sets are small
24
25. Global Data Strategy, Ltd. 2018
Data Warehousing & Enterprise Knowledge Graph
25
Data Warehouse
…Show me Total Sales by Region and by
Customer each month in 2017
Enterprise Knowledge Graph
Relational & Dimensional data model Graph data model
…Who are my most influential
customers. (with the most connections)
26. Global Data Strategy, Ltd. 2018
An Enterprise Knowledge Graph Provides a Holistic View of
the Organization through Relationships
26
Customer Data
Data Quality & Semantics are important
for core enterprise data assets.
Name: Audrey Hepburn
DOB: May 4, 1929
Current Customer: No
But the true value is in the
interrelationships between data assets.
Mother of
Name: Luca Dotti
DOB: February 8, 1970
Current
Customer: Yes
Purchased Yacht Insurance
Purchased Home
Insurance
Filed a Claim
27. Global Data Strategy, Ltd. 2018
Master Data Management (MDM)
• Master Data Management (MDM) is the practice of identifying, cleansing, storing & governance
core data assets of the organization (e.g. customer, product, etc.)
• There are many architectural approaches to MDM. Two are the following:
27
Centralized -- Commonly Relational Virtualized/Registry – Commonly Graph
MDM
Virtualization Layer
• Core data stored in
a common schema
in a centralized
“hub”.
• Used as a common
reference for
operational systems,
DW, etc.
• Data remains in
source systems.
• Referenced through
a common
virtualization layer.
BOTH require the same core foundation of data quality, parsing & matching, semantic meaning,
data governance, etc. in order to be successful… and that’s usually the hardest stuff.
28. Global Data Strategy, Ltd. 2018
The Semantic Web & RDF
28
• The RDF (Resource Description Framework) model from the World Wide Web Consortium (W3C) provides a way to link resources on
the web (people, places, things). It provides a common framework for applications to share information without losing meaning.
• Search Engines
• Exchanging data between datasets
• Sharing information with applications / APIs
• Building social networks
• Etc.
• The goal is to move from a web of documents to a web of data.
• The Framework is a simple way to express relationships between resources.
• IRIs (International Resource Identifiers) (e.g. URI) identify resources
• Simple triples relate objects together in the format: <subject> <predicate> <object>
• These relationships create a connected Graph
• There are several serialization formats, with RDF XML being a common one. For example:
• Turtle is a human-friendly format
• RDF/XML
• JSON-LD
• Schemas define the vocabularies used to describe the objects
• E.g. Dublin Core and Schema.org
Subject Object
Predicate
ACME
Publishing
RDF is
Easy
Is Publisher Of
29. Global Data Strategy, Ltd. 2018
Creating a Web of Data
29
@type: Place
Sheraton San Diego Hotel & Marina
1380 Harbor Island Drive
San Diego, California 92101 USA
"@context": "http://schema.org",
“location": {
"@type": "Place",
"name": "Sheraton San Diego Hotel & Marina",
"address": {
"@type": "PostalAddress",
"streetAddress": "1380 Harbor Island Drive",
"addressLocality": "San Diego",
"addressRegion": "CA",
"postalCode": "92101"
},
"telephone" : "+1-877-734-2726",
"image":
"http://edw2016.dataversity.net/uploads/ConfSiteAssets/72/im
age/sheraton.jpg",
"url":"http://edw2016.dataversity.net/travel.cfm"
},
"@context": "http://schema.org",
"location": {
"@type": "Place",
"name": "Sheraton San Diego Hotel & Marina",
"address": {
"@type": "PostalAddress",
"streetAddress": "1380 Harbor Island Drive",
"addressLocality": "San Diego",
"addressRegion": "CA",
"postalCode": "92101"
},
"telephone" : "+1-877-734-2726",
"image": “http://mysite.com/edw16photo.jpg",
"url":“http://mysite.com/myphotos"
},
* Script provided by: Eric Franzon, eric@smartdataconsultants.com
*
30. Global Data Strategy, Ltd. 2018
Summary
• Graph Databases provide powerful enterprise-wide association using simple constructs
• “Thing Relates to Thing”
• Relationships are first class constructs
• Usage of Graph databases is on the rise
• While adoption is smaller than traditional data sources (e.g. relational), usage is on the rise
• As with any new technology, there is a risk of “inflated expectations”
• Enterprise use cases are best suited to those that focus on interrelationships between data points
• Social Networks
• Fraud Detection
• Recommendation Engines
• Enterprise Knowledge Graph
• MDM
• Semantic Web
• Etc.
31. Global Data Strategy, Ltd. 2018
About Global Data Strategy, Ltd.
• Global Data Strategy is an international information management consulting company that specializes
in the alignment of business drivers with data-centric technology.
• Our passion is data, and helping organizations enrich their business opportunities through data and
information.
• Our core values center around providing solutions that are:
• Business-Driven: We put the needs of your business first, before we look at any technological solution.
• Clear & Relevant: We provide clear explanations using real-world examples, not technical jargon.
• Customized & Right-Sized: Our implementations are based on the unique needs of your organization’s
size, corporate culture, and geography.
• High Quality & Technically Precise: We pride ourselves in excellence of execution, and we attract high-
quality professionals with years of technical expertise in the industry.
31
Data-Driven Business Transformation
Business Strategy
Aligned With
Data Strategy
www.globaldatastrategy.com
32. Global Data Strategy, Ltd. 2018
White Paper: Trends in Data Architecture
32
Free Download
• Download from
www.globaldatastrategy.com
• Under ‘Resources/Whitepapers’
33. Global Data Strategy, Ltd. 2018
DATAVERSITY Data Architecture Strategies
• January Panel: Emerging Trends in Data Architecture – What’s the Next Big Thing?
• February Building an Enterprise Data Strategy – Where to Start?
• March Modern Metadata Strategies
• April The Rise of the Graph Database: Practical Use Cases & Approaches to Benefit your Business
• May Data Architecture Best Practices for Today’s Rapidly Changing Data Landscape
• June Artificial Intelligence: Real-World Applications for Your Organization
• July Panel: Data as a Profit Driver – Emerging Techniques to Monetize Data as a Strategic Asset
• August Data Lake Architecture – Modern Strategies & Approaches
• Sept Master Data Management: Practical Strategies for Integrating into Your Data Architecture
• October Business-Centric Data Modeling: Strategies for Maximizing Business Benefit
• December Panel: Self-Service Reporting and Data Prep – Benefits & Risks
33
This Year’s Line Up for 2018 – Join Us Next Month