Graph databases are well-suited for tracking carbon emissions across complex supply chains. Building a graph digital twin allows organizations to:
1. Model the entities and relationships in their value chain to calculate scope 3 emissions.
2. Ingest multiple data sources and map emission factors to estimate upstream and downstream carbon impacts.
3. Continuously improve data quality and refine carbon accounting as more information is added to the unified graph model.
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Â
Actionable Carbon Tracking and Analysis with the Neo4j Graph Data Platform
1. Neo4j, Inc. All rights reserved 2021
Actionable Carbon Tracking and Analysis
with the Neo4j Graph Data Platform
Michael D. Moore, Ph.D.
Principal, Partner Solutions & Technology
michael.moore@neo4j.com
Thursday, March 30 2023 3:15pm
2. Agenda
â Introduction to Graphs
â Graphs for Carbon Tracking and Reporting
â Building the Graph Digital Twin
â Summary
3. Neo4j, Inc. All rights reserved 2021
âBy 2025, graph technologies will be
used in 80% of data and analytics
innovations...â
Top 10 Trends in Data and Analytics, 11 May 2020, Rita Sallam et al.
6. Neo4j, Inc. All rights reserved 2021
What is a Graph? What is a Graph Database?
Graphs accurately represent complex, connected networks of things, routes or processes
6
Nodes
⢠Can have Labels to classify nodes
⢠Labels have native indexes
Relationships
⢠Relate nodes by type and direction
Properties
⢠Attributes of Nodes & Relationships
⢠Stored as Name/Value pairs
⢠Can have indexes and composite indexes
⢠Visibility security by user/role
id: âX47T-190â
failures: 3
id: âWX0-29-Bâ
service: Dec 5, 2016
since:
Jan 10, 2011
id: âUniversity9Bâ
latitude: 37.5629
longitude: -122.32553
CONNECTED_TO
FLOWS_TO
COMPRESSOR WELLHEAD
PAD
L
O
C
A
T
E
D
_
O
N
rate:
32.7
L
O
C
A
T
E
D
_
O
N
7. Neo4j, Inc. All rights reserved 2021
7
Graphs have low complexity and high ďŹdelity
SQL RDBMS ER Diagram Graph (âWhiteboardâ)
NODES
RELATIONSHIPS
8. Neo4j, Inc. All rights reserved 2021
8
NEO4J
PARTNER
ADVISORY
MEETING
|
2022
Q3
Neo4j, Inc. All rights reserved 2022
Neo4j 5
Graph Data Platform
Neo4j Database
User Tools
⢠Developer Tools (Desktop, Browser, Data
Importer)
⢠Graph Visualization (Bloom)
⢠Administration (Neo4j Ops Manager)
Language Drivers & Connectors
⢠Language Drivers (Java, JavaScript, .NET,
Python, Go)
⢠Spring Data & GraphQL Frameworks
⢠Kafka (Streaming), Spark, BI Connectors
Neo4j Aura
⢠Cloud Database-as-a-Service
Graph Data Science
⢠Enhanced Analytics and Graph-Native ML
Language Standards
⢠GQL, openCypher
9. Neo4j, Inc. All rights reserved 2021
Rich Tooling For Rapid Development
Local database for rapid dev Visualize and explore your data API-driven intelligent applications
Query editor and results visualizer
data
Importer
Code-free data loader
ops
manager
Centralized management
9
10. Neo4j, Inc. All rights reserved 2021
Robust Graph Algorithms & ML methods
â Compute metrics about the topology and connectivity
â Build predictive models to enhance your graph
â Highly parallelized and scale to 10âs of billions of nodes
10
Neo4j Graph Data Science
Mutable In-Memory
Workspace
Computational Graph
Native Graph Store
Efficient & Flexible Analytics Workspace
â Automatically reshapes transactional graphs into
an in-memory analytics graph
â Optimized for global traversals and aggregation
â Create workflows and layer algorithms
â Store and manage predictive models in the
model catalog
11. Š 2022 Neo4j, Inc. All rights reserved.
11
Real-Time
Recommendations
Fraud
Detection
Network &
IT Operations
Master Data
Management
Identity & Access
Management
Risk &
Compliance
Fueling Discovery and Innovation in Every Field
12. Neo4j, Inc. All rights reserved 2021
12
Common Graph Use Cases In Oil & Gas
â Carbon Tracking and Monitoring
â Digital Twins / Predictive Maint
â Supply Chain Visibility
â Capital Projects
â Opportunity Life Cycle
13. Neo4j, Inc. All rights reserved 2021
Neo4j, Inc. All rights reserved 2021
Graph Digital Twins for
Carbon Tracking & Reporting
14. Neo4j, Inc. All rights reserved 2021
Neo4j, Inc. All rights reserved 2021
14
GHG Reporting Requirements
GHG
Reporting
Timelines
⢠March 21, 2022:
SEC released new
proposals for
climate-related risk
disclosures.
⢠February 2024:
First disclosures on
Scope 1 and 2 for large
organizations.
⢠February 2025:
Disclosures on Scope 3
emissions and
emissions intensity
required for large
organizations.
15. Neo4j, Inc. All rights reserved 2021
Scope 3 Requires Upstream and Downstream Reporting
https://www.epa.gov/climateleadership/scope-3-inventory-guidance
16. Neo4j, Inc. All rights reserved 2021
Formidable Data Collection Requirements
Upstream Value Chain Data
x Emission Factors
+
Downstream Value Chain Data
x Emission Factors
+
Existing Scope 1 and Scope
Estimates
= Total Carbon Estimate
17. Neo4j, Inc. All rights reserved 2021
Scope 3 Upstream Data Types
Scope 3 Category Primary Data Source Secondary Data Source
1. Purchased goods and
services
⢠Product-level cradle-to-gate GHG data from suppliers calculated
using site-speciďŹc data
⢠Site-speciďŹc energy use or emissions data from suppliers
⢠Industry average emission factors per material consumed from life cycle
inventory databases
2. Capital goods ⢠Product-level cradle-to-gate GHG data from suppliers calculated
using site-speciďŹc data
⢠Site-speciďŹc energy use or emissions data from capital goods
suppliers
⢠Industry average emission factors per material consumed from life cycle
inventory databases
3. Fuel- and energy-
related activities
(not incl in scope 1
or scope 2)
⢠Company-speciďŹc data on upstream emissions (extraction of fuels)
⢠Grid-speciďŹc T&D loss rate
⢠Company-speciďŹc power purchase
data and generator-speciďŹc emission rate for purchased power
⢠National average data on upstream emissions (e.g. from life cycle
inventory database)
⢠National average T&D loss rate ⢠National average power purchase
data
4. Upstream
transportation and
distribution
⢠Activity-speciďŹc energy use or emissions data from third-party
transportation and distribution suppliers
⢠Actual distance traveled
⢠Carrier-speciďŹc emission factors
⢠Estimated distance traveled by mode based on industry-average data
5. Waste generated in
operations
⢠Site-speciďŹc emissions data from waste management companies
⢠Company-speciďŹc metric tons of waste generated
⢠Company-speciďŹc emission factors
⢠Estimated metric tons of waste generated based on industry-avg data
⢠Industry average emission factors
6. Business travel ⢠Activity-speciďŹc data from transportation suppliers (e.g., airlines)
⢠Carrier-speciďŹc emission factors
⢠Estimated distance traveled based
on industry-average data
7. Employee commuting ⢠SpeciďŹc distance traveled and
mode of transport collected from employees
⢠Estimated distance traveled based on industry-average data
8. Upstream leased
assets
⢠Site-speciďŹc energy use data collected by utility bills or meters ⢠Estimated emissions based on industry-average data (e.g. energy use
per ďŹoor space by building type)
18. Neo4j, Inc. All rights reserved 2021
Category Primary Data Examples Secondary Data Examples
9. Transportation and
distribution of sold
products
⢠Activity-speciďŹc energy use or emissions data from third-party
transportation and distribution partners
⢠Activity-speciďŹc distance traveled
⢠Company-speciďŹc emission factors (e.g., per metric ton-km)
⢠Estimated distance traveled based on industry-average data
⢠National average emission factors
10. Processing of sold
products
⢠Site-speciďŹc energy use or emissions from downstream value chain
partners
⢠Estimated energy use based on industry-average data
11. Use of sold products ⢠SpeciďŹc data collected from consumers ⢠Estimated energy used based on national average statistics on product
use
12. End-of-life treatment
of sold products
⢠SpeciďŹc data collected from consumers on disposal rates
⢠SpeciďŹc data collected from waste management providers on
emissions rates or energy use
⢠Estimated disposal rates based on national average statistics
⢠Estimated emissions or energy use based on national average statistics
13. Downstream leased
assets
⢠Site-speciďŹc energy use data collected by utility bills or meters ⢠Estimated emissions based on industry-average data (e.g., energy use
per ďŹoor space by building type)
14. Franchises ⢠Site-speciďŹc energy use data collected by utility bills or meters ⢠Estimated emissions based on industry-average data (e.g., energy use
per ďŹoor space by building type)
15. Investments ⢠Site-speciďŹc energy use or emissions data ⢠Estimated emissions based on industry-average data
Scope 3 Downstream Data Types
19. Neo4j, Inc. All rights reserved 2021
Granular Emission Factors for all GHG sources
4700+ Scope 3
Emission Factors
â Upstream (WTT)
â Downstream
â Freight Modality
â Carrier Type & Size
â Fuel Type
â UoM
â GHG Unit
https://www.gov.uk/government/publications/greenhouse-gas-reporting-conversion-factors-2022
20. Neo4j, Inc. All rights reserved 2021
Neo4j, Inc. All rights reserved 2021
Value Chain Complexity
Calculation Complexity
21. Neo4j, Inc. All rights reserved 2021
Effective Carbon Tracking Requires Graphs
Neo4j Graph Data
Science Library
Neo4j
Database
Neo4j
Bloom
Inference & Predictions Graph Digital Twin Visualization & Investigation
22. Neo4j, Inc. All rights reserved 2021
Graphs Naturally Support Carbon Tracking at Scale
A B C D E
A B C D E
One-to-Many
Relationships
Across Many
Entities
Wide Data Complex Data Hierarchical & Recursive Data
Many-to-Many
Relationships
Nested Tree
Structures
Recursion
(Self-Joins)
Deep
Hierarchies
Link Inference
(If C relates to A and A relates to E,
then C must relate to E)
Node Similarity
Hidden Data
Legacy Data Frozen Data
Legacy SQL Systems Data Lake Fact Tables Graph Data Science - Machine Reasoning
A
C
E
23. Neo4j, Inc. All rights reserved 2021
23
â Neo4jâs ďŹexible graph data model
easily handles complex relationships
and addition of new data sources
â Provides holistic â360°â view of
assets, processes & related data
with full spatial support
â Quickly traverse the network to
understand dependencies, ďŹows,
co-location, operations, and alerts
â Scales to billions of nodes and
relationships
Nodes:
Regions, Sites, Leases,
Well Pads, Well Heads, Tanks,
Compressors, Heaters Treaters,
Free Water Knockouts, Sensors
Relationships:
LOCATED_IN, LOCATED_ON,
CONNECTED_TO, MONITORED_BY
Infrastructure Digital Twins
24. Neo4j, Inc. All rights reserved 2021
Supply Chain Digital Twins
Digital Supply Chain TwinsâConceptual Clarification, Use Cases and Benefits
Benno Gerlach, Simon Zarnitz, Benjamin Nitsche and Frank Straube
Logistics 2021, 5, 86. https://doi.org/10.3390/logistics5040086
Network Level
Site Level
Organizations' supply chains often account for more than
90% of their greenhouse gas (GHG) emissions, when
taking into account their overall climate impacts.
25. Neo4j, Inc. All rights reserved 2021
25
OrbitMI
Maritime Routing
⢠Digital twin PLM system with full BoM
for all Army equipment, including costs,
armaments, force posture and readiness.
⢠Complex analysis is 7.5 X faster
⢠Rapid âWhat-Ifâ analysis enables more
agile response to global scenarios
U.S. Army
Force Readiness
⢠Knowledge graph of 27 Million warranty
& service documents
⢠Graph AI learns failure mode âprime
examplesâ to anticipate maintenance
⢠Improves equipment lifespan and
customer satisfaction
Caterpillar
AI for Maintenance
Customer Examples of Digital Twins
⢠Digital twin of global maritime routes
⢠Subsecond route planning
⢠Global carbon emissions reduced by
60,000 tons annually
⢠$12-16M ROI for OrbitMI customers
26. Neo4j, Inc. All rights reserved 2021
Esri ArcGIS Knowledge https://www.esri.com/en-us/arcgis/products/arcgis-knowledge
27. Neo4j, Inc. All rights reserved 2021
ENX IoT Platform https://enxchange.co/platform/iot
28. Neo4j, Inc. All rights reserved 2021
Neo4j, Inc. All rights reserved 2021
Building the Graph Digital Twin
29. Neo4j, Inc. All rights reserved 2021
How do you actually enable change?
⢠Given that we agree that tracking and
reporting on Carbon Emissions is important,
but how do you actually enable change?
⢠Understanding the root cause is critical.
Where and Why?
⢠How do you impact change at the source?
Where is the Source??
30. Neo4j, Inc. All rights reserved 2021
Infrastructure Digital Twin Graph
with IoT Sensor Data
31. Neo4j, Inc. All rights reserved 2021
Digital Twin Graph Schema
32. Neo4j, Inc. All rights reserved 2021
Digital Twin Graph Dashboard
Methane trending up
Sensor Alert on Battery
Infrastructure Graph
33. Neo4j, Inc. All rights reserved 2021
Methane Emissions
above SOL
GC Pressure
below SOL
Digital Twin
Infrastructure View
Infer GC is
leaking Methane
34. Neo4j, Inc. All rights reserved 2021
Example Real Time Digital Twin Architecture
Azure IoT Hub Raw
Stream
Sensor Alerts & Sampled Stream
Neo4j ODBC
Connector
ETL
Pipeline OLTP
Azure TS
Insights
Azure Blob
Store
PowerBI
Reports
Azure Cosmos
DB
Azure Data
Warehouse
Azure SQL
Azure Stream
Analytics
Notification
Services
Event Sources
Azure Device
Provisioning
Neo4j Digital
Twin Graph
Neo4j Bloom
Visualization
Neo4j Desktop IDE (neo4j.com/download)
Unstructured Data, JSON Documents, Structured Data
Raw
Stream
Hot Path
Warm Path
Web Apps /
GraphQL API
Neo4j Secure
BOLT Driver
Power BI Server
Enriching
data sources
Neo4j Digital Twin integrates a wide variety of
data sources (beyond BOM + Sensor Data) to
add additional analytical context to the graph.
â Vendors
â Costs
â Compliance
â Schematics
â Service Records
35. Neo4j, Inc. All rights reserved 2021
Building the Graph Digital Twin
for Carbon Tracking & Reporting
3 Ingest Value Chain Data, map EFs
Start with snapshots of data sources, and populate
the Digital Twin graph in Neo4j database. Map
emission factors to upstream & downstream sources.
Visualize the value chain with Neo4j Bloom.
MVP Graph
Digital Twin
4 Allocations, Analytics, Insights
Implement calculation logic as Neo4j Cypher queries.
Test drive the carbon allocations and troubleshoot
against industry standards. Use Neo4j Graph Data
Science to make inferences and ďŹll in data gaps.
Initial
Estimates
5 ReďŹne & Improve, Extend the Graph
Improve data quality as the graph becomes built out
and adjacent use cases emerge. Add IoT streams
and data ingestion pipelines for real-time analytics,
APIs/drivers for applications and reporting.
Auditable
Reporting
Design the initial Digital Twin graph model to depict
the end-to-end value chain for the use case. Identify
main entities, relationships, hierarchies, and key
dependencies. Implement consistent semantics.
Digital Twin Graph Data Model
2
UniďŹed
Data Model
1 Scope Boundary & Data Domains
Determine the Scope 3 boundary requirements for
the business use case. Prioritize data collection based
on level of effort and potential carbon impact.
Donât boil the ocean.
Manageable
Use Case
36. Advantages of Graphs
FAST ELEGANT EFFICIENT UNIFYING INSIGHTFUL
36 Š 2023 Neo4j, Inc. All rights reserved.
Relationships
(and nodes)
are stored in
memory for
real-time
access
Complex
business
processes are
simply and
faithfully
represented
Queries
traverse
locally-linked
objects with
consistent
performance
Creates a
flexible,
connected
view across
disparate data
domains
Builds up
context,
enabling
reasoning,
inference and
predictions
37. Neo4j, Inc. All rights reserved 2021
Summary: Key Points
⢠SigniďŹcant challenges to operationalize and impact carbon management without
infrastructure digital twins and graph databases
⢠Complex carbon capture data are naturally and easily modeled as a graph.
⢠Carbon Data graphs can become very large, with potentially millions of connected data
elements that require frequent near-real time updates.
⢠Neo4jâs in-memory graph database provides the ďŹexibility, performance and analytical
capabilities needed to build, manage and query digital twins on enterprise scale.
⢠Graph technology should be included as part of the carbon management strategy
because it offers the analytical power to meet compliance needs and release business
value.
38. Neo4j, Inc. All rights reserved 2021
Thank You
Michael D. Moore, Ph.D.
Principal, Partner Solutions & Technology
michael.moore@neo4j.com
Mike Welch
Account Executive Energy Practice Lead
mike.welch@neo4j.com
sales@neo4j.com
39. Neo4j, Inc. All rights reserved 2021
Neo4j at Scale: LDBC Trillion Entity Graph
LDBC social forum data set - 3 Billion users, 1110 forums
â 1128 forum shards (250GB each), 1
person shard (850GB), 3 Neo4j
Fabric processors
â Forum shards contains 900 million
relationships and 182 million nodes
â Person shard contains 3 billion
people and 16 billion relationships
between them
â Full dataset is 280 TB with 1 trillion
relationships
â Query response times range from
12-66ms
https://github.com/neo4j/trillion-graph
https://ldbcouncil.org/
40. Neo4j, Inc. All rights reserved 2021
Neo4j Kakfa Connector
Sensors IoT Gateway TimeSeries DB Kafka Topic Kafka Neo4j Connect
https://www.confluent.io/partner/neo4j/
Grid
Controller
ENX - SOL Event Filter SOL Excursion Events Stream Events to Graph Graph Analytics
41. Neo4j, Inc. All rights reserved 2021
Depict the business
as a graph
Squash the graph
into tables
Jam in foreign keys to
relate the records,
populate global index
41
Cheap Memory makes Graphs Compelling
https://jcmit.net/memoryprice.htm
SQL RDBMS workarounds to conserve memory
1979: Oracle v2.0 Released (yes, 43 years ago!)
= hidden technical debt
per MB
per MB
42. Neo4j, Inc. All rights reserved 2021
Connectedness and Size of Data Set
Response
Time
Relational and
Other NoSQL
Databases
0 to 2 hops
0 to 3 degrees of separation
Thousands of connections
Tens to hundreds of hops
Thousands of degrees
Billions of connections
1000x Advantage
at scale
âMinutes to millisecondsâ
Carbon Tracking Requires Scale
1000x Performance @Unlimited Scale
43. Neo4j, Inc. All rights reserved 2021
#1 Most Popular Graph Database
with Developers
Neo4j is the Undisputed Leader in Graph Databases
72k+
Meetup
Members Globally
50k+
Members with
LinkedIn Skills
250k+
Developers
43
Database
44. Neo4j, Inc. All rights reserved 2021
Neo4j: Enabling the world to make sense of their data
160M+ Downloads
250K+ Devs & Data Scientists
$390M Series F (June 2021)
Largest investment in Database history
The Most Trusted Graph
Data Platform
7 of the Worldâs Top 10 Retailers
3 of the Top 5 Aircraft Manufacturers
All of North Americaâs Top 20 Banks
7 of the Top 10 Telcos
Graph market leader; 1000s of
deployments around the globe