16. Amazon DocumentDB: Modern cloud-native
architecture
Decouple
compute and
storage
Distribute data in
smaller partitions
Increase the
replication of
data (6x)
What would you do to improve scalability and availability?
1 2 3
49. Dr. Sebastian Brandt, Dr. Steffen Lamparter, Siemens Corporate Technology Siemens AG 2019
Knowledge graphs become especially powerful for
managing complex queries and heterogeneous data
Why Knowledge Graphs?
• Graphs are a natural way to represent
entities and their relationships
• Graphs can capture a broad spectrum
of data (structured / unstructured)
• Graphs can be managed efficiently
Intuitive domain modelling
Flexibility & performance
Low up-front investment
Robust data quality assurance
Game-changing data integration
50. Dr. Sebastian Brandt, Dr. Steffen Lamparter, Siemens Corporate Technology Siemens AG 2019
Semantic descriptions
The semantic description indicates the
meaning of an object or relation, e.g. Joe
Kaeser is a person
What are Graphs? Knowledge representation formalism
semantic descriptions of entities and their relationships
Relationships
Logical connection between two objects
e.g. Joe Kaeser is born in Arnbruck
Rules make it possible to add further expert knowledge, e.g.
"Siemens has to be a company, as a person is working there"
Objects
Real-world objects (things, places,
people) and abstract concepts (genres,
religions, professions)
Is a
Is born on
Was founded by
Is headquartered in
Munich, Berlin
Was promoted
June 23, 1957
Is a
Is working at
Roesemarie
Kaeser
Is married to
Werner v. Siemens and
Friedrich Halske
August 1, 2013
Person
Company
Has pictures
Is born in
Arnbruck
Joe Kaeser
51. Dr. Sebastian Brandt, Dr. Steffen Lamparter, Siemens Corporate Technology Siemens AG 2019
Knowledge graphs become a powerful addition to traditional data
warehouses for managing heterogeneous data with complex relations
Dimensions
of data relations
High number & complex
relations, e.g. social networks
Limited relations across data,
e.g. time series data
Variety of
questions to be
answered
"We do not yet know all the
different questions, users will
ask" Chatbot
"We have a clear scope of
user questions" fixed set of
dashboard needed
Data heterogeneity
Analysis needs to combine he-
terogeneous data, e.g. sensor
data, text data, product data
Only few different data types
and data sources e.g.
invoicing database in ERP
Quality of data
Need to handle imperfect,
incomplete or inconsistent
data
Ensured consistency of stored
data
low high
low high
low high
low high
Traditional data warehouses
Knowledge graphs
52. Dr. Sebastian Brandt, Dr. Steffen Lamparter, Siemens Corporate Technology Siemens AG 2019
Use cases for knowledge graphs can be clustered into
five categories – overview and use case examples
Data quality Digital companion
Improving data availability
and quality by combining
and comparing data from
various sources to fill in
missing data sets or
identify potentially wrong
data and data duplicates
Enhancing features of
existing products or
services with digital
companions that are able
to understand and
process user questions
and providing the needed
data insights
Data access &
dashboarding
Maintaining up-to-date
meta-data, creating
transparency on all
available data and
making them accessible
to users via queries
Recommender
system
Providing users high
quality recommendations
by identifying similarities
in historical data
Constraints &
planning
Enabling autonomous
systems
to understand data and its
dependencies and take
own decisions, such as
autonomous planning of
production proces-ses
Degree of complexity
53. Dr. Sebastian Brandt, Dr. Steffen Lamparter, Siemens Corporate Technology Siemens AG 2019
Example: gas-turbine maintenance planning
Maintenance Repair
Time-series
Power turbine Configuration
Fleet browsing and search
DB Docs CSV, XML,
Excel, …
Siemens Knowledge Graph
Analytics for R&D• Cross-life-cycle data integration at PS DO:
gas-turbines, including maintenance, repair,
monitoring, and configuration data
• Holistic engineering-centric domain model
• Intuitive graph queries independent of
source data schemata
• Virtual integration of time-series data
DB
Maintenance planning
54. Dr. Sebastian Brandt, Dr. Steffen Lamparter, Siemens Corporate Technology Siemens AG 2019
A semantic data model enables flexible linking and an integrated,
intuitive API for applications
Semantic Data Model
Planning
tools
Simulation
tool
Building
Operations
Security aaS Building
Performance
Service Portal Location Ba-
sed Services
Digital Lifecycle Platform
Drive
applications
BT Knowledge
Graph
Customer Data
Weather Data
Public Energy Data
MindSphereBuilding Structure
BIM
Product Data
www
Data APIProduct APIBuilding API
55. Dr. Sebastian Brandt, Dr. Steffen Lamparter, Siemens Corporate Technology Siemens AG 2019
Creating perfect places based on Services –
a user-centric holistic approach to the modern workplace …
Customer Interest Relevant KPI’s
Cost per space unit
Employee satisfaction
CO2 emissions
Employee productivity
Optimizing CAPEX and OPEX
Energy and
asset efficiency
Space
efficiency
Individual efficiency
and comfort
Workplace Utilization
Revenue per space unit
Vacancy Rate
Asset Performance/Useful Life
56. Dr. Sebastian Brandt, Dr. Steffen Lamparter, Siemens Corporate Technology Siemens AG 2019
Industrializing Knowledge Graphs
R&D AreasRelevant technologies
Decision Making
• Explanation of AI decisions
• Data access: Semantic Search
• Machine Learning on Graphs for
recommendations, quality, etc.
Decision Making
• Reasoning and Constraint Solving
• Machine/Deep Learning
• Question Answering
Storage and Integration
• Reusable Semantic Modelling
and Knowledge Graphs
• Data integration and cleaning (e.g.
entity reconciliation)
Storage and Integration
• Graph/NoSQL databases
• Constraints and Rules
• Probabilistic programming
• Ontologies
Generation
• Extraction from unstructured data
(inclusive text, audio, image)
• Automatic semantic annotation
of structured data
• Learning of domain-specific
rules/patterns
Generation
• NLP/Text understanding
• Machine/Deep Learning
• Computer vision
• Sound recognition
• Virtual data Integration
• Information retrieval
• …
Industrial Knowledge Graph
Decision
Making
Storage and
Integration
Generation
Knowledge
Automation
HumansMachines
ML for
Graphs
Ontology
Library
ML for
automated
annotation
Simple
RDF +
PGM
57. Dr. Sebastian Brandt, Dr. Steffen Lamparter, Siemens Corporate Technology Siemens AG 2019
Get in touch!
Dr. Sebastian Brandt
Senior Key Expert – Knowledge Graph and Data Management
CT RDA BAM SMR-DE
Dr. Steffen Lamparter
Head of Research Group Semantics & Reasoning
CT RDA BAM SMR-DE
Siemens AG
Corporate Technology
Otto-Hahn-Ring 6
81739 München Germany
E-mail
steffen.lamparter@siemens.com
Intranet
intranet.siemens.com/ct
Let’s take a look back at how database technology has evolved over the years.
In the 1970s and 1980s, we had relational databases and then in the 1990s we added open source options like MySQL and PostgreSQL.
Then, around mid 2000s to the current day, we have seen a significant growth of specialized databases that differ from the relational model. I don’t think it is a coincidence that these new database emerged at the same the time cloud was taking off as customers were beginning to build internet-scale apps that demanded functionality, performance, and scale for many different use cases in the same application.
Modern apps create new requirements
For example, if you think of some of the largest cloud applications today
you will notice some common characteristics
Millions of users
Located all over the world
All expecting instant experiences (e.g., millisecond-microsecond latencies)
These systems need to scale on the fly
One size fits all approach from the past no longer works, very easy to overburden a database
Today, developers are doing what they do best
They are breaking down large applications into smaller parts and picking the right tool for the right job
So they never have to trade off functionality, performance, or scale
Speaking of right tool for the right job
Instead of a listing hundreds, lets pivot and first think about common categories
Then consider what the purpose of a tool is within a category
And common use cases we hear from customers
For example
If you were building a health insurance application
required strict schema, data accuracy, and consistency, relational is a great choice
If you were building a massive online game, with millions of players coming and going, requiring high through put reads and writes, with endless scale, key-value is a great choice
If part of your application needed to make a product recommendation, based on highly connected data, graph is a great choice
If you look at our offerings and how they align to these categories, our database strategy is fairly simple, we want to ensure you as developers have the very best purpose-built databases in each of these categories so that you never have to sacrifice on scale, performance, and functionality.
Today we’re going to talk about the Amazon DocumentDB - Fully managed MongoDB compatible database service designed from the ground up to be fast, scalable, and highly available
If you have legacy apps you want to migrate
Interested in improving scale and performance
Want to free up resources to innovate
Customers running commercial databases often choose Amazon RDS
Run your choice of database engines – open source, Oracle, SQL Server
Amazon RDS automates time-consuming administration tasks
Hardware provisioning, database setup, patching, and backups
Customers can spend time innovating and building new applications
This is why we built Amazon Aurora
Aurora = performance and availability of commercial with cost-effectiveness of open source
5x performance of standard MySQL
3x performance of standard PostgreSQL
With the Security, availability, and reliability of commercial-grade
1/10th the cost
The first consideration that needs to be made when selecting a database is the characteristics of the data you are looking to leverage. If the data has a simple tabular structure, like an accounting spreadsheet, then the relational model could be adequate. Data such as geo-spatial, engineering parts, or molecular modeling, on the other hand, tends to be very complex.
Let’s talk about one of the types of application use cases: Internet-scale.
Amazon DynamoDB is a fast and flexible NoSQL database service for all applications that need consistent, single-digit millisecond latency at any scale. It is a fully managed cloud database and supports both document and key-value store models. Its flexible data model, reliable performance, and automatic scaling of throughput capacity, makes it a great fit for mobile, web, gaming, ad tech, IoT, and many other applications.
Backup Notes
1/Performance at scale - Consistent, single-digit millisecond response times at any scale. Build applications with virtually unlimited throughput and storage, backed by a service level agreement for reliability.
2/Serverless - No hardware provisioning, software patching, or upgrades. Scales up or down automatically to accommodate your performance needs. Optimize costs by paying for only the resources you use. Protect your data with on-demand and continuous backups with no downtime.
3/Comprehensive Security- Encrypts all data by default and fully integrates with AWS Identity and Access Management for robust security. Get oversight of your tables by using integrated monitoring on audit logs with AWS CloudTrail, and network isolation with Amazon Virtual Private Cloud.
4/Modern applications - A database for serverless applications that includes AWS Lambda integration. Supports ACID transactions for business-critical applications. Build global applications with fast access to local data by easily replicating tables across multiple AWS Regions.
Again, working backwards from our customers, and taking our collective knowledge of building distributed system and database service as AWS, what could we do differently here to improve these scenarios?
1/Decouple storage and compute
2/Break-up data into smaller partitions, spread the data across multiple AZs, make use of cell-based architectures
3/Distribute those partitions six ways across three AZs
We did all this to reduce single points of failures and to reduce blast radiuses – remember, this failure all the time whether it is hardware, software, or networking
Working backwards from our customers and these challenges, we built Amazon DocumentDB to be a fast, scalable, fully-managed, and MongoDB-compatible AWS database service.
Data is stored in JSON-like documents and JSON documents are first-class objects within the database – documents are not a data type or a value, they are the key design point of the database
Document databases have flexible schema and make the representation of hierarchical and semi-structured data easy – they also enable powerful indexing to make the querying to such documents fast
Documents map naturally to object-oriented programming, which makes the flow of data with your app to persistent layer easier
Expressive query languages built for documents that enable ad hoc queries and aggregations across documents
Together Document databases help developers build applications faster and iterate quickly
Amazon ElastiCache is a web service that makes it easy to deploy, operate, and scale an in-memory data store and cache in the cloud. The service improves the performance of web applications by allowing you to retrieve information from fast, managed, in-memory data stores, instead of relying entirely on slower disk-based databases. Amazon ElastiCache supports two open-source in-memory engines:
Redis - a fast, open source, in-memory data store and cache. Amazon ElastiCache for Redis is a Redis-compatible in-memory service that delivers the ease-of-use and power of Redis along with the availability, reliability and performance suitable for the most demanding applications. Both single-node and up to 15-shard clusters are available, enabling scalability to up to 6.1 TiB of in-memory data. ElastiCache for Redis is fully managed, scalable, and secure - making it an ideal candidate to power high-performance use cases such as Web, Mobile Apps, Gaming, Ad-Tech, and IoT.
Memcached - a widely adopted memory object caching system. Amazon ElastiCache for Memcached is protocol compliant with Memcached, so popular tools that you use today with existing Memcached environments will work seamlessly with the service. ElastiCache for Memcached is suitable for caching use cases where performance and concurrency is important.
Key benefits of Amazon ElastiCache include:
Redis and Memcached Compatible
With Amazon ElastiCache, you get native access to Redis or Memcached in-memory environments. This enables compatibility with your existing tools and applications.
Extreme Performance
Amazon ElastiCache works as an in-memory data store and cache to support the most demanding applications requiring sub-millisecond response times. By utilizing an end-to-end optimized stack running on customer dedicated nodes, Amazon Elasticache provides you secure, blazing fast performance.
Fully Managed
You no longer need to perform management tasks such as hardware provisioning, software patching, setup, configuration, monitoring, failure recovery, and backups. ElastiCache continuously monitors your clusters to keep your workloads up and running so that you can focus on higher value application development.
Easily Scalable
Amazon ElastiCache can scale-out, scale-in, and scale-up to meet fluctuating application demands. Write and memory scaling is supported with sharding. Replicas provide read scaling.
Amazon Neptune is a fast, reliable, fully-managed graph database service that makes it easy to build and run applications that work with highly connected datasets. The core of Amazon Neptune is a purpose-built, high-performance graph database engine optimized for storing billions of relationships and querying the graph with milliseconds latency. Amazon Neptune supports popular graph models Property Graph and W3C's RDF, and their respective query languages Apache TinkerPop Gremlin and SPARQL, allowing you to easily build queries that efficiently navigate highly connected datasets. Neptune powers graph use cases such as recommendation engines, fraud detection, knowledge graphs, drug discovery, and network security.
Neptune supports both leading graph models, which are the Property Graph model and W3C’s Resource Description Framework (RDF), and that means Customers can choose the best model for their application needs. Developers like the Property Graphs because it is somewhat familiar to relational models, and they like the open source Apache TinkerPop Gremlin traversal language since it provides a way to quickly traverse property graphs. Also, Developers like RDF as it provides flexibility for modeling complex information and there are a lot of existing public domain data sets available in RDF including Wikidata and PubChem, which is a database of chemical molecules.
So with Neptune being fast and scalable, Developers can create graph applications that store billions of relationships and query the graph with milliseconds latency.
Neptune is reliable, offering greater than 99.99 percent availability. We store six copies of your data across three Availability Zones and back up your data to Amazon S3. And If there is an instance failover, it typically takes less than 30 seconds.
By using graph query languages like Gremlin or SPARQL, developers can easily execute powerful queries that are easy to write and perform well on connected data, instead of having to write complex SQL queries that are difficult to tune for performance.
So for diverse use cases like social networks to fraud detection to knowledge graphs to drug discovery, with Neptune you can select the best model for your application.
There are two primary graph models that are widely-used. A property graph is a common name for an attributed, multi-relational graph, for those who paid attention in math class. The leading Property Graph API is the open standard Apache TinkerPop™ project. It provides an imperative traversal language, called Gremlin, that can be used to write traversals on property graphs, and it is supported by a number of open source and vendor implementations. Customers like Property Graphs as they are familiar to developers that are used to relational models, and they like the open source Apache TinkerPop Gremlin traversal language as it provides a way to quickly traverse property graphs.
The second is the Resource Description Framework, or RDF, standardized by the W3C in a set of standards collectively known as the Semantic Web. The SPARQL query language for RDF allows uses to express declarative graph queries against RDF graph models. The RDF model is also a labeled, directed multi-graph, but it uses the concept of triples, subject, predicate, and object, to encode the graph. Customers like RDF as it provides flexibility for modeling complex information domains and there are a number of existing public domain data sets available in RDF including Wikidata and PubChem, a database of chemical molecules.
RDF data is provided in “triples”
Amazon Timestream is a purpose-built, time-series database designed specifically for collecting, storing, and analyzing time series data.
At the very core of Amazon Timestream, time isn’t just an attribute, but rather the single primary axis of the data model. This allows for simplification and specialization across the database.
1/ 1,000X faster, and at 1/10th the cost of relational databases - Amazon Timestream can collect fast moving time-series data from multiple sources at the rate of millions of inserts per second (10M/second). Amazon Timestream organizes data by time intervals, reducing the amount of data that needs to be scanned to answer a query. Amazon Timestream also executes inserts and queries in separate processing tiers which eliminates resource contention and improves performance. [Note: This claim is based on internal benchmarks and applies to both queries and inserts. The service team is comfortable making the claim].
2/ Trillions of daily events– With its speed and purpose-built architecture, Amazon Timestream is capable of processing trillions of events daily. This opens up the door to more IoT devices, more sensor reads and ultimately a larger data set to make smarter decisions on time series and machine data. Amazon Timestream’s adaptive query processing engine and data retention policies adjust the query performance and storage capacity to maintain steady, predictable performance at the lowest possible cost as your data grows over time.
3/ Analytics optimized for time series data– Analyzing time-series data with Amazon Timestream is easy, with built-in functions for interpolation, smoothing, and approximation that can be used to identify trends, patterns, and anomalies. For example, a smart home device manufacturer can use Amazon Timestream to collect motion or temperature data from the device sensors in a home, interpolate that data to identify the time ranges without any motion in the home, and alert consumers to take actions such as turning off the lights or turning down the heat to save energy during times when no one is in the house.
4/ Serverless – With Amazon Timestream, there are no servers to manage. As your application needs change, Amazon Timestream automatically scales up or down to adjust capacity and performance. Amazon Timestream takes care of the time-consuming tasks such as server provisioning, software patching, setup, and configuration so you can focus on building your applications. In addition, You can set policies to automate the retention and tiering of how data is stored, which can significantly reduce your manual effort, storage requirements, and cost.
Amazon Timestream is available for preview today.
Ledgers are typically used to record a history of economic and financial activity in an organization. Many organizations build applications with ledger-like functionality because they want to maintain an accurate history of their applications' data, for example, tracking the history of credits and debits in banking transactions, verifying the data lineage of an insurance claim, or tracing movement of an item in a supply chain network. Ledger applications are often implemented using custom audit tables or audit trails created in relational databases. However, building audit functionality with relational databases is time-consuming and prone to human error. It requires custom development, and since relational databases are not inherently immutable, any unintended changes to the data are hard to track and verify. Alternatively, blockchain frameworks, such as Hyperledger Fabric and Ethereum, can also be used as a ledger. However, this adds complexity as you need to set-up an entire blockchain network with multiple nodes, manage its infrastructure, and require the nodes to validate each transaction before it can be added to the ledger.
To help customers move their databases to the cloud, we built the AWS Database Migration Service (DMS).
DMS helps you move common commercial databases such as Oracle and SQL Server
Open source databases such as MySQL
and NoSQL databases such as MongoDB to AWS
When you think about this fundamental question
2 fundamental areas of focus emerge
Customers are migrating their workloads to AWS in record numbers
Verizon is migrating over 1,000 business-critical applications and database backend systems to AWS, several of which also include the migration of production databases to Amazon Aurora—AWS’s relational database engine that combines the speed and availability of high-end commercial databases with the simplicity and cost-effectiveness of open source databases.
Wappa, a market leader in taxi expense management, available in 22 countries, migrated from their Oracle database to Amazon Aurora and improved their reporting time per user by 75 percent.
Trimble, a global leader in telematics solutions, had a significant investment in on-premises hardware running Oracle databases. Rather than refresh the hardware and renew their licenses, they opted to migrate the databases to Amazon RDS and project they will pay about 1/4th of what we were paying when managing their private infrastructure.
Thomas Publishing is using AWS to save hundreds of thousands of dollars in data center costs, quickly launch new websites by spinning up resources in one day instead of several weeks, and rapidly migrate critical Oracle applications to the cloud. The company links industrial buyers and suppliers by offering an array of information and services online and in print. Thomas runs its primary website on AWS, and worked with AWS Partner Apps Associates to migrate key content management applications to Amazon Aurora on RDS. Thomas Publishing was one of the first customers to use DMS to migrate from Oracle to Aurora.
Amazon.com have been moving off Oracle as fast as we can. Today 92% of Amazon’s Fulfillment Centers worldwide have migrated off Oracle to AWS Database services, mostly Aurora PostgreSQL (338 out of 368 FC’s). (Note: By the end of December 2018 Amazon.com will have migrated 170 (56%) of the Critical service databases to DynamoDB and 4316 (58%) of the non-Critical service databases to Aurora and RDS PostgreSQL in a single year).
Amazon’s DW: Amazon builds and operates thousands of micro-services to serve millions of customers. These include catalog browsing, order placement, transaction processing, delivery scheduling, video services, and Prime registration. Each service publishes datasets to Amazon’s massive analytics infrastructure, including over 50 petabytes of data and 75,000 tables, processing 600,000 user analytics jobs each day. They migrated this Data Warehouse from Oracle to AWS (S3, EMR, and Redshift). The new analytics infrastructure has one Data Lake with over 100 petabytes of data – almost twice the size of the previous Oracle Data Warehouse. Teams across Amazon are now using over 2700 Amazon Redshift or Amazon EMR clusters to process data from the Data Lake.
Samsung - Samsung Electronics migrated their Cassandra cluster to DynamoDB for their Samsung Cloud workload, saving 70%.
Intuit - Intuit is a business and financial software company that develops and sells financial, accounting, and tax-preparation software and services for small businesses, accountants, and individuals. Intuit migrated from Microsoft SQL Server to Amazon Redshift to reduce data-processing timelines and get insights to decision makers faster and more frequently. "With Amazon Redshift, our small team has handled a tenfold increase in data volumes while reducing the amount of time spent on system administration—freeing up time for value-add development." -Jason Rhoades, Systems Architect, Intuit
Equinox - Equinox Fitness Clubs transitioned from on-premises Data Warehouses and data marts in Teradata to a cloud-based, integrated data platform, built on AWS and Amazon Redshift. They went from static reports, redundant data, and inefficient data integration to a modern and flexible Data Lake and Data Warehouse architecture that delivers dynamic reports based on trusted data.
Eventbrite – Eventbrite’s mission is to bring the world together through live experiences. To achieve this goal, Eventbrite relies on data-driven decisions at every level. Previously they had Cloudera on-premises. With Amazon EMR, they are able to intelligent resize clusters, and cut costs considerably. They only pay for what they use and reduce costs further by purchasing Reserved instances and bidding on Spot instances (saving > 80%)
We have >400,000 customer accounts using AWS database and analytics.
While the database and analytics markets have been around for a while, with many mature offerings for customers to choose from.
We continue to see customers move to the cloud for a number of reasons and our recent growth in the database market is evidence of how rapidly the landscape is changing.
In 2016 for the database market, Amazon grew and order or magnitude faster than all other leading commercial database vendors. And this is off an already sizeable revenue base.
We see this for a number or reasons
Customers move to the cloud to minimize time spent managing infrastructure
Customers are choosing the cloud and migrating more and more of their workloads to it. In the next 10 to 15 years, the majority of computing is going to be done in the cloud. In the fullness of time, very few companies will want to own their own data centers, manage infrastructure whether it is compute, storage, databases or analytics.
Customers move to the cloud for performance, scale, reliability and costIncreasingly, new applications need to be globally distributed, support millions of users and devices, work with petabytes of data, run 24/7 and be responsive.
As customers move to the cloud and to micro-service architectures, developers are increasingly the ones making technology decisionsAs customers move from monolithic apps to micro-service architectures with loosely coupled components and DevOps cultures. The developers are increasingly making decisions as part of their application development lifecycle on what frameworks and components do they use.
Luckily, we have 2 customers here to speak to us today.
So does it make sense for CIOs to use both relational databases and graph databases? Or should they standardize across the enterprise on one or the other? Today it makes pragmatic sense to use both. Each model has their pros and cons; as the enterprise IT user typically has a wide set of problems it needs to solve, there is no single database or database model that is best at everything.
How to chooseHowever, that’s not a very rigorous approach to delineating which is best. How can you tell when the situation is right for graph databases over RDBMS? Start by drawing the domain on a whiteboard. If your domain entities have relationships to other entities, and your queries rely on exploring those relationships, then a graph database is a great fit.
Developers find the whiteboard test very convenient to work with because of its adaptability. This ability to adapt is particularly useful as new information about the domain becomes known or changes in requirements cause the model to change.
https://sdtimes.com/databases/guest-view-relational-vs-graph-databases-use/
Better performance – highly connected data can cause a lot of joins, which generally are expensive. After over 7 self/recursive joins, the RDMS starts to get really slow compared to Neo4j/native graph databases.
Flexibility – graph data is not forced into a structure like a relational table, and attributes can be added and removed easily. This is especially useful for semi-structured data where a representation in relational database would result in lots of NULL column values.
Easier data modelling – since you are not bound by a predefined structure, you can also model your data easily.
SQL query pain – query syntax becomes complex and large as the joins increase.
https://dzone.com/articles/why-are-native-graph-databases-more-efficient-than
Example: Mona Lisa -> Person or Painting???
Graphs for integration of heterogeneous data
Flexible, resource centric data model, use of Linked Data techniques
Natural representation for a variety of data models
Asset Tracking
ProblemMedical staff spends significant time looking for hospital equipment, leading to overstocking and low employee productivity
Solution Digital construction twin as calculation, asset type and user interface basis, different tracking technologies and asset data as input for the digital performance twin
Space utilization
ProblemLow efficiencies due to lack of good occupancy and usage data
Solution Digital construction twin as calculation and user interface basis, occupancy sensors based on different technologies as input for digital performance twin