Hackolade Tutorial - part 3 - Query-driven data modeling based on access patterns

Hackolade Tutorial
Part 3- Query driven data modeling based on access patterns
Copyright © 2016-2023 Hackolade 1

The rules of data modeling
• Data modeling for RDBMSs uses the rules of normalization

• NoSQL databases are completely different
• Different data models
• Different sizing parameters and capabilities
• True “horizontal” scalability in infinitely distributed systems
• Different transactional capabilities
• immediate vs. eventual consistency
• ACID vs BASE
• Different use cases
• NoSQL requires a mindshift in schema design, adhering to different
rules and parameters

• NoSQL advocates UNLEARNING the rules of normalisation
• NoSQL allows to aggregate information that belongs together
• join the data “on write”, instead of (time and time again) “on read”

The NoSQL mindshift
• From APPLICATION-AGNOSTIC to APPLICATION-SPECIFIC data
modeling

The ”Embedding” approach
Query driven data modeling
• first define the queries (aka “access patterns”) for the
application,
• then store the data according to the query needs
• Ideally, single db access should provide access to all related,
joined-up information: EMBEDDING the data into single
atomic document
<> “referencing” – leveraging data stored elsewhere using foreign keys, pulled
in with joins

Important factors for Query-driven Data Modeling
• Aggregate / Document size and transaction volume
• Cardinality of relationships
• Beware of unbounded arrays: consequences of unlimited growth!
• When embedding one-to-many relationships, one should estimate
the cardinality

Important factors for Query-driven Data Modeling
Referential integrity
• To ACID or not to ACID
• Embedding vs. ACID: documents are atomic units!
• Role of the application!
Indexing impacts
Polymorphic document designs can lead to proliferation of
indexes
Data duplication can be a good idea!

Schema versioning and migration
• Schemas can often be evolved without interruption of
database operations.
• Handle with care!
• Especially when multiple applications / reporting & analytical tools access
the same DB!
• Transition periods & strategies matter!

Schema versioning and migration
Different strategies are used:
• Eager: first migrate data, then application
• Does not leverage the benefits of JSON
• Lazy: only update the document when used
• Some documents will never be migrated!
• Incremental: migrate when lower load!
• Predictive migration: based on heuristics/estimates
• Also combinations of strategies: predictive migration
first, followed by incremental
• Endless versioning: not desirable!
• See entire chapter in MongoDB Data Modeling & Schema Design
book

Backward- and forward-compatibility
• No database is an island: many systems interacting with it
• Avoid the introduction of breaking changes: huge impacts on
agility and costs.
• Think through each evolution: features in schema standards
(JSON Schema, Avro, etc.) for full compatibility of schemas.
• consumers and producers can upgrade at their own pace!

Choice of partition / sharding keys
• NoSQL databases offer horizontal scaling
• through distribution of data across servers, data centers and
geographies
• Requires careful design
• find a scalable way to facilitate efficient retrieval of information
when serving queries, from a minimal number of shards
• a query should hit 1 shard!

Facilitating communication and collaboration
• The purpose of all Data Modeling!!!
• “Looking at the code” vs. sharing an ERD picture
• Data modeling and schema design for NOSQL databases and data
formats provides some guardrails in the face of unlimited flexibility
and power of NoSQL.

Reading material
• See Hackolade online documentation
• The Hackolade Blog
• This excellent new book:
MongoDB Data Modeling & Schema Design
• Many of the principles in the book are related to query
driven modeling based on access patterns!
• Hackolade’s on social media: LinkedIn page, Twitter page
• Download Hackolade studio for free

Questions?
Answers!

Hackolade Tutorial - part 3 - Query-driven data modeling based on access patterns

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Ähnlich wie Hackolade Tutorial - part 3 - Query-driven data modeling based on access patterns

Ähnlich wie Hackolade Tutorial - part 3 - Query-driven data modeling based on access patterns (20)

Mehr von PascalDesmarets1

Mehr von PascalDesmarets1 (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Hackolade Tutorial - part 3 - Query-driven data modeling based on access patterns