Lange Zeit wichen Architekturen von Analytics Systemen stark von modernen Software-Architekturen ab. Während die letzten Jahre oft von Domain Driven Design die Rede war und immer mehr Monolithen in Microservices zerschlagen wurden, blieben Analytics Plattformen wie Data Lake und Data Warehouse weiterhin schwerfällig. Doch nun ist auch hier die Rede von Data Mesh und Data Products. Doch inwiefern unterscheiden sich diese Konzepte nun eigentlich noch und was ist der Unterschied zwischen Microservice und Data Product? Werden hier nicht doch ähnliche Frameworks und Architekturpatterns benötigt? Diese Einordnung soll dieser Vortrag bieten und ein Beispiel bringen, wie eine noch engere Verzahnung im Unternehmen erfolgen kann und somit echte Mehrwerte für Ihre IT-Landschaften schaffen kann.
Self-Service und auch freie Wahl der Waffen beim ETL Tool, etc.
Analytics oft nicht mitgedacht in OLTP
Source: https://www.datamesh-architecture.com/
The domain ownership principle mandates the domain teams to take responsibility for their data.
According to this principle, analytical data should be composed around domains, similar to the team boundaries aligning with the system’s bounded context.
Following the domain-driven distributed architecture, analytical and operational data ownership is moved to the domain teams, away from the central data team.
The data as a product principle projects a product thinking philosophy onto analytical data.
This principle means that there are consumers for the data beyond the domain.
The domain team is responsible for satisfying the needs of other domains by providing high-quality data.
Basically, domain data should be treated as any other public API.
The idea behind the self-serve data infrastructure platform is to adopt platform thinking to data infrastructure.
A dedicated data platform team provides domain-agnostic functionality, tools, and systems to build, execute, and maintain interoperable data products for all domains.
With its platform, the data platform team enables domain teams to seamlessly consume and create data products.
The federated governance principle achieves interoperability of all data products through standardization, which is promoted through the whole data mesh by the governance group.
The main goal of federated governance is to create a data ecosystem with adherence to the organizational rules and industry regulations.
Beispiele für Data Products:
View in Datenbank
API-Service
Kafka-Topic
Source: https://www.datamesh-architecture.com/
A data product is a logical unit that contains all components to process and store domain data for analytical or data-intensive use cases and makes them available to other teams via output ports. You can think of a module or microservice, but for analytical data
Data products connect to sources, such as operational systems or other data products and perform data transformation. Data products serve data sets in one or many output ports. Output ports are typically structured data sets, as defined by a data contract. Some examples:
A BigQuery dataset with multiple related tables
Parquet files in an AWS S3 bucket
Delta files in Azure Data Lake Storage Gen2
Messages in a Kafka topic
Minute 25 max
Fabian
Fabian
Fabian
Zu komplex? Dann CORE Tabelle zwischenschalten „360 Grad Kundensicht“
– Enterprise Entität Kunde