Advanced Git Techniques: Subtrees, Grafting, and Other Fun StuffAtlassian
Your team has adopted Git, and are happily coding along. But is that all? Can you do more with it? You bet! Join the always-animated Nicola Paolucci to learn advanced techniques for grafting multiple repositories, managing project dependencies with git subtree, splitting commits, and finding the best merge strategy for your staging servers. If you've ever wondered how to collate the histories of different projects, or how to split a sub-directory into it's own project without destroying its history, this session is for you.
La arquitectura impulsada por eventos (EDA) será el corazón del ecosistema de MAPFRE. Para seguir siendo competitivas, las empresas de hoy dependen cada vez más del análisis de datos en tiempo real, lo que les permite obtener información y tiempos de respuesta más rápidos. Los negocios con datos en tiempo real consisten en tomar conciencia de la situación, detectar y responder a lo que está sucediendo en el mundo ahora.
Not to be confused with Oracle Database Vault (a commercial db security product), Data Vault Modeling is a specific data modeling technique for designing highly flexible, scalable, and adaptable data structures for enterprise data warehouse repositories. It is not a replacement for star schema data marts (and should not be used as such). This approach has been used in projects around the world (Europe, Australia, USA) for the last 10 years but is still not widely known or understood. The purpose of this presentation is to provide attendees with a detailed introduction to the technical components of the Data Vault Data Model, what they are for and how to build them. The examples will give attendees the basics for how to build, and design structures when using the Data Vault modeling technique. The target audience is anyone wishing to explore implementing a Data Vault style data model for an Enterprise Data Warehouse, Operational Data Warehouse, or Dynamic Data Integration Store. See more content like this by following my blog http://kentgraziano.com or follow me on twitter @kentgraziano.
アンケートを即可視化!~MS Forms ⇒ MS Flow ⇒ Power BI~Yugo Shimizu
2017年12月2日 Power BI 勉強会 #6 の清水のセッション資料です。
まだプレビューですが Office 365 のファミリーとして Microsoft Forms が使えるようになりました。そこで、MS Forms を利用してアンケートを作成し、その結果を MS Flow を通じて Power BI で可視化します。2017年も終わりということで、アンケート結果をその場でイジりながら、行こうと思います。
The document describes the network database model and CODASYL DBTG model. Some key points:
- The network model uses a many-to-many relationship with owner and member records linked together.
- The DBTG model simplified this to one-to-one and one-to-many relationships. It uses segments, sets, and links to represent records, relationships, and connections between records.
- The DBTG model provides commands to retrieve, update, insert, and delete records as well as connect and disconnect them from sets. Programs access the database using templates, pointers, and status flags stored in a work area.
Power BI new workspace experience in power biAmit Kumar ☁
Power BI has introduced a new workspace experience. This presentation will describe the benefits of new workspace experience over classic workspace experience.
Advanced Git Techniques: Subtrees, Grafting, and Other Fun StuffAtlassian
Your team has adopted Git, and are happily coding along. But is that all? Can you do more with it? You bet! Join the always-animated Nicola Paolucci to learn advanced techniques for grafting multiple repositories, managing project dependencies with git subtree, splitting commits, and finding the best merge strategy for your staging servers. If you've ever wondered how to collate the histories of different projects, or how to split a sub-directory into it's own project without destroying its history, this session is for you.
La arquitectura impulsada por eventos (EDA) será el corazón del ecosistema de MAPFRE. Para seguir siendo competitivas, las empresas de hoy dependen cada vez más del análisis de datos en tiempo real, lo que les permite obtener información y tiempos de respuesta más rápidos. Los negocios con datos en tiempo real consisten en tomar conciencia de la situación, detectar y responder a lo que está sucediendo en el mundo ahora.
Not to be confused with Oracle Database Vault (a commercial db security product), Data Vault Modeling is a specific data modeling technique for designing highly flexible, scalable, and adaptable data structures for enterprise data warehouse repositories. It is not a replacement for star schema data marts (and should not be used as such). This approach has been used in projects around the world (Europe, Australia, USA) for the last 10 years but is still not widely known or understood. The purpose of this presentation is to provide attendees with a detailed introduction to the technical components of the Data Vault Data Model, what they are for and how to build them. The examples will give attendees the basics for how to build, and design structures when using the Data Vault modeling technique. The target audience is anyone wishing to explore implementing a Data Vault style data model for an Enterprise Data Warehouse, Operational Data Warehouse, or Dynamic Data Integration Store. See more content like this by following my blog http://kentgraziano.com or follow me on twitter @kentgraziano.
アンケートを即可視化!~MS Forms ⇒ MS Flow ⇒ Power BI~Yugo Shimizu
2017年12月2日 Power BI 勉強会 #6 の清水のセッション資料です。
まだプレビューですが Office 365 のファミリーとして Microsoft Forms が使えるようになりました。そこで、MS Forms を利用してアンケートを作成し、その結果を MS Flow を通じて Power BI で可視化します。2017年も終わりということで、アンケート結果をその場でイジりながら、行こうと思います。
The document describes the network database model and CODASYL DBTG model. Some key points:
- The network model uses a many-to-many relationship with owner and member records linked together.
- The DBTG model simplified this to one-to-one and one-to-many relationships. It uses segments, sets, and links to represent records, relationships, and connections between records.
- The DBTG model provides commands to retrieve, update, insert, and delete records as well as connect and disconnect them from sets. Programs access the database using templates, pointers, and status flags stored in a work area.
Power BI new workspace experience in power biAmit Kumar ☁
Power BI has introduced a new workspace experience. This presentation will describe the benefits of new workspace experience over classic workspace experience.
This document provides an overview of graph databases and their use cases. It begins with definitions of graphs and graph databases. It then gives examples of how graph databases can be used for social networking, network management, and other domains where data is interconnected. It provides Cypher examples for creating and querying graph patterns in a social networking and IT network management scenario. Finally, it discusses the graph database ecosystem and how graphs can be deployed for both online transaction processing and batch processing use cases.
This document discusses graph databases and provides examples of how the Neo4j graph database can be used. It shows how Neo4j supports social, spatial, financial and other types of connected data. It also summarizes Neo4j's REST API, support for object-oriented programming, routing algorithms, multiple indexes, recommendation systems, and other use cases. The document advocates for graph databases for any problem involving multiple relationships and connections between entities.
Max De Marzi gave an introduction to graph databases using Neo4j as an example. He discussed trends in big, connected data and how NoSQL databases like key-value stores, column families, and document databases address these trends. However, graph databases are optimized for interconnected data by modeling it as nodes and relationships. Neo4j is a graph database that uses a property graph data model and allows querying and traversal through its Cypher query language and Gremlin scripting language. It is well-suited for domains involving highly connected data like social networks.
Neo4j is a powerful and expressive tool for storing, querying and manipulating data. However modeling data as graphs is quite different from modeling data under a relational database. In this talk, Michael Hunger will cover modeling business domains using graphs and show how they can be persisted and queried in Neo4j. We'll contrast this approach with the relational model, and discuss the impact on complexity, flexibility and performance.
Graph Database Management Systems provide an effective
and efficient solution to data storage in current scenarios
where data are more and more connected, graph models are
widely used, and systems need to scale to large data sets.
In this framework, the conversion of the persistent layer of
an application from a relational to a graph data store can
be convenient but it is usually an hard task for database
administrators. In this paper we propose a methodology
to convert a relational to a graph database by exploiting
the schema and the constraints of the source. The approach
supports the translation of conjunctive SQL queries over the
source into graph traversal operations over the target. We
provide experimental results that show the feasibility of our
solution and the efficiency of query answering over the target
database.
Designing and Building a Graph Database Application – Architectural Choices, ...Neo4j
Ian closely looks at design and implementation strategies you can employ when building a Neo4j-based graph database solution, including architectural choices, data modelling, and testing.g
Modelling differential clustering and treatment effect heterogeneity in paral...Karla hemming
Cluster randomized trials are frequently used in health service evaluation. It is common practice to use an analysis model with a random effect to combine between cluster information about treatment effects. It is increasingly being acknowledged that intervention effects might vary across clusters, or the variation between clusters might differ across the randomized arms. It has been proposed in both parallel cluster trials, stepped-wedge and other crossover designs that this heterogeneity can be allowed for by incorporating additional random effect(s) into the model. Here we show that the choice of model parameterization needs careful consideration as some parameterizations for additional heterogeneity induce unnecessary assumptions. We suggest more appropriate parameterizations, discuss their relative advantages and demonstrate the implications of these model choices using practical examples of a parallel cluster trial and a simulated stepped-wedge trial.
HypergraphDB is an open-source, graph-oriented database that uses hypergraphs to model higher-order relationships. It has an embedded, schema-flexible data model and supports queries, traversals, indexing, transactions, and distribution. The data model represents data as atoms with target sets of related atoms. Types map data to storage and the type system supports subtypes. Indices and queries provide access to graph structures and relationships. Transactions use multiversion consistency and the system supports eventual consistency across a distributed network of peers.
Frank Celler – Processing large-scale graphs with Google(TM) Pregel - NoSQL m...NoSQLmatters
Frank Celler – Processing large-scale graphs with Google(TM) Pregel
Many popular graph databases are optimized to run on a single machine, using efficient traversals to query the stored graphs. This boosts performance of algorithms originating at a single vertex and iterating through the graph e.g. finding shortest paths or neighbors. However, graphs are getting bigger and traversals are poorly performing if they require a large depth. If you need to distribute a large-scale graph thru several machines, traversals won't be the best choice (in case of performance) to process the graph. Therefore Google has released it's Pregel framework offering an environment to query distributed graphs, Pregel is also known as the map-reduce for graphs. In this talk I want to present the architecture and requirements of the Pregel framework and introduce you to the different mind-set required to write a Pregel algorithm. Furthermore I will give a short introduction to three implementations or Pregel — Giraph, TinkerPop3 and ArangoDB.
Experimenting with Google Knowledge Graph & How Can we Potentially use it in...Pritesh Patel
Presentation given at #be2campOxon organised by @ThirlwallAssoc about Google Knowledge Graph, what is it and how is it being used. I also used myself as an experiment to learn about how it works so that I could apply the same methodology to work out how Google Knowledge Graph could help us learn about buildings and provide us with data faster and more efficiently than other sources.
This document provides contact information for Matthew Brown, Head of Special Projects at SEOmoz in Portland, and includes his Twitter and Slideshare profiles. It also contains links to resources about local knowledge graphs, semantic markup, and analyzing entity mentions in large datasets.
Warum ist ein Graph - bestehend aus Knoten und Verbindungen (jeweils mit Attributen) eigentlich so gut geeignet, die meisten Domänen ohne Verrenkungen zu modellieren? Warum habe ich bisher noch nie etwas von der etablierten Graphendatenbank Neo4j gehört? Was kann ich denn konkret damit machen? Welche interessanten Anwendungsgebiete gibt es? Das objektorientierte API ist gut und schön, aber ich möchte meine Objekte direkt in den Graphen abbilden, kann ich das? Gibt es Neo4j, mit spannenden Datensets, auch als gehostete Lösung, um direkt zu starten? Was für eine Programmiersprache brauche ich denn für eine ...4j-Datenbank?
Diese und viele andere Fragen wollen wir in der Präsentation beantworten. Von den Grundlagen angefangen, über Beispiele mit Aha-Effekten bis zum kompakten API von Neo4j und den Treibern für viele Programmiersprachen wird alles vorgestellt. Besonders wichtig ist die Mächtigkeit in Bezug auf die einfache Modellierung beliebiger Domänen. Dabei kann das Objekt-Graph-Mapping auf der Basis der von uns entwickelten Spring-Data-Graph Bibliothek noch einmal kräftig punkten. Den Abschluss der Präsentation bildet ein Abstecher zu gehosteten Neo4j-Instanzen, die besonders für PaaS-Provider, wie z.B. Heroku, sehr geeignet sind.
by Lukas Masuch, Henning Muszynski and Benjamin Raethlein
The Enterprise Knowledge Graph is a disruptive platform that combines emerging Big Data and Graph technologies to reinvent knowledge management inside organizations. This platform aims to organize and distribute the organization’s knowledge, and making it centralized and universally accessible to every employee. The Enterprise Knowledge Graph is a central place to structure, simplify and connect the knowledge of an organization. By removing complexity, the knowledge graph brings more transparency, openness and simplicity into organizations. That leads to democratized communications and empowers individuals to share knowledge and to make decisions based on comprehensive knowledge. This platform can change the way we work, challenge the traditional hierarchical approach to get work done and help to unleash human potential!
Leveraging SAP, Hadoop, and Big Data to Redefine BusinessDataWorks Summit
The document discusses leveraging SAP, Hadoop, and big data technologies to redefine businesses. It describes how the volume of digital data is exploding and includes both relational and non-relational machine-generated data. The document outlines how SAP focuses on providing an end-to-end value chain through its HANA data platform, which provides in-memory analytics, dynamic data tiering between HANA and Hadoop, smart data integration and quality features, and the ability to consume, compute and store data. Key features of HANA's integration with Hadoop include smart data access to Hive and Spark, support for MapReduce jobs, and access to HDFS.
Neo4j Use Cases - Graphdatenbanken im EinsatzNeo4j
Wenn es an der Zeit ist eine Datenbank für Ihr Projekt auszuwählen, gibt es inzwischen eine verwirrende Anzahl von Auswahlmöglichkeiten. Woher wissen Sie, wann Ihr Projekt gut für eine relationale Datenbank ist, oder ob einer der vielen NoSQL-Optionen eine bessere Wahl darstellt?
In diesem Webinar betrachten wir wann Neo4j zum Einsatz kommen sollte und wo die Vorzüge darin liegen. Dies wird anhand von Kundenbeispielen mit konkreten Einsatzszenerien erläutert.
Leveraging SAP, Hadoop, and Big Data to Redefine BusinessDataWorks Summit
The document discusses SAP's big data solutions leveraging SAP HANA, Hadoop, and related technologies. It covers SAP's strategy to provide an end-to-end platform for ingesting, processing, analyzing and acting on both structured and unstructured data at large scale. Key components discussed include Smart Data Access for querying Hadoop data virtually, virtual user defined functions for custom MapReduce jobs, and Smart Data Streaming for real-time analytics of streaming data. Use cases and customer deployments integrating SAP HANA and Hadoop are also mentioned.
Enterprise knowledge graphs use semantic technologies like RDF, RDF Schema, and OWL to represent knowledge as a graph consisting of concepts, classes, properties, relationships, and entity descriptions. They address the "variety" aspect of big data by facilitating integration of heterogeneous data sources using a common data model. Key benefits include providing background knowledge for various applications and enabling intra-organizational data sharing through semantic integration. Challenges include ensuring data quality, coherence, and managing updates across the knowledge graph.
In den vergangen Jahren entstand eine API-Industrie für
zunächst E-Commerce dann auch für soziale Medien, Cloud, Mobile und Internet der Dinge. Die Anzahl der Web APIs wächst sehr schnell durch unzählige Unternehmen, deren Hauptprodukte Web APIs sind. Dieser Vortrag beschreibt die Entwicklung dieser API-Industrie anhand einiger Beispiele und geht dann konkret auf die Themen Versionierung und Dokumentation ein.
This document provides an overview of graph databases and their use cases. It begins with definitions of graphs and graph databases. It then gives examples of how graph databases can be used for social networking, network management, and other domains where data is interconnected. It provides Cypher examples for creating and querying graph patterns in a social networking and IT network management scenario. Finally, it discusses the graph database ecosystem and how graphs can be deployed for both online transaction processing and batch processing use cases.
This document discusses graph databases and provides examples of how the Neo4j graph database can be used. It shows how Neo4j supports social, spatial, financial and other types of connected data. It also summarizes Neo4j's REST API, support for object-oriented programming, routing algorithms, multiple indexes, recommendation systems, and other use cases. The document advocates for graph databases for any problem involving multiple relationships and connections between entities.
Max De Marzi gave an introduction to graph databases using Neo4j as an example. He discussed trends in big, connected data and how NoSQL databases like key-value stores, column families, and document databases address these trends. However, graph databases are optimized for interconnected data by modeling it as nodes and relationships. Neo4j is a graph database that uses a property graph data model and allows querying and traversal through its Cypher query language and Gremlin scripting language. It is well-suited for domains involving highly connected data like social networks.
Neo4j is a powerful and expressive tool for storing, querying and manipulating data. However modeling data as graphs is quite different from modeling data under a relational database. In this talk, Michael Hunger will cover modeling business domains using graphs and show how they can be persisted and queried in Neo4j. We'll contrast this approach with the relational model, and discuss the impact on complexity, flexibility and performance.
Graph Database Management Systems provide an effective
and efficient solution to data storage in current scenarios
where data are more and more connected, graph models are
widely used, and systems need to scale to large data sets.
In this framework, the conversion of the persistent layer of
an application from a relational to a graph data store can
be convenient but it is usually an hard task for database
administrators. In this paper we propose a methodology
to convert a relational to a graph database by exploiting
the schema and the constraints of the source. The approach
supports the translation of conjunctive SQL queries over the
source into graph traversal operations over the target. We
provide experimental results that show the feasibility of our
solution and the efficiency of query answering over the target
database.
Designing and Building a Graph Database Application – Architectural Choices, ...Neo4j
Ian closely looks at design and implementation strategies you can employ when building a Neo4j-based graph database solution, including architectural choices, data modelling, and testing.g
Modelling differential clustering and treatment effect heterogeneity in paral...Karla hemming
Cluster randomized trials are frequently used in health service evaluation. It is common practice to use an analysis model with a random effect to combine between cluster information about treatment effects. It is increasingly being acknowledged that intervention effects might vary across clusters, or the variation between clusters might differ across the randomized arms. It has been proposed in both parallel cluster trials, stepped-wedge and other crossover designs that this heterogeneity can be allowed for by incorporating additional random effect(s) into the model. Here we show that the choice of model parameterization needs careful consideration as some parameterizations for additional heterogeneity induce unnecessary assumptions. We suggest more appropriate parameterizations, discuss their relative advantages and demonstrate the implications of these model choices using practical examples of a parallel cluster trial and a simulated stepped-wedge trial.
HypergraphDB is an open-source, graph-oriented database that uses hypergraphs to model higher-order relationships. It has an embedded, schema-flexible data model and supports queries, traversals, indexing, transactions, and distribution. The data model represents data as atoms with target sets of related atoms. Types map data to storage and the type system supports subtypes. Indices and queries provide access to graph structures and relationships. Transactions use multiversion consistency and the system supports eventual consistency across a distributed network of peers.
Frank Celler – Processing large-scale graphs with Google(TM) Pregel - NoSQL m...NoSQLmatters
Frank Celler – Processing large-scale graphs with Google(TM) Pregel
Many popular graph databases are optimized to run on a single machine, using efficient traversals to query the stored graphs. This boosts performance of algorithms originating at a single vertex and iterating through the graph e.g. finding shortest paths or neighbors. However, graphs are getting bigger and traversals are poorly performing if they require a large depth. If you need to distribute a large-scale graph thru several machines, traversals won't be the best choice (in case of performance) to process the graph. Therefore Google has released it's Pregel framework offering an environment to query distributed graphs, Pregel is also known as the map-reduce for graphs. In this talk I want to present the architecture and requirements of the Pregel framework and introduce you to the different mind-set required to write a Pregel algorithm. Furthermore I will give a short introduction to three implementations or Pregel — Giraph, TinkerPop3 and ArangoDB.
Experimenting with Google Knowledge Graph & How Can we Potentially use it in...Pritesh Patel
Presentation given at #be2campOxon organised by @ThirlwallAssoc about Google Knowledge Graph, what is it and how is it being used. I also used myself as an experiment to learn about how it works so that I could apply the same methodology to work out how Google Knowledge Graph could help us learn about buildings and provide us with data faster and more efficiently than other sources.
This document provides contact information for Matthew Brown, Head of Special Projects at SEOmoz in Portland, and includes his Twitter and Slideshare profiles. It also contains links to resources about local knowledge graphs, semantic markup, and analyzing entity mentions in large datasets.
Warum ist ein Graph - bestehend aus Knoten und Verbindungen (jeweils mit Attributen) eigentlich so gut geeignet, die meisten Domänen ohne Verrenkungen zu modellieren? Warum habe ich bisher noch nie etwas von der etablierten Graphendatenbank Neo4j gehört? Was kann ich denn konkret damit machen? Welche interessanten Anwendungsgebiete gibt es? Das objektorientierte API ist gut und schön, aber ich möchte meine Objekte direkt in den Graphen abbilden, kann ich das? Gibt es Neo4j, mit spannenden Datensets, auch als gehostete Lösung, um direkt zu starten? Was für eine Programmiersprache brauche ich denn für eine ...4j-Datenbank?
Diese und viele andere Fragen wollen wir in der Präsentation beantworten. Von den Grundlagen angefangen, über Beispiele mit Aha-Effekten bis zum kompakten API von Neo4j und den Treibern für viele Programmiersprachen wird alles vorgestellt. Besonders wichtig ist die Mächtigkeit in Bezug auf die einfache Modellierung beliebiger Domänen. Dabei kann das Objekt-Graph-Mapping auf der Basis der von uns entwickelten Spring-Data-Graph Bibliothek noch einmal kräftig punkten. Den Abschluss der Präsentation bildet ein Abstecher zu gehosteten Neo4j-Instanzen, die besonders für PaaS-Provider, wie z.B. Heroku, sehr geeignet sind.
by Lukas Masuch, Henning Muszynski and Benjamin Raethlein
The Enterprise Knowledge Graph is a disruptive platform that combines emerging Big Data and Graph technologies to reinvent knowledge management inside organizations. This platform aims to organize and distribute the organization’s knowledge, and making it centralized and universally accessible to every employee. The Enterprise Knowledge Graph is a central place to structure, simplify and connect the knowledge of an organization. By removing complexity, the knowledge graph brings more transparency, openness and simplicity into organizations. That leads to democratized communications and empowers individuals to share knowledge and to make decisions based on comprehensive knowledge. This platform can change the way we work, challenge the traditional hierarchical approach to get work done and help to unleash human potential!
Leveraging SAP, Hadoop, and Big Data to Redefine BusinessDataWorks Summit
The document discusses leveraging SAP, Hadoop, and big data technologies to redefine businesses. It describes how the volume of digital data is exploding and includes both relational and non-relational machine-generated data. The document outlines how SAP focuses on providing an end-to-end value chain through its HANA data platform, which provides in-memory analytics, dynamic data tiering between HANA and Hadoop, smart data integration and quality features, and the ability to consume, compute and store data. Key features of HANA's integration with Hadoop include smart data access to Hive and Spark, support for MapReduce jobs, and access to HDFS.
Neo4j Use Cases - Graphdatenbanken im EinsatzNeo4j
Wenn es an der Zeit ist eine Datenbank für Ihr Projekt auszuwählen, gibt es inzwischen eine verwirrende Anzahl von Auswahlmöglichkeiten. Woher wissen Sie, wann Ihr Projekt gut für eine relationale Datenbank ist, oder ob einer der vielen NoSQL-Optionen eine bessere Wahl darstellt?
In diesem Webinar betrachten wir wann Neo4j zum Einsatz kommen sollte und wo die Vorzüge darin liegen. Dies wird anhand von Kundenbeispielen mit konkreten Einsatzszenerien erläutert.
Leveraging SAP, Hadoop, and Big Data to Redefine BusinessDataWorks Summit
The document discusses SAP's big data solutions leveraging SAP HANA, Hadoop, and related technologies. It covers SAP's strategy to provide an end-to-end platform for ingesting, processing, analyzing and acting on both structured and unstructured data at large scale. Key components discussed include Smart Data Access for querying Hadoop data virtually, virtual user defined functions for custom MapReduce jobs, and Smart Data Streaming for real-time analytics of streaming data. Use cases and customer deployments integrating SAP HANA and Hadoop are also mentioned.
Enterprise knowledge graphs use semantic technologies like RDF, RDF Schema, and OWL to represent knowledge as a graph consisting of concepts, classes, properties, relationships, and entity descriptions. They address the "variety" aspect of big data by facilitating integration of heterogeneous data sources using a common data model. Key benefits include providing background knowledge for various applications and enabling intra-organizational data sharing through semantic integration. Challenges include ensuring data quality, coherence, and managing updates across the knowledge graph.
In den vergangen Jahren entstand eine API-Industrie für
zunächst E-Commerce dann auch für soziale Medien, Cloud, Mobile und Internet der Dinge. Die Anzahl der Web APIs wächst sehr schnell durch unzählige Unternehmen, deren Hauptprodukte Web APIs sind. Dieser Vortrag beschreibt die Entwicklung dieser API-Industrie anhand einiger Beispiele und geht dann konkret auf die Themen Versionierung und Dokumentation ein.
9. Graphdatenbanken / 1 – Definition
A graph database is a database that uses graph structures with
nodes, edges, and properties to represent and store
information. General graph databases that can store any graph
are distinct from specialized graph databases such as
triplestores and network databases.
http://en.wikipedia.org/wiki/Graph_database
9
Confidential sones GmbH| 2/9/2011 1
10. Graphdatenbanken / 2 – Property Graph Datenmodell
Name : Alice Kommuniziert mit Name : Bob
ID : 0 ID : 1
Age : 23 Encrypted : true Age : 42
Method : RSA
Erweiterung des Graphdatenmodells
– Zusätzliche Eigenschaften (Properties) an Knoten und Kanten
– Properties sind Key/Value Paare (Age:23)
– Keys werden vom Schema des Knotentyps vorgegeben
Property Graph ist ein gerichteter, multi-relationaler Graph
10
Confidential sones GmbH| 2/9/2011 1
11. Graphdatenbanken / 3 – Property Graph Datenmodell
Name : TU Name : Uni
Ilmenau Stuttgart
Name : Carol
ID : 3
Age : 18
Name: Alice Kommuniziert mit Name : Bob
ID : 0 ID : 1
Age : 23 Encrypted : true Age : 42
Method : RSA
11
Confidential sones GmbH| 2/9/2011 1
12. Nachteile relationales Modell / 1 – Unflexible Datenhaltung
Änderungen des Schemas sehr aufwändig
– ALTER TABLE
Keine semistrukturierten / unstrukturierten Daten
– z.B. XML, JSON, …
Keine Listenattribute
– List<String>, Set<Integer>, Set<User>
Keine Möglichkeit einfacher Versionierung von Datensätzen
12
Confidential sones GmbH| 2/9/2011 1
13. Nachteile relationales Modell / 2 – Foreign Key Constraints
Darstellung von 1:n bzw. n:m nur über zusätzliche Mapping-
Tabellen
– Kein explizites Konzept für Beziehungen / Relationen
Verknüpfung von Tabellen über rechenintensive JOIN
Anweisungen
Keine rekursiven JOINs
User Kommunizert mit
ID Name Alter User_ID1 User_ID2
0 Alice 23 0 1
1 Bob 42 1 0
13
Confidential sones GmbH| 2/9/2011 1
14. Nachteile relationales Modell / 3 – Skalierbarkeit
Keine explizite Möglichkeit zur Skalierung und Partitionierung
innerhalb des relationen Modells
Keine JOINs zwischen verteilten Datenbanksystemen / -
herstellern
Keine integrierte Unterstützung aktueller state-of-the-art Web
Technologien
– HTTP/REST, Hypermedia, Semantic Web
14
Confidential sones GmbH| 2/9/2011 1
15. Vorteile von Graphdatenbanken / 1 – Abstraktion
Explizites Datenmodell
Direkte Abbildung realer Netzwerkstrukturen
Besseres Verständnis der Anwendungsdomäne
15
Confidential sones GmbH| 2/9/2011 1
16. Vorteile von Graphdatenbanken / 2 – Indexfreie Adjazenz
Beziehungen / Relationen über Kanten direkt am Knoten
modelliert
– Kein zusätzliches Mapping erforderlich
Keine Indizes für Relationen erforderlich
– Hohe Performance beim Abfragen des Graphen
– Jeder Knoten agiert als „mini-index“
Datenlokalität / effiziente Speicherung
– Adjazente Knoten können „nah beieinander“ persistiert werden
Performance ist unabhängig von der Größe des Graphen
16
Confidential sones GmbH| 2/9/2011 1
17. Vorteile von Graphdatenbanken / 3 – Traversierung
Eine der wichtigsten Operationen innerhalb von
Graphdatenbanken
Teilweise oder vollständige Abfrage eines Graphen
– Aufbau einer Baumstruktur
Suchen nach Knoten / Kanten mit bestimmen Eigenschaften
– z.B. User mit mehr als 500 Freunden (Naben)
Verschiedene Traversierungsmethoden
– Breitensuche, Tiefensuche, algorithmische Traversierung, Random
Walks
17
Confidential sones GmbH| 2/9/2011 1
18. Vorteile von Graphdatenbanken / 4 – Weitere Vorteile
Unterstützung von state-of-the-art Schnittstellen
Grundlage für wissenschaftliche Analysen realer
Netzwerkstrukturen
Zusätzliche Indizes für einfache Attribute oder komplexe
Subgraphen
Konzipiert für den Einsatz in verteilten Systemen / Cloud-
Angeboten
18
Confidential sones GmbH| 2/9/2011 1
19. Einsatzszenarien / 1 – Graphenbasierte Algorithmen
Bewerten von Websites in Suchmaschinen – page rank
Wer-kennt-wen in sozialen Netzwerken – shortest path
Empfehlungssysteme – biparite matching
Verkehrsinfrastrukturen – minimum spanning tree
Erkennen von Naben in Netzwerken – betweenness centrality
Transportplanoptimierung – maximum flow
…
19
Confidential sones GmbH| 2/9/2011 1
20. Einsatzszenarien / 2 – Business Use-Cases
Web
– Klickpfad-Analyse – Welche Wege nehmen die Kunden auf dem
Portal?
Universal Data Access
– Zentrales Metadaten-Repository – Unternehmensdaten zentral
verwalten. Daten aus diversen redaktionellen Quellen (Bilder, Artikel,
etc.) verknüpfen
eCommerce
– Recommendations – Empfehlung der richtigen Produkte dem richtigen
Kunden zur richtigen Zeit (kundenspezifische Werbung)
…
20
Confidential sones GmbH| 2/9/2011 1
22. sones GraphDB / 2 – Property Hypergraph
Edge
Person Freund Bob
ID = 2
seit : 2009/09/21
Hyperedge Alter = 23
Alice SET<Person> Freunde
ID = 1 SetMaxNumber : 12
Alter = 21
Hyperedge-Properties
Person Freund Carol
ID = 3
seit : 2010/04/11
Alter = 20
22
Confidential sones GmbH| 2/9/2011 1
23. sones GraphDB / 3 – Graph Query Language
// sones GQL Example
// define Vertex Type (QDL)
CREATE VERTEX User
ADD ATTRIBUTES (String Name, SET<User> Friends)
INDICES (Name)
// add vertices Alice and Bob (QML)
INSERT INTO User VALUES (Name = "Alice", Age = 23)
INSERT INTO User VALUES (Name = "Bob", Age = 42)
// add edges between Alice and Bob (QML)
LINK User(Name = ‘Alice') VIA Friends TO User(Name = ‘Bob')
LINK User(Name = ‘Bob') VIA Friends TO User(Name = ‘Alice‘)
23
Confidential sones GmbH| 2/9/2011 1