graphs databases!
and

python
Maksym Klymyshyn
CTO @ GVMachines Inc. (zakaz.ua)
What’s inside?
‣

PostgreSQL

‣

Neo4j

‣

ArangoDB
Python Frameworks
‣

Bulbflow

‣

py4neo

‣

NetworkX

‣

Arango-python
Relational to Graph model
crash course
“Switching from relational to the graph model”!
by Luca Garulli

http://goo.gl/z08q...
My motivation is quite
simple:
“The best material model of a cat is another, or
preferably the same, cat.”

–Norbert Wiener
Old good Postgres
create table nodes (
node integer primary key,
name varchar(10) not null,
feat1 char(1), feat2 char(1))

!

create table e...
Я из Одессы,
я просто бухаю.
Neo4j
Most famous graph
database.
•

1,333 mentions within repositories on Github

•

1,140,000 results in Google

•

26,868 twe...
A lot of python libraries

Py2Neo, Neomodel, neo4django, bulbflow
; Create a node1, node2 and
; relation RELATED between two nodes
CREATE (node1 {name:"node1"}),
(node2 {name: "node2"}),
(...
neo4j is friendly and powerful.
The only thing is a bit complex
querying language – Cypher
py4neo nodes
from py2neo import neo4j, node, rel
!
!
graph_db = neo4j.GraphDatabaseService(
"http://localhost:7474/db/data...
py4neo paths
from py2neo import neo4j, node
!

graph_db = neo4j.GraphDatabaseService(
"http://localhost:7474/db/data/")
al...
Alice KNOWS Bob KNOWS Carol
bulbflow framework

from bulbs.neo4jserver import Graph
g = Graph()
james = g.vertices.create(name="James")
julie = g.verti...
FlockDB
OrientDB
InfoGrid
HyperGraphDB

WAT?
ArangoDB
“In any investment, you expect to have fun and
make profit.”

–Michael Jordan
I’m developer of python driver for ArangoDB
•

NoSQL Database storage

•

Graph of documents

•

AQL (arango query language) to execute graph queries

•

Edge data ty...
Small experiment with graphs and twitter:!
I’ve looked on my tweets and people who added it
to favorites.
After that I’ve ...
1-level depth
2-level depth
3-level depth
Code behind
from arango import create
!

arango = create(db="tweets_maxmaxmaxmax")
arango.database.create()
arango.tweets....
Here we creating edge from from_doc to to_doc
!

from_doc = arango.tweets.documents.create({})
to_doc = arango.tweets.docu...
Full example

•

Sample dataset with 10 users

•

Relations between users

•

Visualise within admin interface
Sample dataset
from arango import create
!
def dataset(a):
a.database.create()
a.users.create()
a.knows.create(type=a.COLL...
Relations between users
def relations(a):
rels = (
(0, 1), (0, 2), (2, 3), (4, 3), (3, 5),
(5, 1), (0, 5), (5, 6), (6, 7),...
Relations between users
users/2744664487->users/2744926631:
users/2744664487->users/2745123239:
users/2745123239->users/27...
AQL, getting paths
FOR p IN PATHS(users, knows, 'outbound')
FILTER p.source.name == 'user_5'
RETURN p.vertices[*].name

fr...
Paths output
['user_5']
['user_5',
['user_5',
['user_5',
['user_5',

'user_1']
'user_6']
'user_6', 'user_7']
'user_6', 'us...
Links
•

Arango paths: http://goo.gl/n2L3SK

•

Neo4j: http://goo.gl/au5y9I

•

Scraper: http://goo.gl/nvMFGk!

•

Visuali...
Thanks. Q’s?
!

@maxmaxmaxmax
Odessapy2013 - Graph databases and Python
Odessapy2013 - Graph databases and Python
Odessapy2013 - Graph databases and Python
Nächste SlideShare
Wird geladen in ...5
×

Odessapy2013 - Graph databases and Python

48,856

Published on

Page 10 "Я из Одессы я просто бухаю." translation: I'm from Odessa I just drink. Meaning his drinking a lot of "Vodka" ^_^ (@tuc @hackernews)
This is local meme - when someone asking question and you will look stupid in case you don't have answer.

Published in: Technologie
0 Kommentare
38 Gefällt mir
Statistiken
Notizen
  • Hinterlassen Sie den ersten Kommentar

Keine Downloads
Views
Gesamtviews
48,856
Bei Slideshare
0
Aus Einbettungen
0
Anzahl an Einbettungen
39
Aktionen
Geteilt
0
Downloads
161
Kommentare
0
Gefällt mir
38
Einbettungen 0
No embeds

No notes for slide

Odessapy2013 - Graph databases and Python

  1. 1. graphs databases! and python Maksym Klymyshyn CTO @ GVMachines Inc. (zakaz.ua)
  2. 2. What’s inside? ‣ PostgreSQL ‣ Neo4j ‣ ArangoDB
  3. 3. Python Frameworks ‣ Bulbflow ‣ py4neo ‣ NetworkX ‣ Arango-python
  4. 4. Relational to Graph model crash course “Switching from relational to the graph model”! by Luca Garulli http://goo.gl/z08qwk! ! http://www.slideshare.net/lvca/switching-from-relational-to-the-graph-model
  5. 5. My motivation is quite simple:
  6. 6. “The best material model of a cat is another, or preferably the same, cat.” –Norbert Wiener
  7. 7. Old good Postgres
  8. 8. create table nodes ( node integer primary key, name varchar(10) not null, feat1 char(1), feat2 char(1)) ! create table edges ( a integer not null references nodes(node) on update cascade on delete cascade, b integer not null references nodes(node) on update cascade on delete cascade, primary key (a, b)); ! create index a_idx ON edges(a); create index b_idx ON edges(b); ! create ! unique index pair_unique_idx on edges (LEAST(a, b), GREATEST(a, b)); ; and no self-loops alter table edges add constraint no_self_loops_chk check (a <> b); ! insert insert insert insert insert insert insert ! into into into into into into into nodes nodes nodes nodes nodes nodes nodes values values values values values values values (1, (2, (3, (4, (5, (6, (7, 'node1', 'node2', 'node3', 'node4', 'node5', 'node6', 'node7', 'x', 'x', 'x', 'z', 'x', 'x', 'x', 'y'); 'w'); 'w'); 'w'); 'y'); 'z'); 'y'); insert into edges values (1, 3), (2, 1), (2, 4), (3, 4), (3, 5), (3, 6), (4, 7), (5, 1), (5, 6), (6, 1); ! ; directed graph select * from nodes n left join edges e on n.node = e.b where e.a = 2; ! ; undirected graph select * from nodes where node in (select case when a=1 then b else a end from edges where 1 in (a,b)); !
  9. 9. Я из Одессы, я просто бухаю.
  10. 10. Neo4j
  11. 11. Most famous graph database. • 1,333 mentions within repositories on Github • 1,140,000 results in Google • 26,868 tweets • Really nice Admin interface • Awesome help tips
  12. 12. A lot of python libraries Py2Neo, Neomodel, neo4django, bulbflow
  13. 13. ; Create a node1, node2 and ; relation RELATED between two nodes CREATE (node1 {name:"node1"}), (node2 {name: "node2"}), (node1)-[:RELATED]->(node2); !
  14. 14. neo4j is friendly and powerful. The only thing is a bit complex querying language – Cypher
  15. 15. py4neo nodes from py2neo import neo4j, node, rel ! ! graph_db = neo4j.GraphDatabaseService( "http://localhost:7474/db/data/") ! die_hard = graph_db.create( node(name="Bruce Willis"), node(name="John McClane"), node(name="Alan Rickman"), node(name="Hans Gruber"), node(name="Nakatomi Plaza"), rel(0, "PLAYS", 1), rel(2, "PLAYS", 3), rel(1, "VISITS", 4), rel(3, "STEALS_FROM", 4), rel(1, "KILLS", 3))
  16. 16. py4neo paths from py2neo import neo4j, node ! graph_db = neo4j.GraphDatabaseService( "http://localhost:7474/db/data/") alice, bob, carol = node(name="Alice"), node(name="Bob"), node(name="Carol") abc = neo4j.Path( alice, "KNOWS", bob, "KNOWS", carol) abc.create(graph_db) abc.nodes # [node(**{'name': 'Alice'}), # node(**{‘name': ‘Bob'}), # node(**{‘name': 'Carol'})]
  17. 17. Alice KNOWS Bob KNOWS Carol
  18. 18. bulbflow framework from bulbs.neo4jserver import Graph g = Graph() james = g.vertices.create(name="James") julie = g.vertices.create(name="Julie") g.edges.create(james, "knows", julie)
  19. 19. FlockDB OrientDB InfoGrid HyperGraphDB WAT?
  20. 20. ArangoDB
  21. 21. “In any investment, you expect to have fun and make profit.” –Michael Jordan
  22. 22. I’m developer of python driver for ArangoDB
  23. 23. • NoSQL Database storage • Graph of documents • AQL (arango query language) to execute graph queries • Edge data type to create edges between nodes (with properties) • Multiple edges collections to keep different kind of edges • Support of Gremlin graph query language
  24. 24. Small experiment with graphs and twitter:! I’ve looked on my tweets and people who added it to favorites. After that I’ve looked to that person’s tweets and did the same thing with people who favorited their tweets.
  25. 25. 1-level depth
  26. 26. 2-level depth
  27. 27. 3-level depth
  28. 28. Code behind from arango import create ! arango = create(db="tweets_maxmaxmaxmax") arango.database.create() arango.tweets.create() arango.tweets_edges.create( type=arango.COLLECTION_EDGES) !
  29. 29. Here we creating edge from from_doc to to_doc ! from_doc = arango.tweets.documents.create({}) to_doc = arango.tweets.documents.create({}) arango.tweets_edges.edges.create(from_doc, to_doc) Getting edges for tweet 196297127 query = db.tweets_edge.query.over( F.EDGES( "tweets_edges", ~V("tweets/196297127"), ~V("outbound")))
  30. 30. Full example • Sample dataset with 10 users • Relations between users • Visualise within admin interface
  31. 31. Sample dataset from arango import create ! def dataset(a): a.database.create() a.users.create() a.knows.create(type=a.COLLECTION_EDGES) ! for u in range(10): a.users.documents.create({ "name": "user_{}".format(u), "age": u + 20, "gender": u % 2 == 0}) ! ! a = create(db="experiments") dataset(a)
  32. 32. Relations between users def relations(a): rels = ( (0, 1), (0, 2), (2, 3), (4, 3), (3, 5), (5, 1), (0, 5), (5, 6), (6, 7), (7, 8), (9, 8)) ! ! ! get_user = lambda id: a.users.query.filter( "obj.name == 'user_{}'".format(id)).execute().first for f, t in rels: what = "user_{} knows user_{}".format(f, t) from_doc, to_doc = get_user(f), get_user(t) a.knows.edges.create(from_doc, to_doc, {"what": what}) print ("{}->{}: {}".format(from_doc.id, to_doc.id, what)) a = create(db="experiments") relations(a)
  33. 33. Relations between users users/2744664487->users/2744926631: users/2744664487->users/2745123239: users/2745123239->users/2745319847: users/2745516455->users/2745319847: users/2745319847->users/2745713063: users/2745713063->users/2744926631: users/2744664487->users/2745713063: users/2745713063->users/2745909671: users/2745909671->users/2746106279: users/2746106279->users/2746302887: users/2746499495->users/2746302887: user_0 user_0 user_2 user_4 user_3 user_5 user_0 user_5 user_6 user_7 user_9 knows knows knows knows knows knows knows knows knows knows knows user_1 user_2 user_3 user_3 user_5 user_1 user_5 user_6 user_7 user_8 user_8
  34. 34. AQL, getting paths FOR p IN PATHS(users, knows, 'outbound') FILTER p.source.name == 'user_5' RETURN p.vertices[*].name from arango import create from arango.aql import F, V ! ! def querying(a): for data in a.knows.query.over( F.PATHS("users", "knows", ~V("outbound"))) .filter("obj.source.name == '{}'".format("user_5")) .result("obj.vertices[*].name") .execute(wrapper=lambda c, i: i): print (data) ! ! a = create(db="experiments") ! querying(a)
  35. 35. Paths output ['user_5'] ['user_5', ['user_5', ['user_5', ['user_5', 'user_1'] 'user_6'] 'user_6', 'user_7'] 'user_6', 'user_7', 'user_8']
  36. 36. Links • Arango paths: http://goo.gl/n2L3SK • Neo4j: http://goo.gl/au5y9I • Scraper: http://goo.gl/nvMFGk! • Visualiser: http://goo.gl/Rzdwci
  37. 37. Thanks. Q’s? ! @maxmaxmaxmax
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×