Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.
Nächste SlideShare
×

# Processing large-scale graphs with Google(TM) Pregel by MICHAEL HACKSTEIN at Big Data Spain 2014

1.340 Aufrufe

Veröffentlicht am

This talk will give a good overview over the complex architecture of the Pregel framework and will give some insights where there are potential bottlenecks when writing a Pregel algorithm.

Veröffentlicht in: Technologie
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Als Erste(r) kommentieren

• Gehören Sie zu den Ersten, denen das gefällt!

### Processing large-scale graphs with Google(TM) Pregel by MICHAEL HACKSTEIN at Big Data Spain 2014

1. 1. PROCESSING LARGE-SCALE GRAPHS WITH GOOGLE(TM) PREGEL MICHAEL HACKSTEIN FRONT END AND GRAPH SPECIALIST ARANGODB
2. 2. Processing large-scale graphs with GoogleTMPregel November 17th Michael Hackstein @mchacki www.arangodb.com
3. 3. Michael Hackstein ArangoDB Core Team Web Frontend Graph visualisation Graph features Host of cologne.js Master’s Degree (spec. Databases and Information Systems) 1
4. 4. Graph Algorithms Pattern matching Search through the entire graph Identify similar components ) Touch all vertices and their neighbourhoods 2
5. 5. Graph Algorithms Pattern matching Search through the entire graph Identify similar components ) Touch all vertices and their neighbourhoods Traversals De1ne a speci1c start point Iteratively explore the graph ) History of steps is known 2
6. 6. Graph Algorithms Pattern matching Search through the entire graph Identify similar components ) Touch all vertices and their neighbourhoods Traversals De1ne a speci1c start point Iteratively explore the graph ) History of steps is known Global measurements Compute one value for the graph, based on all it’s vertices or edges Compute one value for each vertex or edge ) Often require a global view on the graph 2
7. 7. Pregel A framework to query distributed, directed graphs. Known as “Map-Reduce” for graphs Uses same phases Has several iterations Aims at: Operate all servers at full capacity Reduce network traZc Good at calculations touching all vertices Bad at calculations touching a very small number of vertices 3
8. 8. Example – Connected Components 1 1 2 2 5 7 7 5 4 3 4 3 6 6 active inactive 3 forward message 2 backward message 4
9. 9. Example – Connected Components 1 1 2 2 5 7 7 5 6 7 5 4 3 4 3 6 6 4 2 3 4 active inactive 3 forward message 2 backward message 4
10. 10. Example – Connected Components 1 1 2 2 5 7 7 5 6 7 5 4 3 4 3 6 6 4 2 3 4 active inactive 3 forward message 2 backward message 4
11. 11. Example – Connected Components 1 1 2 2 5 6 7 5 6 5 5 4 3 4 3 5 6 3 1 2 2 active inactive 3 forward message 2 backward message 4
12. 12. Example – Connected Components 1 1 2 2 5 6 7 5 6 5 5 4 3 4 3 5 6 3 1 2 2 active inactive 3 forward message 2 backward message 4
13. 13. Example – Connected Components 1 1 1 2 5 5 7 5 2 2 4 3 5 6 1 1 2 2 active inactive 3 forward message 2 backward message 4
14. 14. Example – Connected Components 1 1 1 2 5 5 7 5 2 2 4 3 5 6 1 1 2 2 active inactive 3 forward message 2 backward message 4
15. 15. Example – Connected Components 1 1 1 2 5 5 7 5 1 1 4 3 5 6 1 1 active inactive 3 forward message 2 backward message 4
16. 16. Example – Connected Components 1 1 1 2 5 5 7 5 1 1 4 3 5 6 1 1 active inactive 3 forward message 2 backward message 4
17. 17. Example – Connected Components 1 1 1 2 5 5 7 5 1 1 4 3 5 6 active inactive 3 forward message 2 backward message 4
18. 18. Pregel – Sequence 5
19. 19. Pregel – Sequence 5
20. 20. Pregel – Sequence 5
21. 21. Pregel – Sequence 5
22. 22. Pregel – Sequence 5
23. 23. Worker ^= Map “Map” a user-de1ned algorithm over all vertices Output: set of messages to other vertices Available parameters: The current vertex and his outbound edges All incoming messages Global values Allow modi1cations on the vertex: Attach a result to this vertex and his outgoing edges Delete the vertex and his outgoing edges Deactivate the vertex 6
24. 24. Combine ^= Reduce “Reduce” all generated messages Output: An aggregated message for each vertex. Executed on sender as well as receiver. Available parameters: One new message for a vertex The stored aggregate for this vertex Typical combiners are SUM, MIN or MAX Reduces network traZc 7
25. 25. Activity ^= Termination Execute several rounds of Map/Reduce Count active vertices and messages Start next round if one of the following is true: At least one vertex is active At least one message is sent Terminate if neither a vertex is active nor messages were sent Store all non-deleted vertices and edges as resulting graph 8
26. 26. Pregel at ArangoDB Started as a side project in free hack time Experimental on operational database Implemented as an alternative to traversals Make use of the 2exibility of JavaScript: No strict type system No pre-compilation, on-the-2y queries Native JSON documents Really fast development 9