Neural Network That Learns From a Huge Graph: Spark Summit East talk by Daniel Darabos and Hanna Gabor

Neural Network
Hanna Gábor & Dániel Darabos
Lynx Analytics
Huge Graph
That Learns From a

Who is Lynx Analytics?
Big data analytics with a focus on graph data.
Our core product is a graph-oriented analytics application called LynxKite.
• Web UI for fluid exploration workflow with rapid visualization.
• API and automation for autonomous operation.
• Implemented with Apache Spark.
• Machine learning toolbox.
Telecommunications Financial Services Smart City Transport
START the challenge the prior solutions the novel solution END

Problem statement
A single giant graph, such as relations in a social graph.
A partially populated vertex attribute, such as age.
Predict the missing attribute values!

Old approach I: Machine Learning
Prediction? Use machine learning! Vertex attributes ⇒ label.
How does it perform?
Entirely ignores edges.
25%
accuracy

Anonymized friendship data from now-defunct social network.
• Filtered to single city for faster experimentation
• 27,783 profiles
• 1,095,707 friendships
Age & gender
attributes
Experiment setup

The challenge: Multi-class classification by age
• Age is bucketed into quartiles.
• Model is trained on training data (85%).
• Accuracy is evaluated on test data (15%).
• Accuracy =
• Final number is the median accuracy from 11 trials.
Experiment setup
correct predictions
test set size

Old approach II: ML with graph metrics
To present the graph structure to the ML algorithm calculate every graph metric we can
think of! Degree, PageRank, clustering coefficient, harmonic centrality, graph coloring...
(Easy with LynxKite.)
Expert has to find metrics that are good predictors.
Network neighborhood still largely ignored.
32%
accuracy

Old approach III: Neighborhood
Take average value of neighbors.
Not adaptive. (Average not best option in all cases.)
70%
accuracy

Expert has to pick good way to identify communities.
Not adaptive. (Why average? Why most homogeneous?)
73%
accuracy
Old approach IV: Communities
Find communities. (E.g. connected components in overlapping maximal cliques.)
Take average value from most homogeneous community that meets minimal criteria.

Old approach V: ML with neighborhood data
In addition to metrics, provide the machine learning with the neighborhood average.
Expert has to manufacture features for ML.
Not perfectly adaptive. (Why average?)
75%
accuracy

New approach: ML on graph data
Avoid all expert decisions. Just train the model on the raw graph. Model can learn to
identify communities or calculate PageRank if those are required for optimal predictions.
No expert knowledge required.
Adaptively computes the best features.
81%
accuracy

Strong recent results with deep learning on graphs with graph convolutional networks.
Yujia Li, Daniel Tarlow, Marc Brockschmidt, Richard Zemel (2016).
Gated Graph Sequence Neural Networks. arXiv:1511.05493 [cs.LG]
A recurrent neural network (GRU) in every vertex with shared parameters.
State is communicated along edges.
Trained on many small labeled graphs. Gives prediction on small unlabeled graph.
Model

WW W
WW W
Prediction.
Three copies of the same GRU.
Intermediate state.
More copies of the same network.
Initial state: label (if known) + features.
graph edges
Simplified architecture

• Hard to apply supervised learning when we have a single graph.
• Hard to do anything when this graph does not fit on a single machine.
Problems

Solution:
• Show some of the known labels.
• Backpropagate error only from the vertices where the label was hidden.
• Hide different labels in each iteration.
Network sees none of the known labels.
Model ignores existing labels.
Network sees all the known labels.
Model learns to just return own label.
Supervised learning on a single graph

• Pick representative subgraphs.
• Train in parallel locally on subgraphs. Periodically combine adjustments.
• Prediction on the whole graph is fully distributed.
• Accuracy impact depends on amount of computational resources. Great for
scaling.
Distributed training and prediction

Closed-source. Sorry.
Evaluating classical methods and preparing data: LynxKite on Spark
Researching and prototyping neural networks: TensorFlow
Distributed forward pass: LynxKite on Spark
Distributed training:
in development
LynxKite on Spark + TensorFlow
(TensorFrames?)
Implementation

@LynxAnalytics @Hanna_Gabor @DanielDarabos
You can find us at booth K2. Swing by to see if we have any swag left!
Special thanks to Gabor Olah, Andras Nemeth,
and many others at Lynx for their contributions.

Neural Network That Learns From a Huge Graph: Spark Summit East talk by Daniel Darabos and Hanna Gabor

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (20)

More from Spark Summit

More from Spark Summit (20)

Recently uploaded

Recently uploaded (20)

Neural Network That Learns From a Huge Graph: Spark Summit East talk by Daniel Darabos and Hanna Gabor