Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.
Structure of the presentation
What
High-level overview onemergingfieldofgeometric
deeplearning(andgraphdeeplearning)
How
P...
Brief Review of
Geometric Deep Learning
Geometric Deep Learning #1
Bronstein et al. (July 2017): “Geometric deep learning (
http://geometricdeeplearning.com/) is ...
Geometric Deep Learning #2
Bronstein et al. (July 2017): “The non-Euclidean nature of data
implies that there are no such ...
Geometric Deep Learning #3
Bronstein et al. (July 2017): “We expect the following years to bring exciting new approaches
a...
Primer on
GRAPHs
Taylor and Wrana (2012)
doi: 10.1002/pmic.201100594
Graph theory especially useful for networks analysis
https://doi.org/10.1126/science.286.5439.509
Cited by 29,071 articles...
Graph theory Common metrics and definitions
Graph-theoretic node
importance mining on
network topology
- Xue et al. (2017)...
Ranking in time-varying complex networks
Ranking in evolving complex networks
Hao Liao, Manuel Sebastian Mariani, Matúš Me...
information diffusion intro
Many graphs can be modeled or used to predict how an information flows in the given graph.
● H...
information diffusion Social Networks #1
Nonlinear Dynamics of Information Diffusion in Social Networks
ACM Transactions o...
information diffusion Social Networks #2
Literature Survey on Interplay of Topics, Information Diffusion and
Connections o...
information diffusion scientific citation networks #1
Integration of Scholarly Communication Metadata
Using Knowledge Grap...
information diffusion scientific citation networks #2
Implicit Multi-Feature Learning for Dynamic Time
Series Prediction o...
information diffusion Finance, Quant trading, decision making
Information Diffusion, Cluster formation
and Entropy-based N...
“Intelligent knowledge graphs” with “actionable insights”
Model-Driven Analytics: Connecting Data,
Domain Knowledge, and L...
Graph theory example Applications beyond typical networks
Construction (BIM): “Graph theory based
representation of buildi...
Graph Signal Processing and quantitative graph theory
Defferrard et al. (2016): “The emerging field of Graph Signal Proces...
Graph Fourier Transform GFT
The use of Graph Fourier Transform in image
processing: A new solution to classical problems
V...
Graph signal Processing #1
Adaptive Least Mean Squares Estimation of Graph
Signals
Paolo Di Lorenzo ; Sergio Barbarossa ; ...
Graph signal Processing #2
Kernel Regression for Signals over Graphs
Arun Venkitaraman, Saikat Chatterjee, Peter Händel
(S...
Graph signal Processing #3 Time-varying graphs
Kernel-Based Reconstruction of Space-Time
Functions on Dynamic Graphs
Danie...
Graph signal Processing #4 Time-varying graphs
Signal Processing on Graphs: Causal Modeling of
Unstructured Data
Jonathan ...
Graph Wavelet transform vs. GFT #1
Compression of dynamic 3D point clouds using
subdivisional meshes and graph wavelet tra...
Graph Wavelet transform vs. GFT #2
Bipartite Approximation for Graph
Wavelet Signal Decomposition
Jin Zeng ; Gene Cheung ;...
Graphlet induced subgraphs of a large network
Estimation of Graphlet Statistics
Ryan A. Rossi, Rong Zhou, and Nesreen K. A...
Graph Computing Accelerations
Parallel Local Algorithms for Core, Truss,
and Nucleus Decompositions
Ahmet Erdem Sariyuce, ...
P-Laplacian on graphs
p-Laplacian Regularized Sparse Coding for Human
Activity Recognition
Weifeng Liu ; Zheng-Jun Zha ; Y...
“Applied Laplacian” Mesh processing #1A
Spectral Mesh Processing
H. Zhang, O. Van Kaick, R. Dyer
Computer Graphics Forum 9...
Graph Framework for Manifold-valued Data image processing
Nonlocal Inpainting of Manifold-valued Data on Finite
Weighted G...
segmentation of graphs #1
Convex variational methods for multiclass data
segmentation on graphs
Egil Bae, Ekaterina Merkur...
segmentation of graphs #2:
Scalable Motif-aware Graph Clustering
CE Tsourakakis, J Pachocki, Michael Mitzenmacher Harvard ...
Graph Summarization #1A
Graph Summarization: A Survey
Yike Liu, Abhilash Dighe, Tara Safavi, Danai Koutra
(Submitted on 14...
Graph Summarization #1B
Table I: Qualitative comparison of static graph summarization techniques. The first six columns de...
Point cloud resampling via graphs
Fast Resampling of 3D Point Clouds via Graphs
Siheng Chen ; Dong Tian ; Chen Feng ; Anth...
2D Image Processing with graphs
Directional graph weight prediction for
image compression
Francesco Verdoja ; Marco Grange...
Background on
GRAPH Deep learning
Beyond the short
introduction from
the review above
Graph structure known or not?
GRAPH KNOWN
”Graph well defined, when the temperature
measurement positions are known, and
t...
Convolutions for graphs #1
Deep Convolutional Networks on
Graph-Structured Data
Mikael Henaff, Joan Bruna, Yann LeCun
(Sub...
Convolutions for graphs #2
Learning Convolutional Neural Networks
for Graphs
Mathias Niepert, Mohamed Ahmed, Konstantin Ku...
Convolutions for graphs #3
Geometric deep learning on graphs
and manifolds using mixture model
CNNs
Federico Monti, Davide...
Convolutions for graphs #4
Convolutional Neural Networks on Graphs
with Fast Localized Spectral Filtering
Michaël Defferra...
Convolutions for graphs #5
Top: Schematic
illustration of a
standard CNN where
patches of w×h
pixels are convolved
with D×...
Convolutions for graphs #6
CayleyNets: Graph Convolutional Neural
Networks with Complex Rational Spectral Filters
Ron Levi...
Convolutions for graphs #7
Graph Convolutional Matrix Completion
Rianne van den Berg, Thomas N. Kipf, Max Welling
(Submitt...
Convolutions for graphs #8
Graph Based Convolutional Neural Network
Michael Edwards, Xianghua Xie
(Submitted on 28 Sep 201...
Convolutions for graphs #9
Generalizing CNNs for data structured on
locations irregularly spaced out
Jean-Charles Vialatte...
Convolutions for graphs #10
Robust Spatial Filtering with Graph
Convolutional Neural Networks
Felipe Petroski Such, Shagan...
Convolutions for graphs #11
A Generalization of Convolutional Neural
Networks to Graph-Structured Data
Yotam Hechtlinger, ...
Autoencoders for graphs
Variational Graph Auto-Encoders
Thomas N. Kipf, Max Welling
(Submitted on 21 Nov 2016)
https://arx...
Representation Learning For graphs #1
Inductive Representation Learning on Large Graphs
William L. Hamilton, Rex Ying, Jur...
Representation Learning For graphs #2
Skip-graph: Learning graph embeddings with an
encoder-decoder model
John Boaz Lee, X...
Semi-supervised Learning For graphs
Inductive Representation Learning on Large Graphs
Thang D. Bui, Sujith Ravi, Vivek Ram...
Recurrent Networks for graphs #1
Geometric Matrix Completion with Recurrent
Multi-Graph Neural Networks
Federico Monti, Mi...
Recurrent Networks for graphs #2
Learning From Graph Neighborhoods Using
LSTMs
Rakshit Agrawal, Luca de Alfaro, Vassilis P...
Time-series analysis with graphs #1
Spectral Algorithms for Temporal Graph Cuts
Arlei Silva, Ambuj Singh, Ananthram Swami
...
Active learning on Graphs
Active Learning for Graph Embedding
Hongyun Cai, Vincent W. Zheng, Kevin Chen-Chuan Chang
(Submi...
Transfer learning on Graphs
Intrinsic Geometric Information Transfer
Learning on Multiple Graph-Structured
Datasets
Jaekoo...
Transfer learning on Graphs #2
Deep Feature Learning for Graphs
Ryan A. Rossi, Rong Zhou, Nesreen K. Ahmed
(Submitted on 2...
Learning Graphs learning the graph itself #1
Learning Graph While Training: An Evolving
Graph Convolutional Neural Network...
Graph structure as the “signal” for prediction
DeepGraph: Graph Structure Predicts
Network Growth
Cheng Li, Xiaoxiao Guo, ...
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Geometric Deep Learning
Nächste SlideShare
Wird geladen in …5
×

Geometric Deep Learning

For non-grid 3D images like point clouds and meshes, and inherently graph-based data.

Inherently graph-based data include for example brain connectivity analysis, scientific article citation networks, (social) network analysis, etc.

Alternative download link:
https://www.dropbox.com/s/2o3cofcd6d6e2qt/geometricGraph_deepLearning.pdf?dl=0

  • Als Erste(r) kommentieren

Geometric Deep Learning

  1. 1. Structure of the presentation What High-level overview onemergingfieldofgeometric deeplearning(andgraphdeeplearning) How Presentation focusedonstartup-style organizations witheveryone doingabitofeverything,everyone needingtounderstandabit of everything.CEOcannot bethe ‘ideaguy’not knowinganythingabout graphs andgeometricdeeplearning,ifyouare operatingin thisspace EFFECTUATION – THE BEST THEORY OF ENTREPRENEURSHIP YOU ACTUALLY FOLLOW, WHETHER YOU’VE HEARD OF IT OR NOT by Ricardo dos Santos
  2. 2. Brief Review of Geometric Deep Learning
  3. 3. Geometric Deep Learning #1 Bronstein et al. (July 2017): “Geometric deep learning ( http://geometricdeeplearning.com/) is an umbrella term for e merging techniques attempting to generalize (structured) deep neural models to non- Euclidean domains, such as graphs and manifolds. The purpose of this article is to overview different examples of geometric deep-learning problems and present available solutions, key difficulties, applications, and future research directions in this nascent field” SCNN (2013) GCNN/ChebNet (2016) GCN (2016) GNN (2009) Geodesic CNN (2015) Anisotropic CNN (2016) MoNet (2016) Localized SCNN (2015)
  4. 4. Geometric Deep Learning #2 Bronstein et al. (July 2017): “The non-Euclidean nature of data implies that there are no such familiar properties as global parameterization, common system of coordinates, vector space structure, or shift-invariance. Consequently, basic operations like convolution that are taken for granted in the Euclidean case are even not well defined on non-Euclidean domains.” “First attempts to generalize neural networks to graphs we are aware of are due to Mori et al. (2005) who proposed a scheme combining recurrent neural networks and random walk models. This approach went almost unnoticed, re-emerging in a modern form in Suhkbaatar et al. (2016) and Li et al. (2015) due to the renewed recent interest in deep learning.” “In a parallel effort in the computer vision and graphics community, Masci et al. (2015) showed the first CNN model on meshed surfaces, resorting to a spatial definition of the convolution operation based on local intrinsic patches. Among other applications, such models were shown to achieve state-of-the-art performance in finding correspondence between deformable 3D shapes. Followup works proposed different construction of intrinsic patches on point clouds Boscaini et al. (2016)a,b and general graphs Monti et al. (2016).” In calculus, the notion of derivative describes how the value of a function changes with an infinitesimal change of its argument. One of the big differences distinguishing classical calculus from differential geometry is a lack of vector space structure on the manifold, prohibiting us from naïvely using expressions like f(x+dx). The conceptual leap that is required to generalize such notions to manifolds is the need to work locally in the tangent space. Physically, a tangent vector field can be thought of as a flow of material on a manifold. The divergence measures the net flow of a field at a point, allowing to distinguish between field ‘sources’ and ‘sinks’. Finally, the Laplacian (or Laplace-Beltrami operator in differential geometric jargon) “A centerpiece of classical Euclidean signal processing is the property of the Fourier transform diagonalizing the convolution operator, colloquially referred to as the Convolution Theorem. This property allows to express the convolution f⋆g of two functions in the spectral domain as the element-wise product of their Fourier transforms. Unfortunately, in the non-Euclidean case we cannot even define the operation x-x’ on the manifold or graph, so the notion of convolution does not directly extend to this case.
  5. 5. Geometric Deep Learning #3 Bronstein et al. (July 2017): “We expect the following years to bring exciting new approaches and results, and conclude our review with a few observations of current key difficulties and potential directions of future research.” Generalization: Generalizing deep learning models to geometric data requires not only finding non-Euclidean counterparts of basic building blocks (such as convolutional and pooling layers), but also generalization across different domains. Generalization capability is a key requirement in many applications, including computer graphics, where a model is learned on a training set of non-Euclidean domains (3D shapes) and then applied to previously unseen ones. Time-varying domains: An interesting extension of geometric deep learning problems discussed in this review is coping with signals defined over a dynamically changing structure. In this case, we cannot assume a fixed domain and must track how these changes affect signals. This could prove useful to tackle applications such as abnormal activity detection in social or financial networks. In the domain of computer graphics and vision, potential applications deal with dynamic shapes (e.g. 3D video captured by a range sensor). Computation: The final consideration is a computational one. All existing deep learning software frameworks are primarily optimized for Euclidean data. One of the main reasons for the computational efficiency of deep learning architectures (and one of the factors that contributed to their renaissance) is the assumption of regularly structured data on 1D or 2D grid, allowing to take advantage of modern GPU hardware. Geometric data, on the other hand, in most cases do not have a grid structure, requiring different ways to achieve efficient computations. It seems that computational paradigms developed for large-scale graph processing are more adequate frameworks for such applications.
  6. 6. Primer on GRAPHs Taylor and Wrana (2012) doi: 10.1002/pmic.201100594
  7. 7. Graph theory especially useful for networks analysis https://doi.org/10.1126/science.286.5439.509 Cited by 29,071 articles https://doi.org/10.1038/30918 Cited by 33,772 Random rewiring procedure for interpolating between a regular ring lattice and a random network, without altering the number of vertices or edges in the graph. http://www.bbc.co.uk/newsbeat/article/35500398/how-facebook-updated-six-degree s-of-separation-its-now-357 https://research.fb.com/three-and-a-half-degrees-of-separation/ http://slideplayer.com/slide/9267536/
  8. 8. Graph theory Common metrics and definitions Graph-theoretic node importance mining on network topology - Xue et al. (2017) The graph-theoretic node importance mining methods based on network topologies comprise two main categories: node relevance and shortest path. The method of node relevance is measured by degree analysis. The methods of shortest path that aim at finding optimal spreading paths are measured by several node importance analyses, e.g., betweenness, closeness centrality, eigenvector centrality, Bonacich centrality and alter- based centrality. Betweenness is used particularly for measurements of power while closeness centrality and eigenvector centrality are used particularly for measurements of centrality. Bonacich centrality is an extension of eigenvector centrality which measures node importance on both centrality and power. The other mining methods for node importance based on network topologies included in this review are via processes such as node deleting, node contraction, and data mining and machine learning embedded techniques. For heterogeneous network structures, fusion methods integrate all the previously mentioned measurements. 28 February, 2013 Google’s Knowledge Graph: one step closer to the semantic web? By Andrew Isidoro Knowledge Graph, a database of over 570m of the most searched-for people, places and things (entities), including around 18bn cross-references. The knowledge graph as the default data model for learning on heterogeneous knowledge Wilcke, Xandera; Bloem, Peterc; de Boer, Victor Data Science, vol. Preprint, no. Preprint, pp. 1-19, 2017 http://doi.org/10.3233/DS-170007 The FuhSen Architecture. High-level architecture comprising (a) Mediator and wrappers architecture to build the (b) knowledge graph on demand. The answer of a keyword query corresponds to an RDF subject-molecule that integrates RDF molecules collected from the wrappers. (c) The components to enrich the results KG. FuhSen: A Federated Hybrid Search Engine for building a knowledge graph on-demand July 2016 https://doi.org/10.1007/978-3-319-48472-3_47 + https://doi.org/10.1109/ICSC.2017.85 researchgate.net
  9. 9. Ranking in time-varying complex networks Ranking in evolving complex networks Hao Liao, Manuel Sebastian Mariani, Matúš Medo, Yi-Cheng Zhang, Ming- Yang Zhou Physics Reports Volume 689, 19 May 2017, Pages 1-54 https://doi.org/10.1016/j.physrep.2017.05.001 Top: The often-studied Zachary’s karate club network has 34 nodes and 78 links (here visualized with the Gephi software). Bottom: Ranking of the nodes in the Zachary karate club network by the centrality metrics described in this section. Node labels on the horizontal axis correspond to the node labels in the top panel. For the APS citation data from the period 1893–2015 (560,000 papers in total), we compute the ranking of papers according to various metrics— citation count c, PageRank centrality p (with the teleportation parameter α = 0.5), and rescaled PageRank R(p). The figure shows the median ranking position of the top 1% of papers from each year. The three curves show three distinct patterns. For c, the median rank is stable until approximately 1995; then it starts to grow because the best young papers have not yet reached sufficiently high citation counts. For p, the median rank grows during the whole displayed time period because PageRank applied on an acyclic time-ordered citation network favors old papers. By contrast, the curve is approximately flat for R(p) during the whole period which confirms that the metric is not biased by paper age and gives equal chances to all papers. An illustration of the difference between the first-order Markovian (time-aggregated) and second-order network representation of the same data. Panels A–B represent the destination cities (the right-most column) of flows of passengers from Chicago to other cities, given the previous location (the left-most column). When including memory effects (panel B), the fraction of passengers coming back to the original destination is large, in agreement with our intuition. A similar effect is found for the network of academic journals
  10. 10. information diffusion intro Many graphs can be modeled or used to predict how an information flows in the given graph. ● How influential are with your Instagram posts, tweets, LinkedIn posts, etc? ● How does tweet affect the stock market, or in more general terms, how can the causality be inferred from graph? ● In practice, you see heat diffusion methods applied also applied to information diffusion Random walks and diffusion on networks Naoki Masuda, Mason A. Porter, Renaud Lambiotte Physics Reports (Available online 31 August 2017) https://doi.org/10.1016/j.physrep.2017.07.007 Fig. 12. The weary random walker retires from the network and heads off into the distant sunset. [This picture was drawn by Yulian Ng.]. Inferring networks of diffusion and influence Manuel Gomez Rodriguez, Jure Leskovec, Andreas Krause KDD '10 Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining https://doi.org/10.1145/1835804.1835933 There are several interesting directions for future work. Here we only used time difference to infer edges and thus it would be interesting to utilize more informative features (e.g., textual content of postings etc.) to more accurately estimate the influence probabilities. Moreover, our work considers static propagation networks, however real influence networks are dynamic and thus it would be interesting to relax this assumption. Last, there are many other domains where our methodology could be useful: inferring interaction networks in systems biology (protein- protein and gene interaction networks), neuroscience (inferring physical connections between neurons) and epidemiology. We believe that our results provide a promising step towards understanding complex processes on networks based on partial observations.
  11. 11. information diffusion Social Networks #1 Nonlinear Dynamics of Information Diffusion in Social Networks ACM Transactions on the Web (TWEB) Volume 11 Issue 2, May 2017 Article No. 11 https://doi.org/10.1145/3057741 Online Social Networks and information diffusion: The role of ego networks Valerio Arnaboldi, Marco Conti, Andrea Passarella, Robin I.M. Dunbar Online Social Networks and Media 1 (2017) 44–55 http://dx.doi.org/10.1016/j.osnem.2017.04.001 Data Driven Modeling of Continuous Time Information Diffusion in Social Networks Liang Liu ; Bin Chen ; Bo Qu ; Lingnan He ; Xiaogang Qiu Data Science in Cyberspace (DSC), 2017 IEEE https://doi.org/10.1109/DSC.2017.103 Online Bayesian Inference of Diffusion Networks Shohreh Shaghaghian ; Mark Coates IEEE Transactions on Signal and Information Processing over Networks ( Volume: 3, Issue: 3, Sept. 2017 ) https://doi.org/10.1109/TSIPN.2017.2731160 Modeling the reemergence of information diffusion in social network Dingda Yang, Xiangwen Liao, Huawei Shen, Xueqi Cheng, Guolong Chen Physica A: Statistical Mechanics and its Applications [Available online 1 September 2017] http://dx.doi.org/10.1016/j.physa.2017.08.115 Information Diffusion in Online Social Networks: A Survey Adrien Guille, Hakim Hacid, Cécile Favre, Djamel A. Zighed ACM SIGMOD Volume 42 Issue 2, May 2013 Pages 17-28 https://doi.org/10.1145/2503792.2503797
  12. 12. information diffusion Social Networks #2 Literature Survey on Interplay of Topics, Information Diffusion and Connections on Social Networks Kuntal Dey, Saroj Kaushik, L. Venkata Subramaniam (Submitted on 3 Jun 2017) https://arxiv.org/abs/1706.00921
  13. 13. information diffusion scientific citation networks #1 Integration of Scholarly Communication Metadata Using Knowledge Graphs Afshin Sadeghi, Christoph Lange, Maria-Esther Vidal, Sören Auer International Conference on Theory and Practice of Digital Libraries TPDL 2017: Research and Advanced Technology for Digital Libraries pp 328-341 https://doi.org/10.1007/978-3-319-67008-9_26 Particularly, we demonstrate the benefits of exploiting semantic web technology to reconcile data about authors, papers, and conferences. A Recommendation System Based on Hierarchical Clustering of an Article-Level Citation Network Jevin D. West ; Ian Wesley-Smith ; Carl T. Bergstrom IEEE Transactions on Big Data ( Volume: 2, Issue: 2, June 1 2016 ) https://doi.org/10.1109/TBDATA.2016.2541167 http://babel.eigenfactor.org/ The scholarly literature is expanding at a rate that necessitates intelligent algorithms for search and navigation. For the most part, the problem of delivering scholarly articles has been solved. If one knows the title of an article, locating it requires little effort and, paywalls permitting, acquiring a digital copy has become trivial. However, the navigational aspect of scientific search - finding relevant, influential articles that one does not know exist - is in its early development Big Scholarly Data: A Survey Feng Xia ; Wei Wang ; Teshome Megersa Bekele ; Huan Liu IEEE Transactionson Big Data ( Volume: 3, Issue: 1, March 1 2017 ) https://doi.org/10.1109/TBDATA.2016.2641460 ASNA - Academic Social Network Analysis
  14. 14. information diffusion scientific citation networks #2 Implicit Multi-Feature Learning for Dynamic Time Series Prediction of the Impact of Institutions Xiaomei Bai ; Fuli Zhang ; Jie Hou ; Feng Xia ; Amr Tolba ; ElsayedElashkarr IEEE Access ( Volume: 5 ) https://doi.org/10.1109/ACCESS.2017.2739179 Predicting the impact of research institutions is an important tool for decision makers, such as resource allocation for funding bodies. Despite significant effort of adopting quantitative indicators to measure the impact of research institutions, little is known that how the impact of institutions evolves in time The Role of Positive and Negative Citations in Scientific Evaluation Xiaomei Bai ; Ivan Lee ; Zhaolong Ning ; Amr Tolba ; Feng Xia IEEE Access ( Volume: PP, Issue: 99 ) https://doi.org/10.1109/ACCESS.2017.2740226 Predicting the impact of research institutions is an important tool for decision makers, such as resource allocation for funding bodies. Despite significant effort of adopting quantitative indicators to measure the impact of research institutions, little is known that how the impact of institutions evolves in time Recommendation for Cross- Disciplinary Collaboration Based on Potential Research Field Discovery Wei Liang ; Xiaokang Zhou ; Suzhen Huang ; Chunhua Hu ; Qun Jin Advanced Cloud and Big Data (CBD), 2017 https://doi.org/10.1109/CBD.2017.67 The cross-disciplinary information is hidden in tons of publications, and the relationships between different fields are complicated, which make it challengeable recommending cross-disciplinary collaboration for a specific researcher. Petteri: Whether to recommend “outliers” i.e. unexpected combinations of fields, or something outside your field that would be useful to you. Or just the typical landmark papers of your field? Depends on your needs for sure. https://iris.ai/ http://www.bibblio.org/learning-and-knowledge In the future, we will further explore the relationships between the impact of institutions and the features driving the impact of institutions change to enhance the prediction performance. In addition, this work is conducted only on literatures from the eight top conferences based on Microsoft Academic Graph (MAG), dataset, examining other conferences for the same observed patterns could widen the significance of our findings.
  15. 15. information diffusion Finance, Quant trading, decision making Information Diffusion, Cluster formation and Entropy-based Network Dynamics in Equity and Commodity Markets Stelios Bekiros , Duc Khuong Nguyen , Leonidas Sandoval Junior , Gazi Salah Uddin European Journal of Operational Research (2016) http://dx.doi.org/10.1016/j.ejor.2016.06.052 https://www.prowler.io/ https://www.causalitylink.com/ https://www.forbes.com/sites/antoinegara/2017/02/28/kensho-sp-5 00-million-valuation-jpmorgan-morgan-stanley/#6fe4bb0b5cbf Technology that brings transparency to complex systems https://www.kensho.com/ Our platform uses artificial intelligence to discover, extract and index events, variables and relationships about markets, sectors, industries and equities. It absorbs news articles, analysts’ point-of view or equity- related materials as they are published. Save time and get ahead by letting AI do the repetitive reading for you. Focus on new knowledge. Analysis of Investment Relationships Between Companies and Organizations Based on Knowledge Graph Xiaobo Hu, Xinhuai Tang, Feilong Tang In: Barolli L., Enokido T. (eds) Innovative Mobile and Internet Services in Ubiquitous Computing. IMIS 2017. Advances in Intelligent Systems and Computing, vol 612 https://doi.org/10.1007/978-3-319-61542-4_20 A design for a common-sense knowledge-enhanced decision-support system: Integration of high-frequency market data and real-time news Kun Chen, Jian Yin, Sulin Pang Expert Systems (June 2017) doi: 10.1111/exsy.12209 Compared with previous work, our model is the first to incorporate broad common-sense knowledge into a decision support system, thereby improving the news analysis process through the application of a graphic random-walk framework. Prototype and experiments based on Hong Kong stock market data have demonstrated that common-sense knowledge is an important factor in building financial decision models that incorporate news information. Dynamics of financial markets and transaction costs: A graph-based study FelipeLillo, RodrigoValdés Research in International Business and Finance Volume 38, September 2016, Pages 455-465 Using financialization as a conceptual framework to understand the current trading patterns of financial markets, this work employs a market graph model for studying the stock indexes of geographically separated financial markets. By using an edge creation condition based on a transaction cost threshold, the resulting market graph features a strong connectivity, some traces of a power law in the degree distribution and an intensive presence of cliques. Ponzi scheme diffusion in complex networks Anding Zhu, Peihua Fu, Qinghe Zhang, ZhenyueChen Physica A: Statistical Mechanics and its Applications Volume 479, 1 August 2017, Pages 128-136 https://doi.org/10.1016/j.physa.2017.03.015
  16. 16. “Intelligent knowledge graphs” with “actionable insights” Model-Driven Analytics: Connecting Data, Domain Knowledge, and Learning Thomas Hartmann, Assaad Moawad, Francois Fouquet, Gregory Nain, Jacques Klein, Yves Le Traon, Jean-Marc Jezequel (Submitted on 5 Apr 2017) https://arxiv.org/abs/1704.01320 Gaining profound insights from collected data of today's application domains like IoT, cyber-physical systems, health care, or the financial sector is business-critical and can create the next multi-billion dollar market. However, analyzing these data and turning it into valuable insights is a huge challenge. This is often not alone due to the large volume of data but due to an incredibly high domain complexity, which makes it necessary to combine various extrapolation and prediction methods to understand the collected data. Model-driven analytics is a refinement process of raw data driven by a model reflecting deep domain understanding, connecting data, domain knowledge, and learning.
  17. 17. Graph theory example Applications beyond typical networks Construction (BIM): “Graph theory based representation of building information models (BIM) for access control applications” Automation in Construction, Volume 68, August 2016, Pages 44-51 https://doi.org/10.1016/j.autcon.2016.04.001 IFC 4 model IFC-SPF format. Medical Imaging (OCT): “Improving Segmentation of 3D Retina Layers Based on Graph Theory Approach for Low Quality OCT Images” Metrology and Measurement System Volume 23, Issue 2 (Jun 2016) https://doi.org/10.1515/mms-2016-0016 Dijkstra shortest path algorithm Risk Assessment: “A New Risk Assessment Framework Using Graph Theory for Complex ICT Systems” MIST '16 Proceedings of the 8th ACM CCS1 https://doi.org/10.1145/2995959.2995969 Biodiversity management: “Multiscale connectivity and graph theory highlight critical areas for conservation under climate change” Ecological Applications (8 June 2016) http://doi.org/10.1890/15-0925 Brain Imaging: ““Small World” architecture in brain connectivity and hippocampal volume in Alzheimer’s disease: a study via graph theory from EEG data” Brain Imaging and Behavior April 2017, Volume 11, Issue 2, pp 473–485 doi: 10.1007/s11682-016-9528-3 Small World trends in the two groups of subjects Medical Imaging (OCT): “Reconstruction of 3D surface maps from anterior segment optical coherence tomography images using graph theory and genetic algorithms” Biomedical Signal Processing and Control Volume 25, March 2016, Pages 91-98 https://doi.org/10.1016/j.bspc.2015.11.004 Cybersecurity: “Big Data Behavioral Analytics Meet Graph Theory: On Effective Botnet Takedowns” IEEE Network ( Volume: 31, Issue: 1, January/February 2017 ) https://doi.org/10.1109/MNET.2016.1500116NM
  18. 18. Graph Signal Processing and quantitative graph theory Defferrard et al. (2016): “The emerging field of Graph Signal Processing (GSP) aims at bridging the gap between signal processing and spectral graph theory [Shuman et al. 2013], a blend between graph theory and harmonic analysis. A goal is to generalize fundamental analysis operations for signals from regular grids to irregular structures embodied by graphs. We refer the reader to Belkin and Niyogi 2008 for an introduction of the field.” Matthias Dehmer, Frank Emmert-Streib, Yongtang Shi https://doi.org/10.1016/j.ins.2017.08.009 The main goal of quantitative graph theory is the structural quantification of information contained in complex networks by employing a measurement approach based on numerical invariants and comparisons. Furthermore, the methods as well as the networks do not need to be deterministic but can be statistic. Shuman et al. 2013:Perraudin and Vandergheynst 2016: ”the proposed Wiener regularization framework offers a compelling way to solve traditional problems such as denoising, regression or semi-supervised learning” Experiments on the temperature of Molene. Top: A realization of the stochastic graph signal (first measure). Bottom center: the temperature of the Island of Brehat. Bottom right: Recovery errors (inpainting error) for different noise levels
  19. 19. Graph Fourier Transform GFT The use of Graph Fourier Transform in image processing: A new solution to classical problems Verdoja Francesco. PhD Thesis 2017 https://doi.org/10.1109/ICASSP.2017.7952886 On the Graph Fourier Transform for Directed Graphs Stefania Sardellitti ; Sergio Barbarossa ; Paolo Di Lorenzo IEEE Journal of Selected Topics in Signal Processing ( Volume: 11, Issue: 6, Sept. 2017 ) https://doi.org/10.1109/JSTSP.2017.2726979 The analysis of signals defined over a graph is relevant in many applications, such as social and economic networks, big data or biological networks, and so on. A key tool for analyzing these signals is the so called Graph Fourier Transform (GFT). Alternative definitions of GFT have been suggested in the literature, based on the eigen-decomposition of either the graph Laplacian or adjacency matrix. In this paper, we address the general case of directed graphs and we propose an alternative approach that builds the graph Fourier basis as the set of orthonormal vectors that minimize a continuous extension of the graph cut size, known as the Lovasz extension. Graph-based approaches have recently seen a spike of interest in the image processing and computer vision communities, and many classical problems are finding new solutions thanks to these techniques. The Graph Fourier Transform (GFT), the equivalent of the Fourier transform for graph signals, is used in many domains to analyze and process data modeled by a graph. In this thesis we present some classical image processing problems that can be solved through the use of GFT. We’ll focus our attention on two main research area: the first is image compression, where the use of the GFT is finding its way in recent literature; we’ll propose two novel ways to deal with the problem of graph weight encoding. We’ll also propose approaches to reduce overhead costs of shape-adaptive compression methods. The second research field is image anomaly detection, GFT has never been proposed to this date to solve this class of problems; we’ll discuss here a novel technique and we’ll test its application on hyperspectral and medical (PET tumor scan) images
  20. 20. Graph signal Processing #1 Adaptive Least Mean Squares Estimation of Graph Signals Paolo Di Lorenzo ; Sergio Barbarossa ; Paolo Banelli ; Stefania Sardellitti IEEE Transactions on Signal and Information Processing over Networks ( Volume: 2, Issue: 4, Dec. 2016 ) https://doi.org/10.1109/TSIPN.2016.2613687 Distributed Adaptive Learning of Graph Signals Paolo Di Lorenzo ; Sergio Barbarossa ; Paolo Banelli ; Stefania Sardellitti IEEE Transactions on Signal Processing ( Volume: 65, Issue: 16, Aug.15, 15 2017 ) https://doi.org/10.1109/TSP.2017.2708035 The aim of this paper is to propose a least mean squares (LMS) strategy for adaptive estimation of signals defined over graphs. Assuming the graph signal to be band-limited, over a known bandwidth, the method enables reconstruction, with guaranteed performance in terms of mean-square error, and tracking from a limited number of observations over a subset of vertices. Furthermore, to cope with the case where the bandwidth is not known beforehand, we propose a method that performs a sparse online estimation of the signal support in the (graph) frequency domain, which enables online adaptation of the graph sampling strategy. Finally, we apply the proposed method to build the power spatial density cartography of a given operational region in a cognitive network environment. “We apply the proposed distributed framework to power density cartography in cognitive radio (CR) networks. We consider a 5G scenario, where a dense deployment of radio access points (RAPs) is envisioned to provide a service environment characterized by very low latency and high rate access. Each RAP collects data related to the transmissions of primary users (PUs) at its geographical position, and communicates with other RAPs with the aim of implementing advanced cooperative sensing techniques” “This paper represents the first work that merges the well established field of adaptation and learning over networks, and the emerging topic of signal processing over graphs. Several interesting problems are still open, e.g., distributed reconstruction in the presence of directed and/or switching graph topologies, online identification of the graph signal support from streaming data, distributed inference of the (possibly unknown) graph signal topology, adaptation of the sampling strategy to time-varying scenarios, optimization of the sampling probabilities, just to name a few. We plan to investigate on these exciting problems in our future works”
  21. 21. Graph signal Processing #2 Kernel Regression for Signals over Graphs Arun Venkitaraman, Saikat Chatterjee, Peter Händel (Submitted on 7 Jun 2017) https://arxiv.org/abs/1706.02191 Uncertainty Principles and Sparse Eigenvectors of Graphs Arun Venkitaraman, Saikat Chatterjee, Peter Händel IEEE Transactions on Signal Processing ( Volume: 65, Issue: 20, Oct.15, 15 2017 ) https://doi.org/10.1109/TSP.2017.2731299 We propose kernel regression for signals over graphs. The optimal regression coefficients are learnt using a constraint that the target vector is a smooth signal over an underlying graph. The constraint is imposed using a graph- Laplacian based regularization. We discuss how the proposed kernel regression exhibits a smoothing effect, simultaneously achieving noise- reduction and graph-smoothness. We further extend the kernel regression to simultaneously learn the underlying graph and the regression coefficients. Our hypothesis was that incorporating the graph smoothness constraint would help kernel regression to perform better, particularly when we lack sufficient and reliable training data. Our experiments illustrate that this is indeed the case in practice. Through experiments we also conclude that graph signals carry sufficient information about the underlying graph structure which may be extracted in the regression setting even with moderately small number of samples in comparison with the graph dimension. Thus, our approach helps both predict and infer the underlying topology of the network or graph. When the graph has repeated eigenvalues we explained that s graph Fourier Basis (GFB) is not unique, and the derived lower bound can have different values depending on the selected GFB. We provided a constructive method to find a GFB that yields the smallest uncertainty bound. In order to find the signals that achieve the derived lower bound we considered sparse eigenvectors of the graph. We showed that the graph Laplacian has a 2- sparse eigenvector if and only if there exists a pair of nodes with the same neighbors. When this happens, the uncertainty bound is very low and the 2- sparse eigenvectors achieve this bound. We presented examples of both classical and real- world graphs with 2-sparse eigenvectors. We also discussed that, in some examples, the neighborhood structure has a meaningful interpretation.
  22. 22. Graph signal Processing #3 Time-varying graphs Kernel-Based Reconstruction of Space-Time Functions on Dynamic Graphs Daniel Romero ; Vassilis N. Ioannidis ; Georgios B. Giannakis IEEE Journal of Selected Topics in Signal Processing ( Volume: 11, Issue: 6, Sept. 2017 ) https://doi.org/10.1109/JSTSP.2017.2726976 Filtering Random Graph Processes Over Random Time-Varying Graphs Kai Qiu ; Xianghui Mao ; Xinyue Shen ; Xiaohan Wang ; Tiejian Li ; Yuantao Gu IEEE Journal of Selected Topics in Signal Processing ( Volume: 11, Issue: 6, Sept. 2017 ) https://doi.org/10.1109/JSTSP.2017.2726969 DSLR distributed least squares reconstruction LMS least mean-squares KKF kernel Kalman filter ECoG electrocorticography NMSE cumulative normalized mean-square error This paper investigated kernel-based reconstruction of space-time functions on graphs. The adopted approach relied on the construction of an extended graph, which regards the time dimension just as a spatial dimension. Several kernel designs were introduced together with a batch and an online function estimators. The latter is a kernel Kalman filter developed from a purely deterministic standpoint without any need to adopt any state- space model. Future research will deal with multi-kernel and distributed versions of the proposed algorithms. Schemes tailored for time-evolving functions on graphs include [Bach and Jordan 2004] and [ Mei and Moura 2016], which predict the function values at time t given observations up to time t − 1. However, these schemes assume that the function of interest adheres to a specific vector autoregression and all vertices are observed at previous time instances. Moreover, [Bach and Jordan 2004] requires Gaussianity along with an ad hoc form of stationarity. However, many real-world graph signals are time-varying, and they evolve smoothly, so instead of the signals themselves being bandlimited or smooth on graph, it is more reasonable that their temporal differences are smooth on graph. In this paper, a new batch reconstruction method of time-varying graph signals is proposed by exploiting the smoothness of the temporal difference signals, and the uniqueness of the solution to the corresponding optimization problem is theoretically analyzed. Furthermore, driven by practical applications faced with real-time requirements, huge size of data, lack of computing center, or communication difficulties between two non-neighboring vertices, an online distributed method is proposed by applying local properties of the temporal difference operator and the graph Laplacian matrix. In the future, we will further study the applications of smoothness of temporal difference signals, and may combine it with other properties of signals, such as low rank. Besides, it is also interesting to consider the situation where both the signal and the graph are time- varying.
  23. 23. Graph signal Processing #4 Time-varying graphs Signal Processing on Graphs: Causal Modeling of Unstructured Data Jonathan Mei, José M. F. Moura (Submitted on 28 Feb 2015 (v1), last revised 8 Feb 2017 (this version, v6)) https://arxiv.org/abs/1503.00173 Learning directed Graph Shifts from High- Dimensional Time Series Lukas Nagel(June 2017) Master Thesis, Institute of Telecommunications (TU Wien) https://pdfs.semanticscholar.org/8822/526b7b2862f6374f5f950c89a14a7a931820.pdf Many applications collect a large number of time series, for example, the financial data of companies quoted in a stock exchange, the health care data of all patients that visit the emergency room of a hospital, or the temperature sequences continuously measured by weather stations across the US. These data are often referred to as unstructured. A first task in its analytics is to derive a low dimensional representation, a graph or discrete manifold, that describes well the interrelations among the time series and their intrarelations across time. This paper presents a computationally tractable algorithm for estimating this graph that structures the data. The resulting graph is directed and weighted, possibly capturing causal relations, not just reciprocal correlations as in many existing approaches in the literature. A convergence analysis is carried out. The algorithm is demonstrated on random graph datasets and real network time series datasets, and its performance is compared to that of related methods. The adjacency matrices estimated with the new method are close to the true graph in the simulated data and consistent with prior physical knowledge in the real dataset tested. Frequency ordering depending on the position of the eigenvalues λ in C. Both graphics are from Sandryhaila and Moura 2014. Causal graph signal process. Visualization of the information spreading through graph shifts for P3(A, c) We want to apply the causal graph process estimation algorithm to stock prices and especially point out some additional points of failure we spotted. In the shift matrix shown in Figure 4.9a, we observe that the stocks number 2, 16 and 24 have many incoming connections. It appears unlikely that this is due to some economic relations and points towards a numerical problem. As we were interested in potential interpretations of the shift recovered from the stock data, we chose to visualize the largest possible directions of the shift shown in Figure 4.11 as a graph in Figure 4.12. The only observation we could draw from the graph is that there are multiple bank stocks, which affect multiple other stocks. Otherwise, the connected companies show no common ownership structure nor even similar or related products. The stocks example with no clear expectation did not lead to promising results. Despite this, we described with scaling and averaging two processing steps that could be applied before starting the estimation algorithm. It is unclear if further tuning were needed or the domain of daily stock data cannot reasonably be modeled with causal graph processes, and we, therefore, leave this question open for future research.
  24. 24. Graph Wavelet transform vs. GFT #1 Compression of dynamic 3D point clouds using subdivisional meshes and graph wavelet transforms Aamir Anis ; Philip A. Chou ; Antonio Ortega University of Southern California, Los Angeles, CA; † Microsoft Research, Redmond, WA Acoustics, Speech and Signal Processing (ICASSP), 2016 IEEE https://doi.org/10.1109/ICASSP.2016.7472901 The subdivisional structure also allows us to obtain a sequence of bipartite graphs that facilitate the use of GraphBior [ Narang et al. (2012)] to compute the wavelet transform coefficients of the geometry and color attributes. Compact Support Biorthogonal Wavelet Filterbanks for Arbitrary Undirected Graphs Sunil K. Narang, Antonio Ortega (Submitted on 30 Oct 2012 (v1), last revised 19 Nov 2012 (this version, v2)) https://arxiv.org/abs/1210.8129 In this paper, we provide a framework for compression of 3D point cloud sequences. Our approach involves representing sets of frames by a consistently-evolving high-resolution subdivisional triangular mesh. This representation helps us facilitate efficient implementations of motion estimation and graph wavelet transforms. The subdivisional structure plays a crucial role in designing a simple hierarchical method for efficiently estimating these meshes, and the application of Biorthogonal Graph Wavelet Filterbanks for compression. Preliminary experimental results show promising performances of both the estimation and the compression steps, and we believe this work shall open new avenues of research in this emerging field. Comparison of graph wavelet designs in terms of key properties: zero highpass response for constant graph-signal (DC), critical sampling (CS), perfect reconstruction (PR), compact support (Comp), orthogonal expansion (OE), requires graph simplification (GS). In this paper we have presented novel graph-wavelet filterbanks that provide a critically sampled representation with compactly supported basis functions. The filterbanks come in two flavors: a) nonzeroDC filterbanks, and b) zeroDC filterbanks. The former filterbanks are designed as polynomials of the normalized graph Laplacian matrix, and the latter filterbanks are extensions of the former to provide a zero response by the highpass operators. Preliminary results showed that the filterbanks are useful not only for arbitrary graph but also to the standard regular signal processing domains. Extensions of this work will focus on the application of these filters to different scenarios, including, for example, social network analysis, sensor networks etc.
  25. 25. Graph Wavelet transform vs. GFT #2 Bipartite Approximation for Graph Wavelet Signal Decomposition Jin Zeng ; Gene Cheung ; Antonio Ortega IEEE Transactions on Signal Processing ( Volume: 65, Issue: 20, Oct.15, 15 2017 ) https://doi.org/10.1109/TSP.2017.2733489 Splines and Wavelets on Circulant Graphs Madeleine S. Kotzagiannidis, Pier Luigi Dragotti (Submitted on 15 Mar 2016) https://arxiv.org/abs/1603.04917 (a) Two-channel wavelet filterbank on bipartite graph; (b) Kernels of H0 , H1 in graphBior Narang et al. (2012) with filter length of 19. Unlike previous works, our design of the two metrics relates directly to energy compaction for bipartite subgraph decomposition. Comparison with the state- of-the-art schemes validates our proposed metrics for energy compaction and illustrates the efficiency of our approach. We are currently working on different applications of graphBior with our bipartite approximation, e.g., graph-signal denoising, which will benefit from the energy compaction in the wavelet domain. In this paper, we have introduced novel families of wavelets and associated filterbanks on circulant graphs with vanishing moment properties, which reveal (e-)spline-like functions on graphs, and promote sparse multiscale representations. Moreover, we have discussed generalizations to arbitrary graphs in the form of a multidimensional wavelet analysis scheme based on graph product decomposition, facilitating a sparsity-promoting generalization with the advantage of lower-dimensional processing. In our future work, we wish to further explore the sets of graph signals which can be annihilated with existing and/or evolved graph wavelets as well as refine its extensions and relevance for arbitrary graphs.
  26. 26. Graphlet induced subgraphs of a large network Estimation of Graphlet Statistics Ryan A. Rossi, Rong Zhou, and Nesreen K. Ahmed (Submitted on 6 Jan 2017 (v1), last revised 28 Feb 2017 (this version, v2)) https://arxiv.org/abs/1701.01772
  27. 27. Graph Computing Accelerations Parallel Local Algorithms for Core, Truss, and Nucleus Decompositions Ahmet Erdem Sariyuce, C. Seshadhri, Ali Pinar Sandia National Laboratories, University of California (Submitted on 2 Apr 2017) https://arxiv.org/abs/1704.00386 Finding the dense regions of a graph and relations among them is a fundamental task in network analysis. Nucleus decomposition is a principled framework of algorithms that generalizes the k- core and k-truss decompositions. It can leverage the higher-order structures to locate the dense subgraphs with hierarchical relations. … We present a framework of local algorithms to obtain the exact and approximate nucleus decompositions. Our algorithms are pleasingly parallel and can provide approximations to explore time and quality trade-offs. Our shared-memory implementation verifies the efficiency, scalability, and effectiveness of our algorithms on real-world networks. In particular, using 24 threads, we obtain up to 4.04x and 7.98x speedups for k-truss and (3, 4) nucleus decompositions.
  28. 28. P-Laplacian on graphs p-Laplacian Regularized Sparse Coding for Human Activity Recognition Weifeng Liu ; Zheng-Jun Zha ; Yanjiang Wang ; Ke Lu ; Dacheng Tao IEEE Transactions on Industrial Electronics ( Volume: 63, Issue: 8, Aug. 2016 ) https://doi.org/10.1109/TIE.2016.2552147 On the game p-Laplacian on weighted graphs with applications in image processing and data clustering A. ELMOATAZ, X. DESQUESNES and M. TOUTAIN (3 July 2017) European Journal of Applied Mathematics https://doi.org/10.1017/S0956792517000122 In this paper, we have introduced a new class of normalized p-Laplacian operators as a discrete adaptation of the game-theoretic p-Laplacian on weighted graphs. This class is based on new partial difference operator which interpolate between normalized 2- Laplacian, 1-Laplacian and ∞- Laplacian on graphs. This operator is also connected to non-local average operators such as non-local mean, non-local median and non-local midrange. It generalizes the normalized p-Laplacian on graphs for 1 ≤ p ≤ . We have∞ shown the connections with local and non-local PDEs of p-Laplacian types and stochastic game Tug-of-War with noise (Peres et al. 2008). We have proved existence and uniqueness of the Dirichlet problem involving operators of this new class. Finally, we have illustrated the interest and behaviour of such operators in some inverse problems in image processing and machine learning. The framework of human activity recognition. Firstly, we extract the representative features of human activity including SIFT, STIP and MFCC. Then we concatenate the histograms formed by bags of each feature. Thirdly, we learn the sparse codes of each sample and the corresponding dictionary simultaneously by p-Laplacian regularized sparse coding algorithm. Finally, we input the learned sparse codes into classifiers i.e. support vector machines to conduct human activity recognition. As a sparse representation, the proposed p- Laplacian regularized sparse coding algorithm can also be employed for modern industry using data-based techniques [Jung et al. 2015; Shen et al. 2015] and other computer vision applications such as video summary and visual tracking [Bai and Li 2014; Yu et al. 2016]. In the future, we will apply the proposed p- Laplacian regularized sparse coding for more practical implementations. We will also study the extensions to the multiview learning and deep architecture construction for more attractive performance. Sparse coding has achieved promising performance in classification. The most prominent Laplacian regularized sparse coding employs Laplacian regularization to preserve the manifold structure; however, Laplacian regularization suffers from poor generalization. To tackle this problem, we present a p-Laplacian regularized sparse coding algorithm by introducing the nonlinear generalization of standard graph Laplacian to exploit the local geometry. Compared to the conventional graph Laplacian, the p-Laplacian has tighter isoperimetric inequality and the p- Laplacian regularized sparse coding can achieve superior theoretical evidence.
  29. 29. “Applied Laplacian” Mesh processing #1A Spectral Mesh Processing H. Zhang, O. Van Kaick, R. Dyer Computer Graphics Forum 9 April 2010 http://dx.doi.org/10.1111/j.1467-8659.2010.01655.x
  30. 30. Graph Framework for Manifold-valued Data image processing Nonlocal Inpainting of Manifold-valued Data on Finite Weighted Graphs Ronny Bergmann, Daniel Tenbrinck (Submitted on 21 Apr 2017 (v1), last revised 12 Jul 2017 (this version, v2)) https://arxiv.org/abs/1704.06424 open source code: http://www.mathematik.uni-kl.de/imagepro/members/bergmann/mvirt/ A Graph Framework for Manifold-valued Data Ronny Bergmann, Daniel Tenbrinck (Submitted on 17 Feb 2017) https://arxiv.org/abs/1702.05293 Recently, there has been a strong ambition to translate models and algorithms from traditional image processing to non-Euclidean domains, e.g., to manifold-valued data. While the task of denoising has been extensively studied in the last years, there was rarely an attempt to perform image inpainting on manifold-valued data. In this paper we present a nonlocal inpainting method for manifold-valued data given on a finite weighted graph. First numerical examples using a nonlocal graph construction with patch-based similarity measures demonstrate the capabilities and performance of the inpainting algorithm applied to manifold-valued images. Despite an analytic investigation of the convergence of the presented scheme, future work includes further development of numerical algorithms, as well as properties of the -Laplacian for manifold-valued vertex∞ functions on graphs Illustration of the basic definitions and concepts on a Riemannian manifold M. In the following we present several examples illustrating the large variety of problems that can be tackled using the proposed manifold-valued graph framework. Furthermore, we compare our framework for the special case of nonlocal denoising of phase-valued data to a state-of-the-art method. Finally, we demonstrate a real-world application from denoising surface normals in digital elevation maps from LiDAR data. Subsequently, we model manifold-data measured on samples of an explicitly given surface and in particular illustrate denoising of diffusion tensors measured on a sphere. Finally, we investigate denoising of real DT-MRI data from medical applications both on a regular pixel grid as well as on an implicitly given surface. All algorithm were implemented in Mathworks Matlab by extending the open source software package Manifold-valued Image Restoration Toolbox (MVIRT) . Reconstruction results of measured surface normals in digital elevation maps (DEM) generated by light detection and ranging (LiDAR) measurements of earth’s surface topology. Reconstruction results of manifold-valued data given on the implicit surface of the open Camino brain data set.
  31. 31. segmentation of graphs #1 Convex variational methods for multiclass data segmentation on graphs Egil Bae, Ekaterina Merkurjev (Submitted on 4 May 2016 (v1), last revised 16 Feb 2017 (this version, v4)) https://arxiv.org/abs/1605.01443 | https://doi.org/10.1007/s10851-017-0713-9 Theoretical Analysis of Active Contours on Graphs Christos Sakaridis, Kimon Drakopoulos, Petros Maragos (Submitted on 24 Oct 2016) https://arxiv.org/abs/1610.07381 Detection of triangle on a random geometric graph. Edges are omitted for illustration purposes. (a) Original triangle on graph (b)– (f) Instances of active contour evolution at intervals of 60 iterations, with vertices in the contour’s interior shown in red and the rest in blue (g) Final detection result after 300 iterations, using green for true positives, blue for true negatives, red for false positives and black for false negatives. Experiments on 3D point clouds acquired by a LiDAR in outdoor scenes demonstrate that the scenes can accurately be segmented into object classes such as vegetation, the ground plane and regular structures. The experiments also demonstrate fast and highly accurate convergence of the algorithms, and show that the approximation difference between the convex and original problems vanishes or becomes extremely low in practice. In the future, it would be interesting to investigate region homogeneity terms for general unsupervised classification problems. In addition to avoiding the problem of trivial global minimizers, the region terms may improve the accuracy compared to models based primarily on boundary terms. Region homogeneity may for instance be defined in terms of the eigendecomposition of the covariance matrix or graph Laplacian.
  32. 32. segmentation of graphs #2: Scalable Motif-aware Graph Clustering CE Tsourakakis, J Pachocki, Michael Mitzenmacher Harvard University, Cambridge, MA, USA WWW '17 Proceedings of the 26th International Conference on World Wide Web Pages 1451-1460 https://doi.org/10.1145/3038912.3052653 Coarsening Massive Influence Networks for Scalable Diffusion Analysis Naoto Ohsaka, Tomohiro Sonobe, Sumio Fujita, Ken-ichi Kawarabayashi SIGMOD '17 Proceedings of the 2017 ACM International Conference on Management of Data Pages 635-650 https://doi.org/10.1145/3035918.3064045 “superpixelization”/clustering to speed-up computations Higher-order organization of complex networks Austin R. Benson, David F. Gleich, Jure Leskovec (Submitted on 26 Dec 2016) https://arxiv.org/abs/1612.08447 pre-print to Science→ https://doi.org/10.1126/science.aad9029 Theoretical results in the supplementary materials also explain why classes of hypergraph partitioning methods are more general than previously assumed and how motif-based clustering provides a rigorous framework for the special case of partitioning directed graphs. Finally, the higher- order network clustering framework is generally applicable to a wide range of network types, including directed, undirected, weighted, and signed networks.
  33. 33. Graph Summarization #1A Graph Summarization: A Survey Yike Liu, Abhilash Dighe, Tara Safavi, Danai Koutra (Submitted on 14 Dec 2016 (v1), last revised 12 Apr 2017 (this version, v2)) https://arxiv.org/abs/1612.04883 The abundance of generated data and its velocity call for data summarization, one of the main data mining tasks. … This survey focuses on summarizing interconnected data, otherwise known as graphs or networks. … . In general, graph summarization or coarsening or aggregation approaches seek to find a short representation of the input graph, often in the form of a summary or sparsified graph, which reveals patterns in the original data and preserves specific structural or other properties, depending on the application domain.
  34. 34. Graph Summarization #1B Table I: Qualitative comparison of static graph summarization techniques. The first six columns describe the type of the input graph (e.g. with weighted/directed edges, and one/multiple types of node entities), followed by three algorithm-specific properties (i.e., user parameters, algorithmic compexity—linear on the number of edges or higher—, and type of output). The last column gives the final purpose of each approach. Notation: (1) ∗ indicates that the algorithm can be extended to handle the corresponding type of input, but the authors do not provide details in the paper, for complexity indicates sub-linear; (2) + means that at least one parameter can be∗ set by the user, but it is not required (i.e., the algorithm provides a default value). - Liu et al. (2017)
  35. 35. Point cloud resampling via graphs Fast Resampling of 3D Point Clouds via Graphs Siheng Chen ; Dong Tian ; Chen Feng ; Anthony Vetro ; Jelena Kovačević Acoustics, Speech and Signal Processing (ICASSP), 2017 IEEE https://doi.org/10.1109/ICASSP.2017.7952695 https://arxiv.org/abs/1702.06397 Proposed resampling strategy enhances contours of a point cloud. Plots (a) and (b) resamples 2% points from a 3D point cloud of a building containing 381, 903 points. Plot (b) is more visual-friendly than Plot (a). Note that the proposed resampling strategy is able to to enhance any information depending on users’ preferences.
  36. 36. 2D Image Processing with graphs Directional graph weight prediction for image compression Francesco Verdoja ; Marco Grangetto Acoustics, Speech and Signal Processing (ICASSP), 2017 IEEE https://doi.org/10.1109/ICASSP.2017.7952410 The experimental results showed that the proposed technique is able to improve the compression efficiency; as an example we reported a Bjøntegaard Delta (BD) rate reduction of about 30% over JPEG. Future works will investigate the integration of the proposed method in more advanced image and video coding tools comprising adaptive block sizes and richer set of intra prediction modes. Luminance coding in graph-based representation of multiview images Thomas Maugey ; Yung-Hsuan Chao ; Akshay Gadde ; Antonio Ortega ; Pascal Frossard Image Processing (ICIP), 2014 IEEE https://doi.org/10.1109/ICIP.2014.7025025 (a) Wavelet decomposition on graphs in GraphBior, where shape {circle, triangle, square, and cross} denote coefficients in LL, LH, HL, HH subbands. (b) Parent-children relationship: P node in LH band of level l + 1 has five children from two views in level l marked with blue. (c)The procedure of finding the children node in level for the parent node in level l + 1 (be
  37. 37. Background on GRAPH Deep learning Beyond the short introduction from the review above
  38. 38. Graph structure known or not? GRAPH KNOWN ”Graph well defined, when the temperature measurement positions are known, and temperature measurement uncertainty is small” - Perraudin and Vandergheynst 2016 GRAPH “Semi-KNOWN” ”In a way the structure is known as we can quantify graph signal as number of citations with some journal impact factor weighing, but does this really represent the impact of an article? Scientists are known to game the system and just responding to the metrics[ *] . Are they alternative ways to improve the graph to represent better the impact of an article and the GRAPH NOT KNOWN “Point cloud measured with a terrestrial laser scanner is unordered point cloud given on non- grid x,y,z coordinates. It is not trivial to define how the points are connected to each other” Bibliometric network analysis by Nees Jan van Eck [ *] See e.g. Clauset, Aaron, Daniel B. Larremore, and Roberta Sinatra. "Data-driven predictions in the science of science." Science 355.6324 (2017): 477-480. DOI: 10.1126/science.aal4217 Sinatra, Roberta, et al. "Quantifying the evolution of individual scientific impact." Science 354.6312 (2016): aaf5239. DOI: 10.1126/science.aaf5239 Furlanello, Cesare, et al. "Towards a scientific blockchain framework for reproducible data analysis." arXiv preprint arXiv: 1707.06552 (2017). the R-factor, with R standing for reputation, reproducibility, responsibility, and robustness, http://verumanalytics.io/ Overview of the segmentation method: (a) the initial LiDAR point cloud, (b) height raster image, (c) patches formed with adjacent cells of the same value, (d) hierarchized patches, (e) weighted graph, (f) graph partition, (g) partition result on the raster, (h) segmented point cloud. - Strimbu and Strimbu (2015) Graphics and Media Lab (GML) is a part of Department of Computational Mathematics and Cybernetics of M.V. Lomonosov Moscow State University. http://graphics.cs.msu. ru/en/node/922 http://slideplayer.com/slide/8146222/
  39. 39. Convolutions for graphs #1 Deep Convolutional Networks on Graph-Structured Data Mikael Henaff, Joan Bruna, Yann LeCun (Submitted on 16 Jun 2015) https://arxiv.org/abs/1506.05163 https://github.com/mdeff/cnn_graph However, as our results demonstrate, their extension poses significant challenges: • Although the learning complexity requires O(1) parameters per feature map, the evaluation, both forward and backward, requires a multiplication by the Graph Fourier Transform, which costs O(N2 ) operations. This is a major difference with respect to traditional ConvNets, which require only O(N). Fourier implementations of Convnets bring the complexity to O(N log N) thanks again to the specific symmetries of the grid. An open question is whether one can find approximate eigenbasis of general Graph Laplacians using Givens’ decompositions similar to those of the FFT. Our experiments show that when the input graph structure is not known a priori, graph estimation is the statistical bottleneck of the model, requiring O(N2) for general graphs and O(MN) for M-dimensional graphs. Supervised graph estimation performs significantly better than unsupervised graph estimation based on low-order moments. Furthermore, we have verified that the architecture is quite sensitive to graph estimation errors. In the supervised setting, this step can be viewed in terms of a Bootstrapping mechanism, where an initially unconstrained network is self- adjusted to become more localized and with weightsharing. • Finally, the statistical assumptions of stationarity and compositionality are not always verified. In those situations, the constraints imposed by the model risk to reduce its capacity for no reason. One possibility for addressing this issue is to insert Fully connected layers between the input and the spectral layers, such that data can be transformed into the appropriate statistical model. Another strategy, that is left for future work, is to relax the notion of weight sharing by introducing instead a commutation error ∥Wi L − LWi ∥ with the graph Laplacian, which puts a soft penalty on transformations that do not commute with the Laplacian, instead of imposing exact commutation as is the case in the spectral net. We explore for two areas of application for which it has not been possible to apply convolutional networks before: text categorization and bioinformatics. Our results show that our method is capable of matching or outperforming large, fully-connected networks trained with dropout using fewer parameters. Our main contributions can be summarized as follows: ● We extend the ideas from Bruna et al. (2013) to large-scale classification problems, specifically Imagenet Object Recognition, text categorization and bioinformatics. ● We consider the most general setting where no prior information on the graph structure is available, and propose unsupervised and new supervised graph estimation strategies in combination with the supervised graph convolutions.
  40. 40. Convolutions for graphs #2 Learning Convolutional Neural Networks for Graphs Mathias Niepert, Mohamed Ahmed, Konstantin Kutzkov ; Proceedings of The 33rd International Conference on Machine Learning, PMLR 48:2014-2023, 2016. http://proceedings.mlr.press/v48/niepert16.html A CNN with a receptive field of size 3x3. The field is moved over an image from left to right and top to bottom using a particular stride (here: 1) and zero- padding (here: none) (a). The values read by the receptive fields are transformed into a linear layer and fed to a convolutional architecture (b). The node sequence for which the receptive fields are created and the shapes of the receptive fields are fully determined by the hyper-parameters. An illustration of the proposed architecture. A node sequence is selected from a graph via a graph labeling procedure. For some nodes in the sequence, a local neighborhood graph is assembled and normalized. The normalized neighborhoods are used as receptive fields and combined with existing CNN components. The normalization is performed for each of the graphs induced on the neighborhood of a root node v (the red node; node colors indicate distance to the root node). A graph labeling is used to rank the nodes and to create the normalized receptive fields, one of size k (here: k = 9) for node attributes and one of size k × k for edge attributes. Normalization also includes cropping of excess nodes and padding with dummy nodes. Each vertex (edge) attribute corresponds to an input channel with the respective receptive field. Visualization of RBM features learned with 1-dimensional WL normalized receptive fields of size 9 for a torus (periodic lattice, top left), a preferential attachment graph (Barabási & Albert 1999, bottom left), a co-purchasing network of political books (top right), and a random graph (bottom right). Instances of these graphs with about 100 nodes are depicted on the left. A visual representation of the feature’s weights (the darker a pixel, the stronger the corresponding weight) and 3 graphs sampled from the RBMs by setting all but the hidden node corresponding to the feature to zero. Yellow nodes have position 1 in the adjacency matrices “Directions for future work include the use of alternative neural network architectures such as recurrent neural networks (RNNs); combining different receptive field sizes; pretraining with e restricted Boltzman machines (RBMs) and autoencoders; and statistical relational models based on the ideas of the approach.”
  41. 41. Convolutions for graphs #3 Geometric deep learning on graphs and manifolds using mixture model CNNs Federico Monti, Davide Boscaini, Jonathan Masci, Emanuele Rodolà, Jan Svoboda, Michael M. Bronstein Submitted on 25 Nov 2016 (v1), last revised 6 Dec 2016 (this version, v3)) https://arxiv.org/abs/1611.08402 Left: intrinsic local polar coordinates ,ρ θ on manifold around a point marked in white. Right: patch operator weighting functions wi ( , )ρ θ used in different generalizations of convolution on the manifold (hand-crafted in GCNN and ACNN and learned in MoNet). All kernels are L -normalized; red curves∞ represent the 0.5 level set. Representation of images as graphs. Left: regular grid (the graph is fixed for all images). Right: graph of superpixel adjacency (different for each image). Vertices are shown as red circles, edges as red lines. Learning configuration used for Cora and PubMed experiments. . Predictions obtained applying MoNet over the Cora dataset. Marker fill color represents the predicted class; marker outline color represents the groundtruth class. In this paper, we propose a unified framework allowing to generalize CNN architectures to non-Euclidean domains (graphs and manifolds) and learn local, stationary, and compositional task-specific features. We show that various non-Euclidean CNN methods previously proposed in the literature can be considered as particular instances of our framework. We test the proposed method on standard tasks from the realms of image-, graph- and 3D shape analysis and show that it consistently outperforms previous approaches.
  42. 42. Convolutions for graphs #4 Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering Michaël Defferrard, Xavier Bresson, Pierre Vandergheynst Advances in Neural Information Processing Systems 29 (NIPS 2016) https://arxiv.org/abs/1606.09375 https://github.com/mdeff/cnn_graph https://youtu.be/cIA_m7vwOVQ Architecture of a CNN on graphs and the four ingredients of a (graph) convolutional layer. It is however known that graph clustering is NP-hard [Bui and Jones, 1992] and that approximations must be used. While there exist many clustering techniques, e.g. the popular spectral clustering [von Luxburg, 2007], we are most interested in multilevel clustering algorithms where each level produces a coarser graph which corresponds to the data domain seen at a different resolution. Future works will investigate two directions. On one hand, we will enhance the proposed framework with newly developed tools in GSP. On the other hand, we will explore applications of this generic model to important fields where the data naturally lies on graphs, which may then incorporate external information about the structure of the data rather than artificially created graphs which quality may vary as seen in the experiments. Another natural and future approach, pioneered in [Henaff et al. 2015], would be to alternate the learning of the CNN parameters and the graph.
  43. 43. Convolutions for graphs #5 Top: Schematic illustration of a standard CNN where patches of w×h pixels are convolved with D×E filters to map the D dimensional input features to E dimensional output features. Middle: same, but representing the CNN parameters as a set of M = w×h weight matrices, each of size D×E. Each weight matrix is associated with a single relative position in the input patch. Bottom: our graph convolutional network, where each relative position in the input patch is associated in a soft manner to each of the M weight matrices using the function q(xi , xj ).
  44. 44. Convolutions for graphs #6 CayleyNets: Graph Convolutional Neural Networks with Complex Rational Spectral Filters Ron Levie, Federico Monti, Xavier Bresson, Michael M. Bronstein (Submitted on 22 May 2017) https://arxiv.org/abs/1705.07664 The core ingredient of our model is a new class of parametric rational complex functions (Cayley polynomials) allowing to efficiently compute localized regular filters on graphs that specialize on frequency bands of interest. Our model scales linearly with the size of the input data for sparsely-connected graphs, can handle different constructions of Laplacian operators, and typically requires less parameters than previous models Filters (spatial domain, top and spectral domain, bottom) learned by CayleyNet (left) and ChebNet (center, right) on the MNIST dataset. Cayley filters are able to realize larger supports for the same order r. Eigenvalues of the unnormalized Laplacian h u∆ of the 15-communities graph mapped on the complex unit half-circle by means of Cayley transform with spectral zoom values (left-to-right) h = 0.1, 1, and 10. The first 15 frequencies carrying most of the information about the communities are marked in red. Larger values of h zoom (right) on the low frequency band
  45. 45. Convolutions for graphs #7 Graph Convolutional Matrix Completion Rianne van den Berg, Thomas N. Kipf, Max Welling (Submitted on 7 Jun 2017) https://arxiv.org/abs/1706.02263 Left: Rating matrix M with entries that correspond to user-item interactions (ratings between 1- 5) or missing observations (0). Right: User-item interaction graph with bipartite structure. Edges correspond to interaction events, numbers on edges denote the rating a user has given to a particular item. The matrix completion task (i.e. predictions for unobserved interactions) can be cast as a link prediction problem and modeled using an end-to-end trainable graph auto- encoder. Schematic of a forward-pass through the MC-GC model, which is comprised of a graph convolutional encoder [U, V ] = f(X, M1, . . . , MR) that passes and transforms messages from user to item nodes, and vice versa, followed by a bilinear decoder model that predicts entries of the (reconstructed) rating matrix M = g(U, V), based on pairs of user and item embeddings. “Our model can be seen as a first step towards modeling recommender systems where the interaction data is integrated into other structured modalities, such as a social network or a knowledge graph. As a next step, it would be interesting to investigate how the differentiable message passing scheme of our encoder model can be extended to such structured data environments. We expect that further approximations, e.g. subsampling of local graph neighborhoods, will be necessary in order to keep requirements in terms of computation and memory in a feasible range.”
  46. 46. Convolutions for graphs #8 Graph Based Convolutional Neural Network Michael Edwards, Xianghua Xie (Submitted on 28 Sep 2016) https://arxiv.org/abs/1609.08965 Graph based Convolutional Neural Network components. The GCNN is designed from an architecture of graph convolution and pooling operator layers. Convolution layers generate O output feature maps dependent on the selected O for that layer. Graph pooling layers will coarsen the current graph and graph signal based on the selected vertex reduction method. Two levels of graph pooling operation on regular and irregular grid with MNIST signal. From left: Regular grid, AMG level 1, AMG level 2, Irregular grid, AMG level 1, AMG level 2. Feature maps formed by a feed-forward pass of the regular domain. From left: Original image, Convolution round 1, Pooling round 1, Convolution round 2, Pooling round 2 Feature maps formed by a feed-forward pass of the irregular domain. From left: Original image, Convolution round 1, Pooling round 1, Convolution round 2, Pooling round 2. This study proposes a novel method of performing deep convolutional learning on the irregular graph by coupling standard graph signal processing techniques and backpropagation based neural network design. Convolutions are performed in the spectral domain of the graph Laplacian and allow for the learning of spatially localized features whilst handling the nontrivial irregular kernel design. Results are provided on both a regular and irregular domain classification problem and show the ability to learn localized feature maps across multiple layers of a network. A graph pooling method is provided that agglomerates vertices in the spatial domain to reduce complexity and generalize the features learnt. GPU performance of the algorithm improves upon training and testing speed, however further optimization is needed. Although the results on the regular grid are outperformed by standard CNN architecture this is understandable due to the direct use of a local kernel in the spatial domain. The major contribution over standard CNNs is the ability to function on the irregular graph is not to be underestimated. Graph based CNN requires costly forward and inverse graph Fourier transforms, and this requires some work to enhance usability in the community. Ongoing study into graph construction and reduction techniques is required to encourage uptake by a wider range of problem domains.
  47. 47. Convolutions for graphs #9 Generalizing CNNs for data structured on locations irregularly spaced out Jean-Charles Vialatte, Vincent Gripon, Grégoire Mercier (Submitted on 3 Jun 2016 (v1), last revised 4 Jul 2017 (this version, v3)) https://arxiv.org/abs/1609.08965 In this paper, we have defined a generalized convolution operator. This operator makes possible to transport the CNN paradigm to irregular domains. It retains the proprieties of a regular convolutional operator. Namely, it is linear, supported locally and uses the same kernel of weights for each local operation. The generalized convolution operator can then naturally be used instead of convolutional layers in a deep learning framework. Typically, the created model is well suited for input data that has an underlying graph structure. The definition of this operator is flexible enough for it allows to adapt its weight-allocation map to any input domain, so that depending on the case, the distribution of the kernel weight can be done in a way that is natural for this domain. However, in some cases, there is no natural way but multiple acceptable methods to define the weight allocation. In further works, we plan to study these methods. We also plan to apply the generalized operator on unsupervised learning tasks.
  48. 48. Convolutions for graphs #10 Robust Spatial Filtering with Graph Convolutional Neural Networks Felipe Petroski Such, Shagan Sah, Miguel Dominguez, Suhas Pillai, Chao Zhang, Andrew Michael, Nathan Cahill, Raymond Ptucha (Submitted on 2 Mar 2017 (v1), last revised 14 Jul 2017 (this version, v3)) https://arxiv.org/abs/1703.00792 https://github.com/fps7806/Graph-CNN Two types of graph datasets. Left: Homogeneous datasets. All samples in a homogeneous graph data have identical graph structure, but different vertex values or “signals”. Right: Heterogeneous graph samples. Heterogeneous graph samples can vary in number of vertices, structure of edge connections, and in the vertex values. General vertex-edge domain Graph-CNN architecture. Convolution and pooling layers are cascaded into a deep network. FC are fully- connected layers for graph classification. V is vertex set and A is adjacency matrix that define a graph. Graph convolution and pooling setting. The convolution operation obtains a filtered representation of the graph after a multi-hop vertex filter. Likewise, a compact representation of the graph after a pooling layer
  49. 49. Convolutions for graphs #11 A Generalization of Convolutional Neural Networks to Graph-Structured Data Yotam Hechtlinger, Purvasha Chakravarti, Jining Qin (Submitted on 26 Apr 2017) https://arxiv.org/abs/1704.08165 https://github.com/hechtlinger/graph_cnn Visualization of the graph convolution size 5. For a given node, the convolution is applied on the node and its 4 closest neighbors selected by the random walk. As the right figure demonstrates, the random walk can expand further into the graph to higher degree neighbors. The convolution weights are shared according to the neighbors’ closeness to the nodes and applied globally on all nodes. Visualization of a row of Q(k) on the graph generated over the 2-D grid at a node near the center, when connecting each node to its 8 adjacent neighbors. For k = 1, most of the weight is on the node, with smaller weights on the first order neighbors. This corresponds to a standard 3 × 3 convolution. As k increases the number of active neighbors also increases, providing greater weight to neighbors farther away, while still keeping the local information. We propose a generalization of convolutional neural networks from grid-structured data to graph-structured data, a problem that is being actively researched by our community. Our novel contribution is a convolution over a graph that can handle different graph structures as its input. The proposed convolution contains many sought-after attributes; it has a natural and intuitive interpretation, it can be transferred within different domains of knowledge, it is computationally efficient and it is effective. Furthermore, the convolution can be applied on standard regression or classification problems by learning the graph structure in the data, using the correlation matrix or other methods. Compared to a fully connected layer, the suggested convolution has significantly fewer parameters while providing stable convergence and comparable performance. Our experimental results on the Merck Molecular Activity data set and MNIST data demonstrate the potential of this approach. Convolutional Neural Networks have already revolutionized the fields of computer vision, speech recognition and language processing. We think an important step forward is to extend it to other problems which have an inherent graph structure.
  50. 50. Autoencoders for graphs Variational Graph Auto-Encoders Thomas N. Kipf, Max Welling (Submitted on 21 Nov 2016) https://arxiv.org/abs/1611.07308 https://github.com/tkipf/gae → http://tkipf.github.io/graph-convolutional-networks/ Latent space of unsupervised VGAE model trained on Cora citation network dataset. Grey lines denote citation links. Colors denote document class (not provided during training). Future work will investigate better-suited prior distributions (instead of Gaussian here), more flexible generative models and the application of a stochastic gradient descent algorithm for improved scalability. Modeling Relational Data with Graph Convolutional Networks Michael Schlichtkrull, Thomas N. Kipf, Peter Bloem, Rianne van den Berg, Ivan Titov, Max Welling (Submitted on 17 Mar 2017 (v1), last revised 6 Jun 2017 (this version, v3)) https://arxiv.org/abs/1703.06103 In this work, we introduce relational GCNs (R-GCNs). R-GCNs are specifically designed to deal with highly multi-relational data, characteristic of realistic knowledge bases. Our entity classification model, similarly to Kipf and Welling [see left], uses softmax classifiers at each node in the graph. The classifiers take node representations supplied by an R-GCN and predict the labels. The model, including R-GCN parameters, is learned by optimizing the cross-entropy loss. Our link prediction model can be regarded as an autoencoder consisting of (1) an encoder: an R-GCN producing latent feature representations of entities, and (2) a decoder: a tensor factorization model exploiting these representations to predict labeled edges. Though in principle the decoder can rely on any type of factorization (or generally any scoring function), we use one of the simplest and most effective factorization methods: DistMult [ Yang et al. 2014]. (a) R-GCN per-layer update for a single graph node (in light red). Activations from neighboring nodes (dark blue) are gathered and then transformed for each relation type individually (for both in- and outgoing edges). The resulting representation is accumulated in a (normalized) sum and passed through an activation function (such as the ReLU). This per-node update can be computed in parallel with shared parameters across the whole graph. (b) Depiction of an R-GCN model for entity classification with a per-node loss function. (c) Link prediction model with an R-GCN encoder (interspersed with fully-connected/dense layers) and a DistMult decoder that takes pairs of hidden node representations and produces a score for every (potential) edge in the graph. The loss is evaluated per edge.
  51. 51. Representation Learning For graphs #1 Inductive Representation Learning on Large Graphs William L. Hamilton, Rex Ying, Jure Leskovec (Submitted on 7 Jun 2017) https://arxiv.org/abs/1706.02216 http://snap.stanford.edu/graphsage/ We propose a general framework, called GraphSAGE (SAmple and aggreGatE), for inductive node embedding. Unlike embedding approaches that are based on matrix factorization, we leverage node features (e.g., text attributes, node profile information, node degrees) in order to learn an embedding function that generalizes to unseen nodes. By incorporating node features in the learning algorithm, we simultaneously learn the topological structure of each node’s neighborhood as well as the distribution of node features in the neighborhood. While we focus on feature-rich graphs (e.g., citation data with text attributes, biological data with functional/molecular markers), our approach can also make use of structural features that are present in all graphs (e.g., node degrees). Thus, our algorithm can also be applied to graphs without node features (i.e. point clouds with only the xyz- coordinates without RGB texture, normals, etc.) Low-dimensional vector embeddings of nodes in large graphs have proved extremely useful as feature inputs for a wide variety of prediction and graph analysis tasks. The basic idea behind node embedding approaches is to use dimensionality reduction techniques to distill the high-dimensional information about a node’s neighborhood into a dense vector embedding. These node embeddings can then be fed to downstream machine learning systems and aid in tasks such as node classification, clustering, and link prediction (e.g. LINE, see below). However, previous works have focused on embedding nodes from a single fixed graph, and many real-world applications require embeddings to be quickly generated for unseen nodes, or entirely new (sub)graphs. This inductive capability is essential for high-throughput, production machine learning systems, which operate on evolving graphs and constantly encounter unseen nodes (e.g., posts on Reddit, users and videos on Youtube). An inductive approach to generating node embeddings also facilitates generalization across graphs with the same form of features: for example, one could train an embedding generator on protein-protein interaction graphs derived from a model organism, and then easily produce node embeddings for data collected on new organisms using the trained model. LINE: Large-scale Information Network Embedding Jian Tang, Meng Qu, Mingzhe Wang, Ming Zhang, Jun Yan, Qiaozhu Mei (Submitted on 12 Mar 2015) https://arxiv.org/abs/1503.03578 https://github.com/tangjianpku/LINE
  52. 52. Representation Learning For graphs #2 Skip-graph: Learning graph embeddings with an encoder-decoder model John Boaz Lee, Xiangnan Kong 04 Nov 2016 (modified: 11 Jan 2017) ICLR 2017 conference submission https://openreview.net/forum?id=BkSqjHqxg&noteId=BkSqjHqxg We introduced an unsupervised method, based on the encoder-decoder model, for generating feature representations for graph-structured data. The model was evaluated on the binary classification task on several real-world datasets. The method outperformed several state-of-the-art algorithms on the tested datasets. There are several interesting directions for future work. For instance, we can try training multiple encoders on random walks generated using very different neighborhood selection strategies. This may allow the different encoders to capture different properties in the graphs. We would also like to test the approach using different neural network architectures. Finally, it would be interesting to test the method on other types of heterogeneous information networks.
  53. 53. Semi-supervised Learning For graphs Inductive Representation Learning on Large Graphs Thang D. Bui, Sujith Ravi, Vivek Ramavajjala University of Cambridge, United Kingdom; Google Research, Mountain View, CA, USA (Submitted on 14 Mar 2017) https://arxiv.org/abs/1703.04818 We have revisited graph-augmentation training of neural networks and proposed Neural Graph Machines as a general framework for doing so. Its label propagation (for semi-supervised CNNs see e.g. Tarvainen and Valpola 2017) objective function encourages the neural networks to make accurate node-level predictions, as in vanilla neural network training, as well as constrains the networks to learn similar hidden representations for nodes connected by an edge in the graph. Importantly, the objective can be trained by stochastic gradient descent and scaled to large graphs We validated the efficacy of the graph-augmented objective on various tasks including bloggers’ interest, text category and semantic intent classification problems, using a wide range of neural network architectures (FFNNs, CNNs and LSTM RNNs). The experimental results demonstrated that graph-augmented training almost always helps to find better neural networks that outperforms other techniques in predictive performance or even much smaller networks that are faster and easier to train. Additionally, the node-level input features can be combined with graph features as inputs to the neural networks. We showed that a neural network that simply takes the adjacency matrix of a graph and produces node labels, can perform better than a recently proposed two-stage approach using sophisticated graph embeddings and a linear classifier. Our framework also excels when the neural network is small, or when there is limited supervision available. While our objective can be applied to multiple graphs which come from different domains, we have not fully explored this aspect and leave this as future work. We expect the domain-specific networks can interact with the graphs to determine the importance of each domain/graph source in prediction. We also did not explore using graph regularisation for different hidden layers of the neural networks; we expect this is key for the multi-graph transfer setting (Yosinski et al., 2014). Another possible future extension is to use our objective on directed graphs, that is to control the direction of influence between nodes during training.
  54. 54. Recurrent Networks for graphs #1 Geometric Matrix Completion with Recurrent Multi-Graph Neural Networks Federico Monti, Michael M. Bronstein, Xavier Bresson (Submitted on 22 Apr 2017) https://arxiv.org/abs/1704.06803 Main contribution. In this work, we treat matrix completion problem as deep learning on graph-structured data. We introduce a novel neural network architecture that is able to extract local stationary patterns from the high-dimensional spaces of users and items, and use these meaningful representations to infer the non-linear temporal diffusion mechanism of ratings. The spatial patterns are extracted by a new CNN architecture designed to work on multiple graphs. The temporal dynamics of the rating diffusion is produced by a Long-Short Term Memory (LSTM) recurrent neural network (RNN). To our knowledge, our work is the first application of graph-based deep learning to matrix completion problem. Recurrent GCNN (RGCNN) architecture using the full matrix completion model and operating simultaneously on the rows and columns of the matrix X. The output of the Multi-Graph CNN (MGCNN) module is a q- dimensional feature vector for each element of the input matrix. The number of parameters to learn is O(1) and the learning complexity is O(mn). Separable Recurrent GCNN (sRGCNN) architecture using the factorized matrix completion model and operating separately on the rows and columns of the factors W, H>. The output of the GCNN module is a q- dimensional feature vector for each input row/column, respectively. The number of parameters to learn is O(1) and the learning complexity is O(m + n). Evolution of the matrix X(t) with our architecture using full matrix completion model RGCNN (top) and factorized matrix completion model sRGCNN (bottom). Numbers indicate the RMS error. Absolute value of the first 8 spectral filters learnt by our bidimensional convolution. On the left the first filter with the reference axes associated to the row and column graph eigenvalues.
  55. 55. Recurrent Networks for graphs #2 Learning From Graph Neighborhoods Using LSTMs Rakshit Agrawal, Luca de Alfaro, Vassilis Polychronopoulos (Submitted on 21 Nov 2016) https://arxiv.org/abs/1611.06882 https://sites.google.com/view/ml-on-structures → https://github.com/ML-on-structures/blockchain-lstm → → Bitcoin blockchain data used in paper “The approach is based on a multi-level architecture built from Long Short-Term Memory neural nets (LSTMs); the LSTMs learn how to summarize the neighborhood from data. We demonstrate the effectiveness of the proposed technique on a synthetic example and on real-world data related to crowdsourced grading, Bitcoin transactions, and Wikipedia edit reversions.” The blockchain is the public immutable distributed ledger where Bitcoin transactions are recorded [20]. In Bitcoin, coins are held by addresses, which are hash values; these address identifiers are used by their owners to anonymously hold bitcoins, with ownership provable with public key cryptography. A Bitcoin transaction involves a set of source addresses, and a set of destination addresses: all coins in the source addresses are gathered, and they are then sent in various amounts to the destination addresses. Mining data on the blockchain is challenging [Meiklejohn et al. 2013] due to the anonymity of addresses. We use data from the blockchain to predict whether an address will spend the funds that were deposited to it. We obtain a dataset of addresses by using a slice of the blockchain. In particular, we consider all the addresses where deposits happened in a short range of 101 blocks, from 200,000 to 200,100 (included) . They contain 15,709 unique addresses where deposits took place. Looking at the state of the blockchain after 50,000 blocks (which corresponds to roughly one year later as each block is mined on average every 10 minutes), 3,717 of those addresses still had funds sitting: we call these “hoarding addresses”. The goal is to predict which addresses are hoarding addresses, and which spent the funds. We randomly split the 15,709 addresses into a training set of 10,000 and a validation set of 5,709 addresses. We built a graph with addresses as nodes, and transactions as edges. Each edge was labeled with features of the transaction: its time, amount of funds transmitted, number of recipients, and so forth, for a total of 9 features. We compared two different algorithms: ● Baseline: an informative guess; it guesses a label with a probability equal to its percentage in the training set. ● MLSL of depths 1, 2, 3. The outputs and memory sizes of the learners for the reported results are K2 = K3 = 3. Increasing these to 5 maintained virtually the same performance while increasing training time. Using only 1 output and memory cell was not providing any advances in performance. Quantitative Analysis of the Full Bitcoin Transaction Graph Dorit Ron, Adi Shamir Financial Cryptography 2012 http://doi.org/10.1007/978-3-642-39884-1_2
  56. 56. Time-series analysis with graphs #1 Spectral Algorithms for Temporal Graph Cuts Arlei Silva, Ambuj Singh, Ananthram Swami (Submitted on 15 Feb 2017) https://arxiv.org/abs/1702.04746 We propose novel formulations and algorithms for computing temporal cuts using spectral graph theory, multiplex graphs, divide-and-conquer and low-rank matrix approximation. Furthermore, we extend our formulation to dynamic graph signals, where cuts also capture node values, as graph wavelets. Experiments show that our solutions are accurate and scalable, enabling the discovery of dynamic communities and the analysis of dynamic graph processes. This work opens several lines for future investigation: (i) temporal cuts, as a general framework for solving problems involving dynamic data, can be applied in many scenarios, we are particularly interested to see how our method performs in computer vision tasks; (ii) Perturbation Theory can provide deeper theoretical insights into the properties of temporal cuts [Sole-Ribalta et al. 2013; Taylor et al. 2015] ; finally, (iii) we want to study Cheeger inequalities [Chung 1996] for temporal cuts, as means to better understand the performance of our algorithms. Temporal graph cut for a primary school network. The cut, represented as node colors, reflects the network dynamics, capturing major changes in the children’s interactions.
  57. 57. Active learning on Graphs Active Learning for Graph Embedding Hongyun Cai, Vincent W. Zheng, Kevin Chen-Chuan Chang (Submitted on 15 May 2017) https://arxiv.org/abs/1705.05085 https://github.com/vwz/AGE In this paper, we proposed a novel active learning framework for graph embedding named Active Graph Embedding (AGE). Unlike the traditional active learning algorithms, AGE processes the data with structural information and learnt representations (node embeddings), and it is carefully designed to address the challenges brought by these two characteristics. First, to exploit the graphical information, a graphical centrality based measurement is considered in addition to the popular information entropy based and information density based query criteria. Second, the active learning and graph embedding process are jointly run together by posing the label query at the end of every epoch of the graph embedding training process. Moreover, the time-sensitive weights are put on the three active learning query criteria which focus on the graphical centrality at the beginning and shift the focus to the other two embedding based criteria as the training process progresses (i.e., more accurate embeddings are learnt).
  58. 58. Transfer learning on Graphs Intrinsic Geometric Information Transfer Learning on Multiple Graph-Structured Datasets Jaekoo Lee, Hyunjae Kim, Jongsun Lee, Sungroh Yoon (Submitted on 15 Nov 2016 (v1), last revised 5 Dec 2016 (this version, v2)) https://arxiv.org/abs/1611.04687 Conventional CNN works on a regular grid domain (top); proposed transfer learning framework for CNN, which can transfer intrinsic geometric information obtained from a source graph domain to a target graph domain (bottom). Overview of the proposed method. Conclusion We have proposed a new transfer learning framework for deep learning on graph-structured data. Our approach can transfer the intrinsic geometric information learned from the graph representation of the source domain to the target domain. We observed that the knowledge transfer between tasks domains is most effective when the source and target domains possess high similarity in their graph representations. We anticipate that adoption of our methodology will help extend the territory of deep learning to data in non-grid structure as well as to cases with limited quantity and quality of data. To prove this, we are planning to apply our approach to diverse datasets in different domains.
  59. 59. Transfer learning on Graphs #2 Deep Feature Learning for Graphs Ryan A. Rossi, Rong Zhou, Nesreen K. Ahmed (Submitted on 28 Apr 2017) https://arxiv.org/abs/1611.04687 This paper presents a general graph representation learning framework called DeepGL for learning deep node and edge representations from large (attributed) graphs. In particular, DeepGL begins by deriving a set of base features (e.g., graphlet features) and automatically learns a multi-layered hierarchical graph representation where each successive layer leverages the output from the previous layer to learn features of a higher-order. Contrary to previous work, DeepGL learns relational functions (each representing a feature) that generalize across-networks and therefore useful for graph- based transfer learning tasks. Moreover, DeepGL naturally supports attributed graphs, learns interpretable features, and is space-efficient (by learning sparse feature vectors). Thus, features learned by DeepGL are interpretable and naturally generalize for across-network transfer learning tasks as they can be derived on any arbitrary graph. The framework is flexible with many interchangeable components, expressive, interpretable, parallel, and is both space- and time-efficient for large graphs with runtime that is linear in the number of edges. DeepGL has all the following desired properties: ● Effective for attributed graphs and across-network transfer learning tasks ● Space-efficient requiring up to 6× less memory ● Fast with up to 182× speedup in runtime ● Accurate with a mean improvement of 20% or more on many applications ● Parallel with strong scaling results.
  60. 60. Learning Graphs learning the graph itself #1 Learning Graph While Training: An Evolving Graph Convolutional Neural Network Ruoyu Li, Junzhou Huang (Submitted on 10 Aug 2017) https://arxiv.org/abs/1708.04675 “In this paper, we propose a more general and flexible graph convolution network (EGCN) fed by batch of arbitrarily shaped data together with their evolving graph Laplacians trained in supervised fashion. Extensive experiments have been conducted to demonstrate the superior performance in terms of both the acceleration of parameter fitting and the significantly improved prediction accuracy on multiple graph-structured datasets.” In this paper, we explore our approach primarily on chemical molecular datasets, although the network can be straightforwardly trained on other graph- structured data, such as point cloud, social networks and so on. Our contributions can be summarized as follows: ● A novel spectral graph convolution layer boosted by Laplacian learning (SGC-LL) has been proposed to dynamically update the residual graph Laplacians via metric learning for deep graph learning. ● Re-parametrization on feature domain has been introduced in K-hop spectral graph convolution to enable our proposed deep graph learning and to grant graph CNNs the similar capability of feature extraction on graph data as that in the classical CNNs on grid data. ● An evolving graph convolution network (EGCN) has been designed to be fed by a batch of arbitrarily shaped graph-structured data. The network is able to construct and learn for each data sample the graph structure that best serves the prediction part of network. Extensive experimental results indicate the benefits from the evolving graph structure of data.
  61. 61. Graph structure as the “signal” for prediction DeepGraph: Graph Structure Predicts Network Growth Cheng Li, Xiaoxiao Guo, Qiaozhu Mei (Submitted on 20 Oct 2016) https://arxiv.org/abs/1708.04675 “Extensive experiments on five large collections of real-world networks demonstrate that the proposed prediction model significantly improves the effectiveness of existing methods, including linear or nonlinear regressors that use hand-crafted features, graph kernels, and competing deep learning methods.” Graph descriptor vs. adjacency matrix. We have described the process in converting an adjacency matrix into our graph descriptor, which is then passed through a deep neural network for further feature extraction. All computation in this process is to obtain a more effective low- level representation of the topological structure information than the original adjacency matrix. First, isometric graphs could be represented by many different adjacency matrices, while our graph descriptor would provide a unique representation for those isometric graphs. The unique representation simplifies the neural network structures for network growth prediction. Second, our graph descriptor provides similar representations for graphs with similar structures. The similarity of graphs is less preserved in adjacency matrix representation. Such information loss could cause great burden for deep neural networks in growth prediction tasks. Third, our graph descriptor is a universal graph structure representation which does not depend on vertex ordering or the number of vertexes, while the adjacency matrix is not. The motivation in adopting Heat Kernel Signature (HKS) is its theoretical proven properties in representing graphs: HKS is an intrinsic and informative representation for graphs [31]. Intrinsicness means that isomorphic graphs map to the same HKS representation, and informativeness means if two graphs have the same HKS representation, then they must be isomorphic graphs. A meaningful future direction is to integrate network structure with other types of information, such as the content of information cascades in the network. A joint representation of multi-modal information may maximize the performance of particular prediction tasks.

    Als Erste(r) kommentieren

    Loggen Sie sich ein, um Kommentare anzuzeigen.

  • takanoriogata1121

    Dec. 18, 2017
  • colinlaney

    Jan. 1, 2018
  • ssuserafc864

    Feb. 14, 2018
  • JaeminCho6

    Feb. 14, 2018
  • YasminaJaafra

    Jun. 5, 2018
  • TomQuareme

    Jun. 11, 2018
  • sderaco

    Jul. 2, 2018
  • deruci

    Sep. 8, 2018
  • NamTK

    Jan. 9, 2019
  • LindaStuder1

    Mar. 18, 2019
  • KienLuong6

    Apr. 3, 2019
  • AndrewPeska1

    May. 2, 2019
  • ZehuiChen

    Jun. 24, 2019
  • cureadvocate

    Sep. 20, 2019
  • YepanXiong

    Nov. 3, 2019

For non-grid 3D images like point clouds and meshes, and inherently graph-based data. Inherently graph-based data include for example brain connectivity analysis, scientific article citation networks, (social) network analysis, etc. Alternative download link: https://www.dropbox.com/s/2o3cofcd6d6e2qt/geometricGraph_deepLearning.pdf?dl=0

Aufrufe

Aufrufe insgesamt

5.814

Auf Slideshare

0

Aus Einbettungen

0

Anzahl der Einbettungen

1

Befehle

Downloads

377

Geteilt

0

Kommentare

0

Likes

15

×