Graph Day 2017 Spring Boot

1
Tinkering the Graph
Karthik Karuppaiya, Ten-X
Chris Pounds, Expero
GraphDay San Francisco 06-17-2017

2
Company Overview1
Problem Overview2
High Level Architecture3
Data Infrastructure Details4
API Implementation Details5
Live Demo6
Agenda
Conclusion7

3
About Us
@karthikkrk
https://www.linkedin.com/in/karthikkrk/
3
Karthik Karuppaiya
Sr. Engineering Manager, Data and Analytics
Ten-X
TEN-X IS HIRING!

5
About Us
https://github.com/chrislbs
https://www.linkedin.com/in/chrislbs/
@chrislbs_
5
Chris Pounds
Sr. Developer
Expero Inc

© 2017 Expero, Inc. All Rights Reserved © 2017 Expero, Inc. All Rights Reserved
WHAT WE DO
We bring challenging product ideas to reality. 6
USER EXPERIENCE FOR
COMPLEX DOMAINS
SOFTWARE
MODERNIZATION
ASSESSMENTS:
TECHNOLOGY / UX /
STRATEGY
USER RESEARCH
GRAPH DATA MODELING
& VISUALIZATION
PRODUCT INNOVATION
& DISCOVERY
TRAINING & BOOTCAMPS

© 2017 Expero, Inc. All Rights Reserved © 2017 Expero, Inc. All Rights Reserved
OPEN SOURCE & LICENSED TECHNOLOGIES
7Behind every great product is a great team. Let’s build something great together.

© 2017 Expero, Inc. All Rights Reserved© 2017 Expero, Inc. All Rights Reserved
8
EXPERO INSIGHTS
GRAPH DATA
ACCELERATORS & TOOLKITS

9
Company Overview1
Problem Overview2
Live Demo6
Agenda
Conclusion7

10
Data Set Overview
• Commercial Real Estate Data
• Both Internal and External Datasets - Quality of the data varies significantly
• Different types of Data: Transactions, Entities, Individuals, Assets
• Relationships are inherently hidden due to the nature of the business

11
Problem Overview
• Find the hidden relationships between different entities in Commercial Real Estate
Data
• Help Business easily analyze the data and gain insights into hidden relationships
• Expose Data through easy to consume APIs
• Both Data Storage and Data Queries need to scale - start small and add more datasets
and users
11

12
Company Overview1
Problem Overview2
Live Demo6
Agenda
Conclusion7

13
Platform Design Goals
• Private Data Center
• All Open Source Tools
• Ability to Iterate faster
• Multi-tenant – one platform for all lines of business and all teams
• Easily scalable
• Keep it Simple, Stupid

14
Data Storage
Graph Database
Search Index
Containerization
API/Application Deployment
Java Based API Framework
Technologies That power the platform
14

15
JanusGraph
• Forked from Titan DB
• Support for multiple persistence engines
• Integration with Geospatial and text search (Elastic Search)
• Implements Apache TinkerPop Gremlin Server
• Support For Apache TinkerPop Gremlin Language
• Open Source Apache 2.0 Licensing
15

16
Cassandra
• Elastic and Linear Scalability with Data Growth
• Resiliency to hardware failures
• Replication Across Data Centers OLTP and OLAP
• Open Source Apache 2.0 Licensing
16

18
Company Overview1
Problem Overview2
Live Demo6
Agenda
Conclusion7

19
Infrastructure Configuration
19
• 3 Node Cluster
• Cassandra and JanusGraph servers are co-hosted
• 1 TB SSDs on each Node for a total of 3TB
• 16 cores
• 128 GB total Memory - Ability to scale vertically as we grow
• RHEL 7.3
• Stand alone Elastic Search Cluster for text indexing

20
Infrastructure Deployment Configuration
• Use Ansbile for Janus and Cassandra deployment
• Configuration Expressed as YML
• Declarative Representation of Deployment
• Agentless, only requires python2.7 on host
• Simple for small teams
• Consistently reproduce the environment
20

21
Company Overview1
Problem Overview2
Live Demo6
Agenda
Conclusion7

22
API Layer
• Create Read Only APIs that serves the data to the applications
• Use Spring Boot for API Development
• Use Docker and Mesos for scalable API layer
• Publishes feedback information to Kafka that goes through the
pipeline again

23
Spring Boot
• Embedded HTTP Server
• Simple Application Configuration
• Easy to design REST endpoints
• Ease of Testing
• Java based (Apache TinkerPop Gremlin Language Variant)
23

24
Docker
• Simplifies Integration Testing
• Application deployed as generic unit (Containers)
• Configuration provided through environment variables
• Easy to setup developer environment
24

25
Mesos/Marathon
• Scalability: Easy to scale as the load increases. Supports both Horizontal and vertical scaling
on a cluster of machines.
• Resiliency: If a container dies, Mesos will act on it as necessary and spawn a new container.
• Multi-Tenancy: Easy to control how resources are used. Prioritize Job’s access to limited
Resources.
• Service Discovery and Load Balancing: Easy to load balance and allow services to be
discovered automatically.
• Health Check: Out of the box health checks.
25

26
Company Overview1
Problem Overview2
Infrastructure Details4
Live Demo6
Agenda
Conclusion7

28
Company Overview1
Problem Overview2
Infrastructure Details4
Live Demo6
Agenda
Conclusion7

29
Concluding Thoughts..
• Start Simple - Do not over engineer
• Build a solid CI/CD pipeline - makes development faster
• Ensure developers can work in parallel
• Sample Demo code available here:
• https://github.com/experoinc/spring-boot-graph-day
• https://github.com/experoinc/dropwizard-tinkerpop
29

30
Karthik Karuppaiya
@karthikkrk
https://www.linkedin.com/in/karthikkrk/
kkaruppaiya@ten-x.com
Ten-X IS HIRING!
Thank you!
Q & A
Chris Pounds
https://github.com/chrislbs
https://www.linkedin.com/in/chrislbs/
chris.pounds@experoinc.com

Graph Day 2017 Spring Boot

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie Graph Day 2017 Spring Boot

Ähnlich wie Graph Day 2017 Spring Boot (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Graph Day 2017 Spring Boot