1. GRAU Data Space 2.0 –
The Secure Communication Platform for
Businesses and Organizations
YOUR DATA. YOUR CONTROL
7. Dez 2013
2. Architectural Overview
●
●
●
The GDS is based on a very robust core which is available since years
The architecture scales from SMB (<100 user) to large enterprises and
service providers (>100.000 users)
The key features for scalability are:
–
–
–
–
–
–
–
–
–
–
–
Separation between data and meta data (optional)
Transactional scalable storage backend
Versioning of all file objects (UUID)
Chunking of large objects (chunksize can be different for each object)
Hashing of chunked objects (offloading to object store is possible)
Chunk level deduplication based on hash (under development)
Bidirectional master/master replication of all data and meta data on folder level
Session director allows redirection of sessions to another node
RESTful APIs
CMIS (getContentChanges)
Distributable in-memory cache for meta data
3. Open interfaces
●
Open standard interfaces
–
–
JSON/SOAP core API
–
●
WebDAV
CIFS
Gateways
–
–
●
OwnCloud
CMIS 1.1 (SOAP, AtomPub, JSON)
Identity Management
–
Provisioning Gateway (LDAP, AD,SQL)
–
Authentication Gateway (LDAP, AD, RADIUS)
4. Architecture
Admin
GUI
WebGUI
ownCloudGW
Adm GW
GDS2 API (JSON)
CMIS GW
CIFS
WebDAV
GDS core
Storage Backend
Object-Store
Caringo
S3
SWIFT
FS/CIFS
NAS
GAM
Metadata
SQL
DB/2
Oracle
MySQL
Postgres
SQL
DB/2
Oracle
MySQL
Postgres
7. Storage Backend (3)
GDS2 API (JSON)
GDS2 API (JSON)
GDS core
GDS core
Metadata
Object Store
Replication
Metadata
Object Store
SWIFT
SWIFT
RADOS GW
RADOS GW
librados
librados
RADOS
OSD
RADOS
OSD
RADOS
OSD
Metadata
8. Scalability / High availibility
●
Master/master replication on folder level
–
–
Users, groups
–
●
Data, metadata
Access lists
Shared nothing architecture
–
–
High availability
–
Users that share a lot of folders can be relocated to the same node
–
Adding or removing nodes dynamically
–
●
Horizontal scalability
Software updates on deactivated nodes
Distributed metadata cache
–
●
CMIS gateway allows session and metadata caching
Session redirector (reverse proxy)
–
Redirects session to the home node of the user
–
If the home node is down, one of the backup nodes will be used
9. High availibility
Load Balancer
Load Balancer
GDS (Session) Director
GDS (Session) Director
GDS2 API (JSON)
GDS2 API (JSON)
GDS core
GDS core
Storage
Metadata
Replication
Data
Metadata
Storage
Metadata
10. Scalability (1)
Load Balancer
Load Balancer
GDS (Session) Director
GDS (Session) Director
GDS2 API (JSON)
GDS2 API (JSON)
GDS core
GDS core
Metadata
Data
Master/Master
Replication
Metadata
Objectstore / Cluster filesystem
Data
Metadata
11. Scalability (2)
Load Balancer
Load Balancer
GDS (Session) Director
GDS (Session) Director
CMIS Cache
CMIS Cache
CMIS Cache
CMIS Cache
GDS2 API (JSON)
GDS2 API (JSON)
GDS2 API (JSON)
GDS core
GDS core
GDS core
MD
Data
Metadata
Replication
MD
Data
Metadata
Replication
Objectstore / Cluster filesystem
MD
Data
12. Multiple Sites - Roaming (1)
●
●
●
●
●
●
●
Every user has a home node which is stored in the account data
Redundancy of file objects is provided by objects store at each site
Users, groups and ACLs are synchronized between all sites
File objects are not synchronized between sites
Synchronization takes place asynchronously
Load balancer directs client request to session director
Session director redirects request based on user account to
–
–
Node which hosts shared data room [shared]
–
●
Home node of the user [my]
Any node [global]
Session director analyzes the request and forwards to
–
CMIS caching layer
–
JSON API layer
13. Multiple Sites - Roaming (2)
CMIS
JSON
LB
LB
LB
LB
GDS Director
GDS Director
GDS Director
GDS Director
CMIS Cache
CMIS Cache
CMIS Cache
CMIS Cache
GDS2 API
GDS2 API
GDS2 API
GDS2 API
GDS core
GDS core
GDS core
GDS core
MD
Data
Data
Site A
MD
MD
Data
Data
Site B
MD
14. Identity Management (1)
●
●
●
●
Separation between user provisioning and authentication
Multiple instances of gateways are possible
Multiple directories can be connected in parallel
Provisioning gateway
–
LDAP/AD/SQL crawler
–
Users that match a regular expression are created in the GDS
–
Users that got deleted in the directory get deactivated in the GDS
–
SCIM/SAML module [planed]
17. Multi Tenancy
●
Dedicated Hardware
–
–
●
Highest level of separation and security
No performance impact of virtualization layer
Full virtualization (KVM, HyperV, Vmware, XEN)
–
–
Similar static memory pages can be shared between instances
–
●
Highest level of separation and security in virtualized environment
GDS version can be different for each tenant
Linux Containers (LXC)
–
–
●
Lightweight virtualization
Memory and program files on disk can be shared between instances
Single instance
–
Same GDS version for all tenants
–
Everything gets shared
–
Software bugs or operational problems affect all tenants
18. Distributed Data Space
GDS
CIFS
FW
Site B
GDS
CIFS JSON
LAN
Site A
CIFS JSON
HT
T
S
TP
HT
CIFS
FW
LAN
FW
LAN
GDS
CIFS
PS
Internet
HT
T
LAN
CIFS
FW
GDS
CIFS JSON
PS
Site C
HT
Site D
PS
T
CIFS JSON
19. CMIS
Site B1
Site B2
WebDAV
GDS
HT
T
Site B
CIFS
GDS
CMIS
GDS
WebDAV
PS
OS
OS
Site C
CIFS
CMIS
WebDAV
GDS
HT
T
PS
CMIS Cache
HT
S
TP
SD
WebDAV
CMIS
GDS
GDS
CMIS
CIFS
HT
WebDAV
PS
T
HTTPS
CIFS
GDS
SD
CIFS
CMIS Cache
Site A
CMIS Cache
Corporate CDN
OS
GDS
20. Cloud attached Data Space
Site A
GDS
CIFS
LAN
CIFS JSON
GDS
FW
HT
LB
PS
T
GDS
Internet
HT
T
LAN
CIFS
FW
GDS
CIFS JSON
FW
GDS
PS
Site B
LB
GDS
21. YOUR DATA. YOUR CONTROL.
WWW:
HTTP://WWW.GRAUDATA.COM/DATASPACE
E-MAIL:
THOMAS.UHL@GRAUDATA.COM
CEL:
+49 151 54354373
TWITTER:
@graudataspace