Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.
Des objets dans le cloud, et qui y restent :
L’expérience du développement de
CRESON, support pour des objets distants
for...
Open Source pour le Cloud - OSIS - Paris - Juin 2019
Menu for today
• Object-oriented programming and data in the cloud
• ...
Open Source pour le Cloud - OSIS - Paris - Juin 2019
Your typical Cloud application
✓ Scalability and elasticity (pay-as-y...
Open Source pour le Cloud - OSIS - Paris - Juin 2019
Object-oriented apps with NoSQL
💡 Encapsulate application data in obj...
Open Source pour le Cloud - OSIS - Paris - Juin 2019
Our proposal: CRESON
💡 Efficient support for shared objects over a NoS...
Open Source pour le Cloud - OSIS - Paris - Juin 2019
CRESON: components
• LKVS: a novel NoSQL storage abstraction
• Listen...
Open Source pour le Cloud - OSIS - Paris - Juin 2019
Listenable Key/Value Store
• Extend classical Key/Value Store API …
•...
Open Source pour le Cloud - OSIS - Paris - Juin 2019
LKVS illustrated
8
Client application instances
C1 C3C2
Key-Value Sto...
Open Source pour le Cloud - OSIS - Paris - Juin 2019
LKVS illustrated
8
regListener(k1,c2,h2)
Client application instances...
Open Source pour le Cloud - OSIS - Paris - Juin 2019
LKVS illustrated
8
Client application instances
C1 C3C2
Key-Value Sto...
Open Source pour le Cloud - OSIS - Paris - Juin 2019
LKVS illustrated
8
Client application instances
C1 C3C2
Key-Value Sto...
Open Source pour le Cloud - OSIS - Paris - Juin 2019
LKVS illustrated
8
Client application instances
C1 C3C2
Key-Value Sto...
Open Source pour le Cloud - OSIS - Paris - Juin 2019
LKVS illustrated
8
Client application instances
C1 C3C2
Key-Value Sto...
Open Source pour le Cloud - OSIS - Paris - Juin 2019
LKVS illustrated
8
Client application instances
C1 C3C2
Key-Value Sto...
Open Source pour le Cloud - OSIS - Paris - Juin 2019
Object management in CRESON (1)
• Client-side proxy + a session handl...
Open Source pour le Cloud - OSIS - Paris - Juin 2019
State Machine Replication
• Objects replicated at the LKVS side
• (do...
Open Source pour le Cloud - OSIS - Paris - Juin 2019
Putting everything together
11
Client
application
instances
O1
C1 C3C...
Open Source pour le Cloud - OSIS - Paris - Juin 2019
CRESON guarantees
✓ Strong consistency: linearizability
✓ Composition...
Open Source pour le Cloud - OSIS - Paris - Juin 2019
Use case and Interface
• StackSync: open-source equivalent of Dropbox...
Open Source pour le Cloud - OSIS - Paris - Juin 2019
Original metadata
management in StackSync
• PostgreSQL relational dat...
Open Source pour le Cloud - OSIS - Paris - Juin 2019
StackSync metadata
management with CRESON
• Logic for metadata manage...
Open Source pour le Cloud - OSIS - Paris - Juin 2019
CRESON interface
• Integration in Java (using AspectJ)
• Similar to t...
Open Source pour le Cloud - OSIS - Paris - Juin 2019
Implementation
• Built over the open source Infinispan NoSQL
• Basis f...
Open Source pour le Cloud - OSIS - Paris - Juin 2019
Evaluation
• Experiments
• Micro-benchmarks with single and multiple ...
Open Source pour le Cloud - OSIS - Paris - Juin 2019
StackSync performance: throughput
• 24 StackSync clients, each with 1...
Open Source pour le Cloud - OSIS - Paris - Juin 2019
StackSync performance: latency
• About 300 doCommit() operations per ...
Open Source pour le Cloud - OSIS - Paris - Juin 2019
Elasticity and replication
• Single StackSync client with synthetic w...
Open Source pour le Cloud - OSIS - Paris - Juin 2019
Academia contributing
to open source
• Open source was beneficial in L...
Open Source pour le Cloud - OSIS - Paris - Juin 2019
Conclusion
• Client-side object-NoSQL mapping: costly and inefficient
...
Des objets dans le cloud, et qui y restent :
L’expérience du développement de
CRESON, support pour des objets distants
for...
Nächste SlideShare
Wird geladen in …5
×

OSIS19_Cloud : Des objets dans le cloud, et qui y restent -- L'expérience du développement de CRESON, support pour des objets distants fortement cohérents dans Infinispan

20 Aufrufe

Veröffentlicht am

L'expérience du développement de CRESON, support pour des objets distants fortement cohérents dans Infinispan, par Etienne Riviere (UCLouvain).
Cet exposé présentera des résultats obtenus dans le cadre du projet européen LEADS que j'ai coordonné et où l'entreprise Red Hat était partenaire. Le code produit a été intégré dans le “staging" de la base de données NoSQL Infinispan, et évalué avec un équivalent open source de Dropbox développé par CloudSpaces, un autre projet européen.

Veröffentlicht in: Software
  • Als Erste(r) kommentieren

  • Gehören Sie zu den Ersten, denen das gefällt!

OSIS19_Cloud : Des objets dans le cloud, et qui y restent -- L'expérience du développement de CRESON, support pour des objets distants fortement cohérents dans Infinispan

  1. 1. Des objets dans le cloud, et qui y restent : L’expérience du développement de CRESON, support pour des objets distants fortement cohérents dans Infinispan Prof Etienne Rivière, Ecole Polytechnique de Louvain, UCLouvain, Belgique (je ne fais pas d’IA non plus, ni de Machine Learning) OSIS: Open Source pour le Cloud 14 juin 2019, Paris etienne.riviere@uclouvain.be
  2. 2. Open Source pour le Cloud - OSIS - Paris - Juin 2019 Menu for today • Object-oriented programming and data in the cloud • Why ORMs/OGM are ill-suited for sharing concurrent objects • How we can do better by keeping objects in the cloud: CRESON • Joint work with: Pierre Sutra (now Télécom SudParis), Cristian Cotes, Marc Sánchez Artigas, Pedro Garcia Lopez (Uni. Rovira iVirgili, Spain), Emmanuel Bernard,William Burns and Galder Zamarreño (Red Hat) • Published at IEEE ICDCS 2017 • Result of the LEADS FP7 EU project, with Uni. Neuchâtel and Red Hat • With another open source FP7 EU project, CloudSpaces • Some reflexions on the academia-open source interaction 2
  3. 3. Open Source pour le Cloud - OSIS - Paris - Juin 2019 Your typical Cloud application ✓ Scalability and elasticity (pay-as-you-go model) ✓ Availability even in the presence of failures ✓ Simplicity and separation of concerns 3 Presentation Users Presentation Presentation Application logic Application logic NoSQL Data store ➡ NoSQL database: elastic and scalable storage for application state • But typically offering a simple key/value store interface ➡ Application data shared and accessed concurrently by many service instances 🤔 Can we make the storage and access to application data in the database transparent? Figure 1-5. Database architecture for the taxi-hailing application. PASSENGER MANAGEMENT PASSENGER MANAGEMENT DATABASE DRIVER MANAGEMENT DATABASE TRIP MANAGEMENT DATABASE REST API DRIVER MANAGEMENT REST API TRIP MANAGEMENT REST API DATABASE ADAPTER DATABASE ADAPTER DATABASE ADAPTER On the surface, the Microservices Architecture pattern is similar to SOA. With both approaches, the architecture consists of a set of services. However, one way to think about the Microservices Architecture pattern is that it’s SOA without the commercialization and perceived baggage of web service specifications (WS-*) and an Enterprise Service Bus (ESB). Microservice-based applications favor simpler, lightweight protocols such as REST, rather than WS-* They also very much avoid using ESBs and instead implement ESB-like functionality in the microservices themselves. The Microservices Architecture pattern also rejects other parts of SOA, such as the concept of a canonical schema for data access The Benefits of Microservices The Microservices Architecture pattern has a number of important benefits. First, it tackles the problem of complexity. It decomposes what would otherwise be a monstrous monolithic application into a set of services. While the total amount of functionality is unchanged, the application has been broken up into manageable chunks or services Each service has a well-defined boundary in the form of a remote procedure call (RPC)-driven or message-driven API. The Microservices Architecture pattern enforces
  4. 4. Open Source pour le Cloud - OSIS - Paris - Juin 2019 Object-oriented apps with NoSQL 💡 Encapsulate application data in objects stored in some NoSQL DB • Object-DB mapper (Hibernate OGM) • Language integration (Java Persistence API) ☹ Object access: get serialized object, new instance in local memory • Methods called locally • Data structure traversal = multiple back-and-forth with DB ☹ State-based replication of entire serialized object ☹ Weak or no consistency guarantees for shared objects • e.g. Objectify (part of Google App Engine) not thread-safe 4 mappingO serialized objectin-memory object net state-based replication
  5. 5. Open Source pour le Cloud - OSIS - Paris - Juin 2019 Our proposal: CRESON 💡 Efficient support for shared objects over a NoSQL database ➡ Callable distributed objects • Objects instantiated from DB representation but remain at server side ➡ Replication at the level of operations (method calls) • Support for arbitrary-large objects ➡ Strong consistency guarantees for concurrent accesses • Including for composed operations accessing multiple objects 5 mappingO serialized objectin-memory object net O state-based replication proxy net operation O O O operation-based replication Traditional Object-NoSQL mapping CRESON: callable and replicated shared objects
  6. 6. Open Source pour le Cloud - OSIS - Paris - Juin 2019 CRESON: components • LKVS: a novel NoSQL storage abstraction • Listenable Key-Value Store • Object management logic built atop the LKVS • Handle method calls for shared objects • Maintain multiple replicas of in-memory objects • Implement state-machine (operation-based) replication • Client-side integration with the Java language • Using annotations similar to JPA 6
  7. 7. Open Source pour le Cloud - OSIS - Paris - Juin 2019 Listenable Key/Value Store • Extend classical Key/Value Store API … • void put(key k, value v) • value_type get(key k) • … with two additional calls • void regListener(key k, client_id c, handler h) • void unregListener(key k, client_id c) • put() call for key k call all handlers associated to k • Handler receives new value for the key • For handlers who return something, notify listener client 7
  8. 8. Open Source pour le Cloud - OSIS - Paris - Juin 2019 LKVS illustrated 8 Client application instances C1 C3C2 Key-Value Store v1put(k1,v1) v1
  9. 9. Open Source pour le Cloud - OSIS - Paris - Juin 2019 LKVS illustrated 8 regListener(k1,c2,h2) Client application instances C1 C3C2 Key-Value Store h2 v1
  10. 10. Open Source pour le Cloud - OSIS - Paris - Juin 2019 LKVS illustrated 8 Client application instances C1 C3C2 Key-Value Store h2 v1 handlers for k1
  11. 11. Open Source pour le Cloud - OSIS - Paris - Juin 2019 LKVS illustrated 8 Client application instances C1 C3C2 Key-Value Store h2 v1 handlers for k1 regListener(k1,c3,h3) h3
  12. 12. Open Source pour le Cloud - OSIS - Paris - Juin 2019 LKVS illustrated 8 Client application instances C1 C3C2 Key-Value Store h2 v1 handlers for k1 h3
  13. 13. Open Source pour le Cloud - OSIS - Paris - Juin 2019 LKVS illustrated 8 Client application instances C1 C3C2 Key-Value Store h2 v1 handlers for k1 h3 v1’put(k1,v1’)
  14. 14. Open Source pour le Cloud - OSIS - Paris - Juin 2019 LKVS illustrated 8 Client application instances C1 C3C2 Key-Value Store h2 v1 handlers for k1 h3 v1’ v1’ false true notification of listener C3 success
  15. 15. Open Source pour le Cloud - OSIS - Paris - Juin 2019 Object management in CRESON (1) • Client-side proxy + a session handler • Object handler owns actual object • Method calls and object creation/ closing sent as operations through regular put() calls for key k • Intercepted by handlers registered for k • Client calling method receives its result in its listener notification • Lifetime of object for key k • First use: on server, map from serialized object in DB, or instantiate new object • Object closed by last client for key k: on server, object serialized and stored in DB 9 Client application instance O1 C1 Listenable Key-Value Store kO1 Object handler Session handler Proxy method calls put(kO1,op) listener for kO1 O1
  16. 16. Open Source pour le Cloud - OSIS - Paris - Juin 2019 State Machine Replication • Objects replicated at the LKVS side • (dormant) Copies of serialized objects • In-memory instances of currently-used shared objects • Fault-tolerance: survive up to f faults with f+1 servers • In-memory copies must remain consistent under concurrent accesses ➡ Operation-based (state machine) replication • Copies receive the exact same stream of operations ⚠ Constraint: objects must be deterministic ✓ Applying operations in the same order to deterministic objects ensures strong consistency (linearizability) 10
  17. 17. Open Source pour le Cloud - OSIS - Paris - Juin 2019 Putting everything together 11 Client application instances O1 C1 C3C2 Listenable Key-Value Store trigger kO1 call Object handlers Session handlers Proxies 1 3 4 5 listener for kO2 2 method calls put(kO1,op) notify kO2 ServersClients linearize & replicate op listeners for kO1 O1 O2 O1
  18. 18. Open Source pour le Cloud - OSIS - Paris - Juin 2019 CRESON guarantees ✓ Strong consistency: linearizability ✓ Composition • A shared object can call other objects • Maintains linearizability ✓ Persistence ✓ Disjoint-access parallelism • Accesses to distinct objects use distinct LKVS components ✓ Elasticity • Can add/remove storage nodes without restarting the service 12
  19. 19. Open Source pour le Cloud - OSIS - Paris - Juin 2019 Use case and Interface • StackSync: open-source equivalent of Dropbox • Synchronization of user file system with cloud-stored file system • Sharing of folders and files between users spaces • Trace collected from Ubuntu1 personal cloud service • Data stored in immutable object store (OpenStack Swift) • Metadata requires strong consistency 13
  20. 20. Open Source pour le Cloud - OSIS - Paris - Juin 2019 Original metadata management in StackSync • PostgreSQL relational database • Stores the association between users, files, authorized devices, etc. • Accessed from the application using SQL directly (manual object-relational mapping) • Storage and querying handled by a Facade object • Performance: use of stored procedures implementing app. logic at server side • Scalability: sharded (partitioned) database using PL/Proxy ☹ No support for elastic scaling ☹ No consistency (ACID) guarantees across shards 14 Chunk ItemVersion Item WorkspaceUser Device Facade ⭑ ⭑⭑ ⭑ ⭑ ⭑ ⭑ ⭑⭑⭑
  21. 21. Open Source pour le Cloud - OSIS - Paris - Juin 2019 StackSync metadata management with CRESON • Logic for metadata management re- implemented in plain Java, as methods in StackSync’s classes • Determining which objects to store independently in CRESON depends on update visibility requirements • Embedding Item, etc. to Workspace= atomic operations within one Workspace object • Portage was less than a week of effort • Code is simpler and more coherent than with SQL 15 to modify the application metadata. We depict the class schema of the SyncService in Figure 3. At some SyncService instance, a thread interacts with CRESON using a Facade object. Classes including the Facade and below differ from one persistence technology to another. Their portage is where we spent most of our effort. A Workspace object models a synced folder. It is composed of files and directories (Item in Figure 3). For each Item, the SyncService stores versioning information as ItemVersion objects. Similarly to other personal cloud storage services, StackSync operates at the sub-file level by splitting files into chunks; this greatly reduces the cost of data synchronization. A Chunk is immutable and identified with a fingerprint. It may appear in one or more files. Relational Approach. To scale up the original relational implementation, we followed conventional wisdom and sharded metadata across multiple servers. The key enabler of this pro- cess is PL/Proxy, a stored procedure language for PostgreSQL. PL/Proxy allows dispatching requests to several PostgreSQL servers. It was originally developed by Skype to scale up their services to millions of users. Following this approach, we horizontally partition metadata by hashing user identifiers with PostgreSQL built-in function. As a result, all the metadata of a user is slotted into the same shard. Any request for committing changes made by the same user is redirected to the appropriate shard. In detail, we accomplish this with the following PL/Proxy procedure (we omit some parameters for readability): 1 @Entity(key = "id") 2 public class Workspace { 3 4 public UUID id; 5 private Item root; 6 private List<User> users; 7 8 /* ... */ 9 10 public boolean isAllowed(User user) { 11 return users.contains(user.getId()); 12 } 13 } 1 @Entity(key = "id") 2 public class Facade { 3 4 @Entity(key = "deviceIndex") 5 public static Map<UUID,Device> devices; 6 7 @Entity(key = "workspaceIndex") 8 public static Map<UUID,Workspace> workspaces; 9 10 @Entity(key = "userIndex") 11 public static Map<UUID,User> users; 12 13 public UUID id; 14 15 /* ... */ 16 17 public boolean add(Device device) { 18 return deviceMap.putIfAbsent( 19 device.getId(),device) == null; 20 } 21 } Fig. 4. Workspace and Facade classes Chunk ItemVersion Item WorkspaceUser Device Facade ⭑ ⭑⭑ ⭑ ⭑ ⭑ ⭑ ⭑⭑⭑ independent objects stored in CRESON embedded objects
  22. 22. Open Source pour le Cloud - OSIS - Paris - Juin 2019 CRESON interface • Integration in Java (using AspectJ) • Similar to the Java Persistence API (Hibernate, etc.) … but using remote calls • @Entity(key = “id”) annotation • Object o of this class stored in CRESON under key (classname+”:”+o.id) • Store static field in CRESON under key (classname+”:”+id) • Only applies to static fields! • No further action required from developer • Shared maps (e.g. deviceIndex) are transparently stored as collections in LKVS 16 to modify the application metadata. We depict the class schema of the SyncService in Figure 3. At some SyncService instance, a thread interacts with CRESON using a Facade object. Classes including the Facade and below differ from one persistence technology to another. Their portage is where we spent most of our effort. A Workspace object models a synced folder. It is composed of files and directories (Item in Figure 3). For each Item, the SyncService stores versioning information as ItemVersion objects. Similarly to other personal cloud storage services, StackSync operates at the sub-file level by splitting files into chunks; this greatly reduces the cost of data synchronization. A Chunk is immutable and identified with a fingerprint. It may appear in one or more files. Relational Approach. To scale up the original relational implementation, we followed conventional wisdom and sharded metadata across multiple servers. The key enabler of this pro- cess is PL/Proxy, a stored procedure language for PostgreSQL. PL/Proxy allows dispatching requests to several PostgreSQL servers. It was originally developed by Skype to scale up their services to millions of users. Following this approach, we horizontally partition metadata by hashing user identifiers with PostgreSQL built-in function. As a result, all the metadata of a user is slotted into the same shard. Any request for committing changes made by the same user is redirected to the appropriate shard. In detail, we accomplish this with the following PL/Proxy procedure (we omit some parameters for readability): 1 @Entity(key = "id") 2 public class Workspace { 3 4 public UUID id; 5 private Item root; 6 private List<User> users; 7 8 /* ... */ 9 10 public boolean isAllowed(User user) { 11 return users.contains(user.getId()); 12 } 13 } 1 @Entity(key = "id") 2 public class Facade { 3 4 @Entity(key = "deviceIndex") 5 public static Map<UUID,Device> devices; 6 7 @Entity(key = "workspaceIndex") 8 public static Map<UUID,Workspace> workspaces; 9 10 @Entity(key = "userIndex") 11 public static Map<UUID,User> users; 12 13 public UUID id; 14 15 /* ... */ 16 17 public boolean add(Device device) { 18 return deviceMap.putIfAbsent( 19 device.getId(),device) == null; 20 } 21 } Fig. 4. Workspace and Facade classes
  23. 23. Open Source pour le Cloud - OSIS - Paris - Juin 2019 Implementation • Built over the open source Infinispan NoSQL • Basis for Red Hat JBosss Data Grid • LKVS = 13,500 SLOC ; CRESON = 4,000 SLOC • Along with other developments from the LEADS EU project, CRESON was integrated to the ‘experimental’ features in Infinispan • Red Hat developed and integrated the LKVS • Also used for data curation and processing • Under its previous name ‘Atomic Object Factory’ 17
  24. 24. Open Source pour le Cloud - OSIS - Paris - Juin 2019 Evaluation • Experiments • Micro-benchmarks with single and multiple objects ✓ StackSync: throughput and latency using a real trace • Using a real trace from the “Ubuntu One” drive ✓ StackSync: replication and elasticity • Using trace collected from Ubuntu1 personal Cloud • Cluster of 8-core/8GB Xeon 2.5 GHz, switched 1 Gbps • 2 to 6 Infinispan servers (default = 3) • Replication factor is 2 by default 18
  25. 25. Open Source pour le Cloud - OSIS - Paris - Juin 2019 StackSync performance: throughput • 24 StackSync clients, each with 10 threads • Dispatch commands from trace using RabbitMQ • PostgreSQL using 2 additional PL/Proxy nodes (out of 6) 19 0 2000 4000 6000 8000 10000 2 4 6 Throughput(operations/s) CRESON nodes 0 2000 4000 6000 8000 10000 2 4 Throughput(operations/s) PostgreSQL shards (higher is better)(higher is better) + 2 additional PL/Proxy nodes median performance using 6 servers: +50%
  26. 26. Open Source pour le Cloud - OSIS - Paris - Juin 2019 StackSync performance: latency • About 300 doCommit() operations per second • Most common method: adding new files to a personal space 20 0 20 40 60 80 100 5 10 15 20 25 30 35 40 CRESON PostgreSQL CDF(%) Latency (ms) (leftmost is better)
  27. 27. Open Source pour le Cloud - OSIS - Paris - Juin 2019 Elasticity and replication • Single StackSync client with synthetic workload 21 0 1000 2000 3000 4000 2 3 4 5 6 Throughput(operations/s) CRESON nodes no replication replication factor of 2 (higher is better) adding server with no restart (elasticity) Cost of replication
  28. 28. Open Source pour le Cloud - OSIS - Paris - Juin 2019 Academia contributing to open source • Open source was beneficial in LEADS • Implement ideas in a state-of-the-art KVS • Got feedback on code and integrated staging • Much easier IP management (often complicated in EU projects) • But is it enough? • Non-maintained code gets deprecated • EU (and generally, academic) project lifecycles does not match requirements for high impact in open source • Lack of funding and competencies for long-term support • Funding agencies should act? • Movement towards reproducible research artifacts • Benefits from and to open source 22
  29. 29. Open Source pour le Cloud - OSIS - Paris - Juin 2019 Conclusion • Client-side object-NoSQL mapping: costly and inefficient for strongly-consistent shared objects • CRESON: a solution to host callable shared objects directly on a NoSQL DB • Leverage the support from a novel LKVS abstraction implemented in the Infinispan KVS • Simple programming model and better performance and elasticity than a state-of-the-art PostgreSQL solution 23
  30. 30. Des objets dans le cloud, et qui y restent : L’expérience du développement de CRESON, support pour des objets distants fortement cohérents dans Infinispan 🤔?? questions?

×