We consider uses cases typical for some advanced applications of Cloud architectures incorporating Cloud Foundry PaaS. High-Availability, Fault-Tolerance, scaling down to smaller form-factors while operating in mission-critical environments - all these requirements put constraints on architecture, configuration, and testing. Cloud Foundry's operation depends on the number of external and internal dependencies. Points of failure may exist on different levels stretching from hardware / IaaS foundation to microservices.
3. @altoros
Solution Requirements
● An IoT healthcare solution:
○ Connect devices and users located at customer sites
○ Thousands of devices
○ Hundreds of customers
○ Collect, process, and visualize device data
4. @altoros
Solution Requirements
● Available as a private regional cloud:
○ Operated by a third-party
○ Addressing specific region regulations
○ Serving clients and providing region proximity
● A “scaled-down” version for on-site deployments:
○ Cost-effective
○ Easy remote maintenance
○ Backup data to the regional cloud
Regional Cloud
Customer Facility 1
Local Cloud
Customer Facility 2
Local Cloud
5. @altoros
Solution Requirements
● Consider implementation restrictions:
○ Limited resources for on-site deployment
● Review and approval by government agencies:
○ Open source technologies and products
○ Unified architecture for regional and local clouds
6. @altoros
Solution Requirements
● High availability and scalability:
○ A hardware and infrastructure platform
○ Cloud services and applications
● Security is essential:
○ VPN connectivity
○ Non-VPN connections should be supported
○ WebSocket, TCP, and HTTP protocols
8. @altoros
Infrastructure: OpenStack vs. VMware
● VMware vSphere is about virtualization:
○ ESXi is the only supported hypervisor
○ vCenter for management
● OpenStack is about cloud:
○ Storage, network, and compute services
○ Security groups and access control
○ Projects and quotas
○ Supports KVM, ESXi, and QEMU
9. @altoros
VMware component License cost, USD
VMware vSphere Standard, 1 CPU $995
VMware vCenter Server Standard $4,995
Server CPU Cost per node, USD
SuperMicro 5038MR-H8TRF Intel Xeon E5-2620 v2 $1,800
OpenStack Cost, USD
5 compute nodes 5 * $1,800
3 controller nodes 3 * $1,800
Total $14,400
VMware Cost, USD
5 ESXi (compute) nodes 5 * $1,800 + 5 * $995
1 vCenter appliance 1 * $4,995
Total $18,970
Infrastructure: OpenStack vs. VMware
● Cost estimation for 5 nodes
11. @altoros
OpenStack Deployment Considerations
● Availability zones:
○ Identical zones for compute and storage services
● Support for VM migration:
○ Use Ceph for volumes and ephemeral disks
○ Free the capacity of one compute node
● Increase default values in nova.conf:
○ security_groups = 100
○ security_group_rule=300
○ volumes = 500
○ cpu_overcommit = 4
13. @altoros
● For microservices architecture
● Runtime automation
● Organizations, users, spaces, and security groups
● Health checks, load balancing, and scaling
● AWS, OpenStack, and VMware
The Application Platform: Cloud Foundry
15. @altoros
Jobs
Instances,
zone 1
Instances,
zone 2
Instances,
zone 3
CPU per
instance
RAM per
instance, GB
RAM
total, GB CPU total
etcd 1 1 1 1 2 6 3
UAA + CC DB 1 1 2 2 1
Cloud Controller 1 1 1 4 8 2
Doppler 1 1 1 1 1 3 3
Traffic Controller 1 1 1 1 2 2
Runners 2 2 2 16 64 384 96
Total for CF jobs 33 447 133
Cloud Foundry Planning
16. @altoros
Cloud Foundry HA Deployment Issues
● CC and UAA databases?
✓ Use BOSH Resurrector
✓ Use external MariaDB Galera
● BOSH Director ?
✓ Plan BOSH VM Recovery
● Blob store ?
✓ Store blobs in OpenStack Swift
17. @altoros
BOSH Director Recovery
● You will need:
○ bosh-state.json
○ bosh.yml manifest
○ BOSH persistent disk
● Edit bosh-state.json only with these properties:
○ installation_id
○ current_disk_id
● Re-deploy BOSH and attach the persistent disk:
bosh-init deploy bosh.yml
Total time: around 25 min
18. @altoros
Blob Storage in OpenStack Swift
● Set OpenStack as the provider in the deployment manifest:
properties:
cc:
packages:
app_package_directory_key: cc-packages
fog_connection: &fog_connection
provider: 'OpenStack'
openstack_username: 'cfdeployer'
openstack_api_key: 'ddd3dd23'
openstack_auth_url: 'http://172.30.0.3:5000/v2.0/tokens'
openstack_temp_url_key: '1328d0212'
19. @altoros
BOSH Resurrection
● Configure resurrection for the database VM:
$ bosh vm resurrection pg_data/0 on
● Measure the approximate time for restoring a VM:
○ 60 sec: agent health-check every
○ 60 sec: to mark agent as unresponsive
○ 120 sec: time to recreate the VM on OpenStack
○ 60 sec: time to initialize
Total: around 5 min.
● When a physical VM is down:
○ Resurrector recreates all VMs in the same AZ
21. @altoros
Cassandra in OpenStack Ceph: Pros and Cons
● Pros:
○ Automation—all cloud services are in OpenStack.
○ Ceph is distributed and replicated storage.
○ Low cost compared to hardware SAN.
● Cons:
○ The replication factor is 6: 2 in Ceph * 3 in Cassandra.
○ Cassandra performance is impacted by network performance.
22. @altoros
Testing Cassandra in OpenStack Ceph
● OpenStack configuration:
○ 1 Gb network
○ 1 CPU per node — E5-2630 v3 2.40 GHz
○ 2.0 TB SATA 6.0 Gb/s 7200RPM for Ceph
● Cassandra configuration:
○ Node: 8 vCPUs, 32 GB of RAM
○ 6 nodes in 3 AZ; 2 nodes per AZ
○ A simple strategy with a replication factor of 3
○ Cassandra stress-test tool
23. @altoros
Operations / sec Avg. latency, ms Latency 99%, ms Max. latency, ms
47,700 2.8 10.1 3,851.7
Operations / sec Avg. latency, ms Latency 99%, ms Max latency, ms
65,250 2.1 5.5 50.8
Operations / sec Avg. latency, ms Latency 99%, ms Max latency, ms
54,150 2.5 7.1 2,062.1
Testing Cassandra in OpenStack Ceph
● 100% writes
● 100% reads
● 50% writes, 50% reads
24. @altoros
Cassandra Recommendations
● Cluster and node sizing:
○ Effective data size per node: 3–5 TB
○ Tables in all keyspaces: 500–1,000
○ 30–50% of free space for the compaction process
● DataStax storage recommendations:
○ Use local SSD drives in the JBOD mode
26. @altoros
Altoros’s Contributions to Cloud Foundry
● Cassandra Service Broker for CF :
https://github.com/Altoros/cf-cassandra-broker-release.git
● Improvements to the ELK BOSH release and CF integration:
○ RabbitMQ input, Cassandra output for Logstash
○ Logstash filters
https://github.com/logsearch/logsearch-boshrelease/commits?author=axelaris
https://github.com/cloudfoundry-community/logsearch-for-cloudfoundry/
Hello, colleagues. My name is Sergey and I’m glad to see you at this session. I work as a project manager and software architect at Altoros.
Today, I’m going to share with you the experience our team gained when working on an ongoing project related to healthcare. The project is about building a highly available solution for customers who operate with various medical devices.
Let’s take a look at some of the requirements.
First of all, what are the business requirements to the solution? We call this system the “Internet of Things for healthcare”. And the main idea is to create a Software-as-a-Service solution available to clients who can connect medical devices and users to this service in a secure way. The service allows to collect data from devices, and also store and visualise device data. Users will have various dashboards to view data in near real-time, and they will also be able to locate and manage devices. Some of the customers are large organizations that operate many facilities and thousands of devices. It is expected that this IoT solution will dramatically simplify and make devices and users connectivity transparent and unified. The cloud solution should reduce time to deliver, upgrade, and support healthcare applications for clients.
The new solution should serve customers in different geographical regions, providing region proximity. It also allows to address specific regulations for each region. You know, rules for healthcare industry are different in North and South America, Europe and Asia.
Besides the regional cloud, there is a plan to create a scaled-down, small version of the cloud for on-site deployments. So that some of the biggest customers who are sensitive to data locality can install the solution and keep all the data inside their datacenter. This scaled-down version needs to be cost-effective and support remote maintenance in the same way as it is planned for the regional cloud. As an additional feature, the data stored in the local cloud can be backed up to a regional cloud.
And if we talk about 2 versions of the cloud - the regional cloud and the cloud for on-site deployment, we understand that their architecture must be very similar or identical. First of all, when you have this type of implementation for different scales of deployment - it reduces the time to deliver the solution to the market. Also you need to consider a whole range of implementation restrictions. For example cloud for on-site deployment has limited resources and cost.
It is clear that when you deliver a healthcare solution it must be reviewed and approved by government agencies. So the platform must be based on open source products that can be tested and examined for vulnerabilities. Open components also make it possible to easily extend the functionality of the platform and the products that are used in the solution. Also this makes it easier to review all the components and get all the necessary approvals.
Another set of requirements is related to availability and security of the solution. High availability is extremely important in healthcare. In our case, it means that all apps and services, as well as hardware and the infrastructure platform must be available all the time. The platform deals with very sensitive data, so security is essential. In most cases customers are connected to the cloud through secure VPN tunnels. But, for small customers, the cloud needs to provide connectivity without a firewall.
As for the communication with cloud, devices operate using internet protocols, and support for TCP devices is planned to be added in near future.
Ok, now let’s take a look at how the platform is implemented. I won’t go into all the technical details. Instead, I will focus on the infrastructure platform, and the cloud services that we’ve selected, as well as some of the high availability and scalability aspects. Also I’ll share what parts of this project we contributed to the community.
So, speaking about infrastructure, we were choosing between VMWare vSphere and OpenStack, because we have to build a private cloud.
We chose OpenStack, because VMware vSphere is about virtualization and management of virtual resources. All VMware products are licensed and proprietary.
In contrast to VMWare, OpenStack is open source and includes components that allow to build storage, network, and compute services in the cloud. OpenStack supports multi-tenancy for cloud resorces called projects, it has fine-grained security and access controls. Also it supports several hypervisors AND it can be integrated with the ESXi hypervisor too.
Let’s take a look at this rough cost estimation for an infrastructure platform running VMware and OpenStack. As an example, we are calculating the effective cost for 5 nodes. We’re using an blades chassis with five compute nodes for virtual machines and storage. The cost is estimated for SuperMicro chassis with 6-cores Intel Xeon CPUs.
If we use VMware vSphere, we have to buy licenses for 5 ESXi hypervisors and vCenter management and the initial cost will be around 19K USD. In OpenStack, we need use 3 additional nodes for OpenStack management services.
As you can see, even though OpenStack uses three additional machines, the total cost is less than with VMWare. This is an example that you may use for costs estimation when making the selection for private infrastructure platform.
On next slide, you can see a high-level deployment view of our OpenStack cloud. It is protected by a firewall that supports VPN tunnels and non-VPN HTTPS connections. At the hardware level, we are using a blades chassis to build a highly available OpenStack. At least three nodes are used for OpenStack management components. And compute services are distributed across three availability zones. This creates redundancy for virtual machines launched by OpenStack. One of the reason why we create 3 zones is because some of components and cloud services require 3 or more virtual nodes for availability.
The OpenStack storage may be distributed across compute nodes OR we can setup separate storage nodes. Additional management services, like DNS, NTP, and OpenStack deployment tools, run on additional chassis nodes.
This approach for deployment means, we can scale OpenStack’s computing and storage capacity simply by adding new blades or nodes.
What are some of the important OpenStack deployment considerations. First, it’s required to create identical availability zones for compute and storage services. Second, to enable support for live VM migration we need to configure OpenStack Ceph for persistent volumes and ephemeral disks. And additionally there should be spare capacity of around 1 physical node in every availability zone. Third, we recommend to increase default limits for number of security groups, security rules and volumes. And also it is important is to evaluate CPU overcommit ratio. Recommended value is from 1.5 to 2, but we were able to test OpenStack and reach CPU overcommit ratio value of 4.
Besides the OpenStack platform, we are using a number of other services in cloud platform.
Cassandra is a scalable, redundant, and master-less data store. This is where we keep all the device data.
MariaDB Galera is our relational database cluster for structured data with low velocity.
RabbitMQ provides queueing and messaging for different applications.
And ElasticSearch, LogStash and KIbana serves for application logs aggregation and indexing.
What about running applications? The solution we are building is based on microservice architecture, so we need an application platform that will manage them effectively. When it comes to microservices, we think Cloud Foundry is by far the best option.
It automates up to 90% of all routine work related to application lifecycle management.
It is a complete platform that supports traditional application runtime automation and also Docker containers.
And the most important advantage, at least for our customer, was that, with Cloud Foundry, new features and apps can be released a lot faster.
So what does it take to distribute the components of the Cloud Foundry platform and cloud services inside the OpenStack deployment.
As I have already said, there are three availability zones that are actually three groups of physical nodes in chassis. If we distribute our service instances across the availability zones, we can ensure redundancy on the service level.
For example, the MariaDB cluster requires at least three nodes, and for redundancy ,we place one node in every availability zone. The same approach is applied for RabbitMQ and Cassandra.
As for Cloud Foundry, we need to place the components that support HA in at least two zones. We can expect that most of the platform resources in Cloud Foundry are allocated to application runners (the DEA and Diego cells). The Runners are deployed to three availability zones, so that the application workload can be distributed evenly to all hardware nodes.
Management services can be replicated, as well. The approach for replication depends of the specific service. Some of services, like DNS and NTP, are assumed to be mission critical services, so they have 2 instances on 2 physical nodes. Some of the services may use more relaxed HA requirements. Ok let’s see a more detailed planning of resources for a CF deployment.
On this slide, you can see how we distributed the configuration of Cloud Foundry in three OpenStack availability zones. Of course, on this slide we have only some of the CF jobs to give thel idea of how the planning is done.
This planning page helps us to calculate usage of memory and CPU by OpenStack zones and the number of virtual machines for Cloud Foundry. The values in the row called “Total” represent the total number of instances, memory and virtual CPUs. There are also totals calculated for each availability zone.
The cells highlighted in yellow are the jobs that we recommend to place in the three availability zones. They are the service registry, “etcd”, which should have 3 instances, “loggregator traffic controller” that is recommended to have at least one instance in every zone, and, as I mentioned - Runners for application containers. Runners are major resource consumers for any Cloud Foundry deployment.
But, at the same time, there are CF jobs that don’t support High Availability configuration by default. And we need to decide, how to recover them in case they fail. Or find some workarounds . Let’s move on to the next slide and see what can be done.
One of non-HA job is CC and UAA databases. So, for databases, we can configure BOSH resurrection or use external MariaDB Galera Cluster.
Other non-HA components in the deployment are:
BOSH Director for CF and cloud services automation. BOSH is not directly related to availability of Cloud Foundry and applications. But we need to plan how we will recover the virtual machine with BOSH director.
Another non-HA component is the BLOBstore. The default NFS blobstore is a single instance. We can use object storage, for example, OpenStack SWIFT for it.
Let’s take a look at some of the details for these points.
So, what does a plan for recovering a BOSH Director virtual machine look like? The approach is quite straightforward. To recover BOSH Director, we need the BOSH state file, the deployment manifest, and a persistent disk for the BOSH VM.
First, we have to edit the BOSH state file, leaving only several properties. And then can we re-deploy BOSH and attach the persistent disk.
So in our tests, the recovery of the BOSH director VM according to this scenario took around 25 minutes.
As an alternative, we can use OpenStack VM migration functionality, if ephemeral drives are located in OpenStack Ceph storage and can be attached to the new VM in the same way as the persistent disk. In addition to that, the OpenStack Ceph option for ephemeral drives allows to do Live migrations of VMs in OpenStack.
To set OpenStack SWIFT as the blob store, we need to define the credentials and URL to connect to OpenStack and also set a temporary key in Cloud Foundry deployment manifest. It is very important that the temporary key is unique for every Cloud Foundry installation on OpenStack, if you have 2 installations in one OpenStack, for example. And it should work!
Ok let’s see the effect of BOSH resurrection. What is important about BOSH Resurrection is that it takes around two minutes to mark the Agent as unresponsive. After that, the VM will be recreated. So, the total time in our tests ranged from 4 to 6 minutes for a Postgres database instance. This timeframe can be acceptable for applications that have already been deployed - they will continue to work and if we don’t install new applications during this downtime. In our case we decided to go with this approach.
But take into account the side effect - when you stop a physical machine intentionally, BOSH Resurrector tries to recreate all the VMs hosted on this physical machine in the same OpenStack availability zone. And you should have enough resources for this process in the same zone.
As an alternative to BOSH resurrection, you can configure an external MariaDB cluster for CF databases.
Let’s take a look at the Cassandra storage. In our case, we are using OpenStack Ceph with replication. And the data blocks are distributed among all storage nodes. This means a single data read request triggers several network operations.
First, the Application calls the Cassandra coordinator node - this is the virtual machine where application is connected.
Second, the Cassandra coordinator contacts the Cassandra data node that should store requested data row. This Cassandra node runs on a specific compute node in OpenStack.
Then the compute node talks to the OpenStack Ceph controller.
And, finally, the Ceph controller reads data blocks from the OpenStack storage nodes.
So, what are the pros and cons of running Cassandra in OpenStack CEPH?
On the good side:
- With CEPH, all cloud services are in OpenStack. This simplifies deployment automation and management, because services can be deployed and managed for example by BOSH.
- CEPH is scalable and replicated storage. So, failure of one drive or storage node should not affect the availability of data volumes.
- And last, but not least, the price of storage is quite cheap compared to special hardware SAN systems.
Speaking about the cons, we can say that:
- In CEPH storage, an additional replication factor of 2 will result in a total of 6 replicas for Cassandra data, if we use the recommended replica factor of 3 in Cassandra.
And Cassandra performance depends directly on network performance. So, it is recommended to use a 10G or faster network for connecting OpenStack storage nodes.
In our case, we decided to benchmark Cassandra in OpenStack to understand whether it can satisfy our requirements.
We used Cassandra Stress Test Tool on a cluster of 6 nodes. There was a simple replication strategy with a factor of 3. The network was 1Gb.
Every Cassandra node was configured with 8 vCPU and 32GB of RAM. It’s the recommended ratio between CPU and memory for one Cassandra node. The test was conducted with one table, the approximate test duration was 300 seconds.
On this slide, you can see the results of the benchmark. Cassandra Stress test tool measures throughput as number of operations per seconds, and several latencies for requests that show the distribution of response time during the test.
We put on the number of operations per second, average latency, 99%, maximum, and minimum latencies. They are measured in milliseconds. In terms of deviation, we may be interested in the 99 percent latency and the maximum latency - these figures can give you idea what should be examined in more details.
This type of test can be executed very quickly after you’ve installed the cluster. And it can give you an insight in what kind of performance you can expect. For example, if your requirements are to serve 10,000 operations per second with a latency of less than 10 ms in average, the Cassandra deployment in OpenStack can meet these requirements. But also remember that Cassandra’s data model and access patterns also influence application performance.
Other recommendation for Cassandra cluster planning include effective
Data size per one Cassandra node is 3-5 TB
Number of tables in all keyspaces should be less than 1000 to make compaction process effective.
And 30 to 50% of space should be be free for compaction process.
As for recommended storage options, DataStax recommends to run Cassandra on bare metal using SSD drives.
These are some of the technical details from the project that I decided to share with you within our short time frame.
So, in the last part of my presentation, I would like to say a few word about Altoros contributed to the community from the project. Don’t be surprised - even if we are working in such restricted area as healthcare, we can find a way to spread ideas and experience.
During the project we created a CF service broker for a Cassandra cluster supports authentication and keyspace provisioning. We are updating it regularly to accommodate for changes in the latest Cassandra versions.
And we’re continuously improving the ELK stack, specifically we have added number of inputs and outputs to Logstash, like RabbitMQ and Cassandra. In this project, ELK serves as the main storage of all log events. Our team has developed an approach and some Logstash filters to merge multiple lines of exceptions and stack traces in one message object in ElasticSearch. This helps to find and view the full context of any application error in KIbana.
Also we developed a Web tool that allows developers who work with Cassandra to view keyspaces, objects, run any valid Cassandra CQL statements and store them in history. This tool is extremely useful if you need to interact with a Cassandra cluster in a private cloud without access to any of Cassandra nodes. We were inspired by DataStax DevCenter, a desktop tool to work with Cassandra cluster with direct connectivity to cluster nodes. But for private cloud in OpenStack behind firewall DataStax DevCenter doesn’t work, and we need web-based tool. Moreover it’s Cloud FOundry ready application.
That’s all in this short presentation. I’ll be glad to answer your questions.