Streaming API clients perform many of the same operations as our Streaming API servers. We'll discuss our streaming server's internal architecture, stream processing algorithms and how they relate to a typical client implementation. The focus will be on techniques for sorting and de-duplicating infinite roughly-sorted at-least-once delivery streams, data loss prevention, scaling for the Firehose, and practical operational experience.
This presentation describes the challenges we faced building, scaling and operating a Kubernetes cluster of more than 1000 nodes to host the Datadog applications
Running large Kubernetes clusters is challenging. This talk focus on how you can optimize your network setup in clusters with 1000-2000 nodes. It discusses standard ingresses solutions and their drawbacks as well as potential solutions
This document discusses deploying and managing Apache Solr at scale. It introduces the Solr Scale Toolkit, an open source tool for deploying and managing SolrCloud clusters in cloud environments like AWS. The toolkit uses Python tools like Fabric to provision machines, deploy ZooKeeper ensembles, configure and start SolrCloud clusters. It also supports benchmark testing and system monitoring. The document demonstrates using the toolkit and discusses lessons learned around indexing and query performance at scale.
ZooKeeper - wait free protocol for coordinating processesJulia Proskurnia
ZooKeeper is a service for coordinating processes within distributed systems. Stress test of the tool was applied. Reliable Multicast and Dynamic LogBack system Configuration management were implemented with ZooKeeper.
More details: http://proskurnia.in.ua/wiki/zookeeper_research
Scaling SolrCloud to a large number of CollectionsAnshum Gupta
Anshum Gupta presented on scaling SolrCloud to support thousands of collections. Some challenges included limitations on the cluster state size, overseer performance issues under high load, and difficulties moving or exporting large amounts of data. Solutions involved splitting the cluster state, improving overseer performance through optimizations and dedicated nodes, enabling finer-grained shard splitting and data migration between collections, and implementing distributed deep paging for large result sets. Testing was performed on an AWS infrastructure to validate scaling to billions of documents and thousands of queries/updates per second. Ongoing work continues to optimize and benchmark SolrCloud performance at large scales.
Docker is all the rage these days. While one doesn't hear much about Solr on Docker, we're here to tell you not only that it can be done, but also share how it's done.
We'll quickly go over the basic Docker ideas - containers are lighter than VMs, they solve "but it worked on my laptop" issues - so we can dive into the specifics of running Solr on Docker.
We'll do a live demo showing you how to run Solr master - slave as well as SolrCloud using containers, how to manage CPU assignments, constraint memory and use Docker data volumes when running Solr in containers. We will also show you how to create your own containers with custom configurations.
Finally, we'll address one of the core Solr questions - which deployment type should I use? We will demonstrate performance differences between the following deployment types:
- Single Solr instance running on a bare metal machine
- Multiple Solr instances running on a single bare metal machine
- Solr running in containers
- Solr running on virtual machine
- Solr running on virtual machine using unikernel
For each deployment type we'll address how it impacts performance, operational flexibility and all other key pros and cons you ought to keep in mind.
My personal highlights from the Reactive Summit 2017. I loved the conference from the beginning till the end and I shared some of that with my Reactive Amsterdam meetup. All content belongs to the respective speakers.
This presentation describes the challenges we faced building, scaling and operating a Kubernetes cluster of more than 1000 nodes to host the Datadog applications
Running large Kubernetes clusters is challenging. This talk focus on how you can optimize your network setup in clusters with 1000-2000 nodes. It discusses standard ingresses solutions and their drawbacks as well as potential solutions
This document discusses deploying and managing Apache Solr at scale. It introduces the Solr Scale Toolkit, an open source tool for deploying and managing SolrCloud clusters in cloud environments like AWS. The toolkit uses Python tools like Fabric to provision machines, deploy ZooKeeper ensembles, configure and start SolrCloud clusters. It also supports benchmark testing and system monitoring. The document demonstrates using the toolkit and discusses lessons learned around indexing and query performance at scale.
ZooKeeper - wait free protocol for coordinating processesJulia Proskurnia
ZooKeeper is a service for coordinating processes within distributed systems. Stress test of the tool was applied. Reliable Multicast and Dynamic LogBack system Configuration management were implemented with ZooKeeper.
More details: http://proskurnia.in.ua/wiki/zookeeper_research
Scaling SolrCloud to a large number of CollectionsAnshum Gupta
Anshum Gupta presented on scaling SolrCloud to support thousands of collections. Some challenges included limitations on the cluster state size, overseer performance issues under high load, and difficulties moving or exporting large amounts of data. Solutions involved splitting the cluster state, improving overseer performance through optimizations and dedicated nodes, enabling finer-grained shard splitting and data migration between collections, and implementing distributed deep paging for large result sets. Testing was performed on an AWS infrastructure to validate scaling to billions of documents and thousands of queries/updates per second. Ongoing work continues to optimize and benchmark SolrCloud performance at large scales.
Docker is all the rage these days. While one doesn't hear much about Solr on Docker, we're here to tell you not only that it can be done, but also share how it's done.
We'll quickly go over the basic Docker ideas - containers are lighter than VMs, they solve "but it worked on my laptop" issues - so we can dive into the specifics of running Solr on Docker.
We'll do a live demo showing you how to run Solr master - slave as well as SolrCloud using containers, how to manage CPU assignments, constraint memory and use Docker data volumes when running Solr in containers. We will also show you how to create your own containers with custom configurations.
Finally, we'll address one of the core Solr questions - which deployment type should I use? We will demonstrate performance differences between the following deployment types:
- Single Solr instance running on a bare metal machine
- Multiple Solr instances running on a single bare metal machine
- Solr running in containers
- Solr running on virtual machine
- Solr running on virtual machine using unikernel
For each deployment type we'll address how it impacts performance, operational flexibility and all other key pros and cons you ought to keep in mind.
My personal highlights from the Reactive Summit 2017. I loved the conference from the beginning till the end and I shared some of that with my Reactive Amsterdam meetup. All content belongs to the respective speakers.
Deploying Immutable infrastructures with RabbitMQ and SolrJordi Llonch
This document discusses deploying immutable infrastructures for RabbitMQ and Solr clusters. It describes how to deploy a new RabbitMQ cluster using federated queues to migrate services from the old to new cluster with zero downtime. For Solr, it explains how to deploy a new cluster and reindex data from the old cluster using double near real-time indexing before switching search traffic over. Maintaining both clusters allows for A/B testing, performance testing, and functional testing of new configurations without impacting real users.
Running Kubernetes at scale is challenging and you can often end up in situations where you have to debug complex and unexpected issues. This requires understanding in detail how the different components work and interact with each other. Over the last 3 years, Datadog migrated most of its workloads to Kubernetes and now manages dozens of clusters consisting of thousands of nodes each. During this journey, engineers have debugged complex issues with root causes that were sometimes very surprising. In this talk Laurent and Tabitha will share some of these stories, including a favorite: how a complex interaction between familiar Kubernetes components allowed an OOM-killer invocation to trigger the deletion of a namespace.
Kube-proxy enables access to Kubernetes services (virtual IPs backed by pods) by configuring client-side load-balancing on nodes. The first implementation relied on a userspace proxy which was not very performant. The second implementation used iptables and is still the one used in most Kubernetes clusters. Recently, the community introduced an alternative based on IPVS. This talk will start with a description of the different modes and how they work. It will then focus on the IPVS implementation, the improvements it brings, the issues we encountered and how we fixed them as well as the remaining challenges and how they could be addressed. Finally, the talk will present alternative solutions based on eBPF such as Cilium.
10 ways to shoot yourself in the foot with kubernetes, #9 will surprise you! ...Laurent Bernaille
Kubernetes is a very powerful and complicated system, and many users don’t understand the underlying systems. Come learn how your users can abuse container runtimes, overwhelm your control plane, and cause outages - it’s actually quite easy!
In the last year, we have containerized hundreds of applications and deployed them in large scale clusters (more than 1000 nodes). The journey was eventful and we learned a lot along the way. We’ll share stories of our ten favorite Kubernetes foot guns, including the dangers of cargo culting, rolling updates gone wrong, the pitfalls of initContainers, and nightmarish daemonset upgrades. The talk will present solutions we adopted to avoid or work around some these problems and will finally show several improvements we plan deploy in the future.
Similar to the Kubecon talk with the same title with a few new incidents.
Self Created Load Balancer for MTA on AWSsharu1204
This document summarizes the creation of a self-managed load balancer on AWS to distribute mail traffic across multiple mail gateway servers. It describes the existing mail system architecture, the need for a load balancer due to traffic volume limitations, and the technical implementation using Linux Virtual Server (LVS) and keepalived for load balancing and iptables for network address translation (SNAT) to support load balancing of SMTP traffic. The results were an increased ability to scale mail gateway servers elastically and observe traffic patterns from email services like Google Apps. A note of caution is provided about network bandwidth limitations based on the EC2 instance type used for the load balancer.
Realtime Statistics based on Apache Storm and RocketMQXin Wang
This document discusses using Apache Storm and RocketMQ for real-time statistics. It begins with an overview of the streaming ecosystem and components. It then describes challenges with stateful statistics and introduces Alien, an open-source middleware for handling stateful event counting. The document concludes with best practices for Storm performance and data hot points.
The document discusses proposed enhancements to improve the performance of Apache Storm 2.0. It analyzes the current messaging architecture and identifies bottlenecks. Preliminary testing shows the redesigned messaging architecture improves latency by 116x and increases throughput by 50% over Storm 1.0 and the 2.0 master branch. Further optimizations to grouping, tuple implementation, and the acking mechanism could potentially yield even higher throughput of 15 million tuples per second. A new threading and execution model is also proposed to improve performance.
The Kubernetes audit logs are a rich source of information: all of the calls made to the API server are stored, along with additional metadata such as usernames, timings, and source IPs. They help to answer questions such as “What is overloading my control plane?” or “Which sequence of events led to this problematic situation?”. These questions are hard to answer otherwise—especially in large clusters. At Datadog, we have been running clusters with 1000+ nodes for more than a year and during that time, the audit logs have proved invaluable.
In this presentation, we will first introduce the audit logs, explain how they are configured, and review the type of data they store. Finally, we will describe in detail several scenarios where they have helped us to diagnose complex problems.
This document discusses various configurations and techniques for optimizing Kafka for latency, throughput, durability, and availability. For latency, it recommends small batches, no compression, low replication guarantees, and fetching data as soon as possible. For throughput, it suggests batching, compression, increasing memory, and parallelizing with consumer groups. For durability, it highlights replication, idempotent producers, and exactly-once processing. And for availability, it notes the importance of replicas and log recovery configurations. The document provides guidance on tuning Kafka deployments for different performance objectives.
Docker and Maestro for fun, development and profitMaxime Petazzoni
Presentation on MaestroNG, an orchestration and management tool for multi-host container deployments with Docker.
#lspe meetup, February 20th, 2014 at Yahoo!'s URL café.
SaltConf14 - Anita Kuno, HP & OpenStack - Using SaltStack for event-driven or...SaltStack
This talk will highlight how the OpenStack Infrastructure team uses SaltStack for event-driven orchestration of its various cloud infrastructure components. The speakers will review the flexibility of Salt in a complex automation environment. Salt plays very well with other tools, including Puppet, which is especially critical in the OpenStack Infrastructure environment which requires the event-driven orchestration functions of Salt to synchronize workflow timing of OpenStack Infrastructure components and events.
To learn when and where the next SaltConf will be, subscribe to our newsletter here: http://www.saltstack.com/salt-ink-newsletter or follow us on Twitter: http://www.twitter.com/saltstackinc
Gregory Holt proposes a simpler Container Sync feature for OpenStack Swift that replicates objects across geographically distinct clusters with fewer configuration options than originally planned. The new approach replicates all objects from a source container to a destination container without tracking remote replica counts. It allows per-container sync configuration and could support migrating accounts between clusters. Key components include updating the Swift API and adding a new daemon to handle background synchronization between clusters.
DNS is one of the Kubernetes core systems and can quickly become a source of issues when you’re running clusters at scale. For over a year at Datadog, we’ve run Kubernetes clusters with thousands of nodes that host workloads generating tens of thousands of DNS queries per second. It wasn’t easy to build an architecture able to handle this load, and we’ve had our share of problems along the way.
This talk starts with a presentation of how Kubernetes DNS works. It then dives into the challenges we’ve faced, which span a variety of topics related to load, connection tracking, upstream servers, rolling updates, resolver implementations, and performance. We then show how our DNS architecture evolved over time to address or mitigate these problems. Finally, we share our solutions for detecting these problems before they happen—and identifying misbehaving clients.
Scaling an invoicing SaaS from zero to over 350k customersSpeck&Tech
ABSTRACT: Fatture in Cloud was born in late 2013 on a single-server machine and scaled from zero to 35k customers at the end of 2018. Then, we faced the mandatory electronic invoicing which came into effect in Italy on 1st January 2019, and we experienced a huge growth to 350k customers in few months. In these 5 years, I've learned a lot about cloud architecture, scalability, optimization, DevOps, and we eventually achieved a 99,99% uptime even in the huge growth period.
BIO: Daniele Ratti is the Founder and CEO of Fatture in Cloud, which is currently the leader invoicing platform in Italy, counting more than 350k customers.
Integration testing for salt states using aws ec2 container serviceSaltStack
A SaltConf16 use case talk by Steven Braverman of Dun & Bradstreet. Testing configuration changes for multiple server roles can be time consuming when real instances or legacy container systems are used. Applying configuration changes to each role in parallel can be difficult. So what's the best way to test configuration changes efficiently, quickly, and securely prior to applying them? See how an integrated test setup using AWS EC2 Container Service (ECS), AWS AutoScaling Group, and SaltStack simplifies the application of configuration changes and allows you to test configuration changes in parallel to reduce the time spent testing.
New features in Ansible 2.0 include improved variable management, better use of object-oriented programming, and many new modules. It was released on September 8th. Execution strategies were added, allowing plays to run tasks in parallel using the "free" strategy. Blocks were enhanced to better handle errors and always/rescue tasks. Notable new modules help manage packages, execute commands with responses, and work with AWS services like EC2, IAM, S3, and Route53. Playbooks should be fully compatible with 2.0 but newer modules can be used right away.
Kubernetes networking can be complex to scale due to issues like growing iptables rules, but newer solutions are helping. Pod networking uses CNI plugins like flannel or Calico to assign each pod an IP and allow communication. Service networking uses kube-proxy and iptables or IPVS for load balancing to pods. DNS is used to resolve service names to IPs. While Kubernetes networking brings flexibility, operators must learn the nuances of their specific CNI plugin and issues can arise, but the ecosystem adapts quickly to new needs and changes don't impact all workloads.
The CAP theorem states that it is impossible for a distributed computer system to simultaneously provide consistency, availability, and partition tolerance. You must give up one of these properties. Most systems choose to sacrifice consistency (eventual consistency), making them either AP (available during partition) or CP (consistent during partition). With AP systems like MongoDB, updates will propagate between nodes eventually, so clients may see inconsistent or stale data temporarily. CP systems guarantee consistency during a partition by blocking writes.
In the big data world, our data stores communicate over an asynchronous, unreliable network to provide a facade of consistency. However, to really understand the guarantees of these systems, we must understand the realities of networks and test our data stores against them.
Jepsen is a tool which simulates network partitions in data stores and helps us understand the guarantees of our systems and its failure modes. In this talk, I will help you understand why you should care about network partitions and how can we test datastores against partitions using Jepsen. I will explain what Jepsen is and how it works and the kind of tests it lets you create. We will try to understand the subtleties of distributed consensus, the CAP theorem and demonstrate how different data stores such as MongoDB, Cassandra, Elastic and Solr behave under network partitions. Finally, I will describe the results of the tests I wrote using Jepsen for Apache Solr and discuss the kinds of rare failures which were found by this excellent tool.
Deploying Immutable infrastructures with RabbitMQ and SolrJordi Llonch
This document discusses deploying immutable infrastructures for RabbitMQ and Solr clusters. It describes how to deploy a new RabbitMQ cluster using federated queues to migrate services from the old to new cluster with zero downtime. For Solr, it explains how to deploy a new cluster and reindex data from the old cluster using double near real-time indexing before switching search traffic over. Maintaining both clusters allows for A/B testing, performance testing, and functional testing of new configurations without impacting real users.
Running Kubernetes at scale is challenging and you can often end up in situations where you have to debug complex and unexpected issues. This requires understanding in detail how the different components work and interact with each other. Over the last 3 years, Datadog migrated most of its workloads to Kubernetes and now manages dozens of clusters consisting of thousands of nodes each. During this journey, engineers have debugged complex issues with root causes that were sometimes very surprising. In this talk Laurent and Tabitha will share some of these stories, including a favorite: how a complex interaction between familiar Kubernetes components allowed an OOM-killer invocation to trigger the deletion of a namespace.
Kube-proxy enables access to Kubernetes services (virtual IPs backed by pods) by configuring client-side load-balancing on nodes. The first implementation relied on a userspace proxy which was not very performant. The second implementation used iptables and is still the one used in most Kubernetes clusters. Recently, the community introduced an alternative based on IPVS. This talk will start with a description of the different modes and how they work. It will then focus on the IPVS implementation, the improvements it brings, the issues we encountered and how we fixed them as well as the remaining challenges and how they could be addressed. Finally, the talk will present alternative solutions based on eBPF such as Cilium.
10 ways to shoot yourself in the foot with kubernetes, #9 will surprise you! ...Laurent Bernaille
Kubernetes is a very powerful and complicated system, and many users don’t understand the underlying systems. Come learn how your users can abuse container runtimes, overwhelm your control plane, and cause outages - it’s actually quite easy!
In the last year, we have containerized hundreds of applications and deployed them in large scale clusters (more than 1000 nodes). The journey was eventful and we learned a lot along the way. We’ll share stories of our ten favorite Kubernetes foot guns, including the dangers of cargo culting, rolling updates gone wrong, the pitfalls of initContainers, and nightmarish daemonset upgrades. The talk will present solutions we adopted to avoid or work around some these problems and will finally show several improvements we plan deploy in the future.
Similar to the Kubecon talk with the same title with a few new incidents.
Self Created Load Balancer for MTA on AWSsharu1204
This document summarizes the creation of a self-managed load balancer on AWS to distribute mail traffic across multiple mail gateway servers. It describes the existing mail system architecture, the need for a load balancer due to traffic volume limitations, and the technical implementation using Linux Virtual Server (LVS) and keepalived for load balancing and iptables for network address translation (SNAT) to support load balancing of SMTP traffic. The results were an increased ability to scale mail gateway servers elastically and observe traffic patterns from email services like Google Apps. A note of caution is provided about network bandwidth limitations based on the EC2 instance type used for the load balancer.
Realtime Statistics based on Apache Storm and RocketMQXin Wang
This document discusses using Apache Storm and RocketMQ for real-time statistics. It begins with an overview of the streaming ecosystem and components. It then describes challenges with stateful statistics and introduces Alien, an open-source middleware for handling stateful event counting. The document concludes with best practices for Storm performance and data hot points.
The document discusses proposed enhancements to improve the performance of Apache Storm 2.0. It analyzes the current messaging architecture and identifies bottlenecks. Preliminary testing shows the redesigned messaging architecture improves latency by 116x and increases throughput by 50% over Storm 1.0 and the 2.0 master branch. Further optimizations to grouping, tuple implementation, and the acking mechanism could potentially yield even higher throughput of 15 million tuples per second. A new threading and execution model is also proposed to improve performance.
The Kubernetes audit logs are a rich source of information: all of the calls made to the API server are stored, along with additional metadata such as usernames, timings, and source IPs. They help to answer questions such as “What is overloading my control plane?” or “Which sequence of events led to this problematic situation?”. These questions are hard to answer otherwise—especially in large clusters. At Datadog, we have been running clusters with 1000+ nodes for more than a year and during that time, the audit logs have proved invaluable.
In this presentation, we will first introduce the audit logs, explain how they are configured, and review the type of data they store. Finally, we will describe in detail several scenarios where they have helped us to diagnose complex problems.
This document discusses various configurations and techniques for optimizing Kafka for latency, throughput, durability, and availability. For latency, it recommends small batches, no compression, low replication guarantees, and fetching data as soon as possible. For throughput, it suggests batching, compression, increasing memory, and parallelizing with consumer groups. For durability, it highlights replication, idempotent producers, and exactly-once processing. And for availability, it notes the importance of replicas and log recovery configurations. The document provides guidance on tuning Kafka deployments for different performance objectives.
Docker and Maestro for fun, development and profitMaxime Petazzoni
Presentation on MaestroNG, an orchestration and management tool for multi-host container deployments with Docker.
#lspe meetup, February 20th, 2014 at Yahoo!'s URL café.
SaltConf14 - Anita Kuno, HP & OpenStack - Using SaltStack for event-driven or...SaltStack
This talk will highlight how the OpenStack Infrastructure team uses SaltStack for event-driven orchestration of its various cloud infrastructure components. The speakers will review the flexibility of Salt in a complex automation environment. Salt plays very well with other tools, including Puppet, which is especially critical in the OpenStack Infrastructure environment which requires the event-driven orchestration functions of Salt to synchronize workflow timing of OpenStack Infrastructure components and events.
To learn when and where the next SaltConf will be, subscribe to our newsletter here: http://www.saltstack.com/salt-ink-newsletter or follow us on Twitter: http://www.twitter.com/saltstackinc
Gregory Holt proposes a simpler Container Sync feature for OpenStack Swift that replicates objects across geographically distinct clusters with fewer configuration options than originally planned. The new approach replicates all objects from a source container to a destination container without tracking remote replica counts. It allows per-container sync configuration and could support migrating accounts between clusters. Key components include updating the Swift API and adding a new daemon to handle background synchronization between clusters.
DNS is one of the Kubernetes core systems and can quickly become a source of issues when you’re running clusters at scale. For over a year at Datadog, we’ve run Kubernetes clusters with thousands of nodes that host workloads generating tens of thousands of DNS queries per second. It wasn’t easy to build an architecture able to handle this load, and we’ve had our share of problems along the way.
This talk starts with a presentation of how Kubernetes DNS works. It then dives into the challenges we’ve faced, which span a variety of topics related to load, connection tracking, upstream servers, rolling updates, resolver implementations, and performance. We then show how our DNS architecture evolved over time to address or mitigate these problems. Finally, we share our solutions for detecting these problems before they happen—and identifying misbehaving clients.
Scaling an invoicing SaaS from zero to over 350k customersSpeck&Tech
ABSTRACT: Fatture in Cloud was born in late 2013 on a single-server machine and scaled from zero to 35k customers at the end of 2018. Then, we faced the mandatory electronic invoicing which came into effect in Italy on 1st January 2019, and we experienced a huge growth to 350k customers in few months. In these 5 years, I've learned a lot about cloud architecture, scalability, optimization, DevOps, and we eventually achieved a 99,99% uptime even in the huge growth period.
BIO: Daniele Ratti is the Founder and CEO of Fatture in Cloud, which is currently the leader invoicing platform in Italy, counting more than 350k customers.
Integration testing for salt states using aws ec2 container serviceSaltStack
A SaltConf16 use case talk by Steven Braverman of Dun & Bradstreet. Testing configuration changes for multiple server roles can be time consuming when real instances or legacy container systems are used. Applying configuration changes to each role in parallel can be difficult. So what's the best way to test configuration changes efficiently, quickly, and securely prior to applying them? See how an integrated test setup using AWS EC2 Container Service (ECS), AWS AutoScaling Group, and SaltStack simplifies the application of configuration changes and allows you to test configuration changes in parallel to reduce the time spent testing.
New features in Ansible 2.0 include improved variable management, better use of object-oriented programming, and many new modules. It was released on September 8th. Execution strategies were added, allowing plays to run tasks in parallel using the "free" strategy. Blocks were enhanced to better handle errors and always/rescue tasks. Notable new modules help manage packages, execute commands with responses, and work with AWS services like EC2, IAM, S3, and Route53. Playbooks should be fully compatible with 2.0 but newer modules can be used right away.
Kubernetes networking can be complex to scale due to issues like growing iptables rules, but newer solutions are helping. Pod networking uses CNI plugins like flannel or Calico to assign each pod an IP and allow communication. Service networking uses kube-proxy and iptables or IPVS for load balancing to pods. DNS is used to resolve service names to IPs. While Kubernetes networking brings flexibility, operators must learn the nuances of their specific CNI plugin and issues can arise, but the ecosystem adapts quickly to new needs and changes don't impact all workloads.
The CAP theorem states that it is impossible for a distributed computer system to simultaneously provide consistency, availability, and partition tolerance. You must give up one of these properties. Most systems choose to sacrifice consistency (eventual consistency), making them either AP (available during partition) or CP (consistent during partition). With AP systems like MongoDB, updates will propagate between nodes eventually, so clients may see inconsistent or stale data temporarily. CP systems guarantee consistency during a partition by blocking writes.
In the big data world, our data stores communicate over an asynchronous, unreliable network to provide a facade of consistency. However, to really understand the guarantees of these systems, we must understand the realities of networks and test our data stores against them.
Jepsen is a tool which simulates network partitions in data stores and helps us understand the guarantees of our systems and its failure modes. In this talk, I will help you understand why you should care about network partitions and how can we test datastores against partitions using Jepsen. I will explain what Jepsen is and how it works and the kind of tests it lets you create. We will try to understand the subtleties of distributed consensus, the CAP theorem and demonstrate how different data stores such as MongoDB, Cassandra, Elastic and Solr behave under network partitions. Finally, I will describe the results of the tests I wrote using Jepsen for Apache Solr and discuss the kinds of rare failures which were found by this excellent tool.
Abstract:
Cassandra is a new kind of database: it is more than a single-machine system. It naturally runs in a High-Availability configuration. All nodes in the system are symmetric; there is no single point of failure. As you add machines, failure becomes routine, and Cassandra is built to tolerate that with no interruptions.
Cassandra is linearly scalable with good performance characteristics for very small and very large data stores. Unlike earlier efforts, Cassandra is more than just a key-value store; it is a structured data store which can facilitate complex use cases and queries. Cassandra allows for random access to your data organized into rows and columns.
Cassandra is different, and exciting. This presentation will discuss the pros and cons of using Cassandra, and why it has seen such amazing adoption in the past year.
Bio:
Ben Coverston is Director of Operations at DataStax (formerly knows as Riptano), a provider of software, support, services, training, resources and help for Cassandra. He has been involved in enterprise software his entire career. Working in the airline industry, he helped to build some of the highest volume online booking sites in the world. He saw first hand the consequences of trying to solve real world scalability problems at the limit of what traditional relational databases are capable of.
The document discusses using dashboards and information radiators to track key metrics and provide visibility. It provides examples of metrics that could be tracked, such as requests per second and installs per minute. It also discusses Kafka, a tool for building real-time data pipelines and streaming applications, and how it provides high-throughput, persistent, publish-subscribe messaging capabilities. The document recommends not flying blind, checking out Kafka, and being creative with dashboards and information radiators.
The Data Mullet: From all SQL to No SQL back to Some SQLDatadog
This document discusses Datadog's data architecture, which uses a combination of SQL and NoSQL databases. It initially used all SQL (Postgres) but found it did not scale well. It added Cassandra for durable storage and Redis for in-memory storage to improve performance and scalability. While Cassandra provided large-scale durable storage, it had issues with I/O latency on EC2. The document examines different database choices and how Datadog addressed scaling and latency issues through a hybrid "data mullet" approach using different databases for their strengths.
Generators, Coroutines and Other Brain Unrolling Sweetness. Adi Shavit ➠ Cor...corehard_by
C++20 brings us coroutines and with them the power to create generators, iterables and ranges. We'll see how coroutines allow for cleaner, more readable, code, easier abstraction and genericity, composition and avoiding callbacks and inversion of control. We'll discuss the pains of writing iterator types with distributed internal state and old-school co-routines. Then we'll look at C++20 coroutines and how easy they are to write clean linear code. Coroutines prevent inversion of control and reduce callback hell. We'll see how they compose and play with Ranges with examples from math, filtering, rasterization. The talk will focus more on co_yield and less on co_await and async related usages.
Как сделать высоконагруженный сервис, не зная количество нагрузки / Олег Обле...Ontico
Существует множество архитектур и способов масштабирования систем. Сегодня многие компании мигрируют в облачные сервисы или используют контейнеры. Но действительно ли это так необходимо и нужно ли следовать трендам?
В данном докладе мне бы хотелось рассказать об архитектуре, которую я спланировал и внедрил в компании InnoGames. Архитектура, не требующая вмешательства администратора в случае лавинообразного увеличения нагрузки и, что ещё более важно, умеющая редуцироваться в случае отсутствия её для экономии затрат.
Вы узнаете об опыте создания сервиса с очень непростыми критериями и поймёте, что не обязательно платить в 3 раза дороже за AWS или любую подобную систему.
- Что такое CRM. Зачем нам этот сервис.
- Инфраструктура.
-- Graphite. Почему он должен быть надежным и быстрым.
-- Puppet + gitlab.
-- Балансировка нагрузки.
-- Наше облако. Зачем нам openstack, когда есть serveradmin!? Как роль сервера определяется несколькими атрибутами в веб-интерфейсе.
-- Nagios + аггрегаторы. Другой взгляд на то, как мониторить сервисы через Graphite.
-- Мониторинг кластеров. Clusterhc и Grafsy.
-- Brassmonkey. Как мы написали своего сисадмина на python.
-- Бэкапы.
- Архитектура CRM3.
- Autoscaling или как проанализировать кучу данных и принять решения.
This document discusses the author's experience using PostgreSQL in cloud environments over several startup projects. Some key lessons learned include: PostgreSQL can work well on Amazon EC2 but I/O performance is unpredictable; redundant infrastructure is still vulnerable to failures so regular backups are important; and query performance relies heavily on buffer cache warming so load testing is critical. Overall the document emphasizes being pragmatic with PostgreSQL, focusing on architecture choices that minimize I/O bottlenecks and failures in the cloud.
When it Absolutely, Positively, Has to be There: Reliability Guarantees in Ka...confluent
In the financial industry, losing data is unacceptable. Financial firms are adopting Kafka for their critical applications. Kafka provides the low latency, high throughput, high availability, and scale that these applications require. But can it also provide complete reliability? As a system architect, when asked “Can you guarantee that we will always get every transaction,” you want to be able to say “Yes” with total confidence.
In this session, we will go over everything that happens to a message – from producer to consumer, and pinpoint all the places where data can be lost – if you are not careful. You will learn how developers and operation teams can work together to build a bulletproof data pipeline with Kafka. And if you need proof that you built a reliable system – we’ll show you how you can build the system to prove this too.
Tomas Doran presented on their implementation of Logstash at TIM Group to process over 55 million messages per day. Their applications are all Java/Scala/Clojure and they developed their own library to send structured log events as JSON to Logstash using ZeroMQ for reliability. They index data in Elasticsearch and use it for metrics, alerts and dashboards but face challenges with data growth.
The document discusses the author's journey with functional programming languages and Clojure specifically. It covers the author's background and interests that led to Clojure, key features of Clojure like immutability and persistent data structures, approaches to modeling data and processes, and ideas around computation and efficiency in a functional style.
Kubernetes can orchestrate and manage container workloads through components like Pods, Deployments, DaemonSets, and StatefulSets. It schedules containers across a cluster based on resource needs and availability. Services enable discovery and network access to Pods, while ConfigMaps and Secrets allow injecting configuration and credentials into applications.
Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...ScyllaDB
Outbrain is the world's largest content discovery program. Learn about their use case with Scylla where they lowered latency while doing 20X IOPS of Cassandra.
"It's important that even under load, Apache Kafka ensures user topics are fully replicated in synch.
Replication is essential to endure resilience to data loss, so both users and operators care about it.
If a topic partition falls out of the ISR (In-Synch-replicas) set, a user experiences unavailability (when producing with the default acknowledgment setting).
Users may use non-default acks mode to work around it, but the effect on a Kafka cluster is to make the under-replication worse.
Even simple Under replication with no Under Min Isr is to be avoided as a cluster update may cause the dreaded Under Min ISR.
There are a number of settings that can be used, from quotas to number of replication threads to more low-level settings.
This session wants to show how we successfully measured and evolved our Kafkas configuration, with the goal of giving the best possible user experience (and resilience to their data).
Hofstadter's Law applied!
""It always takes longer than you expect, even when you take into account Hofstadter's Law."""
Slides from the presentation "Modern Cryptography" delivered at Deovxx UK 2013. See Parleys.com for the full video https://www.parleys.com/speaker/5148920c0364bc17fc5697a5
We designed a new framework, made for Microservices. Making it easier for developers to build microservices-based systems – systems that communicate asynchronously, self-heal, scale elastically and remain responsive no matter what bad stuff is happening.
And all this without the pain of selecting and mixing components, from a plethora of libraries that were originally built for other things.
In this presentation, we reveal this new way for Java developers to not only understand and begin building microservices, but also to seamlessly push them into staging and production
Rackspace: Email's Solution for Indexing 50K Documents per Second: Presented ...Lucidworks
George Bailey and Cameron Baker of Rackspace presented their solution for indexing over 50,000 documents per second for Rackspace Email. They modernized their system using Apache Flume for event processing and aggregation and SolrCloud for real-time search. This reduced indexing time from over 20 minutes to under 5 seconds, reduced the number of physical servers needed from over 100 to 14, and increased indexing throughput from 1,000 to over 50,000 documents per second while supporting over 13 billion searchable documents.
Introduction to Apache ZooKeeper | Big Data Hadoop Spark Tutorial | CloudxLabCloudxLab
Big Data with Hadoop & Spark Training: http://bit.ly/2kvXlPd
This CloudxLab Introduction to Apache ZooKeeper tutorial helps you to understand ZooKeeper in detail. Below are the topics covered in this tutorial:
1) Data Model
2) Znode Types
3) Persistent Znode
4) Sequential Znode
5) Architecture
6) Election & Majority Demo
7) Why Do We Need Majority?
8) Guarantees - Sequential consistency, Atomicity, Single system image, Durability, Timeliness
9) ZooKeeper APIs
10) Watches & Triggers
11) ACLs - Access Control Lists
12) Usecases
13) When Not to Use ZooKeeper
Writing concurrent program is hard; maintaining concurrent program even is a nightmare. Actually, a pattern which helps us to write good concurrent code is available, that is, using “channels” to communicate.
This talk will share the channel concept with common libraries, like threading and multiprocessing, to make concurrent code elegant.
It's the talk at PyCon TW 2017 [1] and PyCon APAC/MY 2017 [2].
[1]: https://tw.pycon.org/2017
[2]: https://pycon.my/pycon-apac-2017-program-schedule/
Ähnlich wie Thinking in Streaming - Twitter Streaming API (20)
Main news related to the CCS TSI 2023 (2023/1695)Jakub Marek
An English 🇬🇧 translation of a presentation to the speech I gave about the main changes brought by CCS TSI 2023 at the biggest Czech conference on Communications and signalling systems on Railways, which was held in Clarion Hotel Olomouc from 7th to 9th November 2023 (konferenceszt.cz). Attended by around 500 participants and 200 on-line followers.
The original Czech 🇨🇿 version of the presentation can be found here: https://www.slideshare.net/slideshow/hlavni-novinky-souvisejici-s-ccs-tsi-2023-2023-1695/269688092 .
The videorecording (in Czech) from the presentation is available here: https://youtu.be/WzjJWm4IyPk?si=SImb06tuXGb30BEH .
Best 20 SEO Techniques To Improve Website Visibility In SERPPixlogix Infotech
Boost your website's visibility with proven SEO techniques! Our latest blog dives into essential strategies to enhance your online presence, increase traffic, and rank higher on search engines. From keyword optimization to quality content creation, learn how to make your site stand out in the crowded digital landscape. Discover actionable tips and expert insights to elevate your SEO game.
Generating privacy-protected synthetic data using Secludy and MilvusZilliz
During this demo, the founders of Secludy will demonstrate how their system utilizes Milvus to store and manipulate embeddings for generating privacy-protected synthetic data. Their approach not only maintains the confidentiality of the original data but also enhances the utility and scalability of LLMs under privacy constraints. Attendees, including machine learning engineers, data scientists, and data managers, will witness first-hand how Secludy's integration with Milvus empowers organizations to harness the power of LLMs securely and efficiently.
OpenID AuthZEN Interop Read Out - AuthorizationDavid Brossard
During Identiverse 2024 and EIC 2024, members of the OpenID AuthZEN WG got together and demoed their authorization endpoints conforming to the AuthZEN API
Have you ever been confused by the myriad of choices offered by AWS for hosting a website or an API?
Lambda, Elastic Beanstalk, Lightsail, Amplify, S3 (and more!) can each host websites + APIs. But which one should we choose?
Which one is cheapest? Which one is fastest? Which one will scale to meet our needs?
Join me in this session as we dive into each AWS hosting service to determine which one is best for your scenario and explain why!
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slackshyamraj55
Discover the seamless integration of RPA (Robotic Process Automation), COMPOSER, and APM with AWS IDP enhanced with Slack notifications. Explore how these technologies converge to streamline workflows, optimize performance, and ensure secure access, all while leveraging the power of AWS IDP and real-time communication via Slack notifications.
5th LF Energy Power Grid Model Meet-up SlidesDanBrown980551
5th Power Grid Model Meet-up
It is with great pleasure that we extend to you an invitation to the 5th Power Grid Model Meet-up, scheduled for 6th June 2024. This event will adopt a hybrid format, allowing participants to join us either through an online Mircosoft Teams session or in person at TU/e located at Den Dolech 2, Eindhoven, Netherlands. The meet-up will be hosted by Eindhoven University of Technology (TU/e), a research university specializing in engineering science & technology.
Power Grid Model
The global energy transition is placing new and unprecedented demands on Distribution System Operators (DSOs). Alongside upgrades to grid capacity, processes such as digitization, capacity optimization, and congestion management are becoming vital for delivering reliable services.
Power Grid Model is an open source project from Linux Foundation Energy and provides a calculation engine that is increasingly essential for DSOs. It offers a standards-based foundation enabling real-time power systems analysis, simulations of electrical power grids, and sophisticated what-if analysis. In addition, it enables in-depth studies and analysis of the electrical power grid’s behavior and performance. This comprehensive model incorporates essential factors such as power generation capacity, electrical losses, voltage levels, power flows, and system stability.
Power Grid Model is currently being applied in a wide variety of use cases, including grid planning, expansion, reliability, and congestion studies. It can also help in analyzing the impact of renewable energy integration, assessing the effects of disturbances or faults, and developing strategies for grid control and optimization.
What to expect
For the upcoming meetup we are organizing, we have an exciting lineup of activities planned:
-Insightful presentations covering two practical applications of the Power Grid Model.
-An update on the latest advancements in Power Grid -Model technology during the first and second quarters of 2024.
-An interactive brainstorming session to discuss and propose new feature requests.
-An opportunity to connect with fellow Power Grid Model enthusiasts and users.
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2024/06/building-and-scaling-ai-applications-with-the-nx-ai-manager-a-presentation-from-network-optix/
Robin van Emden, Senior Director of Data Science at Network Optix, presents the “Building and Scaling AI Applications with the Nx AI Manager,” tutorial at the May 2024 Embedded Vision Summit.
In this presentation, van Emden covers the basics of scaling edge AI solutions using the Nx tool kit. He emphasizes the process of developing AI models and deploying them globally. He also showcases the conversion of AI models and the creation of effective edge AI pipelines, with a focus on pre-processing, model conversion, selecting the appropriate inference engine for the target hardware and post-processing.
van Emden shows how Nx can simplify the developer’s life and facilitate a rapid transition from concept to production-ready applications.He provides valuable insights into developing scalable and efficient edge AI solutions, with a strong focus on practical implementation.
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc
How does your privacy program stack up against your peers? What challenges are privacy teams tackling and prioritizing in 2024?
In the fifth annual Global Privacy Benchmarks Survey, we asked over 1,800 global privacy professionals and business executives to share their perspectives on the current state of privacy inside and outside of their organizations. This year’s report focused on emerging areas of importance for privacy and compliance professionals, including considerations and implications of Artificial Intelligence (AI) technologies, building brand trust, and different approaches for achieving higher privacy competence scores.
See how organizational priorities and strategic approaches to data security and privacy are evolving around the globe.
This webinar will review:
- The top 10 privacy insights from the fifth annual Global Privacy Benchmarks Survey
- The top challenges for privacy leaders, practitioners, and organizations in 2024
- Key themes to consider in developing and maintaining your privacy program
GraphRAG for Life Science to increase LLM accuracyTomaz Bratanic
GraphRAG for life science domain, where you retriever information from biomedical knowledge graphs using LLMs to increase the accuracy and performance of generated answers
Digital Marketing Trends in 2024 | Guide for Staying AheadWask
https://www.wask.co/ebooks/digital-marketing-trends-in-2024
Feeling lost in the digital marketing whirlwind of 2024? Technology is changing, consumer habits are evolving, and staying ahead of the curve feels like a never-ending pursuit. This e-book is your compass. Dive into actionable insights to handle the complexities of modern marketing. From hyper-personalization to the power of user-generated content, learn how to build long-term relationships with your audience and unlock the secrets to success in the ever-shifting digital landscape.
Introduction of Cybersecurity with OSS at Code Europe 2024Hiroshi SHIBATA
I develop the Ruby programming language, RubyGems, and Bundler, which are package managers for Ruby. Today, I will introduce how to enhance the security of your application using open-source software (OSS) examples from Ruby and RubyGems.
The first topic is CVE (Common Vulnerabilities and Exposures). I have published CVEs many times. But what exactly is a CVE? I'll provide a basic understanding of CVEs and explain how to detect and handle vulnerabilities in OSS.
Next, let's discuss package managers. Package managers play a critical role in the OSS ecosystem. I'll explain how to manage library dependencies in your application.
I'll share insights into how the Ruby and RubyGems core team works to keep our ecosystem safe. By the end of this talk, you'll have a better understanding of how to safeguard your code.
Monitoring and Managing Anomaly Detection on OpenShift.pdfTosin Akinosho
Monitoring and Managing Anomaly Detection on OpenShift
Overview
Dive into the world of anomaly detection on edge devices with our comprehensive hands-on tutorial. This SlideShare presentation will guide you through the entire process, from data collection and model training to edge deployment and real-time monitoring. Perfect for those looking to implement robust anomaly detection systems on resource-constrained IoT/edge devices.
Key Topics Covered
1. Introduction to Anomaly Detection
- Understand the fundamentals of anomaly detection and its importance in identifying unusual behavior or failures in systems.
2. Understanding Edge (IoT)
- Learn about edge computing and IoT, and how they enable real-time data processing and decision-making at the source.
3. What is ArgoCD?
- Discover ArgoCD, a declarative, GitOps continuous delivery tool for Kubernetes, and its role in deploying applications on edge devices.
4. Deployment Using ArgoCD for Edge Devices
- Step-by-step guide on deploying anomaly detection models on edge devices using ArgoCD.
5. Introduction to Apache Kafka and S3
- Explore Apache Kafka for real-time data streaming and Amazon S3 for scalable storage solutions.
6. Viewing Kafka Messages in the Data Lake
- Learn how to view and analyze Kafka messages stored in a data lake for better insights.
7. What is Prometheus?
- Get to know Prometheus, an open-source monitoring and alerting toolkit, and its application in monitoring edge devices.
8. Monitoring Application Metrics with Prometheus
- Detailed instructions on setting up Prometheus to monitor the performance and health of your anomaly detection system.
9. What is Camel K?
- Introduction to Camel K, a lightweight integration framework built on Apache Camel, designed for Kubernetes.
10. Configuring Camel K Integrations for Data Pipelines
- Learn how to configure Camel K for seamless data pipeline integrations in your anomaly detection workflow.
11. What is a Jupyter Notebook?
- Overview of Jupyter Notebooks, an open-source web application for creating and sharing documents with live code, equations, visualizations, and narrative text.
12. Jupyter Notebooks with Code Examples
- Hands-on examples and code snippets in Jupyter Notebooks to help you implement and test anomaly detection models.
How to Get CNIC Information System with Paksim Ga.pptxdanishmna97
Pakdata Cf is a groundbreaking system designed to streamline and facilitate access to CNIC information. This innovative platform leverages advanced technology to provide users with efficient and secure access to their CNIC details.
Programming Foundation Models with DSPy - Meetup SlidesZilliz
Prompting language models is hard, while programming language models is easy. In this talk, I will discuss the state-of-the-art framework DSPy for programming foundation models with its powerful optimizers and runtime constraint system.
Driving Business Innovation: Latest Generative AI Advancements & Success StorySafe Software
Are you ready to revolutionize how you handle data? Join us for a webinar where we’ll bring you up to speed with the latest advancements in Generative AI technology and discover how leveraging FME with tools from giants like Google Gemini, Amazon, and Microsoft OpenAI can supercharge your workflow efficiency.
During the hour, we’ll take you through:
Guest Speaker Segment with Hannah Barrington: Dive into the world of dynamic real estate marketing with Hannah, the Marketing Manager at Workspace Group. Hear firsthand how their team generates engaging descriptions for thousands of office units by integrating diverse data sources—from PDF floorplans to web pages—using FME transformers, like OpenAIVisionConnector and AnthropicVisionConnector. This use case will show you how GenAI can streamline content creation for marketing across the board.
Ollama Use Case: Learn how Scenario Specialist Dmitri Bagh has utilized Ollama within FME to input data, create custom models, and enhance security protocols. This segment will include demos to illustrate the full capabilities of FME in AI-driven processes.
Custom AI Models: Discover how to leverage FME to build personalized AI models using your data. Whether it’s populating a model with local data for added security or integrating public AI tools, find out how FME facilitates a versatile and secure approach to AI.
We’ll wrap up with a live Q&A session where you can engage with our experts on your specific use cases, and learn more about optimizing your data workflows with AI.
This webinar is ideal for professionals seeking to harness the power of AI within their data management systems while ensuring high levels of customization and security. Whether you're a novice or an expert, gain actionable insights and strategies to elevate your data processes. Join us to see how FME and AI can revolutionize how you work with data!
3. Turtles All The Way Down
• Your client ≅ Our server
• Gather Events
• Parse JSON
• Match on Predicates
• Route to Consumers
4. Properties
• Offered
• At Least Once
• Roughly Sorted (K-Sorted)
• Desired
• Exactly Once
• Sorted
5. Plan
• Over-deliver
Ensure At Least Once
• De-duplicate
Unordered Exactly Once
• Sort
Ordered Exactly Once
6. Why At Least Once?
• Exactly Once impractical across streams
• Clients must handle reconnect over-delivery
• Reuse this capability
• Mask upstream failures
• Relax server restart issues
7. Why At Least Once?
• Exactly Once impractical across streams
• Clients must handle reconnect over-delivery
• Reuse this capability
• Mask upstream failures
• Relax server restart issues
8. Why At Least Once?
• Exactly Once impractical across streams
• Clients must handle reconnect over-delivery
• Reuse this capability to
• Mask upstream failures
• Relax server restart issues
9.
10. Startup
• Prefetch from peer to populate circular buffer
• Go multi-user
• Consume Kestrel backlog - duplicates between:
• Buffer and backlog
• Previous connection and backlog
• Steady State: Exactly Once Delivery
11. Startup
• Prefetch from peer to populate circular buffer
• Go multi-user
• Consume Kestrel backlog - duplicates between:
• Buffer and backlog
• Previous connection and backlog
• Steady State: Exactly Once Delivery
12. Upstream Failure
• Cascaded source fails
• Fail over to next peer
• Over-request to avoid loss
• Steady State: Exactly Once Delivery
13. Client Over-delivery
• Use Count Parameter after fast reconnect
• Deep backfill from REST API
• Client offline offline for a while
• User first issues new query
• Overlap connections slightly
15. Infinite Streams
• De-duplicating a randomly ordered infinite
stream requires infinite time and storage
• Sorting? Ditto
• I have neither infinite time nor storage
16. Roughly Sorted
• A sequence α is k-sorted IFF ∀ i, r, 1 ≤ i ≤ r ≤
n, i ≤ r - k implies aᵢ ≤ aᵣ
• Strictly sorted is 0-sorted.
• Transpose two adjacent values in a 0-sorted
sequence, becomes 1-sorted.
• K For the firehose?
19. Pessimist’s K
• In theory could be hours & millions of events
• Practically, if current and stale queues exist:
• We’ll flush the stale queues before exposing
• You’ll never know this happened
• If all queues stale:
• We’ll deliver the backlog
• K remains reasonable
20. Unordered
De-duplication
• Create two HashSets: Primary, Secondary,
each preallocated to size K
• New event is duplicate if ID exists in Primary
• Add new ID to both HashSets
• When Primary.size > K / 2
Primary.clear
Swap Primary & Secondary
21. Unordered
De-duplication
• Bounded memory consumption
• O(n) behavior
• Low latency
• Emit first tweet
• Discard subsequent duplicates
• Cheaper than de-duplication by sorting?
Probably depends on K
22. Ordered & De-duplicated
• Insertion sort and de-duplicate by ID into a
decreasing order list
• While length > K, remove sorted tail
23. Ordered & De-duplicated
• O(n) --- O(n * K)
• Bounded memory consumption
• Induces latency of K
• Assumes average items not very unsorted
• K is usually large to handle the outliers
24. Routing Events
• By Keyword or by UserId
• Add predicates to HashMap
• Apply events to Map
• Query holds private predicate set for later Map
removal
• O(n)
26. Monitoring
• What to look at?
• Latency
• Throughput
• Errors
• Alerting
27. Horizontal Scale
• Firehose keeps Growing.
• Eventually Firehose stream will become
impractical.
• Partition the Firehose into N streams.
Hinweis der Redaktion
There is a lot of symmetry in what the Streaming API servers do and what your streaming clients do.
In both cases we’re gathering events, parsing them, and farming them out to various consumers.
The issues are similar at all processing points in the stream.
We present a stream of events that is roughly sorted by created at time.
This means that the events are mostly in created at time order, but not exactly so.
We’ve designed our system to publish each event at least once --
which means none are lost, but there may, at times, be duplicates.
I’ll discuss why our streams have these properties.
Also, you’ll probably want to display or process tweets exactly once -- none missing and none duplicated.
You might also want to present them sorted, or you might be OK with a rough sorting.
I’ll go over two algorithms for converting what the API offers into the stream that you want.
The basic plan is to over deliver events and then de-duplicate them to provide an exactly once quality of service.
One technique is to just de-duplicate with set logic, the other is to sort and de-duplicate.
There are trade offs with each.
First, let’s see why the Streaming API offers events at least once.
It would be nice if we could offer everything transactionally, that is, exactly once.
But, it’s impractical to synchronize this state across client reconnections.
For example, it’s unlikely that you’ll reconnect to the same server.
Also, event streams aren’t strictly ordered, so we wouldn’t know what to deliver.
We’d have to coordinate a large vector of sent events between servers.
And, clients would have to transactionally acknowledge all events received.
This is quite impractical at scale unless we sorted streams, but this would introduce latency.
We’ll see why sorting induces latency later.
Yet, first and foremost, we want a very low latency experience.
And, we want a simple programming model for clients.
So, we assume that clients can over-request when reconnecting, and post process to get the required stream properties.
Once we make this fundamental assumption, we can reuse this to also handle the internal data loss risk as well.
Our Streaming API server is called Hosebird.
Hosebird receives events from the rest of the Twitter system through Kestrel message queues.
Two hosebird processes in each cluster read transactionally from Kestrel.
The rest of the servers in a cluster cascade via Streaming HTTP.
When a hosebird server starts, it prefetches events from a peer to pre-populate its circular buffers.
These buffers are used to support the count parameter, which allows some historical back fill on streaming queries.
Count allows your stream to start back a few minutes, then catch up and transition to real time streaming.
This startup prefetching creates a window where you might see the same event twice, if you are unlucky enough to connect to a very recently restarted server.
The backlog read from kestrel will contain some of the same events that were prefetched into the buffer.
The backlog may also have events that you read on your last connection.
You might have to suffer through a minute or so of duplicates as the backlog is processed and displaces the prefetched events in the circular buffer.
Outside of this restart case, during steady state processing, we deliver each event exactly once on fanout servers.
When a cascaded server has its source Hosebird restart, say during a deploy, the server needs to quickly fail over to another source.
A gap in the stream would be introduced during the failure, detection and reconnection window.
We cover this gap by requesting some back-fill from the new source.
This causes a short period of duplicated events.
During steady state processing, however, we deliver each event exactly once on cascaded servers.
Your client should use these same techniques on reconnect.
Over request with the count parameter if the connection was momentarily lost.
If the client has been disconnected for an extended period, you’ll have to back fill from the REST API.
When you need to make a predicate change, you can create a new connection, wait for the first event to arrive, then disconnect the old connection.
This should generally produce an at least once stream.
Let’s talk about de-duplication on your end.
A finite stream looks a lot like a relational database table -- a finite relation.
We’re used to thinking about finite relations.
But, a stream appears as an infinite relation, you can’t ever read to the end.
Also, since we want very low latency, we can’t wait to read to the end.
We have to present results immediately.
A roughly sorted sequence is mostly sorted, where no element is more than K positions away from its strictly sorted position.
At Twitter, we talk about K sorted things all the time.
K this, K that. Nothing is strictly ordered.
We have relaxed various legs of the CAP theorem to make our distributed system feasible.
We’ve never had strictly ordered event processing. Tweets are applied to your timelines in a rough ordering.
On the REST API, we sort the vector before we present it to you, but it’s very loose behind the scenes.
Likewise, events show up in the Streaming API roughly sorted by created at time.
Here are two samples from the status firehose.
I took five hundred thousand status ids, and did an insertion sort into a reverse sorted list.
The most recent id at the head, the oldest status at the tail.
These distributions show the number of list elements traversed before finding the sorted insertion point.
So, the average and median number of hops are pretty small.
The hundred percent case, the worst case, shows a much larger K.
Assuming about 600 events per second on this stream,
back when I took this sample,
we can see that events show up as much as 5 seconds out of order.
Close comparison of the distributions shows that they’re very noisy.
If you took many samples, they’d all have a different shape.
Having an idea of K helps us tune our de-duplication algorithms.
Daily operational issues cause K to grow beyond 5 seconds now and then.
It’s hard to say what a good upper bound for a display client should be.
Something around a few minutes would cover most issues we’ve had over the last six months.
A long-term storage client might want to assume a K of a few hours or a day or so.
In the unlikely event that something goes really wrong with the system, we’ll make a judgement call on recovery.
We’ll probably bias towards delivering the backlog, but, if there’s a partial failure, we’ll keep your K in mind.
Now that we have a handle on K we can think about de-duplication.
An infinite, but roughly sorted, stream can be de duplicated with some set logic.
The key is efficiently aging out irrelevant set members.
One way is to keep two hashes, and alternately clear them.
You don’t have to do any fancy tracking of items, and off the shelf HashSets will work just fine.
The union of the two sets contain at least K items and allow deduplication of a K sorted sequence.
Given the Firehose K, you don’t even need all that much space to de-duplicate.
Please don’t resort to using mySQL primary keys to de-duplicate streams. Unnecessary.
The nice thing here is that we can emit events as they arrive and throw away late arriving dups.
We don’t need to add any latency.
On the other hand, if we want a sorted and deduplicated stream, we have to do a little more work.
Given the Firehose K distribution, doing an insertion sort isn’t the worst thing.
Most events don’t need to traverse too deeply into the list.
Elements dequeued from the tail of the list are sorted and deduplicated.
This algorithm does, however introduce a latency of K.
We can’t emit a sorted event unless we have at least K elements to examine.
Still, this is quite practical to do in memory.
You can plow through a lot of ids per second even in a scripting language like Ruby.
Now that we have a de-duplicated stream, we need to route it to consumers.
This can be done very cheaply by registering every consumer’s predicates in a HashMap.
If, say, you are displaying columns of search results, like TweetDeck,
you can have each column register its keywords in the HashMap.
Each new event is applied to the HashMap, and routed to all consumers easily.
Duplicates can arise, as a given column may have several OR predicates that match.
Hosebird uses a generational de-duplication scheme to solve this.
This scheme is the degenerate case of the sorted algorithm above.
Each client stream maintains just the primary key of the last event.
If the same id is presented twice in a row, it can be discarded.
Break things up into components.
Host components in separate processes.
Measure what happens between components.
Use (reliable) queues between components.