Thinking in Streaming - Twitter Streaming API

What's new in Solr 5.0

Optimizing kubernetes networking

Deploying and managing Solr at scale

This document discusses deploying and managing Apache Solr at scale. It introduces the Solr Scale Toolkit, an open source tool for deploying and managing SolrCloud clusters in cloud environments like AWS. The toolkit uses Python tools like Fabric to provision machines, deploy ZooKeeper ensembles, configure and start SolrCloud clusters. It also supports benchmark testing and system monitoring. The document demonstrates using the toolkit and discusses lessons learned around indexing and query performance at scale.

ZooKeeper - wait free protocol for coordinating processes

Julia Proskurnia

Scaling SolrCloud to a large number of Collections

Anshum Gupta presented on scaling SolrCloud to support thousands of collections. Some challenges included limitations on the cluster state size, overseer performance issues under high load, and difficulties moving or exporting large amounts of data. Solutions involved splitting the cluster state, improving overseer performance through optimizations and dedicated nodes, enabling finer-grained shard splitting and data migration between collections, and implementing distributed deep paging for large result sets. Testing was performed on an AWS infrastructure to validate scaling to billions of documents and thousands of queries/updates per second. Ongoing work continues to optimize and benchmark SolrCloud performance at large scales.

How to Run Solr on Docker and Why

Sematext Group, Inc.

Docker is all the rage these days. While one doesn't hear much about Solr on Docker, we're here to tell you not only that it can be done, but also share how it's done. We'll quickly go over the basic Docker ideas - containers are lighter than VMs, they solve "but it worked on my laptop" issues - so we can dive into the specifics of running Solr on Docker. We'll do a live demo showing you how to run Solr master - slave as well as SolrCloud using containers, how to manage CPU assignments, constraint memory and use Docker data volumes when running Solr in containers. We will also show you how to create your own containers with custom configurations. Finally, we'll address one of the core Solr questions - which deployment type should I use? We will demonstrate performance differences between the following deployment types: - Single Solr instance running on a bare metal machine - Multiple Solr instances running on a single bare metal machine - Solr running in containers - Solr running on virtual machine - Solr running on virtual machine using unikernel For each deployment type we'll address how it impacts performance, operational flexibility and all other key pros and cons you ought to keep in mind.

Reactive Summit 2017 Highlights!

Fabio Tiriticco

Empfohlen

Kubernetes at Datadog the very hard way

What's new in Solr 5.0

Optimizing kubernetes networking

Deploying and managing Solr at scale

ZooKeeper - wait free protocol for coordinating processes

Julia Proskurnia

Scaling SolrCloud to a large number of Collections

How to Run Solr on Docker and Why

Sematext Group, Inc.

Reactive Summit 2017 Highlights!

Fabio Tiriticco

Deploying Immutable infrastructures with RabbitMQ and Solr

Jordi Llonch

This document discusses deploying immutable infrastructures for RabbitMQ and Solr clusters. It describes how to deploy a new RabbitMQ cluster using federated queues to migrate services from the old to new cluster with zero downtime. For Solr, it explains how to deploy a new cluster and reindex data from the old cluster using double near real-time indexing before switching search traffic over. Maintaining both clusters allows for A/B testing, performance testing, and functional testing of new configurations without impacting real users.

How the OOM Killer Deleted My Namespace

Running Kubernetes at scale is challenging and you can often end up in situations where you have to debug complex and unexpected issues. This requires understanding in detail how the different components work and interact with each other. Over the last 3 years, Datadog migrated most of its workloads to Kubernetes and now manages dozens of clusters consisting of thousands of nodes each. During this journey, engineers have debugged complex issues with root causes that were sometimes very surprising. In this talk Laurent and Tabitha will share some of these stories, including a favorite: how a complex interaction between familiar Kubernetes components allowed an OOM-killer invocation to trigger the deletion of a namespace.

Evolution of kube-proxy (Brussels, Fosdem 2020)

Kube-proxy enables access to Kubernetes services (virtual IPs backed by pods) by configuring client-side load-balancing on nodes. The first implementation relied on a userspace proxy which was not very performant. The second implementation used iptables and is still the one used in most Kubernetes clusters. Recently, the community introduced an alternative based on IPVS. This talk will start with a description of the different modes and how they work. It will then focus on the IPVS implementation, the improvements it brings, the issues we encountered and how we fixed them as well as the remaining challenges and how they could be addressed. Finally, the talk will present alternative solutions based on eBPF such as Cilium.

10 ways to shoot yourself in the foot with kubernetes, #9 will surprise you! ...

Kubernetes is a very powerful and complicated system, and many users don’t understand the underlying systems. Come learn how your users can abuse container runtimes, overwhelm your control plane, and cause outages - it’s actually quite easy! In the last year, we have containerized hundreds of applications and deployed them in large scale clusters (more than 1000 nodes). The journey was eventful and we learned a lot along the way. We’ll share stories of our ten favorite Kubernetes foot guns, including the dangers of cargo culting, rolling updates gone wrong, the pitfalls of initContainers, and nightmarish daemonset upgrades. The talk will present solutions we adopted to avoid or work around some these problems and will finally show several improvements we plan deploy in the future. Similar to the Kubecon talk with the same title with a few new incidents.

Self Created Load Balancer for MTA on AWS

sharu1204

This document summarizes the creation of a self-managed load balancer on AWS to distribute mail traffic across multiple mail gateway servers. It describes the existing mail system architecture, the need for a load balancer due to traffic volume limitations, and the technical implementation using Linux Virtual Server (LVS) and keepalived for load balancing and iptables for network address translation (SNAT) to support load balancing of SMTP traffic. The results were an increased ability to scale mail gateway servers elastically and observe traffic patterns from email services like Google Apps. A note of caution is provided about network bandwidth limitations based on the EC2 instance type used for the load balancer.

Ease of use in Apache Solr

Realtime Statistics based on Apache Storm and RocketMQ

Xin Wang

Storm worker redesign

Roshan Naik

The document discusses proposed enhancements to improve the performance of Apache Storm 2.0. It analyzes the current messaging architecture and identifies bottlenecks. Preliminary testing shows the redesigned messaging architecture improves latency by 116x and increases throughput by 50% over Storm 1.0 and the 2.0 master branch. Further optimizations to grouping, tuple implementation, and the acking mechanism could potentially yield even higher throughput of 15 million tuples per second. A new threading and execution model is also proposed to improve performance.

Making the most out of kubernetes audit logs

The Kubernetes audit logs are a rich source of information: all of the calls made to the API server are stored, along with additional metadata such as usernames, timings, and source IPs. They help to answer questions such as “What is overloading my control plane?” or “Which sequence of events led to this problematic situation?”. These questions are hard to answer otherwise—especially in large clusters. At Datadog, we have been running clusters with 1000+ nodes for more than a year and during that time, the audit logs have proved invaluable. In this presentation, we will first introduce the audit logs, explain how they are configured, and review the type of data they store. Finally, we will describe in detail several scenarios where they have helped us to diagnose complex problems.

How to tune Kafka® for production

confluent

This document discusses various configurations and techniques for optimizing Kafka for latency, throughput, durability, and availability. For latency, it recommends small batches, no compression, low replication guarantees, and fetching data as soon as possible. For throughput, it suggests batching, compression, increasing memory, and parallelizing with consumer groups. For durability, it highlights replication, idempotent producers, and exactly-once processing. And for availability, it notes the importance of replicas and log recovery configurations. The document provides guidance on tuning Kafka deployments for different performance objectives.

Docker and Maestro for fun, development and profit

Maxime Petazzoni

SaltConf14 - Anita Kuno, HP & OpenStack - Using SaltStack for event-driven or...

This talk will highlight how the OpenStack Infrastructure team uses SaltStack for event-driven orchestration of its various cloud infrastructure components. The speakers will review the flexibility of Salt in a complex automation environment. Salt plays very well with other tools, including Puppet, which is especially critical in the OpenStack Infrastructure environment which requires the event-driven orchestration functions of Salt to synchronize workflow timing of OpenStack Infrastructure components and events. To learn when and where the next SaltConf will be, subscribe to our newsletter here: http://www.saltstack.com/salt-ink-newsletter or follow us on Twitter: http://www.twitter.com/saltstackinc

Service discovery in Docker environments

alexandru giurgiu

Swift container sync

Open Stack

Gregory Holt proposes a simpler Container Sync feature for OpenStack Swift that replicates objects across geographically distinct clusters with fewer configuration options than originally planned. The new approach replicates all objects from a source container to a destination container without tracking remote replica counts. It allows per-container sync configuration and could support migrating accounts between clusters. Key components include updating the Swift API and adding a new daemon to handle background synchronization between clusters.

Kubernetes DNS Horror Stories

DNS is one of the Kubernetes core systems and can quickly become a source of issues when you’re running clusters at scale. For over a year at Datadog, we’ve run Kubernetes clusters with thousands of nodes that host workloads generating tens of thousands of DNS queries per second. It wasn’t easy to build an architecture able to handle this load, and we’ve had our share of problems along the way. This talk starts with a presentation of how Kubernetes DNS works. It then dives into the challenges we’ve faced, which span a variety of topics related to load, connection tracking, upstream servers, rolling updates, resolver implementations, and performance. We then show how our DNS architecture evolved over time to address or mitigate these problems. Finally, we share our solutions for detecting these problems before they happen—and identifying misbehaving clients.

Spinnaker - Bay Area AWS Meetup - 20160726

Adam Jordens

Scaling an invoicing SaaS from zero to over 350k customers

Speck&Tech

ABSTRACT: Fatture in Cloud was born in late 2013 on a single-server machine and scaled from zero to 35k customers at the end of 2018. Then, we faced the mandatory electronic invoicing which came into effect in Italy on 1st January 2019, and we experienced a huge growth to 350k customers in few months. In these 5 years, I've learned a lot about cloud architecture, scalability, optimization, DevOps, and we eventually achieved a 99,99% uptime even in the huge growth period. BIO: Daniele Ratti is the Founder and CEO of Fatture in Cloud, which is currently the leader invoicing platform in Italy, counting more than 350k customers.

Integration testing for salt states using aws ec2 container service

A SaltConf16 use case talk by Steven Braverman of Dun & Bradstreet. Testing configuration changes for multiple server roles can be time consuming when real instances or legacy container systems are used. Applying configuration changes to each role in parallel can be difficult. So what's the best way to test configuration changes efficiently, quickly, and securely prior to applying them? See how an integrated test setup using AWS EC2 Container Service (ECS), AWS AutoScaling Group, and SaltStack simplifies the application of configuration changes and allows you to test configuration changes in parallel to reduce the time spent testing.

What's new in Ansible 2.0

Allan Denot

New features in Ansible 2.0 include improved variable management, better use of object-oriented programming, and many new modules. It was released on September 8th. Execution strategies were added, allowing plays to run tasks in parallel using the "free" strategy. Blocks were enhanced to better handle errors and always/rescue tasks. Notable new modules help manage packages, execute commands with responses, and work with AWS services like EC2, IAM, S3, and Route53. Playbooks should be fully compatible with 2.0 but newer modules can be used right away.

Kubernetes at Datadog Scale

Docker, Inc.

Kubernetes networking can be complex to scale due to issues like growing iptables rules, but newer solutions are helping. Pod networking uses CNI plugins like flannel or Calico to assign each pod an IP and allow communication. Service networking uses kube-proxy and iptables or IPVS for load balancing to pods. DNS is used to resolve service names to IPs. While Kubernetes networking brings flexibility, operators must learn the nuances of their specific CNI plugin and issues can arise, but the ecosystem adapts quickly to new needs and changes don't impact all workloads.

Thoughts on consistency models

rogerbodamer

The CAP theorem states that it is impossible for a distributed computer system to simultaneously provide consistency, availability, and partition tolerance. You must give up one of these properties. Most systems choose to sacrifice consistency (eventual consistency), making them either AP (available during partition) or CP (consistent during partition). With AP systems like MongoDB, updates will propagate between nodes eventually, so clients may see inconsistent or stale data temporarily. CP systems guarantee consistency during a partition by blocking writes.

Call me maybe: Jepsen and flaky networks

Shalin Shekhar Mangar

In the big data world, our data stores communicate over an asynchronous, unreliable network to provide a facade of consistency. However, to really understand the guarantees of these systems, we must understand the realities of networks and test our data stores against them. Jepsen is a tool which simulates network partitions in data stores and helps us understand the guarantees of our systems and its failure modes. In this talk, I will help you understand why you should care about network partitions and how can we test datastores against partitions using Jepsen. I will explain what Jepsen is and how it works and the kind of tests it lets you create. We will try to understand the subtleties of distributed consensus, the CAP theorem and demonstrate how different data stores such as MongoDB, Cassandra, Elastic and Solr behave under network partitions. Finally, I will describe the results of the tests I wrote using Jepsen for Apache Solr and discuss the kinds of rare failures which were found by this excellent tool.

Weitere ähnliche Inhalte

Was ist angesagt?

Deploying Immutable infrastructures with RabbitMQ and Solr

Jordi Llonch

How the OOM Killer Deleted My Namespace

Evolution of kube-proxy (Brussels, Fosdem 2020)

10 ways to shoot yourself in the foot with kubernetes, #9 will surprise you! ...

Self Created Load Balancer for MTA on AWS

sharu1204

Ease of use in Apache Solr

Realtime Statistics based on Apache Storm and RocketMQ

Xin Wang

Storm worker redesign

Roshan Naik

Making the most out of kubernetes audit logs

How to tune Kafka® for production

confluent

Docker and Maestro for fun, development and profit

Maxime Petazzoni

SaltConf14 - Anita Kuno, HP & OpenStack - Using SaltStack for event-driven or...

Service discovery in Docker environments

alexandru giurgiu

Swift container sync

Open Stack

Kubernetes DNS Horror Stories

Spinnaker - Bay Area AWS Meetup - 20160726

Adam Jordens

Scaling an invoicing SaaS from zero to over 350k customers

Speck&Tech

Integration testing for salt states using aws ec2 container service