This document provides an overview of the data streaming ecosystem at Booking.com. It discusses how Booking.com uses Apache Kafka, Kafka Connect, and related tools across over 300 clusters containing over 350 brokers to handle large volumes of streaming data from its various services and applications. Key aspects of Booking.com's data streaming infrastructure are highlighted, including its use of multiple data centers, global and local clusters, monitoring and alerting systems, and operational best practices.
4. ● travel E-Commerce, part of
Booking Holdings (NASDAQ: BKNG)
● 28,000,000+ listings of hotels, homes,
apartments and other places to stay
● 140,000+ destinations in 230 countries
● 1,550,000+ rooms nights reserved daily
● based in Amsterdam, with 198 offices
worldwide
12. set of small clusters
instead of a few giant ones
13. ● vanilla kafka 1.0.1
● 8 brokers per cluster
● rack awareness
● auto.create.topics.enable = false
● tuning
○ replica fetchers
○ replica fetch max bytes
○ network and IO threads
○ socket buffers for 10G net
14. ● N partitions where N is multiple of number of
brokers in a cluster
● retention = 1 hour
● replication factor = 3 (rack-aware)
● min insync replicas = 1
● 1MB max message size
● 1 GB segments
● timestamp = CreateTime