SlideShare a Scribd company logo
1 of 30
Download to read offline
Lyft's Envoy: From monolith to service mesh
Matt Klein, Software Engineer @Lyft
Agenda
● Historical Lyft SoA architecture
● State of SoA networking in industry
● What is Envoy
● Envoy deployment @Lyft
● Envoy future directions
● Q&A
Lyft ~3.5 years ago
PHP / Apache
monolith
MongoDB
InternetClients AWS ELB
Simple! No SoA!
Lyft ~2 years ago
PHP / Apache
monolith
(+haproxy/nsq)
MongoDB
Internet
Clients
AWS external
ELB
DynamoDB
AWS internal
ELBs
Python services
Not simple! SoA! With monolith!
(and some haproxy/nsq)
Agenda
● Historical Lyft SoA architecture
● State of SoA networking in industry
● What is Envoy
● Envoy deployment @Lyft
● Envoy future directions
● Q&A
State of SoA networking in industry
● Languages and frameworks.
● Per language libraries for service calls.
● Protocols (HTTP/1, HTTP/2, gRPC, databases, caching, etc.).
● Infrastructures (IaaS, CaaS, on premise, etc.).
● Intermediate load balancers (AWS ELB, F5, etc.).
● Observability output (stats, tracing, and logging).
● Implementations (often partial) of retry, circuit breaking, rate limiting,
timeouts, and other distributed systems best practices.
● Authentication and Authorization.
State of SoA networking in industry
A really big and confusing mess...
State of SoA networking in industry
● Likely already in a world of hurt or rapidly approaching that point.
● Debugging is difficult or impossible (each application exposes different stats
and logs with no tracing).
● Limited visibility into infra components such as hosted load balancers,
databases, caches, network topologies, etc.).
● Multiple and partial implementations of circuit breaking, retry, and rate limiting
(If I had a $ for every time someone told me that retries are “easy” …).
● Furthermore, if you do have a good solution, you are likely using a library and
are locked into a particular technology stack essentially forever.
● Libraries are incredibly painful to upgrade. (Think CVEs).
State of SoA networking in industry
● Ultimately, robust observability and easy debugging are everything.
● As SoAs become more complicated, it is critical that we provide a common
solution to all of these problems or developer productivity grinds to a halt (and
the site goes down … often).
Can we do better?
Agenda
● Historical Lyft SoA architecture
● State of SoA networking in industry
● What is Envoy
● Envoy deployment @Lyft
● Envoy future directions
● Q&A
What is Envoy
The network should be transparent to applications. When
network and application problems do occur it should be easy to
determine the source of the problem.
This sounds great! But it turns out it’s really, really hard.
What is Envoy
● Out of process architecture: Let’s do a lot of really hard stuff in one place and
allow application developers to focus on business logic.
● Modern C++11 code base: Fast and productive.
● L3/L4 filter architecture: A TCP proxy at its core. Can be used for things other
than HTTP (e.g., MongoDB, redis, stunnel replacement, TCP rate limiter, etc.).
● HTTP L7 filter architecture: Make it easy to plug in different functionality.
● HTTP/2 first! (Including gRPC and a nifty gRPC HTTP/1.1 bridge).
● Service discovery and active health checking.
● Advanced load balancing: Retry, timeouts, circuit breaking, rate limiting,
shadowing, etc.
● Best in class observability: stats, logging, and tracing.
● Edge proxy: routing and TLS.
Envoy service to service topology
Service Cluster
Envoy
Service
Discovery
Service Cluster
Envoy
Service
External Services
HTTP/2
REST / GRPC
Envoy edge proxy topology
“Front” Envoy
Edge Proxy
Region #1
InternetExternal Clients
HTTP/1.1, HTTP/2, TLS
“Front” Envoy
Edge Proxy
Region #2
Private
Infra
HTTP/2, TLS, Client Auth
Lyft today
Legacy monolith
(+Envoy)
MongoDB
Internet
Clients
“Front” Envoy
(via TCP ELB) DynamoDB
Python services
(+Envoy)
Service mesh! Awesome!
Go services
(+Envoy)
Stats / tracing
(direct from
Envoy)
Discovery
Eventually consistent service discovery
● Fully consistent service discovery systems are very popular (ZK, etcd, consul,
etc.).
● In practice they are hard to run at scale.
● Service discovery is actually an eventually consistent problem. Let’s
recognize that and design for it.
● Envoy is designed from the get go to treat service discovery as lossy.
● Active health checking used in combination with service discovery to produce a
routable overlay.
Discovery Status HC OK HC Failed
Discovered Route Don’t Route
Absent Route Don’t Route / Delete
Advanced load balancing
● Different service discovery types.
● Zone aware least request load balancing.
● Dynamic stats: Per zone, canary specific stats, etc.
● Circuit breaking: Max connections, requests, and retries.
● Rate limiting: Integration with global rate limit service.
● Shadowing: Fork traffic to a test cluster.
● Retries: HTTP router has built in retry capability with different policies.
● Timeouts: Both “outer” (including all retries) and “inner” (per try) timeouts.
● Outlier detection: Consecutive 5xx
● Deploy control: Blue/green, canary, etc.
Observability
● Observability is by far the most important thing that Envoy provides.
● Having all SoA traffic transit through Envoy gives us a single place where we
can:
○ Produce consistent statistics for every hop
○ Create and propagate a stable request ID
○ Consistent logging
○ Distributed tracing
Observability
Observability
Observability
Observability
Performance matters for a service proxy
● Throughput is important ultimately for cost and operational scaling, but for many
companies developer time is worth more than infra costs.
● BUT: Latency and predictability is what matters. And in particular tail latency
(P99+).
● We already deal with incredibly confusing deployments (virtual IaaS, multiple
languages and runtimes, languages that use GC, etc.). All of these niceties
improve productivity and reduce upfront dev costs, but they make debugging
really difficult.
● What is leading to sporadic error? The IaaS? The app? GC?
● Ability to reason about overall performance and reliability is critical.
Performance matters for a service proxy
● A service proxy provides invaluable benefits, but if the proxy itself has tail
latencies that are hard to reason about, most of the debugging benefits go out
the window and you are back to square one.
● Anyone that is trying to sell you a service infra that does not consider the above
points is selling you a dream...
Agenda
● Historical Lyft SoA architecture
● State of SoA networking in industry
● What is Envoy
● Envoy deployment @Lyft
● Envoy future directions
● Q&A
Envoy deployment @Lyft
● > 100 services.
● > 10,000 hosts.
● > 2,000,000 RPS.
● All service to service traffic (REST and gRPC).
● Use gRPC bridge to unlock Python and PHP clients.
● MongoDB proxy.
● DynamoDB proxy.
● External service proxy (AWS and other partners).
● Kibana/Elastic Search for logging.
● LightStep for tracing.
● Wavefront for stats.
Agenda
● Historical Lyft SoA architecture
● State of SoA networking in industry
● What is Envoy
● Envoy deployment @Lyft
● Envoy future directions
● Q&A
Envoy future directions
● Redis.
● More outlier detection and ejection (SR and latency).
● LB subset support.
● More rate limiting options / open source rate limit service / IP tagging.
● Configuration schema and better error output.
● Work with Google/community to add k8s support.
● Authentication and authorization.
● Envoy ecosystem?
Agenda
● Historical Lyft SoA architecture
● State of SoA networking in industry
● What is Envoy
● Envoy deployment @Lyft
● Envoy future directions
● Q&A
Q&A
● Thanks for coming!
● We are super excited about building a community around Envoy. Talk to us if
you need help getting started.
● https://lyft.github.io/envoy/
● Lyft is hiring: Contact us if you want to work on hard scaling problems in a fast
moving company: https://www.lyft.com/jobs

More Related Content

More from Ambassador Labs

[Confoo Montreal 2020] Build Your Own Serverless with Knative - Alex Gervais
[Confoo Montreal 2020] Build Your Own Serverless with Knative - Alex Gervais[Confoo Montreal 2020] Build Your Own Serverless with Knative - Alex Gervais
[Confoo Montreal 2020] Build Your Own Serverless with Knative - Alex GervaisAmbassador Labs
 
[QCon London 2020] The Future of Cloud Native API Gateways - Richard Li
[QCon London 2020] The Future of Cloud Native API Gateways - Richard Li[QCon London 2020] The Future of Cloud Native API Gateways - Richard Li
[QCon London 2020] The Future of Cloud Native API Gateways - Richard LiAmbassador Labs
 
What's New in the Ambassador Edge Stack 1.0?
What's New in the Ambassador Edge Stack 1.0? What's New in the Ambassador Edge Stack 1.0?
What's New in the Ambassador Edge Stack 1.0? Ambassador Labs
 
Webinar: Effective Management of APIs and the Edge when Adopting Kubernetes
Webinar: Effective Management of APIs and the Edge when Adopting Kubernetes Webinar: Effective Management of APIs and the Edge when Adopting Kubernetes
Webinar: Effective Management of APIs and the Edge when Adopting Kubernetes Ambassador Labs
 
Ambassador: Building a Control Plane for Envoy
Ambassador: Building a Control Plane for Envoy Ambassador: Building a Control Plane for Envoy
Ambassador: Building a Control Plane for Envoy Ambassador Labs
 
Telepresence - Fast Development Workflows for Kubernetes
Telepresence - Fast Development Workflows for KubernetesTelepresence - Fast Development Workflows for Kubernetes
Telepresence - Fast Development Workflows for KubernetesAmbassador Labs
 
[KubeCon NA 2018] Telepresence Deep Dive Session - Rafael Schloming & Luke Sh...
[KubeCon NA 2018] Telepresence Deep Dive Session - Rafael Schloming & Luke Sh...[KubeCon NA 2018] Telepresence Deep Dive Session - Rafael Schloming & Luke Sh...
[KubeCon NA 2018] Telepresence Deep Dive Session - Rafael Schloming & Luke Sh...Ambassador Labs
 
[KubeCon NA 2018] Effective Kubernetes Develop: Turbocharge Your Dev Loop - P...
[KubeCon NA 2018] Effective Kubernetes Develop: Turbocharge Your Dev Loop - P...[KubeCon NA 2018] Effective Kubernetes Develop: Turbocharge Your Dev Loop - P...
[KubeCon NA 2018] Effective Kubernetes Develop: Turbocharge Your Dev Loop - P...Ambassador Labs
 
The rise of Layer 7, microservices, and the proxy war with Envoy, NGINX, and ...
The rise of Layer 7, microservices, and the proxy war with Envoy, NGINX, and ...The rise of Layer 7, microservices, and the proxy war with Envoy, NGINX, and ...
The rise of Layer 7, microservices, and the proxy war with Envoy, NGINX, and ...Ambassador Labs
 
The Simply Complex Task of Implementing Kubernetes Ingress - Velocity NYC
The Simply Complex Task of Implementing Kubernetes Ingress - Velocity NYCThe Simply Complex Task of Implementing Kubernetes Ingress - Velocity NYC
The Simply Complex Task of Implementing Kubernetes Ingress - Velocity NYCAmbassador Labs
 
Ambassador Kubernetes-Native API Gateway
Ambassador Kubernetes-Native API GatewayAmbassador Kubernetes-Native API Gateway
Ambassador Kubernetes-Native API GatewayAmbassador Labs
 
Micro xchg 2018 - What is a Service Mesh?
Micro xchg 2018 - What is a Service Mesh? Micro xchg 2018 - What is a Service Mesh?
Micro xchg 2018 - What is a Service Mesh? Ambassador Labs
 
KubeCon NA 2017: Ambassador and Envoy (Envoy Salon)
KubeCon NA 2017: Ambassador and Envoy (Envoy Salon)KubeCon NA 2017: Ambassador and Envoy (Envoy Salon)
KubeCon NA 2017: Ambassador and Envoy (Envoy Salon)Ambassador Labs
 
Webinar: Code Faster on Kubernetes
Webinar: Code Faster on KubernetesWebinar: Code Faster on Kubernetes
Webinar: Code Faster on KubernetesAmbassador Labs
 
QCon SF 2017 - Microservices: Service-Oriented Development
QCon SF 2017 - Microservices: Service-Oriented DevelopmentQCon SF 2017 - Microservices: Service-Oriented Development
QCon SF 2017 - Microservices: Service-Oriented DevelopmentAmbassador Labs
 
Montreal Kubernetes Meetup: Developer-first workflows (for microservices) on ...
Montreal Kubernetes Meetup: Developer-first workflows (for microservices) on ...Montreal Kubernetes Meetup: Developer-first workflows (for microservices) on ...
Montreal Kubernetes Meetup: Developer-first workflows (for microservices) on ...Ambassador Labs
 
O'Reilly Software Architecture Conference London 2017: Building Resilient Mic...
O'Reilly Software Architecture Conference London 2017: Building Resilient Mic...O'Reilly Software Architecture Conference London 2017: Building Resilient Mic...
O'Reilly Software Architecture Conference London 2017: Building Resilient Mic...Ambassador Labs
 
Velocity NYC 2017: Building Resilient Microservices with Kubernetes, Docker, ...
Velocity NYC 2017: Building Resilient Microservices with Kubernetes, Docker, ...Velocity NYC 2017: Building Resilient Microservices with Kubernetes, Docker, ...
Velocity NYC 2017: Building Resilient Microservices with Kubernetes, Docker, ...Ambassador Labs
 
DevOps Days Boston 2017: Real-world Kubernetes for DevOps
DevOps Days Boston 2017: Real-world Kubernetes for DevOpsDevOps Days Boston 2017: Real-world Kubernetes for DevOps
DevOps Days Boston 2017: Real-world Kubernetes for DevOpsAmbassador Labs
 
DevOps Days Boston 2017: Developer first workflows for Kubernetes
DevOps Days Boston 2017: Developer first workflows for KubernetesDevOps Days Boston 2017: Developer first workflows for Kubernetes
DevOps Days Boston 2017: Developer first workflows for KubernetesAmbassador Labs
 

More from Ambassador Labs (20)

[Confoo Montreal 2020] Build Your Own Serverless with Knative - Alex Gervais
[Confoo Montreal 2020] Build Your Own Serverless with Knative - Alex Gervais[Confoo Montreal 2020] Build Your Own Serverless with Knative - Alex Gervais
[Confoo Montreal 2020] Build Your Own Serverless with Knative - Alex Gervais
 
[QCon London 2020] The Future of Cloud Native API Gateways - Richard Li
[QCon London 2020] The Future of Cloud Native API Gateways - Richard Li[QCon London 2020] The Future of Cloud Native API Gateways - Richard Li
[QCon London 2020] The Future of Cloud Native API Gateways - Richard Li
 
What's New in the Ambassador Edge Stack 1.0?
What's New in the Ambassador Edge Stack 1.0? What's New in the Ambassador Edge Stack 1.0?
What's New in the Ambassador Edge Stack 1.0?
 
Webinar: Effective Management of APIs and the Edge when Adopting Kubernetes
Webinar: Effective Management of APIs and the Edge when Adopting Kubernetes Webinar: Effective Management of APIs and the Edge when Adopting Kubernetes
Webinar: Effective Management of APIs and the Edge when Adopting Kubernetes
 
Ambassador: Building a Control Plane for Envoy
Ambassador: Building a Control Plane for Envoy Ambassador: Building a Control Plane for Envoy
Ambassador: Building a Control Plane for Envoy
 
Telepresence - Fast Development Workflows for Kubernetes
Telepresence - Fast Development Workflows for KubernetesTelepresence - Fast Development Workflows for Kubernetes
Telepresence - Fast Development Workflows for Kubernetes
 
[KubeCon NA 2018] Telepresence Deep Dive Session - Rafael Schloming & Luke Sh...
[KubeCon NA 2018] Telepresence Deep Dive Session - Rafael Schloming & Luke Sh...[KubeCon NA 2018] Telepresence Deep Dive Session - Rafael Schloming & Luke Sh...
[KubeCon NA 2018] Telepresence Deep Dive Session - Rafael Schloming & Luke Sh...
 
[KubeCon NA 2018] Effective Kubernetes Develop: Turbocharge Your Dev Loop - P...
[KubeCon NA 2018] Effective Kubernetes Develop: Turbocharge Your Dev Loop - P...[KubeCon NA 2018] Effective Kubernetes Develop: Turbocharge Your Dev Loop - P...
[KubeCon NA 2018] Effective Kubernetes Develop: Turbocharge Your Dev Loop - P...
 
The rise of Layer 7, microservices, and the proxy war with Envoy, NGINX, and ...
The rise of Layer 7, microservices, and the proxy war with Envoy, NGINX, and ...The rise of Layer 7, microservices, and the proxy war with Envoy, NGINX, and ...
The rise of Layer 7, microservices, and the proxy war with Envoy, NGINX, and ...
 
The Simply Complex Task of Implementing Kubernetes Ingress - Velocity NYC
The Simply Complex Task of Implementing Kubernetes Ingress - Velocity NYCThe Simply Complex Task of Implementing Kubernetes Ingress - Velocity NYC
The Simply Complex Task of Implementing Kubernetes Ingress - Velocity NYC
 
Ambassador Kubernetes-Native API Gateway
Ambassador Kubernetes-Native API GatewayAmbassador Kubernetes-Native API Gateway
Ambassador Kubernetes-Native API Gateway
 
Micro xchg 2018 - What is a Service Mesh?
Micro xchg 2018 - What is a Service Mesh? Micro xchg 2018 - What is a Service Mesh?
Micro xchg 2018 - What is a Service Mesh?
 
KubeCon NA 2017: Ambassador and Envoy (Envoy Salon)
KubeCon NA 2017: Ambassador and Envoy (Envoy Salon)KubeCon NA 2017: Ambassador and Envoy (Envoy Salon)
KubeCon NA 2017: Ambassador and Envoy (Envoy Salon)
 
Webinar: Code Faster on Kubernetes
Webinar: Code Faster on KubernetesWebinar: Code Faster on Kubernetes
Webinar: Code Faster on Kubernetes
 
QCon SF 2017 - Microservices: Service-Oriented Development
QCon SF 2017 - Microservices: Service-Oriented DevelopmentQCon SF 2017 - Microservices: Service-Oriented Development
QCon SF 2017 - Microservices: Service-Oriented Development
 
Montreal Kubernetes Meetup: Developer-first workflows (for microservices) on ...
Montreal Kubernetes Meetup: Developer-first workflows (for microservices) on ...Montreal Kubernetes Meetup: Developer-first workflows (for microservices) on ...
Montreal Kubernetes Meetup: Developer-first workflows (for microservices) on ...
 
O'Reilly Software Architecture Conference London 2017: Building Resilient Mic...
O'Reilly Software Architecture Conference London 2017: Building Resilient Mic...O'Reilly Software Architecture Conference London 2017: Building Resilient Mic...
O'Reilly Software Architecture Conference London 2017: Building Resilient Mic...
 
Velocity NYC 2017: Building Resilient Microservices with Kubernetes, Docker, ...
Velocity NYC 2017: Building Resilient Microservices with Kubernetes, Docker, ...Velocity NYC 2017: Building Resilient Microservices with Kubernetes, Docker, ...
Velocity NYC 2017: Building Resilient Microservices with Kubernetes, Docker, ...
 
DevOps Days Boston 2017: Real-world Kubernetes for DevOps
DevOps Days Boston 2017: Real-world Kubernetes for DevOpsDevOps Days Boston 2017: Real-world Kubernetes for DevOps
DevOps Days Boston 2017: Real-world Kubernetes for DevOps
 
DevOps Days Boston 2017: Developer first workflows for Kubernetes
DevOps Days Boston 2017: Developer first workflows for KubernetesDevOps Days Boston 2017: Developer first workflows for Kubernetes
DevOps Days Boston 2017: Developer first workflows for Kubernetes
 

Recently uploaded

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 

Recently uploaded (20)

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 

Lyft's Envoy: From Monolith to Service Mesh - Matt Klein, Lyft

  • 1. Lyft's Envoy: From monolith to service mesh Matt Klein, Software Engineer @Lyft
  • 2. Agenda ● Historical Lyft SoA architecture ● State of SoA networking in industry ● What is Envoy ● Envoy deployment @Lyft ● Envoy future directions ● Q&A
  • 3. Lyft ~3.5 years ago PHP / Apache monolith MongoDB InternetClients AWS ELB Simple! No SoA!
  • 4. Lyft ~2 years ago PHP / Apache monolith (+haproxy/nsq) MongoDB Internet Clients AWS external ELB DynamoDB AWS internal ELBs Python services Not simple! SoA! With monolith! (and some haproxy/nsq)
  • 5. Agenda ● Historical Lyft SoA architecture ● State of SoA networking in industry ● What is Envoy ● Envoy deployment @Lyft ● Envoy future directions ● Q&A
  • 6. State of SoA networking in industry ● Languages and frameworks. ● Per language libraries for service calls. ● Protocols (HTTP/1, HTTP/2, gRPC, databases, caching, etc.). ● Infrastructures (IaaS, CaaS, on premise, etc.). ● Intermediate load balancers (AWS ELB, F5, etc.). ● Observability output (stats, tracing, and logging). ● Implementations (often partial) of retry, circuit breaking, rate limiting, timeouts, and other distributed systems best practices. ● Authentication and Authorization.
  • 7. State of SoA networking in industry A really big and confusing mess...
  • 8. State of SoA networking in industry ● Likely already in a world of hurt or rapidly approaching that point. ● Debugging is difficult or impossible (each application exposes different stats and logs with no tracing). ● Limited visibility into infra components such as hosted load balancers, databases, caches, network topologies, etc.). ● Multiple and partial implementations of circuit breaking, retry, and rate limiting (If I had a $ for every time someone told me that retries are “easy” …). ● Furthermore, if you do have a good solution, you are likely using a library and are locked into a particular technology stack essentially forever. ● Libraries are incredibly painful to upgrade. (Think CVEs).
  • 9. State of SoA networking in industry ● Ultimately, robust observability and easy debugging are everything. ● As SoAs become more complicated, it is critical that we provide a common solution to all of these problems or developer productivity grinds to a halt (and the site goes down … often). Can we do better?
  • 10. Agenda ● Historical Lyft SoA architecture ● State of SoA networking in industry ● What is Envoy ● Envoy deployment @Lyft ● Envoy future directions ● Q&A
  • 11. What is Envoy The network should be transparent to applications. When network and application problems do occur it should be easy to determine the source of the problem. This sounds great! But it turns out it’s really, really hard.
  • 12. What is Envoy ● Out of process architecture: Let’s do a lot of really hard stuff in one place and allow application developers to focus on business logic. ● Modern C++11 code base: Fast and productive. ● L3/L4 filter architecture: A TCP proxy at its core. Can be used for things other than HTTP (e.g., MongoDB, redis, stunnel replacement, TCP rate limiter, etc.). ● HTTP L7 filter architecture: Make it easy to plug in different functionality. ● HTTP/2 first! (Including gRPC and a nifty gRPC HTTP/1.1 bridge). ● Service discovery and active health checking. ● Advanced load balancing: Retry, timeouts, circuit breaking, rate limiting, shadowing, etc. ● Best in class observability: stats, logging, and tracing. ● Edge proxy: routing and TLS.
  • 13. Envoy service to service topology Service Cluster Envoy Service Discovery Service Cluster Envoy Service External Services HTTP/2 REST / GRPC
  • 14. Envoy edge proxy topology “Front” Envoy Edge Proxy Region #1 InternetExternal Clients HTTP/1.1, HTTP/2, TLS “Front” Envoy Edge Proxy Region #2 Private Infra HTTP/2, TLS, Client Auth
  • 15. Lyft today Legacy monolith (+Envoy) MongoDB Internet Clients “Front” Envoy (via TCP ELB) DynamoDB Python services (+Envoy) Service mesh! Awesome! Go services (+Envoy) Stats / tracing (direct from Envoy) Discovery
  • 16. Eventually consistent service discovery ● Fully consistent service discovery systems are very popular (ZK, etcd, consul, etc.). ● In practice they are hard to run at scale. ● Service discovery is actually an eventually consistent problem. Let’s recognize that and design for it. ● Envoy is designed from the get go to treat service discovery as lossy. ● Active health checking used in combination with service discovery to produce a routable overlay. Discovery Status HC OK HC Failed Discovered Route Don’t Route Absent Route Don’t Route / Delete
  • 17. Advanced load balancing ● Different service discovery types. ● Zone aware least request load balancing. ● Dynamic stats: Per zone, canary specific stats, etc. ● Circuit breaking: Max connections, requests, and retries. ● Rate limiting: Integration with global rate limit service. ● Shadowing: Fork traffic to a test cluster. ● Retries: HTTP router has built in retry capability with different policies. ● Timeouts: Both “outer” (including all retries) and “inner” (per try) timeouts. ● Outlier detection: Consecutive 5xx ● Deploy control: Blue/green, canary, etc.
  • 18. Observability ● Observability is by far the most important thing that Envoy provides. ● Having all SoA traffic transit through Envoy gives us a single place where we can: ○ Produce consistent statistics for every hop ○ Create and propagate a stable request ID ○ Consistent logging ○ Distributed tracing
  • 23. Performance matters for a service proxy ● Throughput is important ultimately for cost and operational scaling, but for many companies developer time is worth more than infra costs. ● BUT: Latency and predictability is what matters. And in particular tail latency (P99+). ● We already deal with incredibly confusing deployments (virtual IaaS, multiple languages and runtimes, languages that use GC, etc.). All of these niceties improve productivity and reduce upfront dev costs, but they make debugging really difficult. ● What is leading to sporadic error? The IaaS? The app? GC? ● Ability to reason about overall performance and reliability is critical.
  • 24. Performance matters for a service proxy ● A service proxy provides invaluable benefits, but if the proxy itself has tail latencies that are hard to reason about, most of the debugging benefits go out the window and you are back to square one. ● Anyone that is trying to sell you a service infra that does not consider the above points is selling you a dream...
  • 25. Agenda ● Historical Lyft SoA architecture ● State of SoA networking in industry ● What is Envoy ● Envoy deployment @Lyft ● Envoy future directions ● Q&A
  • 26. Envoy deployment @Lyft ● > 100 services. ● > 10,000 hosts. ● > 2,000,000 RPS. ● All service to service traffic (REST and gRPC). ● Use gRPC bridge to unlock Python and PHP clients. ● MongoDB proxy. ● DynamoDB proxy. ● External service proxy (AWS and other partners). ● Kibana/Elastic Search for logging. ● LightStep for tracing. ● Wavefront for stats.
  • 27. Agenda ● Historical Lyft SoA architecture ● State of SoA networking in industry ● What is Envoy ● Envoy deployment @Lyft ● Envoy future directions ● Q&A
  • 28. Envoy future directions ● Redis. ● More outlier detection and ejection (SR and latency). ● LB subset support. ● More rate limiting options / open source rate limit service / IP tagging. ● Configuration schema and better error output. ● Work with Google/community to add k8s support. ● Authentication and authorization. ● Envoy ecosystem?
  • 29. Agenda ● Historical Lyft SoA architecture ● State of SoA networking in industry ● What is Envoy ● Envoy deployment @Lyft ● Envoy future directions ● Q&A
  • 30. Q&A ● Thanks for coming! ● We are super excited about building a community around Envoy. Talk to us if you need help getting started. ● https://lyft.github.io/envoy/ ● Lyft is hiring: Contact us if you want to work on hard scaling problems in a fast moving company: https://www.lyft.com/jobs