Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.
The Wix Microservice
Stack
Tomer Gabel, Wix
March 2017 @ Dnipro, UA
Agenda
1. Topology
2. Networking
3. Structure
4. Operations
5. Beer
Our conceptual system
Store Service
Checkout
Service
Cart Service
1. TOPOLOGY
Image: Penrose Steps by Alex Eylar (CC BY-NC-SA 2.0)
Our conceptual system
Store Service
Checkout
Service
Cart Service
Host A
Host B Host C
Topology
Topology
Service→
host
mapping
Server
inventory
Service
catalogue
Formally,
“scheduling”
Service Scheduling
• A hard problem!
• Multiple dimensions:
– Resource utilization
(disk space, I/O, RAM,
network, power…)...
Service Scheduling
• A hard problem!
• Multiple dimensions:
– Resource utilization
(disk space, I/O, RAM,
network, power…)...
Service Scheduling
• The middle ground:
– Naïve automatic
scheduler
– Human-configured
overrides for zoning,
optimization
...
Our conceptual system
Store Service
Checkout
Service
Cart Service
http://err:42/uh
… derp?
Service Discovery
Static Dynamic
Logical
Physical
That way
madness lies
Service Discovery
Static Dynamic
Logical
Physical
Service Discovery
Static Dynamic
Logical
Physical
In practice
• Static topology
– Managed with Frying Pan
– Exported to Chef
– Deployed via
configuration files
• Live regis...
2. NETWORKING
Image: Neurons by Birth Into Being (CC BY-NC-SA 2.0)
Back to diagrams
Store Service
Checkout
Service
Cart Service
Back to diagrams
Store Service
Checkout
Service
Cart Service
Protocol
Protocol
• RPC-style
– Sync or async
– Point-to-point
• Message passing
– Async only
– Requires broker
Shared
Concerns
Top...
Protocol
• Wix RPC
– RPC-style
– Custom JSON
– HTTP
• Pros/cons
– Rock-solid
– Sync/blocking
– Legacy
Image: psycho chicke...
Protocol
• Greyhound
–Message-passing
–Custom JSON
–Kafka
• Pros/cons
–Async + replayable
–Still experimental
Image: Robin...
Load balancing
• Centralized
– Simple
– Limited flexibility
– Limited scale
– Thin implementation
 highly portable
– Suit...
Load balancing
• Centralized
– Simple
– Limited flexibility
– Limited scale
– Thin implementation
 highly portable
– Suit...
To our shame
• There’s always IDL.
• Informal
– Text documentation
– Code samples
• Formal
– Swagger, Apiary etc.
– ProtoB...
To our shame
• There’s always IDL.
• Informal
– Text documentation
– Code samples
• Formal
– Swagger, Apiary etc.
– ProtoB...
In Detail
• Java interfaces?
+ Ridiculously simple
+ Lend well to RPC
– Coupled to JVM
• JSON serialization
+ Jackson-base...
In Detail
• Java interfaces?
+ Ridiculously simple
+ Lend well to RPC
– Coupled to JVM
• JSON serialization
+ Jackson-base...
In Detail
• Java interfaces?
+ Ridiculously simple
+ Lend well to RPC
– Coupled to JVM
• JSON serialization
+ Jackson-base...
Cascade Failures
• What is a
cascade failure?
• Mitigations
– Bulkheading
– Circuit breakers
– Load shedding
• We don’t do...
Does it go?
• Short answer: yes.
• Battle-tested
– Evolving since 2010.
– >200 services in
production.
• Known quantity
– ...
Not all is well, though
• Polyglot development
– Custom client stack
– Expensive to port!
Not all is well, though
• Polyglot development
– Custom client stack
– Expensive to port!
• Implicit state
– Transparently...
3. STRUCTURE
Codebase modeling
• A product comprises
multiple services
• Services have
dependencies
– Creating a DAG
– Tends to cluster...
Codebase modeling
Repository-per-domain
• Small repositories
• Artifacts built
independently
• Binary dependencies
• Requi...
At Wix
• One repo per domain
• Dependencies:
– Declared in POMs
– Version management
via custom plugin
– Builds managed by...
Version management
[INFO] QuickRelease
/home/builduser/agent01/work/d9922a1c87aee4bb
bf1bc8bcfb2eccebc4268651c5f19faa689be...
4. OPERATIONS
Back to diagrams
Store Service
Checkout
Service
Cart Service
How ya
doin’?
Health
• Host monitoring
– Sensu alerts
– Usual host metrics
– Health-check endpoint
in framework
• End-to-end
– Pingdom
•...
Instrumentation
• Metrics
– DropWizard Metrics
– Graphite and Anodot
– Built-in metrics (RPC,
resource pools…)
– APIs for ...
Debugging
• Logs
– Good old Logback
– No centralized
aggregation
– Not particularly useful
• Feature toggle
overrides
• Di...
WE’RE DONE HERE!
… AND YES, WE’RE HIRING :-)
Thank you for listening
tomer@tomergabel.com
@tomerg
http://il.linkedin.com/i...
Nächste SlideShare
Wird geladen in …5
×

The Wix Microservice Stack

1.451 Aufrufe

Veröffentlicht am

(A talk given at Wix R&D in Dnipro, Ukraine on March 2017. Video available at https://www.youtube.com/watch?v=eIX33mQdkAI&feature=youtu.be)

While microservices are conceptually simple, it's a deep rabbit hole to go down. Deceptively simple questions can have far-reaching implications: Which communication protocol should I choose? Is event-driven the way to go? What monitoring tools should I put in place?

In this talk we'll cover some of the fundamental questions, outline the solutions adopted or developed by Wix, and share our hindsight on what worked well for us, what didn't and thoughts on future directions for our stack.

Veröffentlicht in: Software
  • Als Erste(r) kommentieren

The Wix Microservice Stack

  1. 1. The Wix Microservice Stack Tomer Gabel, Wix March 2017 @ Dnipro, UA
  2. 2. Agenda 1. Topology 2. Networking 3. Structure 4. Operations 5. Beer
  3. 3. Our conceptual system Store Service Checkout Service Cart Service
  4. 4. 1. TOPOLOGY Image: Penrose Steps by Alex Eylar (CC BY-NC-SA 2.0)
  5. 5. Our conceptual system Store Service Checkout Service Cart Service Host A Host B Host C
  6. 6. Topology Topology Service→ host mapping Server inventory Service catalogue Formally, “scheduling”
  7. 7. Service Scheduling • A hard problem! • Multiple dimensions: – Resource utilization (disk space, I/O, RAM, network, power…) – Resource availability – Failover (physical server, rack, row…) – Custom constraints (zoning, e.g. PCI compliance)
  8. 8. Service Scheduling • A hard problem! • Multiple dimensions: – Resource utilization (disk space, I/O, RAM, network, power…) – Resource availability – Failover (physical server, rack, row…) – Custom constraints (zoning, e.g. PCI compliance)
  9. 9. Service Scheduling • The middle ground: – Naïve automatic scheduler – Human-configured overrides for zoning, optimization • Easy but limited scale – A few hundred servers
  10. 10. Our conceptual system Store Service Checkout Service Cart Service http://err:42/uh … derp?
  11. 11. Service Discovery Static Dynamic Logical Physical That way madness lies
  12. 12. Service Discovery Static Dynamic Logical Physical
  13. 13. Service Discovery Static Dynamic Logical Physical
  14. 14. In practice • Static topology – Managed with Frying Pan – Exported to Chef – Deployed via configuration files • Live registry in Zookeeper – Deployment only – … for now
  15. 15. 2. NETWORKING Image: Neurons by Birth Into Being (CC BY-NC-SA 2.0)
  16. 16. Back to diagrams Store Service Checkout Service Cart Service
  17. 17. Back to diagrams Store Service Checkout Service Cart Service Protocol
  18. 18. Protocol • RPC-style – Sync or async – Point-to-point • Message passing – Async only – Requires broker Shared Concerns Topology Serialization Operations
  19. 19. Protocol • Wix RPC – RPC-style – Custom JSON – HTTP • Pros/cons – Rock-solid – Sync/blocking – Legacy Image: psycho chicken by Bernhard Latzko (CC BY-ND 2.0)
  20. 20. Protocol • Greyhound –Message-passing –Custom JSON –Kafka • Pros/cons –Async + replayable –Still experimental Image: Robin Fledgeling by edgeplot (CC BY-NC-SA 2.0)
  21. 21. Load balancing • Centralized – Simple – Limited flexibility – Limited scale – Thin implementation  highly portable – Suitable for static topologies • Distributed – Highly scalable – Flexible – Fully dynamic – Fat implementation  difficult to port • Quasi-distributed – e.g. Synapse – Best of both worlds?
  22. 22. Load balancing • Centralized – Simple – Limited flexibility – Limited scale – Thin implementation  highly portable – Suitable for static topologies • Distributed – Highly scalable – Flexible – Fully dynamic – Fat implementation  difficult to port • Quasi-distributed – e.g. Synapse – Best of both worlds? Frying Pan  Chef  Nginx
  23. 23. To our shame • There’s always IDL. • Informal – Text documentation – Code samples • Formal – Swagger, Apiary etc. – ProtoBuf, Thrift, Avro – WSDL, god forbid! • … or – Ad-hoc public interface SiteMembersService { SiteMemberDto getMemberById( Guid<SiteMember> memberId, UserGuid userId); SiteMemberDto getMemberOrOwnerById( Guid<SiteMember> memberId, Guid<SMCollection> collectionId); SiteMemberDto getMemberDtoByEmailAndCollectionId( String email, Guid<SMCollection> collectionId); List<SiteMemberDto> listMembersByCollectionId( Guid<SMCollection> collectionId); }
  24. 24. To our shame • There’s always IDL. • Informal – Text documentation – Code samples • Formal – Swagger, Apiary etc. – ProtoBuf, Thrift, Avro – WSDL, god forbid! • … or – Ad-hoc public interface SiteMembersService { SiteMemberDto getMemberById( Guid<SiteMember> memberId, UserGuid userId); SiteMemberDto getMemberOrOwnerById( Guid<SiteMember> memberId, Guid<SMCollection> collectionId); SiteMemberDto getMemberDtoByEmailAndCollectionId( String email, Guid<SMCollection> collectionId); List<SiteMemberDto> listMembersByCollectionId( Guid<SMCollection> collectionId); }
  25. 25. In Detail • Java interfaces? + Ridiculously simple + Lend well to RPC – Coupled to JVM • JSON serialization + Jackson-based + Custom, extensible mapping – Reflection-based • Server stack (JVM) – Jetty – Spring + Spring MVC – Custom handler • RPC client stack (JVM) – Spring – Proxy classes generated at runtime – AsyncHttpClient
  26. 26. In Detail • Java interfaces? + Ridiculously simple + Lend well to RPC – Coupled to JVM • JSON serialization + Jackson-based + Custom, extensible mapping – Reflection-based • Alternative stack – Based on Node.js – Generated RPC clients – Manually-converted entity schema :-(
  27. 27. In Detail • Java interfaces? + Ridiculously simple + Lend well to RPC – Coupled to JVM • JSON serialization + Jackson-based + Custom, extensible mapping – Reflection-based • Alternative stack – Based on Node.js – Generated RPC clients – Manually-converted entity schema :-(
  28. 28. Cascade Failures • What is a cascade failure? • Mitigations – Bulkheading – Circuit breakers – Load shedding • We don’t do any of that (mostly)
  29. 29. Does it go? • Short answer: yes. • Battle-tested – Evolving since 2010. – >200 services in production. • Known quantity – Easy to operate – Performs well enough – Known workarounds
  30. 30. Not all is well, though • Polyglot development – Custom client stack – Expensive to port!
  31. 31. Not all is well, though • Polyglot development – Custom client stack – Expensive to port! • Implicit state – Transparently handled by the framework – Thread local storage – Hard to go async! Client Proxy Service A Service B Session info Session info Transaction ID Session info Transaction ID A/B experiment Transaction ID A/B experiment
  32. 32. 3. STRUCTURE
  33. 33. Codebase modeling • A product comprises multiple services • Services have dependencies – Creating a DAG – Tends to cluster around domains • Org structure reflects the clustering (Conway)
  34. 34. Codebase modeling Repository-per-domain • Small repositories • Artifacts built independently • Binary dependencies • Requires specialized tools to manage: – Versions – Build dependencies Monorepo • Repository contains everything • Code is built atomically • Source dependencies • Requires a specialized build tool
  35. 35. At Wix • One repo per domain • Dependencies: – Declared in POMs – Version management via custom plugin – Builds managed by custom tool* • Custom dashboard, “Wix Lifecycle” * Lifecycle – Dependency Management Algorithm
  36. 36. Version management [INFO] QuickRelease /home/builduser/agent01/work/d9922a1c87aee4bb bf1bc8bcfb2eccebc4268651c5f19faa689be6e4 [08:10:55][INFO] Adding tag RC;.;1.20.0 [08:10:56][INFO] Tag RC;.;1.20.0 added successfully [08:10:56][INFO] Working on onboarding-server-web [08:10:56][INFO] onboarding-server-web-1.19.0- SNAPSHOT jar deployable copied [08:10:56][INFO] onboarding-server-web-1.19.0- SNAPSHOT jar sources copied [08:10:56][INFO] onboarding-server-web-1.19.0- SNAPSHOT jar copied [08:10:56][INFO] onboarding-server-web-1.19.0- SNAPSHOT jar tests copied [08:10:56][INFO] onboarding-server-web pom deployed [08:10:57][INFO] Deploying artifacts to release artifacts repository [08:10:57][INFO] Deploying onboarding-server-web to RELEASE [08:10:57][INFO] pushing new pom [08:10:59]2016-02-22 08:10:39 [INFO ] /usr/bin/git push --tag origin master exitValue = 0 • All artifacts share a common parent – Master list of versions • Manually-triggered release builds – Custom release plugin – Increments version – Updates master – Pushes changes to git
  37. 37. 4. OPERATIONS
  38. 38. Back to diagrams Store Service Checkout Service Cart Service How ya doin’?
  39. 39. Health • Host monitoring – Sensu alerts – Usual host metrics – Health-check endpoint in framework • End-to-end – Pingdom • Business – Custom BI toolchain
  40. 40. Instrumentation • Metrics – DropWizard Metrics – Graphite and Anodot – Built-in metrics (RPC, resource pools…) – APIs for custom metrics • Alerts – Anodot, NewRelic – Via PagerDuty
  41. 41. Debugging • Logs – Good old Logback – No centralized aggregation – Not particularly useful • Feature toggle overrides • Distributed tracing
  42. 42. WE’RE DONE HERE! … AND YES, WE’RE HIRING :-) Thank you for listening tomer@tomergabel.com @tomerg http://il.linkedin.com/in/tomergabel Wix Engineering blog: http://engineering.wix.com

×