Moving product pipelines from one team to another, from one department to another, from one country to another is a challenging and common task in a dynamic environment - especially in big companies with many teams where projects can move back and forth due to internal structure changes, teams merge and splits, etc.
The Zalando Engineering team will share their approach and practices they use to make those migrations smooth. The team will provide you with insights into how this works at scale at Zalando, what are the technical and non-technical challenges are - from moving parts of such migrations, timelines, cross team collaboration aspects, rolling out to production when you are not an original owner of the system.
5. 5
ZALANDO AT A GLANCE
~ 6.5billion EUR
revenue 2019
~ 350
million
visits
per
month
~ 15,000
employees in
Europe
> 80%
of visits via
mobile devices
31
million
active customers
> 500,000
product choices
> 2,500
brands
17
countries
as of September 2019
6. 6
SOME OF OUR MOST INNOVATIVE TECH IS DEVELOPED IN HELSINKI
Connected retail
We are integrating 640,000 physical
stores into the Zalando
New product for Zalando, launched in
March 2018 and built in Helsinki.
Zalando logistics
Warehouse software to manage logistics
network complexity and improving
distribution of the goods
Fashion Store
Personalised content selection
8. 8
CAMPAIGNS & COLLECTIONS TEAM
Happy
customer
Interface Framework
Content selection systemContent solutions
system
UI Tool 1
BFF
service
Mgmt
API
Q1
...
Mgmt
API
...
...
...
Fashion
Store
API
Delivery
API
Entity IDs
Internal
users GraphQL
Media
service
Validations
API
Score
Rendering
Engine
Renderers
...
UI Tool 2 BFF 2
Mgmt
API 2
Q2
...
Outfit
Mgmt
Creator
Mgmt
Legend
Outside scope
Within scope
Search
...
UI tool 3 ... Q3 ... Q3
Catalog
Auction
9. 9
ZALANDO TECH âKITCHENâ
1. Autonomous teams = You run what youâve built
2. Cloud native = AWS + K8S + Continuous Delivery
3. Open code = Everyone can see the code of everyone
4. API First = Microservices + OpenAPI REST (guidelines)
5. Writing culture = Our engineers are great writers
12. 12
THE 5 PRINCIPLES
1. Start early
Better safe than sorry
2. (Over-) communicate
DRY? Please, repeat yourself, leave nothing to chance
3. One document as a single source of truth
Write it. Pale ink is better than the best memory
4. Iterate
Donât try to solve everything at once
5. âDonât break thingsâ
Luckily you know how to ïŹx it when it happens, right?
13. 13
1. START EARLY
â Estimate your team capacity
How many people need to be involved?
â Identify constraints and risks
Time constraints? Do we know a tech stack of the project?
â Innersource small things when possible
Learning by doing is a good way to get familiar with a new
project. Start small.
14. 14
2. (OVER-) COMMUNICATE
â With involved teams
In-person, hangouts, via google docs.
â With your stakeholders
Make sure they know whom to contact in case of issues,
requests, etc.
â Via documenting exhaustively
Gather as much info as possible while previous team is there and
knowledge is fresh.
15. 15
3. ONE DOCUMENT AS A USE SINGLE SOURCE OF TRUTH
â Create it
Create a google doc, put all info there, document exhaustively
â Share it
Everyone should know where to look and track the progress
â Get feedback
Gather as much info as possible while previous team is there and
knowledge is fresh.
23. 23
4. ITERATE
â We know how to operate the system
Monitoring, logs, alerts. Incident impact is clear
â We know the dependencies and stakeholders
Upstream/downstream. Relevant contacts
â We know how to ïŹx small bugs
Development/deployment process is clear
â We know how to evolve the project
Ability to build mid/long-term features. Longer-term vision
24. 24
5. âDONâT BREAK THINGSâ
â Blue-Green deployment model
Make API, database, message queue migrations great again
â Graceful URLs migrations
Make use of DNS CNAME records, give people time to migrate
â Monitor access logs and help stakeholders to switch
Log requests with old URLs, notify the owners
30. 30
DATABASE MIGRATIONS
â Downtime for writes is allowed
The system can tolerate no writes for minutes or few hours
â Only minimal downtime for writes is allowed
The system can tolerate no writes only for few minutes
35. 35
MINIMAL DOWNTIME ALLOWED: STANDBY
S3 Backups S3 Backups
old-cluster new-cluster
API
Standby cluster
1. Restore from the backup
2. Keep replicating the data
39. 39
THE 5 PRINCIPLES
â Start early
â (Over-) communicate
â One document as a single source of truth
â Iterate
â âDonât break thingsâ
40. 40
WHATâS NEXT?
â Assess integration with existing components
Does it make sense? Are there overlaps?
â DeïŹne a target architecture
Based on a longer-term vision and strategy
â Progress step by step towards the target
Re-assess as you go