2. ❖❖ Who am I?Who am I?
- Chief System Architect of SiteGround.com- Chief System Architect of SiteGround.com
- Sysadmin since 1996- Sysadmin since 1996
- Organizer of OpenFest, BG Perl- Organizer of OpenFest, BG Perl
Workshops, LUG-BG and othersWorkshops, LUG-BG and others
- Teaching Network Security and- Teaching Network Security and
Linux System AdministrationLinux System Administration
courses in Sofia Universitycourses in Sofia University
and SoftUniand SoftUni
3.
4. GoogleGoogle
Sep 8, 2016 in the end of Jan.2017 it will
mark sites with CC data or passwords without
HTTPS as "Not secure"
Jan 31, 2017 it happened
Apr 27, 2017 http sites which have forms will
be added to "Not secure" in October
Feb 8, 2018 it will mark all HTTP sites that
don't have HTTPS as "Not secure"
5. ❖❖ Announced in the beginning of 2016Announced in the beginning of 2016
❖❖ Lunched on April 12, 2016Lunched on April 12, 2016
❖❖ SiteGround is one of the first
sponsors of Let's Encrypt
❖❖ Wildcard certificates were
introduced in March 2018
Let's EncryptLet's Encrypt
6. SiteGroundSiteGround
❖❖ In the beginning 2016 we started
working on cPanel plugin for Let's
Encrypt
❖❖ In Oct 2017 we started work on wildcard certsIn Oct 2017 we started work on wildcard certs
7. SiteGroundSiteGround
❖❖ In the beginning of Dec 2017 we
decided to issue certs to every
host that is hosted on our
infrastructure...
9. ❖❖ We first decided to create accountWe first decided to create account
for every client that we have, sofor every client that we have, so
everyone can take their own Let'severyone can take their own Let's
Encrypt account with them if they areEncrypt account with them if they are
leaving...leaving...
Creating that many accounts:Creating that many accounts:
- it was slow... so we decided to do it- it was slow... so we decided to do it
in parallelin parallel
- about 10 requests per server- about 10 requests per server
- but all servers in the same time- but all servers in the same time
10. ❖❖ We broke the infrastructure ofWe broke the infrastructure of
Let's EncryptLet's Encrypt
11. ❖❖ We got a call from Let's Encrypt :)We got a call from Let's Encrypt :)
12. ❖❖ They had oneThey had one SIMPLESIMPLE request:request:
Implement a global limit(across allImplement a global limit(across all
of our Data Centers) to create max 2of our Data Centers) to create max 2
accounts per secondaccounts per second
13. ❖❖ Implementing such lock, withImplementing such lock, with
performance and reliability in mindperformance and reliability in mind
is not simple...is not simple...
❖❖ We decided to use HashiCorp'sWe decided to use HashiCorp's
Consul locking for thatConsul locking for that
14. ❖❖ We have a lot of customers... suchWe have a lot of customers... such
a limit was never going to bea limit was never going to be
enough...enough...
❖❖ So we decided to go with a singleSo we decided to go with a single
account for all servers (proposed byaccount for all servers (proposed by
Let's Encrypt)Let's Encrypt)
15. ❖❖ At that point everything started toAt that point everything started to
work...work...
❖❖ But we issued certs only to peopleBut we issued certs only to people
that wanted SSL, not everyonethat wanted SSL, not everyone
16. ❖❖ Using certbot it took around 1minUsing certbot it took around 1min
to issue a single certificateto issue a single certificate
17. ❖❖ Using certbot it took around 1minUsing certbot it took around 1min
to issue a single certificateto issue a single certificate
❖❖ But we had millions of hosts...But we had millions of hosts...
18. ❖❖ Using certbot it took around 1minUsing certbot it took around 1min
to issue a single certificateto issue a single certificate
❖❖ But we had millions of hosts...But we had millions of hosts...
❖❖ 1,000,000 min ~ 16,667h ~ 694 days1,000,000 min ~ 16,667h ~ 694 days
19. ❖❖ Using certbot it took around 1minUsing certbot it took around 1min
to issue a single certificateto issue a single certificate
❖❖ But we had millions of hosts...But we had millions of hosts...
❖❖ 1,000,000 min ~ 16,667h ~ 694 days1,000,000 min ~ 16,667h ~ 694 days
❖❖ No problem :) We have thousands ofNo problem :) We have thousands of
machines...machines...
20. ❖❖ Using certbot it took around 1minUsing certbot it took around 1min
to issue a single certificateto issue a single certificate
❖❖ But we had millions of hosts...But we had millions of hosts...
❖❖ 1,000,000 min ~ 16,667h ~ 694 days1,000,000 min ~ 16,667h ~ 694 days
❖❖ No problem :) We have thousands ofNo problem :) We have thousands of
machines...machines...
❖❖ We will issue them in parallel fromWe will issue them in parallel from
all servers :)all servers :)
21. ➢ I don't remember why, but we decided to issue
all certs between Christmas and New year :)
SiteGroundSiteGround
23. ❖❖ 6 Data Centers on 3 continents6 Data Centers on 3 continents
24. ❖❖ 6 Data Centers on 3 continents6 Data Centers on 3 continents
❖❖ Every server trying to issue up toEvery server trying to issue up to
6 certificates in parallel6 certificates in parallel
25. ❖❖ 6 Data Centers on 3 continents6 Data Centers on 3 continents
❖❖ Every server trying to issue up toEvery server trying to issue up to
6 certificates in parallel6 certificates in parallel
❖❖ Let's Encrypt did not had limitsLet's Encrypt did not had limits
for this...for this...
26. ❖❖ Needless to say... we had a callNeedless to say... we had a call
from Let's Encrypt :)from Let's Encrypt :)
27. ❖❖ They had oneThey had one SIMPLESIMPLE request:request:
Implement a global limit(across allImplement a global limit(across all
of our Data Centers) to issue lessof our Data Centers) to issue less
then 10 certificates at the same timethen 10 certificates at the same time
28. ❖❖ While we were again implementingWhile we were again implementing
Consul lock for this...Consul lock for this...
❖❖ Let's EncryptLet's Encrypt implemented specificimplemented specific
limits, because of us :)limits, because of us :)
❖❖ for a week there were intermittentfor a week there were intermittent
issues with Let's Encrypt, but it wasissues with Let's Encrypt, but it was
between Christmas and New year... sobetween Christmas and New year... so
both we and Let's Encrypt didn't hadboth we and Let's Encrypt didn't had
the people to address the issuethe people to address the issue
29. In March renewing became a problem :)In March renewing became a problem :)
- we hit different limits- we hit different limits
- number of issues of a cert with the same- number of issues of a cert with the same
hostshosts
- number of errors per-server- number of errors per-server
- number of new certs- number of new certs
30. ❖❖ unintentionally one of my DevOps guysunintentionally one of my DevOps guys
switched from single account to oneswitched from single account to one
account per serveraccount per server
- now, we had a new challenge, when- now, we had a new challenge, when
moving an account, we also had to movemoving an account, we also had to move
its certs, because otherwise it reachesits certs, because otherwise it reaches
the limit of issuesthe limit of issues
- the certs were still slow to issue- the certs were still slow to issue
- we had to do some changes to certbot- we had to do some changes to certbot
in order to be able to finalize somein order to be able to finalize some
interrupted issue of certsinterrupted issue of certs
31. I wrote our own certbot, calledI wrote our own certbot, called
acmebot, based on Net::ACME2.acmebot, based on Net::ACME2.
Next 2 months we spent adding the sameNext 2 months we spent adding the same
limits that LE havelimits that LE have
32. While doing that...While doing that...
The next wave of renewals came...The next wave of renewals came...
Again we had issues.Again we had issues.
33. We implemented per LE account limitWe implemented per LE account limit
for every server that had its ownfor every server that had its own
account.account.
We implemented a sliding window limit for allWe implemented a sliding window limit for all
servers that use the big account.servers that use the big account.