SlideShare ist ein Scribd-Unternehmen logo
1 von 45
Downloaden Sie, um offline zu lesen
Instant messenger with Python
Back-end development
Viacheslav Kakovskyi
WebCamp 2016
Me!
@kakovskyi
Python Developer at SoftServe
Contributor of Atlassian HipChat — Python 2, Twisted
Maintainer of KPIdata — Python 3, asyncio
2
Agenda
● What is 'instant messenger'?
● Related projects from my experience
● Messaging protocols
● Life of messaging platform
● Lessons learned
● Summary
● Further reading
3
What is 'instant messenger'?
4
What is 'instant messenger'?
● online chat
● real-time delivery
● short messages
5
What is 'instant messenger'?
● history search
● file sharing
● mobile push notifications
● video calling
● bots and integrations
6
Related projects from my experience
● Hosted chat for teams and enterprises
● Founded in 2009 by 3 students
● 100 000+ connected users
● 100+ nodes
● REST API for integrations and bots
● Built with Python 2 and Twisted
7
Messaging protocols
Protocol is about:
● Message format
● Allowed types of messages
● Limitations
● Routine
○ How to encode data?
○ How to establish/close connection?
○ How to authenticate?
○ How to encrypt?
8
Messaging protocols
● OSCAR (1997)
● XMPP (1999)
● Skype (2003)
● WebSocket-based (2011)
● MQTT, MTProto, DHT-based, etc.
9
XMPP
● XMPP - signaling protocol
● BOSH - transport protocol
● Started from Jabber in 1999
● XML as a message format
● Stanza - basic unit in XMPP
● Types of stanzas:
○ Message
○ Presence
○ Info/Query
10
XMPP
● Extensions defined by XEPs (XMPP Extension
Protocols):
○ Bidirectional-streams Over Synchronous
HTTP (BOSH)
○ Serverless messaging
○ File transfer and etc.
11
XMPP: Establishing a connection
12
Client:
<?xml version='1.0'?>
<stream:stream to='example.com' xmlns='jabber:client'
xmlns:stream='http://etherx.jabber.org/streams'
version='1.0'>
Server:
<?xml version='1.0'?>
<stream:stream from='example.com' id='someid'
xmlns='jabber:client' xmlns:stream='http://etherx.jabber.
org/streams' version='1.0'>
XMPP: Sending a message
13
Client:
<message from='juliet@example.com' to='romeo@example.net'
xml:lang='en'>
<body>Art thou not Romeo, and a Montague?</body>
</message>
Server:
<message from='romeo@example.net' to='juliet@example.com' xml:
lang='en'>
<body>
Neither, fair saint, if either thee dislike.
</body>
</message>
XMPP: Closing a connection
14
Client:
</stream:stream>
Server:
</stream:stream>
XMPP: Pros
● Robust and standardized
● Extendable via XEPs
● Secured
● Native support of multi-sessions
● A lot of clients implementations
15
XMPP: Cons
● Overhead
○ Presence
○ Downloading the World on startup
● XML
○ Large documents
○ Expensive parsing
16
XMPP and Python
● Servers:
○ TwistedWords - good place to start
○ Tornado-based example
○ aioxmpp
○ XMPPFlask
○ Punjab - BOSH-server on Twisted
17
XMPP and Python
● Clients:
○ SleekXMPP - mature and solid
○ Slixmpp - asyncio-support
○ TwistedWords
○ Wokkel - Twisted-based
○ xmpp.py
● JS-client: Strophe.js
18
WebSocket-based solutions
● WebSocket - transport protocol
● Standardized in 2011 by W3C
● Full-duplex communication channel
● JSON as a message format
● Custom message types
19
WebSocket: Establishing a connection
20
Client:
GET /chat HTTP/1.1
Host: server.example.com
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==
Origin: http://example.com
Sec-WebSocket-Protocol: chat, superchat
Sec-WebSocket-Version: 13
Server:
HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=
WebSocket: Sending a message
21
Client:
{
"type": "message",
"ts": 1469563519,
"user": "kakovskyi",
"text": "Hello, @WebCamp!"
}
Server:
{
"type": "notification",
"ts": 1469563519,
"user": "WebCamp Bot",
"text": "Howdy @kakovskyi?"
}
WebSocket: Closing a connection
22
Client:
0x8
Server:
0x8
WebSocket: Pros
● Supported by majority of browsers
● Low latency
● Small bandwidth
● Easy to start development
23
WebSocket: Cons
● Needs development of signaling protocol
● Timeouts/reconnections should be additionally
handled
24
WebSocket and Python
● Servers:
○ Autobahn - Twisted and asyncio
implementations
○ aiohttp
○ Tornado
○ Flask-SocketIO
○ Flask-Sockets
25
WebSocket and Python
● Clients:
○ Autobahn
○ aiohttp
○ Tornado-based example
○ Vanilla websocket-client
● JS-client: SocketIO
26
Life of messaging platform
● Authentication
● Access control checks
● Delivery
○ Messages
○ User's presence
○ Push notifications
● History retrieval
● History search
27
Life of messaging platform
● Parsing
○ Protocol
○ Message content
● Dealing with file uploads
○ Security checks
○ Thumbnails distribution
● Multi-session support
● Reconnection handling
● Rate-limiting
28
Life of messaging platform
● Server keeps connections open for every client
● High amount of long-lived concurrent connections
● Multithreaded approach isn't efficient due to overhead
● Requires usage of a select implementation on backend:
○ poll
○ epoll
○ kqueue
● Usage of asynchronous Python frameworks is preferred
for high loaded solutions
29
Life of messaging platform
● Authentication
○ OAuth2
○ Run encryption operations in a separate Python thread
○ Cache users identities with Redis/Memcached
● Access-control checks
○ Make the checks lightweight and cheap
○ Raise an exception when operation isn't permitted
30
EAFP: Easier to ask for forgiveness than permission
Delivery
● Make message delivery fault-tolerant
● Limit size of a message
● Filter content of messages:
○ Users like to send chars that break all the things
● Reduce presence traffic, it could be a bottleneck for large chats
● Use asynchronous broker for delivery when a user is offline
(email or push)
○ Celery
○ RQ
○ Amazon Simple Queue Service
○ Huey
31
Life of messaging platform
● Push notifications
■ Vendors
● Amazon SNS
● APNS
● Google Cloud Messaging
● Firebase Cloud Messaging
■ Python tools
● PyAPNs
● Python-GCM
● Pusher
● Be careful with device registration
● Make delivery of pushes fault-tolerant
32
History retrieval
● Return last messages for every chat instantly
○ Use double writes
■ In-memory queue only for last messages
■ Persistent storage for all the things
● Majority of history retrievals is for the last days
○ Let's optimize the case
● Index messages by date
33
History search
● ElasticSearch is the default solution for full-
text search
● @a_soldatenko: What is the best full text
search engine for Python?
● Add timing for search requests
34
Parsing
● Protocol
○ Avoid to use Pure Python parsers
■ ujson
■ lxml
○ Run benchmarks against your typical cases
● Message content
○ Be careful with regular expressions
■ re2
■ pyre2
○ Alternative parsers in Python
35
Dealing with file uploads
● Security checks
○ File upload vulnerabilities
○ Image upload
■ Decompression bomb
■ Other vulnerabilities with Pillow
○ Amazon S3 as file storage
■ boto
■ aiobotocore
■ botornado
● Thumbnails distribution
○ Delegate that to S3
○ Requested by a client even if not needed
36
Life of messaging platform
● Multi-session support
○ Set expiration time
○ Be ready to handle up to 4x sessions per user simultaneously
■ Desktop
■ Mobile
■ Tablet
■ Laptop
● Reconnection handling
○ Spin a proxy layer between messaging server and clients
● Rate-limiting
○ Limit amount of operations per user/group for heavy stuff
○ Leaky bucket
○ Throttling
37
Lessons learned
● Bursty traffic
○ Load testing is a must, but not always enough
■ Locust
■ Yandex Tank
● Reconnect storm could be a big deal
○ We should handle that on platform and client-side
● AWS issues make bad customers experience
○ Put nodes in Multi-AZ
38
Lessons learned
● Incidents prevention is cheaper than resolution
○ Grab stats and metrics about your services and
storages
■ Redis for per-chat stats
■ StatsD
■ Grafana
○ Be notified when something starts going wrong
■ Elastalert
■ Monit
■ DataDog
39
Lessons learned
● Don't stick with one language/stack
○ Python is great, but for some cases Go, Ruby or
PHP are more suitable from product side
○ Avoid business logic duplication in several repos,
spin a service and just call the endpoint
● Releasing new features only for certain groups makes
product management easier
○ LaunchDarkly
40
Lessons learned
● Don’t F**k the Customer
○ Provide unit/integration tests with every PR
○ Have development environment same as prod
○ Have staging environment same as prod
○ Make deployments fast
○ Rollback faster
○ Have a fallback plan
41
Summary
42
Summary
● Select a messaging protocol which aligns with your needs
● WebSocket + JSON could be the thing for new projects
● Usage of asynchronous frameworks is preferred
● Execute blocking operations in a separate thread
● Collect metrics for common services operations
● Caching saves a lot of time
● Use C or Cython-based solutions for CPU-bound tasks
● Have fast release/deploy/rollback cycle
● Python is great, but don't hesitate to pick other tools
43
Further reading
● How HipChat Stores and Indexes Billions of Messages Using ElasticSearch
● @kakovskyi: Maintaining a high load Python project for newcomers
● HipChat: Important improvements to staging, presence & database storage
● HipChat and the little connection that could
● Elasticsearch at HipChat: 10x faster queries
● Atlassian: How IT and SRE use ChatOps to run incident management
● A Study of Internet Instant Messaging and Chat Protocols
● What Is Async, How Does It Work, And When Should I Use It?
● Leaky Bucket & Tocken Bucket - Traffic shaping
● A guide to analyzing Python performance
● Why Leading Companies Dark Launch - LaunchDarkly Blog
● @bmwant: Asyncio-stack for web development
44
Questions?
45
Viacheslav Kakovskyi
viach.kakovskyi@gmail.com
@kakovskyi
Instant messenger with Python
Back-end development

Weitere ähnliche Inhalte

Was ist angesagt?

XHR Web APps F2F at San Jose
XHR Web APps F2F at San JoseXHR Web APps F2F at San Jose
XHR Web APps F2F at San Jose
jungkees
 

Was ist angesagt? (20)

Robert Kubis - gRPC - boilerplate to high-performance scalable APIs - code.t...
 Robert Kubis - gRPC - boilerplate to high-performance scalable APIs - code.t... Robert Kubis - gRPC - boilerplate to high-performance scalable APIs - code.t...
Robert Kubis - gRPC - boilerplate to high-performance scalable APIs - code.t...
 
Bringing Learnings from Googley Microservices with gRPC - Varun Talwar, Google
Bringing Learnings from Googley Microservices with gRPC - Varun Talwar, GoogleBringing Learnings from Googley Microservices with gRPC - Varun Talwar, Google
Bringing Learnings from Googley Microservices with gRPC - Varun Talwar, Google
 
Last Month in PHP - November 2016
Last Month in PHP - November 2016Last Month in PHP - November 2016
Last Month in PHP - November 2016
 
The Ring programming language version 1.5.4 book - Part 11 of 185
The Ring programming language version 1.5.4 book - Part 11 of 185The Ring programming language version 1.5.4 book - Part 11 of 185
The Ring programming language version 1.5.4 book - Part 11 of 185
 
"Building, deploying and running production code at Dropbox" Васильев Леонид,...
"Building, deploying and running production code at Dropbox" Васильев Леонид,..."Building, deploying and running production code at Dropbox" Васильев Леонид,...
"Building, deploying and running production code at Dropbox" Васильев Леонид,...
 
Last Month in PHP - October 2016
Last Month in PHP - October 2016Last Month in PHP - October 2016
Last Month in PHP - October 2016
 
HTTP2 and gRPC
HTTP2 and gRPCHTTP2 and gRPC
HTTP2 and gRPC
 
공영주차장 실시간 예측
공영주차장 실시간 예측공영주차장 실시간 예측
공영주차장 실시간 예측
 
Last Month in PHP - February 2017
Last Month in PHP - February 2017Last Month in PHP - February 2017
Last Month in PHP - February 2017
 
XHR Web APps F2F at San Jose
XHR Web APps F2F at San JoseXHR Web APps F2F at San Jose
XHR Web APps F2F at San Jose
 
End-to-end W3C APIs - tpac 2012
End-to-end W3C APIs - tpac 2012End-to-end W3C APIs - tpac 2012
End-to-end W3C APIs - tpac 2012
 
A Quick Intro to ReactiveX
A Quick Intro to ReactiveXA Quick Intro to ReactiveX
A Quick Intro to ReactiveX
 
ReactiveX
ReactiveXReactiveX
ReactiveX
 
Drupal Brisbane Meetup :: Drupal in late 2017-2018
Drupal Brisbane Meetup :: Drupal in late 2017-2018Drupal Brisbane Meetup :: Drupal in late 2017-2018
Drupal Brisbane Meetup :: Drupal in late 2017-2018
 
gRPC in Go
gRPC in GogRPC in Go
gRPC in Go
 
GRPC 101 - DevFest Belgium 2016
GRPC 101 - DevFest Belgium 2016GRPC 101 - DevFest Belgium 2016
GRPC 101 - DevFest Belgium 2016
 
Network programming with Qt (C++)
Network programming with Qt (C++)Network programming with Qt (C++)
Network programming with Qt (C++)
 
Per aspera ad grid To the grid computing through difficulties
Per aspera ad grid To the grid computing through difficultiesPer aspera ad grid To the grid computing through difficulties
Per aspera ad grid To the grid computing through difficulties
 
gRPC
gRPCgRPC
gRPC
 
OSMC 2019 | Grafana Loki: Like Prometheus, but for Logs by Ganesh Vernekar
OSMC 2019 | Grafana Loki: Like Prometheus, but for Logs by Ganesh VernekarOSMC 2019 | Grafana Loki: Like Prometheus, but for Logs by Ganesh Vernekar
OSMC 2019 | Grafana Loki: Like Prometheus, but for Logs by Ganesh Vernekar
 

Andere mochten auch

Andere mochten auch (16)

2016署假宅學營 Google Analytics & FaceBook Messenger BOT
2016署假宅學營 Google Analytics  & FaceBook Messenger  BOT2016署假宅學營 Google Analytics  & FaceBook Messenger  BOT
2016署假宅學營 Google Analytics & FaceBook Messenger BOT
 
Facebook Messenger and Go
Facebook Messenger and GoFacebook Messenger and Go
Facebook Messenger and Go
 
Facebook Messenger Bot with Flask & Google App Engine
Facebook Messenger Bot with Flask & Google App EngineFacebook Messenger Bot with Flask & Google App Engine
Facebook Messenger Bot with Flask & Google App Engine
 
Messenger for Mobile Operator
Messenger for Mobile OperatorMessenger for Mobile Operator
Messenger for Mobile Operator
 
Facebook Messenger als Teil der Distributionsstrategie der BILD @ #AFBMC
 Facebook Messenger als Teil der Distributionsstrategie der BILD @ #AFBMC Facebook Messenger als Teil der Distributionsstrategie der BILD @ #AFBMC
Facebook Messenger als Teil der Distributionsstrategie der BILD @ #AFBMC
 
Serverless Finland Meetup 16.11.2016: Messenger Bot Workshop
Serverless Finland Meetup 16.11.2016: Messenger Bot WorkshopServerless Finland Meetup 16.11.2016: Messenger Bot Workshop
Serverless Finland Meetup 16.11.2016: Messenger Bot Workshop
 
Chatbot Studies: WSJ for Facebook Messenger
Chatbot Studies: WSJ for Facebook MessengerChatbot Studies: WSJ for Facebook Messenger
Chatbot Studies: WSJ for Facebook Messenger
 
PyCon Ukraine 2016: Maintaining a high load Python project for newcomers
PyCon Ukraine 2016: Maintaining a high load Python project for newcomersPyCon Ukraine 2016: Maintaining a high load Python project for newcomers
PyCon Ukraine 2016: Maintaining a high load Python project for newcomers
 
Austin Python Meetup 2017: What's New in Pythons 3.5 and 3.6?
Austin Python Meetup 2017: What's New in Pythons 3.5 and 3.6?Austin Python Meetup 2017: What's New in Pythons 3.5 and 3.6?
Austin Python Meetup 2017: What's New in Pythons 3.5 and 3.6?
 
Komunikatory internetowe
Komunikatory internetoweKomunikatory internetowe
Komunikatory internetowe
 
Tuck Reunion 2014 - International Tax Reform in the U.S.: Why Are We Stuck in...
Tuck Reunion 2014 - International Tax Reform in the U.S.: Why Are We Stuck in...Tuck Reunion 2014 - International Tax Reform in the U.S.: Why Are We Stuck in...
Tuck Reunion 2014 - International Tax Reform in the U.S.: Why Are We Stuck in...
 
O tworzeniu użytecznych aplikacji słów kilka
O tworzeniu użytecznych aplikacji słów kilkaO tworzeniu użytecznych aplikacji słów kilka
O tworzeniu użytecznych aplikacji słów kilka
 
Social media przyszłości #MobileSilesia
Social media przyszłości #MobileSilesiaSocial media przyszłości #MobileSilesia
Social media przyszłości #MobileSilesia
 
#NOW: Pokolenie Chwili, Generacja Z, Conversational Economy (ImpactAcademy Se...
#NOW: Pokolenie Chwili, Generacja Z, Conversational Economy (ImpactAcademy Se...#NOW: Pokolenie Chwili, Generacja Z, Conversational Economy (ImpactAcademy Se...
#NOW: Pokolenie Chwili, Generacja Z, Conversational Economy (ImpactAcademy Se...
 
Bot Trends 2016
Bot Trends 2016Bot Trends 2016
Bot Trends 2016
 
chatbot and messenger as a platform
chatbot and messenger as a platformchatbot and messenger as a platform
chatbot and messenger as a platform
 

Ähnlich wie WebCamp Ukraine 2016: Instant messenger with Python. Back-end development

Sync IT Presentation 3.16
Sync IT Presentation 3.16Sync IT Presentation 3.16
Sync IT Presentation 3.16
Marcus Grimaldo
 
CN 6131(15) Module IV.docx
CN 6131(15) Module IV.docxCN 6131(15) Module IV.docx
CN 6131(15) Module IV.docx
AkhilMS30
 
Open Chemistry, JupyterLab and data: Reproducible quantum chemistry
Open Chemistry, JupyterLab and data: Reproducible quantum chemistryOpen Chemistry, JupyterLab and data: Reproducible quantum chemistry
Open Chemistry, JupyterLab and data: Reproducible quantum chemistry
Marcus Hanwell
 
Initial presentation of swift (for montreal user group)
Initial presentation of swift (for montreal user group)Initial presentation of swift (for montreal user group)
Initial presentation of swift (for montreal user group)
Marcos García
 

Ähnlich wie WebCamp Ukraine 2016: Instant messenger with Python. Back-end development (20)

WebCamp 2016: Python. Вячеслав Каковский: Real-time мессенджер на Python. Осо...
WebCamp 2016: Python. Вячеслав Каковский: Real-time мессенджер на Python. Осо...WebCamp 2016: Python. Вячеслав Каковский: Real-time мессенджер на Python. Осо...
WebCamp 2016: Python. Вячеслав Каковский: Real-time мессенджер на Python. Осо...
 
MySQL X protocol - Talking to MySQL Directly over the Wire
MySQL X protocol - Talking to MySQL Directly over the WireMySQL X protocol - Talking to MySQL Directly over the Wire
MySQL X protocol - Talking to MySQL Directly over the Wire
 
Storing your data in the cloud: doing right reversim 2018
Storing your data in the cloud: doing right reversim 2018Storing your data in the cloud: doing right reversim 2018
Storing your data in the cloud: doing right reversim 2018
 
AMQP with RabbitMQ
AMQP with RabbitMQAMQP with RabbitMQ
AMQP with RabbitMQ
 
SPDY and What to Consider for HTTP/2.0
SPDY and What to Consider for HTTP/2.0SPDY and What to Consider for HTTP/2.0
SPDY and What to Consider for HTTP/2.0
 
Socket Programming with Python
Socket Programming with PythonSocket Programming with Python
Socket Programming with Python
 
PHP at Density and Scale (Lone Star PHP 2014)
PHP at Density and Scale (Lone Star PHP 2014)PHP at Density and Scale (Lone Star PHP 2014)
PHP at Density and Scale (Lone Star PHP 2014)
 
Using protocol analyzer on mikrotik
Using protocol analyzer on mikrotikUsing protocol analyzer on mikrotik
Using protocol analyzer on mikrotik
 
Sync IT Presentation 3.16
Sync IT Presentation 3.16Sync IT Presentation 3.16
Sync IT Presentation 3.16
 
CN 6131(15) Module IV.docx
CN 6131(15) Module IV.docxCN 6131(15) Module IV.docx
CN 6131(15) Module IV.docx
 
Open Chemistry, JupyterLab and data: Reproducible quantum chemistry
Open Chemistry, JupyterLab and data: Reproducible quantum chemistryOpen Chemistry, JupyterLab and data: Reproducible quantum chemistry
Open Chemistry, JupyterLab and data: Reproducible quantum chemistry
 
Prometheus: What is is, what is new, what is coming
Prometheus: What is is, what is new, what is comingPrometheus: What is is, what is new, what is coming
Prometheus: What is is, what is new, what is coming
 
CN 6131(15) Module IV.pdf
CN 6131(15) Module IV.pdfCN 6131(15) Module IV.pdf
CN 6131(15) Module IV.pdf
 
Build real time stream processing applications using Apache Kafka
Build real time stream processing applications using Apache KafkaBuild real time stream processing applications using Apache Kafka
Build real time stream processing applications using Apache Kafka
 
Cloud storage: the right way OSS EU 2018
Cloud storage: the right way OSS EU 2018Cloud storage: the right way OSS EU 2018
Cloud storage: the right way OSS EU 2018
 
Glowing bear
Glowing bear Glowing bear
Glowing bear
 
Go at uber
Go at uberGo at uber
Go at uber
 
Big data @ Hootsuite analtyics
Big data @ Hootsuite analtyicsBig data @ Hootsuite analtyics
Big data @ Hootsuite analtyics
 
Initial presentation of swift (for montreal user group)
Initial presentation of swift (for montreal user group)Initial presentation of swift (for montreal user group)
Initial presentation of swift (for montreal user group)
 
Netty training
Netty trainingNetty training
Netty training
 

Kürzlich hochgeladen

Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Medical / Health Care (+971588192166) Mifepristone and Misoprostol tablets 200mg
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
masabamasaba
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
masabamasaba
 
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
masabamasaba
 

Kürzlich hochgeladen (20)

MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
 
WSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go Platformless
 
WSO2Con204 - Hard Rock Presentation - Keynote
WSO2Con204 - Hard Rock Presentation - KeynoteWSO2Con204 - Hard Rock Presentation - Keynote
WSO2Con204 - Hard Rock Presentation - Keynote
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK Software
 
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
 
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
 
%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benoni%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benoni
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
WSO2CON 2024 - How to Run a Security Program
WSO2CON 2024 - How to Run a Security ProgramWSO2CON 2024 - How to Run a Security Program
WSO2CON 2024 - How to Run a Security Program
 
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open SourceWSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
 
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
 
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
 
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 

WebCamp Ukraine 2016: Instant messenger with Python. Back-end development

  • 1. Instant messenger with Python Back-end development Viacheslav Kakovskyi WebCamp 2016
  • 2. Me! @kakovskyi Python Developer at SoftServe Contributor of Atlassian HipChat — Python 2, Twisted Maintainer of KPIdata — Python 3, asyncio 2
  • 3. Agenda ● What is 'instant messenger'? ● Related projects from my experience ● Messaging protocols ● Life of messaging platform ● Lessons learned ● Summary ● Further reading 3
  • 4. What is 'instant messenger'? 4
  • 5. What is 'instant messenger'? ● online chat ● real-time delivery ● short messages 5
  • 6. What is 'instant messenger'? ● history search ● file sharing ● mobile push notifications ● video calling ● bots and integrations 6
  • 7. Related projects from my experience ● Hosted chat for teams and enterprises ● Founded in 2009 by 3 students ● 100 000+ connected users ● 100+ nodes ● REST API for integrations and bots ● Built with Python 2 and Twisted 7
  • 8. Messaging protocols Protocol is about: ● Message format ● Allowed types of messages ● Limitations ● Routine ○ How to encode data? ○ How to establish/close connection? ○ How to authenticate? ○ How to encrypt? 8
  • 9. Messaging protocols ● OSCAR (1997) ● XMPP (1999) ● Skype (2003) ● WebSocket-based (2011) ● MQTT, MTProto, DHT-based, etc. 9
  • 10. XMPP ● XMPP - signaling protocol ● BOSH - transport protocol ● Started from Jabber in 1999 ● XML as a message format ● Stanza - basic unit in XMPP ● Types of stanzas: ○ Message ○ Presence ○ Info/Query 10
  • 11. XMPP ● Extensions defined by XEPs (XMPP Extension Protocols): ○ Bidirectional-streams Over Synchronous HTTP (BOSH) ○ Serverless messaging ○ File transfer and etc. 11
  • 12. XMPP: Establishing a connection 12 Client: <?xml version='1.0'?> <stream:stream to='example.com' xmlns='jabber:client' xmlns:stream='http://etherx.jabber.org/streams' version='1.0'> Server: <?xml version='1.0'?> <stream:stream from='example.com' id='someid' xmlns='jabber:client' xmlns:stream='http://etherx.jabber. org/streams' version='1.0'>
  • 13. XMPP: Sending a message 13 Client: <message from='juliet@example.com' to='romeo@example.net' xml:lang='en'> <body>Art thou not Romeo, and a Montague?</body> </message> Server: <message from='romeo@example.net' to='juliet@example.com' xml: lang='en'> <body> Neither, fair saint, if either thee dislike. </body> </message>
  • 14. XMPP: Closing a connection 14 Client: </stream:stream> Server: </stream:stream>
  • 15. XMPP: Pros ● Robust and standardized ● Extendable via XEPs ● Secured ● Native support of multi-sessions ● A lot of clients implementations 15
  • 16. XMPP: Cons ● Overhead ○ Presence ○ Downloading the World on startup ● XML ○ Large documents ○ Expensive parsing 16
  • 17. XMPP and Python ● Servers: ○ TwistedWords - good place to start ○ Tornado-based example ○ aioxmpp ○ XMPPFlask ○ Punjab - BOSH-server on Twisted 17
  • 18. XMPP and Python ● Clients: ○ SleekXMPP - mature and solid ○ Slixmpp - asyncio-support ○ TwistedWords ○ Wokkel - Twisted-based ○ xmpp.py ● JS-client: Strophe.js 18
  • 19. WebSocket-based solutions ● WebSocket - transport protocol ● Standardized in 2011 by W3C ● Full-duplex communication channel ● JSON as a message format ● Custom message types 19
  • 20. WebSocket: Establishing a connection 20 Client: GET /chat HTTP/1.1 Host: server.example.com Upgrade: websocket Connection: Upgrade Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ== Origin: http://example.com Sec-WebSocket-Protocol: chat, superchat Sec-WebSocket-Version: 13 Server: HTTP/1.1 101 Switching Protocols Upgrade: websocket Connection: Upgrade Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=
  • 21. WebSocket: Sending a message 21 Client: { "type": "message", "ts": 1469563519, "user": "kakovskyi", "text": "Hello, @WebCamp!" } Server: { "type": "notification", "ts": 1469563519, "user": "WebCamp Bot", "text": "Howdy @kakovskyi?" }
  • 22. WebSocket: Closing a connection 22 Client: 0x8 Server: 0x8
  • 23. WebSocket: Pros ● Supported by majority of browsers ● Low latency ● Small bandwidth ● Easy to start development 23
  • 24. WebSocket: Cons ● Needs development of signaling protocol ● Timeouts/reconnections should be additionally handled 24
  • 25. WebSocket and Python ● Servers: ○ Autobahn - Twisted and asyncio implementations ○ aiohttp ○ Tornado ○ Flask-SocketIO ○ Flask-Sockets 25
  • 26. WebSocket and Python ● Clients: ○ Autobahn ○ aiohttp ○ Tornado-based example ○ Vanilla websocket-client ● JS-client: SocketIO 26
  • 27. Life of messaging platform ● Authentication ● Access control checks ● Delivery ○ Messages ○ User's presence ○ Push notifications ● History retrieval ● History search 27
  • 28. Life of messaging platform ● Parsing ○ Protocol ○ Message content ● Dealing with file uploads ○ Security checks ○ Thumbnails distribution ● Multi-session support ● Reconnection handling ● Rate-limiting 28
  • 29. Life of messaging platform ● Server keeps connections open for every client ● High amount of long-lived concurrent connections ● Multithreaded approach isn't efficient due to overhead ● Requires usage of a select implementation on backend: ○ poll ○ epoll ○ kqueue ● Usage of asynchronous Python frameworks is preferred for high loaded solutions 29
  • 30. Life of messaging platform ● Authentication ○ OAuth2 ○ Run encryption operations in a separate Python thread ○ Cache users identities with Redis/Memcached ● Access-control checks ○ Make the checks lightweight and cheap ○ Raise an exception when operation isn't permitted 30 EAFP: Easier to ask for forgiveness than permission
  • 31. Delivery ● Make message delivery fault-tolerant ● Limit size of a message ● Filter content of messages: ○ Users like to send chars that break all the things ● Reduce presence traffic, it could be a bottleneck for large chats ● Use asynchronous broker for delivery when a user is offline (email or push) ○ Celery ○ RQ ○ Amazon Simple Queue Service ○ Huey 31
  • 32. Life of messaging platform ● Push notifications ■ Vendors ● Amazon SNS ● APNS ● Google Cloud Messaging ● Firebase Cloud Messaging ■ Python tools ● PyAPNs ● Python-GCM ● Pusher ● Be careful with device registration ● Make delivery of pushes fault-tolerant 32
  • 33. History retrieval ● Return last messages for every chat instantly ○ Use double writes ■ In-memory queue only for last messages ■ Persistent storage for all the things ● Majority of history retrievals is for the last days ○ Let's optimize the case ● Index messages by date 33
  • 34. History search ● ElasticSearch is the default solution for full- text search ● @a_soldatenko: What is the best full text search engine for Python? ● Add timing for search requests 34
  • 35. Parsing ● Protocol ○ Avoid to use Pure Python parsers ■ ujson ■ lxml ○ Run benchmarks against your typical cases ● Message content ○ Be careful with regular expressions ■ re2 ■ pyre2 ○ Alternative parsers in Python 35
  • 36. Dealing with file uploads ● Security checks ○ File upload vulnerabilities ○ Image upload ■ Decompression bomb ■ Other vulnerabilities with Pillow ○ Amazon S3 as file storage ■ boto ■ aiobotocore ■ botornado ● Thumbnails distribution ○ Delegate that to S3 ○ Requested by a client even if not needed 36
  • 37. Life of messaging platform ● Multi-session support ○ Set expiration time ○ Be ready to handle up to 4x sessions per user simultaneously ■ Desktop ■ Mobile ■ Tablet ■ Laptop ● Reconnection handling ○ Spin a proxy layer between messaging server and clients ● Rate-limiting ○ Limit amount of operations per user/group for heavy stuff ○ Leaky bucket ○ Throttling 37
  • 38. Lessons learned ● Bursty traffic ○ Load testing is a must, but not always enough ■ Locust ■ Yandex Tank ● Reconnect storm could be a big deal ○ We should handle that on platform and client-side ● AWS issues make bad customers experience ○ Put nodes in Multi-AZ 38
  • 39. Lessons learned ● Incidents prevention is cheaper than resolution ○ Grab stats and metrics about your services and storages ■ Redis for per-chat stats ■ StatsD ■ Grafana ○ Be notified when something starts going wrong ■ Elastalert ■ Monit ■ DataDog 39
  • 40. Lessons learned ● Don't stick with one language/stack ○ Python is great, but for some cases Go, Ruby or PHP are more suitable from product side ○ Avoid business logic duplication in several repos, spin a service and just call the endpoint ● Releasing new features only for certain groups makes product management easier ○ LaunchDarkly 40
  • 41. Lessons learned ● Don’t F**k the Customer ○ Provide unit/integration tests with every PR ○ Have development environment same as prod ○ Have staging environment same as prod ○ Make deployments fast ○ Rollback faster ○ Have a fallback plan 41
  • 43. Summary ● Select a messaging protocol which aligns with your needs ● WebSocket + JSON could be the thing for new projects ● Usage of asynchronous frameworks is preferred ● Execute blocking operations in a separate thread ● Collect metrics for common services operations ● Caching saves a lot of time ● Use C or Cython-based solutions for CPU-bound tasks ● Have fast release/deploy/rollback cycle ● Python is great, but don't hesitate to pick other tools 43
  • 44. Further reading ● How HipChat Stores and Indexes Billions of Messages Using ElasticSearch ● @kakovskyi: Maintaining a high load Python project for newcomers ● HipChat: Important improvements to staging, presence & database storage ● HipChat and the little connection that could ● Elasticsearch at HipChat: 10x faster queries ● Atlassian: How IT and SRE use ChatOps to run incident management ● A Study of Internet Instant Messaging and Chat Protocols ● What Is Async, How Does It Work, And When Should I Use It? ● Leaky Bucket & Tocken Bucket - Traffic shaping ● A guide to analyzing Python performance ● Why Leading Companies Dark Launch - LaunchDarkly Blog ● @bmwant: Asyncio-stack for web development 44