In this session, we will look at how Apache Flink can be used to stream anonymized API request and response data from a production environment to make sure staging environments are up-to-date and reflect the most recent features (and bugs) that comprise a service. The talk will also examine how to deal with issues of data retention, throttling, and persistence, finishing with recommendations for how to use these sandbox environments to rapidly prototype and test new features and fixes.
4. Stories from the dark side
A microservice sent a text
message from a staging
environment to a real
customer saying “I want you
back <3” over twenty times a
day
A cucumber test verified an
emergency alert is sent when
a table is accidentally
deleted … and did it in
production
A team’s APIs drifted so far
from their swagger spec that
they decided to stop sending
the spec to consultants and
just ask them to figure stuff
out from production logs
5. Mocks allow teams to test
their services and their
integrations with confidence
9. But staging environments had their problems…
• Monolithic: Don’t expect anything reliable to be there if
you’re testing things out, as your colleague may also be
using it.
• Drifting: Frequently falls out of step with production.
• Heavy: Difficult to deploy anywhere, which made CI/CD
cumbersome.
11. Containerization helps mocking
• Tools like Docker and Kubernetes allow developers to spin
up and tear down ephemeral versions of their stack on a
per-test basis.
• No more monolith, not heavy, always (theoretically) in line
with production when using Ansible, Terraform etc.
12. But there are still issues
• Containers do not know anything about downstream
dependencies, making it hard to mock containers that
depend on a robust ecosystem of services.
• Containers cannot start in an arbitrary state without ad hoc
logic.
• Containers have no inherent service discovery mechanism,
making it tough to know what to test.
13. At Meeshkan, we solve
these issues by mocking
services recording
streams of their IO
operations and using the
streams to reverse
engineer an accurate
mock of the service.
16. Benefits of in-memory mocks
• Allows programmers to collocate application logic and test
logic.
• Certain frameworks (ie unmock) allow for fuzzing and
property-based testing.
• Very light to setup.
17. Downsides of in-memory mocks
• Allows programmers to collocate application logic and test
logic (this was also a benefit!).
• Difficult to transfer to other languages.
• Requires lots of manual maintenance as services evolve.
• Tends to bloat tests with logic, which defeats the purpose of
testing, as you have to test your tests.
22. How mocking with streams solves the mock
server problem
• When ingesting 10s of millions of requests, mocking based
on stored fixtures is simply not an option.
• Mocking a server based on streams of its input and output
allows testing against a snapshot of a service as it functions
over an arbitrary timeframe.
• Tests can execute against different versions of a server to
find a bug – like git bisect but for your mocks.
23. Mocking with streams - Kafka
• In your K8s cluster, include a Meeshkan node
(https://github.com/meeshkan/meeshkan) that records
request and response data from a Kafka stream.
• The output is a mock server that exists as another K8s node
against which you execute your tests.
• https://dev.to/meeshkan/building-a-real-time-http-traffic-
stream-with-apache-kafka-dim
24. Mocking with Kafka is hard because
it requires a stateful service that
retains a large window of data that
is transformed into the mock server.
26. Flink solves the state problem in service
mocking
• Meeshkan is deployed as a Stateful Function (StateFun) on Flink
thanks to the new Stateful Function API (thank you Flink team!)
• The stateful function keeps the current mock as the state, ingests
recordings, and outputs a new mock based on the diff.
• If the mocks are not equivalent, an alert can be sent that the
service has drifted.
• Creating a mock from a diff obviates the need for long stream
retention.
27. Meeshkan: Your partners in testing using Flink
• If your company is deployed on Flink and interested in
building mock servers for testing using streams and stateful
functions, reach out!
• Meeshkan’s open-source tools are available at
https://github.com/meeshkan, and our automatic-testing
alpha is accepting sign-ups at https://meeshkan.com.