This document discusses scaling the backend of a financial platform for big data and blockchain. It describes challenges integrating big data using Apache Spark and Cassandra for tasks like predictive modeling, recommendations, and credit scoring. It also covers using a microservices architecture with Spring Cloud, Docker, and Kubernetes for deployment. Blockchain integration involves a private Ethereum network on Kubernetes for tokenization and a connection to the public Ethereum mainnet using Infura for payments and transfers.
3. 1. Introduction (The companies, project and me)
2. Backend challenge
3. Big Data Integration
4. Blockchain Integration
·············· P. 3
···················································
················ P. 7
···················································
·············· P. 17
···················································
········ P. 21
13. Fecha
Backend challenge - Why microservices?
• Migration of Dapp easier
• Easy to scale
• Polyglot Database and Languages
14. Why don't use exclusively blockchain
with a database?
15. 1. Spring Cloud
Netflix and Kubernetes
• Easy to learn.
• Nice integrations
• Spring 5 reactive
2. Docker
• Most adopted vendor technology for containers
• Well supported
3. Kubernetes
• Multi-cloud provider and on-premises data centers
• Self-repair and health check capabilities
• Auto-scale
Backend challenge - Microservice Architecture Stack
18. PFM values generation from user data.
Apache Spark + Cassandra
Forecast prediction and regeneration of this models
Apache Spark + Cassandra
Product recommendations based on the economic profile of the user and his real needs.
Apache Spark + Cassandra + Neo4j
Credit scoring calculation
Apache Spark + Cassandra
Big Data Integration - Tasks
19. • Tasks are hard, needs:
• Time
• Resources
• Not Real Time is needed.
• Event Driven Architecture.
Big Data Integration - Events
21. Big Data Integration - RabbitMQ vs Kafka
KafkaRabbitMq
• RabbitMQ is designed as a general
purpose message broker
• Support existing protocols like
AMQP, STOMP, MQTT.
• Finer-grained consistency
control/guarantees on a peer-
message.
• Complex routing.
• Apache Kafka is designed for high
volume publish-subscribe
messages and streams, meant to
be durable, fast, and scalable.
• Event Sourcing
• Your application needs access to
stream history.
• No complex routing.
https://content.pivotal.io/blog/understanding-when-to-use-rabbitmq-or-apache-kafka
22. • Deployed in Kubernetes.
• Only accessible by NodeJS API.
• All keys are stored in secrets vaults.
• Used for:
• Tokenization
• Transactions of users
Blockchain Integration - Private Ethereum
26. Blockchain Integration - Ethereum Main Net
• We are the owner of the wallets
• We use Infura to connect blockchain
• Used for:
• Payment
• Transfers