Weitere ähnliche Inhalte Ähnlich wie Handling the Extremes: Scaling and Streaming in Finance (20) Mehr von MapR Technologies (20) Kürzlich hochgeladen (20) Handling the Extremes: Scaling and Streaming in Finance1. ®
1®
© 2016 MapR Technologies 1© 2016 MapR Technologies
®
Handling the Extremes
Scaling and Streaming in Finance
2. ®
2®
© 2016 MapR Technologies 2
Agenda
• History
– Past, present, future
• Messaging platforms
– Defining the extremes
• Use cases
– Email, fraud
• Resources
• Q&A
3. ®
3®
© 2016 MapR Technologies 3
Message
Bus
Specialized Storage
Operational Applications
J2EE
AppServer
Relational
Database
Legacy Business Platforms
• IT must integrate all the products
• Inability to operationalize the insight rapidly
• Can’t deal with high speed data ingestion and processing
• Scale up architecture leads to high cost
Specialized Storage
Analytical Applications
Analytic
Database
ETL Tool BI Tool
4. ®
4®
© 2016 MapR Technologies 4
Converged Data Platform
Analytical
Applications
Operational
Applications
Converged Applications
Complete Access to Real-time and
Historical Data in One Platform
Developers
Creating Database
and Event Based
Applications
(Bottom Line Initiatives) (Top Line Initiatives)
Analysts
Creating BI Reports
and KPIs on Data
Warehouse
Historical Data Current Data
5. ®
5®
© 2016 MapR Technologies 5
Application Development and Deployment
Oracle
Bulk Load
Machine
Learning
Data
Lake
Predictive
Modeling
BI /
Reporting
Insights
DB
Events
(Kafka)
NoSQL
SQL
Server
Graph
DB
Microservice
(.NET)
Microservice
(NodeJS)
Microservice
(Java)
Customer Insights
SQL
Server
IIS, ASP.NET
Desktop
Browser
(Javascript, jQuery)
SQL
HTML, CSS, JS
Microsoft
Reporting
Service
2005 Today Desktop
Browser
(Javascript, 20+
Frameworks)
Tablet
Native
Android
Native
iOS
JSON
JSON, CSS,
HTML, JS
Backendfor
Frontend
(Java)
6. ®
6®
© 2016 MapR Technologies 6
Application Development and Deployment
Oracle
Bulk Load
Machine
Learning
Data
Lake
Predictive
Modeling
BI /
Reporting
Insights
DB
Events
(Kafka)
NoSQL
SQL
Server
Graph
DB
Microservice
(.NET)
Backendfor
Frontend
(Java)
Microservice
(NodeJS)
Microservice
(Java)
Desktop
Browser
(Javascript, 20+
Frameworks)
Tablet
Native
Android
Native
iOS
Customer Insights
JSON
JSON, CSS,
HTML, JS
SQL
Server
IIS, ASP.NET
Desktop
Browser
(Javascript, jQuery)
SQL
HTML, CSS, JS
Microsoft
Reporting
Service
2005 Today
7. ®
7®
© 2016 MapR Technologies 7
Web-Scale Storage
MapR-FS MapR-DB
Real Time Unified Security Multi-tenancy Disaster Recovery Global NamespaceHigh Availability
MapR Streams
Event StreamingDatabase
MapR Platform Services: Open API Architecture
Assures Interoperability, Avoids Lock-in
HDFS
API
POSIX
NFS
SQL,
HBase
API
JSON
API
Kafka
API
8. ®
8®
© 2016 MapR Technologies 8
Converged Application Benefits
• Consumers scale horizontally with partitions
• 1:1 mapping between consumer and partition
• Enables predictable scaling as production needs grow
• Data can be seamlessly replicated to another cluster
• Enables HA with zero code changes
• Data is indexed dynamically according to receivers, senders
• Scales beyond the capabilities of Kafka
• Snapshots can be taken to capture state
• Enables faster testing and deployment of
applications
9. ®
9®
© 2016 MapR Technologies 9© 2016 MapR Technologies© 2016 MapR Technologies
Messaging platforms
10. ®
10®
© 2016 MapR Technologies 10
Producers Consumers
A stream is an unbounded sequence of events carried
from a set of producers to a set of consumers.
What’s a Stream?
Producers and consumers don’t have to be aware of
each other, instead they participate in shared topics.
This is called publish/subscribe.
/Events:Topic
11. ®
11®
© 2016 MapR Technologies 11
Ability to Handle the “Extreme”
• 1+ Trillion Events
– per day
• Millions of Producers
– Billions of events per second
• Multiple Consumers
– Potentially for every event
• Multiple Data Centers
– Plan for success
– Plan for drastic failure
Think that is crazy? Consider having 100
servers and performing:
Monitoring and Application logs…
– 100 metrics per server
– 60 samples per minute
– 50 metrics per request
– 1,000 log entries per request (abnormally
small, depends on level)
– 1million requests per day
~ 2 billion events per day, for one small
(ish) use case
Extreme Average Reality
12. ®
12®
© 2016 MapR Technologies 12
Producing and Consuming is Easy
producer = new KafkaProducer<>();
ProducerRecord<> event =
new ProducerRecord<>(“/Events:Topic”, “MyEvent”);
producer.send(event);
consumer = new KafkaConsumer<>();
consumer.subscribe(“/MyStream:MyTopic”);
while(true) {
ConsumerRecords<> events = consumer.poll(1000);
Iterator<> newEvents = records.iterator();
while(newEvents.hasNext()) {
System.out.println(newEvents.next().toString());
}
}
/Events:Topic
13. ®
13®
© 2016 MapR Technologies 13
Producers and Consumers
/Events:Topic Analytics
Consumers
Stream ProcessorsSocial Platforms
Servers
(Logs, Metrics)
Sensors
Mobile Apps
Other Apps &
Microservices
Alerting Systems
Stream Processing
Frameworks
Databases &
Search Engines
Dashboards
Other Apps &
Microservices
14. ®
14®
© 2016 MapR Technologies 14
Considering a Messaging Platform
• 50-100k messages per second used to be good
– Not really good to handle decoupled communication between services
• Kafka model is BLAZING fast
– Kafka 0.9 API with message sizes at 200 bytes
– MapR Streams on a 5 node cluster sustained 18 million events / sec
– Throughput of 3.5GB/s and over 1.5 trillion events / day
• Manual sharding is not a “great” solution
– Adding more servers should be easy and fool proof, not painful
– Yes, I have lived through this
16. ®
16®
© 2016 MapR Technologies 16
Event-based Data Drives Applications
Failure
Alerts
Real-time application
& network monitoring
Trending
now
Web
Personalized Offers
Real-time Fraud Detection
Ad optimization
Supply Chain Optimization
18. ®
18®
© 2016 MapR Technologies 18
Fighting Fraudulent E-Mail
• Phishing attempts
• Malware
• Spam
19. ®
19®
© 2016 MapR Technologies 19
Prevention Options
• Train people to not click random links in emails
– This will NEVER happen (Honestly!)
• E-mail appliances to prevent users from seeing emails
– Most typically require users to intervene
– Costly
20. ®
20®
© 2016 MapR Technologies 20
Constructing an E-Mail Management Pipeline
Postfix Mail Server
E-Mail Stream
MTA
Spam FiltersPhishing Classification InternalAffairs
LegalArchive
MTA Postfix Mail Server
21. ®
21®
© 2016 MapR Technologies 21
Benefits of Approach
• Customizable pipeline
• Can learn and apply new policies
– Spam
– Phishing classification
– Fraud attempts
• Retention policies
– Auditable
– Simple search and discovery
– Litigation hold
22. ®
22®
© 2016 MapR Technologies 22
Classifiers
Fighting Fraudulent Web Traffic
Activity Stream
Click Stream
Deviation from Normal
Blacklist Activities
Whitelist Activities
User Activity Profile
Known Bad Classifier
All OK Classifier
SessionAlteration
Stream Notify Security
23. ®
23®
© 2016 MapR Technologies 23
Similarities between Marketing and Fraud?
Customer 360 Website Fraud
• Build a user profile
– What are their normal usage patterns
• Build “segmented” profiles
– What do real users normally do
• Dynamically alter website
– Prevent user functionality
• Kick-off external workflows
– Notify security team
• Build a user profile
– What type of content do they like
• Build “segmented” profiles
– Company affiliation
• Dynamically alter website
– Show alternate content
• Kick-off external workflows
– Nurture emails
25. ®
25®
© 2016 MapR Technologies 25
Learn More about Converged Applications
Check out our Converged Application Blueprint
Visit www.mapr.com/appblueprint
26. ®
26®
© 2016 MapR Technologies 26
@kingmesal
jscott@mapr.com
Engage with us!
kingmesal