4. 9/10/2013 Confidential | Copyright 2013 TrendMicro Inc. 4
YesterdayYesterdayYesterdayYesterday
~40 Hadoop nodes
~15 Service/user accounts
3 Teams
<50 TB storage
<100 Jobs per day
5. 9/10/2013 Confidential | Copyright 2013 TrendMicro Inc. 5
TodayTodayTodayToday
~200 Hadoop nodes
~130 Service/user accounts
11 Teams
~500 TB storage
>16000 Jobs per day
6. 9/10/2013 Confidential | Copyright 2013 TrendMicro Inc. 6
1 MapReduce Job
Submitted
Each 5.4 Seconds
7. 9/10/2013 Confidential | Copyright 2013 TrendMicro Inc. 7
Why?Why?Why?Why?
Raw Data
Actionable
Intelligence
14. 網路威脅呈現爆炸性的成長
New Unique Malware Discovered
各式各樣的變種病毒、垃圾郵件、不明的下載來源等等,這些來自網路上
的威脅,躲過傳統安全防護系統的偵測,一直持續呈現爆炸性的成長,形
成嚴重的資安威脅
1M
unique
Malwares
every
month
1M
unique
Malwares
every
month
15. Reality Check
2011
New Unique Threats per Hour
(worldwide estimate*)
Network
Worms
Threats Found in Enterprises
(Real-world data from 150+ assessments*)
Data-Stealing
Malware
IRC
Bots
Targeting
Malware
COMPLEXITY
DANGER
Dangerous RisksSkyrocketing Volume Avoiding Detection
42%
56%
77%
100%
2010200920082007
12600
NEW
Threat Every
0.28
Seconds
2400
• 52% of companies failed to report or remediate a cyber breach
in 2011. --- SAIC, 2011
• Two new pieces of malwares are created every second. ---
Trend Micro, 2012
• A cyber intrusion occurs every 5 minutes. --- US CERT 2012
18. New approach for cyber threat solution
Web CrawlerWeb Crawler
Trend Micro
Endpoint Protection
Trend Micro
Endpoint Protection
Trend Micro
Mail Protection
Trend Micro
Mail Protection
Trend Micro
Web Protection
Trend Micro
Web Protection
HoneypotHoneypot
CDN / xSPCDN / xSP Researcher
Intelligence
Researcher
Intelligence
3+ Billion Worldwide Sensors
19. SPN: Smart Protection Network
9/10/2013 Confidential | Copyright 2013 TrendMicro Inc. 19
Collects
Protects
Identifies
BIG
DATA
ANALYTICS
(Data Mining,
Machine Learning,
Modeling, Correlation)
DAILY STATS:
• 7.2 TB data correlated
• 1B IP addresses
• 90K malicious
threats identified
• 100+M good files
20. SPN High Level Architecture
9/10/2013 Confidential | Copyright 2013 TrendMicro Inc. 20
Receiver
Trend Message Exchange (Message Bus)
Hadoop Distributed File System (HDFS)
HBaseMapReduce
Adhoc-Query (Pig)
Oozie
CDN/xSP
Log
Honey
Pot
SPN
Feedback
Data SourcingData Sourcing
APP 1
MySPN Platform
Solr Cloud
API Server/Portal
Service Platform
APP 2
Service DeliveryService Delivery
21. MySPN Ecosystem
Portal
& API
Single
Entry-Point
SPN Infrastructure
APT KB Service
TopCVE Service
APT KB
VE DB
FB Logs
Census
MySPN
Market Place
Service Platform
SSO
New App
OPS RD / Team
Monitor SDK
All My
Guard
Threat
Connect
Dashboard
Service
Catalog
Census
Profile Alert
New App
Dispatcher
Access
Login
Trender
Need
Solution
Customer
Publish
ImplementOperate
Develop
Solution
backed-by
Data Catalogue
22. SPN Solution Architecture
File
URL
Web /
URL
Email
Domain
IP
File Reputation ServiceFile Reputation Service
Email Reputation ServiceEmail Reputation Service
Customer
SmartProtection
Community Intelligence
(Feedback loop)
Web Reputation ServiceWeb Reputation Service
Sourcing
Processing
& Analysis
Validate &
Create Solution
Quality
Assurance
Solution
Distribution
Solution
Adoption
SPN Correlation
28. How to Scale?
• Un-structure data first
• If you really need structure data
– Use Google Protocol Buffers or
– JSON string
• Purify your data before processing
• Leverage HBase more
– Well design row key to prevent hot-spot
• Use MapReduce to create Lucene index
• Leverage SolrCloud for complex real-time use cases
9/10/2013 Confidential | Copyright 2013 TrendMicro Inc. 28
29. Our Learning
• Has clear strategy first
• Start small, scale quickly
• Chose right solution for right problem