SlideShare ist ein Scribd-Unternehmen logo
1 von 38
© 2014 MapR Technologies 1© 2014 MapR Technologies
© 2014 MapR Technologies 2
Objective
• Advanced Persistent Threat (APT)
• Big Data + Threat Intelligence
• Hadoop + Spark Solution
• Example Detection Algorithm Development Scenarios (most of
them are still open problems)
Topics covered in this talk
© 2014 MapR Technologies 3© 2014 MapR Technologies
Advanced Persistent Threat
© 2014 MapR Technologies 4
APT
• Advanced Persistent Threat (APT) is one of the biggest headaches in
IT departments
– Target Compromise
– Countless DDoS attacks (Thousands a day according to Arbor Networks)
– These are only known cases, that could just be a tip of the iceberg.
• Why APT is so prevalent?
– No more hobby for smart hackers
– Huge money is involved, even behind organized crimes
– Political tool (Recent conflict between Ukraine and Russia sparked malware
warfare between them)
– Cyber warfare (Stuxnet)
Overview
© 2014 MapR Technologies 5
APT
• Hard to Detect
– More software layer stacks without thorough vulnerability test popping every day
• Storm, spark, yarn, grail, play, spring, flask, …
– Mobile area is even worse
• Particularly android
• Some estimates 30% or more devices are already compromised, worldwide
• Anti-Virus is useful only up to a certain point
– It takes months to years to define malware signature
– Zero day attack is still unpreventable
– It became almost a Placebo
• Firewall is not much useful anymore
– A device can be infected when the user brings it outside the Firewall premise
• Botnet itself is becoming more complex with many hierarchies
– Minimal binary delivery
– Surreptitious C&C connection with complex hierarchy or even headless peer to peer bots (Gameover
Zeus Botnet)
Status
© 2014 MapR Technologies 6
APT
• Snort / Suricata
– Rule based system
– Community support, pre/post-compromise detection
– Constant update is needed, cannot detect Zero day attack
– Sourcefire provides paid service
• Sandbox Technology
– Firewall + In premise detection
– Fireeye
• Poly-morphing technology
– ShapeSecurity
• Log data mining based methods
– Splunk / Sumo Logic, Solutionary
Defense, state of the art
© 2014 MapR Technologies 7
APT
• Many world wide security labs have malware labs and generate
threat reports
• The analysis takes from 2 weeks to months
• Involves
– Decoding binary execution and decrypting load / config parameters
– Complete time line analysis, from infection to exploit
– What devices and ips and domain names are involved
• Sometimes, analyze IRC data, or even social network data
– Botnet connection and verify the command and control
• Can we automate this with Big Data?
Threat Report
© 2014 MapR Technologies 8
APT
Example Annual Threat Report (from Fireeye, 2013, Europe)
Top Two
Industries in
Threat Finding
were
Healthcare
and Finance
© 2014 MapR Technologies 9
APT
• Configuration (Decrypted)
• ID: F16 08-07-2013
Group:
DNS/Port: Direct: toornt.servegame.com:443,
Proxy DNS/Port:
Proxy Hijack: No
ActiveX Startup Key:
HKLM Startup Entry:
File Name:
Install Path: C:Documents and SettingsAdminLocal SettingsTempmorse.exe
Keylog Path: C:Documents and SettingsAdminLocal SettingsTempmorse
Inject: No
Process Mutex: gdfgdfgdg
Key Logger Mutex:
ActiveX Startup: No
HKLM Startup: No
Copy To: No
Melt: No
Persistence: No
Keylogger: No
Password: !@#GooD#@!
Example Threat Report (from Fireeye)
C&C Servers
toornt.servegame.com
updateo.servegame.com
egypttv.sytes.net
skype.servemp3.com
natco2.no-ip.net
Why does it need Password?
© 2014 MapR Technologies 10
APT
• CHAIN OF EVENTS
• ASSOCIATED DOMAINS
• 192.81.171.13 - www.toonzone.net - Compromised website
• 190.123.47.198 - ilinsting.com - Redirect
• 64.202.116.124 - bgbyhn.in.ua - Fiesta EK
• INFECTION CHAIN OF EVENTS
• 06:40:07 UTC - www.toonzone.net - GET /forums/adult-swim-toonami-forum/
• 06:40:08 UTC - ilinsting.com - GET /szjhmucw.js?3ad1359a5153d640
• 06:40:09 UTC - bgbyhn.in.ua - GET /hdjng94/?2
• 06:40:11 UTC - bgbyhn.in.ua - GET /hdjng94/?25b6d1b1cb76ec625b500e0d560a50040703520d5053520a0706510355090109
• 06:40:12 UTC - bgbyhn.in.ua - GET /hdjng94/?2d8a97d01a056fdd41084e5a0b0c56050752085a0d55540b07570b54080f0708;5110411
• 06:40:14 UTC - bgbyhn.in.ua - GET /hdjng94/?02bb88c62d7306c8534209590a035103050452590c5a530d0501515709000057;5
• 06:40:15 UTC - bgbyhn.in.ua - GET /hdjng94/?02bb88c62d7306c8534209590a035103050452590c5a530d0501515709000057;5;1
• 06:40:42 UTC - bgbyhn.in.ua - GET /hdjng94/?2ad5cdef3fc4ef9851110f0e515f57530757540e5706555d07525700525c065e;6
• 06:40:43 UTC - bgbyhn.in.ua - GET /hdjng94/?2ad5cdef3fc4ef9851110f0e515f57530757540e5706555d07525700525c065e;6;1
• 06:40:43 UTC - bgbyhn.in.ua - GET /hdjng94/?5998786b9c7a1ffe544b580305030457000f0903035a0659000a0a0d0600555a
• 06:40:49 UTC - bgbyhn.in.ua - GET /hdjng94/?59576b00f4cfd03e5641500c04590205000f050c0200000b000a0602075a5308;1;2
• 06:40:49 UTC - bgbyhn.in.ua - GET /hdjng94/?59576b00f4cfd03e5641500c04590205000f050c0200000b000a0602075a5308;1;2;1
Another Example, Fiesta EK, from malware-traffic-analysis.net
© 2014 MapR Technologies 11© 2014 MapR Technologies
Big Data + Threat Intelligence
© 2014 MapR Technologies 12
Big Data + Threat Intelligence
• Tom Brady + Gisele Bundchen
– An Ideal Marriage
• With All the advances in Computing and Data Resources, why can’t we
automate Malware detection
• Big Data is an ideal platform for malware study
– Simple packet capture can easily make PETA bytes data from small offices
– Huge storage + Fast processing is essential for malware study
• Various aspects of Big Data fit well with Malware
– Streaming analysis (Storm, Spark Streaming)
– Volumetric data analysis (Spark)
– Graph analysis
• View network devices as nodes, discover command and control role
• Each url can be a node and the basis of graph analysis
– Visualization for intuitive analysis
Pros
© 2014 MapR Technologies 13
Big Data + Threat Intelligence
• Anomaly detection
– Typical log analysis
– Router / Switch has built in alarm setting
• Simple Level based detection
– Is this going to be useful?
• How much can you tell
• Machine Learning
– Not much useful
• Not easy to get labeled data
• Even with labeled data it is very hard to develop a feature set
– If the feature set is known, hackers will revise their codes
• Zero day attack does not come with a label
– Modeling needs complete understanding of criminal minds
Cons (e.g., Gwyneth Paltrow and Chris Martin)
© 2014 MapR Technologies 14
Big Data + Threat Intelligence
An Example Architecture
Storm Spout
Packet
Stream
Or
Binary
Downloads
Storm Bolt
Packet Analysis
Alert and store
packet data
Store to HDFS
Spark Analysis
Storm Bolt
Meta Data
Extraction
Packet stream
truly reveals
Malware
expression
compared to Log
Connect the Dots with Strong
In Memory Processing
© 2014 MapR Technologies 15
Big Data + Threat Intelligence
• Reduce False Positives
– Mantra in Malware detection business
• Big data is a great resource for reducing false positives (Type 1
error)
– As soon as an update on an algorithm is made, test against the Big
Data test cases
– The test can even be applied to old cases, greatly reducing false
positives
• Typically, we had to sample test data by weighting old data lower
False Positives
© 2014 MapR Technologies 16
Big Data + Threat Intelligence
• Wireshark (tshark) is the goto software for packet analysis
– Huge memory hogging software
• Need to put packet data onto HDFS
• Packetpig has been developed from Hortonworks
– A lot more has to be done to be closer anywhere near to the strength of
Wireshark
• Need to design efficient meta data collection and storage
mechanisms
– Use snort or custom c platform library to extract essential flow data
• Flow is a 5-tuple src/dest/ip/port/protocol
• Flow is the de facto unit of network malware expression analysis
Packet to HDFS
© 2014 MapR Technologies 17
Big Data + Threat Intelligence
• Big Data provides opportunity to map out all the ip addresses
used on a particular network
• Through graph analysis, find rogue IP addresses
• Use geographical information with IP to find abnormal
connection behavior
• DNS provides many insights on Malware connection
– Static IP cannot be used for malware control purpose
– Fast Flux
– Awkward names
IP based analysis
© 2014 MapR Technologies 18
Big Data + Threat Intelligence
• Flow is an essential malware analysis unit
• Flow identifies
– Who’s connecting to whom
• Frequency, duration, communication bandwidth
• App can be identified from flow
– Port, actual content
– Palo Alto Networks
• Normal flow vs Abnormal flow
– With enough data, we could potentially identify normal flow
• Use first 16 bytes?
– Cluster analysis, detect anomaly
Flow to detect malware expression
© 2014 MapR Technologies 19© 2014 MapR Technologies
Spark on Hadoop
© 2014 MapR Technologies 20
Apache Spark
• spark.apache.org
• github.com/apache/spark
• user@spark.apache.org
• Originally developed in 2009 in UC
Berkeley’s AMP Lab
• Fully open sourced in 2010 – now
at Apache Software Foundation
© 2014 MapR Technologies 21
Easy: Example – IP Count
• Spark
public static class WordCountMapClass extends MapReduceBase
implements Mapper<LongWritable, Text, Text, IntWritable> {
private final static IntWritable one = new IntWritable(1);
private Text word = new Text();
public void map(LongWritable key, Text value,
OutputCollector<Text, IntWritable> output,
Reporter reporter) throws IOException {
String line = value.toString();
StringTokenizer itr = new StringTokenizer(line);
while (itr.hasMoreTokens()) {
word.set(itr.nextToken());
output.collect(word, one);
}
}
}
public static class WorkdCountReduce extends MapReduceBase
implements Reducer<Text, IntWritable, Text, IntWritable> {
public void reduce(Text key, Iterator<IntWritable> values,
OutputCollector<Text, IntWritable> output,
Reporter reporter) throws IOException {
int sum = 0;
while (values.hasNext()) {
sum += values.next().get();
}
output.collect(key, new IntWritable(sum));
}
}
• Hadoop MapReduce
val spark = new SparkContext(master, appName, [sparkHome], [jars])
val file = spark.textFile("hdfs://...")
val counts = file.flatMap(line => line.split(“,”)(0))
.map(ip=> (ip, 1))
.reduceByKey(_ + _)
counts.saveAsTextFile("hdfs://...")
© 2014 MapR Technologies 22
Fast: Using RAM, Operator Graphs
• In-memory Caching
• Data Partitions read from
RAM instead of disk
• Operator Graphs
• Scheduling Optimizations
• Fault Tolerance
= cached partition
= RDD
join
filter
groupBy
Stage 3
Stage 1
Stage 2
A: B:
C: D: E:
F:
map
© 2014 MapR Technologies 23
SPARK RDD
• Resilient Distributed Datasets (RDD) is the key (potentially) in
memory data structure
• RDD is distributed over Hadoop Nodes, typically resides on
memory
• Transform RDD, then get data from RDD, Lazy Evaluation
– 2 sets of interfaces are provided, one for transform, the other for taking
actions (e.g., count, save etc)
• Most of the interface is quite similar to Lisp operations and SQL
operations
• Use Persist (Cache) to have the RDD on memory
© 2014 MapR Technologies 24
RDD
© 2014 MapR Technologies 25
Working With RDDs
RDD
RDD
RDD
RDD
Transformations
Action Value
linesWithSpark = textFile.filter(lambda line: "Spark” in line)
linesWithSpark.count()
74
linesWithSpark.first()
# Apache Spark
textFile = sc.textFile(”SomeFile.txt”)
© 2014 MapR Technologies 26
Spark, Hadoop Malware Analysis
Why useful
Packet
Stream
Construct Group of
Suspected Flows In RDD
E.g., suspected DNS tunnels,
IRC communications
Analyze with SPARK on RDD, IN
MEMORY
Connect the Dots, Flows,
SysLogs and Events
Huge advantage over Wireshark!
Store in HDFS for easy
access and use HBase for
database support
Real Time Event
Processing
Fast
Classification or
Anomaly
Detection
© 2014 MapR Technologies 27
SPARK and Hadoop
• Connecting dots needs Huge Storage and Fast Access
– Potential need to go back in time to find correlating events
• DDoS attack found Today + 10 Days ago spotty IRC chat + 20 days ago NXDomain
events by the suspected infected machine
– Sometimes it takes months to know a domain (the machine contacted) is suspicious (e.g.,
scored in VirusTotal)
– Then see if these patterns match with known malware expressions
– Approximate matching technology here is quite important
» HMM and Correlation Modeling
– HDFS + Hbase would be a good solution
• Store relevant temporal data
• Retrieve fast according to the criteria
• SPARK + Hadoop provides fast development cycle
– From prototype to evaluation
Why Hadoop
© 2014 MapR Technologies 28© 2014 MapR Technologies
Example Detection Algorithm Development Scenarios
© 2014 MapR Technologies 29
Introduction to Botnet (Terminology)
Bot Master
Bots
Code Server
IRC Server
Victim
IRC Channel
Attack
IRC Channel
C&C Traffic
Updates
Old Days BotNet
operation,
Just for Reference
Companies are
interested in
finding these in
there premises
© 2014 MapR Technologies 30
(Malware Expression) Detection Phases
• Pre Infection Detection
– Intrusion Detection System
• Active Infection Detection
– Recruit and Reconnaissance in the internal network
• Post Infection Detection
– Exploit and Monetize
© 2014 MapR Technologies 31
Pre Infection Detection
• Detect suspicious URLs
– When a device tries to contact or download suspicious URLs, block it
• How it works
– If suspicious or unknown contents are detected, send it to backend big
data deep analysis engine
– Update suspicious IP/Domain Name/URLs
– Update hash of the binary
– Regularly remove old hash/suspicious URLs
CAMP
© 2014 MapR Technologies 32
On going infection detection
• How it works
– Detect suspicious internal behavior
– Develop normal behavioral model for target customer site
– Detect abnormal authentication behavior, e.g., Kerberos, LDAP etc
– Detect suspicious data move
– Detect suspicious port usage
– Detect tunnels
• It is highly important to leverage Big Data to develop sustainable
normal behavioral model and constant update. Network data/model is
constantly changing.
• Consult with Security experts to define the measure points
In-network infection propagation
© 2014 MapR Technologies 33
Post Infection Detection
• HTTP / DNS is most frequently abused protocols
– Firewalls allow these ports get through
– If needed, play man in the middle for SSL data inspection
• Ill formed Http Header detection
– Abnormal location
– Abnormal referrer
– Abnormal User Agent
– Abnormal Size
• Abnormal Http Post Detection (e.g., entropy analysis)
• Ill formed XML / HTML
• SQL Injection
– SELECT * FROM users WHERE name = '' OR '1'='1';
• LDAP Code Injection
Protocol Abnormality
Collect Malware
Expression
Samples
Develop Feature
Set with Hadoop
and SME
Deploy and
Continually
update the model
© 2014 MapR Technologies 34
Post Infection Detection
• Click Fraud
• Like Fraud
• DDoS
• SPAM
Volumetric Abnormality
© 2014 MapR Technologies 35
Post Infection Detection
• Cadence
• Weird domain name resolution
• Fast Fluxing domain names
• Abnormal IRC traffic behavior
• Abnormal twitter behavior
• Abnormal facebook behavior
Command and Control Contact
© 2014 MapR Technologies 36
DGA
ClickSecurity.com
What Features Would U Use?
© 2014 MapR Technologies 37© 2014 MapR Technologies
Conclusion
© 2014 MapR Technologies 38
Conclusion
• Threat Intelligence and Big Data are very HOT
• Big Data is the ideal analysis platform for Malware expression analysis
– Caution, Remember the Cons
– Useful for efficiently connecting the dots
• Big Data enables
– Persistent model building and updating
– Reducing false positives through exhaustive data check compared to spot check
• Hadoop / SPARK supports ideal platform for Malware expression analysis
– SPARK provides strong inmemory processing power for complex malware data analysis
with simpler scripting level coding
• scala
– MapR provides fastest data access on Hadoop nodes
• M7
• MapR is the better hadoop
• Don’t under estimate NFS and Volume convenience
• Questions are welcome, send to syoon@maprtech.com,
mvasquez@maprtech.com nestrada@maprtech.com

Más contenido relacionado

Was ist angesagt?

Using Canary Honeypots for Network Security Monitoring
Using Canary Honeypots for Network Security MonitoringUsing Canary Honeypots for Network Security Monitoring
Using Canary Honeypots for Network Security Monitoringchrissanders88
 
Threat Hunting with Data Science
Threat Hunting with Data ScienceThreat Hunting with Data Science
Threat Hunting with Data ScienceAustin Taylor
 
Leveraging DNS to Surface Attacker Activity
Leveraging DNS to Surface Attacker ActivityLeveraging DNS to Surface Attacker Activity
Leveraging DNS to Surface Attacker ActivitySqrrl
 
PinTrace Advanced AWS meetup
PinTrace Advanced AWS meetup PinTrace Advanced AWS meetup
PinTrace Advanced AWS meetup Suman Karumuri
 
Splunking configfiles 20211208_daniel_wilson
Splunking configfiles 20211208_daniel_wilsonSplunking configfiles 20211208_daniel_wilson
Splunking configfiles 20211208_daniel_wilsonBecky Burwell
 
Apache metron meetup presentation at capital one
Apache metron meetup presentation at capital oneApache metron meetup presentation at capital one
Apache metron meetup presentation at capital onegvetticaden
 
Big Data for Security - DNS Analytics
Big Data for Security - DNS AnalyticsBig Data for Security - DNS Analytics
Big Data for Security - DNS AnalyticsMarco Casassa Mont
 
Apache metron - An Introduction
Apache metron - An IntroductionApache metron - An Introduction
Apache metron - An IntroductionBaban Gaigole
 
Listening at the Cocktail Party with Deep Neural Networks and TensorFlow
Listening at the Cocktail Party with Deep Neural Networks and TensorFlowListening at the Cocktail Party with Deep Neural Networks and TensorFlow
Listening at the Cocktail Party with Deep Neural Networks and TensorFlowDatabricks
 
A Practical Guide to Anomaly Detection for DevOps
A Practical Guide to Anomaly Detection for DevOpsA Practical Guide to Anomaly Detection for DevOps
A Practical Guide to Anomaly Detection for DevOpsBigPanda
 
Applied machine learning defeating modern malicious documents
Applied machine learning defeating modern malicious documentsApplied machine learning defeating modern malicious documents
Applied machine learning defeating modern malicious documentsPriyanka Aash
 
SPO2-T11_Automated-Prevention-of-Ransomware-with-Machine-Learning-and-GPOs
SPO2-T11_Automated-Prevention-of-Ransomware-with-Machine-Learning-and-GPOsSPO2-T11_Automated-Prevention-of-Ransomware-with-Machine-Learning-and-GPOs
SPO2-T11_Automated-Prevention-of-Ransomware-with-Machine-Learning-and-GPOsRod Soto
 
Design Patterns For Real Time Streaming Data Analytics
Design Patterns For Real Time Streaming Data AnalyticsDesign Patterns For Real Time Streaming Data Analytics
Design Patterns For Real Time Streaming Data AnalyticsDataWorks Summit
 
Providence Future of Data Meetup - Apache Metron Open Source Cybersecurity Pl...
Providence Future of Data Meetup - Apache Metron Open Source Cybersecurity Pl...Providence Future of Data Meetup - Apache Metron Open Source Cybersecurity Pl...
Providence Future of Data Meetup - Apache Metron Open Source Cybersecurity Pl...Carolyn Duby
 
What's Next for Google's BigTable
What's Next for Google's BigTableWhat's Next for Google's BigTable
What's Next for Google's BigTableSqrrl
 
Threat Hunting for Command and Control Activity
Threat Hunting for Command and Control ActivityThreat Hunting for Command and Control Activity
Threat Hunting for Command and Control ActivitySqrrl
 
Managing your Black Friday Logs NDC Oslo
Managing your  Black Friday Logs NDC OsloManaging your  Black Friday Logs NDC Oslo
Managing your Black Friday Logs NDC OsloDavid Pilato
 
Managing your black friday logs Voxxed Luxembourg
Managing your black friday logs Voxxed LuxembourgManaging your black friday logs Voxxed Luxembourg
Managing your black friday logs Voxxed LuxembourgDavid Pilato
 

Was ist angesagt? (20)

Using Canary Honeypots for Network Security Monitoring
Using Canary Honeypots for Network Security MonitoringUsing Canary Honeypots for Network Security Monitoring
Using Canary Honeypots for Network Security Monitoring
 
Threat Hunting with Data Science
Threat Hunting with Data ScienceThreat Hunting with Data Science
Threat Hunting with Data Science
 
Leveraging DNS to Surface Attacker Activity
Leveraging DNS to Surface Attacker ActivityLeveraging DNS to Surface Attacker Activity
Leveraging DNS to Surface Attacker Activity
 
PinTrace Advanced AWS meetup
PinTrace Advanced AWS meetup PinTrace Advanced AWS meetup
PinTrace Advanced AWS meetup
 
Splunking configfiles 20211208_daniel_wilson
Splunking configfiles 20211208_daniel_wilsonSplunking configfiles 20211208_daniel_wilson
Splunking configfiles 20211208_daniel_wilson
 
Apache metron meetup presentation at capital one
Apache metron meetup presentation at capital oneApache metron meetup presentation at capital one
Apache metron meetup presentation at capital one
 
Big Data for Security - DNS Analytics
Big Data for Security - DNS AnalyticsBig Data for Security - DNS Analytics
Big Data for Security - DNS Analytics
 
Apache metron - An Introduction
Apache metron - An IntroductionApache metron - An Introduction
Apache metron - An Introduction
 
Listening at the Cocktail Party with Deep Neural Networks and TensorFlow
Listening at the Cocktail Party with Deep Neural Networks and TensorFlowListening at the Cocktail Party with Deep Neural Networks and TensorFlow
Listening at the Cocktail Party with Deep Neural Networks and TensorFlow
 
A Practical Guide to Anomaly Detection for DevOps
A Practical Guide to Anomaly Detection for DevOpsA Practical Guide to Anomaly Detection for DevOps
A Practical Guide to Anomaly Detection for DevOps
 
Applied machine learning defeating modern malicious documents
Applied machine learning defeating modern malicious documentsApplied machine learning defeating modern malicious documents
Applied machine learning defeating modern malicious documents
 
Big Data for Security
Big Data for SecurityBig Data for Security
Big Data for Security
 
SPO2-T11_Automated-Prevention-of-Ransomware-with-Machine-Learning-and-GPOs
SPO2-T11_Automated-Prevention-of-Ransomware-with-Machine-Learning-and-GPOsSPO2-T11_Automated-Prevention-of-Ransomware-with-Machine-Learning-and-GPOs
SPO2-T11_Automated-Prevention-of-Ransomware-with-Machine-Learning-and-GPOs
 
Design Patterns For Real Time Streaming Data Analytics
Design Patterns For Real Time Streaming Data AnalyticsDesign Patterns For Real Time Streaming Data Analytics
Design Patterns For Real Time Streaming Data Analytics
 
Providence Future of Data Meetup - Apache Metron Open Source Cybersecurity Pl...
Providence Future of Data Meetup - Apache Metron Open Source Cybersecurity Pl...Providence Future of Data Meetup - Apache Metron Open Source Cybersecurity Pl...
Providence Future of Data Meetup - Apache Metron Open Source Cybersecurity Pl...
 
Solving Cyber at Scale
Solving Cyber at ScaleSolving Cyber at Scale
Solving Cyber at Scale
 
What's Next for Google's BigTable
What's Next for Google's BigTableWhat's Next for Google's BigTable
What's Next for Google's BigTable
 
Threat Hunting for Command and Control Activity
Threat Hunting for Command and Control ActivityThreat Hunting for Command and Control Activity
Threat Hunting for Command and Control Activity
 
Managing your Black Friday Logs NDC Oslo
Managing your  Black Friday Logs NDC OsloManaging your  Black Friday Logs NDC Oslo
Managing your Black Friday Logs NDC Oslo
 
Managing your black friday logs Voxxed Luxembourg
Managing your black friday logs Voxxed LuxembourgManaging your black friday logs Voxxed Luxembourg
Managing your black friday logs Voxxed Luxembourg
 

Andere mochten auch

Apache kafka performance(throughput) - without data loss and guaranteeing dat...
Apache kafka performance(throughput) - without data loss and guaranteeing dat...Apache kafka performance(throughput) - without data loss and guaranteeing dat...
Apache kafka performance(throughput) - without data loss and guaranteeing dat...SANG WON PARK
 
Monitor all the cloud things - security monitoring for everyone
Monitor all the cloud things - security monitoring for everyoneMonitor all the cloud things - security monitoring for everyone
Monitor all the cloud things - security monitoring for everyoneDuncan Godfrey
 
Stormshield Visibility Center
Stormshield Visibility CenterStormshield Visibility Center
Stormshield Visibility CenterNRC
 
What's new in oracle ORAchk & EXAchk 12.2.0.1.2
What's new in oracle ORAchk & EXAchk 12.2.0.1.2What's new in oracle ORAchk & EXAchk 12.2.0.1.2
What's new in oracle ORAchk & EXAchk 12.2.0.1.2Gareth Chapman
 
Docker in Production, Look No Hands! by Scott Coulton
Docker in Production, Look No Hands! by Scott CoultonDocker in Production, Look No Hands! by Scott Coulton
Docker in Production, Look No Hands! by Scott CoultonDocker, Inc.
 
150430 regiosessie corv_almelo
150430 regiosessie corv_almelo150430 regiosessie corv_almelo
150430 regiosessie corv_almeloKING
 
Performance monitoring and call tracing in microservice environments
Performance monitoring and call tracing in microservice environmentsPerformance monitoring and call tracing in microservice environments
Performance monitoring and call tracing in microservice environmentsMartin Gutenbrunner
 
Performance Benchmarking of Clouds Evaluating OpenStack
Performance Benchmarking of Clouds                Evaluating OpenStackPerformance Benchmarking of Clouds                Evaluating OpenStack
Performance Benchmarking of Clouds Evaluating OpenStackPradeep Kumar
 
Exponentiële groei v2
Exponentiële groei v2Exponentiële groei v2
Exponentiële groei v2guest6b41899
 
A BRIEF OVERVIEW ON WILDLIFE MANAGEMENT
A BRIEF OVERVIEW ON WILDLIFE MANAGEMENTA BRIEF OVERVIEW ON WILDLIFE MANAGEMENT
A BRIEF OVERVIEW ON WILDLIFE MANAGEMENTPintu Kabiraj
 
E-commerce Berlin Expo - Divante - Anna Lankauf
E-commerce Berlin Expo - Divante - Anna LankaufE-commerce Berlin Expo - Divante - Anna Lankauf
E-commerce Berlin Expo - Divante - Anna LankaufE-Commerce Berlin EXPO
 
(ARC401) Cloud First: New Architecture for New Infrastructure
(ARC401) Cloud First: New Architecture for New Infrastructure(ARC401) Cloud First: New Architecture for New Infrastructure
(ARC401) Cloud First: New Architecture for New InfrastructureAmazon Web Services
 
Elks for analysing performance test results - Helsinki QA meetup
Elks for analysing performance test results - Helsinki QA meetupElks for analysing performance test results - Helsinki QA meetup
Elks for analysing performance test results - Helsinki QA meetupAnoop Vijayan
 
Docker experience @inbotapp
Docker experience @inbotappDocker experience @inbotapp
Docker experience @inbotappJilles van Gurp
 
Cloud Expo New York: OpenFlow Is SDN Yet SDN Is Not Only OpenFlow
Cloud Expo New York: OpenFlow Is SDN Yet SDN Is Not Only OpenFlowCloud Expo New York: OpenFlow Is SDN Yet SDN Is Not Only OpenFlow
Cloud Expo New York: OpenFlow Is SDN Yet SDN Is Not Only OpenFlowCohesive Networks
 
Evolutions et nouveaux outils SEO
Evolutions et nouveaux outils SEOEvolutions et nouveaux outils SEO
Evolutions et nouveaux outils SEODimitri Brunel
 
Cloud adoption patterns April 11 2016
Cloud adoption patterns April 11 2016Cloud adoption patterns April 11 2016
Cloud adoption patterns April 11 2016Kyle Brown
 
How to Build a High Performance Application Using Cloud Foundry and Redis (Cl...
How to Build a High Performance Application Using Cloud Foundry and Redis (Cl...How to Build a High Performance Application Using Cloud Foundry and Redis (Cl...
How to Build a High Performance Application Using Cloud Foundry and Redis (Cl...VMware Tanzu
 

Andere mochten auch (20)

Diabetes mellitus
Diabetes mellitusDiabetes mellitus
Diabetes mellitus
 
Apache kafka performance(throughput) - without data loss and guaranteeing dat...
Apache kafka performance(throughput) - without data loss and guaranteeing dat...Apache kafka performance(throughput) - without data loss and guaranteeing dat...
Apache kafka performance(throughput) - without data loss and guaranteeing dat...
 
Monitor all the cloud things - security monitoring for everyone
Monitor all the cloud things - security monitoring for everyoneMonitor all the cloud things - security monitoring for everyone
Monitor all the cloud things - security monitoring for everyone
 
Stormshield Visibility Center
Stormshield Visibility CenterStormshield Visibility Center
Stormshield Visibility Center
 
What's new in oracle ORAchk & EXAchk 12.2.0.1.2
What's new in oracle ORAchk & EXAchk 12.2.0.1.2What's new in oracle ORAchk & EXAchk 12.2.0.1.2
What's new in oracle ORAchk & EXAchk 12.2.0.1.2
 
Veselík 1
Veselík 1Veselík 1
Veselík 1
 
Docker in Production, Look No Hands! by Scott Coulton
Docker in Production, Look No Hands! by Scott CoultonDocker in Production, Look No Hands! by Scott Coulton
Docker in Production, Look No Hands! by Scott Coulton
 
150430 regiosessie corv_almelo
150430 regiosessie corv_almelo150430 regiosessie corv_almelo
150430 regiosessie corv_almelo
 
Performance monitoring and call tracing in microservice environments
Performance monitoring and call tracing in microservice environmentsPerformance monitoring and call tracing in microservice environments
Performance monitoring and call tracing in microservice environments
 
Performance Benchmarking of Clouds Evaluating OpenStack
Performance Benchmarking of Clouds                Evaluating OpenStackPerformance Benchmarking of Clouds                Evaluating OpenStack
Performance Benchmarking of Clouds Evaluating OpenStack
 
Exponentiële groei v2
Exponentiële groei v2Exponentiële groei v2
Exponentiële groei v2
 
A BRIEF OVERVIEW ON WILDLIFE MANAGEMENT
A BRIEF OVERVIEW ON WILDLIFE MANAGEMENTA BRIEF OVERVIEW ON WILDLIFE MANAGEMENT
A BRIEF OVERVIEW ON WILDLIFE MANAGEMENT
 
E-commerce Berlin Expo - Divante - Anna Lankauf
E-commerce Berlin Expo - Divante - Anna LankaufE-commerce Berlin Expo - Divante - Anna Lankauf
E-commerce Berlin Expo - Divante - Anna Lankauf
 
(ARC401) Cloud First: New Architecture for New Infrastructure
(ARC401) Cloud First: New Architecture for New Infrastructure(ARC401) Cloud First: New Architecture for New Infrastructure
(ARC401) Cloud First: New Architecture for New Infrastructure
 
Elks for analysing performance test results - Helsinki QA meetup
Elks for analysing performance test results - Helsinki QA meetupElks for analysing performance test results - Helsinki QA meetup
Elks for analysing performance test results - Helsinki QA meetup
 
Docker experience @inbotapp
Docker experience @inbotappDocker experience @inbotapp
Docker experience @inbotapp
 
Cloud Expo New York: OpenFlow Is SDN Yet SDN Is Not Only OpenFlow
Cloud Expo New York: OpenFlow Is SDN Yet SDN Is Not Only OpenFlowCloud Expo New York: OpenFlow Is SDN Yet SDN Is Not Only OpenFlow
Cloud Expo New York: OpenFlow Is SDN Yet SDN Is Not Only OpenFlow
 
Evolutions et nouveaux outils SEO
Evolutions et nouveaux outils SEOEvolutions et nouveaux outils SEO
Evolutions et nouveaux outils SEO
 
Cloud adoption patterns April 11 2016
Cloud adoption patterns April 11 2016Cloud adoption patterns April 11 2016
Cloud adoption patterns April 11 2016
 
How to Build a High Performance Application Using Cloud Foundry and Redis (Cl...
How to Build a High Performance Application Using Cloud Foundry and Redis (Cl...How to Build a High Performance Application Using Cloud Foundry and Redis (Cl...
How to Build a High Performance Application Using Cloud Foundry and Redis (Cl...
 

Ähnlich wie Hadoop / Spark on Malware Expression

Luiz eduardo. introduction to mobile snitch
Luiz eduardo. introduction to mobile snitchLuiz eduardo. introduction to mobile snitch
Luiz eduardo. introduction to mobile snitchYury Chemerkin
 
Crowd sourced intelligence built into search over hadoop
Crowd sourced intelligence built into search over hadoopCrowd sourced intelligence built into search over hadoop
Crowd sourced intelligence built into search over hadooplucenerevolution
 
Combating Advanced Persistent Threats with Flow-based Security Monitoring
Combating Advanced Persistent Threats with Flow-based Security MonitoringCombating Advanced Persistent Threats with Flow-based Security Monitoring
Combating Advanced Persistent Threats with Flow-based Security MonitoringLancope, Inc.
 
Data torrent meetup-productioneng
Data torrent meetup-productionengData torrent meetup-productioneng
Data torrent meetup-productionengChris Westin
 
Fighting cyber fraud with hadoop v2
Fighting cyber fraud with hadoop v2Fighting cyber fraud with hadoop v2
Fighting cyber fraud with hadoop v2Niel Dunnage
 
Apache Metron Meetup May 4, 2016 - Big data cybersecurity
Apache Metron Meetup May 4, 2016 - Big data cybersecurityApache Metron Meetup May 4, 2016 - Big data cybersecurity
Apache Metron Meetup May 4, 2016 - Big data cybersecurityHortonworks
 
Good Guys vs Bad Guys: Using Big Data to Counteract Advanced Threats
Good Guys vs Bad Guys: Using Big Data to Counteract Advanced ThreatsGood Guys vs Bad Guys: Using Big Data to Counteract Advanced Threats
Good Guys vs Bad Guys: Using Big Data to Counteract Advanced ThreatsZivaro Inc
 
Architecting R into Storm Application Development Process
Architecting R into Storm Application Development ProcessArchitecting R into Storm Application Development Process
Architecting R into Storm Application Development ProcessDataWorks Summit
 
R + Storm Moneyball - Realtime Advanced Statistics - Hadoop Summit - San Jose
R + Storm Moneyball - Realtime Advanced Statistics - Hadoop Summit - San JoseR + Storm Moneyball - Realtime Advanced Statistics - Hadoop Summit - San Jose
R + Storm Moneyball - Realtime Advanced Statistics - Hadoop Summit - San JoseAllen Day, PhD
 
Ariu - Ph.D. Defense Slides
Ariu - Ph.D. Defense SlidesAriu - Ph.D. Defense Slides
Ariu - Ph.D. Defense SlidesPluribus One
 
Hortonworks sqrrl webinar v5.pptx
Hortonworks sqrrl webinar v5.pptxHortonworks sqrrl webinar v5.pptx
Hortonworks sqrrl webinar v5.pptxHortonworks
 
Extracting the Malware Signal from Internet Noise
Extracting the Malware Signal from Internet NoiseExtracting the Malware Signal from Internet Noise
Extracting the Malware Signal from Internet NoiseAshwini Almad
 
Extracting the Malware Signal from Internet Noise
Extracting the Malware Signal from Internet NoiseExtracting the Malware Signal from Internet Noise
Extracting the Malware Signal from Internet NoiseEndgameInc
 
Security Breakout Session
Security Breakout Session Security Breakout Session
Security Breakout Session Splunk
 
Complete notes security
Complete notes securityComplete notes security
Complete notes securityKitkat Emoo
 
Distributed tracing
Distributed tracingDistributed tracing
Distributed tracingnishantmodak
 
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...Codemotion
 
Distributed Sensor Data Contextualization for Threat Intelligence Analysis
Distributed Sensor Data Contextualization for Threat Intelligence AnalysisDistributed Sensor Data Contextualization for Threat Intelligence Analysis
Distributed Sensor Data Contextualization for Threat Intelligence AnalysisJason Trost
 
MMIX Peering Forum and MMNOG 2020: Packet Analysis for Network Security
MMIX Peering Forum and MMNOG 2020: Packet Analysis for Network SecurityMMIX Peering Forum and MMNOG 2020: Packet Analysis for Network Security
MMIX Peering Forum and MMNOG 2020: Packet Analysis for Network SecurityAPNIC
 
MapR and Lucidworks Joint Webinar 2012
MapR and Lucidworks Joint Webinar 2012MapR and Lucidworks Joint Webinar 2012
MapR and Lucidworks Joint Webinar 2012MapR Technologies
 

Ähnlich wie Hadoop / Spark on Malware Expression (20)

Luiz eduardo. introduction to mobile snitch
Luiz eduardo. introduction to mobile snitchLuiz eduardo. introduction to mobile snitch
Luiz eduardo. introduction to mobile snitch
 
Crowd sourced intelligence built into search over hadoop
Crowd sourced intelligence built into search over hadoopCrowd sourced intelligence built into search over hadoop
Crowd sourced intelligence built into search over hadoop
 
Combating Advanced Persistent Threats with Flow-based Security Monitoring
Combating Advanced Persistent Threats with Flow-based Security MonitoringCombating Advanced Persistent Threats with Flow-based Security Monitoring
Combating Advanced Persistent Threats with Flow-based Security Monitoring
 
Data torrent meetup-productioneng
Data torrent meetup-productionengData torrent meetup-productioneng
Data torrent meetup-productioneng
 
Fighting cyber fraud with hadoop v2
Fighting cyber fraud with hadoop v2Fighting cyber fraud with hadoop v2
Fighting cyber fraud with hadoop v2
 
Apache Metron Meetup May 4, 2016 - Big data cybersecurity
Apache Metron Meetup May 4, 2016 - Big data cybersecurityApache Metron Meetup May 4, 2016 - Big data cybersecurity
Apache Metron Meetup May 4, 2016 - Big data cybersecurity
 
Good Guys vs Bad Guys: Using Big Data to Counteract Advanced Threats
Good Guys vs Bad Guys: Using Big Data to Counteract Advanced ThreatsGood Guys vs Bad Guys: Using Big Data to Counteract Advanced Threats
Good Guys vs Bad Guys: Using Big Data to Counteract Advanced Threats
 
Architecting R into Storm Application Development Process
Architecting R into Storm Application Development ProcessArchitecting R into Storm Application Development Process
Architecting R into Storm Application Development Process
 
R + Storm Moneyball - Realtime Advanced Statistics - Hadoop Summit - San Jose
R + Storm Moneyball - Realtime Advanced Statistics - Hadoop Summit - San JoseR + Storm Moneyball - Realtime Advanced Statistics - Hadoop Summit - San Jose
R + Storm Moneyball - Realtime Advanced Statistics - Hadoop Summit - San Jose
 
Ariu - Ph.D. Defense Slides
Ariu - Ph.D. Defense SlidesAriu - Ph.D. Defense Slides
Ariu - Ph.D. Defense Slides
 
Hortonworks sqrrl webinar v5.pptx
Hortonworks sqrrl webinar v5.pptxHortonworks sqrrl webinar v5.pptx
Hortonworks sqrrl webinar v5.pptx
 
Extracting the Malware Signal from Internet Noise
Extracting the Malware Signal from Internet NoiseExtracting the Malware Signal from Internet Noise
Extracting the Malware Signal from Internet Noise
 
Extracting the Malware Signal from Internet Noise
Extracting the Malware Signal from Internet NoiseExtracting the Malware Signal from Internet Noise
Extracting the Malware Signal from Internet Noise
 
Security Breakout Session
Security Breakout Session Security Breakout Session
Security Breakout Session
 
Complete notes security
Complete notes securityComplete notes security
Complete notes security
 
Distributed tracing
Distributed tracingDistributed tracing
Distributed tracing
 
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
 
Distributed Sensor Data Contextualization for Threat Intelligence Analysis
Distributed Sensor Data Contextualization for Threat Intelligence AnalysisDistributed Sensor Data Contextualization for Threat Intelligence Analysis
Distributed Sensor Data Contextualization for Threat Intelligence Analysis
 
MMIX Peering Forum and MMNOG 2020: Packet Analysis for Network Security
MMIX Peering Forum and MMNOG 2020: Packet Analysis for Network SecurityMMIX Peering Forum and MMNOG 2020: Packet Analysis for Network Security
MMIX Peering Forum and MMNOG 2020: Packet Analysis for Network Security
 
MapR and Lucidworks Joint Webinar 2012
MapR and Lucidworks Joint Webinar 2012MapR and Lucidworks Joint Webinar 2012
MapR and Lucidworks Joint Webinar 2012
 

Mehr von MapR Technologies

Converging your data landscape
Converging your data landscapeConverging your data landscape
Converging your data landscapeMapR Technologies
 
ML Workshop 2: Machine Learning Model Comparison & Evaluation
ML Workshop 2: Machine Learning Model Comparison & EvaluationML Workshop 2: Machine Learning Model Comparison & Evaluation
ML Workshop 2: Machine Learning Model Comparison & EvaluationMapR Technologies
 
Self-Service Data Science for Leveraging ML & AI on All of Your Data
Self-Service Data Science for Leveraging ML & AI on All of Your DataSelf-Service Data Science for Leveraging ML & AI on All of Your Data
Self-Service Data Science for Leveraging ML & AI on All of Your DataMapR Technologies
 
Enabling Real-Time Business with Change Data Capture
Enabling Real-Time Business with Change Data CaptureEnabling Real-Time Business with Change Data Capture
Enabling Real-Time Business with Change Data CaptureMapR Technologies
 
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...MapR Technologies
 
ML Workshop 1: A New Architecture for Machine Learning Logistics
ML Workshop 1: A New Architecture for Machine Learning LogisticsML Workshop 1: A New Architecture for Machine Learning Logistics
ML Workshop 1: A New Architecture for Machine Learning LogisticsMapR Technologies
 
Machine Learning Success: The Key to Easier Model Management
Machine Learning Success: The Key to Easier Model ManagementMachine Learning Success: The Key to Easier Model Management
Machine Learning Success: The Key to Easier Model ManagementMapR Technologies
 
Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action MapR Technologies
 
Live Tutorial – Streaming Real-Time Events Using Apache APIs
Live Tutorial – Streaming Real-Time Events Using Apache APIsLive Tutorial – Streaming Real-Time Events Using Apache APIs
Live Tutorial – Streaming Real-Time Events Using Apache APIsMapR Technologies
 
Bringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale StorageBringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale StorageMapR Technologies
 
Live Machine Learning Tutorial: Churn Prediction
Live Machine Learning Tutorial: Churn PredictionLive Machine Learning Tutorial: Churn Prediction
Live Machine Learning Tutorial: Churn PredictionMapR Technologies
 
An Introduction to the MapR Converged Data Platform
An Introduction to the MapR Converged Data PlatformAn Introduction to the MapR Converged Data Platform
An Introduction to the MapR Converged Data PlatformMapR Technologies
 
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...MapR Technologies
 
Best Practices for Data Convergence in Healthcare
Best Practices for Data Convergence in HealthcareBest Practices for Data Convergence in Healthcare
Best Practices for Data Convergence in HealthcareMapR Technologies
 
Geo-Distributed Big Data and Analytics
Geo-Distributed Big Data and AnalyticsGeo-Distributed Big Data and Analytics
Geo-Distributed Big Data and AnalyticsMapR Technologies
 
MapR Product Update - Spring 2017
MapR Product Update - Spring 2017MapR Product Update - Spring 2017
MapR Product Update - Spring 2017MapR Technologies
 
3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data Analytics3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data AnalyticsMapR Technologies
 
Cisco & MapR bring 3 Superpowers to SAP HANA Deployments
Cisco & MapR bring 3 Superpowers to SAP HANA DeploymentsCisco & MapR bring 3 Superpowers to SAP HANA Deployments
Cisco & MapR bring 3 Superpowers to SAP HANA DeploymentsMapR Technologies
 
MapR and Cisco Make IT Better
MapR and Cisco Make IT BetterMapR and Cisco Make IT Better
MapR and Cisco Make IT BetterMapR Technologies
 
Evolving from RDBMS to NoSQL + SQL
Evolving from RDBMS to NoSQL + SQLEvolving from RDBMS to NoSQL + SQL
Evolving from RDBMS to NoSQL + SQLMapR Technologies
 

Mehr von MapR Technologies (20)

Converging your data landscape
Converging your data landscapeConverging your data landscape
Converging your data landscape
 
ML Workshop 2: Machine Learning Model Comparison & Evaluation
ML Workshop 2: Machine Learning Model Comparison & EvaluationML Workshop 2: Machine Learning Model Comparison & Evaluation
ML Workshop 2: Machine Learning Model Comparison & Evaluation
 
Self-Service Data Science for Leveraging ML & AI on All of Your Data
Self-Service Data Science for Leveraging ML & AI on All of Your DataSelf-Service Data Science for Leveraging ML & AI on All of Your Data
Self-Service Data Science for Leveraging ML & AI on All of Your Data
 
Enabling Real-Time Business with Change Data Capture
Enabling Real-Time Business with Change Data CaptureEnabling Real-Time Business with Change Data Capture
Enabling Real-Time Business with Change Data Capture
 
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
 
ML Workshop 1: A New Architecture for Machine Learning Logistics
ML Workshop 1: A New Architecture for Machine Learning LogisticsML Workshop 1: A New Architecture for Machine Learning Logistics
ML Workshop 1: A New Architecture for Machine Learning Logistics
 
Machine Learning Success: The Key to Easier Model Management
Machine Learning Success: The Key to Easier Model ManagementMachine Learning Success: The Key to Easier Model Management
Machine Learning Success: The Key to Easier Model Management
 
Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action
 
Live Tutorial – Streaming Real-Time Events Using Apache APIs
Live Tutorial – Streaming Real-Time Events Using Apache APIsLive Tutorial – Streaming Real-Time Events Using Apache APIs
Live Tutorial – Streaming Real-Time Events Using Apache APIs
 
Bringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale StorageBringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale Storage
 
Live Machine Learning Tutorial: Churn Prediction
Live Machine Learning Tutorial: Churn PredictionLive Machine Learning Tutorial: Churn Prediction
Live Machine Learning Tutorial: Churn Prediction
 
An Introduction to the MapR Converged Data Platform
An Introduction to the MapR Converged Data PlatformAn Introduction to the MapR Converged Data Platform
An Introduction to the MapR Converged Data Platform
 
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
 
Best Practices for Data Convergence in Healthcare
Best Practices for Data Convergence in HealthcareBest Practices for Data Convergence in Healthcare
Best Practices for Data Convergence in Healthcare
 
Geo-Distributed Big Data and Analytics
Geo-Distributed Big Data and AnalyticsGeo-Distributed Big Data and Analytics
Geo-Distributed Big Data and Analytics
 
MapR Product Update - Spring 2017
MapR Product Update - Spring 2017MapR Product Update - Spring 2017
MapR Product Update - Spring 2017
 
3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data Analytics3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data Analytics
 
Cisco & MapR bring 3 Superpowers to SAP HANA Deployments
Cisco & MapR bring 3 Superpowers to SAP HANA DeploymentsCisco & MapR bring 3 Superpowers to SAP HANA Deployments
Cisco & MapR bring 3 Superpowers to SAP HANA Deployments
 
MapR and Cisco Make IT Better
MapR and Cisco Make IT BetterMapR and Cisco Make IT Better
MapR and Cisco Make IT Better
 
Evolving from RDBMS to NoSQL + SQL
Evolving from RDBMS to NoSQL + SQLEvolving from RDBMS to NoSQL + SQL
Evolving from RDBMS to NoSQL + SQL
 

Último

Where developers are challenged, what developers want and where DevEx is going
Where developers are challenged, what developers want and where DevEx is goingWhere developers are challenged, what developers want and where DevEx is going
Where developers are challenged, what developers want and where DevEx is goingFrancesco Corti
 
TrustArc Webinar - How to Live in a Post Third-Party Cookie World
TrustArc Webinar - How to Live in a Post Third-Party Cookie WorldTrustArc Webinar - How to Live in a Post Third-Party Cookie World
TrustArc Webinar - How to Live in a Post Third-Party Cookie WorldTrustArc
 
20140402 - Smart house demo kit
20140402 - Smart house demo kit20140402 - Smart house demo kit
20140402 - Smart house demo kitJamie (Taka) Wang
 
UiPath Studio Web workshop series - Day 1
UiPath Studio Web workshop series  - Day 1UiPath Studio Web workshop series  - Day 1
UiPath Studio Web workshop series - Day 1DianaGray10
 
Introduction to RAG (Retrieval Augmented Generation) and its application
Introduction to RAG (Retrieval Augmented Generation) and its applicationIntroduction to RAG (Retrieval Augmented Generation) and its application
Introduction to RAG (Retrieval Augmented Generation) and its applicationKnoldus Inc.
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
 
UiPath Studio Web workshop series - Day 4
UiPath Studio Web workshop series - Day 4UiPath Studio Web workshop series - Day 4
UiPath Studio Web workshop series - Day 4DianaGray10
 
Trailblazer Community - Flows Workshop (Session 2)
Trailblazer Community - Flows Workshop (Session 2)Trailblazer Community - Flows Workshop (Session 2)
Trailblazer Community - Flows Workshop (Session 2)Muhammad Tiham Siddiqui
 
Patch notes explaining DISARM Version 1.4 update
Patch notes explaining DISARM Version 1.4 updatePatch notes explaining DISARM Version 1.4 update
Patch notes explaining DISARM Version 1.4 updateadam112203
 
Novo Nordisk's journey in developing an open-source application on Neo4j
Novo Nordisk's journey in developing an open-source application on Neo4jNovo Nordisk's journey in developing an open-source application on Neo4j
Novo Nordisk's journey in developing an open-source application on Neo4jNeo4j
 
3 Pitfalls Everyone Should Avoid with Cloud Data
3 Pitfalls Everyone Should Avoid with Cloud Data3 Pitfalls Everyone Should Avoid with Cloud Data
3 Pitfalls Everyone Should Avoid with Cloud DataEric D. Schabell
 
How to become a GDSC Lead GDSC MI AOE.pptx
How to become a GDSC Lead GDSC MI AOE.pptxHow to become a GDSC Lead GDSC MI AOE.pptx
How to become a GDSC Lead GDSC MI AOE.pptxKaustubhBhavsar6
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightSafe Software
 
Outage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedIn
Outage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedInOutage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedIn
Outage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedInThousandEyes
 
My key hands-on projects in Quantum, and QAI
My key hands-on projects in Quantum, and QAIMy key hands-on projects in Quantum, and QAI
My key hands-on projects in Quantum, and QAIVijayananda Mohire
 
Emil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptx
Emil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptxEmil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptx
Emil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptxNeo4j
 
Planetek Italia Srl - Corporate Profile Brochure
Planetek Italia Srl - Corporate Profile BrochurePlanetek Italia Srl - Corporate Profile Brochure
Planetek Italia Srl - Corporate Profile BrochurePlanetek Italia Srl
 
March Patch Tuesday
March Patch TuesdayMarch Patch Tuesday
March Patch TuesdayIvanti
 
2024.03.12 Cost drivers of cultivated meat production.pdf
2024.03.12 Cost drivers of cultivated meat production.pdf2024.03.12 Cost drivers of cultivated meat production.pdf
2024.03.12 Cost drivers of cultivated meat production.pdfThe Good Food Institute
 

Último (20)

Where developers are challenged, what developers want and where DevEx is going
Where developers are challenged, what developers want and where DevEx is goingWhere developers are challenged, what developers want and where DevEx is going
Where developers are challenged, what developers want and where DevEx is going
 
TrustArc Webinar - How to Live in a Post Third-Party Cookie World
TrustArc Webinar - How to Live in a Post Third-Party Cookie WorldTrustArc Webinar - How to Live in a Post Third-Party Cookie World
TrustArc Webinar - How to Live in a Post Third-Party Cookie World
 
20140402 - Smart house demo kit
20140402 - Smart house demo kit20140402 - Smart house demo kit
20140402 - Smart house demo kit
 
UiPath Studio Web workshop series - Day 1
UiPath Studio Web workshop series  - Day 1UiPath Studio Web workshop series  - Day 1
UiPath Studio Web workshop series - Day 1
 
Introduction to RAG (Retrieval Augmented Generation) and its application
Introduction to RAG (Retrieval Augmented Generation) and its applicationIntroduction to RAG (Retrieval Augmented Generation) and its application
Introduction to RAG (Retrieval Augmented Generation) and its application
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
UiPath Studio Web workshop series - Day 4
UiPath Studio Web workshop series - Day 4UiPath Studio Web workshop series - Day 4
UiPath Studio Web workshop series - Day 4
 
Trailblazer Community - Flows Workshop (Session 2)
Trailblazer Community - Flows Workshop (Session 2)Trailblazer Community - Flows Workshop (Session 2)
Trailblazer Community - Flows Workshop (Session 2)
 
SheDev 2024
SheDev 2024SheDev 2024
SheDev 2024
 
Patch notes explaining DISARM Version 1.4 update
Patch notes explaining DISARM Version 1.4 updatePatch notes explaining DISARM Version 1.4 update
Patch notes explaining DISARM Version 1.4 update
 
Novo Nordisk's journey in developing an open-source application on Neo4j
Novo Nordisk's journey in developing an open-source application on Neo4jNovo Nordisk's journey in developing an open-source application on Neo4j
Novo Nordisk's journey in developing an open-source application on Neo4j
 
3 Pitfalls Everyone Should Avoid with Cloud Data
3 Pitfalls Everyone Should Avoid with Cloud Data3 Pitfalls Everyone Should Avoid with Cloud Data
3 Pitfalls Everyone Should Avoid with Cloud Data
 
How to become a GDSC Lead GDSC MI AOE.pptx
How to become a GDSC Lead GDSC MI AOE.pptxHow to become a GDSC Lead GDSC MI AOE.pptx
How to become a GDSC Lead GDSC MI AOE.pptx
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and Insight
 
Outage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedIn
Outage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedInOutage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedIn
Outage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedIn
 
My key hands-on projects in Quantum, and QAI
My key hands-on projects in Quantum, and QAIMy key hands-on projects in Quantum, and QAI
My key hands-on projects in Quantum, and QAI
 
Emil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptx
Emil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptxEmil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptx
Emil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptx
 
Planetek Italia Srl - Corporate Profile Brochure
Planetek Italia Srl - Corporate Profile BrochurePlanetek Italia Srl - Corporate Profile Brochure
Planetek Italia Srl - Corporate Profile Brochure
 
March Patch Tuesday
March Patch TuesdayMarch Patch Tuesday
March Patch Tuesday
 
2024.03.12 Cost drivers of cultivated meat production.pdf
2024.03.12 Cost drivers of cultivated meat production.pdf2024.03.12 Cost drivers of cultivated meat production.pdf
2024.03.12 Cost drivers of cultivated meat production.pdf
 

Hadoop / Spark on Malware Expression

  • 1. © 2014 MapR Technologies 1© 2014 MapR Technologies
  • 2. © 2014 MapR Technologies 2 Objective • Advanced Persistent Threat (APT) • Big Data + Threat Intelligence • Hadoop + Spark Solution • Example Detection Algorithm Development Scenarios (most of them are still open problems) Topics covered in this talk
  • 3. © 2014 MapR Technologies 3© 2014 MapR Technologies Advanced Persistent Threat
  • 4. © 2014 MapR Technologies 4 APT • Advanced Persistent Threat (APT) is one of the biggest headaches in IT departments – Target Compromise – Countless DDoS attacks (Thousands a day according to Arbor Networks) – These are only known cases, that could just be a tip of the iceberg. • Why APT is so prevalent? – No more hobby for smart hackers – Huge money is involved, even behind organized crimes – Political tool (Recent conflict between Ukraine and Russia sparked malware warfare between them) – Cyber warfare (Stuxnet) Overview
  • 5. © 2014 MapR Technologies 5 APT • Hard to Detect – More software layer stacks without thorough vulnerability test popping every day • Storm, spark, yarn, grail, play, spring, flask, … – Mobile area is even worse • Particularly android • Some estimates 30% or more devices are already compromised, worldwide • Anti-Virus is useful only up to a certain point – It takes months to years to define malware signature – Zero day attack is still unpreventable – It became almost a Placebo • Firewall is not much useful anymore – A device can be infected when the user brings it outside the Firewall premise • Botnet itself is becoming more complex with many hierarchies – Minimal binary delivery – Surreptitious C&C connection with complex hierarchy or even headless peer to peer bots (Gameover Zeus Botnet) Status
  • 6. © 2014 MapR Technologies 6 APT • Snort / Suricata – Rule based system – Community support, pre/post-compromise detection – Constant update is needed, cannot detect Zero day attack – Sourcefire provides paid service • Sandbox Technology – Firewall + In premise detection – Fireeye • Poly-morphing technology – ShapeSecurity • Log data mining based methods – Splunk / Sumo Logic, Solutionary Defense, state of the art
  • 7. © 2014 MapR Technologies 7 APT • Many world wide security labs have malware labs and generate threat reports • The analysis takes from 2 weeks to months • Involves – Decoding binary execution and decrypting load / config parameters – Complete time line analysis, from infection to exploit – What devices and ips and domain names are involved • Sometimes, analyze IRC data, or even social network data – Botnet connection and verify the command and control • Can we automate this with Big Data? Threat Report
  • 8. © 2014 MapR Technologies 8 APT Example Annual Threat Report (from Fireeye, 2013, Europe) Top Two Industries in Threat Finding were Healthcare and Finance
  • 9. © 2014 MapR Technologies 9 APT • Configuration (Decrypted) • ID: F16 08-07-2013 Group: DNS/Port: Direct: toornt.servegame.com:443, Proxy DNS/Port: Proxy Hijack: No ActiveX Startup Key: HKLM Startup Entry: File Name: Install Path: C:Documents and SettingsAdminLocal SettingsTempmorse.exe Keylog Path: C:Documents and SettingsAdminLocal SettingsTempmorse Inject: No Process Mutex: gdfgdfgdg Key Logger Mutex: ActiveX Startup: No HKLM Startup: No Copy To: No Melt: No Persistence: No Keylogger: No Password: !@#GooD#@! Example Threat Report (from Fireeye) C&C Servers toornt.servegame.com updateo.servegame.com egypttv.sytes.net skype.servemp3.com natco2.no-ip.net Why does it need Password?
  • 10. © 2014 MapR Technologies 10 APT • CHAIN OF EVENTS • ASSOCIATED DOMAINS • 192.81.171.13 - www.toonzone.net - Compromised website • 190.123.47.198 - ilinsting.com - Redirect • 64.202.116.124 - bgbyhn.in.ua - Fiesta EK • INFECTION CHAIN OF EVENTS • 06:40:07 UTC - www.toonzone.net - GET /forums/adult-swim-toonami-forum/ • 06:40:08 UTC - ilinsting.com - GET /szjhmucw.js?3ad1359a5153d640 • 06:40:09 UTC - bgbyhn.in.ua - GET /hdjng94/?2 • 06:40:11 UTC - bgbyhn.in.ua - GET /hdjng94/?25b6d1b1cb76ec625b500e0d560a50040703520d5053520a0706510355090109 • 06:40:12 UTC - bgbyhn.in.ua - GET /hdjng94/?2d8a97d01a056fdd41084e5a0b0c56050752085a0d55540b07570b54080f0708;5110411 • 06:40:14 UTC - bgbyhn.in.ua - GET /hdjng94/?02bb88c62d7306c8534209590a035103050452590c5a530d0501515709000057;5 • 06:40:15 UTC - bgbyhn.in.ua - GET /hdjng94/?02bb88c62d7306c8534209590a035103050452590c5a530d0501515709000057;5;1 • 06:40:42 UTC - bgbyhn.in.ua - GET /hdjng94/?2ad5cdef3fc4ef9851110f0e515f57530757540e5706555d07525700525c065e;6 • 06:40:43 UTC - bgbyhn.in.ua - GET /hdjng94/?2ad5cdef3fc4ef9851110f0e515f57530757540e5706555d07525700525c065e;6;1 • 06:40:43 UTC - bgbyhn.in.ua - GET /hdjng94/?5998786b9c7a1ffe544b580305030457000f0903035a0659000a0a0d0600555a • 06:40:49 UTC - bgbyhn.in.ua - GET /hdjng94/?59576b00f4cfd03e5641500c04590205000f050c0200000b000a0602075a5308;1;2 • 06:40:49 UTC - bgbyhn.in.ua - GET /hdjng94/?59576b00f4cfd03e5641500c04590205000f050c0200000b000a0602075a5308;1;2;1 Another Example, Fiesta EK, from malware-traffic-analysis.net
  • 11. © 2014 MapR Technologies 11© 2014 MapR Technologies Big Data + Threat Intelligence
  • 12. © 2014 MapR Technologies 12 Big Data + Threat Intelligence • Tom Brady + Gisele Bundchen – An Ideal Marriage • With All the advances in Computing and Data Resources, why can’t we automate Malware detection • Big Data is an ideal platform for malware study – Simple packet capture can easily make PETA bytes data from small offices – Huge storage + Fast processing is essential for malware study • Various aspects of Big Data fit well with Malware – Streaming analysis (Storm, Spark Streaming) – Volumetric data analysis (Spark) – Graph analysis • View network devices as nodes, discover command and control role • Each url can be a node and the basis of graph analysis – Visualization for intuitive analysis Pros
  • 13. © 2014 MapR Technologies 13 Big Data + Threat Intelligence • Anomaly detection – Typical log analysis – Router / Switch has built in alarm setting • Simple Level based detection – Is this going to be useful? • How much can you tell • Machine Learning – Not much useful • Not easy to get labeled data • Even with labeled data it is very hard to develop a feature set – If the feature set is known, hackers will revise their codes • Zero day attack does not come with a label – Modeling needs complete understanding of criminal minds Cons (e.g., Gwyneth Paltrow and Chris Martin)
  • 14. © 2014 MapR Technologies 14 Big Data + Threat Intelligence An Example Architecture Storm Spout Packet Stream Or Binary Downloads Storm Bolt Packet Analysis Alert and store packet data Store to HDFS Spark Analysis Storm Bolt Meta Data Extraction Packet stream truly reveals Malware expression compared to Log Connect the Dots with Strong In Memory Processing
  • 15. © 2014 MapR Technologies 15 Big Data + Threat Intelligence • Reduce False Positives – Mantra in Malware detection business • Big data is a great resource for reducing false positives (Type 1 error) – As soon as an update on an algorithm is made, test against the Big Data test cases – The test can even be applied to old cases, greatly reducing false positives • Typically, we had to sample test data by weighting old data lower False Positives
  • 16. © 2014 MapR Technologies 16 Big Data + Threat Intelligence • Wireshark (tshark) is the goto software for packet analysis – Huge memory hogging software • Need to put packet data onto HDFS • Packetpig has been developed from Hortonworks – A lot more has to be done to be closer anywhere near to the strength of Wireshark • Need to design efficient meta data collection and storage mechanisms – Use snort or custom c platform library to extract essential flow data • Flow is a 5-tuple src/dest/ip/port/protocol • Flow is the de facto unit of network malware expression analysis Packet to HDFS
  • 17. © 2014 MapR Technologies 17 Big Data + Threat Intelligence • Big Data provides opportunity to map out all the ip addresses used on a particular network • Through graph analysis, find rogue IP addresses • Use geographical information with IP to find abnormal connection behavior • DNS provides many insights on Malware connection – Static IP cannot be used for malware control purpose – Fast Flux – Awkward names IP based analysis
  • 18. © 2014 MapR Technologies 18 Big Data + Threat Intelligence • Flow is an essential malware analysis unit • Flow identifies – Who’s connecting to whom • Frequency, duration, communication bandwidth • App can be identified from flow – Port, actual content – Palo Alto Networks • Normal flow vs Abnormal flow – With enough data, we could potentially identify normal flow • Use first 16 bytes? – Cluster analysis, detect anomaly Flow to detect malware expression
  • 19. © 2014 MapR Technologies 19© 2014 MapR Technologies Spark on Hadoop
  • 20. © 2014 MapR Technologies 20 Apache Spark • spark.apache.org • github.com/apache/spark • user@spark.apache.org • Originally developed in 2009 in UC Berkeley’s AMP Lab • Fully open sourced in 2010 – now at Apache Software Foundation
  • 21. © 2014 MapR Technologies 21 Easy: Example – IP Count • Spark public static class WordCountMapClass extends MapReduceBase implements Mapper<LongWritable, Text, Text, IntWritable> { private final static IntWritable one = new IntWritable(1); private Text word = new Text(); public void map(LongWritable key, Text value, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException { String line = value.toString(); StringTokenizer itr = new StringTokenizer(line); while (itr.hasMoreTokens()) { word.set(itr.nextToken()); output.collect(word, one); } } } public static class WorkdCountReduce extends MapReduceBase implements Reducer<Text, IntWritable, Text, IntWritable> { public void reduce(Text key, Iterator<IntWritable> values, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException { int sum = 0; while (values.hasNext()) { sum += values.next().get(); } output.collect(key, new IntWritable(sum)); } } • Hadoop MapReduce val spark = new SparkContext(master, appName, [sparkHome], [jars]) val file = spark.textFile("hdfs://...") val counts = file.flatMap(line => line.split(“,”)(0)) .map(ip=> (ip, 1)) .reduceByKey(_ + _) counts.saveAsTextFile("hdfs://...")
  • 22. © 2014 MapR Technologies 22 Fast: Using RAM, Operator Graphs • In-memory Caching • Data Partitions read from RAM instead of disk • Operator Graphs • Scheduling Optimizations • Fault Tolerance = cached partition = RDD join filter groupBy Stage 3 Stage 1 Stage 2 A: B: C: D: E: F: map
  • 23. © 2014 MapR Technologies 23 SPARK RDD • Resilient Distributed Datasets (RDD) is the key (potentially) in memory data structure • RDD is distributed over Hadoop Nodes, typically resides on memory • Transform RDD, then get data from RDD, Lazy Evaluation – 2 sets of interfaces are provided, one for transform, the other for taking actions (e.g., count, save etc) • Most of the interface is quite similar to Lisp operations and SQL operations • Use Persist (Cache) to have the RDD on memory
  • 24. © 2014 MapR Technologies 24 RDD
  • 25. © 2014 MapR Technologies 25 Working With RDDs RDD RDD RDD RDD Transformations Action Value linesWithSpark = textFile.filter(lambda line: "Spark” in line) linesWithSpark.count() 74 linesWithSpark.first() # Apache Spark textFile = sc.textFile(”SomeFile.txt”)
  • 26. © 2014 MapR Technologies 26 Spark, Hadoop Malware Analysis Why useful Packet Stream Construct Group of Suspected Flows In RDD E.g., suspected DNS tunnels, IRC communications Analyze with SPARK on RDD, IN MEMORY Connect the Dots, Flows, SysLogs and Events Huge advantage over Wireshark! Store in HDFS for easy access and use HBase for database support Real Time Event Processing Fast Classification or Anomaly Detection
  • 27. © 2014 MapR Technologies 27 SPARK and Hadoop • Connecting dots needs Huge Storage and Fast Access – Potential need to go back in time to find correlating events • DDoS attack found Today + 10 Days ago spotty IRC chat + 20 days ago NXDomain events by the suspected infected machine – Sometimes it takes months to know a domain (the machine contacted) is suspicious (e.g., scored in VirusTotal) – Then see if these patterns match with known malware expressions – Approximate matching technology here is quite important » HMM and Correlation Modeling – HDFS + Hbase would be a good solution • Store relevant temporal data • Retrieve fast according to the criteria • SPARK + Hadoop provides fast development cycle – From prototype to evaluation Why Hadoop
  • 28. © 2014 MapR Technologies 28© 2014 MapR Technologies Example Detection Algorithm Development Scenarios
  • 29. © 2014 MapR Technologies 29 Introduction to Botnet (Terminology) Bot Master Bots Code Server IRC Server Victim IRC Channel Attack IRC Channel C&C Traffic Updates Old Days BotNet operation, Just for Reference Companies are interested in finding these in there premises
  • 30. © 2014 MapR Technologies 30 (Malware Expression) Detection Phases • Pre Infection Detection – Intrusion Detection System • Active Infection Detection – Recruit and Reconnaissance in the internal network • Post Infection Detection – Exploit and Monetize
  • 31. © 2014 MapR Technologies 31 Pre Infection Detection • Detect suspicious URLs – When a device tries to contact or download suspicious URLs, block it • How it works – If suspicious or unknown contents are detected, send it to backend big data deep analysis engine – Update suspicious IP/Domain Name/URLs – Update hash of the binary – Regularly remove old hash/suspicious URLs CAMP
  • 32. © 2014 MapR Technologies 32 On going infection detection • How it works – Detect suspicious internal behavior – Develop normal behavioral model for target customer site – Detect abnormal authentication behavior, e.g., Kerberos, LDAP etc – Detect suspicious data move – Detect suspicious port usage – Detect tunnels • It is highly important to leverage Big Data to develop sustainable normal behavioral model and constant update. Network data/model is constantly changing. • Consult with Security experts to define the measure points In-network infection propagation
  • 33. © 2014 MapR Technologies 33 Post Infection Detection • HTTP / DNS is most frequently abused protocols – Firewalls allow these ports get through – If needed, play man in the middle for SSL data inspection • Ill formed Http Header detection – Abnormal location – Abnormal referrer – Abnormal User Agent – Abnormal Size • Abnormal Http Post Detection (e.g., entropy analysis) • Ill formed XML / HTML • SQL Injection – SELECT * FROM users WHERE name = '' OR '1'='1'; • LDAP Code Injection Protocol Abnormality Collect Malware Expression Samples Develop Feature Set with Hadoop and SME Deploy and Continually update the model
  • 34. © 2014 MapR Technologies 34 Post Infection Detection • Click Fraud • Like Fraud • DDoS • SPAM Volumetric Abnormality
  • 35. © 2014 MapR Technologies 35 Post Infection Detection • Cadence • Weird domain name resolution • Fast Fluxing domain names • Abnormal IRC traffic behavior • Abnormal twitter behavior • Abnormal facebook behavior Command and Control Contact
  • 36. © 2014 MapR Technologies 36 DGA ClickSecurity.com What Features Would U Use?
  • 37. © 2014 MapR Technologies 37© 2014 MapR Technologies Conclusion
  • 38. © 2014 MapR Technologies 38 Conclusion • Threat Intelligence and Big Data are very HOT • Big Data is the ideal analysis platform for Malware expression analysis – Caution, Remember the Cons – Useful for efficiently connecting the dots • Big Data enables – Persistent model building and updating – Reducing false positives through exhaustive data check compared to spot check • Hadoop / SPARK supports ideal platform for Malware expression analysis – SPARK provides strong inmemory processing power for complex malware data analysis with simpler scripting level coding • scala – MapR provides fastest data access on Hadoop nodes • M7 • MapR is the better hadoop • Don’t under estimate NFS and Volume convenience • Questions are welcome, send to syoon@maprtech.com, mvasquez@maprtech.com nestrada@maprtech.com