Src: http://www.birthdaydirect.com/images/cupcake-candle-cake-pan-desc-09.png*PROVOCATIVE*The Anti-Virus Industry is celebrating its 20th birthday.[Clap]So why do we still suffer from computer viruses?? [Click]
Src: http://www.publicdomainpictures.net/pictures/20000/velka/sad-child-portrait.jpg[2-SecondPause]So why is malware and computer viruses still considered an unsolvable threat? to help answering the question..[Click]
[http://images.suite101.com/2486354_com_turingalan.jpg]In 1936, Alan Turing proved that there is no general algorithm to solve the halting problem.The halting problem can be described as follows:Given a program I and an input X. The algorithm should Return 1 if program I halts on input X or 0 if it doesn’t.However, not all programs halt immediately, maybe they will after 5 minutes, maybe after 1 hour or maybe they won’t stop at all.Alan Turing proved that there is no algorithm that can produce 1 or 0 for every program and input, which means that this problem is undecidable.Simply put it says that whatever properties of programs you’re interested in, no program can tell if this property holds for every program (i.e. it is undecidable).It is important to note that this problem is undecidable over infinite state machines.Now let’s look at a different problem which I like to call [Click]
The Malware Problem.Given a program ithe algorithm should return 1 if the program is malicious and 0 if it’s benign.This problem could be easily reduced to the halting problem, which means that it’s un-solvable.This makes sense, we are looking for a property in a program, we are trying to check if the program is malicious like we did with the halting problem, trying to see if it stops.However, Halt is well defined, Being Malicious isn’t.Now.. We can solve The Halting Problem or The Malware problem when we are talking about finite-state machines.However, we won’t be able to give TRUE or FALSE to ANY program, only to a SUBSET of programs.This means, that theoretically we cannot catch all Malware, all the time, only a subset of it using the current detection methods that I will elaborate In my next slides.Even if we focus on a subset of these programs, we have problems detecting them.I don’t want to be blamed of being pessimistic so let’s assume that we can detect 99% of the subset, this still leaves us with what I like to call [click]---//Now let’s assume that we detect 99% of these subset, we are now facing with what I like to call [click]
[http://irregulartimes.com/aapaypalfiles/images/99percentbuttonthumb.png]The One Percent.Malware that does not re-use old techniques and bypass current detection mechanism.Unfortunately, as you know there is a Cat & Mouse game on that 1%.The defenders develop a mitigation technique and the attackers bypass the technique, keeping the 1% alive and kicking.However, there is a limited amount of ways an attacker can achieve a certain goal without getting caught/detected.In order to bite out of this 1% we need to understand the constrains that we have today in finite state machine [click]
The major Constrains are: TIME & SPACE.TIME:We just cannot analyze a program forever. We have to stop at some point and make a decision.Malware authors know that and they will try to slow-walk us using Loops, Sleep and time consuming operations such as encryption, packing (Host Identity Based Encryption - DRM Like) and self-modifying code.There are some ways to fight that for example by Overclocking the machine, we can make the Malware thinks that he run for longer, We can detect that a thread went to sleep and wake-it up using interrupts, but there are always ways around it.SPACE :In a finite-state machine we just cannot maintain unlimited states.Malware authors know that as well, and they will try to avoid calling known patterns one after the other.One of the most popular techniques is “Run The Clock” or the way I like to call it “Almost There”. For Example: A Malware that try to inject a Thread to another process will call “Open Process” followed by “VirtualAllocEx” and “WriteProcessMemory”, it will then loop and will cause a state machine to lose it state.At some point it will call “CreateRemoteThread”, thus, bypassing traditional API call trace analysis mechanisms.Advanced Malware Will Often try to Exploit these constrains [click]
Malware trying to bypass static analysiswill use packing, encryption and anti-reversing techniques forcing the analysis to explore every path possible in order to determine if the program is malicious, this requires saving many states and since we are analyzing programs usingfinite-state machines it is sometimes not scalable (More States -> More Memory == Limited).Malware trying to bypass dynamic analysis will use DRM-Like encryption to slow down the decryption/unpacking operation, this process can fill up the memory pretty fast (SPACE) and can last for very long time (TIME), since we cannot analysis forever, we will eventually have to halt and give an answer that is not accurate (benign even though its malicious).However, these Constrains are not the only thing we need to deal with [Click]
Here is some more depressing news that will followed by a picture of a kitten.We have the problem of Malware elevating its privileges to kernel mode by using legitimate API calls it is able to bypass security product sincethey cannot predict this flow in advance.Later in this talk, we will review a new technique that tries to detect the “after effect” - The modification of kernel objects during exploitation. [Click]We also saw a sharp incline in trusted certificate authorities being stolen in the last year. This helps criminal to stay in that 1% because security products often lower their guards when they see an object signed by a trusted certificate. [Click]Automatic Static Analysis is hard, took Veracode more than 10 years to master it, and it still pretty trivial to bypass using Packing, Obfuscation and Encryption. [Click]Sometimes, this job has to be done Manually and this process is both time consuming and not scalable. [click]And as we will see in the next couple of slides, running the malware dynamically will introduce the constraints and the Malware Problem. [click]
Relax! It’s A Kitten.[2-sec-break]So how can we fight Malware knowing that we have these constrains and problems?Well, to find out, let’s quickly review the detection methods that we have today.
So if we quickly review the current detection methods we use today we have:Pattern based detection: using MD5/SHA1/ Fuzzy Hashing/ Regex / ClassifiersStatic Analysis: Detecting ANTI-VM/Debugging/Disassembly/Obfuscation tricks. Rodrigo Branco created an excellent reference for all the static tricks that are out there in his BH presentation last year, I encourage you to take a look.Dynamic Analysis: With dynamic analysis today we have emulators and sandbox that do API call trace analysis, lookup suspicious network activities (why does my PDF suddenly contact china?), Registry modification, process and file activates, and of course good-ol’ debuggers that come to the rescue, one of the problems with this approch is the “what you see is all you get”, if the Malware identifies that its being analyzed it won’t do its malicious activity, and we won’t spot it, that’s why its very important to try to detect such technique during static analysis.And there is also the hybrid approach which theoretically works but I haven't seen it in practice, this is a Semantic-Aware detection that tries to figure out code patterns using intermediate representation, trying to optimize some code and match it to a pre-defined template without matching it to a specific pattern or signature.Also, Running Dynamically to get a memory snapshot, and then running on it Statically.In order to understand how attackers bypass these mechanisms, we need to understand the sample lifecycle as it enters one of the security company Malware feed [click]
Sample arrives, through one of the corporate feeds – Each Company maintain a feed of verified malicious files and most of them share it between each other (there also services such as Virus Total, etc)If the sample is unknown (not in the DB) it will go through Static analysis detection. This process will rate how malicious the file is and will score it.If the sample did not raise any suspicious flags or the score was not high enough, then the file will go through Dynamic Analysis, this means that it will run under and emulator or sandbox and will test for abnormal activates depending on the file-type. Again, this process will be scored and will be added to the static analysis score.In case the number of flags or score that was raise passed a certain threshold (that could be dynamic), the Malware will be classified.If the classifier found a match, then it will be categorized as a family threat, like Zeus for example.If not, it will be handled as a Generic Threat. If the Malware raised an “interesting” flag it will be further analyzed manually.The last thing an attacker want is to have a researcher manually analyze the Malware.The author aim to appear benign by passing static analysis which means, don’t look Malicious, and pass Dynamic Analysis, don’t act malicious.Malware will often try to break the trust by presenting a legitimate certificate, Security Products will often mark such software as trusted or rate them a lower score.Let’s see how attackers take advantage of this process.
The first step is to be Unknown, this means that the sample hash is not in the security vendor DB.This can be easily achieved using builders that build variants, append garbage, use encoding.Malware will often try to Stay Complain and make the Security Product hard life. Security Products will won’t take the risk of producing a false positive and Malware will take advantage of this gray area. [Click]Malware will also use Packing and Obfuscation to avoid being detected using Static Analysis.The usage of Packing does not say anything about a program. Many Independent developers using packers to make it hard to reverse engineer.Obfuscation and Encryption make it even harder to do any static analyze of a program. [Click]At that point in the lifecycle, the Malware will probably be executed and it can use many tricks to detect that it is being analyzed.One technique that we spotted was to break or split the maliciousness between multiple files.The attacker knows that the files are being fed into an emulator one by one, and will likely split the malicious flags between several files, so multiple can be malicious, but not malicious enough to be considered a threat. [Click]Few weeks ago, Metasploit released a new evasion module that allows penetration testers to embed malicious code inside trusted templetes such as “calc” or using executable such as PowerShell to avoid using the same malicious template over and over again, Malware authors do that too, and easily bypass detection.Sometimes products take assumptions when analyzing Malware, and Malware authors will take advantage of these assumptions [click]Malware authors knows that products need to take some assumptions, and it will exploit them [click]
One of those assumptions or myth is that “Malware Executes Immediately”This is true for most Malware today, they want to execute their payload as fast as they can and move on.However, today’s targeted Malware don’t run immediately, they wait for a specific event to happen (opening an interesting application, etc..)And taking advantage of the TIME constraint, knowing that we cannot emulate or inspect a file forever.So does “Malware executes immediately” ? [Click]
NO!
Another assumptions or myth is that “Malware is usually small”This is true for most Malware today, they want to be as small as possible, so when they are being downloaded, they won’t look suspicious.However, we have seen examples of Malware carrying legitimate files and taking advantage of the SPACE constrains, knowing that we won’t scan files that are bigger than a certain size.So “Malware is usually small” ? [Click]
NO!
Alan Turing proved that the halting problem is unsolvable using an infinite state machine.In this table we can see the correlation of malware detection and it’s size.Which means that if we only check files up to 10MB than we will find 99% of all the malware out there.And here is another reason how the 1% can slip inside the organization without detection.So does malware and size have something to do YES, but is it 100% ? [Click]
Here is an Example of Stuxnet and Flame Malware and they way they bypassed Static & Dynamic Analysis.Stuxnet: Static: Had 4 Zero-days which are unknown patterns, it breaks the trust using Stolen Certificates.Dynamic: Can’t just run it in a dynamic analysis machine, needs to find the entry point – which means that automated malware analyis machine will fail here. Came in multiple files, each one is malicious, but not enough to pass the threshold (“Spreading the maliciousness”)Stuxnet won’t run its main payload on a normal computer, it is looking for a special environment.Flame:Static: 20 MB of code, some products will not bother scan it, since they believe it is to big to be a Malware.It breaks the trust, showing a legitimate trusted certificate.Comes with legitimate software, the some vendors believe is safe.Dynamic:IT does not execute immediately, takes it time, which means that if it will be emulated, it won’t do anything maliciousCame in multiple files, each one is malicious, but not enough to pass the threshold (“Spreading the maliciousness”)It needs a special loader It seems that if we take advantage of the myths and constrains we will be able to evade detection.
Problem Good not GreatStuxnet – 1% - 4 Zero Days = 4 Path non-detected to reach the same goalMany ways to inject proces / load library -> Same resultFocus on Result and Path commonalitiesFor Instance..-----------So we got a Problem.Detection is Good, But not Great.. And we need to do something about it.Stuxnet for example had 4 zero-days, these are 4 paths that we didn’t have a pattern for back then And it’s very challenging to fight this, because there will always be new paths to achieve the same goal.For example, there is many ways to achieve a process injection or a DLL loading, but the result is similar in all those paths, a thread was injected or a DLL was loaded.So the idea is to focus on the result and find path commonalities.For instance [click]
Check Existence in a weird wayStuxnet/Flame check if AV/ HIPS running -> DisableMalware won’t compare clear-text process – Straightforward, static analysisWill compare in a non-conventional waySolution is taint analysis with weight based mechanismSRC: http://2.bp.blogspot.com/-AkUidiy5pPM/TeRFd4a87xI/AAAAAAAAAIc/na8x23U0mWQ/s1600/MathProblem_3.jpgAnother pattern we have noticed is that many advanced malware will likely check for the existence of a security product in a “weird way” and will try to disable it.We have seen Stuxnet and Flame trying to check if there is an instance of Anti-Virus or HIPS running in order to try disabling them.In order to find out which product is up, the Malware will usually enumerate the process list.However, it won’t just compare these processes to a clear-text string such as “Mcafee” or “Kasperksy” because its too straight-forward and can be easily detected during static analysis.It will usually encrypt the string and compare it in a non-conventional way.The solution we came up to this was using taint analysis and weight based mechanism, we are trying to score the process of enumeration.
Using Taint Analysis and Weight based mechanism to detect if a process is looking for the present of a security product.
When we talking about Drive-By-Download, Malware is using Obfuscation to hide its malicious intents.Most of the Network Security Products are blind, and the only way to stop it is on the End Point.
Take advantage of the fact that Malware are not coming bundled with rootkit detection toolsIf Malware will use rootkit detection techniques some day, we can catch them during static-analysis or dynamic analysis, sort of anti-anti-analysis.
But sometimes, Malware does slip through inside, and we need to have post-infection strategy for the 1% malware.
War Between Spam Group DNS Attack utilizing open DNS servers (there are more than 21 million online)Attack reach 350 gigabits per secondsCloudFlare and Spamhous survive the attackSlow down was felt in EuropeLINX = London Internet Exchangehttp://blog.cloudflare.com/the-ddos-that-almost-broke-the-internethttp://www.theregister.co.uk/2013/03/27/spamhaus_ddos_megaflood/http://nakedsecurity.sophos.com/2013/03/28/massive-ddos-attack-against-anti-spam-provider-impacts-millions-of-internet-users/ - How it works.“Traditionally even large botnets are only able to deliver hundreds of megabits or a few gigabits per second.”“DNS reflection attack that takes advantage of misconfigured DNS servers to amplify the power of a much smaller botnet.”
http://internetcensus2012.bitbucket.org/paper.html - Full StoryAbstract While playing around with the Nmap Scripting Engine (NSE) we discovered an amazing number of open embedded devices on the Internet. Many of them are based on Linux and allow login to standard BusyBox with empty or default credentials. We used these devices to build a distributed port scanner to scan all IPv4 addresses. These scans include service probes for the most common ports, ICMP ping, reverse DNS and SYN scans. We analyzed some of the data to get an estimation of the IP address usage. All data gathered during our research is released into the public domain for further study. Introduction Two years ago while spending some time with the Nmap Scripting Engine (NSE) someone mentioned that we should try the classic telnet login root:root on random IP addresses. This was meant as a joke, but was given a try. We started scanning and quickly realized that there should be several thousand unprotected devices on the Internet. After completing the scan of roughly one hundred thousand IP addresses, we realized the number of insecure devices must be at least one hundred thousand. Starting with one device and assuming a scan speed of ten IP addresses per second, it should find the next open device within one hour. The scan rate would be doubled if we deployed a scanner to the newly found device. After doubling the scan rate in this way about 16.5 times, all unprotected devices would be found; this would take only 16.5 hours. Additionally, with one hundred thousand devices scanning at ten probes per second we would have a distributed port scanner to port scan the entire IPv4 Internet within one hour.
Continue the analogy – similar to the door locking, issue – soon after someone publishes an exploits, it gets into an exploit kit, allowing much less skilled attackers to use it.An exploit pack deploys a web site, with code to detect client version and likely vulnerabilities, and serve exploits to the most relevant vulnerabilities. Sold for ~1500$ annual license. All you need to do is to make someone come to the infected site, and if the exploit kit includes a relevant vulnerability – he will be infected.
The second reason applies to the Syrian attack – this specific implementation of the exploit is very rare, and seen <150 times worldwide.
Introducing Check Point’s new Anti-Bot SW Blade to REVOLUTIONIZE BOT PREVENTION!