Classifying internet hosts along a reputation continuum

Abstract

With increasing inventiveness and agility, cutting edge Internet attack techniques such as “fast
fluxing” and advanced persistent threats challenge the effectiveness of traditional blacklists. The
challenge undertaken by HP TippingPoint is to rapidly counteract attacks such as these by
classifying as much of the Internet as possible along a continuum between reputable and
disreputable. Our solution implements a number of novel methods to identify and track Internet
hosts, which in turn provides intelligence to the Reputation Digital Vaccine (Reputation DV) Service.
Reputation DV provides IPv4, IPv6 and Domain Name System (DNS) security intelligence feeds
from a global reputation database that enables customers to actively enforce and manage
reputation security policies using the HP TippingPoint Intrusion Prevention System (IPS) Platform.

Bio

Marc Eisenbarth recently noticed the word "Architect" has been appended to his business cards,
and while not entirely sure what that means, he has continued to just do what he has been doing
for the last five years, namely improving the HP TippingPoint Intrusion Prevention System (IPS) as a
member of DVLabs' Advanced Security Intelligence team. Prior to this, he managed "cyber
liability" at a US defense contractor for five years and completed a graduate program at
Columbia University in Computer Science. Off the clock, he is a "hardware guy" who enjoys
releasing various do-it-yourself projects to the general public.

HP Confidential 1 16 June 2011

Problem Statement

A blacklist is simply a list of Internet hosts which all traffic should be discarded
indiscriminately. Challenges with traditional blacklists exist in both the development and
implementation of the blacklist.


Problem Statement

The greatest technical challenge to blacklists is how to keep abreast of a rapidly changing
threat landscape. Attack techniques such as “fast fluxing” [1] and advanced persistent
threats [2] are particularly difficult to identify and protect against because they represent
two extremes of the time scale. In the former, a single Internet host constantly varies its IP
address and DNS name in order to avoid detection. In the later, a single stealthy attack is
carried out over a long period of time, using specific domain knowledge of the target.
Other complications include DNS entries with inordinately large number of A or NS
records, and similar mechanisms to resist take down and complicate traditional mitigation
strategies. Detection of these techniques requires storage, evaluation and comparison
against historical state information.


Problem Statement

The major shortfall of existing blacklists is the fact that they do not classify or discriminate
via a relative or absolute reputation score, or offer a confidence metric. Furthermore,
traditional blacklists assess reputation simplistically, using a binary classifier rather than a
continuum of risk and reputation.


Problem Statement

The source and quality of the data used to compile a blacklist is often suspect, originating
from email server logs, firewall logs, and DNS responses; all of which provide meaningful
information but fall far short of profiling a modern attacker for inclusion in a truly reliable
and trusted blacklist. To complicate the problem, people looking to purchase this data
rather than stand up internal systems to compile and maintain blacklist entries run into poor
quality data sold by blacklist vendors, typically due to an increasing pressure to deliver
larger and larger number of entries, as well as vendors who balk at the questions of
whether or not their lists are of a suitable quality to be used in blocking scenarios in large
enterprise networks.

Even among highly specialized, trusted feeds we see disagreement. For example,
zeustracker.abuse.ch and malwaredomainlist.com produce disjoint blacklists for Zeus
botnets, which admittedly is one of the more difficult and challenging botnets to track and
determine attribution. We assume that this is due to the inherent, limited visibility of both
monitoring approaches.


Problem Statement

Maintaining a reliable, timely, and actionable blacklist that can then be enforced in
today's enterprise networks is challenging. The problem that needs resolution is to better
scrutinize host systems before adding them to a blacklist so as to minimize false positives
and to reassure skeptical customers who want greater transparency into the implications of
implementing the blacklist. To achieve this level of improvement, blacklist research must
perform additional intelligence gathering out-of-band and must analyze attacks that occur
across multiple, disparate network flows which can occur over an arbitrary amount of time.
Finally, active interaction with a suspected malicious host is often needed to confirm its
disreputable intent. These reasons highlight why we chose not to do this analysis inline
using the existing IPS engine and instead invented the HP TippingPoint Reputation Digital
Vaccine (Reputation DV) service, which automates the identification and blocking of known
bad traffic before it reaches the IPS deep packet inspection engine, thus relieving the load
on the IPS and deterring traffic from disreputable host systems.


Question

This sophisticated use of the DNS system by modern attackers is very much in response to
the simplistic attempts at DNS blacklists that began more than 10 years ago. Attackers are
generally lazy and innovation is necessity driven. At this point in time, technology is
responding. It seems to be responding favorably, evidenced by customers who are
blocking millions of reputation sourced events per day.


Question

The next logical step is considering to replace your IDS with a reputation based system.
However, if you really mean intrusion detection system, I’d argue that this does not really
make sense. This is due to the historical nature of IDS and the interest in answering
questions such as “what happened, and why did it happen? [emphasis on past tense]”.
Now, if you want to talk about IPS, with a focus on the timely enforcement component that
IPS brings to the table, then the conversation gets interesting.

IPS in the context of reputation focuses on the source of a threat, not the vulnerability or
payload of the attack. As such, it’s not a replacement for an IPS, but another tool to
provide preemptive threat protection. In some senses, reputation provides a cloudy crystal
ball which has the ability to forecast to some extent what attackers will do and how to
catch them, by being vulnerability agnostic. Again, this allows reputation-based
approaches to outpace attackers by focusing on their infrastructure rather than their wares.


Approach

The first step in generating the Reputation DV package is continual acquisition of potential
malicious hosts from various external intelligence feeds (Figure 1). These feeds fall into three
broad categories: commercial, open-source, and automated customer submissions. Without
exception, these first two feed types suffer from the pitfalls and limitations outlined in the
previous section. The third category is split between customers who have elected to share
security event data and entities that have allowed collocation of an HP TippingPoint
controlled IPS as part of our Lighthouse program. The data received from these sources is
unique as it is not simply low level event data, but a set of context rich security events
associated with particular HP TippingPoint Digital Vaccine (DV) filters. This allows high level
correlation between attacks and attackers, despite their efforts to evade detection by
manipulating the fluid assignment scheme of IP addresses and DNS host names.


Approach

The process of placing a host on a blacklist uses a series of modules, each of which
generates data we use to classify the host. The first module is responsible for tracking
content on these hosts and retrieving a copy of malicious documents, scripts, executables,
and tracking changes to these files. This data is used compute a similarity metric which is
then used to cluster hosts which are hosting related malware and exploits. The last is a
series of modules collectively called “meta” which is a collection of active and passive
intelligence gathering techniques.

These techniques are used to define more data points for each entry which are ultimately
used to mine additional interrelationships between entries. The passive intelligence
techniques that we employ include search engine results, DNS and whois information, and
the like. The active intelligence techniques are not limited to port scanning, banner
grabbing, content spidering, operating system fingerprinting, uptime tracking, and even
high interaction honeypots. This monitoring results in a rich collection of out-of-band
intelligence that can be warehoused and then at a later point, used to compare to current
state.


Approach (continued)

Inline monitoring is expensive and difficult to scale for global coverage. Our goal is to
monitor an arbitrary system by detecting outwardly visible changes, wherever possible. The
motivation for this module lies in the desire to increase the number of Internet systems
under surveillance. Unlike many organizations, we do in fact have a large network of
sensors that are monitoring the Internet both in an inline capacity as well as through
network span ports. While this is useful in its own right and currently scaled out to the
degree which allows the results to be considered statistically viable, the chief limitation of
this approach is that only traffic which crosses this sphere of inspection can be considered
for analysis. Born out of this realization was the concept of an active monitoring system
which could reach out and query an arbitrary host and could scale to the point that it
could track the Internet as a whole.

Currently our systems track around four billion annual events which are distilled into a set
of approximately two million IP addresses and a half of million DNS entries which are
distributed to the end user and comprise the Reputation DV. To support this massive effort,
we developed the extensible architecture outlined above, which is responsible for
constructing, maintaining and distributing this blacklist of irreputable Internet hosts.


Approach

Once module-based classification work is complete, there is an enormous amount of
information associated with each entry that now can be consumed by the “rule engine”
module, which exists to further classify and score each entry. At the heart of the rule engine
is the support-vector machines (SVM) algorithm [3]. SVMs are a set of related supervised
learning methods that analyze data and recognize patterns. The advantage that SVMs
offer is a soft margin classifier which is able to reduce a single multiclass problem into
multiple binary classifications. In other words, it is possible to operate on arbitrary data
types and reduce the chance for overfitting the data by accounting for mislabeled
examples.

Additional algorithms are used to assign various scores to the blacklist entries. These scores
represent our assessment of the host’s potential to generate malicious behavior along with
our confidence that it is not a false positive. We also distribute tags comprised of the
above mined metadata, which serve to classify each entry and provide useful data, such as
country of origin, attack family and reason for inclusion.


Approach

Administrators can leverage the tags and scores to build custom filters used to tailor the
blacklist to their company’s business and risk management requirements. An example filter
would read, ”block all botnets but not spam originating from Azakstan with a score greater
than 80”. The flexibility that these filters offer gives a level of transparency and control to
administrators that traditional blacklists cannot provide. Bad traffic is dropped before the
IPS deep packet inspection engine resulting in efficient, scalable policy enforcement.


Observations

It became clear very early on that a whitelist mechanism was needed to train and validate
our algorithms. Alexa, which offers a ranking of the top million domains, serves as a good
place to start, along with search engine results. However, looking at the top 250,000
domains, we note that a notable percentage would show up in our blacklist algorithms
from time to time. It’s important to note that this list contains popular file sharing, porn-
related, and unethical advertising websites which often deserve disreputable scores. In
further investigating some of these domains, we note that often they are hosted in networks
that contain proven malicious domains and are thus there is some validity to a certain
amount of “guilty by association”. This idea of the reputation of a ISP is something that we
are looking to explore further, and something that has already made the news on a few
blogs out there and elicited response from a known German ISP, which appeared near the
top of this list. All this to say, due to the lack of granularity of reputation based blocking,
for cases where a site such as Google is delivering malicious content, the IPS signature
engine is much more adept at handling these cases and for the last obvious cases,
compared to Google anyways, a mixture of reputation and filter technologies proves
promising, as we shall explore later.


Observations (continued)

A corollary to this is perhaps that vendors can artificially inflate their reputation lists by
including large numbers of addresses that have a low probability of causing business
interruption, but are not necessarily malicious. However, we believe this approach to be
specious at best, citing the example where an online retailer may be deploying a blacklist
and blocking these hosts results in loss of revenue. In fact, in this case it’s conceivable that
the retailer in this scenario wouldn’t require a squeaky clean bill of reputation health in
order to do business with certain potential customers and this use case helps underscore
the flexibility of our approach.


Observations

After an initial period of time to learn new entries, we see that the rate of new DNS entries is relatively flat.
In fact, it doesn’t take long to discover that a relatively small percentage of the Internet is actively queried.
Given a large ISP to sample from, we see that this list converges fairly rapidly. Furthermore, we would
expect this convergence in smaller networks given additional time. Anomalous deviation from this trend is
malicious.

A notable example is content distribution networks, such as Akamai. It’s interesting to note that these behave
very similar to fast-flux networks. The number of unique IP addresses stays constant with the number of new,
unique domains. Furthermore, each new DNS entry is a new IP and a new child domain. The reason for this
is in fact very similar to why fast-flux networks behave in this fashion, namely geo-based, high availability. In
the malicious case, the new entries are compromised hosts, which again have a new IP address in a
different topological location in the network and this host is given a new domain name for tracking
purposes.


Observations (continued)

On the opposite side of the spectrum, we note that popular sites tend to use a relatively
small number of IP addresses and have a large number of associated domain names.
Obvious reasons for this include schemes such as virtual hosting and less obvious reasons
point to the fact that more sites are encoding information into the domain name itself.

Finally, Dynamic DNS and publically routed DHCP networks form a very interesting study
in and of themselves. Yet the observations that the IP address is often encoded into the
DNS entry itself, as well as the one to one relationship between dynamic DNS names and
IP addresses, make identification and tracking much more tractable than it seems at first.


Observations

In the case where attackers do not reuse domain names and address space, always
procuring new resources, reputation becomes difficult. This becomes much more
problematic as the shift towards IPv6 occurs.


Work In Progress

Combining reputation information with filters into a hybrid policy model allows increased
performance and accuracy of the overall security solution. For example, imagine being
able to push a policy that simply states: “Block all compound document types originating
from China” or instruct a filter to block that might not be “recommended on” in the default
configuration only if the host has a reputation score below a given threshold. This
additional information allows customers to justify a more aggressive, and thus effective,
security policy.


Work In Progress

We believe that we can not only offer protection for the subscribers and consumers of
cloud computing and large data center deployments, but have the unique capability to
protect the reputation of the services themselves by vetting outbound traffic and thereby
bringing to market a significant differentiator in this rapidly emerging space.


Work In Progress

Dynamically decide reputation of never-seen-before hosts by moving from historical and
statistical evaluation to predictive, dynamic methods.


References

[1] Jamie Riden, 2008, “The Honeynet Project: How Fast-Flux Service Networks Work”,
http://www.honeynet.org/node/132

[2] Michael K. Daly, 2009, “Advanced Persistent Threat (or Informationized Force
Operations)”, http://www.usenix.org/event/lisa09/tech/slides/daly.pdf

[3] Corinna Cortes and V. Vapnik, 1995, “Support-Vector Machines”,
http://www.springerlink.com/content/k238jx04hm87j80g/


Q&A

Evaluation accounts are available that you can use to check out the system, punch in your
own address range, etc. as well as a couple publically available whitepapers.


Classifying internet hosts along a reputation continuum

Recommended

Recommended

More Related Content

What's hot

What's hot (18)

Similar to Classifying internet hosts along a reputation continuum

Similar to Classifying internet hosts along a reputation continuum (20)

More from Source Conference

More from Source Conference (20)

Recently uploaded

Recently uploaded (20)

Classifying internet hosts along a reputation continuum