SlideShare ist ein Scribd-Unternehmen logo
1 von 60
Downloaden Sie, um offline zu lesen
Master’s Thesis
Automatically assessing exposure to known security
vulnerabilities in third-party dependencies
Edward M. Poot
edwardmp@gmail.com
July 2016, 55 pages
Supervisors: dr. Magiel Bruntink
Host organisation: Software Improvement Group, https://www.sig.eu
Universiteit van Amsterdam
Faculteit der Natuurwetenschappen, Wiskunde en Informatica
Master Software Engineering
http://www.software-engineering-amsterdam.nl
Abstract
Up to 80 percent of code in modern software systems originates from the third-party components used by a
system. Software systems incorporate these third-party components (’dependencies’) to preclude reinventing
the wheel when common or generic functionality is needed. For example, Java systems often incorporate
logging libraries like the popular Log4j library. Usage of such components is not without risk; third-party
software dependencies frequently expose host systems to their vulnerabilities, such as the ones listed in
publicly accessible CVE (vulnerability) databases. Yet, a system’s dependencies are often still not updated to
versions that are known to be immune to these vulnerabilities. A risk resulting from this phenomenon when
the dependency is not updated timely after the vulnerability is disclosed is that persons with malicious intent
may try to compromise the system. Tools such as Shodan∗
have emerged that can identify servers running
a specific version of a vulnerable component, for instance the Jetty webserver version 4.2†
, that is known
to be vulnerable‡
. Once a vulnerability is disclosed publicly, finding vulnerable systems is trivial using such
tooling. This risk is often overlooked by the maintainers of a system. In 2011 researchers discovered that 37%
of the 1,261 versions of 31 popular libraries studied contain at least one known vulnerability.
Tooling that continuously scans a systems’ dependencies for known vulnerabilities can help mitigate this
risk. A tool like this, Vulnerability Alert Service (’VAS’), is already developed and in active use at the Software
Improvement Group (’SIG’) in Amsterdam. The vulnerability reports generated by this tool are generally
considered helpful but there are limitations to the current tool. VAS does not report whether the vulnerable
parts of the dependency are actually used or potentially invoked by the system; VAS only reports whether
a vulnerable version of a dependency is used but not the extent to which this vulnerability can actually be
exploited in a system.
Links to a specific Version Control System revision (’commit’) of a system’s code-base are frequently in-
cluded in so-called CVE entries. CVE entries are bundles of meta-data related to a specific software vulner-
ability that has been disclosed. By using this information, the methods whose implementations have been
changed can be determined by looking at the changes contained within a commit. These changes reveal
which methods were involved in the conception of the vulnerability. These methods are assumed to con-
tain the vulnerability. By tracing which of these vulnerable methods is invoked directly or indirectly by the
system we can determine the actual exposure to a vulnerability. The purpose of this thesis is to develop a
proof-of-concept tool that incorporates such an approach to assessing the exposure known vulnerabilities.
As a final step, the usefulness of the prototype tool will be validated. This is assessed by first using the tool
in the context of SIG and then determining to what extent the results can be generalized to other contexts.
We will show why tools like the one proposed are assumed to be useful in multiple contexts.
Keywords: software vulnerability, vulnerability detection, known vulnerabilities in dependencies, CVE, CPE,
CPE matching, call graph analysis
∗https://www.shodan.io
†https://www.shodan.io/search?query=jetty+4.2
‡https://www.cvedetails.com/cve/CVE-2004-2478
Contents
1 Introduction 1
1.1 Problem analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Research questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4 Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.5 Research method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.6 Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.7 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2 Related work 7
2.1 Tracking Known Security Vulnerabilities in Proprietary Software Systems . . . . . . . . . . 7
2.2 Tracking known security vulnerabilities in third-party components . . . . . . . . . . . . . . 8
2.3 The Unfortunate Reality of Insecure Libraries . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.4 Impact assessment for vulnerabilities in open-source software libraries . . . . . . . . . . . . 9
2.5 Measuring Dependency Freshness in Software Systems . . . . . . . . . . . . . . . . . . . . . 10
2.6 Monitoring Software Vulnerabilities through Social Networks Analysis . . . . . . . . . . . . 10
2.7 An Analysis of Dependence on Third-party Libraries in Open Source and Proprietary Systems 11
2.8 Exploring Risks in the Usage of Third-Party Libraries . . . . . . . . . . . . . . . . . . . . . . 12
2.9 Measuring Software Library Stability Through Historical Version Analysis . . . . . . . . . . 12
2.10 An Empirical Analysis of Exploitation Attempts based on Vulnerabilities in Open Source Soft-
ware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.11 Understanding API Usage to Support Informed Decision Making in Software Maintenance . 13
3 Research method 15
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.2 Client helper cycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.2.1 Problem investigation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.2.2 Treatment design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.2.3 Design validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.2.4 Implementation and Implementation evaluation . . . . . . . . . . . . . . . . . . . . . 17
3.3 Research cycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.3.1 Research problem investigation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.3.2 Research design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.3.3 Research design validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.3.4 Analysis of results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.4 Design cycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.4.1 Problem investigation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.4.2 Artifact design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.4.3 Design validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.4.4 Implementation and Implementation evaluation . . . . . . . . . . . . . . . . . . . . . 19
4 Designing a proof of concept tool 20
4.1 Research context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.2 High-level overview tool . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
CONTENTS
4.2.1 Gathering and downloading dependencies of a system . . . . . . . . . . . . . . . . . 21
4.2.2 Gathering CVE data relevant to included dependencies . . . . . . . . . . . . . . . . . 21
4.2.3 Establishing vulnerable methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.2.4 Ascertaining which library methods are invoked . . . . . . . . . . . . . . . . . . . . 22
4.2.5 Identifying vulnerable methods that are invoked . . . . . . . . . . . . . . . . . . . . 22
4.3 Detailed approach for automatically assessing exposure to known vulnerabilities . . . . . . 22
4.3.1 Determining vulnerable methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.3.2 Extracting dependency information . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.3.3 Creating a call graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.3.4 Determining actual exposure to vulnerable methods . . . . . . . . . . . . . . . . . . 29
4.3.5 External interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
5 Evaluation 32
5.1 Conducting analysis on client projects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
5.1.1 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
5.1.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
5.1.3 Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
5.2 Finding known vulnerabilities without using CVE databases . . . . . . . . . . . . . . . . . . 35
5.2.1 Implementing retrieval of data from another source . . . . . . . . . . . . . . . . . . . 35
5.2.2 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
5.2.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
5.2.4 Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
5.3 Finding vulnerabilities through GitHub that are not listed in CVE databases . . . . . . . . . 41
5.3.1 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
5.3.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
5.3.3 Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
5.3.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
5.4 Evaluating usefulness with security consultants . . . . . . . . . . . . . . . . . . . . . . . . . 45
5.4.1 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
5.4.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
5.4.3 Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
5.5 Reflection on usefulness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
5.5.1 Result analysis research cycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
5.5.2 Implementation evaluation of the design cycle . . . . . . . . . . . . . . . . . . . . . . 48
5.6 Threats to validity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
5.6.1 Conclusion validity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
5.6.2 Construct validity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
5.6.3 External validity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
6 Conclusion and future work 50
6.1 Answering the research questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
6.1.1 To what extent is it possible to automatically determine whether vulnerable code in
dependencies can potentially be executed? . . . . . . . . . . . . . . . . . . . . . . . . 50
6.1.2 How can we generalize the usefulness of the prototype tool based on its usefulness
in the SIG context? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
6.2 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
Bibliography 53
Acronyms 55
Preface
Before you lies the result of five months of hard work. Although I am the one credited for this work, this
thesis could not have been produced without the help of several people.
First of all I would like to thank Mircea Cadariu for his reflections on the research direction I should
pursue. My gratitude goes out to Theodoor Scholte for his input on the tool I developed. I would also like
to acknowledge Reinier Vis for connecting me with the right persons. Special thanks to Marina Stojanovski,
Sanne Brinkhorst and Brenda Langedijk for participating in interviews or facilitating them. I want to give a
shout-out to Wander Grevink for setting up the technical infrastructure used during my research.
I sincerely appreciate the advice and guidance of my supervisor Magiel Bruntink during this period. Fur-
thermore, I would like to express my gratitude to anyone else in the research department at Software Im-
provement Group (SIG) for their input — Xander Schrijen, Haiyun Xu, Baŕbara Vieira and Cuiting Chen. I
would also like to thank all the other interns at SIG for their companionship during this period.
Finally, I would like to thank everybody else at SIG for providing me with the opportunity to write my
thesis here.
Edward Poot
Amsterdam, The Netherlands
July 2016
Chapter 1
Introduction
1.1 Problem analysis
In April of 2014, the cyber-security community came to know of a severe security vulnerability unprecedented
in scale and severity. The vulnerability, quickly dubbed as ’Heartbleed’, was found in OpenSSL, a popular
cryptography library that implements the Transport Layer Security (TLS) protocol. OpenSSL is incorporated
in widely used web-server software like Apache, which powers the vast majority of websites found on the
internet today. The library is also used by thousands of other systems requiring cryptographic functionality.
After the disclosure of this vulnerability, security researchers identified at least 600.000 systems connected
to the public Internet that were exploitable due to this vulnerability1
. This specific security incident makes
it painfully clear that there is a shadow side to the use of open-source software. The widespread adoption of
open-source software has made such systems easy victims. Once a vulnerability is disclosed, it can be trivial
for malicious persons to exploit thousands of affected systems.
Contrary to popular belief, analysis done by Ransbotham (2010) corroborates that, when compared to pro-
prietary systems, open source systems have a greater risk of exploitation, diffuse earlier and wider and know
a greater overall volume of exploitation attempts. The OWASP Top Ten2
exposes the most commonly occur-
ring security flaws in software systems. Using components with known vulnerabilities is listed as number
nine in the list of 2013. The emergence of dependency management tools has caused a significant increase
in the number of libraries involved in a typical application. In a report of Williams and Dabirsiaghi (2012),
in which the prevalence of using vulnerable libraries is investigated, it is recommended that systems and
processes for monitoring the usage of libraries are established.
The SIG analyses the maintainability of clients’ software systems and certifies systems to assess the long-
term maintainability of such systems. Security is generally considered to be related to the maintainability of
the system. Use of outdated dependencies with known vulnerabilities provides a strong hint that maintain-
ability is not a top priority in the system. Furthermore, IT security is one of the main themes of the work SIG
fulfills for its clients. The systems of SIG’s clients typically depend on third-party components for common
functionality. However, as indicated before this is not without risk. In security-critical applications, such as
banking systems, it is crucial to minimize the time between the disclosure of the vulnerability and the appli-
cation of a patch to fix the vulnerability. Given the increasing number of dependencies used by applications,
this can only be achieved by employing dedicated tooling.
In 2014 an intern at SIG, Mircea Cadariu (see Cadariu (2014); Cadariu et al. (2015)), modified an existing
tool to be able to scan the dependencies of a system for vulnerabilities as part of his master’s thesis. The
tool was modified to support indexing Project Object Model (POM)3
files, in which dependencies of a system
are declared when the Maven dependency management system is used. Interviews with consultants at SIG
revealed that they would typically consider the vulnerability reports to be useful, even though false positives
would frequently be reported. The interviewees mentioned that typically they would consider whether the
vulnerability description could be linked to functionality in dependencies that the client uses. However,
a consultant may mistakenly think that the vulnerable code is never executed since this kind of manual
1http://blog.erratasec.com/2014/04/600000-servers-vulnerable-to-heartbleed.html
2https://www.owasp.org/index.php/Top_10_2013-Top_10
3https://maven.apache.org/pom.html
1
CHAPTER 1. INTRODUCTION
verification is prone to human error. Furthermore, the need for manual verification by humans means that
the disclosure of a critical and imminent threat to the client may be delayed. We propose to create a prototype
tool that will automatically indicate the usage of vulnerable functionality.
Plate et al. (2015) have published a paper in which a technique is proposed to identify vulnerable code in
dependencies based on references to Common Vulnerabilities and Exposures (CVE) identifiers in the commit
messages of a dependency. CVE identifiers are assigned to specific vulnerabilities when they are disclosed.
The issue with this approach was that CVE identifiers were rarely referenced in commit messages, at least not
structurally. In addition, manual effort was required to match Version Control System (VCS) repositories to
specific dependencies. Moreover, Plate et al. (2015) indicate that once a vulnerability is confirmed to be present
in one of the systems’ dependencies, they are regularly still not updated to mitigate the risk of exposure. In
the enterprise context this can be attributed to the fact that these systems are presumed to be mission-critical.
Hence, downtime has to be minimized. The reluctance to update dependencies is caused by beliefs that new
issues will be introduced by updating. Because of these kind of beliefs there is an urge to carefully assess
whether a system requires an urgent patch to avert exposure to a vulnerability or whether this patch can
be applied during the application’s regular release cycle; a vulnerability that is actually exploitable and can
be used to compromise the integrity of the system would require immediate intervention, while updating a
library with a known vulnerability in untouched parts can usually be postponed.
Bouwers et al. (2015) state that prioritizing dependency updates proves to be difficult because the use
of outdated dependencies is often opaque. The authors have devised a metric (’dependency freshness’) to
indicate whether recent versions of dependencies are generally used in a specific system. After calculating
this metric for 75 systems, the authors conclude that only 16.7% of the dependencies incorporated in systems
display no update lag at all. The large majority (64.1%) of the dependencies used in a system show an update
lag of over 365 days, with a tail of up to 8 years. Overall, it is determined that it is not common practice to
update dependencies on a regular basis in most systems. It is also discovered that the freshness rating has
a negative correlation with the number of dependencies that contain known security vulnerabilities. More
specifically, systems with a high median dependency freshness rating know a lower number of dependencies
with reported security vulnerabilities and vice versa. However, these metrics do not take in account how the
dependency is actually used by the system. The tool we propose would be able to justify the urge to update
dependencies by showing that a system is actually vulnerable; the risk of using outdated dependencies is no
longer opaque.
Raemaekers et al. (2011) sought to assess the frequency of use of third-party libraries in both proprietary
and open source systems. Using this information, a rating is derived based on the frequency of use of partic-
ular libraries and on the dependence on third-party libraries in a software system. This rating can be used
to indicate the exposure to potential security risks introduced by these libraries. Raemaekers et al. (2012a)
continue this inquiry in another paper, the goal of which was to explore to what extent risks involved in the
use of third-party libraries can be assessed automatically. The authors hypothesize that risks in the usage of
third party libraries are influenced by the way a given system is using a specific library. They do not rely on
CVE information but the study does look at Application Programming Interface (API) usage as an indicator
of risk.
We can conclude from the existing literature reviewed that vulnerabilities introduced in a system by its
dependencies are a prevalent threat in today’s technological landscape. Various tools have been developed
aiming to tackle this problem. However, a tool that tries to determine the actual usage of the API units
introducing the vulnerable behavior is currently lacking to our knowledge. Therefore, the problem we seek
to solve is assessing how we can automatically determine actual exposure to vulnerabilities introduced by a
system’s dependencies rather than hypothetical exposure alone. A proof-of-concept tool will be created to
indicate the feasibility of this approach. We will evaluate this tool in the context of our host company (SIG).
Furthermore, we will generalize the usefulness of a tool featuring such functionality in multiple contexts.
1.2 Research questions
Research question 1 To what extent is it possible to automatically determine whether vulnerable code
in dependencies can potentially be executed?
– How can we retrieve all CVEs relevant to a specific dependency?
2
CHAPTER 1. INTRODUCTION
– How can we determine which methods of a dependency are called directly or indirectly?
– How do we determine which code was changed to fix a CVE?
– How can we validate the correctness of the prototype tool we will design?
Research question 2 How can we generalize the usefulness of the prototype tool based on its usefulness
in the SIG context?
– In what ways can the tool implementing the aforementioned technique be exploited in useful ways
at SIG?
– In what ways is the SIG use case similar to other cases?
1.3 Definitions
First, we will establish some common vocabulary that will be used in the remainder of this thesis. An overview
of the acronyms we use is also provided at the end of this thesis.
Software vulnerabilities According to the Internet Engineering Task Force (IETF)4
a software vulnerabil-
ity is defined to be: “a flaw or weakness in a system’s design, implementation, or operation and management
that could be exploited to violate the system’s security policy”. For the purpose of this thesis, we are primarily
concerned with known vulnerabilities. These are vulnerabilities that have been disclosed in the past through
some public channel.
CVE CVE is the abbreviated form of the term Common Vulnerabilities and Exposures. Depending on the
context, it can have a slightly different meaning, but in all circumstances CVE relates to known security
vulnerabilities in software systems.
First of all, CVE can be used to refer to an identifier assigned to a specific security vulnerability. When a
vulnerability is disclosed, it will be assigned an identifier of the form “CVE-YYYY-1234”. More specifically,
the CVE prefix is added, followed by the year the vulnerability was discovered in. Finally, a number unique to
all discovered vulnerabilities in that year is added to the suffix. This identifier serves as a mechanism through
which different information sources can refer to the same vulnerability.
Secondly, a CVE can refer to a bundle of meta-data related to a vulnerability identified by a CVE identifier,
something to which we will refer as CVE entry. For instance, a score indicating the severity of vulnerability
(“CVSS”) is assigned as well as a description indicating how the vulnerability manifests. Moreover, a list
of references is attached, which basically is a collection of links to other sources that have supplementary
information on a specific vulnerability.
Finally, CVE is sometimes used synonymously with the databases containing the CVE entries. This is
something we will refer to as CVE databases from now on. The National Vulnerability Database (NVD) is a
specific database that we will use.
CPE CPE is an acronym for Common Platform Enumeration. One or more CPEs can be found in a CVE
entry. CPEs are identifiers that identify the platforms affected by a specific vulnerability.
VCS VCS is an abbreviation for Version Control System. This refers to a class of systems used to track
changes in source code over time. Version Control Systems use the notion of revisions. For instance, the
initial source code that is added is known as revision one, but after the first change is made the revision two
is the state the code is in.
As of 2016, the most popular VCS is Git. Git is a distributed VCS, in which the source code may be dispersed
over multiple locations. Git has the concept of repositories, in which such a copy of the source code is stored.
The website GitHub is currently the most popular platform for hosting these repositories.
In Git, revisions are called commits. Moreover, Git and GitHub introduce other meta-data concepts such
as tags and pull requests respectively. We will commonly refer to such pieces of meta-data as VCS artifacts.
GitHub also introduces the notion of issues, through which problems related to a system can be discussed.
4https://tools.ietf.org/html/rfc2828
3
CHAPTER 1. INTRODUCTION
Dependencies Software systems often incorporate third-party libraries that provide common functionality
to preclude developing such functionality in-house and thereby reinventing the wheel. The advantages of
using such libraries includes shortened development times and cost savings due to not having to develop and
maintain such components.
Since a system now depends on these libraries to function, we call these libraries the dependencies of
a system. New versions of libraries containing bug-fixes and security improvements may be released by
the maintainers. To aid in the process of keeping these dependencies up-to-date, dependency management
systems have emerged. One of the most popular dependency management systems is Maven, a dependency
management system for applications written in the Java programming language. In Maven, the dependencies
are declared in an XML file referred to as the Project Object Model file, or POM file in short.
1.4 Assumptions
Based on initial analysis conducted, we have established the following assumptions about known security
vulnerabilities:
Assumption 1 It is becoming increasingly more likely that CVE entries refer to VCS artifacts.
Assumption 2 The commits referred to in CVE entries contain the fix for the vulnerability.
Assumption 3 The methods whose implementation has been changed as indicated by the commit contain
the fix for a vulnerability.
We will substantiate each assumption in the following paragraphs.
It is becoming increasingly more likely that CVE entries contain references to VCS artifacts The
approach we envision to assess the actual exposure to vulnerabilities heavily relies on the presence of VCS
references in CVE entries. The percentage of CVE having at least one VCS reference is still quite low (6,48%
to be precise5
) but over the years we signify a positive trend. Figure 1.1 provides a graphical depiction of this
trend. With the notable exception of the year 2015, the absolute number of CVE entries having at least one
VCS reference is increasing year over year. The year 2015 deviates from this trend probably simply due to
the fact that the absolute number of CVEs in that year is lower than in other years.
Figure 1.1: The absolute number of CVE in the NVD database having at least one VCS reference increases
almost every year.
5Relative to all CVE entries in the NVD database
4
CHAPTER 1. INTRODUCTION
The commits referred to in CVE entries contain the fix for the vulnerability Based on manual ex-
amination of several CVE entries, it appears that when there is a reference to a commit or other VCS artifact,
the code changes included in that commit encompass the fix for the vulnerability. There are corner cases
where this does not apply; we already encountered a commit link that referred to an updated change-log file
indicating that the problem was solved instead of the actual code changing to remedy the problem. This does
not matter in our case, since we only take source code into account.
The methods whose implementation has been changed as indicated by the commit contain the fix
for a vulnerability We have analyzed a number of patches. Regularly, when a vulnerability is disclosed
publicly, only certain method implementations are changed to fix the vulnerability. A helpful illustration is the
commit containing the fix for the now infamous Heartbleed vulnerability (CVE-2014-0160) in the OpenSSL
library mentioned at the beginning of this chapter. After investigating the related CVE, we observe that
there indeed is a link to the commit containing the fix as expected. When looking at the modifications in the
respective commit6
we can observe that, apart from added comments, only a single method implementation
was changed — the one containing the fix for the vulnerability.
1.5 Research method
We will employ Action Research to evaluate the usefulness of a prototype tool that can automatically assess
exposure to known vulnerabilities. More specifically, we employ Technical Action Research (TAR). Our in-
stantiation of TAR is presented in Chapter 3. Action Research is a form of research in which researchers seek
to combine theory and practice (Moody et al., 2002; Sjøberg et al., 2007). The tool will be created in the context
of our host company, the Software Improvement Group (SIG) located in Amsterdam. First, the usefulness of
such a tool is determined in the context of this company, later on we will try to determine the components
that contribute to this perceived usefulness and hypothesize if they would also contribute to hypothesized
usefulness in other contexts. During the initial study of the usefulness in the context of the host organization
of the prototype tool, potential problems threatening the usefulness of the tool can be solved.
1.6 Complexity
There are a lot of moving parts involved in the construction of the prototype tool that need to be carefully
aligned to obtain meaningful results. These complexities include working with a multitude of vulnerability
sources and third-party libraries. We need to interact with local and remote Git repositories, retrieve infor-
mation using the GitHub API, invoke Maven commands programmatically, conduct call graph analysis, work
with existing vulnerability sources and parse source code.
Limitations of using CVEs CVE databases can be used, but they are known to have certain limitations.
A limitation we are aware of is that the correct matching between information extracted from dependency
management systems and CPE identifiers is not always possible due to ambiguities in naming conventions.
Heuristics can be employed to overcome some of these limitations.
Working with APIs of GitHub/Git We could use the GitHub API to retrieve patches included in a specific
commit. However, not all open-source dependencies use GitHub; they may also serve Git through private
servers. Fortunately, we can also clone a remote repository locally using JGit7
to obtain patch information.
In addition, the GitHub API for issues can be used to obtain other meta-data that could be of interest to us.
Call graph analysis Once we have retrieved the relevant patches for our library and derived a list of
methods that are expected to be vulnerable, we need to determine if these methods are executed directly or
indirectly by the parent system. This can be achieved using a technique better known as call graph analysis.
Call graph analysis tools are available for analysing virtually any programming language. There is also a
huge body of research available explaining the currently used methods, static or dynamic analysis, in detail.
6https://git.openssl.org/gitweb/?p=openssl.git;a=commitdiff;h=96db902
7https://eclipse.org/jgit
5
CHAPTER 1. INTRODUCTION
Also, we need to know the limitations of these tools. All call graph tools identified for Java have issues in
processing source code as opposed to JAR files containing bytecode. Therefore, a different method needs
to be devised to trace the initial method call within a system’s source code to a library method. Based on
evaluating various tools to generate call graphs, we expect that we can reliably determine this under normal
circumstances. With normal circumstances it is meant that method invocation through reflection is usually
not traced by call graph libraries. Nonetheless, in general we don’t expect that systems would extensively
use reflection to interact with third-party libraries.
1.7 Outline
The rest of this thesis is structured as follows. We will first examine related work. This is followed by explain-
ing our instantiation of TAR. Then, we will describe both the high-level design and low-level implementation
of our prototype tool. This is followed by an evaluation of the usefulness of the tool. Finally, we will answer
the research questions in the conclusion.
6
Chapter 2
Related work
In this chapter we will review related work on the topic of known vulnerabilities in third-party components.
The goal of the chapter is to provide insight into the prevalence of the problem and the research that has
been conducted related to this topic so far.
2.1 Tracking Known Security Vulnerabilities in Proprietary Soft-
ware Systems
Cadariu et al. (2015)
Software systems are often prone to security vulnerabilities that are introduced by the third party com-
ponents of a system. Therefore, it is crucial that these components are kept up to date by providing early
warnings when new vulnerabilities for those dependencies are disclosed allowing appropriate action to be
taken.
A high level description of an approach that could be followed for creating a tool that provides such early
warnings is given. In modern build environments, dependency managers — such as Maven for Java projects
— are used. These tools process information relating to the dependencies needed to be included found in a
structured XML file. For Maven systems this is called the POM file. This file can then be used to gather a list
of dependencies used by the project, as opposed to other strategies, such as looking at import statements in
Java code. This approach can easily be extended for dependency managers in other programming languages
that use similar configuration files, such as Python (PyPi), Node.js (NPM), PHP (composer) and Ruby (Gems).
As a source of vulnerability data existing CVE databased are used. Common Platform Enumerations (CPE)
identifiers contained within CVE reports uniquely identify affected platforms.
An existing system, OWASP Dependency Check, that already features some requested functionality is
employed and extended to support retrieving dependencies from POM files.
A matching mechanism is devised to match dependency names retrieved from Maven with CPE identifiers.
For example, a specific Maven dependency can be identified as “org.mortbay.jetty:jetty:6.1.20” and the CPE is
“cpe:/a:mortbay:jetty:6.1.20”. False positives and false negatives rates are determined by calculating precision
and recall by randomly looking at 50 matches and determine whether the match is relevant. Precision is quite
low (14%), while the recall is higher (80%).
The prevalence of the known-vulnerabilities-in-dependencies phenomenon in practice is assessed. A total
of 75 client systems available at SIG are used to test the prototype tool with. The majority of them, 54, have
at least one vulnerable dependency, while the maximum is seven vulnerable dependencies.
Finally, technical consultants working at the host company evaluate the usefulness of such a system in
practice. Interviews with consultants working at SIG are held to discuss the analysis results. Without the
system, respondents would not have considered outdated dependencies and their impact on the security of
the system. One specific customer was informed and he was very fond of the detection of this vulnerability
7
CHAPTER 2. RELATED WORK
in his system.
The problem investigated is partially similar to the topic we are researching. The difference between this
approach and our topic is that the tool proposed in this paper does not support reporting whether a identified
vulnerability really affects the the system, e.g. to what extent the reported vulnerable methods or classes are
actually used. In addition, like in this research we are also interested in evaluating the usefulness of a security
tool like this.
2.2 Tracking known security vulnerabilities in third-party compo-
nents
Cadariu (2014)
The paper "Tracking Known Security Vulnerabilities in Proprietary Software Systems" described previ-
ously is based on this prior research, which is a thesis. The thesis expands a bit on several topics but the
information is largely the same but a bit more detailed. The goal of this thesis is to propose a method to con-
tinuously track known vulnerabilities in third party components of software systems and assess its usefulness
in a relevant context.
All potential publicly available sources of vulnerability reports (CVEs) are considered. Eventually it is
determined to use the NVD, because it appears to be the only one at that time that offered XML feeds listing
the vulnerabilities.
Finally, interviews with consultants at SIG are conducted to assess the usefulness of the prototype tool
that was developed during the course of this research. Evaluation shows that the method produces useful
security-related alerts consistently reflecting the presence of known vulnerabilities in third party libraries of
software projects.
This study has shown that the NVD database has proven to be the most useful vulnerability database for this
kind of research. This is due to its adequacy for the research goal and convenient data export features. This
database contains known vulnerabilities that have been assigned a standardized CVE identifier. However,
for a vulnerability to be known, it does not necessarily need to go through the process that leads to a CVE
assignment. Some security vulnerabilities are public knowledge before receiving a CVE identifier, such as
when users of open-source projects signal security vulnerabilities. Ideally, tracking known vulnerabilities
would mean indexing every possible source of information that publishes information regarding software
security threats. In this research this has not been investigated. In our research we will keep in mind that CVE
databases are not the only data source for vulnerabilities might we run into problems with these traditional
sources of vulnerability information.
2.3 The Unfortunate Reality of Insecure Libraries
Williams and Dabirsiaghi (2012)
This article shows the prevalence and relevance of the issue that is using libraries with known vulnerabilities.
The authors show that there are significant risks associated with the use of libraries.
A significant majority of code found in modern applications originates from third party libraries and frame-
works. Organizations place strong trust in these libraries by incorporating them in their systems. However,
the authors discover that almost 30% of the downloaded dependencies contain known vulnerabilities after
analyzing nearly 30 million downloads from the Maven Central dependency repository. The authors con-
clude that this phenomenon proves that most organizations are not likely to have a strong policy in place for
keeping libraries up to date to prevent systems becoming compromised by the known vulnerabilities in the
dependencies used.
The security aspect of in-house developed code is normally given proper security attention, but, in contrast,
the possibility that risk comes from third party libraries is barely considered by most companies. The 31 most
downloaded libraries are closely examined. It turns out 37% of the 1261 versions of those libraries contain
known vulnerabilities. Even more interesting is that security related libraries turn out to be 20% more likely
to have reported security vulnerabilities than, say, a web framework. It is expected that these libraries simply
8
CHAPTER 2. RELATED WORK
have more reported vulnerabilities due to the nature of the library; they simply receive more attention and
scrutiny from researchers and hackers.
Finally, it is found that larger organizations on average have downloaded 19 of the 31 most popular Java
libraries. Smaller organizations downloaded a mere 8 of these libraries. The functionality offered by some of
these libraries overlaps with functionality in other libraries. This is a concern because this indicates that larger
organizations have not standardized on using a small set of trusted libraries. More libraries used means more
third-party code is included in a system, and more code leads to a higher chance of security vulnerabilities
being present.
The authors conclude that deriving metrics indicating what libraries are in use and how far out-of-date
and out-of-version they are would be a good practice. They recommend establishing systems and processes
to lessen the exposure to known security vulnerabilities introduced by third-party dependencies as the use
of dependency management tools has caused a significant increase in the number of libraries involved in a
typical application.
2.4 Impact assessment for vulnerabilities in open-source software
libraries
Plate et al. (2015)
Due to the increased inclusion of open source components in systems, each vulnerability discovered in a
bundle of dependencies potentially jeopardizes the security of the whole application. After a vulnerability
is discovered, its impact on a system has to be assessed. Current decision-making is based on high-level
vulnerability information descriptions and expert knowledge, which is not ideal due to effort that needs to
be exercised and due to its proneness to errors. In this paper a more pragmatic approach to assess the impact
is proposed.
Once a vulnerability is discovered, the dependencies of a system will sometimes still not be updated to
neutralize the risk of exposure. In the enterprise context this can be attributed to the fact that these systems
are mission-critical. Therefore, downtime has to be minimized. The problem with updating dependencies is
that new issues may be introduced. Enterprises are reluctant to update their dependencies more frequently
for this reason. Due to these convictions, system maintainers need to carefully assess whether an application
requires an urgent application patch or whether this update can be applied during the application’s regular
release cycle. The question that arises is whether it can be determined if any vulnerability found in a depen-
dency originates from parts of the dependency’s API that are used by the system. In this paper a possible
approach to assess this is described.
The following assumption is made: Whenever an application incorporates a library known to be vulnerable
and executes a fragment of the library that contains the vulnerable code, there is a significant risk that the
vulnerability can be exploited. The authors collect execution traces of applications, and compare those with
changes that would be introduced by the security patches of known vulnerabilities in order to detect whether
critical library code is executed. Coverage is measured by calculating the intersection between programming
constructs that are both present in the security patch and that are, directly or indirectly, executed in the
context of the system.
Practical problems arise due to use of different sources such as VCS repositories and CVE databases. This
is mainly attributed to the use of non-standardized methods to refer to a certain library and versions.
The authors state that once a vulnerability is discovered, its impact on a system has to be assessed. Their
intended approach is a bit similar to ours; look at the VCS repositories of dependencies and try to determine
the changes that have occurred after the vulnerable version was released, up to the point the vulnerability
was patched. However, manual effort is needed to connect CVE entries to VCS repositories.
A key problem that their approach faces is how to reliably relate CVE entries with the affected software
products and the corresponding source code repository, down to the level of accurately matching vulnerability
reports with the code changes that provide a fix for them. This information was apparently unavailable or
went unnoticed when their research was conducted as our preliminary investigation shows that VCS links
are often even referenced in the CVE entry, there is no need to manually provide this information for each
dependency.
9
CHAPTER 2. RELATED WORK
2.5 Measuring Dependency Freshness in Software Systems
Bouwers et al. (2015)
Prioritizing dependency updates often proves to be difficult since the use of outdated dependencies can
be opaque. The goal of this paper is making this usage more transparent by devising a metric to quantify
how recent the versions of the used dependencies are in general. The metric is calibrated by basing the
thresholds on industry benchmarks. The usefulness of the metric in practice is evaluated. In addition, the
relation between outdated dependencies and security vulnerabilities is determined.
In this paper, the term “freshness” is used to denote the difference between the used version of a dependency
and the desired version of a dependency. In this research the desired situation equates to using the latest
version of the dependency. The freshness values of all dependencies are aggregated to the system-level using
a benchmark-based approach.
A study is conducted to investigate the prevalence of the usage of outdated dependencies among 75 Java
systems. Maven POM files are used to determine the dependencies that are used in systems. When consider-
ing the overall state of dependency freshness using a version sequence number metric, the authors conclude
that only 16.7% of the dependencies display no update lag at all; e.g. the most recent version of a dependency
is used. Over 50% of the dependencies have an update lag of at least 5 versions. The version release date
distance paints an even worse picture. The large majority (64.1%) of the dependencies have an update lag
of over 365 days, with a tail up to 8 years. Overall, the authors conclude that apparently it is not common
practice to update dependencies on a regular basis.
Given the measurement of freshness on the dependency level, a system level metric can be defined by
aggregating the lower level measurements. This aggregation method works with a so-called risk profile that
in this case describes which percentage of dependencies falls into one of four risk categories.
To determine the relationship between the dependency freshness rating and security vulnerabilities the
authors calculate the rating for each system and determine how many of the dependencies used by a system
have known security vulnerabilities.
The experiment points out that systems with a high median dependency freshness rating show a lower
number of dependencies with reported security vulnerabilities. The opposite also holds. Moreover, systems
with a low dependency freshness score are more than four times as likely to incorporate dependencies with
known security vulnerabilities.
This study relates to our topic due to the fact that it shows there is a relation between outdated dependencies
and security vulnerabilities. The tool we propose can justify the importance to update dependencies by
showing the vulnerabilities the system is else exposed to; the use of outdated dependencies is no longer
opaque.
2.6 Monitoring Software Vulnerabilities through Social Networks
Analysis
Trabelsi et al. (2015)
Security vulnerability information is spread over the Internet and it requires manual effort to track all
these sources. Trabelsi et al. (2015) noticed that the information in these sources is frequently aggregated
on Twitter. Therefore, Twitter can be used to find information about software vulnerabilities. This can even
include information about zero-day exploits that are not yet submitted to CVE databases. The authors propose
a prototype tool to index this information.
First, a clustering algorithm for social media content is devised, grouping all information regarding the
same subject matter, which is a pre-requisite for distinguishing known from new security information.
The system is comprised of two subsystems, a data collection and a data processing part. The data col-
lection part stores information including common security terminology such as “vulnerability” or “exploit”
combined with names of software components such as “Apache Commons”. Apart from Twitter information,
a local mirror of a CVE database, such as NVD, is stored. This database is used to categorize security in-
formation obtained from Twitter, in particular to distinguish new information from the repetition of already
known vulnerability information. The data processing part identifies, evaluates and classifies the security
10
CHAPTER 2. RELATED WORK
information retrieved from Twitter. Using data-mining algorithms, the data is processed. Each algorithm is
implemented by a so-called analyzer. An element of this system is a pre-processor that filters out duplicate
tweets or content not meeting certain criteria.
To detect zero-day vulnerability information, the authors identify clusters of information relating to the
same issue of some software component and contains specific vulnerability keywords.
The prototype tool conducts a Twitter search by identifying information matching the regular expression
“CVE-*-” to obtain all the messages dealing with CVEs. After this, the messages are grouped by CVE identifier
in order to obtain clusters of messages dealing with the same CVE. From these clusters the authors extract
the common keywords in order to identify the manifestation of the vulnerability.
Furthermore, the result of an empirical study that compares the availability of information published
through Social Media (e.g.Twitter) and classical sources (e.g. the NVD) is presented. The authors have con-
ducted two studies that compare the freshness of the data collected compared to the traditional sources. The
first study concerns the comparison between the publication date of CVEs in the NVD and the publication
date on social media. 41% of the CVEs were discussed on Twitter before they were listed in the NVD. The
second study investigates the publication date of zero-day vulnerabilities on social media relative to the date
of publication for the related CVE in the NVD. 75,8% of the CVEs vulnerabilities where disclosed on social
media before their official disclosure in the NVD.
The research conducted by Trabelsi et al. (2015) relates to our topic because we might also want to use un-
conventional (i.e. not CVE databases) sources to either obtain new vulnerability information or complement
existing vulnerability data.
2.7 An Analysis of Dependence on Third-party Libraries in Open
Source and Proprietary Systems
Raemaekers et al. (2012a)
At present there is little insight into the actual usage of third-party libraries in real-word applications as
opposed to general download statistics. The authors of this paper seek to identify the frequency of use of
third-party libraries among proprietary and open source systems. This information is used to derive a rating
that reflects the frequency of use of specific libraries and the dependence on third-party libraries. The rating
can be employed to estimate the amount of exposure to possible security risks present in these libraries.
To obtain the frequency of use of third-party libraries, import and package statements are extracted from a
set of Java systems. After processing the import and package statements, a rating is calculated for individual
third-party libraries and the systems that incorporate these libraries. The rating for a specific library consists
of the number of different systems it is used in divided by the total number of systems in the sample system
set. The rating for a system as a whole is the sum of all ratings of the libraries it contains, divided by the
square of the number of libraries.
The authors hypothesize that when a library is shown to be incorporated frequently in multiple systems
there must have been a good reason to do so. The reasoning behind this is that apparently a large number of
teams deems the library safe enough to use and therefore have made a rational decision to prefer this library
over another library offering similar functionality. It is assumed that people are risk-averse in their choice
of third-party libraries and that people therefore tend to prefer safer libraries to less safe ones. The authors
thus exploit the collective judgment in the rating.
Raemaekers et al. (2012a) also assume that the more third-party library dependencies a system has, the
higher the exposure to risk in these libraries becomes. The analysis shows that frequency of use and the
number of libraries used can give valuable insight in the usage of third-party libraries in a system.
The final rating devised ranks more common third-party libraries higher than less common ones, and
systems with a large number of third-party dependencies get rated lower than systems with less third-party
dependencies.
This paper relates to our topic because the rating derived may correlate with the secureness of a library or
system as a whole; if a lot of obscure dependencies are used by the system it could be considered to be less
safe. However, this assumption does not necessarily hold in all cases because a popular library may attract
more attention from hackers and thus is a more attractive target to exploit than less commonly used libraries.
11
CHAPTER 2. RELATED WORK
2.8 Exploring Risks in the Usage of Third-Party Libraries
Raemaekers et al. (2011)
Using software libraries may be tempting but we should not ignore the risks they can introduce to a system.
These risks include lower quality standards or security risks due to the use of dependencies with known
vulnerabilities. The goal of this paper is to explore to what extent the risks involved in the use of third-
party libraries can be assessed automatically. A rating based on frequency of use is proposed to assess this.
Moreover, various library attributes that could be used as risk indicators are examined. The authors also
propose an isolation rating that measures the concentration and distribution of library import statements in
the packages of a system. Another goal of this paper is to explore methods to automatically calculate such a
rating based on static source code analysis.
First, the frequency of use of third-party libraries in a large corpus of open source and proprietary software
systems is analyzed. Secondly, the authors investigate additional library attributes that could serve as an
indicator for risks in the usage of third-party libraries. Finally, the authors investigate ways to improve
this rating by incorporating information on the distribution and concentration of third party library import
statements in the source code. The result is a formula by which one can calculate the the rating based on
the frequency of use, the number of third-party libraries that a system uses and the encapsulation of calls to
these libraries in sub-packages of a system.
The rating for a specific library that the authors propose in this paper is the number of different systems
it is used in divided by the total number of systems in the data set. The rating for a system is the average of
all ratings of the libraries it contains, divided by the number of libraries.
Risks in the usage of third party libraries are influenced by the way a given system is using a specific
library. In particular, the usage can be well encapsulated in one dedicated component (which would isolate
the risk), or scattered through the entire system (which would distribute risk to multiple places and makes it
costly to replace the library).
When a library is imported frequently in a single package but not frequently imported in other packages,
this would result in an array of frequencies with a high ’inequality’ relative to each other. Ideally third-
party imports should be imported in specific packages dealing with this library, thus reducing the amount of
’exposed’ code to possible risks in this library.
This paper describes an approach to use the frequency of use of third-party libraries to assess risks present
in a system. With this data, an organization can have insight into the risks present in libraries and contemplate
on necessary measures or actions needed to be taken to reduce this risk.
This paper relates to our topic because the API usage is used as a proxy for potential vulnerability risk. In
the system we propose we seek to determine whether vulnerable APIs are called.
2.9 Measuring Software Library Stability Through Historical Ver-
sion Analysis
Raemaekers et al. (2012b)
Vendors of libraries and users of the same libraries have conflicting concerns. Users seek backward com-
patibility in libraries while library vendors want to release new versions of their software to include new
features, improve existing features or fix bugs. The library vendors are constantly faced with a trade-off be-
tween keeping backward compatibility and living with mistakes from the past. The goal of this paper is to
introduce a way to measure interface and implementation stability.
By means of a case study, several issues with third-party library dependencies are illustrated:
• It is shown that maintenance debt accumulates when updates of libraries are deferred.
• The authors show that when a moment in the future arrives where there is no choice but to update to a
new version a much larger effort has to be put in than when smaller incremental updates are performed
during the evolution of the system.
12
CHAPTER 2. RELATED WORK
• It is shown that the transitive dependencies libraries bring along can increase the total amount of work
required to update to a new version of a library, even if an upgrade of these transitive dependencies
was originally not intended.
• The authors show that a risk of using deprecated and legacy versions of libraries is that they may
contain security vulnerabilities or critical bugs.
The authors propose four metrics that provide insight on different aspects of implementation and interface
stability. Library (in)stability is the degree to which the public interface or implementation of a software
library changes as time passes in such way that it potentially requires users of this library to rework their
implementations due to these changes.
This study illustrates one of the reasons a systems’ dependencies are often not kept up to date. We may uti-
lize these metrics in our research to indicate how much dependencies interfaces have been changed between
the currently used version and a new version containing security improvements. This indication provides an
estimation for the amount of time needed to update to a newer release of a dependency.
2.10 An Empirical Analysis of Exploitation Attempts based on Vul-
nerabilities in Open Source Software
Ransbotham (2010)
Open source software has the potential to be more secure than closed source software due to the large
number of people that review the source code who may find vulnerabilities before they are shipped in the
next release of a system. However, when considering vulnerabilities identified after the release of a system,
malicious persons might abuse the openness of its source code. These individuals can use the source code to
learn about the details of a vulnerability to fully exploit it; the shadow side of making source code available
to anyone.
Open source software presents two additional challenges to post-release security. First and foremost, the
open nature of the source code eliminates any benefits of private disclosure. Because changes to the source
code are visible, they are publicly disclosed by definition, making it easy for hackers to figure out how to
defeat the security measures.
Many open source systems are themselves used as components in other software products. Hence, not
only must the vulnerability be fixed in the initial source, it must be propagated through derivative products,
released and installed. These steps give attackers more time, further increasing the expected benefits for the
attacker.
In conclusion, when compared to proprietary dependencies, open source dependencies have a greater risk
of exploitation, diffuse earlier and wider and have greater overall volume of exploitation attempts.
Using open source libraries brings along additional security risks due to their open character. Vulnerabili-
ties in these libraries, even when they are patched, propagate to other systems incorporating these libraries.
Since the effort to exploit a system decreases due to the availability of the source code, it is paramount that
early warnings are issued and distributed upon discovery of a vulnerability. The latter can be accomplished
by the tool we propose. This way, owners can limit the exploit-ability of their system. Therefore, this research
emphasizes why our area of research is so important.
2.11 Understanding API Usage to Support Informed Decision Mak-
ing in Software Maintenance
Bauer and Heinemann (2012)
The use of third-party libraries has several productivity-related advantages but it also introduces risks —
such as exposure to security vulnerabilities — to a system. In order to be able to make informed decisions, a
thorough understanding of the extent and nature of the dependence upon external APIs is needed.
Risks include that:
13
CHAPTER 2. RELATED WORK
• APIs keep evolving, often introducing new functionality or providing bug fixes. Migrating to the latest
version is therefore often desirable. However, depending on the amount of changes — e.g. in case of a
major new release of an API — backward-compatibility might not be guaranteed.
• An API might not be completely mature yet. Thus, it could introduce bugs into a software system that
may be difficult to find and hard to fix. In such scenarios it would be beneficial to replace the current
API with a more reliable one as soon as it becomes available.
• The provider of an API might decide to discontinue its support, such that users can no longer rely on
it for new functionality and bug fixes.
• The license of a library or a project might change, making it impossible to continue the use of a par-
ticular API for legal reasons.
These risks are beyond the control of the maintainers of a system that are using these external APIs but
they do need to be taken into account when making decisions about the maintenance options of a software
system. Tool support is therefore required to provide this information in an automated fashion. Bauer and
Heinemann (2012) devise an approach to automatically extract information about library usage from the
source code of a project and visualize it to support decision-making during software maintenance. The goal
is determining the degree of dependence on the used libraries.
This paper is related to our topic in the sense that the tool we will devise could be used to provide insight
to the effort required to update a vulnerable dependency to a newer version once it has been discovered.
14
Chapter 3
Research method
In this chapter we explain the research method we will employ during our research.
The goal of this chapter is to explain our instantiation of Technical Action Research.
3.1 Introduction
In this thesis TAR will be employed as proposed by Wieringa and Morali (2012). TAR is a research method in
which a researcher evaluates a technique by solving problems in practice employing the technique. Findings
can be generalized to unobserved cases that show similarities to the studied case.
In TAR, a research fulfills three roles:
I Artifact designer
II Client helper
III Empirical researcher
The technique is first tested on a small scale in an idealized “laboratory” setting and is then tested in increas-
ingly realistic settings within the research context, eventually finishing by making the technique available
for use in other contexts to solve real problems.
Before a suitable technique can be developed, improvement problems should be solved and knowledge ques-
tions answered. An improvement problem in this case could be: “How can we assess actual exposure to
vulnerabilities in automated fashion?”. Knowledge problems are of the form “Why is it necessary to deter-
mine actual exposure to vulnerabilities?” or “What could be the effect of utilizing this technique in practice?”.
To solve an improvement problem we can design treatments. A treatment is something that solves a prob-
lem or reduces the severity of it. Each plausible treatment should be validated and one should be selected
and implemented. A treatment consists of an artifact interacting with a problem context. This treatment will
be inserted into a problem context, with which it will start interacting. In our case the treatment consists
of a tool incorporating the technique we proposed before used to fulfill some goal. Treatments can be vali-
dated by looking at their expected effects in context, the evaluation of these effects, expected trade-offs and
sensitivities.
It is necessary to determine actual exposure to vulnerabilities because the maintainers of a system often
neglect to keep their dependencies update due to a presumed lack of threat. A tool that points out that the
perceived sense of security is false to the complacent maintainers would stimulate them to take action; after
all, once they know of the threat, so do large numbers of others with less honorable intentions.
The effect of this would be that a systems’ dependencies are kept up to date better, which should lead
to improved security. This is also expected to lead to improved maintainability of a system. This can be
substantiated by arguing that the more time has passed since a dependency is last updated, the more effort it
takes to upgrade. The reason being that the public API of a dependency evolves, and as more time passes and
more updates are released the API might have changed so dramatically that its almost impossible to keep up.
15
CHAPTER 3. RESEARCH METHOD
Generalization of solutions in TAR is achieved by distinguishing between particular problems and problem
classes. A particular problem is a problem in a specific setting. When abstracted away from this setting, a
particular problem may indicate the class of problems it belongs to. This is important because the aim of
conducting this research is to accumulate general knowledge rather than case-specific knowledge that does
not apply in a broader context.
In the next sections we will explain our instantiation of three cycles, each one belonging to a specific role
(client helper, empirical researcher, artifact designer) the researcher fulfills.
3.2 Client helper cycle
3.2.1 Problem investigation
SIG offers security-related services to its clients. As part of this value proposition, the Vulnerability Alert
Service (VAS) tool has been devised. Although the tool is considered to be useful, it also generates a lot of
false positives. More importantly, SIG consultants need to manually verify each reported vulnerability to see
whether the vulnerability could impact the system of the client. This is based on the consultant’s knowledge
of the part of the dependency the vulnerability is contained in and how this dependency is used in the system.
An issue is that this assessment is not foolproof due to the fact that it relies on the consultant’s knowledge of
the system, which may be incomplete. A better option would be to completely automatically assess whether
vulnerable code may be executed without the involvement of humans.
SIG also provides its clients with services to assess the future maintainability of a system. When depen-
dencies are not frequently updated to newer versions it will require considerably more effort in the future
to integrate with newer versions of the dependency due to API changes. As discussed in the introduction,
the reason for not updating may be attributed to the anxiety of introducing new bugs when doing so. If
any of the used dependencies are known to have security vulnerabilities, the maintainers of such systems
have to be convinced of the urge to update to a newer version to mitigate the vulnerability. Maintainers may
think that they are not affected by a known vulnerability based on their judgement. This judgement may
be poor. Automatic tooling could be employed to convince these maintainers of the urge to update when it
can be shown that vulnerable code is likely executed. If the tool indicates the system is actually exposed to
the vulnerability, the dependency will likely be updated, which may improve the long-term maintainability
of the system because the distance between the latest version of the dependency and the used dependency
decreases. In turn, this makes it easier to keep up to date with breaking API changes when they occur rather
than letting them accumulate. Hence, our tool might also be useful from a maintainability perspective.
We have identified an approach that could be used to fulfill this need. We will design a tool that incorporates
such functionality and appraise whether this tool can be exploited in useful ways for SIG. Table 3.1 shows
the stakeholders that are involved in the SIG context along with their goals and criteria.
Stakeholder Goals Criteria
SIG Add value for clients by actively monitor-
ing exposure to known vulnerabilities
The tool should aid in system security as-
sessments conducted by consultants at SIG.
The number of false positives reported
should be minimized, as this may lead to
actual threats going unnoticed in the noise.
Clients should consider any findings
of the tool useful and valuable.
SIG’s clients Tool allows clients to take action as soon as
possible when new threats emerge.
Less exposure to security threats.
Improved maintainability of the sys-
tem.
Table 3.1: Stakeholders in the SIG context, their goals and criteria.
16
CHAPTER 3. RESEARCH METHOD
3.2.2 Treatment design
Using the artifact (proof-of-concept tool) and the context (SIG) we can devise multiple treatments:
I Tool indicates actual exposure to vulnerability in library → client updates to newer version of depen-
dency → security risk lowered and dependency lag reduced. This treatment contributes to the goals in
that the security risk of that specific system is lowered and the maintainability of the system is improved.
II Tool indicates actual exposure to vulnerability in library → client removes dependency on library or
replaces with another library having the same functionality. This treatment might lessen the immediate
security risk, but another library might have another risk. The dependency lag with a new dependency
could remain stable but it can also change negatively or positively depending on the dependency lag of
the new dependency.
3.2.3 Design validation
The effect we expect our tool to accomplish is improved awareness of exposure to vulnerabilities on the part
of both stakeholders. The resulting value for the the client is that they are able to take action and therefore
improve the security of the system. Awareness leads to reduced dependency lag and thus leads to improved
maintainability. Even if the use case of the tool shifts within SIG, the artifact is still useful because it can be
used in both security-minded contexts and maintainability-minded contexts.
3.2.4 Implementation and Implementation evaluation
The proof-of-concept is used to analyze a set of client systems. We will investigate one client system for which
a security assessment is ongoing and schedule an interview with the involved SIG consultants to discover
whether our tool supports their work and ultimately adds value for the client.
3.3 Research cycle
3.3.1 Research problem investigation
The research population consists of all clients of SIG having systems with dependencies as well as SIG con-
sultants responsible for these systems.
The research question we seek to answer by using TAR is: “Can the results of a tool implementing the
proposed technique be exploited in useful ways by SIG? Useful in this case denotes that the results will add
value for SIG and its clients”.
We know that the current VAS tool currently used at SIG was already considered to be useful when it was
delivered. Therefore it would be the most relevant to assess what makes the tool more useful than VAS.
3.3.2 Research design
The improvement goal in the research context is to extend or supplement the current VAS tool to assess
actual exposure to vulnerabilities, then monitor the results and improve them if possible. We have chosen to
proceed with the first (I) treatment (refer to client helper cycle). This treatment is preferred as it satisfies two
goals at the same time as opposed to the second (II) treatment.
The research question will be answered in the context of SIG. Data is collected by first obtaining analysis
results from the tool we propose, then discussing analysis results with SIG consultants or clients. Based on
this data we seek to assess which components contribute to the perceived usefulness.
The results are expected to be useful from at least from a maintainability and security perspective. Hence,
it is expected that in other contexts, the results are deemed useful as well in these or other perspectives.
3.3.3 Research design validation
We expect that our tool can serve various purposes in different contexts. It should be noted that a human
would also be able to assess actual exposure to vulnerabilities. However, as the average number of depen-
17
CHAPTER 3. RESEARCH METHOD
Stakeholder Goals Criteria
Maintainers of systems with dependencies Improve system maintainabil-
ity and security by actively
monitoring exposure to known
vulnerabilities.
Use of tool should lead to re-
duced dependency lag and thus
less maintainability-related prob-
lems. Not too much false positives
reported.
Companies/entities with internal systems Lessen security risk of these inter-
nal systems.
Not too much missed vulnerabili-
ties (false negatives) leading to a
false sense of security.
Researchers Utilize actual vulnerability expo-
sure data in research in order to
make some conclusion based on
this data.
Accuracy of reported exposure to
vulnerabilities.
Third-party service providers Deliver a security-related service to
clients.
Scalability and versatility of solu-
tion.
Table 3.2: Stakeholders in the general context and their goals and criteria.
dencies used in a system increases, manual examination would only be feasible for systems with little depen-
dencies.
The research design allows us to answer the research question as the tool can be used by consultants at
SIG in real client cases. As these consultants actually use the tool to aid in an assessment, they are likely to
provide meaningful feedback.
We have identified the following potential risks that may threaten the results obtained in the research
cycle:
• SIG clients’ systems use uncommon libraries (no CVE data available).
• SIG clients’ systems use only proprietary libraries (no CVE data available).
• Perceived usefulness significantly varies per case.
• There is no perceived usefulness. However, in that case we could look at which elements do not con-
tribute to the usefulness and try to change them.
• The VAS system we rely on for CVE detection does not report any vulnerabilities while those are
present in a certain library (false negatives).
3.3.4 Analysis of results
We will execute the client helper cycle. Then, we evaluate the observations and devise explanations for un-
expected results. Generalizations to other contexts are hypothesized and limitations noted. We will dedicate
a separate chapter to this.
3.4 Design cycle
3.4.1 Problem investigation
The currently tooling available to detect known vulnerabilities in the dependencies of a system does not
assess actual exposure to these vulnerabilities. We plan to develop a tool that is actually able to do this. In
Table 3.2 we list a number of stakeholders that could potentially be users of this tool in external contexts.
By observing the phenomena we can conclude that there is a need for tooling to aid in the detection of
dependencies that have known vulnerabilities.
• Up to 80 percent of code in modern systems originates from dependencies (Williams and Dabirsiaghi,
2012).
18
CHAPTER 3. RESEARCH METHOD
• 2011 research: 37% of the 1,261 versions of 31 libraries studied contains 1 vulnerability or more (Williams
and Dabirsiaghi, 2012).
• Plate et al. (2015) indicate that once a vulnerability in a system’s dependencies is discovered companies
often still do not update them.
• There is need to carefully assess whether application requires urgent patch or whether this patch can
be applied during regular release cycle.
3.4.2 Artifact design
We will design and implement a proof-of-concept tool incorporating this functionality.
3.4.3 Design validation
We expect that the tool we can propose can be useful in multiple contexts. The results achieved after exe-
cuting the research cycle will provide evidence whether it is deemed useful in at least the one context that is
researched. We also expect that there will be limitations that impact the usefulness in certain contexts. We
will note these limitations and try to accommodate to them or else propose alternative approaches that may
be used in the future to reduce these limitations.
Different types of users of the tool can use the prototype tool to find known vulnerabilities in dependencies.
This information can be used for multiple purposes. We have listed some potential stakeholders of this kind
of information in the table at the beginning of this section. Thus, the tool should be considered useful in
multiple contexts.
The exposure to known vulnerabilities could also be assessed manually. After a list of vulnerabilities
potentially affecting the system is obtained, a human could try to determine whether vulnerable code is
potentially executed. The disadvantage is that this would require manual effort. The advantage is that there
would be less false negatives, i.e. a human is able to determine the vulnerable methods regardless of the
source of this information. However, the manual effort exerted may be very time consuming and thus this
approach is not scalable, while the approach we suggest — by using automatic tooling to do this — is.
To this point we have assumed that all vulnerabilities originate from vulnerable code at the method level.
However, it should be noted that vulnerabilities could also be the result of wrong configuration. For instance,
a property in a configuration file may be set to a value that makes a system less secure. In such cases our ap-
proach would not yield any results. Our tool could be changed to accommodate for this, but in our experience
it would be very hard to find out which settings make a system insecure; there is little structured information
available to find out about these wrong configurations and furthermore these vulnerabilities tend to be user
configuration errors rather than vulnerabilities present in the dependencies themselves.
3.4.4 Implementation and Implementation evaluation
Ordinarily, we would release the source code of the proof-of-concept tool after our research ends. This would
allow the tool to be used in other contexts. Unfortunately, at this time our host company can not open-source
the tool for intellectual property reasons.
19
Chapter 4
Designing a proof of concept tool
In this chapter we explain how we will construct our prototype tool, including the technical choices we have
made. We will first give the research context and a high-level overview of the components involved in real-
izing automatic assessment to vulnerabilities, followed by a more in-depth explanation of these components.
The goal of this chapter is to provide insight how a prototype tool could be constructed, including the
implementation choices made and the difficulties faced.
4.1 Research context
SIG is interested in expanding their product offering with new security-related products. For this purpose,
SIG has developed a tool called VAS in the past. This tool extracts information from a POM file, which is an
artifact used in the Maven build system. Maven facilitates easy management of dependencies, e.g. installing
and updating dependencies. Users can simply declare a list of libraries they require in the POM file and Maven
will download them and/or update to a newer version. The VAS tool uses the information in this file to derive
the list of dependencies of an arbitrary system. VAS will then download a local copy of the NVD1
and search
for CVE affecting any used versions of the dependencies. A report is made if there are known vulnerabilities
listed for a specific version of a dependency that is used. The CVE entries contain CPE identifiers that reflect
the platforms affected by the vulnerability. Formally, CPE is defined as a “naming specification [that] defines
standardized methods for assigning names to IT product classes”2
.
For the purpose of this thesis, an extension to the current VAS tool, Assessed Exposure Vulnerability Alert
Service (AEVAS), will be developed. For a given system, the existing VAS tool produces a list of CVE identifiers
for all known vulnerabilities present in the system’s dependencies. VAS will then prompt AEVAS to conduct
additional analysis by passing the list with CVEs.
4.2 High-level overview tool
Conceptually, the approach that allows us to assess actual exposure to known vulnerabilities for a given
systems works as follows:
I The dependencies of a system are identified. We store the specific versions of the dependencies that are
used.
II We download the executables containing these dependencies.
III We gather all CVE entries affecting any of the identified dependencies. Furthermore, we process the
references listed in the CVE entries. These references may refer to VCS artifacts, such as a link to a
commit on GitHub.
1https://web.nvd.nist.gov
2https://cpe.mitre.org/specification/
20
CHAPTER 4. DESIGNING A PROOF OF CONCEPT TOOL
IV We establish which library methods are vulnerable. If a reference links to a specific VCS artifact we can
identify which code was changed. More specifically, we are interested to know which methods had their
implementation changed.
V We determine which library methods are invoked.
VI We ascertain whether one of the library methods invoked is a method we identified to be vulnerable
earlier in the process. If that is the case, we assume that the system in question is vulnerable to that
specific vulnerability.
Figure 4.1 provides an overview of the steps involved. We will describe these steps in detail in the next
section.
Figure 4.1: A high-level overview of the steps involved.
4.2.1 Gathering and downloading dependencies of a system
We look at the POM file used by Maven to identify the dependencies of a system. In this file, the dependencies
are listed in a structured way. We then try to download the dependencies from the Maven Central Repository.
Some dependencies might be proprietary, in that case we can not download them through the Maven Central
Repository. We skip these dependencies from the rest of our analysis. This is not a major concern because
CVE data usually is not available for proprietary or internal dependencies.
4.2.2 Gathering CVE data relevant to included dependencies
We need to determine the vulnerabilities that potentially impact a system. There are several ways to assess
this, but the most straightforward approach would be to obtain this from VAS, the current vulnerability
monitoring system used at SIG. VAS exposes a REST API. Similarly to our tool,VAS extracts dependency
information from a systems’ POM file and looks for known vulnerabilities in those dependencies included as
depicted in Figure 4.2. We can query this API and a list of CVE for the dependencies of any given system is
returned.
Once we have a list of CVE identifiers, additional information relating to the CVE from various sources
is retrieved, such as the CVSS score that indicates the severity and potential impact of the vulnerability. In
particular, we are interested in the list of references included in a CVE entry. References, as their name
implies, are links to additional sources offering information related to some aspect of the CVE. In some cases,
links to issue tracking systems and links to a commit or some other VCS artifact are given.
21
CHAPTER 4. DESIGNING A PROOF OF CONCEPT TOOL
Figure 4.2: Systems have dependencies, which frequently have known vulnerabilities
4.2.3 Establishing vulnerable methods
In line with our assumptions, as stated in Section 1.4, we expect that the commits identified in the references
of a CVE entry contain the fix for the vulnerability. More specifically, the methods changed in the fix were the
ones that contained the vulnerable code before it was fixed. The process of gathering the vulnerable methods
from patches in commits is visualized in Figure 4.3.
Figure 4.3: In a CVE entry we try to find a VCS reference, which potentially allows us to identify the vulnerable
methods.
4.2.4 Ascertaining which library methods are invoked
Furthermore, we need to confirm that the system in question actually invokes any of these vulnerable methods
directly or indirectly. We derive a list of called methods by conducting call graph analysis.
4.2.5 Identifying vulnerable methods that are invoked
Finally, to determine if the system in question is exposed to a vulnerability we take the intersection between
the set of dependency API methods that are invoked and the set of vulnerable dependency methods. If the
result of this intersection is not empty, we can conclude that the system in question is actually vulnerable.
4.3 Detailed approach for automatically assessing exposure to known
vulnerabilities
We have implemented the proof of concept tool in Java 8. We chose to implement it in this programming
language because the majority of client systems’ at SIG are written in this language. Because we will use
these client systems in our analysis to determine the usefulness of such a tool and due to the fact that we
need to create a call graph for these systems, we need a call graph library that can handle Java code. We did
not find any suitable call graph libraries written in any other language than Java — so that we can invoke it
programmatically — that can handle Java systems. Therefore, we chose to implement the proof of concept
tool in Java. The next sections describe how the steps mentioned in the previous section are implemented to
arrive at the final goal of assessing the actual exposure to vulnerabilities.
22
CHAPTER 4. DESIGNING A PROOF OF CONCEPT TOOL
4.3.1 Determining vulnerable methods
The existing VAS system will pass a list of CVEs to AEVAS. These CVEs are all the CVEs affecting the specific
versions of libraries that are used by a given system.
Finding references to Version Control System artifacts
First of all, more information relating to each CVE is obtained. This information includes a list of references.
These references are simple URLs pointing to a resource that has more information on the vulnerability in
any form. A reference could simply refer to a CVE website or a blog post describing the vulnerability in more
detail. We acquire this additional CVE data by using the open-source vFeed3
tool that downloads information
from public CVE databases and stores it in a local database.
For each reference, we assess if it is a link that contains information related to a version control repository.
For example, a link may refer to a specific commit on GitHub. In our prototype implementation we will solely
use Git artifacts. One might ask why we choose Git here opposed to any other VCS, such as Subversion or
Mercurial. The reason is that the volume of Git references simply outnumbers the amount of references
related to any other VCS. Figure 4.4 provides a graphical depiction of the number of references found in the
NVD CVE database for each distinct VCS.
Figure 4.4: The number of VCS related references found in the NVD CVE database grouped by VCS.
Using regular expressions we check if a reference is a valid link to a specific commit. Listing 1 shows
how this check has been implemented. The extractGitArtifactsFromReferences method first determines which
regular expression should be applied, based on certain keywords (such as GitHub, GitLab and Bitbucket)
in the reference. The method tryToExtractGitPlatformArtifacts shows how this is implemented for one of
three types of Git URLs we take into account. The methods tryToExtractCgitPlatformArtifacts and tryToEx-
tractGenericGitURLArtifacts are very similar, they only differ in the regular expressions used to extract the
information needed. We have implemented it this way so that it is relatively straightforward to support any
other platform in the future.
Determining vulnerable methods
Once a reference to a specific commit has been obtained, we analyze the changes contained in the patches of
that specific commit. As mentioned earlier (refer to Section 1.4) our assumption is that any method whose
implementation has changed was a method that contained the vulnerable code.
If we have a reference to a specific commit we usually also know the (likely) clone URL of the repository
containing the source code. Do note we say likely, because if we have a URL that looks like “https://github.
com/netty/netty/commit/2fa9400a59d0563a66908aba55c41e7285a04994” we know that the URL to clone the
repository will be “https://github.com/netty/netty.git”. In the case of a GitHub, GitLab or Bitbucket URL, we
can determine the clone URL with certainty since the clone URL adheres to a predictive pattern. For other
3https://github.com/toolswatch/vFeed
23
CHAPTER 4. DESIGNING A PROOF OF CONCEPT TOOL
protected void extractGitArtifactsFromReferences() throws
NoGitArtifactsFoundException {→
for (String gitReference : inputReferences) {
if (gitReference.contains(CGIT)) {
tryToExtractCgitPlatformArtifacts(gitReference);
} else if (gitReference.contains(GITHUB) || gitReference.contains(GITLAB) ||
gitReference.contains(BITBUCKET)) {→
tryToExtractGitPlatformArtifacts(gitReference);
} else {
tryToExtractGenericGitURLArtifacts(gitReference);
}
}
if (commitShaList.isEmpty() || repositoryLocation == null) {
throw new NoGitArtifactsFoundException();
}
}
protected void tryToExtractGitPlatformArtifacts(String gitReference) {
String gitPlatformRegex =
String.format("(https?://(?:(?:(?:%s|%s).%s)|%s.%s)/[w-~]+/[w-
~]+)/%s?/(b[0-9a-f]{5,40}b)", GITHUB, GITLAB, TLD_COM, BITBUCKET,
TLD_ORG, COMMIT_ARTIFACT_PLURAL);
→
→
→
Pattern gitPlatformPattern = Pattern.compile(gitPlatformRegex);
Matcher gitPlatformURLMatch = gitPlatformPattern.matcher(gitReference);
if (gitPlatformURLMatch.find()) {
log.info(String.format("Reference is git platform reference: %s",
gitReference));→
if (gitPlatformURLMatch.groupCount() == 2) {
repositoryLocation = gitPlatformURLMatch.group(1);
commitShaList.add(gitPlatformURLMatch.group(2));
}
}
}
Listing 1: The methods in the class GitURLArtifactExtractor responsible for extracting VCS artifact informa-
tion from a reference URL.
types of VCS URLs, such as URLs to custom cgit4
installations, this proves to be more difficult. In some cases,
the clone URL has been customized and thus does not follow a predictable pattern. In those cases, we simply
can not retrieve any patch information.
In the cases in which we do have a clone URL, we clone the repository locally by using JGit5
. JGit is a Java
implementation of the Git VCS.
We can programmatically acquire the contents of all Java files that have changes according to the commit
information. In addition, we also acquire the contents of those files in the state of the previous commit (e.g.
before they were changed). Moreover, we parse all files and compare the two revisions representing the
old code (before the commit) and the code after the commit was applied. We compare by comparing the
content of a method (i.e. the lines in the body) between the two revisions. If they are not equal, the method’s
implementation has been changed in the commit and thus we assume this method to be vulnerable.
One might ask why we implemented it in this way instead of simply using the raw patch contents. The
reason is that the approach we haven chosen is easier to implement. If operating at the level of the patch
itself, all lines including “+” and “-” signs need to be extracted using some regular expression. Furthermore,
we would need to extract the lines that did not change and integrate those parts to obtain a file with the new
state and a file with the old state. This implementation is much more difficult and prone to errors. Thus, we
have opted for the current approach.
Our implementation is given in Listing 2. For the sake of brevity we omit the implementation of the method
calculateChangedMethodsBetweenFiles here. It involves comparing the lines of code in the body of the same
method between two revisions.
4https://git.zx2c4.com/cgit/about/
5https://eclipse.org/jgit
24
CHAPTER 4. DESIGNING A PROOF OF CONCEPT TOOL
protected void generateDiff(String commitSha) {
try {
List<DiffEntry> diffEntries = GitUtils.obtainDiffEntries(gitRepository,
commitSha);→
processDiffEntries(diffEntries, commitSha);
} catch (IOException exception) {
log.error("Could not generate diff", exception);
}
}
protected void processDiffEntries(List<DiffEntry> diffEntries, String commitSha)
throws IOException {→
for (DiffEntry diffEntry : diffEntries) {
boolean fileIsJavaFile = StringUtils.endsWith(diffEntry.getNewPath(),
".java");→
if (diffEntry.getChangeType() == DiffEntry.ChangeType.ADD || !fileIsJavaFile) {
continue;
}
String rawFileContents =
GitUtils.fetchFileContentsInCommit(gitRepository.getRepository(), commitSha,
diffEntry.getNewPath());
→
→
ObjectId parentSha = GitUtils.parentCommitForCommit(commitSha, gitRepository);
String rawFileContentsPreviousCommit =
GitUtils.fetchFileContentsInCommit(gitRepository.getRepository(), parentSha,
diffEntry.getOldPath());
→
→
calculateChangedMethodsBetweenFiles(rawFileContents,
rawFileContentsPreviousCommit);→
}
log.debug(String.format("Changed methods: %s", changedMethods));
}
Listing 2: The methods in the class GitDiff responsible for determining which methods were changed in a
commit.
4.3.2 Extracting dependency information
Before we can create a call graph, we need to obtain the JAR files for all libraries used. These JAR files contain
Java bytecode. First, we extract the list of dependencies used along with information on the specific versions
used. This includes any transitive dependencies that may be present. In our implementation, we collect the
required information programmatically invoking the Maven tree command. The extractDependencyInforma-
tion method in the aptly named MavenDependencyExtractor class is responsible for this. The implementation
is given in Listing 3. The “-debug” flag is added to the command to prevent Maven from not outputting a
dependency tree if even a single dependency can not be resolved. A dependency can not be resolved when,
for example, a proprietary dependency is listed that is not available in the Maven Central Repository. Adding
the “debug” flag ensures that unrelated or partial failures will not lead to no information being extracted at
all. The filterDependenciesUsedFromRawOutput method (not shown here) uses regular expressions to filter
the relevant output since the “debug” flag also leads to a lot of information being output that we do not care
about.
4.3.3 Creating a call graph
The next step in our analysis involves determining which methods in those vulnerable dependencies are
called by a given system, either directly or indirectly. For example, method E in class A of the system may
call method F of class B contained within a library. In turn, this method F in class B may call method G
of class C in the same library. Therefore, there is path from methods B to G. To determine these relations
programmatically, we use the WALA call graph library6
originally developed by IBM. The call graph library
6http://wala.sourceforge.net/wiki/index.php/Main_Page
25
CHAPTER 4. DESIGNING A PROOF OF CONCEPT TOOL
protected void extractDependencyTreeInformation(String pomFilePath) {
currentPomFile = pomFilePath;
MavenInvocationRequest request = new MavenInvocationRequest(currentPomFile);
// we use the debug flag to continue outputting the tree even if a single
dependency can not be resolved→
String command = String.format("dependency:tree --debug -Dmaven.repo.local=%s",
MVN_REPO_PATH);→
request.addGoal(command);
log.info(String.format("Invoking mvn %s for pom file %s", command, pomFilePath));
String output = request.invoke();
filterDependenciesUsedFromRawOutput(output);
}
Listing 3: The method in the class MavenDependencyExtractor that extracts information from the dependency
tree.
can use JAR (Java Archive) files containing bytecode to conduct analysis. The resulting information provides
insight into which methods of the libraries are called by the system under investigation.
Figure 4.5: A graphical depiction of how we determine whether vulnerable library methods are invoked.
Using raw source code as input
Source code of clients’ projects is uploaded to SIG frequently. SIG does not require presence of executable
binaries in the upload. Ordinarily, static analysis is used at SIG to analyze all source code, SIG never executes
client code to perform any form of analysis. However, open-source call graph libraries rarely support creating
a call graph using source code alone. The call graph library we use can only analyze source code after
it has been translated to byte code by a Java front-end. Conveniently, this kind of tooling is provided by
the maintainers of the call graph library we use. Since we also want to trace calls occurring in third party
libraries, executable files (e.g. JAR files) containing the libraries are needed. We can obtain these from the
Maven repository. This does not work for any proprietary libraries since these are not available publicly.
We exclude these libraries from the call graph analysis for this reason. In addition, there often is very little
vulnerability data on proprietary libraries, therefore it does not matter that we do not include it. Thus, in the
context of SIG such a tool should be able to process raw source code to be considered useful. It is realistic
to assume that in other contexts this is also useful, for instance if executable binaries are not available for
26
E.M._Poot
E.M._Poot
E.M._Poot
E.M._Poot
E.M._Poot
E.M._Poot
E.M._Poot
E.M._Poot
E.M._Poot
E.M._Poot
E.M._Poot
E.M._Poot
E.M._Poot
E.M._Poot
E.M._Poot
E.M._Poot
E.M._Poot
E.M._Poot
E.M._Poot
E.M._Poot
E.M._Poot
E.M._Poot
E.M._Poot
E.M._Poot
E.M._Poot
E.M._Poot
E.M._Poot
E.M._Poot
E.M._Poot

Weitere ähnliche Inhalte

Was ist angesagt?

Security concepts
Security conceptsSecurity concepts
Security conceptsDeepak Raj
 
Musvo security agents
Musvo security agentsMusvo security agents
Musvo security agentssadombajoel
 
Android OS Security: Risks and Limitations. AISEC Technical Report
Android OS Security: Risks and Limitations. AISEC Technical ReportAndroid OS Security: Risks and Limitations. AISEC Technical Report
Android OS Security: Risks and Limitations. AISEC Technical ReportFraunhofer AISEC
 
Using Open Source Tools For STR7XX Cross Development
Using Open Source Tools For STR7XX Cross DevelopmentUsing Open Source Tools For STR7XX Cross Development
Using Open Source Tools For STR7XX Cross DevelopmentGiacomo Antonino Fazio
 
ProjectCodeMeter Pro Users Manual
ProjectCodeMeter Pro Users ManualProjectCodeMeter Pro Users Manual
ProjectCodeMeter Pro Users Manualasdfgarfgqr
 
Applications of genetic algorithms to malware detection and creation
Applications of genetic algorithms to malware detection and creationApplications of genetic algorithms to malware detection and creation
Applications of genetic algorithms to malware detection and creationUltraUploader
 
bonino_thesis_final
bonino_thesis_finalbonino_thesis_final
bonino_thesis_finalDario Bonino
 
Anomaly_Analysis_of_OpenStack_Firewall_Polices_Report
Anomaly_Analysis_of_OpenStack_Firewall_Polices_ReportAnomaly_Analysis_of_OpenStack_Firewall_Polices_Report
Anomaly_Analysis_of_OpenStack_Firewall_Polices_ReportCiaran McDonald
 
Master_Thesis_2015_by_Sanjeev_Laha_21229267
Master_Thesis_2015_by_Sanjeev_Laha_21229267Master_Thesis_2015_by_Sanjeev_Laha_21229267
Master_Thesis_2015_by_Sanjeev_Laha_21229267Sanjeev Laha
 
Arduino bộ vi điều khiển cho tất cả chúng ta part 1
Arduino bộ vi điều khiển cho tất cả chúng ta part 1Arduino bộ vi điều khiển cho tất cả chúng ta part 1
Arduino bộ vi điều khiển cho tất cả chúng ta part 1tungdientu
 

Was ist angesagt? (11)

Security concepts
Security conceptsSecurity concepts
Security concepts
 
Musvo security agents
Musvo security agentsMusvo security agents
Musvo security agents
 
Android OS Security: Risks and Limitations. AISEC Technical Report
Android OS Security: Risks and Limitations. AISEC Technical ReportAndroid OS Security: Risks and Limitations. AISEC Technical Report
Android OS Security: Risks and Limitations. AISEC Technical Report
 
Using Open Source Tools For STR7XX Cross Development
Using Open Source Tools For STR7XX Cross DevelopmentUsing Open Source Tools For STR7XX Cross Development
Using Open Source Tools For STR7XX Cross Development
 
ProjectCodeMeter Pro Users Manual
ProjectCodeMeter Pro Users ManualProjectCodeMeter Pro Users Manual
ProjectCodeMeter Pro Users Manual
 
Kernel
KernelKernel
Kernel
 
Applications of genetic algorithms to malware detection and creation
Applications of genetic algorithms to malware detection and creationApplications of genetic algorithms to malware detection and creation
Applications of genetic algorithms to malware detection and creation
 
bonino_thesis_final
bonino_thesis_finalbonino_thesis_final
bonino_thesis_final
 
Anomaly_Analysis_of_OpenStack_Firewall_Polices_Report
Anomaly_Analysis_of_OpenStack_Firewall_Polices_ReportAnomaly_Analysis_of_OpenStack_Firewall_Polices_Report
Anomaly_Analysis_of_OpenStack_Firewall_Polices_Report
 
Master_Thesis_2015_by_Sanjeev_Laha_21229267
Master_Thesis_2015_by_Sanjeev_Laha_21229267Master_Thesis_2015_by_Sanjeev_Laha_21229267
Master_Thesis_2015_by_Sanjeev_Laha_21229267
 
Arduino bộ vi điều khiển cho tất cả chúng ta part 1
Arduino bộ vi điều khiển cho tất cả chúng ta part 1Arduino bộ vi điều khiển cho tất cả chúng ta part 1
Arduino bộ vi điều khiển cho tất cả chúng ta part 1
 

Andere mochten auch

Mi biografia escolar (recuperado)
Mi biografia escolar (recuperado)Mi biografia escolar (recuperado)
Mi biografia escolar (recuperado)gabrielbernal426
 
Vídeos curiosos
Vídeos curiososVídeos curiosos
Vídeos curiososAckens69
 
Jornada laboral y retribución
Jornada laboral y retribuciónJornada laboral y retribución
Jornada laboral y retribuciónpamelaquinga
 
Mamma mía
Mamma míaMamma mía
Mamma míaAckens69
 
Iobc soft fruits trentino
Iobc soft fruits trentinoIobc soft fruits trentino
Iobc soft fruits trentinofondazionemach
 
SXSW 2016 Takeaways
SXSW 2016 TakeawaysSXSW 2016 Takeaways
SXSW 2016 Takeawayssteffan
 
Colegio de bachilleres plantel 8 cuajimalpa
Colegio de bachilleres plantel 8 cuajimalpaColegio de bachilleres plantel 8 cuajimalpa
Colegio de bachilleres plantel 8 cuajimalpagabrielbernal426
 
Rios noelia c
Rios noelia cRios noelia c
Rios noelia cxilef1001
 
keyfetch Metal KF Promotional Presentation
keyfetch Metal KF Promotional Presentationkeyfetch Metal KF Promotional Presentation
keyfetch Metal KF Promotional Presentationsam wong
 
¿PORQUE LOS ABUELOS --- ?
¿PORQUE LOS ABUELOS --- ?¿PORQUE LOS ABUELOS --- ?
¿PORQUE LOS ABUELOS --- ?Mariaam Salazar
 
30 Hour OSHA Certificate
30 Hour OSHA Certificate30 Hour OSHA Certificate
30 Hour OSHA CertificateChris Irish
 

Andere mochten auch (20)

Mi biografia escolar (recuperado)
Mi biografia escolar (recuperado)Mi biografia escolar (recuperado)
Mi biografia escolar (recuperado)
 
Vídeos curiosos
Vídeos curiososVídeos curiosos
Vídeos curiosos
 
Jornada laboral y retribución
Jornada laboral y retribuciónJornada laboral y retribución
Jornada laboral y retribución
 
Galápagos
GalápagosGalápagos
Galápagos
 
Mamma mía
Mamma míaMamma mía
Mamma mía
 
zoologico de cali (ofimatica)
zoologico de cali (ofimatica) zoologico de cali (ofimatica)
zoologico de cali (ofimatica)
 
Iobc soft fruits trentino
Iobc soft fruits trentinoIobc soft fruits trentino
Iobc soft fruits trentino
 
OSHA 511 Certificate Sean Meadowsrot
OSHA 511 Certificate Sean MeadowsrotOSHA 511 Certificate Sean Meadowsrot
OSHA 511 Certificate Sean Meadowsrot
 
SXSW 2016 Takeaways
SXSW 2016 TakeawaysSXSW 2016 Takeaways
SXSW 2016 Takeaways
 
Satwinder singh social
Satwinder singh socialSatwinder singh social
Satwinder singh social
 
AGROGLIFOS
AGROGLIFOS AGROGLIFOS
AGROGLIFOS
 
CV_of_Siviwe_Mnqovu
CV_of_Siviwe_MnqovuCV_of_Siviwe_Mnqovu
CV_of_Siviwe_Mnqovu
 
Aprendizaje colaborativo
Aprendizaje colaborativoAprendizaje colaborativo
Aprendizaje colaborativo
 
Colegio de bachilleres plantel 8 cuajimalpa
Colegio de bachilleres plantel 8 cuajimalpaColegio de bachilleres plantel 8 cuajimalpa
Colegio de bachilleres plantel 8 cuajimalpa
 
NUEVO TRATADO de MURPHY
NUEVO TRATADO de MURPHYNUEVO TRATADO de MURPHY
NUEVO TRATADO de MURPHY
 
Rios noelia c
Rios noelia cRios noelia c
Rios noelia c
 
keyfetch Metal KF Promotional Presentation
keyfetch Metal KF Promotional Presentationkeyfetch Metal KF Promotional Presentation
keyfetch Metal KF Promotional Presentation
 
¿PORQUE LOS ABUELOS --- ?
¿PORQUE LOS ABUELOS --- ?¿PORQUE LOS ABUELOS --- ?
¿PORQUE LOS ABUELOS --- ?
 
30 Hour OSHA Certificate
30 Hour OSHA Certificate30 Hour OSHA Certificate
30 Hour OSHA Certificate
 
Reveal.js
Reveal.jsReveal.js
Reveal.js
 

Ähnlich wie E.M._Poot

Report on e-Notice App (An Android Application)
Report on e-Notice App (An Android Application)Report on e-Notice App (An Android Application)
Report on e-Notice App (An Android Application)Priyanka Kapoor
 
Security in mobile banking apps
Security in mobile banking appsSecurity in mobile banking apps
Security in mobile banking appsAlexandre Teyar
 
Specification of the Linked Media Layer
Specification of the Linked Media LayerSpecification of the Linked Media Layer
Specification of the Linked Media LayerLinkedTV
 
Work Measurement Application - Ghent Internship Report - Adel Belasker
Work Measurement Application - Ghent Internship Report - Adel BelaskerWork Measurement Application - Ghent Internship Report - Adel Belasker
Work Measurement Application - Ghent Internship Report - Adel BelaskerAdel Belasker
 
AUGUMENTED REALITY FOR SPACE.pdf
AUGUMENTED REALITY FOR SPACE.pdfAUGUMENTED REALITY FOR SPACE.pdf
AUGUMENTED REALITY FOR SPACE.pdfjeevanbasnyat1
 
Automatic Detection of Performance Design and Deployment Antipatterns in Comp...
Automatic Detection of Performance Design and Deployment Antipatterns in Comp...Automatic Detection of Performance Design and Deployment Antipatterns in Comp...
Automatic Detection of Performance Design and Deployment Antipatterns in Comp...Trevor Parsons
 
Deployment guide series ibm tivoli composite application manager for web reso...
Deployment guide series ibm tivoli composite application manager for web reso...Deployment guide series ibm tivoli composite application manager for web reso...
Deployment guide series ibm tivoli composite application manager for web reso...Banking at Ho Chi Minh city
 
Deployment guide series ibm tivoli composite application manager for web reso...
Deployment guide series ibm tivoli composite application manager for web reso...Deployment guide series ibm tivoli composite application manager for web reso...
Deployment guide series ibm tivoli composite application manager for web reso...Banking at Ho Chi Minh city
 
bkremer-report-final
bkremer-report-finalbkremer-report-final
bkremer-report-finalBen Kremer
 
project Report on LAN Security Manager
project Report on LAN Security Managerproject Report on LAN Security Manager
project Report on LAN Security ManagerShahrikh Khan
 

Ähnlich wie E.M._Poot (20)

Report on e-Notice App (An Android Application)
Report on e-Notice App (An Android Application)Report on e-Notice App (An Android Application)
Report on e-Notice App (An Android Application)
 
Security in mobile banking apps
Security in mobile banking appsSecurity in mobile banking apps
Security in mobile banking apps
 
diss
dissdiss
diss
 
BA1_Breitenfellner_RC4
BA1_Breitenfellner_RC4BA1_Breitenfellner_RC4
BA1_Breitenfellner_RC4
 
Srs
SrsSrs
Srs
 
Specification of the Linked Media Layer
Specification of the Linked Media LayerSpecification of the Linked Media Layer
Specification of the Linked Media Layer
 
Thesis_Report
Thesis_ReportThesis_Report
Thesis_Report
 
Work Measurement Application - Ghent Internship Report - Adel Belasker
Work Measurement Application - Ghent Internship Report - Adel BelaskerWork Measurement Application - Ghent Internship Report - Adel Belasker
Work Measurement Application - Ghent Internship Report - Adel Belasker
 
AUGUMENTED REALITY FOR SPACE.pdf
AUGUMENTED REALITY FOR SPACE.pdfAUGUMENTED REALITY FOR SPACE.pdf
AUGUMENTED REALITY FOR SPACE.pdf
 
Automatic Detection of Performance Design and Deployment Antipatterns in Comp...
Automatic Detection of Performance Design and Deployment Antipatterns in Comp...Automatic Detection of Performance Design and Deployment Antipatterns in Comp...
Automatic Detection of Performance Design and Deployment Antipatterns in Comp...
 
Deployment guide series ibm tivoli composite application manager for web reso...
Deployment guide series ibm tivoli composite application manager for web reso...Deployment guide series ibm tivoli composite application manager for web reso...
Deployment guide series ibm tivoli composite application manager for web reso...
 
Deployment guide series ibm tivoli composite application manager for web reso...
Deployment guide series ibm tivoli composite application manager for web reso...Deployment guide series ibm tivoli composite application manager for web reso...
Deployment guide series ibm tivoli composite application manager for web reso...
 
bkremer-report-final
bkremer-report-finalbkremer-report-final
bkremer-report-final
 
Milan_thesis.pdf
Milan_thesis.pdfMilan_thesis.pdf
Milan_thesis.pdf
 
Report-V1.5_with_comments
Report-V1.5_with_commentsReport-V1.5_with_comments
Report-V1.5_with_comments
 
DM_DanielDias_2020_MEI.pdf
DM_DanielDias_2020_MEI.pdfDM_DanielDias_2020_MEI.pdf
DM_DanielDias_2020_MEI.pdf
 
Uml (grasp)
Uml (grasp)Uml (grasp)
Uml (grasp)
 
Investigation in deep web
Investigation in deep webInvestigation in deep web
Investigation in deep web
 
project Report on LAN Security Manager
project Report on LAN Security Managerproject Report on LAN Security Manager
project Report on LAN Security Manager
 
jc_thesis_final
jc_thesis_finaljc_thesis_final
jc_thesis_final
 

E.M._Poot

  • 1. Master’s Thesis Automatically assessing exposure to known security vulnerabilities in third-party dependencies Edward M. Poot edwardmp@gmail.com July 2016, 55 pages Supervisors: dr. Magiel Bruntink Host organisation: Software Improvement Group, https://www.sig.eu Universiteit van Amsterdam Faculteit der Natuurwetenschappen, Wiskunde en Informatica Master Software Engineering http://www.software-engineering-amsterdam.nl
  • 2. Abstract Up to 80 percent of code in modern software systems originates from the third-party components used by a system. Software systems incorporate these third-party components (’dependencies’) to preclude reinventing the wheel when common or generic functionality is needed. For example, Java systems often incorporate logging libraries like the popular Log4j library. Usage of such components is not without risk; third-party software dependencies frequently expose host systems to their vulnerabilities, such as the ones listed in publicly accessible CVE (vulnerability) databases. Yet, a system’s dependencies are often still not updated to versions that are known to be immune to these vulnerabilities. A risk resulting from this phenomenon when the dependency is not updated timely after the vulnerability is disclosed is that persons with malicious intent may try to compromise the system. Tools such as Shodan∗ have emerged that can identify servers running a specific version of a vulnerable component, for instance the Jetty webserver version 4.2† , that is known to be vulnerable‡ . Once a vulnerability is disclosed publicly, finding vulnerable systems is trivial using such tooling. This risk is often overlooked by the maintainers of a system. In 2011 researchers discovered that 37% of the 1,261 versions of 31 popular libraries studied contain at least one known vulnerability. Tooling that continuously scans a systems’ dependencies for known vulnerabilities can help mitigate this risk. A tool like this, Vulnerability Alert Service (’VAS’), is already developed and in active use at the Software Improvement Group (’SIG’) in Amsterdam. The vulnerability reports generated by this tool are generally considered helpful but there are limitations to the current tool. VAS does not report whether the vulnerable parts of the dependency are actually used or potentially invoked by the system; VAS only reports whether a vulnerable version of a dependency is used but not the extent to which this vulnerability can actually be exploited in a system. Links to a specific Version Control System revision (’commit’) of a system’s code-base are frequently in- cluded in so-called CVE entries. CVE entries are bundles of meta-data related to a specific software vulner- ability that has been disclosed. By using this information, the methods whose implementations have been changed can be determined by looking at the changes contained within a commit. These changes reveal which methods were involved in the conception of the vulnerability. These methods are assumed to con- tain the vulnerability. By tracing which of these vulnerable methods is invoked directly or indirectly by the system we can determine the actual exposure to a vulnerability. The purpose of this thesis is to develop a proof-of-concept tool that incorporates such an approach to assessing the exposure known vulnerabilities. As a final step, the usefulness of the prototype tool will be validated. This is assessed by first using the tool in the context of SIG and then determining to what extent the results can be generalized to other contexts. We will show why tools like the one proposed are assumed to be useful in multiple contexts. Keywords: software vulnerability, vulnerability detection, known vulnerabilities in dependencies, CVE, CPE, CPE matching, call graph analysis ∗https://www.shodan.io †https://www.shodan.io/search?query=jetty+4.2 ‡https://www.cvedetails.com/cve/CVE-2004-2478
  • 3. Contents 1 Introduction 1 1.1 Problem analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Research questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.3 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.4 Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.5 Research method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.6 Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.7 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2 Related work 7 2.1 Tracking Known Security Vulnerabilities in Proprietary Software Systems . . . . . . . . . . 7 2.2 Tracking known security vulnerabilities in third-party components . . . . . . . . . . . . . . 8 2.3 The Unfortunate Reality of Insecure Libraries . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.4 Impact assessment for vulnerabilities in open-source software libraries . . . . . . . . . . . . 9 2.5 Measuring Dependency Freshness in Software Systems . . . . . . . . . . . . . . . . . . . . . 10 2.6 Monitoring Software Vulnerabilities through Social Networks Analysis . . . . . . . . . . . . 10 2.7 An Analysis of Dependence on Third-party Libraries in Open Source and Proprietary Systems 11 2.8 Exploring Risks in the Usage of Third-Party Libraries . . . . . . . . . . . . . . . . . . . . . . 12 2.9 Measuring Software Library Stability Through Historical Version Analysis . . . . . . . . . . 12 2.10 An Empirical Analysis of Exploitation Attempts based on Vulnerabilities in Open Source Soft- ware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.11 Understanding API Usage to Support Informed Decision Making in Software Maintenance . 13 3 Research method 15 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.2 Client helper cycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3.2.1 Problem investigation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3.2.2 Treatment design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.2.3 Design validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.2.4 Implementation and Implementation evaluation . . . . . . . . . . . . . . . . . . . . . 17 3.3 Research cycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.3.1 Research problem investigation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.3.2 Research design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.3.3 Research design validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.3.4 Analysis of results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 3.4 Design cycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 3.4.1 Problem investigation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 3.4.2 Artifact design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.4.3 Design validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.4.4 Implementation and Implementation evaluation . . . . . . . . . . . . . . . . . . . . . 19 4 Designing a proof of concept tool 20 4.1 Research context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 4.2 High-level overview tool . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
  • 4. CONTENTS 4.2.1 Gathering and downloading dependencies of a system . . . . . . . . . . . . . . . . . 21 4.2.2 Gathering CVE data relevant to included dependencies . . . . . . . . . . . . . . . . . 21 4.2.3 Establishing vulnerable methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 4.2.4 Ascertaining which library methods are invoked . . . . . . . . . . . . . . . . . . . . 22 4.2.5 Identifying vulnerable methods that are invoked . . . . . . . . . . . . . . . . . . . . 22 4.3 Detailed approach for automatically assessing exposure to known vulnerabilities . . . . . . 22 4.3.1 Determining vulnerable methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 4.3.2 Extracting dependency information . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 4.3.3 Creating a call graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 4.3.4 Determining actual exposure to vulnerable methods . . . . . . . . . . . . . . . . . . 29 4.3.5 External interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 5 Evaluation 32 5.1 Conducting analysis on client projects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 5.1.1 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 5.1.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 5.1.3 Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 5.2 Finding known vulnerabilities without using CVE databases . . . . . . . . . . . . . . . . . . 35 5.2.1 Implementing retrieval of data from another source . . . . . . . . . . . . . . . . . . . 35 5.2.2 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 5.2.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 5.2.4 Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 5.3 Finding vulnerabilities through GitHub that are not listed in CVE databases . . . . . . . . . 41 5.3.1 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 5.3.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 5.3.3 Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 5.3.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 5.4 Evaluating usefulness with security consultants . . . . . . . . . . . . . . . . . . . . . . . . . 45 5.4.1 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 5.4.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 5.4.3 Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 5.5 Reflection on usefulness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 5.5.1 Result analysis research cycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 5.5.2 Implementation evaluation of the design cycle . . . . . . . . . . . . . . . . . . . . . . 48 5.6 Threats to validity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 5.6.1 Conclusion validity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 5.6.2 Construct validity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 5.6.3 External validity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 6 Conclusion and future work 50 6.1 Answering the research questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 6.1.1 To what extent is it possible to automatically determine whether vulnerable code in dependencies can potentially be executed? . . . . . . . . . . . . . . . . . . . . . . . . 50 6.1.2 How can we generalize the usefulness of the prototype tool based on its usefulness in the SIG context? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 6.2 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 Bibliography 53 Acronyms 55
  • 5. Preface Before you lies the result of five months of hard work. Although I am the one credited for this work, this thesis could not have been produced without the help of several people. First of all I would like to thank Mircea Cadariu for his reflections on the research direction I should pursue. My gratitude goes out to Theodoor Scholte for his input on the tool I developed. I would also like to acknowledge Reinier Vis for connecting me with the right persons. Special thanks to Marina Stojanovski, Sanne Brinkhorst and Brenda Langedijk for participating in interviews or facilitating them. I want to give a shout-out to Wander Grevink for setting up the technical infrastructure used during my research. I sincerely appreciate the advice and guidance of my supervisor Magiel Bruntink during this period. Fur- thermore, I would like to express my gratitude to anyone else in the research department at Software Im- provement Group (SIG) for their input — Xander Schrijen, Haiyun Xu, Baŕbara Vieira and Cuiting Chen. I would also like to thank all the other interns at SIG for their companionship during this period. Finally, I would like to thank everybody else at SIG for providing me with the opportunity to write my thesis here. Edward Poot Amsterdam, The Netherlands July 2016
  • 6. Chapter 1 Introduction 1.1 Problem analysis In April of 2014, the cyber-security community came to know of a severe security vulnerability unprecedented in scale and severity. The vulnerability, quickly dubbed as ’Heartbleed’, was found in OpenSSL, a popular cryptography library that implements the Transport Layer Security (TLS) protocol. OpenSSL is incorporated in widely used web-server software like Apache, which powers the vast majority of websites found on the internet today. The library is also used by thousands of other systems requiring cryptographic functionality. After the disclosure of this vulnerability, security researchers identified at least 600.000 systems connected to the public Internet that were exploitable due to this vulnerability1 . This specific security incident makes it painfully clear that there is a shadow side to the use of open-source software. The widespread adoption of open-source software has made such systems easy victims. Once a vulnerability is disclosed, it can be trivial for malicious persons to exploit thousands of affected systems. Contrary to popular belief, analysis done by Ransbotham (2010) corroborates that, when compared to pro- prietary systems, open source systems have a greater risk of exploitation, diffuse earlier and wider and know a greater overall volume of exploitation attempts. The OWASP Top Ten2 exposes the most commonly occur- ring security flaws in software systems. Using components with known vulnerabilities is listed as number nine in the list of 2013. The emergence of dependency management tools has caused a significant increase in the number of libraries involved in a typical application. In a report of Williams and Dabirsiaghi (2012), in which the prevalence of using vulnerable libraries is investigated, it is recommended that systems and processes for monitoring the usage of libraries are established. The SIG analyses the maintainability of clients’ software systems and certifies systems to assess the long- term maintainability of such systems. Security is generally considered to be related to the maintainability of the system. Use of outdated dependencies with known vulnerabilities provides a strong hint that maintain- ability is not a top priority in the system. Furthermore, IT security is one of the main themes of the work SIG fulfills for its clients. The systems of SIG’s clients typically depend on third-party components for common functionality. However, as indicated before this is not without risk. In security-critical applications, such as banking systems, it is crucial to minimize the time between the disclosure of the vulnerability and the appli- cation of a patch to fix the vulnerability. Given the increasing number of dependencies used by applications, this can only be achieved by employing dedicated tooling. In 2014 an intern at SIG, Mircea Cadariu (see Cadariu (2014); Cadariu et al. (2015)), modified an existing tool to be able to scan the dependencies of a system for vulnerabilities as part of his master’s thesis. The tool was modified to support indexing Project Object Model (POM)3 files, in which dependencies of a system are declared when the Maven dependency management system is used. Interviews with consultants at SIG revealed that they would typically consider the vulnerability reports to be useful, even though false positives would frequently be reported. The interviewees mentioned that typically they would consider whether the vulnerability description could be linked to functionality in dependencies that the client uses. However, a consultant may mistakenly think that the vulnerable code is never executed since this kind of manual 1http://blog.erratasec.com/2014/04/600000-servers-vulnerable-to-heartbleed.html 2https://www.owasp.org/index.php/Top_10_2013-Top_10 3https://maven.apache.org/pom.html 1
  • 7. CHAPTER 1. INTRODUCTION verification is prone to human error. Furthermore, the need for manual verification by humans means that the disclosure of a critical and imminent threat to the client may be delayed. We propose to create a prototype tool that will automatically indicate the usage of vulnerable functionality. Plate et al. (2015) have published a paper in which a technique is proposed to identify vulnerable code in dependencies based on references to Common Vulnerabilities and Exposures (CVE) identifiers in the commit messages of a dependency. CVE identifiers are assigned to specific vulnerabilities when they are disclosed. The issue with this approach was that CVE identifiers were rarely referenced in commit messages, at least not structurally. In addition, manual effort was required to match Version Control System (VCS) repositories to specific dependencies. Moreover, Plate et al. (2015) indicate that once a vulnerability is confirmed to be present in one of the systems’ dependencies, they are regularly still not updated to mitigate the risk of exposure. In the enterprise context this can be attributed to the fact that these systems are presumed to be mission-critical. Hence, downtime has to be minimized. The reluctance to update dependencies is caused by beliefs that new issues will be introduced by updating. Because of these kind of beliefs there is an urge to carefully assess whether a system requires an urgent patch to avert exposure to a vulnerability or whether this patch can be applied during the application’s regular release cycle; a vulnerability that is actually exploitable and can be used to compromise the integrity of the system would require immediate intervention, while updating a library with a known vulnerability in untouched parts can usually be postponed. Bouwers et al. (2015) state that prioritizing dependency updates proves to be difficult because the use of outdated dependencies is often opaque. The authors have devised a metric (’dependency freshness’) to indicate whether recent versions of dependencies are generally used in a specific system. After calculating this metric for 75 systems, the authors conclude that only 16.7% of the dependencies incorporated in systems display no update lag at all. The large majority (64.1%) of the dependencies used in a system show an update lag of over 365 days, with a tail of up to 8 years. Overall, it is determined that it is not common practice to update dependencies on a regular basis in most systems. It is also discovered that the freshness rating has a negative correlation with the number of dependencies that contain known security vulnerabilities. More specifically, systems with a high median dependency freshness rating know a lower number of dependencies with reported security vulnerabilities and vice versa. However, these metrics do not take in account how the dependency is actually used by the system. The tool we propose would be able to justify the urge to update dependencies by showing that a system is actually vulnerable; the risk of using outdated dependencies is no longer opaque. Raemaekers et al. (2011) sought to assess the frequency of use of third-party libraries in both proprietary and open source systems. Using this information, a rating is derived based on the frequency of use of partic- ular libraries and on the dependence on third-party libraries in a software system. This rating can be used to indicate the exposure to potential security risks introduced by these libraries. Raemaekers et al. (2012a) continue this inquiry in another paper, the goal of which was to explore to what extent risks involved in the use of third-party libraries can be assessed automatically. The authors hypothesize that risks in the usage of third party libraries are influenced by the way a given system is using a specific library. They do not rely on CVE information but the study does look at Application Programming Interface (API) usage as an indicator of risk. We can conclude from the existing literature reviewed that vulnerabilities introduced in a system by its dependencies are a prevalent threat in today’s technological landscape. Various tools have been developed aiming to tackle this problem. However, a tool that tries to determine the actual usage of the API units introducing the vulnerable behavior is currently lacking to our knowledge. Therefore, the problem we seek to solve is assessing how we can automatically determine actual exposure to vulnerabilities introduced by a system’s dependencies rather than hypothetical exposure alone. A proof-of-concept tool will be created to indicate the feasibility of this approach. We will evaluate this tool in the context of our host company (SIG). Furthermore, we will generalize the usefulness of a tool featuring such functionality in multiple contexts. 1.2 Research questions Research question 1 To what extent is it possible to automatically determine whether vulnerable code in dependencies can potentially be executed? – How can we retrieve all CVEs relevant to a specific dependency? 2
  • 8. CHAPTER 1. INTRODUCTION – How can we determine which methods of a dependency are called directly or indirectly? – How do we determine which code was changed to fix a CVE? – How can we validate the correctness of the prototype tool we will design? Research question 2 How can we generalize the usefulness of the prototype tool based on its usefulness in the SIG context? – In what ways can the tool implementing the aforementioned technique be exploited in useful ways at SIG? – In what ways is the SIG use case similar to other cases? 1.3 Definitions First, we will establish some common vocabulary that will be used in the remainder of this thesis. An overview of the acronyms we use is also provided at the end of this thesis. Software vulnerabilities According to the Internet Engineering Task Force (IETF)4 a software vulnerabil- ity is defined to be: “a flaw or weakness in a system’s design, implementation, or operation and management that could be exploited to violate the system’s security policy”. For the purpose of this thesis, we are primarily concerned with known vulnerabilities. These are vulnerabilities that have been disclosed in the past through some public channel. CVE CVE is the abbreviated form of the term Common Vulnerabilities and Exposures. Depending on the context, it can have a slightly different meaning, but in all circumstances CVE relates to known security vulnerabilities in software systems. First of all, CVE can be used to refer to an identifier assigned to a specific security vulnerability. When a vulnerability is disclosed, it will be assigned an identifier of the form “CVE-YYYY-1234”. More specifically, the CVE prefix is added, followed by the year the vulnerability was discovered in. Finally, a number unique to all discovered vulnerabilities in that year is added to the suffix. This identifier serves as a mechanism through which different information sources can refer to the same vulnerability. Secondly, a CVE can refer to a bundle of meta-data related to a vulnerability identified by a CVE identifier, something to which we will refer as CVE entry. For instance, a score indicating the severity of vulnerability (“CVSS”) is assigned as well as a description indicating how the vulnerability manifests. Moreover, a list of references is attached, which basically is a collection of links to other sources that have supplementary information on a specific vulnerability. Finally, CVE is sometimes used synonymously with the databases containing the CVE entries. This is something we will refer to as CVE databases from now on. The National Vulnerability Database (NVD) is a specific database that we will use. CPE CPE is an acronym for Common Platform Enumeration. One or more CPEs can be found in a CVE entry. CPEs are identifiers that identify the platforms affected by a specific vulnerability. VCS VCS is an abbreviation for Version Control System. This refers to a class of systems used to track changes in source code over time. Version Control Systems use the notion of revisions. For instance, the initial source code that is added is known as revision one, but after the first change is made the revision two is the state the code is in. As of 2016, the most popular VCS is Git. Git is a distributed VCS, in which the source code may be dispersed over multiple locations. Git has the concept of repositories, in which such a copy of the source code is stored. The website GitHub is currently the most popular platform for hosting these repositories. In Git, revisions are called commits. Moreover, Git and GitHub introduce other meta-data concepts such as tags and pull requests respectively. We will commonly refer to such pieces of meta-data as VCS artifacts. GitHub also introduces the notion of issues, through which problems related to a system can be discussed. 4https://tools.ietf.org/html/rfc2828 3
  • 9. CHAPTER 1. INTRODUCTION Dependencies Software systems often incorporate third-party libraries that provide common functionality to preclude developing such functionality in-house and thereby reinventing the wheel. The advantages of using such libraries includes shortened development times and cost savings due to not having to develop and maintain such components. Since a system now depends on these libraries to function, we call these libraries the dependencies of a system. New versions of libraries containing bug-fixes and security improvements may be released by the maintainers. To aid in the process of keeping these dependencies up-to-date, dependency management systems have emerged. One of the most popular dependency management systems is Maven, a dependency management system for applications written in the Java programming language. In Maven, the dependencies are declared in an XML file referred to as the Project Object Model file, or POM file in short. 1.4 Assumptions Based on initial analysis conducted, we have established the following assumptions about known security vulnerabilities: Assumption 1 It is becoming increasingly more likely that CVE entries refer to VCS artifacts. Assumption 2 The commits referred to in CVE entries contain the fix for the vulnerability. Assumption 3 The methods whose implementation has been changed as indicated by the commit contain the fix for a vulnerability. We will substantiate each assumption in the following paragraphs. It is becoming increasingly more likely that CVE entries contain references to VCS artifacts The approach we envision to assess the actual exposure to vulnerabilities heavily relies on the presence of VCS references in CVE entries. The percentage of CVE having at least one VCS reference is still quite low (6,48% to be precise5 ) but over the years we signify a positive trend. Figure 1.1 provides a graphical depiction of this trend. With the notable exception of the year 2015, the absolute number of CVE entries having at least one VCS reference is increasing year over year. The year 2015 deviates from this trend probably simply due to the fact that the absolute number of CVEs in that year is lower than in other years. Figure 1.1: The absolute number of CVE in the NVD database having at least one VCS reference increases almost every year. 5Relative to all CVE entries in the NVD database 4
  • 10. CHAPTER 1. INTRODUCTION The commits referred to in CVE entries contain the fix for the vulnerability Based on manual ex- amination of several CVE entries, it appears that when there is a reference to a commit or other VCS artifact, the code changes included in that commit encompass the fix for the vulnerability. There are corner cases where this does not apply; we already encountered a commit link that referred to an updated change-log file indicating that the problem was solved instead of the actual code changing to remedy the problem. This does not matter in our case, since we only take source code into account. The methods whose implementation has been changed as indicated by the commit contain the fix for a vulnerability We have analyzed a number of patches. Regularly, when a vulnerability is disclosed publicly, only certain method implementations are changed to fix the vulnerability. A helpful illustration is the commit containing the fix for the now infamous Heartbleed vulnerability (CVE-2014-0160) in the OpenSSL library mentioned at the beginning of this chapter. After investigating the related CVE, we observe that there indeed is a link to the commit containing the fix as expected. When looking at the modifications in the respective commit6 we can observe that, apart from added comments, only a single method implementation was changed — the one containing the fix for the vulnerability. 1.5 Research method We will employ Action Research to evaluate the usefulness of a prototype tool that can automatically assess exposure to known vulnerabilities. More specifically, we employ Technical Action Research (TAR). Our in- stantiation of TAR is presented in Chapter 3. Action Research is a form of research in which researchers seek to combine theory and practice (Moody et al., 2002; Sjøberg et al., 2007). The tool will be created in the context of our host company, the Software Improvement Group (SIG) located in Amsterdam. First, the usefulness of such a tool is determined in the context of this company, later on we will try to determine the components that contribute to this perceived usefulness and hypothesize if they would also contribute to hypothesized usefulness in other contexts. During the initial study of the usefulness in the context of the host organization of the prototype tool, potential problems threatening the usefulness of the tool can be solved. 1.6 Complexity There are a lot of moving parts involved in the construction of the prototype tool that need to be carefully aligned to obtain meaningful results. These complexities include working with a multitude of vulnerability sources and third-party libraries. We need to interact with local and remote Git repositories, retrieve infor- mation using the GitHub API, invoke Maven commands programmatically, conduct call graph analysis, work with existing vulnerability sources and parse source code. Limitations of using CVEs CVE databases can be used, but they are known to have certain limitations. A limitation we are aware of is that the correct matching between information extracted from dependency management systems and CPE identifiers is not always possible due to ambiguities in naming conventions. Heuristics can be employed to overcome some of these limitations. Working with APIs of GitHub/Git We could use the GitHub API to retrieve patches included in a specific commit. However, not all open-source dependencies use GitHub; they may also serve Git through private servers. Fortunately, we can also clone a remote repository locally using JGit7 to obtain patch information. In addition, the GitHub API for issues can be used to obtain other meta-data that could be of interest to us. Call graph analysis Once we have retrieved the relevant patches for our library and derived a list of methods that are expected to be vulnerable, we need to determine if these methods are executed directly or indirectly by the parent system. This can be achieved using a technique better known as call graph analysis. Call graph analysis tools are available for analysing virtually any programming language. There is also a huge body of research available explaining the currently used methods, static or dynamic analysis, in detail. 6https://git.openssl.org/gitweb/?p=openssl.git;a=commitdiff;h=96db902 7https://eclipse.org/jgit 5
  • 11. CHAPTER 1. INTRODUCTION Also, we need to know the limitations of these tools. All call graph tools identified for Java have issues in processing source code as opposed to JAR files containing bytecode. Therefore, a different method needs to be devised to trace the initial method call within a system’s source code to a library method. Based on evaluating various tools to generate call graphs, we expect that we can reliably determine this under normal circumstances. With normal circumstances it is meant that method invocation through reflection is usually not traced by call graph libraries. Nonetheless, in general we don’t expect that systems would extensively use reflection to interact with third-party libraries. 1.7 Outline The rest of this thesis is structured as follows. We will first examine related work. This is followed by explain- ing our instantiation of TAR. Then, we will describe both the high-level design and low-level implementation of our prototype tool. This is followed by an evaluation of the usefulness of the tool. Finally, we will answer the research questions in the conclusion. 6
  • 12. Chapter 2 Related work In this chapter we will review related work on the topic of known vulnerabilities in third-party components. The goal of the chapter is to provide insight into the prevalence of the problem and the research that has been conducted related to this topic so far. 2.1 Tracking Known Security Vulnerabilities in Proprietary Soft- ware Systems Cadariu et al. (2015) Software systems are often prone to security vulnerabilities that are introduced by the third party com- ponents of a system. Therefore, it is crucial that these components are kept up to date by providing early warnings when new vulnerabilities for those dependencies are disclosed allowing appropriate action to be taken. A high level description of an approach that could be followed for creating a tool that provides such early warnings is given. In modern build environments, dependency managers — such as Maven for Java projects — are used. These tools process information relating to the dependencies needed to be included found in a structured XML file. For Maven systems this is called the POM file. This file can then be used to gather a list of dependencies used by the project, as opposed to other strategies, such as looking at import statements in Java code. This approach can easily be extended for dependency managers in other programming languages that use similar configuration files, such as Python (PyPi), Node.js (NPM), PHP (composer) and Ruby (Gems). As a source of vulnerability data existing CVE databased are used. Common Platform Enumerations (CPE) identifiers contained within CVE reports uniquely identify affected platforms. An existing system, OWASP Dependency Check, that already features some requested functionality is employed and extended to support retrieving dependencies from POM files. A matching mechanism is devised to match dependency names retrieved from Maven with CPE identifiers. For example, a specific Maven dependency can be identified as “org.mortbay.jetty:jetty:6.1.20” and the CPE is “cpe:/a:mortbay:jetty:6.1.20”. False positives and false negatives rates are determined by calculating precision and recall by randomly looking at 50 matches and determine whether the match is relevant. Precision is quite low (14%), while the recall is higher (80%). The prevalence of the known-vulnerabilities-in-dependencies phenomenon in practice is assessed. A total of 75 client systems available at SIG are used to test the prototype tool with. The majority of them, 54, have at least one vulnerable dependency, while the maximum is seven vulnerable dependencies. Finally, technical consultants working at the host company evaluate the usefulness of such a system in practice. Interviews with consultants working at SIG are held to discuss the analysis results. Without the system, respondents would not have considered outdated dependencies and their impact on the security of the system. One specific customer was informed and he was very fond of the detection of this vulnerability 7
  • 13. CHAPTER 2. RELATED WORK in his system. The problem investigated is partially similar to the topic we are researching. The difference between this approach and our topic is that the tool proposed in this paper does not support reporting whether a identified vulnerability really affects the the system, e.g. to what extent the reported vulnerable methods or classes are actually used. In addition, like in this research we are also interested in evaluating the usefulness of a security tool like this. 2.2 Tracking known security vulnerabilities in third-party compo- nents Cadariu (2014) The paper "Tracking Known Security Vulnerabilities in Proprietary Software Systems" described previ- ously is based on this prior research, which is a thesis. The thesis expands a bit on several topics but the information is largely the same but a bit more detailed. The goal of this thesis is to propose a method to con- tinuously track known vulnerabilities in third party components of software systems and assess its usefulness in a relevant context. All potential publicly available sources of vulnerability reports (CVEs) are considered. Eventually it is determined to use the NVD, because it appears to be the only one at that time that offered XML feeds listing the vulnerabilities. Finally, interviews with consultants at SIG are conducted to assess the usefulness of the prototype tool that was developed during the course of this research. Evaluation shows that the method produces useful security-related alerts consistently reflecting the presence of known vulnerabilities in third party libraries of software projects. This study has shown that the NVD database has proven to be the most useful vulnerability database for this kind of research. This is due to its adequacy for the research goal and convenient data export features. This database contains known vulnerabilities that have been assigned a standardized CVE identifier. However, for a vulnerability to be known, it does not necessarily need to go through the process that leads to a CVE assignment. Some security vulnerabilities are public knowledge before receiving a CVE identifier, such as when users of open-source projects signal security vulnerabilities. Ideally, tracking known vulnerabilities would mean indexing every possible source of information that publishes information regarding software security threats. In this research this has not been investigated. In our research we will keep in mind that CVE databases are not the only data source for vulnerabilities might we run into problems with these traditional sources of vulnerability information. 2.3 The Unfortunate Reality of Insecure Libraries Williams and Dabirsiaghi (2012) This article shows the prevalence and relevance of the issue that is using libraries with known vulnerabilities. The authors show that there are significant risks associated with the use of libraries. A significant majority of code found in modern applications originates from third party libraries and frame- works. Organizations place strong trust in these libraries by incorporating them in their systems. However, the authors discover that almost 30% of the downloaded dependencies contain known vulnerabilities after analyzing nearly 30 million downloads from the Maven Central dependency repository. The authors con- clude that this phenomenon proves that most organizations are not likely to have a strong policy in place for keeping libraries up to date to prevent systems becoming compromised by the known vulnerabilities in the dependencies used. The security aspect of in-house developed code is normally given proper security attention, but, in contrast, the possibility that risk comes from third party libraries is barely considered by most companies. The 31 most downloaded libraries are closely examined. It turns out 37% of the 1261 versions of those libraries contain known vulnerabilities. Even more interesting is that security related libraries turn out to be 20% more likely to have reported security vulnerabilities than, say, a web framework. It is expected that these libraries simply 8
  • 14. CHAPTER 2. RELATED WORK have more reported vulnerabilities due to the nature of the library; they simply receive more attention and scrutiny from researchers and hackers. Finally, it is found that larger organizations on average have downloaded 19 of the 31 most popular Java libraries. Smaller organizations downloaded a mere 8 of these libraries. The functionality offered by some of these libraries overlaps with functionality in other libraries. This is a concern because this indicates that larger organizations have not standardized on using a small set of trusted libraries. More libraries used means more third-party code is included in a system, and more code leads to a higher chance of security vulnerabilities being present. The authors conclude that deriving metrics indicating what libraries are in use and how far out-of-date and out-of-version they are would be a good practice. They recommend establishing systems and processes to lessen the exposure to known security vulnerabilities introduced by third-party dependencies as the use of dependency management tools has caused a significant increase in the number of libraries involved in a typical application. 2.4 Impact assessment for vulnerabilities in open-source software libraries Plate et al. (2015) Due to the increased inclusion of open source components in systems, each vulnerability discovered in a bundle of dependencies potentially jeopardizes the security of the whole application. After a vulnerability is discovered, its impact on a system has to be assessed. Current decision-making is based on high-level vulnerability information descriptions and expert knowledge, which is not ideal due to effort that needs to be exercised and due to its proneness to errors. In this paper a more pragmatic approach to assess the impact is proposed. Once a vulnerability is discovered, the dependencies of a system will sometimes still not be updated to neutralize the risk of exposure. In the enterprise context this can be attributed to the fact that these systems are mission-critical. Therefore, downtime has to be minimized. The problem with updating dependencies is that new issues may be introduced. Enterprises are reluctant to update their dependencies more frequently for this reason. Due to these convictions, system maintainers need to carefully assess whether an application requires an urgent application patch or whether this update can be applied during the application’s regular release cycle. The question that arises is whether it can be determined if any vulnerability found in a depen- dency originates from parts of the dependency’s API that are used by the system. In this paper a possible approach to assess this is described. The following assumption is made: Whenever an application incorporates a library known to be vulnerable and executes a fragment of the library that contains the vulnerable code, there is a significant risk that the vulnerability can be exploited. The authors collect execution traces of applications, and compare those with changes that would be introduced by the security patches of known vulnerabilities in order to detect whether critical library code is executed. Coverage is measured by calculating the intersection between programming constructs that are both present in the security patch and that are, directly or indirectly, executed in the context of the system. Practical problems arise due to use of different sources such as VCS repositories and CVE databases. This is mainly attributed to the use of non-standardized methods to refer to a certain library and versions. The authors state that once a vulnerability is discovered, its impact on a system has to be assessed. Their intended approach is a bit similar to ours; look at the VCS repositories of dependencies and try to determine the changes that have occurred after the vulnerable version was released, up to the point the vulnerability was patched. However, manual effort is needed to connect CVE entries to VCS repositories. A key problem that their approach faces is how to reliably relate CVE entries with the affected software products and the corresponding source code repository, down to the level of accurately matching vulnerability reports with the code changes that provide a fix for them. This information was apparently unavailable or went unnoticed when their research was conducted as our preliminary investigation shows that VCS links are often even referenced in the CVE entry, there is no need to manually provide this information for each dependency. 9
  • 15. CHAPTER 2. RELATED WORK 2.5 Measuring Dependency Freshness in Software Systems Bouwers et al. (2015) Prioritizing dependency updates often proves to be difficult since the use of outdated dependencies can be opaque. The goal of this paper is making this usage more transparent by devising a metric to quantify how recent the versions of the used dependencies are in general. The metric is calibrated by basing the thresholds on industry benchmarks. The usefulness of the metric in practice is evaluated. In addition, the relation between outdated dependencies and security vulnerabilities is determined. In this paper, the term “freshness” is used to denote the difference between the used version of a dependency and the desired version of a dependency. In this research the desired situation equates to using the latest version of the dependency. The freshness values of all dependencies are aggregated to the system-level using a benchmark-based approach. A study is conducted to investigate the prevalence of the usage of outdated dependencies among 75 Java systems. Maven POM files are used to determine the dependencies that are used in systems. When consider- ing the overall state of dependency freshness using a version sequence number metric, the authors conclude that only 16.7% of the dependencies display no update lag at all; e.g. the most recent version of a dependency is used. Over 50% of the dependencies have an update lag of at least 5 versions. The version release date distance paints an even worse picture. The large majority (64.1%) of the dependencies have an update lag of over 365 days, with a tail up to 8 years. Overall, the authors conclude that apparently it is not common practice to update dependencies on a regular basis. Given the measurement of freshness on the dependency level, a system level metric can be defined by aggregating the lower level measurements. This aggregation method works with a so-called risk profile that in this case describes which percentage of dependencies falls into one of four risk categories. To determine the relationship between the dependency freshness rating and security vulnerabilities the authors calculate the rating for each system and determine how many of the dependencies used by a system have known security vulnerabilities. The experiment points out that systems with a high median dependency freshness rating show a lower number of dependencies with reported security vulnerabilities. The opposite also holds. Moreover, systems with a low dependency freshness score are more than four times as likely to incorporate dependencies with known security vulnerabilities. This study relates to our topic due to the fact that it shows there is a relation between outdated dependencies and security vulnerabilities. The tool we propose can justify the importance to update dependencies by showing the vulnerabilities the system is else exposed to; the use of outdated dependencies is no longer opaque. 2.6 Monitoring Software Vulnerabilities through Social Networks Analysis Trabelsi et al. (2015) Security vulnerability information is spread over the Internet and it requires manual effort to track all these sources. Trabelsi et al. (2015) noticed that the information in these sources is frequently aggregated on Twitter. Therefore, Twitter can be used to find information about software vulnerabilities. This can even include information about zero-day exploits that are not yet submitted to CVE databases. The authors propose a prototype tool to index this information. First, a clustering algorithm for social media content is devised, grouping all information regarding the same subject matter, which is a pre-requisite for distinguishing known from new security information. The system is comprised of two subsystems, a data collection and a data processing part. The data col- lection part stores information including common security terminology such as “vulnerability” or “exploit” combined with names of software components such as “Apache Commons”. Apart from Twitter information, a local mirror of a CVE database, such as NVD, is stored. This database is used to categorize security in- formation obtained from Twitter, in particular to distinguish new information from the repetition of already known vulnerability information. The data processing part identifies, evaluates and classifies the security 10
  • 16. CHAPTER 2. RELATED WORK information retrieved from Twitter. Using data-mining algorithms, the data is processed. Each algorithm is implemented by a so-called analyzer. An element of this system is a pre-processor that filters out duplicate tweets or content not meeting certain criteria. To detect zero-day vulnerability information, the authors identify clusters of information relating to the same issue of some software component and contains specific vulnerability keywords. The prototype tool conducts a Twitter search by identifying information matching the regular expression “CVE-*-” to obtain all the messages dealing with CVEs. After this, the messages are grouped by CVE identifier in order to obtain clusters of messages dealing with the same CVE. From these clusters the authors extract the common keywords in order to identify the manifestation of the vulnerability. Furthermore, the result of an empirical study that compares the availability of information published through Social Media (e.g.Twitter) and classical sources (e.g. the NVD) is presented. The authors have con- ducted two studies that compare the freshness of the data collected compared to the traditional sources. The first study concerns the comparison between the publication date of CVEs in the NVD and the publication date on social media. 41% of the CVEs were discussed on Twitter before they were listed in the NVD. The second study investigates the publication date of zero-day vulnerabilities on social media relative to the date of publication for the related CVE in the NVD. 75,8% of the CVEs vulnerabilities where disclosed on social media before their official disclosure in the NVD. The research conducted by Trabelsi et al. (2015) relates to our topic because we might also want to use un- conventional (i.e. not CVE databases) sources to either obtain new vulnerability information or complement existing vulnerability data. 2.7 An Analysis of Dependence on Third-party Libraries in Open Source and Proprietary Systems Raemaekers et al. (2012a) At present there is little insight into the actual usage of third-party libraries in real-word applications as opposed to general download statistics. The authors of this paper seek to identify the frequency of use of third-party libraries among proprietary and open source systems. This information is used to derive a rating that reflects the frequency of use of specific libraries and the dependence on third-party libraries. The rating can be employed to estimate the amount of exposure to possible security risks present in these libraries. To obtain the frequency of use of third-party libraries, import and package statements are extracted from a set of Java systems. After processing the import and package statements, a rating is calculated for individual third-party libraries and the systems that incorporate these libraries. The rating for a specific library consists of the number of different systems it is used in divided by the total number of systems in the sample system set. The rating for a system as a whole is the sum of all ratings of the libraries it contains, divided by the square of the number of libraries. The authors hypothesize that when a library is shown to be incorporated frequently in multiple systems there must have been a good reason to do so. The reasoning behind this is that apparently a large number of teams deems the library safe enough to use and therefore have made a rational decision to prefer this library over another library offering similar functionality. It is assumed that people are risk-averse in their choice of third-party libraries and that people therefore tend to prefer safer libraries to less safe ones. The authors thus exploit the collective judgment in the rating. Raemaekers et al. (2012a) also assume that the more third-party library dependencies a system has, the higher the exposure to risk in these libraries becomes. The analysis shows that frequency of use and the number of libraries used can give valuable insight in the usage of third-party libraries in a system. The final rating devised ranks more common third-party libraries higher than less common ones, and systems with a large number of third-party dependencies get rated lower than systems with less third-party dependencies. This paper relates to our topic because the rating derived may correlate with the secureness of a library or system as a whole; if a lot of obscure dependencies are used by the system it could be considered to be less safe. However, this assumption does not necessarily hold in all cases because a popular library may attract more attention from hackers and thus is a more attractive target to exploit than less commonly used libraries. 11
  • 17. CHAPTER 2. RELATED WORK 2.8 Exploring Risks in the Usage of Third-Party Libraries Raemaekers et al. (2011) Using software libraries may be tempting but we should not ignore the risks they can introduce to a system. These risks include lower quality standards or security risks due to the use of dependencies with known vulnerabilities. The goal of this paper is to explore to what extent the risks involved in the use of third- party libraries can be assessed automatically. A rating based on frequency of use is proposed to assess this. Moreover, various library attributes that could be used as risk indicators are examined. The authors also propose an isolation rating that measures the concentration and distribution of library import statements in the packages of a system. Another goal of this paper is to explore methods to automatically calculate such a rating based on static source code analysis. First, the frequency of use of third-party libraries in a large corpus of open source and proprietary software systems is analyzed. Secondly, the authors investigate additional library attributes that could serve as an indicator for risks in the usage of third-party libraries. Finally, the authors investigate ways to improve this rating by incorporating information on the distribution and concentration of third party library import statements in the source code. The result is a formula by which one can calculate the the rating based on the frequency of use, the number of third-party libraries that a system uses and the encapsulation of calls to these libraries in sub-packages of a system. The rating for a specific library that the authors propose in this paper is the number of different systems it is used in divided by the total number of systems in the data set. The rating for a system is the average of all ratings of the libraries it contains, divided by the number of libraries. Risks in the usage of third party libraries are influenced by the way a given system is using a specific library. In particular, the usage can be well encapsulated in one dedicated component (which would isolate the risk), or scattered through the entire system (which would distribute risk to multiple places and makes it costly to replace the library). When a library is imported frequently in a single package but not frequently imported in other packages, this would result in an array of frequencies with a high ’inequality’ relative to each other. Ideally third- party imports should be imported in specific packages dealing with this library, thus reducing the amount of ’exposed’ code to possible risks in this library. This paper describes an approach to use the frequency of use of third-party libraries to assess risks present in a system. With this data, an organization can have insight into the risks present in libraries and contemplate on necessary measures or actions needed to be taken to reduce this risk. This paper relates to our topic because the API usage is used as a proxy for potential vulnerability risk. In the system we propose we seek to determine whether vulnerable APIs are called. 2.9 Measuring Software Library Stability Through Historical Ver- sion Analysis Raemaekers et al. (2012b) Vendors of libraries and users of the same libraries have conflicting concerns. Users seek backward com- patibility in libraries while library vendors want to release new versions of their software to include new features, improve existing features or fix bugs. The library vendors are constantly faced with a trade-off be- tween keeping backward compatibility and living with mistakes from the past. The goal of this paper is to introduce a way to measure interface and implementation stability. By means of a case study, several issues with third-party library dependencies are illustrated: • It is shown that maintenance debt accumulates when updates of libraries are deferred. • The authors show that when a moment in the future arrives where there is no choice but to update to a new version a much larger effort has to be put in than when smaller incremental updates are performed during the evolution of the system. 12
  • 18. CHAPTER 2. RELATED WORK • It is shown that the transitive dependencies libraries bring along can increase the total amount of work required to update to a new version of a library, even if an upgrade of these transitive dependencies was originally not intended. • The authors show that a risk of using deprecated and legacy versions of libraries is that they may contain security vulnerabilities or critical bugs. The authors propose four metrics that provide insight on different aspects of implementation and interface stability. Library (in)stability is the degree to which the public interface or implementation of a software library changes as time passes in such way that it potentially requires users of this library to rework their implementations due to these changes. This study illustrates one of the reasons a systems’ dependencies are often not kept up to date. We may uti- lize these metrics in our research to indicate how much dependencies interfaces have been changed between the currently used version and a new version containing security improvements. This indication provides an estimation for the amount of time needed to update to a newer release of a dependency. 2.10 An Empirical Analysis of Exploitation Attempts based on Vul- nerabilities in Open Source Software Ransbotham (2010) Open source software has the potential to be more secure than closed source software due to the large number of people that review the source code who may find vulnerabilities before they are shipped in the next release of a system. However, when considering vulnerabilities identified after the release of a system, malicious persons might abuse the openness of its source code. These individuals can use the source code to learn about the details of a vulnerability to fully exploit it; the shadow side of making source code available to anyone. Open source software presents two additional challenges to post-release security. First and foremost, the open nature of the source code eliminates any benefits of private disclosure. Because changes to the source code are visible, they are publicly disclosed by definition, making it easy for hackers to figure out how to defeat the security measures. Many open source systems are themselves used as components in other software products. Hence, not only must the vulnerability be fixed in the initial source, it must be propagated through derivative products, released and installed. These steps give attackers more time, further increasing the expected benefits for the attacker. In conclusion, when compared to proprietary dependencies, open source dependencies have a greater risk of exploitation, diffuse earlier and wider and have greater overall volume of exploitation attempts. Using open source libraries brings along additional security risks due to their open character. Vulnerabili- ties in these libraries, even when they are patched, propagate to other systems incorporating these libraries. Since the effort to exploit a system decreases due to the availability of the source code, it is paramount that early warnings are issued and distributed upon discovery of a vulnerability. The latter can be accomplished by the tool we propose. This way, owners can limit the exploit-ability of their system. Therefore, this research emphasizes why our area of research is so important. 2.11 Understanding API Usage to Support Informed Decision Mak- ing in Software Maintenance Bauer and Heinemann (2012) The use of third-party libraries has several productivity-related advantages but it also introduces risks — such as exposure to security vulnerabilities — to a system. In order to be able to make informed decisions, a thorough understanding of the extent and nature of the dependence upon external APIs is needed. Risks include that: 13
  • 19. CHAPTER 2. RELATED WORK • APIs keep evolving, often introducing new functionality or providing bug fixes. Migrating to the latest version is therefore often desirable. However, depending on the amount of changes — e.g. in case of a major new release of an API — backward-compatibility might not be guaranteed. • An API might not be completely mature yet. Thus, it could introduce bugs into a software system that may be difficult to find and hard to fix. In such scenarios it would be beneficial to replace the current API with a more reliable one as soon as it becomes available. • The provider of an API might decide to discontinue its support, such that users can no longer rely on it for new functionality and bug fixes. • The license of a library or a project might change, making it impossible to continue the use of a par- ticular API for legal reasons. These risks are beyond the control of the maintainers of a system that are using these external APIs but they do need to be taken into account when making decisions about the maintenance options of a software system. Tool support is therefore required to provide this information in an automated fashion. Bauer and Heinemann (2012) devise an approach to automatically extract information about library usage from the source code of a project and visualize it to support decision-making during software maintenance. The goal is determining the degree of dependence on the used libraries. This paper is related to our topic in the sense that the tool we will devise could be used to provide insight to the effort required to update a vulnerable dependency to a newer version once it has been discovered. 14
  • 20. Chapter 3 Research method In this chapter we explain the research method we will employ during our research. The goal of this chapter is to explain our instantiation of Technical Action Research. 3.1 Introduction In this thesis TAR will be employed as proposed by Wieringa and Morali (2012). TAR is a research method in which a researcher evaluates a technique by solving problems in practice employing the technique. Findings can be generalized to unobserved cases that show similarities to the studied case. In TAR, a research fulfills three roles: I Artifact designer II Client helper III Empirical researcher The technique is first tested on a small scale in an idealized “laboratory” setting and is then tested in increas- ingly realistic settings within the research context, eventually finishing by making the technique available for use in other contexts to solve real problems. Before a suitable technique can be developed, improvement problems should be solved and knowledge ques- tions answered. An improvement problem in this case could be: “How can we assess actual exposure to vulnerabilities in automated fashion?”. Knowledge problems are of the form “Why is it necessary to deter- mine actual exposure to vulnerabilities?” or “What could be the effect of utilizing this technique in practice?”. To solve an improvement problem we can design treatments. A treatment is something that solves a prob- lem or reduces the severity of it. Each plausible treatment should be validated and one should be selected and implemented. A treatment consists of an artifact interacting with a problem context. This treatment will be inserted into a problem context, with which it will start interacting. In our case the treatment consists of a tool incorporating the technique we proposed before used to fulfill some goal. Treatments can be vali- dated by looking at their expected effects in context, the evaluation of these effects, expected trade-offs and sensitivities. It is necessary to determine actual exposure to vulnerabilities because the maintainers of a system often neglect to keep their dependencies update due to a presumed lack of threat. A tool that points out that the perceived sense of security is false to the complacent maintainers would stimulate them to take action; after all, once they know of the threat, so do large numbers of others with less honorable intentions. The effect of this would be that a systems’ dependencies are kept up to date better, which should lead to improved security. This is also expected to lead to improved maintainability of a system. This can be substantiated by arguing that the more time has passed since a dependency is last updated, the more effort it takes to upgrade. The reason being that the public API of a dependency evolves, and as more time passes and more updates are released the API might have changed so dramatically that its almost impossible to keep up. 15
  • 21. CHAPTER 3. RESEARCH METHOD Generalization of solutions in TAR is achieved by distinguishing between particular problems and problem classes. A particular problem is a problem in a specific setting. When abstracted away from this setting, a particular problem may indicate the class of problems it belongs to. This is important because the aim of conducting this research is to accumulate general knowledge rather than case-specific knowledge that does not apply in a broader context. In the next sections we will explain our instantiation of three cycles, each one belonging to a specific role (client helper, empirical researcher, artifact designer) the researcher fulfills. 3.2 Client helper cycle 3.2.1 Problem investigation SIG offers security-related services to its clients. As part of this value proposition, the Vulnerability Alert Service (VAS) tool has been devised. Although the tool is considered to be useful, it also generates a lot of false positives. More importantly, SIG consultants need to manually verify each reported vulnerability to see whether the vulnerability could impact the system of the client. This is based on the consultant’s knowledge of the part of the dependency the vulnerability is contained in and how this dependency is used in the system. An issue is that this assessment is not foolproof due to the fact that it relies on the consultant’s knowledge of the system, which may be incomplete. A better option would be to completely automatically assess whether vulnerable code may be executed without the involvement of humans. SIG also provides its clients with services to assess the future maintainability of a system. When depen- dencies are not frequently updated to newer versions it will require considerably more effort in the future to integrate with newer versions of the dependency due to API changes. As discussed in the introduction, the reason for not updating may be attributed to the anxiety of introducing new bugs when doing so. If any of the used dependencies are known to have security vulnerabilities, the maintainers of such systems have to be convinced of the urge to update to a newer version to mitigate the vulnerability. Maintainers may think that they are not affected by a known vulnerability based on their judgement. This judgement may be poor. Automatic tooling could be employed to convince these maintainers of the urge to update when it can be shown that vulnerable code is likely executed. If the tool indicates the system is actually exposed to the vulnerability, the dependency will likely be updated, which may improve the long-term maintainability of the system because the distance between the latest version of the dependency and the used dependency decreases. In turn, this makes it easier to keep up to date with breaking API changes when they occur rather than letting them accumulate. Hence, our tool might also be useful from a maintainability perspective. We have identified an approach that could be used to fulfill this need. We will design a tool that incorporates such functionality and appraise whether this tool can be exploited in useful ways for SIG. Table 3.1 shows the stakeholders that are involved in the SIG context along with their goals and criteria. Stakeholder Goals Criteria SIG Add value for clients by actively monitor- ing exposure to known vulnerabilities The tool should aid in system security as- sessments conducted by consultants at SIG. The number of false positives reported should be minimized, as this may lead to actual threats going unnoticed in the noise. Clients should consider any findings of the tool useful and valuable. SIG’s clients Tool allows clients to take action as soon as possible when new threats emerge. Less exposure to security threats. Improved maintainability of the sys- tem. Table 3.1: Stakeholders in the SIG context, their goals and criteria. 16
  • 22. CHAPTER 3. RESEARCH METHOD 3.2.2 Treatment design Using the artifact (proof-of-concept tool) and the context (SIG) we can devise multiple treatments: I Tool indicates actual exposure to vulnerability in library → client updates to newer version of depen- dency → security risk lowered and dependency lag reduced. This treatment contributes to the goals in that the security risk of that specific system is lowered and the maintainability of the system is improved. II Tool indicates actual exposure to vulnerability in library → client removes dependency on library or replaces with another library having the same functionality. This treatment might lessen the immediate security risk, but another library might have another risk. The dependency lag with a new dependency could remain stable but it can also change negatively or positively depending on the dependency lag of the new dependency. 3.2.3 Design validation The effect we expect our tool to accomplish is improved awareness of exposure to vulnerabilities on the part of both stakeholders. The resulting value for the the client is that they are able to take action and therefore improve the security of the system. Awareness leads to reduced dependency lag and thus leads to improved maintainability. Even if the use case of the tool shifts within SIG, the artifact is still useful because it can be used in both security-minded contexts and maintainability-minded contexts. 3.2.4 Implementation and Implementation evaluation The proof-of-concept is used to analyze a set of client systems. We will investigate one client system for which a security assessment is ongoing and schedule an interview with the involved SIG consultants to discover whether our tool supports their work and ultimately adds value for the client. 3.3 Research cycle 3.3.1 Research problem investigation The research population consists of all clients of SIG having systems with dependencies as well as SIG con- sultants responsible for these systems. The research question we seek to answer by using TAR is: “Can the results of a tool implementing the proposed technique be exploited in useful ways by SIG? Useful in this case denotes that the results will add value for SIG and its clients”. We know that the current VAS tool currently used at SIG was already considered to be useful when it was delivered. Therefore it would be the most relevant to assess what makes the tool more useful than VAS. 3.3.2 Research design The improvement goal in the research context is to extend or supplement the current VAS tool to assess actual exposure to vulnerabilities, then monitor the results and improve them if possible. We have chosen to proceed with the first (I) treatment (refer to client helper cycle). This treatment is preferred as it satisfies two goals at the same time as opposed to the second (II) treatment. The research question will be answered in the context of SIG. Data is collected by first obtaining analysis results from the tool we propose, then discussing analysis results with SIG consultants or clients. Based on this data we seek to assess which components contribute to the perceived usefulness. The results are expected to be useful from at least from a maintainability and security perspective. Hence, it is expected that in other contexts, the results are deemed useful as well in these or other perspectives. 3.3.3 Research design validation We expect that our tool can serve various purposes in different contexts. It should be noted that a human would also be able to assess actual exposure to vulnerabilities. However, as the average number of depen- 17
  • 23. CHAPTER 3. RESEARCH METHOD Stakeholder Goals Criteria Maintainers of systems with dependencies Improve system maintainabil- ity and security by actively monitoring exposure to known vulnerabilities. Use of tool should lead to re- duced dependency lag and thus less maintainability-related prob- lems. Not too much false positives reported. Companies/entities with internal systems Lessen security risk of these inter- nal systems. Not too much missed vulnerabili- ties (false negatives) leading to a false sense of security. Researchers Utilize actual vulnerability expo- sure data in research in order to make some conclusion based on this data. Accuracy of reported exposure to vulnerabilities. Third-party service providers Deliver a security-related service to clients. Scalability and versatility of solu- tion. Table 3.2: Stakeholders in the general context and their goals and criteria. dencies used in a system increases, manual examination would only be feasible for systems with little depen- dencies. The research design allows us to answer the research question as the tool can be used by consultants at SIG in real client cases. As these consultants actually use the tool to aid in an assessment, they are likely to provide meaningful feedback. We have identified the following potential risks that may threaten the results obtained in the research cycle: • SIG clients’ systems use uncommon libraries (no CVE data available). • SIG clients’ systems use only proprietary libraries (no CVE data available). • Perceived usefulness significantly varies per case. • There is no perceived usefulness. However, in that case we could look at which elements do not con- tribute to the usefulness and try to change them. • The VAS system we rely on for CVE detection does not report any vulnerabilities while those are present in a certain library (false negatives). 3.3.4 Analysis of results We will execute the client helper cycle. Then, we evaluate the observations and devise explanations for un- expected results. Generalizations to other contexts are hypothesized and limitations noted. We will dedicate a separate chapter to this. 3.4 Design cycle 3.4.1 Problem investigation The currently tooling available to detect known vulnerabilities in the dependencies of a system does not assess actual exposure to these vulnerabilities. We plan to develop a tool that is actually able to do this. In Table 3.2 we list a number of stakeholders that could potentially be users of this tool in external contexts. By observing the phenomena we can conclude that there is a need for tooling to aid in the detection of dependencies that have known vulnerabilities. • Up to 80 percent of code in modern systems originates from dependencies (Williams and Dabirsiaghi, 2012). 18
  • 24. CHAPTER 3. RESEARCH METHOD • 2011 research: 37% of the 1,261 versions of 31 libraries studied contains 1 vulnerability or more (Williams and Dabirsiaghi, 2012). • Plate et al. (2015) indicate that once a vulnerability in a system’s dependencies is discovered companies often still do not update them. • There is need to carefully assess whether application requires urgent patch or whether this patch can be applied during regular release cycle. 3.4.2 Artifact design We will design and implement a proof-of-concept tool incorporating this functionality. 3.4.3 Design validation We expect that the tool we can propose can be useful in multiple contexts. The results achieved after exe- cuting the research cycle will provide evidence whether it is deemed useful in at least the one context that is researched. We also expect that there will be limitations that impact the usefulness in certain contexts. We will note these limitations and try to accommodate to them or else propose alternative approaches that may be used in the future to reduce these limitations. Different types of users of the tool can use the prototype tool to find known vulnerabilities in dependencies. This information can be used for multiple purposes. We have listed some potential stakeholders of this kind of information in the table at the beginning of this section. Thus, the tool should be considered useful in multiple contexts. The exposure to known vulnerabilities could also be assessed manually. After a list of vulnerabilities potentially affecting the system is obtained, a human could try to determine whether vulnerable code is potentially executed. The disadvantage is that this would require manual effort. The advantage is that there would be less false negatives, i.e. a human is able to determine the vulnerable methods regardless of the source of this information. However, the manual effort exerted may be very time consuming and thus this approach is not scalable, while the approach we suggest — by using automatic tooling to do this — is. To this point we have assumed that all vulnerabilities originate from vulnerable code at the method level. However, it should be noted that vulnerabilities could also be the result of wrong configuration. For instance, a property in a configuration file may be set to a value that makes a system less secure. In such cases our ap- proach would not yield any results. Our tool could be changed to accommodate for this, but in our experience it would be very hard to find out which settings make a system insecure; there is little structured information available to find out about these wrong configurations and furthermore these vulnerabilities tend to be user configuration errors rather than vulnerabilities present in the dependencies themselves. 3.4.4 Implementation and Implementation evaluation Ordinarily, we would release the source code of the proof-of-concept tool after our research ends. This would allow the tool to be used in other contexts. Unfortunately, at this time our host company can not open-source the tool for intellectual property reasons. 19
  • 25. Chapter 4 Designing a proof of concept tool In this chapter we explain how we will construct our prototype tool, including the technical choices we have made. We will first give the research context and a high-level overview of the components involved in real- izing automatic assessment to vulnerabilities, followed by a more in-depth explanation of these components. The goal of this chapter is to provide insight how a prototype tool could be constructed, including the implementation choices made and the difficulties faced. 4.1 Research context SIG is interested in expanding their product offering with new security-related products. For this purpose, SIG has developed a tool called VAS in the past. This tool extracts information from a POM file, which is an artifact used in the Maven build system. Maven facilitates easy management of dependencies, e.g. installing and updating dependencies. Users can simply declare a list of libraries they require in the POM file and Maven will download them and/or update to a newer version. The VAS tool uses the information in this file to derive the list of dependencies of an arbitrary system. VAS will then download a local copy of the NVD1 and search for CVE affecting any used versions of the dependencies. A report is made if there are known vulnerabilities listed for a specific version of a dependency that is used. The CVE entries contain CPE identifiers that reflect the platforms affected by the vulnerability. Formally, CPE is defined as a “naming specification [that] defines standardized methods for assigning names to IT product classes”2 . For the purpose of this thesis, an extension to the current VAS tool, Assessed Exposure Vulnerability Alert Service (AEVAS), will be developed. For a given system, the existing VAS tool produces a list of CVE identifiers for all known vulnerabilities present in the system’s dependencies. VAS will then prompt AEVAS to conduct additional analysis by passing the list with CVEs. 4.2 High-level overview tool Conceptually, the approach that allows us to assess actual exposure to known vulnerabilities for a given systems works as follows: I The dependencies of a system are identified. We store the specific versions of the dependencies that are used. II We download the executables containing these dependencies. III We gather all CVE entries affecting any of the identified dependencies. Furthermore, we process the references listed in the CVE entries. These references may refer to VCS artifacts, such as a link to a commit on GitHub. 1https://web.nvd.nist.gov 2https://cpe.mitre.org/specification/ 20
  • 26. CHAPTER 4. DESIGNING A PROOF OF CONCEPT TOOL IV We establish which library methods are vulnerable. If a reference links to a specific VCS artifact we can identify which code was changed. More specifically, we are interested to know which methods had their implementation changed. V We determine which library methods are invoked. VI We ascertain whether one of the library methods invoked is a method we identified to be vulnerable earlier in the process. If that is the case, we assume that the system in question is vulnerable to that specific vulnerability. Figure 4.1 provides an overview of the steps involved. We will describe these steps in detail in the next section. Figure 4.1: A high-level overview of the steps involved. 4.2.1 Gathering and downloading dependencies of a system We look at the POM file used by Maven to identify the dependencies of a system. In this file, the dependencies are listed in a structured way. We then try to download the dependencies from the Maven Central Repository. Some dependencies might be proprietary, in that case we can not download them through the Maven Central Repository. We skip these dependencies from the rest of our analysis. This is not a major concern because CVE data usually is not available for proprietary or internal dependencies. 4.2.2 Gathering CVE data relevant to included dependencies We need to determine the vulnerabilities that potentially impact a system. There are several ways to assess this, but the most straightforward approach would be to obtain this from VAS, the current vulnerability monitoring system used at SIG. VAS exposes a REST API. Similarly to our tool,VAS extracts dependency information from a systems’ POM file and looks for known vulnerabilities in those dependencies included as depicted in Figure 4.2. We can query this API and a list of CVE for the dependencies of any given system is returned. Once we have a list of CVE identifiers, additional information relating to the CVE from various sources is retrieved, such as the CVSS score that indicates the severity and potential impact of the vulnerability. In particular, we are interested in the list of references included in a CVE entry. References, as their name implies, are links to additional sources offering information related to some aspect of the CVE. In some cases, links to issue tracking systems and links to a commit or some other VCS artifact are given. 21
  • 27. CHAPTER 4. DESIGNING A PROOF OF CONCEPT TOOL Figure 4.2: Systems have dependencies, which frequently have known vulnerabilities 4.2.3 Establishing vulnerable methods In line with our assumptions, as stated in Section 1.4, we expect that the commits identified in the references of a CVE entry contain the fix for the vulnerability. More specifically, the methods changed in the fix were the ones that contained the vulnerable code before it was fixed. The process of gathering the vulnerable methods from patches in commits is visualized in Figure 4.3. Figure 4.3: In a CVE entry we try to find a VCS reference, which potentially allows us to identify the vulnerable methods. 4.2.4 Ascertaining which library methods are invoked Furthermore, we need to confirm that the system in question actually invokes any of these vulnerable methods directly or indirectly. We derive a list of called methods by conducting call graph analysis. 4.2.5 Identifying vulnerable methods that are invoked Finally, to determine if the system in question is exposed to a vulnerability we take the intersection between the set of dependency API methods that are invoked and the set of vulnerable dependency methods. If the result of this intersection is not empty, we can conclude that the system in question is actually vulnerable. 4.3 Detailed approach for automatically assessing exposure to known vulnerabilities We have implemented the proof of concept tool in Java 8. We chose to implement it in this programming language because the majority of client systems’ at SIG are written in this language. Because we will use these client systems in our analysis to determine the usefulness of such a tool and due to the fact that we need to create a call graph for these systems, we need a call graph library that can handle Java code. We did not find any suitable call graph libraries written in any other language than Java — so that we can invoke it programmatically — that can handle Java systems. Therefore, we chose to implement the proof of concept tool in Java. The next sections describe how the steps mentioned in the previous section are implemented to arrive at the final goal of assessing the actual exposure to vulnerabilities. 22
  • 28. CHAPTER 4. DESIGNING A PROOF OF CONCEPT TOOL 4.3.1 Determining vulnerable methods The existing VAS system will pass a list of CVEs to AEVAS. These CVEs are all the CVEs affecting the specific versions of libraries that are used by a given system. Finding references to Version Control System artifacts First of all, more information relating to each CVE is obtained. This information includes a list of references. These references are simple URLs pointing to a resource that has more information on the vulnerability in any form. A reference could simply refer to a CVE website or a blog post describing the vulnerability in more detail. We acquire this additional CVE data by using the open-source vFeed3 tool that downloads information from public CVE databases and stores it in a local database. For each reference, we assess if it is a link that contains information related to a version control repository. For example, a link may refer to a specific commit on GitHub. In our prototype implementation we will solely use Git artifacts. One might ask why we choose Git here opposed to any other VCS, such as Subversion or Mercurial. The reason is that the volume of Git references simply outnumbers the amount of references related to any other VCS. Figure 4.4 provides a graphical depiction of the number of references found in the NVD CVE database for each distinct VCS. Figure 4.4: The number of VCS related references found in the NVD CVE database grouped by VCS. Using regular expressions we check if a reference is a valid link to a specific commit. Listing 1 shows how this check has been implemented. The extractGitArtifactsFromReferences method first determines which regular expression should be applied, based on certain keywords (such as GitHub, GitLab and Bitbucket) in the reference. The method tryToExtractGitPlatformArtifacts shows how this is implemented for one of three types of Git URLs we take into account. The methods tryToExtractCgitPlatformArtifacts and tryToEx- tractGenericGitURLArtifacts are very similar, they only differ in the regular expressions used to extract the information needed. We have implemented it this way so that it is relatively straightforward to support any other platform in the future. Determining vulnerable methods Once a reference to a specific commit has been obtained, we analyze the changes contained in the patches of that specific commit. As mentioned earlier (refer to Section 1.4) our assumption is that any method whose implementation has changed was a method that contained the vulnerable code. If we have a reference to a specific commit we usually also know the (likely) clone URL of the repository containing the source code. Do note we say likely, because if we have a URL that looks like “https://github. com/netty/netty/commit/2fa9400a59d0563a66908aba55c41e7285a04994” we know that the URL to clone the repository will be “https://github.com/netty/netty.git”. In the case of a GitHub, GitLab or Bitbucket URL, we can determine the clone URL with certainty since the clone URL adheres to a predictive pattern. For other 3https://github.com/toolswatch/vFeed 23
  • 29. CHAPTER 4. DESIGNING A PROOF OF CONCEPT TOOL protected void extractGitArtifactsFromReferences() throws NoGitArtifactsFoundException {→ for (String gitReference : inputReferences) { if (gitReference.contains(CGIT)) { tryToExtractCgitPlatformArtifacts(gitReference); } else if (gitReference.contains(GITHUB) || gitReference.contains(GITLAB) || gitReference.contains(BITBUCKET)) {→ tryToExtractGitPlatformArtifacts(gitReference); } else { tryToExtractGenericGitURLArtifacts(gitReference); } } if (commitShaList.isEmpty() || repositoryLocation == null) { throw new NoGitArtifactsFoundException(); } } protected void tryToExtractGitPlatformArtifacts(String gitReference) { String gitPlatformRegex = String.format("(https?://(?:(?:(?:%s|%s).%s)|%s.%s)/[w-~]+/[w- ~]+)/%s?/(b[0-9a-f]{5,40}b)", GITHUB, GITLAB, TLD_COM, BITBUCKET, TLD_ORG, COMMIT_ARTIFACT_PLURAL); → → → Pattern gitPlatformPattern = Pattern.compile(gitPlatformRegex); Matcher gitPlatformURLMatch = gitPlatformPattern.matcher(gitReference); if (gitPlatformURLMatch.find()) { log.info(String.format("Reference is git platform reference: %s", gitReference));→ if (gitPlatformURLMatch.groupCount() == 2) { repositoryLocation = gitPlatformURLMatch.group(1); commitShaList.add(gitPlatformURLMatch.group(2)); } } } Listing 1: The methods in the class GitURLArtifactExtractor responsible for extracting VCS artifact informa- tion from a reference URL. types of VCS URLs, such as URLs to custom cgit4 installations, this proves to be more difficult. In some cases, the clone URL has been customized and thus does not follow a predictable pattern. In those cases, we simply can not retrieve any patch information. In the cases in which we do have a clone URL, we clone the repository locally by using JGit5 . JGit is a Java implementation of the Git VCS. We can programmatically acquire the contents of all Java files that have changes according to the commit information. In addition, we also acquire the contents of those files in the state of the previous commit (e.g. before they were changed). Moreover, we parse all files and compare the two revisions representing the old code (before the commit) and the code after the commit was applied. We compare by comparing the content of a method (i.e. the lines in the body) between the two revisions. If they are not equal, the method’s implementation has been changed in the commit and thus we assume this method to be vulnerable. One might ask why we implemented it in this way instead of simply using the raw patch contents. The reason is that the approach we haven chosen is easier to implement. If operating at the level of the patch itself, all lines including “+” and “-” signs need to be extracted using some regular expression. Furthermore, we would need to extract the lines that did not change and integrate those parts to obtain a file with the new state and a file with the old state. This implementation is much more difficult and prone to errors. Thus, we have opted for the current approach. Our implementation is given in Listing 2. For the sake of brevity we omit the implementation of the method calculateChangedMethodsBetweenFiles here. It involves comparing the lines of code in the body of the same method between two revisions. 4https://git.zx2c4.com/cgit/about/ 5https://eclipse.org/jgit 24
  • 30. CHAPTER 4. DESIGNING A PROOF OF CONCEPT TOOL protected void generateDiff(String commitSha) { try { List<DiffEntry> diffEntries = GitUtils.obtainDiffEntries(gitRepository, commitSha);→ processDiffEntries(diffEntries, commitSha); } catch (IOException exception) { log.error("Could not generate diff", exception); } } protected void processDiffEntries(List<DiffEntry> diffEntries, String commitSha) throws IOException {→ for (DiffEntry diffEntry : diffEntries) { boolean fileIsJavaFile = StringUtils.endsWith(diffEntry.getNewPath(), ".java");→ if (diffEntry.getChangeType() == DiffEntry.ChangeType.ADD || !fileIsJavaFile) { continue; } String rawFileContents = GitUtils.fetchFileContentsInCommit(gitRepository.getRepository(), commitSha, diffEntry.getNewPath()); → → ObjectId parentSha = GitUtils.parentCommitForCommit(commitSha, gitRepository); String rawFileContentsPreviousCommit = GitUtils.fetchFileContentsInCommit(gitRepository.getRepository(), parentSha, diffEntry.getOldPath()); → → calculateChangedMethodsBetweenFiles(rawFileContents, rawFileContentsPreviousCommit);→ } log.debug(String.format("Changed methods: %s", changedMethods)); } Listing 2: The methods in the class GitDiff responsible for determining which methods were changed in a commit. 4.3.2 Extracting dependency information Before we can create a call graph, we need to obtain the JAR files for all libraries used. These JAR files contain Java bytecode. First, we extract the list of dependencies used along with information on the specific versions used. This includes any transitive dependencies that may be present. In our implementation, we collect the required information programmatically invoking the Maven tree command. The extractDependencyInforma- tion method in the aptly named MavenDependencyExtractor class is responsible for this. The implementation is given in Listing 3. The “-debug” flag is added to the command to prevent Maven from not outputting a dependency tree if even a single dependency can not be resolved. A dependency can not be resolved when, for example, a proprietary dependency is listed that is not available in the Maven Central Repository. Adding the “debug” flag ensures that unrelated or partial failures will not lead to no information being extracted at all. The filterDependenciesUsedFromRawOutput method (not shown here) uses regular expressions to filter the relevant output since the “debug” flag also leads to a lot of information being output that we do not care about. 4.3.3 Creating a call graph The next step in our analysis involves determining which methods in those vulnerable dependencies are called by a given system, either directly or indirectly. For example, method E in class A of the system may call method F of class B contained within a library. In turn, this method F in class B may call method G of class C in the same library. Therefore, there is path from methods B to G. To determine these relations programmatically, we use the WALA call graph library6 originally developed by IBM. The call graph library 6http://wala.sourceforge.net/wiki/index.php/Main_Page 25
  • 31. CHAPTER 4. DESIGNING A PROOF OF CONCEPT TOOL protected void extractDependencyTreeInformation(String pomFilePath) { currentPomFile = pomFilePath; MavenInvocationRequest request = new MavenInvocationRequest(currentPomFile); // we use the debug flag to continue outputting the tree even if a single dependency can not be resolved→ String command = String.format("dependency:tree --debug -Dmaven.repo.local=%s", MVN_REPO_PATH);→ request.addGoal(command); log.info(String.format("Invoking mvn %s for pom file %s", command, pomFilePath)); String output = request.invoke(); filterDependenciesUsedFromRawOutput(output); } Listing 3: The method in the class MavenDependencyExtractor that extracts information from the dependency tree. can use JAR (Java Archive) files containing bytecode to conduct analysis. The resulting information provides insight into which methods of the libraries are called by the system under investigation. Figure 4.5: A graphical depiction of how we determine whether vulnerable library methods are invoked. Using raw source code as input Source code of clients’ projects is uploaded to SIG frequently. SIG does not require presence of executable binaries in the upload. Ordinarily, static analysis is used at SIG to analyze all source code, SIG never executes client code to perform any form of analysis. However, open-source call graph libraries rarely support creating a call graph using source code alone. The call graph library we use can only analyze source code after it has been translated to byte code by a Java front-end. Conveniently, this kind of tooling is provided by the maintainers of the call graph library we use. Since we also want to trace calls occurring in third party libraries, executable files (e.g. JAR files) containing the libraries are needed. We can obtain these from the Maven repository. This does not work for any proprietary libraries since these are not available publicly. We exclude these libraries from the call graph analysis for this reason. In addition, there often is very little vulnerability data on proprietary libraries, therefore it does not matter that we do not include it. Thus, in the context of SIG such a tool should be able to process raw source code to be considered useful. It is realistic to assume that in other contexts this is also useful, for instance if executable binaries are not available for 26