Extracting Authoring Information Based on Keywords andSemant.docx

Extracting Authoring Information Based on Keywords and
Semantic Search
Faisal Alkhateeb, Amal
Alzubi, Iyad Abu Doush
Computer Sciences
Department
Yarmouk University, Irbid,
Jordan
{alkhateebf,iyad.doush}@yu.edu.jo
Shadi Aljawarneh
Faculty of Science and
Information Technology
Al-Isra University, Amman,
Jordan
[email protected]
Eslam Al Maghayreh
Computer Sciences
Department
Yarmouk University, Irbid,
Jordan
[email protected]
ABSTRACT
Many people, in particular researchers, are interested in

searching and retrieving authoring information from online
authoring databases to be cited in their research projects.
In this paper, we propose a novel approach for retrieving
authoring information that combines keyword and semantic-
based approaches. In this approach, the user is interested
only in retrieving authoring information considering some
specified keywords and ignore how the internal semantic
search is being processed. Additionally, this approach ex-
ploits the semantics and relationships between different re-
sources for a better knowledge-based inference.
Categories and Subject Descriptors
H.3.3 [Information Search and Retrieval]: Search pro-
cess
Keywords
Semantic web, RDF, SPARQL, Authoring Information, Key-
word Search, Semantic Search
1. INTRODUCTION
The world wide web (or simply the web) has become the
first source of knowledge for all life domains. It can be seen
as an extensive information system that allows exchanging
the resources as well as documents. The semantic web is an
evolving extension of the web aiming at giving well defined
forms and semantics to the web resources (e.g., content of
an HTML web page) [4].
Due to the growth of the semantic web, semantic search
became an attracting approach. The term refers to meth-
ods of searching web documents beyond the syntactic level
of matching keywords. Exposing metadata is an essential
point for a semantic search approach associated with the
semantic web. The most important recent development is

Permission to make digital or hard copies of all or part of this
work for
personal or classroom use is granted without fee provided that
copies are
not made or distributed for profit or commercial advantage and
that copies
bear this notice and the full citation on the first page. To copy
otherwise, to
republish, to post on servers or to redistribute to lists, requires
prior specific
permission and/or a fee.
ISWSA’10, June 14–16, 2010, Amman, Jordan.
Copyright 2010 ACM 978-1-4503-0475-7/0 /2010 ...$10.00.
in the area of embedding metadata directly into web doc-
uments. RDF (Resource Description Framework) [15] is a
knowledge representation language dedicated to the annota-
tion of resources within the Semantic web. Currently, many
documents are annotated via RDF due to its simple data
model and its formal semantics. For example, it is embed-
ded in (X)HTML web pages using the RDFa language [1],
in SMIL documents [7] using RDF/XML [3], etc. SPARQL
[17] is a W3C recommendation language developed in order
to query RDF knowledge bases, e.g., retrieving nodes from
RDF graphs.
Another approach that is found in search engines is based
on using keywords. More precisely, both queries and docu-
ments are typically treated at a word or gram level (simi-
lar to Information Retrieval). The search engine is missing
a semantic-level understanding of the query and can only
understand the content of a document by picking out docu-
ments with the most commonly occurring keywords.
The objective of this paper is to provide a novel approach
for retrieving authoring information that combines keyword-

based and semantic-based approaches. In this approach,
the user is interested only in retrieving authoring informa-
tion considering some specified keywords and ignores how
the internal semantic search is being processed. In particu-
lar, the user is interested in searching authoring information
from online authoring information portals (such as DBLP1,
ACM2, IEEE3, etc). For instance, show me all documents
of the author ”faisal alkhateeb” or the author ”jerome eu-
zenat” with a title containing ”SPARQL”. In the proposed
approach, keywords are used for collecting authoring infor-
mation about the authors, which are then filtered with se-
mantic search (using RDF and SPARQL) based on the se-
mantic relations of the query.
The remainder of the paper is organized as follows: we
introduce the research background in Section 2. The com-
bined approach is presented in Section 3 as well as a testcase
illustrating the proposed approach. A review of related work
is discussed in Section 4. Discussion issues drawn from this
study are presented in Section 5.
2. RESEARCH BACKGROUND
This section provides an overview of the elements that
are necessary for presenting the proposed approach namely:
1
http://www.informatik.uni-trier.de/~ley/db/
2
http://portal.acm.org/portal.cfm
3
http://www.ieee.org/portal/site6

BibTeX, RDF, and SPARQL.
2.1 BibTeX
BibTeX4[16, 10] is a tool and a file format which are used
to describe and process lists of references, mostly in conjunc-
tion with LaTeX documents. BibTeX makes it easy to cite
sources in a consistent manner, by separating bibliographic
information from the presentation of this information. Bib-
TeX uses a style-independent text-based file format for lists
of bibliography items, such as articles, books, thesis. Each
bibliography entry contains some subset of standard data
entries: author, booktitle, number, organization, pages,
title, type, volume, year, institution, and others. Bib-
liography entries included in a .bib file are split by types.
The following types are understood by virtually all BibTeX
styles: article, book, booklet, conference, inproceed-
ings, phdthesis, etc.
Example 1. The following is an instance of a BibTeX
element:
@article{DBLP:AlkhateebBE09,
author = {Faisal Alkhateeb and Jean-Francois
Baget and Jerome Euzenat},
title = {Extending SPARQL with regular exp-
ression patterns (for querying RDF)
},
journal = {J. web Sem.},

volume = {7},
number = {2},
year = {2009},
pages = {57-73},
}
2.2 RDF
RDF is a language for describing resources. In its abstract
syntax, an RDF document is a set of triples of the form
<subject, predicate, object>.
Example 2. The assertion of the following RDF triples:
{ <ex:person1 foaf:name "Faisal Alkhateeb">,
<ex:document1 BibTeX:author ex:person1>,
<ex:document1 rdf:type BibTeX:inproceedings>,
<ex:document1 BibTeX:title "PSPARQL">,
<ex:person1 foaf:knows ex:person2>,
<ex:person2 foaf:name "Jerome Euzenat">,
<ex:document1 BibTeX:author ex:person2>,
}
means that there exists an inproceedings document, which
is coauthored by two persons named ”Faisal Alkhateeb” and
”Jerome Euzenat”, whose title is ”PSPARQL”.
An RDF document can be represented by a directed la-
beled graph, as shown in Figure 1, where the set of nodes is
the set of terms appearing as a subject or object in a triple

and the set of arcs is the set of predicates (i.e., if <s, p, o>
is a triple, then s
p
−→ o).
2.3 SPARQL
SPARQL is the query language developed by the W3C for
querying RDF graphs. A simple SPARQL query is expressed
using a form resembling the SQL SELECT query:
4
http://www.bibtex.org/
ex:person1 ex:person2
ex:document1
BibTeX:inproceedings
”Faisal Alkhateeb” ”Jerome Euzenat”
”PSPARQL”
foaf:knows
foaf:name foaf:name
BibTeX:author BibTeX:author
rdf:typeBibTeX:title
Figure 1: An RDF graph.
SELECT ~B FROM u WHERE P

where u is the URL of an RDF graph G to be queried, P is
a SPARQL graph pattern (i.e., a pattern constructed over
RDF graphs with variables) and ~B is a tuple of variables
appearing in P . Intuitively, an answer to a SPARQL query
is an instantiation of the variables of ~B by the terms of the
RDF graph G such that the substitution of the values to the
variables of P yields to a subset of the graph G.5
Example 3. Consider the RDF graph of Figure 1 repre-
senting some possible authoring information. For instance
the existence of the following triples {〈ex:document1, rdf:type,
BibTeX:inproceedings〉, 〈ex:document1, BibTeX:title,
"PSPARQL"〉}
asserts that there exists an inproceedings document whose ti-
tle is ”PSPARQL”.
The following SPARQL query modeling this information:
SELECT *
FROM <Figure1>
WHERE {
?document BibTeX:author ?author .
?document BibTeX:title "PSPARQL" .
?author foaf:name ?name .
}
could be used, when evaluated against the RDF graph of Fig-
ure 1, to return the following answers:
# ?document ?author ?name
1 ex:document1 ex:person1 ”F aisalAlkhateeb”
2 ex:document1 ex:person2 ”JeromeEuzenat”

In RDF there exists a set of reserved words (called RDF
Schema or simply RDFS [6]), designed to describe the re-
lationships between resources and properties, e.g., classA
subClassOf classB. It adds additional constraints to the re-
sources associated to the RDFS terms, and thus permitting
more consequences (reasoning).
Example 4. Using the RDF graph presented in Figure 1,
we can deduce the following triple <ex:document1 rdf:type
BibTeX:proceedings> from the following triples <ex:document1
rdf:type BibTeX:inproceedings> and <BibTeX:inproceedings
rdfs:subClassOf BibTeX:publications>. Hence, the following
SPARQL query :
SELECT *
FROM <Figure1>
WHERE {
?document rdf:type BibTeX:publications .
5When using RDFS semantics [6], this intuitive definition
is irrelevant and one could apply RDFS reasoning rules to
calculate answers over RDFS documents.
}
returns the same set of answers described in Example 1 be-
cause a inproceedings is a subclass of publications.
SPARQL provides several result forms other than SE-

LECT that can be used for formating the query results. For
example, a CONSTRUCT query can be used for building
an RDF graph from the set of answers to the query. More
precisely, an RDF graph pattern (i.e., an RDF involving
variables) is specified in the CONSTRUCT clause that will
be constructed. For each answer to the query, the variable
values are substituted in the RDF graph pattern and the
merge of the resulting RDF graphs is computed.6 This fea-
ture can be viewed as rules over RDF permitting to build
new relations from the linked data.
Example 5. The following CONSTRUCT query:
CONSTRUCT {?author BibTeX:coauthorof ?document .}
FROM <Figure1>
WHERE {
}
constructs the RDF graph (containing the coauthor relation)
by substituting for each located answer the values of the vari-
ables ?author and ?document to have the following graph (as
done for SPARQL, we encode the resulting graph in the Tur-
tle language7):
@prefix ex: <http://ex.org/> .
ex:person1 BibTeX:coauthorof ex:document1 .
ex:person2 BibTeX:coauthorof ex:document1 .
3. METHODOLOGY
The Extracting Authoring Information system which we

have implemented is used to achieve the following:
Given: - A user query in the form of textual keywords.
Find: - A set of BibTeX elements that are relevant to the
query.
The proposed methodology consists of the following ma-
jor phases: connecting to Google search engine, connecting
to DBLP page and extracting BibTeX elements, convert-
ing BibTeX to RDF and keywords to SPARQL query, and
then evaluate the SPARQL query against the RDF docu-
ment. The first two phases deal with extracting author in-
formation based on keyword search, while the third and the
fourth represent the semantic search. In the following, we
present the basic work flow of the system as well as its main
components.
3.1 System Work Flow
As shown in Figure 2, the system works as follows: the
user firstly enters the keywords to be searched such as key-
words from author name, title of the paper title, year of
publication, etc. Then, uses goolge search engine to cor-
rect misspelled entered keywords (in particular, names of
the authors) as well as finding the pages for the corrected
6A definition of RDF merge operation can be found at http:
//www.w3.org/TR/2001/WD-rdf-mt-20010925/#merging.
7
http://www.dajobe.org/2004/01/turtle/.
entered keywords (for instance, DBLP pages of the author).
After that, BibTeX elements will be extracted and these
BibTeX elements will be converted to RDF document. The
corrected keywords will be transformed to a SPARQL query

to be used for querying the RDF document corresponding
to the extracted BibTeX elements.
Figure 2: The Basic Flow of the System.
3.2 System Components
The following are the main components of the system:
• Google Search: after entering the keywords in the
corresponding positions, they will be passed to a com-
ponent that connects to Google engine. That is, the
magic URL ”http://www.google.com/search?hl=ar&q=”
+”searchParameters” of Google search engine will be
used to search for the specified keywords. To this end,
there could be two cases returned from this search ei-
ther:
– correct author name; or
– misspelled author name.
In the second, the new search path ”did you mean
structure” will be used to reconnect to the Google search
engine. This process is repeated until finding the corre-
sponding author page in the specified authoring database
(DBLP, ACM, IEEE, etc).
• BibTeX extractor: this component is responsible for
extracting the BibTeX elements and save them in a file
for later usage. it should be noticed that this compo-
nent contains several methods, each of them is specific
to a bibliography database. This is due to the fact
that each bibliography database has its own style to
include BibTeX elements in the authoring web pages.
Therefore, we suggest to include BibTeX elements in
web pages as an RDFa annotations8.

8
http://www.w3.org/TR/xhtml-rdfa-primer/
Figure 3: The user interface of the system as well as the found
results.
• BibTeX parser: BibTeX elements are then converted
to RDF documents using results from the BibTeX parser
that we have implemented in the system. Note that if
the RDFa is used to annotate BibTeX elements, then
there is no need for this parser. In this case, the on-
line RDF distiller9 could be used to extract RDF doc-
uments corresponding to the annotated BibTeX ele-
ments from web pages. In addition to the RDF triples
that correspond to the BibTeX entries, RDF triples
corresponding to RDFS relationships (such as <BibTeX:in-
proceedings rdfs:subClassOf BibTeX:proceedings> and
<BibTeX:booklet rdfs:subClassOf BibTeX:book>) are added
to the RDF document to allow reasoning more results.
• Keywords to SPARQL query: the entered key-
words are also used to build a SPARQL query auto-
matically. The query will be then used to filter the
results obtained in search based on keywords. More
precisely, when entering keywords, the user selects the
type of the data entry to be entered such as ”Title”,
”Author”, ”Publication”, ”Pages”, and so on. Note
that, the user can enter multiple authors. If the key-
word begins with underscore ” ”, this means that the
entered keyword is part of the BibTeX data entry. In
this case, the ”regex” function can be used in the Filter
constraint to build the SPARQL query. Otherwise, it
is considered to be an exact search for such keyword.

Moreover, the user can specify the relationship be-
tween the entered keywords (i.e., ”or” or ”and”). When
building the SPARQL query, these relationships corre-
9
http://www.w3.org/2007/08/pyRDFa/
spond to the ”UNION” and ”AND” SPARQL query
graph patterns.
• Query evaluator: this component is used to evaluate
the SPARQL query (i.e., the query obtained from the
entered keywords) against the RDF document (i.e., the
RDF document obtained from the file containing the
BibTeX elements) to find and construct the precise
results. Any query evaluator could be used at this
stage10, but we have used jena11.
It should be noticed that DBLP provides the capability of
searching by allowing users to pose keyword-based queries
over only its bibliography dataset. For instance, one can
pose the query ”alkhateeb|jerome euzenat” that searches for
documents matching the keyword ”alkhateeb” or ”jerome eu-
zenat”. The search process in DBLP offers good features
such as a search is triggered after each keystroke with in-
stant times if the network connections is not lame and case-
insensitive search [2]. However, a misspelled keyword such
as ”alkhateb” has no hits while ”alkhateeb” returns five doc-
uments. Additionally, the semantic relations are neither
fully preserved nor well defined. In particular, one can pose
the query ”alkhateeb|euzenat” providing 79 documents while
putting a space after the pipe ”alkhateeb| euzenat” provides
only 2 documents. The semantic reasoning is not provided
(see Example 4). We avoided these limitations in the pro-
posed methodology.

3.3 Test Case
10
http://esw.w3.org/topic/SparqlImplementations
11
http://jena.sourceforge.net/
Suppose that the user had entered ”faisal alkhateb” as an
author, ”jerome euzenat” as an another author, and ” sparql”
as a title in the interface shown in Figure 3 and selected
DBLP as a search database as well ”or” and ”and” connec-
tions between the authors and the title keywords. Then the
query equation will be as: ((Author or Author)) and Title)
= ((faisal alkhateeb or jerome euzenat) and sparql).
The search will be done in Google to check if the author
name exists in DBLP or not. In this testcase, the Google
engine corrects the misspelled author name ”faisal alkhateb”
and uses ”faisal alkhateeb” instead to connect to the DBLP
with the correct name. Then the BibTeX elements cor-
responding to the keywords ”faisal alkhateb”, ”jerome eu-
zenat”, and ”sparql” are extracted from DBLP:
@article{DBLP:AlkhateebBE09,
author = {Faisal Alkhateeb and Jean-Francois
Baget and Jerome Euzenat},
title = {Extending SPARQL with regular expre-
ssion patterns (for querying RDF)},
journal = {J. web Sem.},
volume = {7},
number = {2},
year = {2009},

pages = {57-73},}
...
The BibTeX elements will be then converted to an RDF
document such as the one in Example 2. Also, the cor-
rected entered keywords will be used to build the following
SPARQL query used to filter the results:
CONSTRUCT{?doc BibTeX:author "Faisal Alkhateeb"
?doc BibTeX:author "Jerome Euzenat"
...
}
FROM<RDF document corresponds to BibTeX>
WHERE{{{?doc BibTeX:author "Faisal Alkhateeb".
?doc BibTeX:title ?title.
?doc BibTeX:year ?year.
?doc BibTeX:pages ?pages. }
Union { ?doc BibTeX:author "Jerome Euzenat".
?doc BibTeX:title ?title.
?doc BibTeX:year ?year.
?doc BibTeX:pages ?pages. }}
{ ?doc BibTeX:title ?title }
FILTER(regex(?title, "^sparql"))}
}
Note that the keyword ” sparql” begins with underscore
” ” and so it is considered to be part of the title while other
keywords such ”‘faisal alkhateeb” do not and considered to
be the full author names. Note that the user can specify a
range for the publishing years. For instance, show me the
authoring information between ”2004” and ”2008”. In this

case, she/he must can enter ”2004-2008” in the year field,
which in tern converted to the following part of a SPARQL
query:
?document BibTeX:hasyear ?year .
FILTER ((?year >=2004) && (?year <= 2008))
4. RELATED WORK
The literature on combining the keyword search with the
semantic search is rich; in this section we provide a brief
overview of some relevant proposals.
Semantic web languages (i.e., RDF and OWL) can be used
for knowledge encoding and can be used by services, tools,
and applications [11]. The semantic web will not enable
only human to process web contents, but also machines will
be able to process web contents. This can help in creating
intelligent services, customized web, and have more powerful
search engines [9].
Traditional search engines use keywords as their search
basis. Semantic search applies semantic processing on key-
words for a better retrieval search. Hybrid search utilizes the
keyword search from regular search along with the ability to
use semantic search to query and reason using Meta data.
Using ontologies the search engines can find pages that have
different syntax, but similar semantics [9].
The hybrid search provided users with more capabilities
for searching and reasoning to get better results. According
to Bhagdev et al. [5] there are three types of queries that
are possible using hybrid search:
• Semantic search using the defined Meta data and the

relations between instances.
• Regular search using keywords.
• Search for keywords within specific contents.
Kiryakov et al. [14] proposed a system in which the user
can select between keyword based search or ontology based
search, but s/he cannot merge them to obtain search results
using the two approaches together.
Another work by Bhagdev et al. [5] introduced a search
method that combines ontology and keyword search based
methods. The research results shows that the use of hybrid
search gives a better performance over keyword search or
semantic search in real world cases.
Rocha et al. [18] combined ontology based information
retrieval with regular search in a semantic search technique.
They used spread activation algorithm to get activation value
of the relevance of search results with keywords. The links
in the ontology are given weights according to certain prop-
erties. The proposed method do not identify promptly the
unique concepts and relations.
In another work Gilardoni et al. [12] provided integration
of keyword based search with ontology search, but with no
capability for Boolean queries.
Hybrid search is implemented by some large companies in
the industry. Google Product Search12is a semantic search
service from Google which searches for products by linking
between different attributes in the knowledge base to re-
trieve a product. Sheth et al. [19] use keyword query to
apply multi-domain search by automatically classifying and
extracting information along with ontology and meta data
information.

Guha et al. [13] used a semantic search that uses an ap-
proach which combines traditional search and other data
from distributed sources to answer the user query in more
details. In the work of Davies et al. [8] QuizRDF is in-
troduced. A system that combines the traditional search
method with the ability to query and navigate RDF. The
system shortcoming when there is a chaining in the query.
5. DISCUSSION
We have presented in this paper, an approach for search-
ing and extracting authoring information. The approach is
based on keyword and semantic search approaches. In the
keyword search part, the entered keywords are used to col-
lect authoring information. In this part, the Google search
12
http://www.google.com/products
engine is used to correct the misspelled keywords, in particu-
lar the author’s name, which allows to give more results. ad-
ditionally, ad-hoc routines are used to extract bibliography
elements from online databases. So, we suggest to include
BibTeX elements in web pages as an RDFa annotations so
that standard methods can be exploited. In the semantic
part, the SPARQL query obtained from entered keywords is
queried against the metadata corresponding to the author-
ing information, which allows to give more precise results.
6. REFERENCES
[1] Adida, B., and Birbeck, M. RDFa primer - bridging
the human and data webs. Working draft, W3C, 2008.
http://www.w3.org/TR/xhtml-rdfa-primer/.

[2] Bast, H., Mortensen, C. W., and Weber, I.
Output-sensitive autocompletion search. Inf. Retr. 11,
4 (2008), 269–286.
[3] Beckett, D., and McBride, B. RDF/XML syntax
specification (revised). Recommendation, W3C, 2004.
http://www.w3.org/TR/rdf-syntax-grammar/.
[4] Berners-Lee, T., Hendler, J., and Lassila, O.
The semantic web, 2001.
http://www.sciam.com/article.cfm?articleID=
00048144-10D2-1C70-84A9809EC588EF21.
[5] Bhagdev, R., Chapman, S., Ciravegna, F.,
Lanfranchi, V., and Petrelli, D. Hybrid search:
Effectively combining keywords and semantic searches.
In ESWC (2008), pp. 554–568.
[6] Brickley, D., and Guha, R. RDF vocabulary
description language 1.0: RDF schema.
Recommendation, W3C, 2004.
http://www.w3.org/TR/rdf-schema/.
[7] Bulterman, D., Grassel, G., Jansen, J.,
Koivisto, A., Layäıda, N., Michel, T.,
Mullender, S., and Zucker, D. Synchronized
Multimedia Integration Language (SMIL 2.1).
http://www.w3.org/TR/SMIL/.
[8] Davies, J., and Weeks, R. Quizrdf: Search
technology for the semantic web. In HICSS ’04:
Proceedings of the Proceedings of the 37th Annual
Hawaii International Conference on System Sciences

(HICSS’04) - Track 4 (Washington, DC, USA, 2004),
IEEE Computer Society, p. 40112.
[9] Decker, S., Melnik, S., van Harmelen, F.,
Fensel, D., Klein, M., Broekstra, J., Erdmann,
M., and Horrocks, I. The semantic web: the roles
of XML and RDF. 63–73.
[10] Fenn, J. Managing citations and your bibliography
with BibTeX. The PracTeX Journal 4 (2006).
http://www.tug.org/pracjourn/2006-4/fenn/.
[11] Finin, T., and Ding, L. Search Engines for Semantic
Web Knowledge. In Proceedings of XTech 2006:
Building Web 2.0 (May 2006).
[12] Gilardoni, L., Biasuzzi, C., Ferraro, M., Fonti,
R., and Slavazza, P. Lkms - a legal knowledge
management system exploiting semantic web
technologies. In International Semantic Web
Conference (2005), Y. Gil, E. Motta, V. R. Benjamins,
and M. A. Musen, Eds., vol. 3729 of Lecture Notes in
Computer Science, Springer, pp. 872–886.
[13] Guha, R., McCool, R., and Miller, E. Semantic
search. In WWW ’03: Proceedings of the 12th
international conference on World Wide Web (New
York, NY, USA, 2003), ACM, pp. 700–709.
[14] Kiryakov, A., Popov, B., Terziev, I., Manov, D.,
and Ognyanoff, D. Semantic annotation, indexing,
and retrieval. Web Semantics: Science, Services and
Agents on the World Wide Web 2, 1 (2004), 49 – 79.
[15] Manola, F., and Miller, E. RDF primer.

http://www.w3.org/TR/rdf-primer/.
[16] Patashnik, O. Bibtexing, 1988.
http://ftp.ntua.gr/mirror/ctan/biblio/bibtex/
contrib/doc/btxdoc.pdf.
[17] Prud’hommeaux, E., and Seaborne, A. SPARQL
query language for RDF. Recommendation, W3C,
January 2008.
http://www.w3.org/TR/rdf-sparql-query/.
[18] Rocha, C., Schwabe, D., and Aragao, M. P. A
hybrid approach for searching in the semantic web. In
WWW ’04: Proceedings of the 13th international
conference on World Wide Web (New York, NY, USA,
2004), ACM, pp. 374–383.
[19] Sheth, A., Bertram, C., Avant, D., Hammond,
B., Kochut, K., and Warke, Y. Managing semantic
content for the web. IEEE Internet Computing 6, 4
(2002), 80–87.
contributed articles
m a r c h 2 0 1 0 | v o l . 5 3 | n o . 3 | c o m m u n i c
at i o n s o f t h e a c m 121
d o i : 1 0 . 1 1 4 5 / 1 6 6 6 4 2 0 . 1 6 6 6 4 5 2
by fabio arduini and Vincenzo morabito
S i n c e t h e S e p t e m b e r 1 1 t h a t ta c k S on the

World
Trade Center,8 tsunami disaster, and hurricane
Katrina, there has been renewed interest in emergency
planning in both the private and public sectors. In
particular, as managers realize the size of potential
exposure to unmanaged risk, insuring “business
continuity” (BC) is becoming a key task within all
industrial and financial sectors (Figure 1).
Aside from terrorism and natural disasters, two
main reasons for developing the BC approach in the
finance sector have been identified as unique to it:
regulations and business specificities.
Regulatory norms are key factors for all financial
sectors in every country. Every organization is required
to comply with federal/national law in addition to
national and international governing bodies. Referring
to business decisions, more and more organizations
recognize that Business Continuity could be and
should be strategic for the good of the business. The
finance sector is, as a matter of fact, a sector in which
the development of information technology (IT) and
information systems (IS) have had a dramatic effect
upon competitiveness. In this sector, organizations
have become dependent upon tech-
nologies that they do not fully compre-
hend. In fact, banking industry IT and
IS are considered production not sup-
port technologies. As such, IT and IS
have supported massive changes in the
ways in which business is conducted
with consumers at the retail level. In-
novations in direct banking would have
been unthinkable without appropriate

IS. As a consequence business continu-
ity planning at banks is essential as the
industry develops in order to safeguard
consumers and to comply with interna-
tional regulatory norms. Furthermore,
in the banking industry, BC planning
is important and at the same time dif-
ferent from other industries, for three
other specific reasons as highlighted
by the Bank of Japan in 2003:
Maintaining the economic activity of ˲
residents in disaster areas2 by enabling
the continuation of financial services
during and after disasters, thereby sus-
taining business activities in the dam-
aged area;
Preventing widespread payment and ˲
settlement disorder2 or preventing sys-
temic risks, by bounding the inability
of financial institutions in a disaster
area to execute payment transactions;
Reduce managerial risks ˲ 2 for example,
by limiting the difficulties for banks
to take profit opportunities and lower
their customer reputation.
Business specificities, rather than
regulatory considerations, should be
the primary drivers of all processes.
Even if European (EU) and US markets
differ, BC is closing the gap. Progres-
sive EU market consolidation neces-
sitates common rules and is forcing

major institutions to share common
knowledge both on organizational and
technological issues.
The financial sector sees business
continuity not only as a technical or
risk management issue, but as a driver
towards any discussion on mergers
and acquisitions; the ability to manage
BC should also be considered a strate-
gic weapon to reduce the acquisition
timeframe and shorten the data center
business
continuity and
the banking
industry
122 c o m m u n i c at i o n s o f t h e a c m | m a r c h 2
0 1 0 | v o l . 5 3 | n o . 3
differences in preparing and imple-
menting strategies that enhance busi-
ness process security. Two approaches
seem to be prevalent. Firstly, there are
those disaster recovery (DR) strate-
gies that are internally and hardware-
focused9 and secondly, there are those
strategies that treat the issues of IT and
IS security within a wider internal-ex-
ternal, hardware-software framework.
The latter deals with IS as an integrat-

ing business function rather than as a
stand-alone operation. We have labeled
this second type of business continuity
approach (BCA).
As a consequence, we define BCA as
a framework of disciplines, processes,
and techniques aiming to provide
continuous operation for “essential
business functions” under all circum-
stances.
More specifically, business continu-
ity planning (BCP) can be defined as “a
collection of procedures and informa-
tion” that have been “developed, com-
piled and maintained” and are “ready
to use - in the event of an emergency
or disaster.”6 BCP has been addressed
by different contributions to the litera-
ture. Noteworthy studies include Julia
Allen’s contribution on Cert’s Octave
methoda1 the activities of the Business
Continuity Institute (BCI) in defining
certification standards and practice
guidelines, the EDS white paper on
Business Continuity Management4 and
merge, often considered one of the top
issues in quick wins and information
and communication technology (ICT)
budget savings.
business continuity concepts
The evolution of IT and IS have chal-
lenged the traditional ways of conduct-

ing business within the finance sector.
These changes have largely represented
improvements to business processes
and efficiency but are not without their
flaws, in as much as business disrup-
tion can occur due to IT and IS sources.
The greater complexity of new IT and IS
operating environments requires that
organizations continually reassess how
best they may keep abreast of changes
and exploit those for organizational ad-
vantage. In particular, this paper seeks
to investigate how companies in the fi-
nancial sector understand and manage
their business continuity problems.
BC has become one of the most im-
portant issues in the banking industry.
Furthermore, there still appears to be
some discrepancy as to the formal defi-
nitions of what precisely constitutes a
disaster and there are difficulties in as-
sessing the size of claims in the crises
and disaster areas.
One definition of what constitutes
a disaster is an incident that leads to
the formal invocation of contingency/
continuity plans or any incident which
leads to a loss of revenue; in other
words it is any accidental, natural or
malicious event which threatens or dis-
rupts normal operations or services, for
as long a time as to significantly cause
the failure of the enterprise. It follows
then that when referring to the size of

claims in the area of organizational cri-
ses and disasters, the degree to which
a company has been affected by such
interruptions is the defining factor.
The definition of these concepts is
important because 80% of those orga-
nizations which face a significant crisis
without either a contingency/recovery
or a business continuity plan, fail to
survive a further year (Business Con-
tinuity Institute estimate). Moreover,
the BCI believes that only a small num-
ber of organizations have disaster and
recovery plans and, of those, few have
been renewed to reflect the changing
nature of the organization.
In observing Italian banking indus-
try practices, there seems to be major
finally, referring to banking, Business
Continuity Planning at Financial Insti-
tutions by the Bank of Japan.2 This last
study illustrates the process and activi-
ties for successful business continuity
planning in three steps:
1. Formulating a framework for robust
project management, where banks
should:
a. develop basic policy and guidelines
for BC planning (basic policy);
b. Develop a study firm-wide aspects

(firm-wide control section);
c. Implement appropriate progress
control (project management pro-
cedures)
2. Identifying assumptions and condi-
tions for business continuity plan-
ning, where banks should:
a. Recognize and identify the poten-
tial threats, analyze the frequency
of potential threats and identify
the specific scenarios with mate-
rial risk (Disaster scenarios);
b. Focus on continuing prioritized
critical operations (Critical opera-
tions);
c. Target times for the resumption of
operations (Recovery time objec-
tives);
3. Introducing action plans, where
banks should:
a. Study specific measures for busi-
ness continuity planning (BC
measures);
b. acquire and maintain back-up
data (Robust back-up data);
c. Determine the managerial re-
sources and infrastructure avail-

ability capacity required (Procure-
ment of managerial resources);
figure 1. 2004 top business priorities in industrial and financial
sectors (source Gartner)
a The Operationally Critical Threat, Asset, and Vulnerability
Evaluation Method of CERT. CERT is a center of Internet
security expertise, located at the Software Engineering
Institute, a federally funded research and development
center operated by Carnegie Mellon University.
d. Determine strong time con-
straints, a contact list and a means
of communication on emergency
decisions (Decision-making pro-
cedures and communication ar-
rangements);
e. Realize practical operational pro-
cedures for each department and
level (Practical manual)
4. Implement a test/training program
on a regular basis (Testing and re-
viewing).
business continuity aspects
The business continuity approach has

three fundamental aspects that can be
viewed in a systemic way: technology,
people and process.
Firstly, technology refers to the re-
covery of mission-critical data and
applications contained in the disas-
ter recovery plan (DRP). It establishes
technical and organizational measures
in order to face events or incidents with
potentially huge impact that in a worst
case scenario could lead to the unavail-
ability of data centers. Its development
ought to ensure IT emergency proce-
dures intervene and protect the data in
question at company facilities. In the
past, this was, whenever it even existed,
the only part of the BCP.
Secondly, people refers to the recov-
ery of the employees and physical work-
space. In particular, BCP teams should
be drawn from a variety of company
departments including those from per-
sonnel, marketing and internal consul-
tants. Also the managers of these teams
should possess general skill and they
should be partially drawn from busi-
ness areas other than IT departments.
Nowadays this is perceived as essential
to real survival with more emphasis on
human assets and value rather than on
those hardware and software resources
that in most cases are probably protect-
ed by backup systems.

Finally, the term process here refers
to the development of a strategy for the
deployment, testing and maintenance
of the plan. All BCP should be regularly
updated and modified in order to take
into consideration the latest kinds of
threats, both physical as well as tech-
nological.
Whereas a simple DR approach aims
at salvaging those facilities that are sal-
vageable, a BCP approach should have
different foci. One of these ought to be
treating IT and IS security with a wider
internal-external, hardware-software
framework where all processes are nei-
ther in-house nor subcontracted-out
but are a mix of the two so as to be an
integrating business function rather
than a stand alone operation. From
this point of view the BCP constitutes
a dual approach where management
and technology function together.
In addition, the BCP as a global ap-
proach must also consider all existing
relationships, thus giving value to cli-
ents and suppliers considering the to-
tal value chain for business and to pro-
tect business both in-house and out.
The BCP proper incorporates the di-
saster recovery (DR) approach but rejects
its exclusive focus upon facilities. It de-
fines the process as essentially business-

wide and one which enables competitive
and/or organizational advantages.
it focus Versus business
focus as a starting Point
The starting point for planning pro-
cesses that an organization will use as
its BCP must include an assessment of
the likely impact different types of ‘in-
cidents’ will/would make on the busi-
ness. As far as financial companies are
concerned, IT focus is critical since, as
mentioned, new technologies continue
to become more and more integral to
on going financial activities. In addition
to assessing the likely impact upon the
entire organization, banks must con-
sider the likely effects upon their differ-
ent business areas. The “vulnerability
& business impact matrix” (Figure 2) is
a tool that can be used to summarize
the inter-linkages between the various
information system services, their vul-
nerability and the impact on business
activities. It is useful in different ways.
To start, the BC approach doesn’t fo-
cus solely upon IT problems but rather
uses a business-wide approach. Given
the strategic focus of BCP, an under-
standing of the relationships between
value-creating activities is a key deter-
minant of the effectiveness of any such
process. In this way we can define cor-
rect BC perimeter (Figure 2) by trying to
extract the maximum value from BCP

within a context of bounded rationality
and limited resources. What the BCP
teams in these organizations have done
is focus upon how resources were uti-
lized and how they were added to value-
creation rather than merely being “sup-
port activity” which consumes financial
resources unproductively. In addition,
the convergence of customer with client
technologies also demands that those
managing the BCP process are aware of
the need to “... expand the contingency
role to not merely looking inward but
actually looking out.” Such a dual focus
uncovers the linkages between customer
and client which create competitive ad-
vantage. Indeed, in cases where clients’
business fundamentally depends upon
information exchange, for instance
many banks today provide online equity
brokerage services, it might be argued
that there is a ‘virtual value chain’ which
the BCP team protects thereby provid-
ing the ‘market-space’ for value creation
to take place. Finally, another benefit is
that vulnerability and business impact
can aid the prioritization of particular
key areas.
figure 2. Vulnerability & business impact matrix
124 c o m m u n i c at i o n s o f t h e a c m | m a r c h 2
0 1 0 | v o l . 5 3 | n o . 3

player, yet their functions are just as
vital to achieving the overall objectives
of the football team. The value chain
provides an opportunity to examine
the connection between the exciting
and the hum drum links that deliver
customer value. The evolution of crisis
preparations from the IT focused di-
saster recovery (DR) solutions towards
the BC approach reflects a growing un-
derstanding that business continuity
depends upon the maintenance of all
elements which provide organizational
efficiency-effectiveness and customer
value, whether directly or indirectly.
Prevention focus of
business continuity
A final key characteristic of the BC ap-
proach concerns its primary role in
prevention. A number of authors have
identified that the potential for crises
is normal for organizations.7,11 Crisis
avoidance requires a strategic approach
and requires a good understanding of
both the organization’s operating pro-
cesses, systems and the environment
in which it operates.
In the BC approach, a practice orga-
nization should develop a BCP culture
to eliminate the barriers to the develop-
ment of crisis prevention strategies. In
particular, these organizations should

recognize that incidents, such as the
New York terrorist attach or the City of
London bombings are merely triggered
by external technical causes and that
their effects are largely determined by
internal factors that were within the
control of their organizations. In these
cases a cluster of crises should be iden-
new and obsolete technologies
Today’s approach to BCP is focused on
well-structured process management and
business-driven paradigms. Even if some
technology systems seem to be “business
as usual,” some considerations must be
made to avoid any misleading conjecture
from an analytical side.
When considering large institutions
with systemic impact- not only on their
own but on clients businesses as well-
two key objectives need to be consid-
ered when facing an event. These have
been named RPO (Recovery Point Ob-
jective) and RTO (Recovery Time Ob-
jective) as shown in Figure 3. RPO deals
with how far in the past you have to go
to resume a consistent situation; RTO
considers how long it takes to resume a
standard or regular situation. The defi-
nitions of RPO and RTO can change ac-
cording to data center organization and
how high a level a company wants to its
own security and continuity to be.
For instance a dual site recovery sys-

tem organization must consider and
evaluate three points of view (Figure
3). These are: application’s availability,
BC process and data perspective.
Data are first impacted (RTO) before
the crisis event (CE) due to the closest
“consistent point” from which to re-
start. The crisis opening (CO) or decla-
ration occurs after the crisis event (CE).
“RTO_s,” or computing environ-
ment restored point, considers the
length of time the computing environ-
ment needs in order to be restored (for
example, when servers, network etc.
are once again available); “RTO_rc,” or
mission critical application restarted
point, indicates the “critical or vital ap-
plications” (in rank order) are working
once again; “RTO_r,” or applications
and data restored point, is the point
from which all applications and data
are restored, but (and it is a big but)
“RTO_end,” or previous environment
restored point, is the true end point
when the previous environment is fully
restored (all BC solutions are properly
working). Of the utmost importance
is that during the period between
“RTO_r” and “RTO_end” a second di-
saster event could be fatal!
Natural risks are also increasing in
scope and frequency, both in terms of
floods (central Europe 2002) and hurri-

canes (U.S. 2005), thus the coining of an
actual geographical recovery distance,
today considered more than 500 miles.
Such distance is forcing businesses and
institutions alike to consider a new tech-
nological approach and to undertake
critical discussion on synchronous-asyn-
chronous data replication: their intervals
and quality. Therefore, more complex
analysis about RPO and RTO is required.
However the most important issue,
from a business point of view when
faced with an imminent and unfore-
seen disaster, is how to reduce restore
or restart time, trying to shrink this win-
dow to mere seconds or less. New push-
ing technologies (SATA – Serial ATA
and MAID – Massive Arrays Inexpen-
sive Disk) are beginning to make some
progress in reducing the time problem.
business focus Versus
Value chain focus
The business area selected by the “vul-
nerability and business impact analy-
sis matrix” should be treated in accor-
dance with the value chain and value
system. In addition to assessing the
likely disaster impact upon IT depart-
ments, organizations should consider
disaster impacts over all company de-
partments and their likely effects upon
customers. Organizations should avoid
the so-called Soccer Star Syndrome.6

In drawing an analogy with the football
industry, one recognizes that greater
management attention is often focused
on the playing field rather than the un-
glamorous, but very necessary, locker
room and stadium management sup-
port activities. Defenders and goalkeep-
ers, let alone the stadium manager, do
not get paid at the same level as the star
figure 3. rPo & rto
tified. Such clusters should be catego-
rized along the axis of internal-external
and human/social-technical/economic
causes and effects. By adopting a strate-
gic approach, decisions could be made
about the extent of exposure in particu-
lar product markets or geographical
sites. An ongoing change management
program could contribute to real com-
mitment from middle managers who,
from our first investigation, emerged
as key determinants of the success of
the BC approach.
management support
and sponsorship
BCP success requires the commitment

of middle managers. Hence manag-
ers need to avoid considering BCP as
a costly, administrative inconvenience
that diverts time away from money-
making activities. All organizational
levels should be aware of the fact that
BCP was developed in partnership be-
tween the BCP team and front line op-
eratives. As a result, strategic business
units should own BCP plans. In addi-
tion, CEO involvement is key in rallying
support for the BCP process.
Two other key elements support
the BC approach. Firstly, there is the
recognition that responsibility for the
process rests with business managers
and this is reinforced through a formal
appraisal and other reward systems.
Secondly, peer pressure is deemed im-
portant in getting laggards to assume
responsibility and so affect a more re-
ceptive culture.
Finally, BCP teams need to regard
BCP as a process rather than as a spe-
cific end-point.
conclusion
Although the risk of terrorism and
regulations are identified as two key
factors for developing a business con-
tinuity perspective, we see that orga-
nizations need to adopt the BC ap-
proach for strategic reasons. The trend
to adopt a BC approach is also a proxy

for organizational change in terms of
culture, structure and communica-
tions. The BC approach is increasingly
viewed as a driver to generate competi-
tive advantage in the form of resilient
information systems and as an impor-
tant marketing characteristic to attract
and maintain customers.
Referring to organizational change
and culture, the BC approach should
be a business-wide approach and not
an IT-focused one. It needs supportive
measures to be introduced to encour-
age managers to adhere to the BC idea.
Management as a whole should also be
confident that the BC approach is an
ongoing process and not only an end
point that remains static upon comple-
tion. It requires changes of key assump-
tions and values within the organiza-
tional structure and culture that lead to
a real cultural and organizational shift.
This has implications for the role that
the BC approach has to play within the
strategic management processes of the
organization as well as within the levels
of strategic risk that an organization
may wish to undertake in its efforts to
secure a sustainable competitive or so
called first mover advantage.
References
1. Allen J.H. CERT® Guide to System and Network

Security Practices. Addison Wesley Professional, 2001.
2. Bank of Japan, Business Continuity Planning at
Financial Institutions, July 2003. http://www.boj.or.jp/
en/type/release/zuiji/kako03/fsk0307a.htm
3. Cerullo V. and Cerullo, J. Business continuity planning:
A comprehensive approach. Informtion System
Management Journal, Summer 2004.
4. Decker A. Business continuity management: A model
for survival. EDS White Paper, 2004.
5. Dhillon, G. The challenge of managing information
security. In International Journal of Information
Management 1, 1(2004), 243–244.
6. Elliott D. and Swartz E. Just waiting for the next big
bang: Business continuity planning in the uk finance
sector. Journal of Applied Management Studies 8, 1
(1999), 45-60.
7. Greiner, L. Evolution and revolution as organisations
grow. In Harvard Business Review (July/August)
reprinted in Asch, D. & Bowman, C. (Eds) (1989)
Readings in Strategic Management (London,
Macmillan), 373-387.
8. Lam, W. Ensuring business continuity. IT Professional
4, 3 (2002), 19 - 25
9. Lewis, W. and Watson, R.T. Pickren A. An empirical
assessment of IT disaster risk. Comm. ACM 46, 9
(2003), 201-206.
10. McAdams, A.C. Security and risk management:

A fundamental business issue. Information
Management Journal 38, 4 (2004), 36–44.
11. Pauchant, T.C. and Mitroff, I. Crisis prone versus crisis
avoiding organisations: is your company’s culture its
own worst enemy in creating crises?. Industrial Crisis
Quarterly 2, 4 (1998), 53-63.
12. Quirchmayr, G. Survivability and business continuity
management. In Proceedings of the 2nd Workshop on
Australasian Information Security, Data Mining and
Web Intelligence, and Software Internationalisation.
ACSW Frontiers (2004).
Vincenzo Morabito ([email protected])
is assistant professor of Organization and Information
System at the Bocconi University in Milan where he
teaches management information system, information
management and organization. He is also Director of the
Master of Management Information System System at
the Bocconi University.
Fabio Arduini ([email protected]) is
responsible for IT architecture and Business Continuity
for defining the technological and business continuity
statements for the Group according to the ICT
department.
© 2010 ACM 0001-0782/10/0300 $10.00
The Anti-Forensics Challenge

Kamal Dahbur
[email protected]
Bassil Mohammad
[email protected]
School of Engineering and Computing Sciences
New York Institute of Technology
Amman, Jordan
ABSTRACT
Computer and Network Forensics has emerged as a new field in
IT that is aimed at acquiring and analyzing digital evidence for
the purpose of solving cases that involve the use, or more
accurately misuse, of computer systems. Many scientific
techniques, procedures, and technological tools have been
evolved and effectively applied in this field. On the opposite
side, Anti-Forensics has recently surfaced as a field that aims at
circumventing the efforts and objectives of the field of
computer
and network forensics. The purpose of this paper is to highlight
the challenges introduced by Anti-Forensics, explore the
various
Anti-Forensics mechanisms, tools and techniques, provide a
coherent classification for them, and discuss thoroughly their
effectiveness. Moreover, this paper will highlight the challenges
seen in implementing effective countermeasures against these
techniques. Finally, a set of recommendations are presented
with
further seen research opportunities.
Categories and Subject Descriptors
K.6.1 [Management of Computing and Information
Systems]: Projects and People Management – System Analysis
and Design, System Development.

General Terms
Management, Security, Standardization.
Keywords
Computer Forensics (CF), Computer Anti-Forensics (CAF),
Digital Evidence, Data Hiding.
1. INTRODUCTION
The use of technology is increasingly spreading
covering various aspects of our daily lives. An equal increase, if
not even more, is realized in the methods and techniques created
with the intention to misuse the technologies serving varying
objectives being political, personal or anything else. This has
clearly been reflected in our terminology as well, where new
terms like cyber warfare, cyber security, and cyber crime,
amongst others, were introduced. It is also noticeable that such
attacks are getting increasingly more sophisticated, and are
utilizing novel methodologies and techniques. Fortunately,
these
attacks leave traces on the victim systems that, if successfully
recovered and analyzed, might help identify the offenders and
consequently resolve the case(s) justly and in accordance with
applicable laws. For this purpose, new areas of research
emerged
addressing Network Forensics and Computer Forensics in order
to define the foundation, practices and acceptable frameworks
for scientifically acquiring and analyzing digital evidence in to
be presented in support of filed cases. In response to Forensics
efforts, Anti-Forensics tools and techniques were created with
the main objective of frustrating forensics efforts, and taunting
its credibility and reliability.
This paper attempts to provide a clear definition for Computer
Anti-Forensics and consolidates various aspects of the topic. It

also presents a clear listing of seen challenges and possible
countermeasures that can be used. The lack of clear and
comprehensive classification for existing techniques and
technologies is highlighted and a consolidation of all current
classifications is presented.
Please note that the scope of this paper is limited to Computer-
Forensics. Even though it is a related field, Network-Forensics
is
not discussed in this paper and can be tackled in future work.
Also, this paper is not intended to cover specific Anti-Forensics
tools; however, several tools were mentioned to clarify the
concepts.
After this brief introduction, the remainder of this paper is
organized as follows: section 2 provides a description of the
problem space, introduces computer forensics and computer
anti-forensics, and provides an overview of the current issues
concerning this field; section 3 provides an overview of related
work with emphasis on Anti-Forensics goals and classifications;
section 4 provides detailed discussion of Anti-Forensics
challenges and recommendations; section 5 provides our
conclusion, and suggested future work.
2. THE PROBLEM SPACE
Rapid changes and advances in technology are impacting every
aspect of our lives because of our increased dependence on such
systems to perform many of our daily tasks. The achievements
in the area of computers technology in terms of increased
capabilities of machines, high speeds communication channels,
and reduced costs resulted in making it attainable by the public.
The popularity of the Internet, and consequently the technology
associated with it, has skyrocketed in the last decade (see Table
1 and Figure 1). Internet usage statistics for 2010 clearly show
the huge increase in Internet users who may not necessary be
computer experts or even technology savvy [1].

Permission to make digital or hard copies of all or part of this
work for
personal or classroom use is granted without fee provided that
copies are
not made or distributed for profit or commercial advantage and
that
copies bear this notice and the full citation on the first page. To
copy
otherwise, or republish, to post on servers or to redistribute to
lists,
requires prior specific permission and/or a fee.
ISWSA’11, April 18–20, 2011, Amman, Jordan.
Copyright 2011 ACM 978-1-4503-0474-0/04/2011…$10.00.
WORLD INTERNET USAGE AND POPULATION
STATISTICS
World Regions
Population
(2010 Est.)
Internet Users
Dec. 31, 2000
Internet Users
Latest Data

Growth
2000-2010
Africa 1,013,779,050 4,514,400 110,931,700 2357%
Asia 3,834,792,852 114,304,000 825,094,396 622%
Europe 813,319,511 105,096,093 475,069,448 352%
Middle East 212,336,924 3,284,800 63,240,946 1825%
North America 344,124,450 108,096,800 266,224,500 146%
Latin America/
Caribbean
592,556,972 18,068,919 204,689,836 1033%
Oceania/Australia 34,700,201 7,620,480 21,263,990 179%
WORLD TOTAL 6,845,609,960 360,985,492 1,966,514,816
445%
Table 1. World Internet Usage – 2010 (Reproduced from [1]).
Figure 1. World Internet Usage–2010 (Based on Data from [1])
Unfortunately, some of the technology users will not use it in a
legitimate manner; instead, some users may deliberately misuse
it. Such misuse can result in many harmful consequences
including, but not limited to, major damage to others systems or
prevention of service for legitimate users. Regardless of the
objectives that such “bad guys” might be aiming for from such

misuse (e.g. personal, financial, political or religious purposes),
one common goal for such users is the need to avoid detection
(i.e. source determination). Therefore, these offenders will exert
thought and effort to cover their tracks to avoid any liabilities
or
accountability for their damaging actions. Illegal actions (or
crimes) that involve a computing system, either as a mean to
carry out the attack or as a target, are referred to as
Cybercrimes
[2]. Computer crime or Cybercrime are two terms that are being
used interchangeably to refer to the same thing. A Distributed
Denial of Service attack (DDoS) is a good example for a
computer crime where the computing system is used as a mean
as well as a target. Fortunately, cybercrimes leave fingerprints
that investigators can collect, correlate and analyze to
understand what, why, when and how a crime was committed;
and consequently, and most importantly, build a good case that
can bring the criminals to justice. In this sense, computers can
be
seen as great source of evidence. For this purpose Computer
Forensics (CF) emerged as a major area of interest, research and
development driven by the legislative needs of having scientific
reliable framework, practices, guidelines, and techniques for
forensics activities starting from evidence acquisition,
preservation, analysis, and finally presentation. Computer
Forensics can be defined as the process of scientifically
obtaining, examining and analyzing digital information so that
it
can be used as evidence in civil, criminal or administrative
cases
[2]. A more formal definition of Computer Forensics is “the
discipline that combines elements of law and computer science
to collect and analyse data from computer systems, networks,
wireless communications, and storage devices in a way that is
admissible as evidence in a court of law” [3].

To hinder the efforts of Computer Forensics, criminals work
doggedly to instigate, develop and promote counter techniques
and methodologies, or what is commonly referred to as Anti-
Forensics. If we adopt the definition of Computer Forensics
(CF) as scientifically obtaining, examining, and analysing
digital
information to be used as evidence in a court of law, then Anti-
Forensics can be defined similarly but in the opposite direction.
In Computer Anti-Forensics (CAF) scientific methods are used
to simply frustrate Forensics efforts at all forensics stages. This
includes preventing, impeding, and/or corrupting the acquiring
of the needed evidence, its examination, its analysis, or its
credibility. In other words, whatever necessary to ensure that
computer evidence cannot get to, or will not be admissible in, a
court of law.
The use of Computer Anti-Forensics tools and techniques is
evident and far away from being an illusion. So, criminals’
reliance on technology to cover their tracks is not a claim, as
clearly reflected in recent researches conducted on reported and
investigated incidents. Based on 2009-2010 Data Breach
Investigations Reports [4][5], investigators found signs of anti-
forensics usage in over one third of cases in 2009 and 2010 with
the most common forms being the same for both years. The
results show that the overall use of anti-forensics remained
relatively flat with slight movement among the techniques
themselves. Figure [2] below shows the types of anti-Forensic
techniques used (data wiping, data hiding and data corruption)
by percentage of breaches. As shown in Figure [2] below, data
wiping is still the most common, because it is supported by
many commercial off-the-shelf products that are available even
as freeware that are easy to install, learn and use; while data
hiding and data corruption remain a distant behind.

Figure 2 Types of Anti-Forensics – 2010 (Reproduced from [5])
It is important to note that the lack of understanding on what
CAF is and what it is capable of may lead to underestimating or
probably overlooking CAF impact on the legitimate efforts of
CF. Therefore, when dealing with computer forensics, it is
important that we address the following questions, among
others, that are related to CAF: Do we really have everything?
Are the collected evidences really what were left behind or they
are only just those intentionally left for us to find? How to
know
if the CF tool used was not misleading us due to certain
weaknesses in the tool itself? Are these CF tools developed
according to proper secure software engineering methodologies?
Are these CF tools immune against attacks? What are the recent
CAF methods and techniques? This paper attempts to provide
some answers to such questions that can assist in developing the
proper understanding for the issue.
3. RELATED WORK, CAF GOALS AND
CLASSIFICATIONS
Even though computer forensics and computer ant-forensics are
tightly related, as if they are two faces of the same coin, the
amount of research they received was not the same. CF received
more focus over the past ten years or so because of its relation
with other areas like data recovery, incident management and
information systems risk assessment. CF is a little bit older, and
therefore more mature than CAF. It has consistent definition,
well defined systematic approach and complete set of leading
best practices and technology.
CAF on the other side, is still a new field, and is expected to get
mature overtime and become closer to CF. In this effort, recent

research papers attempted to introduce several definitions,
various classifications and suggest some solutions and
countermeasures. Some researchers have concentrated more on
the technical aspects of CF and CAF software in terms of
vulnerabilities and coding techniques, while others have
focused
primarily on understanding file systems, hardware capabilities,
and operating systems. A few other researchers chose to address
the issue from an ethical or social angle, such as privacy
concerns. Despite the criticality of CAF, it is hard to find a
comprehensive research that addresses the subject in a holistic
manner by providing a consistent definition, structured
taxonomies, and an inclusive view of CAF.
3.1. CAF Goals
As stated in the previous section, CAF is a collection of tools
and techniques that are intended to frustrate CF tools and CF’s
investigators efforts. This field is growingly receiving more
interest and attention as it continues to expose the limitations of
currently available computer forensics techniques as well as
challenge the presumed reliability of common CF tools. We
believe, along with other researchers, that the advancements in
the CAF field will eventually put the necessary pressure on CF
developers and vendors to be more proactive in identifying
possible vulnerabilities or weaknesses in their products, which
consequently should lead to enhanced and more reliable tools.
CAF can have a broad range of goals including: avoiding
detection of event(s), disrupting the collection of information,
increasing the time an examiner needs to spend on a case,
casting doubt on a forensic report or testimony. In addition,
these goals may also include: forcing the forensic tool to reveal
its presence, using the forensic tool to attack the organization in
which it is running, and leaving no evidence that an anti-
forensic

tool has been run [6].
3.2. CAF Classifications
Several classifications for CAF have been introduced in the
literature. These various taxonomies differ in the criteria used
to
do the classification. The following are the most common
approaches used:
1. Categories Based on the Attacked Target
• Attacking Data: The acquisition of evidentiary data in
the forensics process is a primary goal. In this
category CAFs seek to complicate this step by
wiping, hiding or corrupting evidentiary data.
• Attacking CF Tools: The major focus of this category
is the examination step of the forensics process. The
objective of this category is to make the examination
results questionable, not trustworthy, and/or
misleading by manipulating essential information
like hashes and timestamps.
• Attacking the Investigator: This category is aimed at
exhausting the investigator’s time and resources,
leading eventually to the termination of the
investigation.
2. CAF Techniques vs. Tactics
This categorization makes a clear distinction
between the terms anti-forensics and counter-forensics
[7], even though the two terms have been used
interchangeably by many others as the emphasis is
usually on technology rather than on tactics.

• Counter-Forensics: This category includes all
techniques that target the forensics tools directly to
cause them to crash, erase collected evidence,
and/or break completely (thus disallowing the
investigator from using it). Compression bombs
are good example on this category.
• Anti-Forensics: This category includes all
technology related techniques including
encryption, steganography, and alternate data
streams (ADMs).
3. Traditional vs. Non-Traditional
• Traditional Techniques: This category includes
techniques involving overwriting data,
Cryptography, Steganography, and other data hiding
approaches beside generic data hiding techniques.
• Non-Traditional Techniques: As opposed to
traditional techniques, these techniques are more
creative and impose more risk as they are harder to
detect. These include:
o Memory injections, where all malicious
activities are done on the volatile memory area.
o Anonymous storage, utilizes available web-
based storage to hide data to avoid being found
on local machines.
o Exploitation of CF software bugs, including
Denial of Service (DoS) and Crashers, amongst
others.

4. Categories Based on Functionality
This categorization includes data hiding, data
wiping and obfuscation. Attacks against CF processes
and tools is considered a separate category based on
this scheme
4. CAF CHALLENGES
Because Computer Anti-Forensics (CAF) is a relatively new
discipline, the field faces many challenges that need considered
and addressed. In this section, we have attempted to identify the
most pressing challenges surrounding this area, highlight the
research needed to address such challenges, and attempt to
provide perceptive answers to some the concerns.
4.1. Ambiguity
Aside from having no industry-accepted definition for CAF,
studies in this area view anti-forensics differently; this leads to
not having a clear set of standards or frameworks for this
critical
area. Consequently, misunderstanding may be an unavoidable
end result that could lead to improperly addressing the
associated concerns. The current classification schemes, stated
above, which mostly reflect the author’s viewpoint and probably
background, confirm as well as contribute to the ambiguity in
this field. A classification can only be beneficial if it must has
clear criteria that can assist not only in categorizing the current
known techniques and methodologies but will also enable
proper
understanding and categorization of new ones. The attempt to
distinguish between the two terms, anti-forensics and counter-
forensics based on technology and tactics is a good initiative
but
yet requires more elaboration to avoid any unnecessary

confusions.
To address the definition issue, we suggest to adopt a definition
for CAF that is built from our clear understanding of CF. The
classification issue can be addressed by narrowing the gaps
amongst the different viewpoints in the current classifications
and excluding the odd ones.
4.2. Investigation Constraints
A CF investigation has three main constraints/challenges,
namely: time, cost and resources. Every CF investigation case
should be approached as separate project that requires proper
planning, scoping, budgeting and resources. If these elements
are not properly accounted for, the investigation will eventually
fail, with most efforts up to the point of failure being wasted. In
this regard, CAF techniques and methodologies attempt to
attack
the time, cost and resources constraints of an investigation
project. An investigator may not able to afford the additional
costs or allocate the additional necessary resources. Most
importantly, the time factor might play a critical role in the
investigation as evidentiary data might lose value with time,
and/or allow the suspect(s) the opportunity to cover their tracks
or escape. Most, if not all, CAF techniques and methodologies
(including data wiping, data hiding, and data corruption)
attempt
to exploit this weakness. Therefore, it proper project
management is imperative before and during every CF
investigation.
4.3. Integration of Anti-Forensics into Other
Attacks
Recent researches show an increased adoption of CAF
techniques into other typical attacks. The primary purposes of

integrating CAF into other attacks are undetectability and
deletion of evidence. Two major areas for this threatening
integration are Malware and Botnets [8][9]. Malwares and
Botnets when armed with these techniques will make the
investigative efforts labour and time intensive which can lead to
overlooking critical evidence, if not abandoning the entire
investigation.
4.4. Breaking the Forensics Software
CF tools are, of course, created by humans, just like other
software systems. Rushing to release their products to the
market before their competition, companies tend to,
unintentionally, introduce vulnerabilities into their products. In
such cases, software development best practices, which are
intended to ensure the quality of the product, might be
overlooked leading to the end product being exposed to many
known vulnerabilities, such as buffer overflow and code
injection. Because CF software is ultimately used to present
evidence in courts, the existence of such weaknesses is not
tolerable. Hence, all CF software, before being used, must be
subjected to thorough security testing that focuses on robustness
against data hiding and accurate reproduction of evidence.
The Common Vulnerabilities and Exposures (CVE) database is
a great source for getting updates on vulnerabilities in existing
products [10]. Some studies have reported several weaknesses
that may result in crashes during runtime leaving no chance for
interpreting the evidence [11]. Regardless of the fact that some
of these weaknesses are still being disputed [12], it is important
to be aware that these CF tools are not immune to
vulnerabilities, and that CAF tools would most likely take
advantage of such weaknesses. A good example of a common
technique that can cause a CF to fail or crash is the
“Compression Bomb”; where files are compressed hundreds of
times such that when a CF tool tries to decompress, it will use
up so many resources causing the computer or the tool to hang

or crash.
4.5. Privacy Concerns
Increasingly, users are becoming more aware of the fact that
just
deleting a file does not make it really disappear from the
computer and that it can be retrieved by several means. This
awareness is driving the market for software solutions that
provide safe and secure means for files deletion. Such tools are
marketed as “privacy protection” software and claim to have the
ability to completely remove all traces of information
concerning user’s activity on a system, websites, images and
downloaded files. Some of these tools do not only provide
protection through secure deletion; but also offer encryption
and
compression. Moreover, these tools are easy use, and some can
even be downloaded for free. WinZip is a popular tool that
offers encryption, password protection, and compression. Such
tools will most definitely complicate the search for and
acquiring of evidence in any CF investigation because they
make the whole process more time and resources consuming.
Privacy issues in relation to CF have been the subject of
detailed
research in an attempt to define appropriate policies and
procedures that would maintain users’ privacy when excessive
data is acquired for forensics purposes [13].
4.6. Nature of Digital Evidence
CF investigations rely on two main assumptions to be

successful: (1) the data can be acquired and used as evidence,
and (2) the results of the CF tools are authentic, reliable, and
believable. The first assumption highlights the importance of
digital evidence as the basis for any CF investigation; while the
second assumption highlights the critical role of the
trustworthiness of the CF tools in order for the results to stand
solid in courts.
Digital evidence is more challenging than physical evidence
because of its more susceptible to being altered, hidden,
removed, or simply made unreadable. Several techniques can be
utilized to achieve such undesirable objectives that can
complicate the acquisition process of evidentiary digital data,
and thus compromise the first assumption.
CF tools rely on many techniques that can attest to their
trustworthiness, including but limited to: hashing; timestamps;
and signatures during examination, analyses and inspection of
source files. CAF tools can in turn utilize new advances in
technology to break such authentication measures, and thus
comprise the second assumption..
The following is a brief explanation of some of the techniques
that are used to compromise these two assumptions:
• Encryption is used to make the data unreadable. This is one
of the most challenging techniques, as advances in
encryption algorithms and tools empowered it to be applied
on entire hard drive, selected partitions, or specific
directories and files. In all cases, an encryption key is
usually needed to reverse the process and decrypt the
desired data, which is usually unknown to an investigator,
in most cases. To complicate matters, decryption using
brute-force techniques becomes infeasible when long keys
are used. More success in this regard might be achieved
with keyloggers or volatile memory content acquisition.

• Steganography aims at hiding the data, by embedding it
into another digital form, such as images or videos.
Commercial Steganalysis tools, that can detect hidden data,
exist and can be utilized to counter Steganography.
Encryption and Steganography can be combined to obscure
data and make it also unreadable, which can extremely
complicate a CF investigation.
• Secure-Deletion removes the target data completely from
the source system, by overwriting it with random data, and
thus rendering the target data unrecoverable. Fortunately,
most of the available commercial secure-deletion tools tend
to underperform and thus miss some data [14]. More
research is needed in this area to understand the weaknesses
and identify the signatures of such tools. Such information
is needed to detect the operations and minimize the impact
of these tools.
• Hashing is used by CF tools to validate the integrity of
data. A hashing algorithm accepts a variable-size input,
such as a file, and generates a unique fixed-size value that
corresponds to the given input. The generated output is
unique and can be used as a fingerprint for the input file.
Any change in the original file, no matter how minor, will
result in considerable change in the hash value produced by
the hashing algorithm. A key feature in hashing algorithms
is “Irreversibility” where having the hash value in hand will
not allow the recovery of the original input. Another key
feature is “Uniqueness” which basically means that the
hash values of two files will be equal if and only if the files
are absolutely identical. Many hashing algorithms have
developed, and some have been already infiltrated or
cracked. Other algorithms like MD5, MD6, Secure Hashing
Algorithms (SHA), SHA-1, SHA-2, amongst others, are
harder to break. However, all are vulnerable to being

infiltrated as technology and research advance [15].
Research is also necessary in the other direction to enhance
the capabilities of CF tools in this regard and maintain their
credibility.
• Timestamps are associated with files and are critical for the
task of establishing the chain of events during a CF
investigation. The time line for the events is contingent on
the accuracy of timestamps. CAF tools have provided the
capability to modify timestamps of files or logs, which can
mislead an investigation and consequently coerce the
conclusion. Many tools currently exist on the market, some
are even freely available, that make it easy to manipulate
the timestamps, such as Timestamp Modifier and
SKTimeStamp [16].
• File Signatures, also known as Magic Numbers, are
constant known values that exist at the beginning of each
file to identify the file type (e.g. image file, word
document, etc.). Hexadecimal editors, such as WinHex, can
be used to view and inspect these values. Forensics
investigators rely on these values to search for evidence of
certain type. When a file extension is changed, the actual
type file is not changed, and thus the file signature remains
unchanged. ACF tools intentionally change the file
signatures in their attempt to mislead the investigations as
some evidence files are overlooked or dismissed. Complete
listing of file signatures or magic numbers can be found on
the web in [17].
• CF Detection is simply the capability of ACF tools to
detect the presences of CF software and their activities or
functionalities. Self-Monitoring, Analysis and Reporting
Technology (SMART) built into most hard drives reports
the total number of power cycles (Power_Cycle_Count),
the total time that a hard drive has been in use

(Power_On_Hours or Power_On_Minutes), a log of high
temperatures that the drive has reached, and other
manufacturer-determined attributes. These counters can be
reliably read by user programs and cannot be reset.
Although the SMART specification implements a
DISABLE command (SMART 96), experimentation
indicates that the few drives that actually implement the
DISABLE command continue to keep track of the time-in-
use and power cycle count and make this information
available after the next power cycle. CAF tools can read
SMART counters to detect attempts at forensic analysis and
alter their behavior accordingly. For example, a dramatic
increase in Power_On_Minutes might indicate that the
computer’s hard drive has been imaged [18].
• Business Needs: Cloud Computing (CC) is a business
model typically suited for small and medium enterprises
(SME) that do not have enough resources to invest in
building their own IT infrastructure. Hence, they tend to
outsource this to third parties who will in turn lease their
infrastructure and probably applications as services. This
new model introduces more challenges to CF investigations
due to mainly the fact that the data is on the cloud (i.e.
hosted somewhere in the Internet space), being transferred
across countries with different regulations, and most
importantly might reside on a machine that hosts other data
instances of other enterprises. In some instances, the data
for the same enterprise might even be stored across
multiple data centres [19][20]. These issues complicate the
CF’s primary functions (i.e. data acquisition, examination,
and analyses) needed to build a good case extremely hard.
4.7 Recommendations

Based on our findings, we see room for improvement in the
field
of ACF that can address some of the issues surrounding this
field. We believe that such recommendations, when adopted
and/or implemented properly, can add value and consolidate the
efforts for advancing this field. Below is a list and brief
explanation of the recommendations:
a) Spend More Efforts to Understand ACF
More efforts should be spent in order to reach an agreed
upon comprehensive definition for ACF that would assist in
getting better understanding of the concepts in the field.
These efforts should also extend to develop acceptable best
practices, procedures and processes that constitute the
proper framework, or standard, that professionals can use
and build onto. ACF classifications also need to be
integrated, clarified, and formulated on well-defined
criteria. Such fundamental foundational efforts would
eventually assist researchers and experts in addressing the
issues and mitigating the associated risks.
Awareness of AFC techniques and their capabilities will
prevent, or at least reduce, their success and consequently
their impact on CF investigations. Knowledge in this area
should encompass both techniques and tactics. Continued
education and research are necessary to stay atop of latest
developments in the field, and be ready with appropriate
countermeasures when and as necessary.
b) Define Laws that Prohibit Unjustified Use of ACF
Existence of strict and clear laws that detail the obligations
and consequences of violations can play a key deterrent
role for the use of these tools in a destructive manner.
When someone knows in advance that having certain ACF
tools on one’s machine might be questioned and possibly
pose some liabilities, one would probably have second

thoughts about installing such tools.
Commercial non-specialized ACF tools, which are more
commonly used, always leave easily detectable fingerprints
and signatures. They sometimes also fail to fulfil their
developers’ promises of deleting all traces of data. This can
later be used as evidence against a suspected criminal and
can lead to an indictment. The proven unjustified use of
ACF tools can be used as supporting incriminatory
evidence in courts in some countries [21].
To address the privacy concerns, such as users needs to
protect personal data like family pictures or videos, an
approved list of authorized software can be compiled with
known fingerprints, signatures and special recovery keys.
Such information, especially recovery keys, would then be
safe-guarded in possession of the proper authorities. It
would strictly be used to reverse the process of AFC tools,
through the appropriate judicial processes.
c) Utilize Weaknesses of ACF Software
In some cases, digital evidence can still be recovered if a
data wiping tool is poorly used or is functioning
improperly. Hence, each AFC software must be carefully
examined and continuously analyzed in order to fully
understand its exact behaviour and determine its
weaknesses and vulnerabilities [14][22]. This can help to
develop the appropriate course of actions given the
different possible scenarios and circumstances. This could
prove to be valuable in saving time and resources during an
investigation.
d) Harden CF Software
CAF and CF thrive on the weaknesses of each other. To
ensure justice CF must always strive to be more advanced

than its counterpart. This can be achieved by conducting
security and penetration tests to verify the software is
immune to external attacks. Also, it is imperative not to
submit to market pressure and demand for tools by rapidly
releasing products without proper validation. The best
practices of software development must not be overlooked
at any rate. When vulnerabilities are identified, proper fixes
and patches must be tested, verified and deployed promptly
in order to avoid zero-day attacks.
5. CONCLUSION AND FUTURE WORK
5.1. Conclusion
Computer Anti-Forensics (CAF) is an important developing area
of technology. Because CAF success means that digital
evidence
will not be admissible in courts, Computer Forensics (CF) must
evaluate its techniques and tactics very carefully. Also, CF
efforts must be integrated and expedited to narrow the current
exiting gap with CAF. It is important to agree on an acceptable
definition and classification for CAF which will assist in
implementing proper countermeasures. Current definitions and
classifications all seem to concentrate on specific aspects of
CAF without truly providing the needed holistic view.
It is very important to realize that CAF is not only about tools
that are used to delete, corrupt, or hide evidence. CAF is a
blend
of techniques and tactics that utilize technological
advancements
in areas like encryption and data overwriting amongst other
techniques to obstruct investigators’ efforts.
Many challenges exist and need to be carefully analyzed and

addressed. In this paper we attempted to identify some of these
challenges and suggested some recommendations that might, if
applied properly, mitigate the risks.
5.2. Future Work
This paper provides solid foundation for future work that can
further elaborate on the various highlighted areas. It suggests a
definition for CAF that is closely aligned with CF and presents
several classifications that we deem acceptable. It also
discusses
several challenges that can be further addressed in future
research. CAF technologies, techniques, and tactics need to
receive more attention in research, especially in the areas that
present debates on hashes, timestamps, and file signatures.
Research opportunities in Computer Forensics, Network
Forensics, and Anti-Forensics can use the work presented in this
paper as a base. Privacy concerns and other issues related to the
forensics field introduce a raw domain that requires serious
consideration and analysis. Cloud computing, virtualization, and
related laws and regulations concerns are topics that can be
considered in future research.
6. REFERENCES
[ 1 ] Corey Thuen, University of Idaho: “Understanding
Counter-Forensics to Ensure a Successful Investigation”.
DOI=http://citeseerx.ist.psu.edu/viewdoc/summary?doi=
10.1.1.138.2196
[ 2 ] Internet Usage Statistics, “The Internet Big Picture,
World Internet Users and Population Stats”. DOI=
http://www.internetworldstats.com/stats.htm

[ 3 ] Bill Nelson, Amelia Phillips, and Steuart, “Guide to
Computer Forensics and Investigations”, pp 2-3, 4
th
Edition.
[ 4 ] US-Computer Emergency Readiness Team, CERT, a
government organization, “Computer Forensics”, 2008.
[ 5 ] Verizon Business, “2009 Data Breach Investigations
Report”. A study conducted by the Verizon RISK Team
in cooperation with the United States Secret Service.
DOI=http://www.verizonbusiness.com/about/news/podca
sts/1008a1a3-111=129947--
Verizon+Business+2009+Data+Breach+Investigations+
Report.xml
[ 6 ] Verizon Business, “2010 Data Breach Investigations
Report”. A study conducted by the Verizon RISK Team
in cooperation with the United States Secret Service.
DOI=http://www.verizonbusiness.com/resources/reports/
rp_2010-data-breach-
report_en_xg.pdf?&src=/worldwide/resources/index.xml
&id=
[ 7 ] Simson Garfinkel, “Anti-Forensics: Techniques,
Detection and Countermeasures”, 2
nd
International
Conference in i-Warefare and Security, pp 77, 2007
[ 8 ] W.Matthew Hartley, “Current and Future Threats to

Digital Forensics”, ISSA Journal, August 2007
[ 9 ] Murray Brand, (2007), “Forensics Analysis Avoidance
Techniques of Malware”, Edith Cowan University,
Australia.
[ 10 ] “Security 101: Botnets”. DOI=
http://www.secureworks.com/research/newsletter/2008/0
5/
[ 11 ] Common Vulnerabilities and Exposures (CVE) database,
http://cve.mitre.org/
[ 12 ] Tim Newsham, Chris Palmer, Alex Stamos, “Breaking
Forensics Software: Weaknesses in Critical Evidence
Collection”, iSEC Partners http://www.isecpartners.com,
2007
[ 13 ] Guidance Software: Computer Forensics
Solution
s and
Digital Investigations
(http://www.guidancesoftware.com/)
[ 14 ] S. Srinivasan, “Security and Privacy vs. Computer
Forensics Capabilities”, ISACA Online Journal, 2007

[ 15 ] Matthew Geiger, Carnegie Mellon University,
“Evaluating Commercial Counter-Forensic Tools”,
Digital Forensic Research Workshop (DFRWS), 2005
[ 16 ] Xiaoyun Wang and Hongbo Yu, Shandong University,
China, “How to Break MD5 and Other Hash Functions”,
EUROCRYPT 2005, pp.19-35, May, 2005
[ 17 ] How to Change TimeStamp of a File in Windows. DOI=
http://www.trickyways.com/2009/08/how-to-change-
timestamp-of-a-file-in-windows-file-created-modified-
and-accessed/.
[ 18 ] File Signature Table. DOI=
http://www.garykessler.net/library/file_sigs.html,
[ 19 ] McLeod S, “SMART Anti-Forensics”, DOI=
http://www.forensicfocus.com/smart-anti-forensics, .
[ 20 ] Stephen Biggs and Stilianos, “Cloud Computing
Storms”, International Journal of Intelligent Computing
Research (IJICR), Volume 1, Issue 1, MAR, 2010
[ 21 ] U Gurav, R Shaikh, “Virtualization – A key feature of

cloud computing”, International Conference and
Workshop on Emerging Trends in technology (ICWET
2010), Mumbai, India
[ 22 ] U.S .v .Robert Johnson - Child Pornography Indictment.
DOI=http://news.findlaw.com/hdocs/docs/chldprn/usjhns
n62805ind.pdf
[ 23 ] United States of America v. H. Marc Watzman. DOI=
http://www.justice.gov/usao/iln/.../2003/watzman.pdf
[ 24 ] Mark Whitteker, “Anti-Forensics: Breaking the
Forensics Process”, ISSA Journal, November, 2008
[ 25 ] Gary C. Kessler,“Anti-Forensics and the Digital
Investigator”, Champlain College, USA
[ 26 ] Ryan Harris, “Arriving at an anti-forensics consensus:
examining how to define and control the anti-forensics
problem”, DOI= www.elsevier.com/locate/dinn.
Appendix A: Anti-Forensics Tools

The following is a list of some commercial CAF software
packages available on the market. The tools listed below are
intended as examples; none of these tools were purchased or
tested as part of this paper work.
Category Tool Name
Privacy and Secure Deletion Privacy Expert; SecureClean;
PrivacyProtection; Evidence
Eliminator; Internet Cleaner
File and Disk Encryption TruCrypt, PointSec; Winzip 14
Time stamp Modifiers SKTimeStamp; Timestamp
Modifier; Timestomp
Others The Defiler’s Toolkit – Necrofile
and Klimafile; Metasploit Anti-
Forensic Investigation Arsenal
(known affectionately as MAFIA)
Download and read the following articles available in the ACM

Digital Library:
Arduini, F., & Morabito, V. (2010, March). Business continuity
and the banking industry. Communications of the ACM, 53(3),
121-125
Dahbur, K., & Mohammad, B. (2011). The anti-forensics
challenge. Proceedings from ISWSA '11: International
Conference on Intelligent Semantic Web-Services and
Applications. Amman, Jordan.
Write a five to seven (5-7) page paper in which you:
1. Consider that Data Security and Policy Assurance methods
are important to the overall success of IT and Corporate data
security.
a. Determine how defined roles of technology, people, and
processes are necessary to ensure resource allocation for
business
continuity.
b. Explain how computer security policies and data retention
policies help maintain user expectations of levels of business
continuity that could be achieved.
c. Determine how acceptable use policies, remote access
policies, and email policies could help minimize any anti-
forensics
efforts. Give an example with your response.
2. Suggest at least two (2) models that could be used to ensure
business continuity and ensure the integrity of corporate

forensic
efforts. Describe how these could be implemented.
3. Explain the essentials of defining a digital forensics process
and provide two (2) examples on how a forensic recovery and
analysis
plan could assist in improving the Recovery Time Objective
(RTO) as described in the first article.
4. Provide a step-by-step process that could be used to develop
and sustain an enterprise continuity process.
5. Describe the role of incident response teams and how these
accommodate business continuity.
6. There are several awareness and training efforts that could be
adopted in order to prevent anti-forensic efforts.
a. Suggest two (2) awareness and training efforts that could
assist in preventing anti-forensic efforts.
b. Determine how having a knowledgeable workforce could
provide a greater level of secure behavior. Provide a rationale
with
your response.
c. Outline the steps that could be performed to ensure
continuous effectiveness.
7. Use at least three (3) quality resources in this assignment.
Note: Wikipedia and similar Websites do not qualify as quality
resources.
Your assignment must follow these formatting requirements:

· Be typed, double spaced, using Times New Roman font (size
12), with one-inch margins on all sides; citations and references
must follow APA or school-specific format. Check with your
professor for any additional instructions.
· Include a cover page containing the title of the assignment, the
student’s name, the professor’s name, the course title, and the
date. The cover page and the reference page are not included in
the required assignment page length.
The specific course learning outcomes associated with this
assignment are:
· Describe and apply the 14 areas of common practice in the
Department of Homeland Security (DHS) Essential Body of
Knowledge.
· Describe best practices in cybersecurity.
· Explain data security competencies to include turning policy
into practice.
· Describe digital forensics and process management.
· Evaluate the ethical concerns inherent in cybersecurity and
how these concerns affect organizational policies.
· Create an enterprise continuity plan.
· Describe and create an incident management and response
plan.
· Describe system, application, network, and
telecommunications security policies and response.
· Use technology and information resources to research issues in

cybersecurity.
· Write clearly and concisely about topics associated with
cybersecurity using proper writing mechanics and technical
style conventions.

Extracting Authoring Information Based on Keywords andSemant.docx

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Ähnlich wie Extracting Authoring Information Based on Keywords andSemant.docx

Ähnlich wie Extracting Authoring Information Based on Keywords andSemant.docx (20)

Mehr von mydrynan

Mehr von mydrynan (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Extracting Authoring Information Based on Keywords andSemant.docx