1. Secure File Management Using the Public
Cloud
A Masters in Cybersecurity Practicum Project
Cecil Thornhill
ABSTRACT
The Project explores the history and evolution of document management
tools through the emergence of cloud computing and documents the
development of a basic cloud computing web based system for secure
transmission and storage of confidential information on a public cloud
following guidance for federal computing systems.
2. Secure File Management Using the Public Cloud
Masters of Cybersecurity Practicum Project, ISM 6905– Cecil Thornhill
Masters Project CThornhill v2 final.docx7/13/16 Page 2 of 46
Introduction ................................................................................................................3
Background of the Driving Problem – Ur to the Cloud ..................................................3
The Cloud in Context – A New Way to Provide IT .........................................................7
Cloud Transformation Drivers......................................................................................8
The Federal Cloud & the Secure Cloud Emerge.......................................................... 10
Designing a Project to Demonstrate Using the Cloud ..................................................13
Planning the Work and Implementing the Project Design ...........................................15
Findings, Conclusions and Next Steps.........................................................................32
References.................................................................................................................34
Source Code Listings ..................................................................................................39
Test Document ..........................................................................................................46
3. Secure File Management Using the Public Cloud
Masters of Cybersecurity Practicum Project, ISM 6905– Cecil Thornhill
Masters Project CThornhill v2 final.docx7/13/16 Page 3 of 46
Introduction
This paper describes the design and development of a system to support the
encrypted transfer of confidential and sensitive Personally Identifiable Information
(PII) and Personal Healthcare Information (PHI) to a commercial cloud based object
storage system. This work was undertaken as a Practicum project for the Masters in
Cybersecurity program, and as such was implemented within the time limits of a
semester session and was completed by a single individual. This prototype
represents a basic version of a web-based system implemented on a commercial
cloud based object storage system. The prototype demonstrates an approach to
implementation suitable for use by government or private business for the
collection of data subject to extensive regulation such as HIPAA/HiTech healthcare
data, or critical financial data.
A general review of the context of the subject area and history of document
management are provided below, along with a review of the implementation efforts.
Findings and results are provided both for the implementation efforts as well as the
actual function of the system. Due to the restricted time available for this project,
the scope was limited to fit the schedule. Only basic features were implemented per
the design guidance documented below. To explore future options for expansion of
the project several experiments designed to further analyze the system capacity and
performance are outlined below. These options represent potential future
directions to further explore this aspect of secure delivery of information
technology functions using cloud-based platforms.
Background of the Driving Problem – Ur to the Cloud
The need to exchange documents containing important information between
individuals, and enterprises is a universal necessity in any organized human society.
Since the earliest highly organized human cultures information about both private
and government activities has been recorded on physical media and exchanged
between parties1. Various private and government couriers were used to exchange
documents in the ancient and classical world. In the West, this practice of private
courier service continued after the fall of Rome. The Catholic Church acted as a
primary conduit for document exchange and was itself a prime consumer of
document exchange services2.
In the West, after the renaissance the growth of both the modern nation state and
the emergence of early commerce and capitalism were both driven by and
supportive of the growth of postal services open to private interest. The needs of
commerce quickly came to dominate the traffic, and shape the evolution of
document exchange via physical media3. In the early United States the critical role of
publicly accessible document exchange was widely recognized by the founders of
the new democracy. The Continental Congress in1775 established the US Postal
4. Secure File Management Using the Public Cloud
Masters of Cybersecurity Practicum Project, ISM 6905– Cecil Thornhill
Masters Project CThornhill v2 final.docx7/13/16 Page 4 of 46
Service to provide document communications services to the emerging new
government prior to the declaration of independence4.
As a new and modern nation cost effective, efficient document exchange services
from the new post office were essential to the growth of the US economy5. The
growth of the US as a political and economic power unfolds in parallel with the
Industrial Revolution in England and Europe as well as the overall transition of the
Western world to what can be described as modern times. New science, new
industry and commerce and new political urgencies all drive the demand for the
transmission of documents and messages in ever faster and more cost effective
forms6.
It is within this accelerating technical and commercial landscape that the digital age
is born in the US when Samuel Morse publicly introduces the telegraph to the world
in 1844 with the famous question “What Hath God Wrought?” sent from the US
Capitol to the train statin in Baltimore, Maryland7. Morse’s demonstration was the
result of years of experiment and effort by hundreds of people in scores of countries,
but has come to represent the singular moment of creation for the digital era and
marks the beginning of the struggle to understand and control the issues stemming
from document transmission in the digital realm. All of the issues we face emerge
from this time forward, such as:
• Translation of document artifacts created by people into digital formats and
the creation of human readable documents from digital intermediary formats.
• The necessity to authenticate the origin of identical digital data sets and to
manage the replication of copies.
• The need to enforce privacy and security during the transmission process
across electronic media.
Many of these problems have similar counterparts in the physical document
exchange process, but some such as the issue of an indefinite number of identical
copies were novel and all these issues require differing solutions for a physical or
digital environment8. The telegraph was remarkable successful due to its compelling
commercial, social and military utility. As Du Boff and Yates note in their research:
“By 1851, only seven years after the inauguration of the pioneer Baltimore-to-
Washington line, the entire eastern half of the US up to the Mississippi River was
connected by a network of telegraph wires that made virtually instantaneous
communication possible. By the end of another decade, the telegraph had reached
the west coast, as well9, 10 “.
The reach of the telegraph went well beyond the borders of the US, or even the
shores of any one continent by 1851. That same year Queen Victoria sent president
5. Secure File Management Using the Public Cloud
Masters of Cybersecurity Practicum Project, ISM 6905– Cecil Thornhill
Masters Project CThornhill v2 final.docx7/13/16 Page 5 of 46
Buchannan a congratulatory telegram to mark the successful completion of the
Anglo-American transatlantic cable project11. Digital documents now had global
scope, and the modern era of document exchange and management had truly
arrived.
The US Civil war would be largely shaped by the technical impact of the telegraph
and railroad. Both the North and South ruthlessly exploited advances in
transportation and communication during the conflict12. Centralization of
information management and the need to confidentiality, integrity, and availability
all emerged as issues. Technical tools like encryption rapidly became standard
approaches to meeting these needs13.
The patterns of technical utilization during the war provided a model for future civil
government and military use of digital communications and for digital document
transmission. The government’s use patterns then became a lesson in the potential
for commercial use of the technology. Veterans of the war went on to utilize the
telegraph as an essential tool in post war America’s business climate. Rapid
communication and a faster pace in business became the norm as the US scaled up
its industry in the late 19th century. Tracking and managing documents became an
ever-increasing challenge along with other aspects of managing the growing and
geographically diverse business enterprises emerging.
By the turn of the 20th century the telegraph provided a thriving and vital
alternative to the physical transmission of messages and documents. Most messages
and documents to be sent by telegraph were either entered directly as digital signals
sent originally by telegraphy, or transcribed by a human who read and re-entered
the data from the document. However, all of the modern elements of digital
document communication existed and were in some form of use, including the then
under-utilized facsimile apparatus14.
As the 20th century progresses two more 19th century technologies which would
come to have a major impact on document interchange and management would
continue to evolve in parallel with the telegraph: mechanical/electronic
computation and photography. Mechanical computation tracing its origin from
Babbage’s Analytical Engine would come to be indispensible in tabulating and
managing the data needed to run an increasingly global technical and industrial
society15. Photography not only provided a new and accurate record of people and
events, but with the development of fine grained films in the 20th century, microfilm
would come to be the champion of high density document and hence information
storage media. Despite some quality drawbacks, the sheer capacity and over 100-
year shelf life of microfilm made it very attractive as a document storage tool. By the
1930’s microfilm had become the bulk document storage medium of choice for
publications and libraries as well as the federal government16.
6. Secure File Management Using the Public Cloud
Masters of Cybersecurity Practicum Project, ISM 6905– Cecil Thornhill
Masters Project CThornhill v2 final.docx7/13/16 Page 6 of 46
The experience with early electronic computers in World War II and familiarity with
microfilm made merging the two technologies appear as a natural next step to
forward thinkers. In 1945 Vannevar Bush, the wartime head of the Office of
Scientific Research and Development (OSRD) would propose the Memex. Memex
was designed as an associative information management device combining
electronic computer-like functions with microfilm storage, but was not fully digital
nor was it networked17. In many ways this project pointed the way to modern
information management tools that were introduced in the 1960’s but not fully
realized until the end of the 20th century.
Bush, V., & Think, A. W. M. (1945). The Atlantic Monthly. As we may think, 176(1),
101-108.
The commercial release and rapid adoption of modern computer systems such as
the groundbreaking IBM 360 in the 1960’s, and series of mini-computer systems in
the 1970 such as the DEC VAX greatly expanded the use of digital documents and
created the modern concept of a searchable database filled with data from these
documents. The development of electronic document publishing systems in the
1980’s allowed for a “feedback loop” that allowed digital data to go back into printed
documents, generating a need to manage these new documents with the computers
used to generate them from the data and user input. The growth of both electronic
data exchange and document scanning in the 1990’s, to began to replace microfilm.
Many enterprises realized the need to eliminate paper and only work with
electronic versions of customer documents. The drive for more efficient and
convenient delivery of services as well as the need to reduce the cost of managing
paper records continues to drive the demand for electronic document management
tools. By the 1990’s large-scale document management and document search
systems such as FileNet and its competitors began to emerge into the commercial
market. The emergence of fully digital document management systems in wide
spread use by the turn of the 21st century brings the story of document management
into the present day, where we see a predominance of electronic document systems,
and an expectation of quick and universal access to both the data and documents as
artifacts in every aspect of life, including activities that are private, commercial and
interactions with the government.
As the demand for large electronic document management infrastructures the scale
of these systems and related IT infrastructure continued to expand, placing
significant cost stress on the enterprise. There was a boom in the construction of
data centers to house the infrastructure. At the same time that the physical data
centers for enterprises were expanding, a new model of enterprise computing was
being developed: Cloud Computing.
7. Secure File Management Using the Public Cloud
Masters of Cybersecurity Practicum Project, ISM 6905– Cecil Thornhill
Masters Project CThornhill v2 final.docx7/13/16 Page 7 of 46
The Cloud in Context – A New Way to Provide IT
In 1999 Salesforce popularized the idea of providing enterprise applications
infrastructure via a website, and by 2002 Amazon started delivering computation
and storage to enterprises via the Amazon Web Services platform. Google, Microsoft
and Oracle as well as a host of other major IT players quickly followed with their
own version of cloud computing options. These new cloud services offered the
speed and convenience of web based technology with the features of a large data
center. An enterprise could lease and provision cloud resources with little time and
no investment in up front costs for procurement of system hardware. By 2009
options for cloud computing were plentiful, but there was as yet little generally
accepted evidence about the reasons for the shift or even the risk and benefits18.
What made cloud systems different from earlier timeshare approaches and data
center leasing of physical space? Why were they more compelling than renting or
leasing equipment? While a detailed examination of all the concepts and
considerations leading to the emergence of cloud computing is outside the scope of
this paper, there is a broad narrative that can be suggested based on prior historical
study of technological change from steam to electricity and then to centralized
generations systems. While the analogies may not all be perfect, they can be useful
tools in contextualizing the question of "why cloud computing now?"
In the 19th century, the development of practical steam power drove a revolution in
technical change. The nature of mechanical steam power was such that the steam
engine was intrinsically local, as mechanical power is hard to transmit across
distance19
. When electrical generation first emerged at the end of the 19th century,
the first electrical applications tended to reproduce this pattern. Long distance
distribution of power was hard to achieve, and so many facilities used generators
for local power production20
.
The nature of electricity was quite different from mechanical power, and so
breakthroughs in distribution were rapid. Innovators such as Tesla and Westinghouse
quickly developed long distance transmission of electricity. This electrical power
distribution breakthrough allowed the rapid emergence of very large centralized power
stations; the most significant of these early centers was the Niagara hydroelectric station21
.
Today, most power is generated in large central stations. Power is transmitted via a
complex national grid system. The distribution grid is an amalgam of local and regional
grids22
. However this was not the end of the demand for local generators. In fact more use
of electricity lead to more demand for local generators, but for non-primary use cases
such as emergency power, or for alternate use cases such as remote or temporary power
supplies23, 24
.
The way local generation was used changed with the shift to the power grid in ways that
can be seen to parallel to shift from local data centers to cloud based data center
8. Secure File Management Using the Public Cloud
Masters of Cybersecurity Practicum Project, ISM 6905– Cecil Thornhill
Masters Project CThornhill v2 final.docx7/13/16 Page 8 of 46
operations. Wile it is true that early computers were more centralized since the mid 70's
and the emergence of the mini-computer and then micro-computer that came to
prominence in the 80's, a much more distributed pattern emerged. The mainframe and
mini-computer became the nucleus of emerging local data centers in every enterprise. As
Local Area Networks emerged they reinforced the role of the local data center as a hub
for the enterprise. Most enterprises in the 1980’s and 90’s had some form of local data
center, in a pattern not totally dissimilar to that of early electric generators.
As the networks grew in scale and speed, they began to shift the patterns of local
computing to emphasize connectivity and wider geographic area of service. When the
commercial Internet emerged in the 1990's the stage was set for a radical change, in much
the same way that the development of efficient electrical distribution across a grid
changed the pattern of an earlier technical system. Connectivity became the driving
necessity for en enterprise competing to reach its supply chain and customers by the new
network tools.
By the turn of the 21st
century, firms like Google and Amazon were experimenting with
what the came to consider a new type of computer, the Warehouse Scale Computer. By
2009 this was a documented practical new tool, as noted in Google’s landmark paper
“The Datacenter as a Computer An Introduction to the Design of Warehouse-Scale
Machines”, Luiz André Barroso and Urs Hölzle, Google Inc. 2009. This transition can be
considered as similar to the move to centrally generated electrical power sent out via the
grid. In a similar manner it will not erase local computer resources but will alter their
purpose and use cases25
.
As was the case for the change to more centralized electrical generation, by the early
21st century there was considerable pressure on IT managers to consider moving
from local data centers to cloud based systems. For both general computing and for
document management systems this pressure tends to come from two broad source
categories: Technical/Process drivers and Cost drivers. Technical drivers include
the savings in deployment time for servers and systems at all points in the systems
development lifecycle, and cost drivers are reflected in the reduced operational
costs provided by cloud systems26.
Cloud Transformation Drivers
Technical and Process drivers also include considerations such as functional performance
and flexible response to business requirements. The need to be responsive in short time
frames as well as to provide the latest trends in functional support for the enterprise
business users and customers favors the quick start up times of cloud based IT services.
The wide scope of the business use case drivers goes beyond the scope of this paper, but
is important to note.
9. Secure File Management Using the Public Cloud
Masters of Cybersecurity Practicum Project, ISM 6905– Cecil Thornhill
Masters Project CThornhill v2 final.docx7/13/16 Page 9 of 46
Cost drivers favoring cloud based IT services are more easily understood in the
context of document management as discussed in this paper. Moving to cloud based
servers and storage for document management systems represents an opportunity
to reduce the Total Cost of Ownership (TCO) of the IT systems. These costs include
not only the cost to procure the system components but also the cost to operate
them in a managed environment, controlled by the enterprise. Even it appears there
is no compelling functional benefit to be obtained by the use of cloud based systems,
the cost factors alone are typically compelling as a driver for the decision to move
document management systems move from local servers and storage to the cloud.
As an example of the potential cost drivers, Amazon and other vendors offer a
number of TCO comparison tools that illustrate the case for cost savings from cloud-
based operations. While the vendors clearly have a vested interest in promotion of
cloud based operations, these tools provide a reasonable starting point for an
“apples to apples” estimate of costs for local CPU and storage vs. cloud CPU and
storage options. Considering that the nature of document systems is not especially
CPU intense, but is very demanding of storage subsystems this cost comparison is a
good starting point, as it tends to reduce the complexity of the pricing model.
For purposes of comparison here the Amazon TCO model will be discussed below to
examine the storage costs implications for a small (1TB) document store. The
default model from Amazon starts with an assumption of 1 TB of data, that requires
“hot” storage (fast access for on demand application support), full plus incremental
backup and grows by 1TB per month in size27. This is a good fit for a modest
document storage system and can be considered a “ballpark” baseline.
Total Cost of Ownership. (2016). Retrieved July 06, 2016, from
http://www.backuparchive.awstcocalculator.com/
Amazon’s tool estimates this storage to cost about $ 308,981 per year for local SAN
backed up to tape. The tool estimates the same storage using the cloud option cost
about $37,233 for a year. The cost of local hot storage alone is estimated at
$129,300 for and $29,035 for Amazon S3 storage. Based on the author’s past
experience in federal IT document management systems, these local storage costs
are generally within what could be considered reasonably relevant and accurate for
a private or federal data center storage TCO cost ranges. Processing costs estimates
for servers required in the storage solution are also within the range of typical mid-
size to large data center costs based on author’s experience over the past 8 years
with federal and private data center projects. Overall, the Amazon tool does appear
to produce estimates of local costs that can be considered reasonably viable for
planning purposes.
This rough and quick analysis form the Amazon TCO tool gives a good impression of
the level of cost savings possible with cloud-based systems. It serves as an example
of some of the opportunities presented to IT managers faced with a need to control
10. Secure File Management Using the Public Cloud
Masters of Cybersecurity Practicum Project, ISM 6905– Cecil Thornhill
Masters Project CThornhill v2 final.docx7/13/16 Page 10 of 46
budgets and provide more services for less cost. The potential to provide the same
services for half to ¼ the normal cost of local systems is very interesting to most
enterprises as a whole. When added to the cloud based flexibility to rapidly deploy
and the freedom to scale services up and down, these factors helps to explain the
increased preference for cloud based IT deployment. This preference for cloud
computing now extends beyond the private sector to government enterprises
seeking the benefits of the new computing models offered by cloud vendors.
The Federal Cloud & the Secure Cloud Emerge
For the federal customer the transition to Warehouse Scale Computing and the public
cloud can be dated to 2011 when the FedRAMP initiative was established. The
FedRAMP program is based on policy guidance from President Barack Obama’s 2001
paper titled "International Strategy for Cyberspace” 28
as well as the "Cloud First" policy
authored by US CIO Vivek Kundra 29
and the “Security Authorization of Information
Systems in Cloud Computing Environments “30
memo from Federal Chief Information
Officer, Steven VanRoekel. Together these documents framed the proposed revamp of all
federal Information Technology systems:
In the introduction to his 2011 cloud security memo, VanRoekel provides some concise
notes on the compelling reasons for the federal move to cloud computing:
“Cloud computing offers a unique opportunity for the Federal Government to take
advantage of cutting edge information technologies to dramatically reduce procurement
and operating costs and greatly increase the efficiency and effectiveness of services
provided to its citizens. Consistent with the President’s International Strategy for
Cyberspace and Cloud First policy, the adoption and use of information systems operated
by cloud service providers (cloud services) by the Federal Government depends on
security, interoperability, portability, reliability, and resiliency. 30
“
Collectively, these three documents and the actions they set in motion have
transformed the federal computing landscape since 2011 and as the private sector’s
use of local computing has begun a rapid shift to the cloud driven by competition
and the bottom line, in the short space of 5 years the entire paradigm for IT in the
federal government of the US has shifted radically. It is not unreasonable to expect
that by 2020, cloud computing will be the norm, not the exception for any federal IT
system. This transition offers huge opportunities, but brings massive challenges to
implement secure infrastructure in a public cloud computing space.
Functionally, the conversion from physical to electronic documents has a number of
engineering requirements, but above and beyond this, there are legal and security
considerations that make any document management system more complex to
impalement than earlier databases of disparate facts. Documents as an entity are
more than a collection of facts. They represent social and legal relationships and
11. Secure File Management Using the Public Cloud
Masters of Cybersecurity Practicum Project, ISM 6905– Cecil Thornhill
Masters Project CThornhill v2 final.docx7/13/16 Page 11 of 46
agreements. As such the authenticity, integrity, longevity and confidentiality of the
document as an artifact matter. The security and privacy implications of the
continued expansion of electronic exchange of data in consumer and commercial
financial transactions was incorporated into the rules, regulations and policy
guidance included in the Gramm-Leach-Bliley Act of 199931.
A good example of the wide swath of sensitive data that needs to be protected in
both physical and electronic transactions is shown in the Sensitive Data:
Your Money AND Your Life web page that is part of the Safe Computing Pamphlet
Series from MIT. As the page notes:
“Sensitive data encompasses a wide range of information and can include: your
ethnic or racial origin; political opinion; religious or other similar beliefs;
memberships; physical or mental health details; personal life; or criminal or civil
offences. These examples of information are protected by your civil rights.
Sensitive data can also include information that relates to you as a consumer, client,
employee, patient or student; and it can be identifying information as well: your
contact information, identification cards and numbers, birth date, and parents’
names. 32 “
Sensitive data also includes core identity data aside from the information about any
particular event, account or transaction, personal preferences, or self identified
category. Most useful documents supporting interactions between people and
business or government enterprises contain Personally Identifiable Information
(PII), which is defined by the Government as:
"...any information about an individual maintained by an agency, including any
information that can be used to distinguish or trace an individual’s identity, such as
name, Social Security number, date and place of birth, mother’s maiden name,
biometric records, and any other personal information that is linked or linkable to
an individual. 33,"
Identity data is a special and critical subset of sensitive data, as identity data is
required to undertake most of the other transactions, and to interact with essential
financial, government or healthcare services. As such this data must be protected
from theft or alteration to protect individuals and society as well as to ensure the
integrity of other data in any digital system34. In order to protect this PII data the
Government through the National Institute of Standards (NIST) defines a number of
best practices and security controls that form the basis for sound management of
confidential information. 35 These controls include such concepts as:
• Identification and Authentication - uniquely identifying and authenticating
users before accessing PII
12. Secure File Management Using the Public Cloud
Masters of Cybersecurity Practicum Project, ISM 6905– Cecil Thornhill
Masters Project CThornhill v2 final.docx7/13/16 Page 12 of 46
• Access Enforcement - implementing role-based access control and
configuring it so that each user can access only the pieces of data necessary
for the user‘s role.
• Remote Access Control - ensuring that the communications for remote
access are encrypted.
• Event Auditing - monitor events that affect the confidentiality of PII, such as
unauthorized access to PII.
• Protection of Information at Rest - encryption of the stored information
storage disks.
In addition to these considerations, many enterprises also need to handle
documents that contain both PII and medical records or data from medical records,
or Protected Heath Information (PHI). Medical records began to be stored
electronically in the 1990’s. By the early part of the 21st century this growth in
electronic health records resulted in a new set of legislation design to both
encourage the switch to electronic health records and to set up guidelines and policy
for managing and exchanging these records. The Health Insurance Portability and
Account- ability Act (HIPAA) of 1996 creates a set of guidelines and regulations for
how enterprises much manage PHI36.
Building on HIPAA, the American Recovery and Reinvestment Act of 2009 and the
Health Information Technology for Economic and Clinical Health Act (HITECH) of
2009 added additional policy restrictions, and security requirements as well as
penalties for failure to comply with the rules37. These regulations for PHI both
overlap and add to the considerations for data and documents containing PII.
The HITEC law increased the number of covered organizations or “entities” from
those under the control of the HIPAA legislations:
“Previously, the rules only applied to "covered entities," including such healthcare
organizations as hospitals, physician group practices and health insurers. Now, the
rules apply to any organization that has access to "protected health information. 38”
HITEC also added considerable detail and clarification as well as new complexity
and even more stringent penalties for lack of compliance or data exposure or
“breaches”. Under HITEC a breach is defined as:
"…the unauthorized acquisition, access, use or disclosure of protected health
information which compromises the security or privacy of such information, except
where the unauthorized person to whom such information is disclosed would not
reasonably have been able to retain such information. 38"
The result of the considerations needed to manage documents that might contain
Sensitive Data, PII or PHI or any combination of these elements is that any
document management system implemented in private or public data centers must
13. Secure File Management Using the Public Cloud
Masters of Cybersecurity Practicum Project, ISM 6905– Cecil Thornhill
Masters Project CThornhill v2 final.docx7/13/16 Page 13 of 46
implement a wide range of technical and procedural steps to operate in a secure
manner. Protection of the security, privacy and integrity of the documents and data
in those documents becomes a major part of the challenge to designing, building and
operating any information system. These engineering efforts are essential to
business operations however they also become part of the cost for any system, and
as such can be a considerable burden on the budget of any enterprise.
Designing a Project to Demonstrate Using the Cloud
It is within this context of providing a secure system leveraging cloud-based benefits
that the practicum project described in this paper was designed. The goal of the
project was to demonstrate a viable approach to following the policy guidance as
provided for federal IT systems. To achieve this goal, the first step was to
understand the context as outlined in the discussion above. The next step was to
design a system that followed sound cybersecurity principles and the relevant
policy guidance.
Based on the demand for electronic document management in both private and
government enterprise, a basic document management system was selected as the
business case for the prototype to be developed. Document management provides
an opportunity to implement some server side logic for the operation of the user
interface and for the selection and management of storage systems. Document
management also provides a driving problem that allows for clear utilization of
storage options, and thus can demonstrate the benefits of the cloud based storage
options that feature prominently in the consideration of cloud advantages of both
speed of deployment and lower TCO. These considerations were incorporated in the
decision to implement a document management system as the demonstration
project.
The scope of the system was also a key consideration. Given the compressed time
frame and limited access to developer resources that are intrinsic to a practicum
project, the functional scope of the document management system would need to be
constrained. As a solo developer, the range of features that can be implemented
would need to be limited to the basic functions needed to show proof of concept for
the system. In this case, this were determined to be:
1. The system would be implemented on the Amazon EC2 public cloud for the
compute tier of the demonstration.
2. The system would utilize Amazon S3 object storage as opposed to block
storage.
3. The system would be implemented using commercially available Amazon
provided security features for ensuring Confidentiality, Integrity and
Accessibility39.
14. Secure File Management Using the Public Cloud
Masters of Cybersecurity Practicum Project, ISM 6905– Cecil Thornhill
Masters Project CThornhill v2 final.docx7/13/16 Page 14 of 46
Dimov, I. (2013, June 20). Guiding Principles in Information Security - InfoSec
Resources. Retrieved July 09, 2016, from
http://resources.infosecinstitute.com/guiding-principles-in-information-security/
4. The servers used for the project would all be Linux based.
5. The system would feature a basic web interface to allow demonstration of
the ability to store documents.
6. The system would use Public Key Infrastructure certificates generated
commercially to meet the need to support encryption for both web and
storage components.
7. The web components of the prototype would use HTTP to enforce secure
connection to the cloud based servers and storage.
8. The system would utilize a commercial web server infrastructure suitable for
scaling up to full-scale operation but only a single instance would be
implemented in the prototype.
9. The web components would be implemented in a language and framework
well suited to large-scale web operations with the ability to handle large
concurrent loads.
10. Only a single demonstration customer/vendor would be implemented in the
prototype.
11. The group and user structure would be developed and implemented using
the Amazon EC2 console functions.
12. Only the essential administrative and user groups would be populated for the
prototype.
13. The prototype would feature configurable settings for both environment and
application values set by environment, files, and Amazon settings tools. The
current prototype phase would not introduce a database subsystem expected
to be used to manage configuration in a fully production ready version of the
system.
14. Data files used in the prototype would be minimal versions of XML files
anticipated to be used in an operational system, but would only contain
structure and minimal ID data not full payloads.
In the case of a narrowly scoped prototype such as this demonstration project it is
equally critical to determine what function is out of scope. For this system this list
included the following:
• The web interface would be left in a basic state to demonstrate proof of
function only. Elaboration and extension of the GUI would be outside the
scope of the work for this prototype project.
• There would be no restriction on the documents to be uploaded. Filtering
vendor upload would be outside the scope of work for this prototype.
• Testing uploads with anti-virus/malware tools would be outside the scope of
this prototype project.
15. Secure File Management Using the Public Cloud
Masters of Cybersecurity Practicum Project, ISM 6905– Cecil Thornhill
Masters Project CThornhill v2 final.docx7/13/16 Page 15 of 46
• Security testing or restriction of the client would be outside the scope of this
project. The URL to access the upload function would be open for the
prototype and the infrastructure for user management would not be
developed in the prototype.
• Load testing and performance testing of the prototype would be outside the
scope of this phase of the project.
• No search capacity would be implemented to index the data stored in the S3
subsystem in the prototype project.
Proof of concept was thus defined as:
A) The establishment of the cloud based infrastructure to securely store
documents.
B) The implementation of the required minimal web and application servers
with the code required to support upload of documents.
C) The successful upload of test documents to the prototype system using a
secure web service.
While the scope of the project may appear modest and the number of restrictions
for the phase to be implemented in the practicum course period an numerous, these
scope limitations proved vital to completion of the project in the anticipated period.
The subtle challenges to implementation of this proof of concept feature set proved
more than adequate to occupy the time available and provided considerable scope
for learning and valuable information for future projects based on cloud computing,
as detailed in the subsequent sections of this paper.
Planning the Work and Implementing the Project Design
To move to implementation, the next phase of the Software Development Lifecycle
(SDLC) the requirements and scope limitations listed above were used to develop a
basic project plan for the project consisting of two main phases:
A) The technical implementation of the infrastructure and code through to proof of
concept.
B) The documentation of the project work and production of this report/paper.
The project management of any implementation process for a project is a critical
success factor for any enterprise no matter how large of small. This is very true for
cloud computing projects as they often represent a significant departure from
existing IT systems and processed for an enterprise. This was the case in this project
as well.
While no formal GNATT or PERT chart was developed for the project plan, as there
was no need to transmit the plan to multiple team members, an informal breakdown
16. Secure File Management Using the Public Cloud
Masters of Cybersecurity Practicum Project, ISM 6905– Cecil Thornhill
Masters Project CThornhill v2 final.docx7/13/16 Page 16 of 46
was used to guide the technical implementation in an attempt to keep it on
schedule:
Week 1: Establish the required Amazon EC2 accounts and provision a basic
server with a secure management account for remote administration
of the cloud systems.
Week 2: Procure the required PKI certificates and then configure the
certificates needed to secure access to the servers, and any S3 storage
used by the system. Configure the S3 Storage.
Week 3: Obtain and install the required commercial web server and
application server to work together and utilize a secure HTTP
configuration for system access. Implement any language framework
needed for application code development.
Week 4: Research and develop the required application code to demonstrate
file upload and reach proof of concept. Create any required data files
for testing.
Weeks 5-8: Document the project and produce the final report/paper.
In practice this proposed 8 week schedule would slip by about 4 weeks due to about
2 weeks of extra work caused by the complexity and unexpected issues found in the
system and code development implementation and about 2 weeks of delays in the
write up caused by the author’s relocation to a new address. These delays in
schedule are not atypical of many IT projects. They serve to illustrate the
importance of both planning and anticipation of potential unexpected factors when
implementing new systems that are not well understood in advance by the teams
involved. Allowing slack in any IT schedule, and especially those for new systems is
key to a successful outcome as it allows flexibility to deal with unexpected aspects of
the new system.
The very first tasks to be undertaken in the execution of the project plan for this
project was to establish the required Amazon Elastic Compute Cloud (Amazon EC2)
accounts. EC2 is the basic cloud infrastructure service provided by Amazon. This
service provides user management, security, system provisioning, billing and
reporting features for Amazon’s cloud computing platform. It is the central point for
administration of any hosted project such as the prototype under discussion in this
paper40.
Because the author was an existing Amazon customer with prior EC2 accounts, the
existing identification and billing credentials could be used for this project as well.
Both identity and billing credentials are critical components for this and any other
cloud based project on Amazon or any other cloud vendor. It is axiomatic that the
identity of at least one responsible party, either an individual or institution, must be
known for the cloud vendor to establish systems and accounts in its infrastructure.
This party acts as the “anchor” for any future security chain to be established. The
17. Secure File Management Using the Public Cloud
Masters of Cybersecurity Practicum Project, ISM 6905– Cecil Thornhill
Masters Project CThornhill v2 final.docx7/13/16 Page 17 of 46
primary account will act as the ultimate system owner and will be responsible for
the system’s use or abuse and for any costs incurred. Below is an example home
screen for the author’s project on EC2:
Responsibility for costs is the other key aspect of the primary EC2 account. While
cloud computing may offer cost savings benefits, it is by no means a free service.
Every aspect of the EC2 system is monetized and tracked in great detail to ensure
correct and complete billing for any features used by an account holder. Some basis
for billing must be provided at the time any account is established. In the case of this
project all expenses for the EC2 features used would be billed back to the author’s
credit account previously established with Amazon.
In any cloud project it is vital that each team member committing to additional
infrastructure have the understanding that there will be a bill for each feature used.
Amazon and most cloud vendors offer a number of planning and budgeting tools for
projecting the costs of features before making a commitment. This is helpful, but is
not a substitute for clearly communicating and planning for costs in advance among
the development team members and project owners, stakeholders and managers. In
the case of this project, while the author did reference the budgeting tools to note
costs estimates, communication and decisions were simple due to the singular team
size. Below is an example of the billing report console:
18. Secure File Management Using the Public Cloud
Masters of Cybersecurity Practicum Project, ISM 6905– Cecil Thornhill
Masters Project CThornhill v2 final.docx7/13/16 Page 18 of 46
Establishment of the basic account for the project was, as indicated simple due to
the author having an existing EC2 account. To provision a server, it was necessary to
determine the configuration most appropriate for the project’s needs, and then
determine the Amazon Availability Zone where the server should be located. The
server configuration would be decided by estimating the required performance
characteristics needed to host the required software and execute the application
features for the anticipated user load.
In this case, all these parameters were scoped to be minimal for the prototype to be
created, reducing the capacity of virtual server required. Based on the author’s
experience with Linux servers a small configuration would meet the needs of the
project. Using the descriptive materials provided by Amazon detailing the server
performance, a modest configuration of server was selected to host the project:
• t2.micro: 1 GiB of memory, 1 vCPU, 6 CPU Credits/hour, EBS-only, 32 bit or
64-bit platform41
When the server was provisioned RedHat was selected as the OS. Other Linux
distributions and even Windows operating systems were available from Amazon
EC2. Red Hat was selected in order to maintain the maximum compatibility to
systems now in use by the federal systems currently approved for use in production
systems per the author’s personal experience. Use of Red Hat Linux also makes
getting support and documentation of any open source tools from the Internet
easier as this is a popular distribution for web based systems. Below is a release
description from the virtual instance as configured on EC2 for this project:
19. Secure File Management Using the Public Cloud
Masters of Cybersecurity Practicum Project, ISM 6905– Cecil Thornhill
Masters Project CThornhill v2 final.docx7/13/16 Page 19 of 46
By default the server was provisioned in the same zone as the author’s prior EC2
instances, which was us-west-2 (Oregon). An Availability Zone (zone) is the Amazon
data center used to host the instance. Availability zones are designed to offer
isolation from each other in the event of service disruption in any one zone. Each
zone operates to the published Service Level Agreement provided by Amazon42.
Understanding the concept of zone isolation and the key provisions of the SLA
provided by a cloud vendor are important to the success of any cloud based project.
Highly distributed applications or those needed advanced fault tolerance and load
balancing might choose to host in multiple zones.
For the purposed of this project a single zone and the SLA offered by Amazon was
sufficient for successful operation. However, the default zone allocation was
problematic and was the first unexpected implementation issue. Almost all EC2
features are offered in the main US zones, but us-east-1 (N. Virginia) does have a few
more options available than us-west-2 (Oregon). In order to explore the
implications and effort needed to migrate between zones and ensure access to all
potential features, the author decided to migrate the project server to the us-east-1
zone.
Migration involved a backup of the configured server, which appeared to be prudent
operational activity anyway. Following the backup, the general expectation was that
the instance could be restored directly in the desired location and then the old
instance could be removed. In general this expectation proved to be sound, but the
exact steps were not so direct. Some of the complexity was strictly due to needing to
allow for replication time. Some of the complexity proved to be due to the use of a
Elastic IP address that creates a public IP address for the server.
An AWS Elastic IP provided a static public IP that can then be associated with any
instance on EC2, allowing public DNS configuration to then be re-mapped as needed
to any collection of EC2 servers. The author had a prior Elastic IP and expected to
20. Secure File Management Using the Public Cloud
Masters of Cybersecurity Practicum Project, ISM 6905– Cecil Thornhill
Masters Project CThornhill v2 final.docx7/13/16 Page 20 of 46
just re-use it for this project, but as noted in the AWS EC2 documentation “An Elastic
IP address is for use in a specific region only43”. This created an issue when the
instance was migrated across zones.
Once the problem was understood, the solution was to release the old Elastic IP and
generate a new Elastic IP that could be mapped using DNS. This new Elastic IP could
be associated with the servers now restored to the us-east-1 (N. Virginia). This step
wound up taking quite a bit of time to debug and fix in the first week, and was to
lead to the next unexpected issues with DNS.
None of this work was so complex as to put the project at risk. This required IP
change does illustrate the fact that understanding the SLA and restrictions of each
cloud feature is critical. Small issues like requiring a change of IP address can have
big implications for other work in a project. Decisions to provision across zones are
easy in the cloud, but can have unintended consequences, such as this IP address
change and the subsequent work in DNS that generated. All of these issues take
resources and cost time in a project schedule.
An existing domain, Juggernit.com, already registered to the author was the
expected target domain. Since one of the requirements for the project was to get a
Public Key for the project site, it was essential to have a publicly registered Internet
domain to use for the PKI. Once the public IP was re-established in the new us-east-
1 zone, and connectivity was confirmed by accessing the instance using SSL, the next
unexpected task was moving the DNS entries for the instance from the current
registrar. This would also include learning to configure the Amazon Elastic Load
Balancer and then map the domain to it.
The load balancer forwards any HTTP or HTTPS traffic to the HTTPS secure instance.
The HTTPS instance is the final target for the project. Amazon Elastic Load
Balancing is a service that both distributes incoming application traffic across
multiple Amazon EC2 instances, and allows for complex forwarding to support
forcing secure access to a domain. In this instance while the project would not have
many servers in the prototype phase, the use of load balancing would reflect the “to
be” state of a final production instance and allow secure operations in even
development and preliminary phases of the project used for the practicum scope.
The load balancer configuration would require a domain record of the form:
juggerload1-123781548.us-east-1.elb.amazonaws.com (A Record)
As noted in the Amazon web site, you should not actually use an “A Record” in your
DNS for a domain under load balancing:
Because the set of IP addresses associated with a LoadBalancer can change over
time, you should never create an "A record” with any specific IP address. If you want
to use a friendly DNS name for your load balancer instead of the name generated by
21. Secure File Management Using the Public Cloud
Masters of Cybersecurity Practicum Project, ISM 6905– Cecil Thornhill
Masters Project CThornhill v2 final.docx7/13/16 Page 21 of 46
the Elastic Load Balancing service, you should create a CNAME record for the
LoadBalancer DNS name, or use Amazon Route 53 to create a hosted zone. For more
information, see Using Domain Names With Elastic Load Balancing44.
The Juggernit.com domain was being managed by Network Solutions. Unfortunately
the GUI used by Network Solutions did not allow for the entry of the CNAME record
formats needed for the EC2. This required moving the domain out of the control of
Network Solutions and into the Amazon Route53 domain management service. The
Route 53 service has a variety of sophisticated options, but most critically, it
interoperates well with other Amazon EC2 offerings including the load balancing
features45.
Route 53 is a good example of not only an unexpected issue that must be overcome
to migrate to the cloud, but how the nature of the cloud platform creates a small
“ecosystem” around the cloud vendor. Even when striving for maximum standards
compliance and openness, the nature of the cloud platform offerings such as load
balancing tend to create interoperations issues with older Internet offerings like
those for DNS from Network Solutions, which date from the origin of the
commercial Internet. The author had used Network Solutions DNS since the late
1990’s, but in this instance there was no quick path to a solution other than
migration to the Amazon Route 53 offering.
The Juggernit.com domain would need to be linked to the public IP of the instance,
and pragmatically this was only achievable via Route 53 services. Once the situation
was analyzed after consultation with both Network Solutions and Amazon support,
the decision to move to Route 53 was made. The changes were relatively quick and
simple using the Network Solutions and Amazon web consoles. Waiting for the DNS
changes to propagate imposed some additional time, but as with the zone migration,
the delay was not critical to the project schedule.
With the server, public IP address and DNS issues resolved PKI certificate
generation could be attempted. The author was relatively experienced in generation
and use of PKI credentials, but once again the continued evolution of the Internet
environment and of cloud computing standards was to provide unexpected
challenges to the actual implementation experience.
There are many vendors offering certificates suitable for this practicum project,
including Amazon’s own new PKI service. The author selected Network Solutions as
a PKI provider. Using another commercial certificate vendor offered an opportunity
to explore the interoperation of Amazon’s platform with other public offerings.
Network Solutions also has a long history with the commercial Internet and has a
well-regarded if not inexpensive certificate business46.
The certificates were issued in a package including both the typical root certificate
most Internet developers are used to, as well as a number of intermediate
22. Secure File Management Using the Public Cloud
Masters of Cybersecurity Practicum Project, ISM 6905– Cecil Thornhill
Masters Project CThornhill v2 final.docx7/13/16 Page 22 of 46
certificates that were less familiar to the author. In most cases inside an enterprise,
certificates are issued for enterprise resources by trusted systems and all the
intermediate certificates are often in place already. This was not the case for the
Amazon EC2 infrastructure for this project. In this instance, not only was the root
certificate needed, but also all the intermediates must be manually bundled into the
uploaded package47. This was a new process for the author and management of
intermediate certificates represented another unexpected task.
The need to include the intermediate certificates in the upload to Amazon was not
immediately apparent and debugging the reason why uploading just the root
certificate did not work (as with prior systems) was going to involve a major
research effort and many hours of support diagnostics with each vendor involved.
To make the issue more complex, there was documentation the Amazon support
team found for some certificate vendors and there was documentation for cloud
service vendors found by Network Solutions support, but neither firm had
documents for working with certificates or cloud services from the other – this was
the one case not documented anywhere.
The Network solution certificates were issued using a new naming format that did
not follow either the older Network Solutions documentation to identify the proper
chaining order. Amazon was also not totally sure what orders would constitute a
working package. A number of orders had to be tried and tested one at a time and
then the errors diagnosed for clues as to the more correct order needed in the
concatenate command. On top of this, the actual Linux command to concatenate and
hence chain the certificates was not exactly correct when attempted. This was due
to the text format at the end of the issued certificates. Manual editing of the files was
needed to fix the inaccurate number of delimiters left in the resulting text file.
The final command needed for the Amazon load balancer was determined to be:
amazon_cert_chain.crt; for i in DV_NetworkSolutionsDVServerCA2.crt
DV_USERTrustRSACertificationAuthority.crt AddTrustExternalCARoot.crt ; do cat
"$i" >> amazon_cert_chain.crt; echo "" >> amazon_cert_chain.crt; done
This back and forth diagnostic work for certificate chains represented a major
unexpected source of complexity and extra work. Again, this did not disrupt the
execution schedule beyond a recoverable limit. The experience with certificate
chaining was a valuable learning opportunity on the pragmatic use of PKI tools. The
author has subsequently come across a number of federal IT workers encountering
these challenges as more and more systems start to include components from
outside vendors in the internal enterprise infrastructure.
After the installation of the certificates, the next major configuration tasks were the
installation and configuration of the web server and the application server
platforms on the EC2 instance. Nginx is the web server used on the project, and
23. Secure File Management Using the Public Cloud
Masters of Cybersecurity Practicum Project, ISM 6905– Cecil Thornhill
Masters Project CThornhill v2 final.docx7/13/16 Page 23 of 46
Node.JS and the Express framework is used as the application server. Each of these
subsystems provided further opportunities for learning as they were installed.
Nginx was selected to provide an opportunity to gain experience with this very
popular commercial platform as well as due to its reputation for high performance
and excellent ability to scale and support very high traffic web sites. Nginx was
designed from the start to address the C10K problem (10,000 concurrent
connections) using an asynchronous, non-blocking, event-driven connection-
handling algorithm48. This is very different from the approach taken by Apache or
many other available web servers. In the author’s experience many web sites that
start out with more traditional web servers such as Apache, experience significant
scale issues as they grow due to high volumes of concurrent users. Starting with
Nginx was an attempt to avoid problem this by design, though installation and
configuration of the web server was more complex
The open source version of Nginx was used for the project, as a concession to cost
management. Downloading the correct code did prove to be somewhat of an issue,
as it was not easy to find the correct repositories for the current package and then it
turned out the application had to be updated before it could function. It was also
critical to verify the firewall status once the system was providing connections.
The Amazon install of Red Hat Linux turns out to disable the default firewalls and
instead use the Amazon built in firewalls for the site. This actually provides a very
feature rich GUI firewall configuration but is another non-standard operations detail
for those familiar with typical Red Hat stand-alone server operations. The firewall
was another implementation detail that could not easily be anticipated.
After the firewall was sorted out there remained considerable research to
determine how to configure the Nginx web server to utilize HTTPS based on the
certificates for the domain. Again the issue turned out to be due to the chaining
requirements for the certificate. In this case, Nginx needed a separate and different
concatenated package in this format:
cat WWW.JUGGERNIT.COM.crt AddTrustExternalCARoot.crt
DV_NetworkSolutionsDVServerCA2.crt DV_USERTrustRSACertificationAuthority.crt
>> cert_chain.crt
After determining the correct concatenation format needed for Nginx and making
the appropriate uploads of concatenated files, HTTPS services were available end to
end. However, Nginx does not provide dynamic web services. To serve dynamic
content it would be necessary to install and configure the Node.JS Web Application
Server and the Express framework.
Node.JS (Node) is an open source server-based implementation of the JavaScript
language originally developed by Ryan Dahl in 2009 using both original code and
24. Secure File Management Using the Public Cloud
Masters of Cybersecurity Practicum Project, ISM 6905– Cecil Thornhill
Masters Project CThornhill v2 final.docx7/13/16 Page 24 of 46
material from the Google V8 JavaScript engine. Most significantly, Node is event-
driven, and uses a non-blocking I/O model. This makes Node both very fast and very
easy to scale. Node is extremely well suited to situations like the C10K problem, and
web sites that scale quickly and efficiently. Being based on JavaScript, Node is Object
oriented and offers a huge open source support base of modules and libraries,
accessed using the Node Package Manager (NPM).
Express is a minimal and flexible Node.js web application framework based on many
of the ideas about web site design and development taken from the Ruby of Rails
framework project. Express offers a set of standard libraries and allows users to mix
in many other NPM tool to create web sites base on the original Ruby on Rails
principle of “convention over configuration” by providing a common structure for
web apps49.
Installation of Node on the server was done using the standard Red Hat Package
Manager tools. Once Node is installed, the Node Package Manager (NPM) system can
be used to bootstrap load any other packages such as the Express framework. In a
production system it is expected that the web server and the application server
would be hosted on separate hardware instances, but since the practicum was to be
subject to only a small load, both serves can run on the same instance of Linux with
little impact.
While Node comes with its own dynamic web server to respond to request for
dynamic web content, it is not well suited to heavy-duty serving on the font end.
Nginx is design for the task of responding to high volumes of initial user inquiries.
The combination of a high performance web server (Nginx) and some number (N)
application server instances (such as Node) is a widely accepted pattern that
supports large scale web systems. Implementation of this design pattern was a goal
of the prototype, to pre-test integration all the constituent components even prior to
any load testing of the system. Deployment and configuration of Nginx and Node to
the single Linux server fulfills this requirement and provides a working model that
can be expanded to multiple servers as needed in the future.
In order to smoothly transfer web browser request from users to the application
server domain, the web server must act as a reverse proxy for the application server.
To accomplish this with Nginx requires the addition of directives to the Nginx
configuration file inside the “server” section of the configuration file. These
commands will instruct the web server to forward web traffic (HTTPS) request for
dynamic pages targeted at the DNS domain from Nginx to Node.JS. This is a
relatively standard forwarding for Nginx and only requires a small amount of
research to verify the correct server configuration directive as shown in this
example from the Nginx documentation:
server {
#here is the code to redirect to node on 3000
25. Secure File Management Using the Public Cloud
Masters of Cybersecurity Practicum Project, ISM 6905– Cecil Thornhill
Masters Project CThornhill v2 final.docx7/13/16 Page 25 of 46
location / {
proxy_set_header X-Forwarded-For
$remote_addr;
proxy_set_header Host $http_host;
proxy_pass "http://127.0.0.1:3000";
}}
Note that this is just an example for use on Local Host with a Node.JS engine running
on port 3000 (any port will suffice). The critical issue is to configure Nginx to act as
a reverse proxy to the Node.JS engine. Nginx will then send traffic to the configured
port for the Node.JS application instance. Node.JS and Express thenuse a RESTFUL
approach to routing to the application logic based on parsing the URL.
The reverse proxy configuration will ensure that when traffic comes into the Nginx
server with the format “HTTPS://Juggernit.com/someurl” it will be handled by the
appropriate logic section of the Node.JS applications as configured in the Express
framework. The Express listener will catch the traffic on port 3000 and use the
route handler code in express to parse the URL after the slash and ensure that the
proper logic for that route is launched to provide the service requested. This is a
well established RESTFUL web design pattern, first widely popularized in Ruby on
Rails and adopted by a number of web frameworks for languages such as Java, Node
or Python, etc.
Implementing this pattern requires that both Nginx and Node be installed on the
server to be used as a pre-requisite. In addition, the Express framework for web
applications used by Node must also be loaded to allow at least a basic test of the
forwarding process. All of this code is available as open source, so access to the
needed components was not a blocker for the project. Each of these components
was first loaded onto the Author’s local Unix system (a Macbook Pro using OSX).
This allowed for independent and integration testing of the Nginx web server, the
Node application server and the Express web framework. By altering the
configuration file and adding the appropriate directives as noted above, the reverse
proxy configuration and function could be tested locally as well against the local
host IP address.
After validation of the configuration requirements locally on the Author’s
development station, the web server and application server needed to both be
installed on the cloud server. As noted above, Nginx was actually loaded on the
cloud server earlier to allow for configuration of the domain and HTTPS secure
access to the site. This left only the installation of the Node and Express application
server components. While conceptually easy, in practice loading Node also proved
to provide unexpected challenges. The 7.x Red Hat version of Linux installed on the
cloud server supports Node in the RPM package manager system. However the
available RPM version was only a 0.10.xx version. The current version of Node is
26. Secure File Management Using the Public Cloud
Masters of Cybersecurity Practicum Project, ISM 6905– Cecil Thornhill
Masters Project CThornhill v2 final.docx7/13/16 Page 26 of 46
4.4.x. The stable development version installed on the Author’s local system was
4.4.5 (provided from the Node web site).
There are substantial syntax and function differences between the earlier version of
Node and the current version. This required that the Node install on the cloud
server be updated, and that proved to require help from the Amazon support team,
as following the default upgrade instructions did not work. Again, the delay was not
large, but cost a couple days between testing, exploration of options, and final
correction of the blocking issues. The final install of a current 4.4.x version of Node
required a complete uninstall of the default version, as upgrading resulted in locked
RPM packages.
After cleaning up the old install and loading the new Node version, the cloud server
was conformed to the required Node version. The Express framework was loaded
on the server via the standard command line Node Package Manager (NPM) tool. A
simple “Hello World” test web application was created in Express/Node and again
the function of both the Nginx and Node servers was validated.
To accomplish the verification of web and application server function an Amazon
firewall change was required to allow Node to respond directly to traffic pointed at
the IP address of the server and the port number (3000) of the Node server was
needed. This firewall rule addition allowed testing of HTTPS traffic targeted at the
domain name, which was served by Nginx. HTTP traffic directed to the IP address
and port 3000 could then be tested at the same time, as this traffic was served by
the test Node/Express application.
To complete the integration, the next step was to reconfigure the Nginx server to act
as a reverse proxy. The Nginx configuration file was backed up, and then the reverse
proxy directives as shown above were added to the Nginx configuration file, and
Nginx was reloaded to reflect the changes. At this point, Nginx no longer provided its
default static web page to request sent to HTTPS://Juggernit.com. Instead, Nginx
forwarded the HTTPS traffic to the Node application server, still under the secure
connection, and Node responded with the default “Hello World” page as configured
in the Express test application. This state represented a complete integration of
Nginx and Node for the project. The server was backed up and the next stage of
work to implement the upload logic to store data on the Amazon S3 object store
could continue.
The two major tasks required to finish the site configuration and functional
completion of the prototype project were:
• Establishment of an Amazon S3 storage area (know as a “bucket” on
Amazon)
• Coding server and client logic to access the S3 storage via HTTPS
27. Secure File Management Using the Public Cloud
Masters of Cybersecurity Practicum Project, ISM 6905– Cecil Thornhill
Masters Project CThornhill v2 final.docx7/13/16 Page 27 of 46
The first of these tasks could be accomplished directly via the Amazon EC2
management console. For the prototype there was no requirement for a custom web
interface to create S3 storage, and no requirement for any automatic storage
assignment or management. In a fully realized production application it is possible
that application based management of storage might be desirable, but this is a
system feature requirement highly subject to enterprise policy and business case
needs. However, even when using the Amazon interface to manage S3 storage as in
this project, there was still a need to consider the user and group structure in order
to manage access security to the S3 storage.
As discussed earlier in the paper, a default EC2 account assumes that the owner is
granted all access to all resources configured by that owner in the Amazon cloud
infrastructure. For this reason, it is important to create separate administrative
accounts for resources that require finer grained access and might also require
access restrictions. In a fully realized web application hosted on local servers, this
user and group management is often done at the application level. For this
prototype these considerations were to be managed by the Amazon EC2 interface.
Prior to setting up a storage area on the S3 object storage, the administrator group
named “admins” was created, with full permissions to manage the site resources.
Another group called “partners” with access to the S3 storage, but not other site
resources for management of servers was created. A user named “testone” was then
created and added to the “partners” group. The Author used the primary Amazon
identity to build and manage the site, but the administrative group was constructed
so that any future web based management functions could be separated from user-
oriented functions of the prototype web application.
With the users and groups established, the S3 storage called “ctprojectbucketone”
was created using the standard Amazon GUI. Below is a screenshot showing this
bucket:
28. Secure File Management Using the Public Cloud
Masters of Cybersecurity Practicum Project, ISM 6905– Cecil Thornhill
Masters Project CThornhill v2 final.docx7/13/16 Page 28 of 46
To manage access rights, the S3 storage was then assigned a Cross-Origin Resource
Sharing (CORS) access policy that allowed GET, POST and PUT permissions to the
S3 storage. As shown below:
The “partner” group was assigned access to this storage by providing them with the
resource keys. With the creation of the S3 Object Storage “bucket”, the remaining
task to reach functional proof of concept for the prototype project was to construct
the JavaScript application code to access the S3 storage bucket securely from the
Internet.
To create the logic for bucket access there were a number of pre-requisite steps not
emphasized so far. The most significant of these steps was to develop at least a basic
familiarity with Node.JS and JavaScript. While the author posses some number of
years of experience with using JavaScript in a casual manner for other web
applications, site development in JavaScript was a very different proposition. Node
also has its own “ecosystem” of tools and libraries, much like any emerging open
source project. Some understanding of these was also essential to succeed in
creating the code required to achieve a proof of concept function for the prototype
site.
As a starting point the main Node site, https://nodejs.org/en/, provided an
essential reference. In addition the author referenced two very useful textbooks:
• Kiessling, Manuel. "The node beginner book." Available at [last accessed: 18
March 2013]: http://www. nodebeginner. org (2011).
• Kiessling, Manuel. “The Node Craftsman Book. “.Available at [last accessed: 25
October 2015]: https://leanpub.com/nodecraftsman)(2015).
These proved to be essential in providing both background on Node, and some
guidance on the use of the Express application framework. In addition a number of
other small Node library packages were key to creating the required code,
specifically:
29. Secure File Management Using the Public Cloud
Masters of Cybersecurity Practicum Project, ISM 6905– Cecil Thornhill
Masters Project CThornhill v2 final.docx7/13/16 Page 29 of 46
• Node Package Manager (NPM) – a Node tool for getting and managing Node
packages (library’s of function). https://www.npmjs.com
• EXPRESS- a Node library providing an application framework for RESTFUL
web applications based on the concepts from Ruby on Rails.
https://expressjs.com
• Dotenv – a Node library to allow loading environment variables from a
configuration file with the extension .env. This was used to allow passing
critical values such as security keys for S3 storage in a secure manner from
the server to a client. https://www.npmjs.com/package/dotenv
• EJS – a Node library that allows embedded JavaScript in an HTML file. This
was used to add the required logic to communicate to the server components
of the application and then access the S3 bucket from the client page using
values securely passed over HTTPS. https://www.npmjs.com/package/ejs
• AWS-SDK – a Node library provided by Amazon to support basic functions
for the S3 storage service to be accessed by Node code.
https://www.npmjs.com/package/aws-sdk
As a newcomer to Node, the most critical problem in creation of this code for the
Author was a lack of standard examples to S3 access using a common approach at a
sufficiently simple level of clear explanation. There are actually at least dozens of
sample approaches to integration of S3 storage in Node projects, but almost all use
very idiosyncratic sets of differing libraries or don’t address some critical but basic
aspect of the prototype such as secure access. There are also a number of very
sophisticated and complete examples that are almost incompressible to the Node
novice. This inability to find a clear and functional pattern to learn from was a major
delay of over a week and a half in completion of the final steps of the prototype.
After considerable reading, coding, and searching for reference models, the Author
finally came across a tutorial from Dr. Will Webberly of the Cardiff University
School of Computer Science & Informatics. The author read, studied and analyzed
the example provided. The next step was to create several test programs to adapt
the approach used by Dr. Webberly in the Heroku cloud instance he documented to
a local Node Express project50. After some trial and error and some correspondence
with Dr. Webberly via email, a working set of code emerged.
The final proof of concept function was a minimal web application based on the
patter used by Dr. Webberly and running in a cloud based server as an Express
application using local variables on the Amazon EC2 server. The server code
provides a restful service over HTTPS to allow a client web page executing on the
remote PC or device to upload to the S3 storage using HTPS. Below is a screenshot of
some of the server side code:
30. Secure File Management Using the Public Cloud
Masters of Cybersecurity Practicum Project, ISM 6905– Cecil Thornhill
Masters Project CThornhill v2 final.docx7/13/16 Page 30 of 46
The upload page logic is provided by the project web site, as is the back end server
logic. Since the client page is running on a remote device, the entire transfer is done
using client resources. The prototype project site provides only context and security
data, but is not used to manage the upload. This frees server side resources from the
work of the transfer and thus creates a higher performance distributed system. The
exchange of logic and credentials is all done over the HTTPS protocol with the client,
as is the subsequent file upload. This provides a secure method of access to the
cloud based S3 storage.
Client side data from the partner is encrypted in transfer and no other parties
besides the partner and the prototype project operations teams have access to the
S3 bucket. For purposes of the prototype only one client identity and one bucket
were produced. In a fully realized system, there could be unique buckets for each
client, subject to the security and business rules required by the use case of the
system.
After establishing that the Node logic was in fact working and successfully uploaded
files to the S3 storage, a small set of sample health records based on the Veterans
Administration Disability Benefits Questionnaires (DBQs)51 were constructed.
Below is a sample of one of these files:
31. Secure File Management Using the Public Cloud
Masters of Cybersecurity Practicum Project, ISM 6905– Cecil Thornhill
Masters Project CThornhill v2 final.docx7/13/16 Page 31 of 46
These simulated DBQ records were then uploaded as a test, and verified as correct
using the Amazon S3 GUI to access the documents for verification. PDF format was
used for the test files to make them directly readable via standard viewing tools.
Here is a screenshot of the uploaded test files in the Amazon S3 bucket:
This test represents uploading the sort of sensitive and confidential data expected to
be collected and managed in any finished system based on the prototype project.
While basic in its function creation and upload of these documents provided the
final steps in the implementation of this phase of the prototype project. Below is a
screen shot showing the selection of a DBQ for upload using the client side web
page:
32. Secure File Management Using the Public Cloud
Masters of Cybersecurity Practicum Project, ISM 6905– Cecil Thornhill
Masters Project CThornhill v2 final.docx7/13/16 Page 32 of 46
Storing these files represents the completion of the major design goals of the project
and the completion of the implementation phase, and the prototype project itself.
Findings, Conclusions and Next Steps
While achieving the successful secure upload of the test documents to the prototype
meets the objectives set out for this project, it represents only the first milestone in
extending the system to a more full featured platform, and exploration of additional
topics of interest in this area. The architecture implemented offers a good example
of the latest non-blocking, asynchronous approach to serving web content. These
designs exploit CPU resources in very different ways than traditional code and web
frameworks, and there is ample room for scale and load testing to measure the
actual capacity of these systems to perform on 64 but architectures.
The asynchronous and distributed client controlled approach to storage access also
provides an opportunity to test the capacity of the S3 interface to support
concurrent access. The Results should provide tuning direction about the number
and partition rules for the S3 storage. A larger scale simulation with many more
virtual clients would be a natural approach to measuring the capacity of this use
pattern.
The web site functions also offer an opportunity to expand the functionality of the
system and demonstrate more advance fine grain access controls supported by the
user and group model. At a minimum a database of administrators and partners can
be created to both lock the site down from casual access, and to explore the minimal
levels of access needed to still meet all functional needs. Driving each role to he
absolute lowest level of privilege will likely require trial and error, but should be a
benefit in assuring the site has a minimal profile to any potential attackers.
33. Secure File Management Using the Public Cloud
Masters of Cybersecurity Practicum Project, ISM 6905– Cecil Thornhill
Masters Project CThornhill v2 final.docx7/13/16 Page 33 of 46
In addition to these operations oriented future areas of research, once a larger data
set is simulated the ability of the S3 storage to support search indexing on the XML
data is a rich area of exploration. There is emerging federal guidance on the best
practice for meta-data tagging of PII and PHI data, and this prototype would allow
for an easy way to create versions of S3 buckets with a variety of meta-data patterns
and then determine the most efficient search and index options for each with a
higher volume of simulated data. An expanded prototype could act as a test platform
for future production systems, revealing both physical and logical performance
metrics.
Each of these future options provides scope to expand the project, but the basic
implementation also provides some important benefits:
• The implementation of the system shows that it is pragmatic to store
sensitive data on a public cloud based system using PKI infrastructure to
protect the data from both external in cloud vendor access.
• The design of the prototype shows that modest cloud resources can in fact be
used to host a site with the capacity to provide distributed workload using
HTTPS to secure the data streams and leverage client resources to support
data upload, not just central server capacity.
• The prototype shows that it is relatively easy to use Object Storage to acquire
semi-structured data such as XML. This validates use of an Object Store as a
form of document management tool beyond block storage.
• The establishment of the project in only a few weeks with limited staff house
shows the cost and speed advantages of the cloud as opposed to local
physical servers.
• The experience with both the cloud and new web servers and languages
demonstrates the importance of flexible scheduling and allowing for the
unexpected. Even on projects that leverage many off the shelf components
unexpected challenges often show up and consume time and resources.
The prototype produced as a result of this project does meet the guidance for
building secure projects on a public infrastructure. It allows PII and PHI data to be
transferred to an enterprise via secure web services, and demonstrates an approach
that can satisfy many enterprises and the guidelines for HIPAA and HiTech data
handling. The architecture used demonstrates how a scalable web service model can
be implemented using a cloud infrastructure by a small team in a limited time. The
model does only provide a basic proof of concept but offers easy opportunities to
expand to explore a number of additional questions. As such the resulting site can
be considered a success at meetings it design goals, and the information generated
in the site development can be employed by both the Author and others for future
work in cloud computing implementation for secure digital document storage.
34. Secure File Management Using the Public Cloud
Masters of Cybersecurity Practicum Project, ISM 6905– Cecil Thornhill
Masters Project CThornhill v2 final.docx7/13/16 Page 34 of 46
References
1. Oppenheim, A. L. (Ed.). (1967). Letters from Mesopotamia: Official business,
and private letters on clay tablets from two millennia. University of Chicago
Press. Page 1-10
2. Fang, I. (2014). Alphabet to Internet: Media in Our Lives. Routledge. Page
90-91
3. Noam, E. M. (1992). Telecommunications in Europe (pp. 363-368). New York:
Oxford University Press. Page 15-17
4. Moroney, R. L. (1983). History of the US Postal Service, 1775-1982 (Vol. 100).
The Service.
5. John, R. R. (2009). Spreading the news: The American postal system from
Franklin to Morse. Harvard University Press. Page 1-25
6. Johnson, P. (2013). The birth of the modern: world society 1815-1830.
Hachette UK.
7. Currie, R. (2013, May 29). HistoryWired: A few of our favorite things.
Retrieved May 15, 2016, from http://historywired.si.edu/detail.cfm?ID=324
8. Standage, T. (1998). The Victorian Internet: The remarkable story of the
telegraph and the nineteenth century's online pioneers. London: Weidenfeld
& Nicolson.
9. Yates, J. (1986). The telegraph's effect on nineteenth century markets and
firms. Business and Economic History, 149-163.
10. Du Boff, R. B. (1980). Business Demand and the Development of the
Telegraph in the United States, 1844–1860. Business History Review, 54(04),
459-479.
11. Gordon, J. S. (2002). A thread across the ocean: the heroic story of the
transatlantic cable. Bloomsbury Publishing USA.
12. Ross, C. D. (2000). Trial by fire: science, technology and the Civil War. White
Mane Pub.
13. Bates, D. H. (1995). Lincoln in the telegraph office: recollections of the United
States Military Telegraph Corps during the Civil War. U of Nebraska Press.
35. Secure File Management Using the Public Cloud
Masters of Cybersecurity Practicum Project, ISM 6905– Cecil Thornhill
Masters Project CThornhill v2 final.docx7/13/16 Page 35 of 46
14. Coopersmith, J. (2015). Faxed: The Rise and Fall of the Fax Machine. JHU
Press.
15. Cortada, J. W. (2000). Before the computer: IBM, NCR, Burroughs, and
Remington Rand and the industry they created, 1865-1956. Princeton
University Press.
16. Smith, E. (2016, June 14). The Strange History of Microfilm, Which Will Be
With Us for Centuries. Retrieved June 22, 2016, from
http://www.atlasobscura.com/articles/the-strange-history-of-microfilm-
which-will-be-with-us-for-centuries
17. Bush, V., & Think, A. W. M. (1945). The Atlantic Monthly. As we may think,
176(1), 101-108.
18. Mohamed, A. (2015, November). A history of cloud computing. Retrieved July
07, 2016, from http://www.computerweekly.com/feature/A-history-of-
cloud-computing
19. Electric Light and Power System - The Edison Papers. (n.d.). Retrieved July 13,
2016, from http://edison.rutgers.edu/power.htm
20. The discovery of electicity - CitiPower and Powercor. (n.d.). Retrieved July 13,
2016, from https://www.powercor.com.au/media/1251/fact-sheet-
electricity-in-early-victoria-and-through-the-years.pdf
21. Powering A Generation: Power History #1. (n.d.). Retrieved July 13, 2016,
from http://americanhistory.si.edu/powering/past/prehist.htm
22. Electricity - Switch Energy Project Documentary Film and ... (n.d.). Retrieved
July 13, 2016, from
http://www.switchenergyproject.com/education/CurriculaPDFs/SwitchCur
ricula-Secondary-Electricity/SwitchCurricula-Secondary-
ElectricityFactsheet.pdf
23. Tita, B. (2012, November 6). A Sales Surge for Generator Maker - WSJ.
Retrieved July 13, 2016, from
http://www.wsj.com/articles/SB100014241278873248941045781033340
72599870
24. Residential Generators, 3rd Edition - U.S. Market and World Data. (n.d.).
Retrieved July 13, 2016, from
https://www.giiresearch.com/report/sbi227838-residential-generators-
3rd-edition-us-market-world.html
36. Secure File Management Using the Public Cloud
Masters of Cybersecurity Practicum Project, ISM 6905– Cecil Thornhill
Masters Project CThornhill v2 final.docx7/13/16 Page 36 of 46
25. Barroso, L. A., Clidaras, J., & Hölzle, U. (2013). The datacenter as a computer:
An introduction to the design of warehouse-scale machines. Synthesis
lectures on computer architecture, 8(3), 1-154.
26. West, B. C. (2014). Factors That Influence Application Migration To Cloud
Computing In Government Organizations: A Conjoint Approach.
27. Total Cost of Ownership. (2016). Retrieved July 06, 2016, from
http://www.backuparchive.awstcocalculator.com/
28. United States. White House Office, & Obama, B. (2011). International Strategy
for Cyberspace: Prosperity, Security, and Openness in a Networked World.
White House.
29. Kundra, V. (2011). Federal cloud computing strategy.
30. VanRoekel, S. (2011, December 8). MEMORANDUM FOR CHIEF
INFORMATION OFFICERS. Retrieved July 13, 2016, from
https://www.fedramp.gov/files/2015/03/fedrampmemo.pdf
31. Code, U. S. (1999). Gramm-Leach-Bliley Act. Gramm-Leach-Bliley Act/AHIMA,
American Health Information Management Association.
32. What is Sensitive Data? Protecting Financial Information ... (2008). Retrieved
June 19, 2016, from
http://ist.mit.edu/sites/default/files/migration/topics/security/pamphlets/
protectingdata.pdf
33. Government Accountability Office (GAO) Report 08-343, Protecting
Personally Identifiable Information, January 2008,
http://www.gao.gov/new.items/d08343.pdf
34. (Wilshusen, G. C., & Powner, D. A. (2009). Cybersecurity: Continued efforts
are needed to protect information systems from evolving threats (No. GAO-
10-230T). GOVERNMENT ACCOUNTABILITY OFFICE WASHINGTON DC.)
35. McCallister, E., Grance, T., & Scarfone, K. (2010, April). Guide to Protecting the
Confidentiality of Personally ... Retrieved July 13, 2016, from
http://csrc.nist.gov/publications/nistpubs/800-122/sp800-122.pdf
36. Act, A. C. C. O. U. N. T. A. B. I. L. I. T. Y. (1996). Health insurance portability and
accountability act of 1996. Public law, 104, 191.
37. Secure File Management Using the Public Cloud
Masters of Cybersecurity Practicum Project, ISM 6905– Cecil Thornhill
Masters Project CThornhill v2 final.docx7/13/16 Page 37 of 46
37. Graham, C. M. (2010). HIPAA and HITECH Compliance: An Exploratory Study
of Healthcare Facilities Ability to Protect Patient Health Information.
Proceedings of the Northeast Business & Economics Association.
38. Anderson, H. (2010, February 8). The Essential Guide to HITECH Act.
Retrieved June 19, 2016, from
http://www.healthcareinfosecurity.com/essential-guide-to-hitech-act-a-
2053
39. Dimov, I. (2013, June 20). Guiding Principles in Information Security - InfoSec
Resources. Retrieved July 09, 2016, from
http://resources.infosecinstitute.com/guiding-principles-in-information-
security/
40. Amazon Web Services (AWS) - Cloud Computing Services. (n.d.). Retrieved
July 10, 2016, from https://aws.amazon.com/
41. EC2 Instance Types – Amazon Web Services (AWS). (2016). Retrieved July 10,
2016, from https://aws.amazon.com/ec2/instance-types/
42. Regions and Availability Zones. (2016, January). Retrieved July 13, 2016,
from http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-
regions-availability-zones.html
43. Elastic IP Addresses. (2016). Retrieved July 10, 2016, from
http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/elastic-ip-
addresses-eip.html
44. AWS | Elastic Load Balancing - Cloud Network Load Balancer. (2016).
Retrieved July 10, 2016, from
https://aws.amazon.com/elasticloadbalancing/
45. AWS | Amazon Route 53 - Domain Name Server - DNS Service. (2016).
Retrieved July 10, 2016, from https://aws.amazon.com/route53/
46. SSL Security Solutions. (2016). Retrieved July 10, 2016, from
http://www.networksolutions.com/SSL-certificates/index.jsp
47. What is the SSL Certificate Chain? (2016). Retrieved July 10, 2016, from
https://support.dnsimple.com/articles/what-is-ssl-certificate-chain/
48. Ellingwood, J. (2015, January 28). Apache vs Nginx: Practical Considerations |
DigitalOcean. Retrieved July 10, 2016, from
https://www.digitalocean.com/community/tutorials/apache-vs-nginx-
practical-considerations
38. Secure File Management Using the Public Cloud
Masters of Cybersecurity Practicum Project, ISM 6905– Cecil Thornhill
Masters Project CThornhill v2 final.docx7/13/16 Page 38 of 46
49. Node.js Introduction. (2016). Retrieved July 10, 2016, from
http://www.tutorialspoint.com/nodejs/nodejs_introduction.htm
50. Webberly, W. (2016, May 23). Direct to S3 File Uploads in Node.js | Heroku
Dev Center. Retrieved July 12, 2016, from
https://devcenter.heroku.com/articles/s3-upload-node#summary
51. Compensation. (2013, October 22). Retrieved July 12, 2016, from
http://www.benefits.va.gov/compensation/dbq_disabilityexams.asp
39. Secure File Management Using the Public Cloud
Masters of Cybersecurity Practicum Project, ISM 6905– Cecil Thornhill
Masters Project CThornhill v2 final.docx7/13/16 Page 39 of 46
Source Code Listings
App.js – this is the server side logic for the project:
/*
Cecil Thornhill
5/26/2016
Based on code examples and samples from Will Webberly and Amazon for S3
uploads
*/
/*
In learning how to interface to S3 via Node JS and JavaScript I started with code
from a tutorial provided by Dr. Will Webberly who was a computer science lecturer
at Cardiff University and is now CTO at Simply Di Ideas. Will was kind enough to
correspond with my and address questions on the concepts and use cases involved
in my project. The original article I referenced is at:
https://devcenter.heroku.com/articles/s3-upload-node#initial-setup
*/
/*
This is the main logic for the server side of the proof of concept demo for my project.
The code here supports the features required to
allow the client to security load a file to the S3 storage site. The simple proof pages
and this core logic do not attempt to implement
any user authentication, authorization or administration of the site. Those funcitons
are pre-selected via the structure of the users and groups
built in the S3 interface for this demo. All these aspects would be expected in a more
full featured site design, but are not required to
establish the functional proof of concept for the main secure upload of files
functionality.
*/
/*
Licensed under the Apache License, Version 2.0 (the "License"); you may not use
this file except in compliance with the License. You may obtain a copy of the License
at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed
under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR
CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
*/
40. Secure File Management Using the Public Cloud
Masters of Cybersecurity Practicum Project, ISM 6905– Cecil Thornhill
Masters Project CThornhill v2 final.docx7/13/16 Page 40 of 46
/*
* Import required packages.
* Packages should be installed with "npm install".
*/
/*
CT - I am using local variable for the development versions of this demo site.
Below I requre dotenv to allow local config management, so this demo can
runwithout setting envirionment variables on the server
which is the more correct final operations configuration practice on a deployed
systems to prevent exposing the values in the
open production environment. Of course it is much easier to manage local values
from this resource file in the development phase
so that is the way I went for the the current demo code.
*/
var dotenv = require('dotenv');
dotenv.load();
/*
To ensure that we got the values we expexted I also show the variables now in
process.env - now with the values from the .env added
on the console. Of course this is not something to do in the final production system.
*/
console.log(process.env)
const express = require('express');
const aws = require('aws-sdk');
/*
* Set-up and run the Express app.
CT - note we are ruuning on port 3000 in this case. It is important to foraward your
web traffic from the NGINX server to
the proper port via setting up the reverse proxy configuration in the NGINX server,
so that traffic gets through from the web
server to the applicaiton server.
*/
const app = express();
app.set('views', './views');
app.use(express.static('./public'));
app.engine('html', require('ejs').renderFile);
app.listen(process.env.PORT || 3000);
/*
* Load the S3 information from the environment variables.