http://dx.doi.org/10.1109/CLOUD.2012.68
A modern field science such as archaeology is heavily data-driven using various kinds of state-of-the-art measurement instruments. It requires sophisticated computer infrastructure to manage large amounts of heterogeneous data. The concept of cloud computing provides a flexible cyber infrastructure for large-scale data management, which is being deployed at university campuses. A problem unique to field research is that researchers often work at remote field sites with limited computer and network resources. For a data management system that has to work in the campus cloud and under vastly different field conditions, portability of computer infrastructure and common data access methods are essential requirements. This paper explores the portability of cloud infrastructure and illustrates the portable data management system that we used in a recent archaeological expedition.
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
Portable Data Management Cloud for Field Science
1. PORTABLE DATA MANAGEMENT CLOUD
FOR FIELD SCIENCE
UC San Diego, Calit2
Yuma Matsui, Aaron Gidding, Thomas E. Levy, Falko Kuester, Thomas A. DeFanti
IEEE Cloud 2012, 6/24/2012
2. CONTENTS
Managing Big Data in Archaeology
Heterogeneous data
Need for data management system
Portability of Data Management Cloud
System in the Wild
5. PORTABLE DATA
MANAGEMENT CLOUD
Need data management system that runs both on
Campus: powerful computers, high-speed network
Field sites: small computers, limited network
6. PORTABLE DATA
MANAGEMENT CLOUD
Need data management system that runs both on
Campus: powerful computers, high-speed network
Field sites: small computers, limited network
Portable data management infrastructure
between field sites and campus with cloud!
Cloud provides flexible computer infrastructure
virtualized environment, ease of deployment, scalability
7. Managing Big Data in Archaeology
Portability of Data Management Cloud
Virtualized environment
Data access
System in the Wild
8. PORTABILITY IN THE
SYSTEM
Goal: streamline data processes over field sites and
campus
Data collection
Data management
Data analysis
What is portability?
Portability of whole system environment
Portability of collected data
Data Collection
Data
Management
Data Analysis
and
Visualization
Field Sites
Campus
DatacenterPortability
9. PORTABLE SYSTEM WITH
CLOUD
IaaS
Fully controllable virtualized environment
Makes whole environment (data and programs) portable
Suitable for our field science needs
PaaS
SaaS
10. DATA ACCESS
Structured data: artifact/site metadata, artifact inventory data, and
total station geo-data
Stored in a database
Accessible with JSON REST API
Raw measurement data: Photos, XRF (X-Ray Fluorescence), FTIR
(Fourier Transform Infrared Spectroscopy), and LiDAR
Stored in an object storage
Accessible with S3-compatible REST API
11. Web-based data management application
All data are accessible with the web application or
REST API. This makes data portable and consumable.
12. Managing Big Data in Archaeology
Between Cloud and Ground
System in the Wild
System components
System workflow
14. SYSTEM WORKFLOW
Field sites
Various data are collected with insruments.
Structured data are put into the database through the web application.
Raw file data are temporarily stored in network-attached storage.
Campus
Data and programs from fields are moved to campus cloud infrastructure.
Data analyses and visualizations are executed with the collected data on high-
performance computers.
Synchronize environments
(VM copy and object storage registration)
register
Total
Station
LIDAR
Artifact
Data
copy
Network
Attached
Storage
Cloud Storage
Photos
Field Sites
Campus
Datacenter
Visualization
Facility
(CAVE,
OptIPortal)
Small Server
Web
App
DB
Virtualization
IaaS Cloud
Web
App
DB
Virtualization
15.
16. CONCLUSION AND
FUTURE WORK
We developed a portable data management infrastructure
for digital archaeology.
It is based on IaaS virtualized hosting environments and
equipped with unified data access methods.
We used the system in an excavation in 2011.
Integration of the system with large-scale analysis and
visualization is in progress.