2. 2
OUTLINE
1. Problems and Our Way Out
2. HydroDaVE Database & Modules
3. Analysis Tools
4. Applications
3. 3 1. Problems and Our Way Out
Problems: Big Challenges of Big Data
4. 4 1. Problems and Our Way Out
Problems: Big Challenges of Big Data
5. 5 1. Problems and Our Way Out
Our Wish and Way Out
A cloud-connected platform to manage, share, visualize, and
analyze proprietary and public water resources data,
including
Geospatial Information:
• Maps.
• Location of wells and monitoring stations.
Temporal Information:
• Groundwater quality, groundwater level elevation,
pumping.
• Surface water discharge and quality.
• Climatic data – weather observations, modeled or
observational raster datasets.
• Modeling results – groundwater models, global
circulation models.
6. 6 1. Problems and Our Way Out
HydroDaVE Managed Service Platform
HydroDaVE Server
Databases, Flat Files Storage, Reporting Services
Internet
HydroDaVE Web Services
Secure Connection with Clients
HydroDaVE Explorer (HDX)
Visualization, Analysis, Reporting
HydroDaVE Manager (HDM)
Data Upload and Management
Hydrologic Database and Visual Explanations
7. 7 1. Problems and Our Way Out
HydroDaVE Managed Service Platform &
Public Data Portals
HydroDaVE Server
Databases, Flat Files Storage, Reporting Services
Internet
HydroDaVE Web Services
Secure Connection with Clients
HydroDaVE Explorer (HDX)
Visualization, Analysis, Reporting
Modules to access public data
HydroDaVE Manager (HDM)
Data Upload and Management
Public Data Portals
For example, USGS NWIS, NWQMC
Public Data Web Services
8. 8 1. Problems and Our Way Out
Web Services
A web service is a service offered
by an electronic device to
another electronic device,
communicating with each other
via the World Wide Web. In a web
service, web technology … is
utilized … for transferring
machine readable file formats
such as XML and JSON.
9. 9 1. Problems and Our Way Out
Example: USGS EPQ Web Service
Request
Response
User
Interface
10. 10 1. Problems and Our Way Out
How HDX utilizes Web Services
The mouse cursor
coordinates are sent to the
USGS EPQ Web Service.
The response from the USGS
EPQ Web Service is decoded
and then displayed.
11. 11 2. HydroDaVE Database & Modules
HydroDaVE Database & Modules
• The HydroDaVE database is a relational database
based on the Structured Query Language (SQL)
and consists of a number of tables.
• The relationships between the tables are
presented by Entity-Relationship-Diagrams (ERD).
• SQL constraints are used to ensure the integrity
and reliability of the data stored in the tables. For
example, the unique constraint prevents duplicate
data.
• A number of tables and a set of HDX/HDM
functionalities, that deal with a specific data type,
are grouped together and referred to as a
module.
12. 12 2. HydroDaVE Database & Modules
A Simple ERD with Two Tables
County Table
• The unique constraint on Name ensures that no duplicate country
name may be entered.
State Table
• The foreign key constraint on CountryID of the State table prevents
an invalid ID being inserted, because it has to be one of the country
IDs contained in the Country table.
• The unique constraint on the combination of CountryID and Name
ensures that no duplicate State name may be defined for any given
country.
13. 13 2. HydroDaVE Database & Modules
HydroDaVE Modules
• Project (users, map contents, and security).
• Data Tracking (upload status, logs, and original files).
• Reference (online reference files).
• Surface Water (discharge and quality time-series).
• Climate (time-series of weather data and gridded
datasets from NWS, NEXRAD, PRISM, CMIP3/5, etc.).
• Live Link to Public Data Portals (NWIS, MWQMC, etc.).
• Groundwater
14. 14 2. HydroDaVE Database & Modules
Data Types of Groundwater Module
• Wells
• Coordinates,
• Reference elevations,
• Well casing, lithology and geophysical logs, and
• Attributes (such as reference files, well use, well type, owner, etc.)
• Time-series
• Groundwater level elevation,
• Groundwater quality, and
• Production.
• Lookup tables
• Water quality standards,
• Analytes, etc.
17. 17 3. Analysis Tools
Display of Wells
1. Displays and
symbolizes wells
based on various
attributes.
2. Access reference
files of individual
wells.
3. Displays time-
series charts.
4. Creates cross-
sections.
5. Displays Piper and
Stiff diagrams
18. 18 3. Analysis Tools
Multivariate Chart
1. Displays arbitrary
combinations of
groundwater level
elevation,
groundwater quality,
and production time
series charts.
2. Export time-series
data.
3. Horizontal axes (time)
of all charts are
synced; vertical axes
of individual charts
can be scaled
independently.
19. 19 3. Analysis Tools
Geological Cross-sections
1. Prerequisites
a. Wellsite/Borehole location
and ground surface elevation
b. Lithology logs
2. Optional Data
a. Geophysical logs
b. Well casing information
3. Additional Data
a. Ground Surface Elevation
from the USGS EPQ Web
Service.
b. Predefined lithology symbols
4. Output in bitmap or scalable
vector graphics.
20. 20 3. Analysis Tools
Piper Diagram at Individual Wells
1. Visually
presents the
cation and
anion
compositions of
individual
samples.
2. Visualize the
trend of water
quality over
time at
individual wells
21. 21 3. Analysis Tools
Piper Diagram of Multiple Wells
1. Visually presents the cation
and anion compositions of
many samples from
multiple wells on a single
graph.
2. Allows the major groupings
or trends in the data to be
discerned visually.
3. Facilitates the
characterization or
classification of waters.
Grouping of waters on the
Piper Diagram suggests a
common composition and
origin.
4. For details, see
http://training.usgs.gov/tel
/wqprinciples/lesson13-
freeze.pdf
22. 22 3. Analysis Tools
Plotting Stiff Diagrams on Map
1. Displays Stiff diagrams at
well locations within a
specified area based on
groundwater quality
measured within a given
time period.
2. The shape of a Stiff
diagram indicates the
relative proportions of the
different ions, and the size
of the Stiff diagram
represents the total ion
concentrations.
3. Stiff Pattern maps allow
similarities and differences
between different waters
to be seen at a glance.
23. 23 3. Analysis Tools
Scatter Maps
1. A Scatter Map displays
the statistics of water
quality or production
data at the wells located
within a specified area
and measured within a
given time frame.
2. Statistics include the
minimum, maximum,
and average values of
the selected properties
of individual wells.
3. Statistics are presented
as markers on map with
specified sizes and
colors.
24. 24 4. Applications
Applications
• Perform Hydrologic and Hydrogeologic Studies,
• Design Monitoring Programs,
• Develop Hydrologic Conceptual Models,
• Facilitate Development and Visualization of
Models and Their Results,
• Resolve Disputes regarding Source of
Contamination (Example: Chino Airport).
WEI designs and implements sustainable groundwater management plans since 1990. We found that data collection is the key component of most, if not all, water resource projects. Almost all projects start with data collection. So we asked ourselves, why won’t we integrate data into a platform to avoid repetitive data collection efforts.
Today, I would like to give you an overview of a platform that we have developed over the past 10 years in these four major points.
In the past, before the Internet existed, our challenges were to find data.
In the present time, thanks to the Internet, it is fairly easy to find a great collection of websites that provide various kind of data.
For example, here are 9 websites that provide public water data related to California. We get Climate data, Surface Water data, Groundwater Data, Data of sites with potential contamination, etc.
However, it remains a challenge to identify data that are relevant to a given water management project.
Not only those data are located in different silos, but also in different format hidden behind different user interfaces.
And. Proprietary private datasets come on top of those public datasets.
Our challenge was how to integrate these data into a more user-friendly platform that spares us from the daunting data collection tasks.
To design the platform, we started with collecting ideas and wishes more than 10 years ago. At the time when Google Maps, Google itself, was in its infancy. We realized that the platform must have georeferenced information and must have a map-based user-interface so that we can display wells and stations on top of georeferenced maps. We also realized that not only groundwater and surface water time-series data are important, but also the climatic data and model results are of great importance.
So, this slide shows our wishes and hopefully the way out of the data jungle.
How do we fulfill those wishes? We started with a conceptual design of the platform. The name HydroDaVE was invented later.
We decided to store all proprietary data in a centralized place – HydroDaVE Server – that hosts databases, stores flat files (such as pdf, documents, photos), and should also provide reporting services that can generate table-based reports (such as water quality exceedance reports) automatically. We have also planned to create web services that act as the middleman between the Server and Client applications.
And, we added two client-side applications to the design.
The green one is responsible for data QA/QC and upload. It allows administrators to pull out and correct uploaded datasets in case erroneous data are found. It also allows admins to manage lookup tables and users.
The yellow one, the HydroDaVE Explorer or HDX, provides user-interface for visualization, data analysis, and accessing the reporting services.
But, how about the public data?
We will simply add the abilities to access public data web services to HDX. For example, when we want to display wells that are stored in the NWIS, HDX will access the USGS water services instead of HydroDaVE Web Services.
This slide shows the completed conceptual design. When necessary, we can add additional client-side applications, such as mobile apps, to the design.
You may wonder how web services interacts with client-side application and the server.
In the tech jargon, a web services is a ….
Here is an example,
A web browser displays the user interface of the USGS EPQ web service.
We fill the data into the provided boxes.
When we hit the Get Elevation button, the browser sent out a REQUEST consists of the user-specified parameters and the address of the EPQ web service.
When the EPQ Web service receives the request, it obtains the result (either by itself or asks some other procedures on the server to do it) and sent the result as a RESPONSE back.
In this example, the RESPONSE is encoded in an XML format (Extensible Markup Language).
HDX perform the query in the same way while providing visual references, without asking the user to enter the coordinates. Whenever you move and stop the mouse cursor for fraction of a second, HDX will get the elevation at the cursor. HDX also performs on-the-fly coordinate transformation and displays Lng/Lat and UTM coordinates. It is just more elegant than a text-based user interface.
A relational database is a collection of data items organized as a set of formally-described tables from which data can be accessed or reassembled in many different ways without having to reorganize the database tables.
In SQL, we have the following constraints:
NOT NULL - Indicates that a column cannot store NULL value
UNIQUE - Ensures that each row for a column must have a unique value
PRIMARY KEY - A combination of a NOT NULL and UNIQUE. Ensures that a column (or combination of two or more columns) have a unique identity which helps to find a particular record in a table more easily and quickly
FOREIGN KEY - Ensure the referential integrity of the data in one table to match values in another table
CHECK - Ensures that the value in a column meets a specific condition
DEFAULT - Specifies a default value for a column
SQL databases differ from Excel tables by ERD, constraints, stringent data types, etc. and as a result, fast query.
The County table has a unique constraint defined by Name to ensure that no duplicate country name may be stored.
The State table has a unique constraint defined by the combination of CountryID and Name to ensure that no duplicate State name may be defined for any given country.
The State table has a foreign key constraint defined by the CountryID to ensure that a CountryID must exist in the Country table before it can be stored in the State table.
The County table has a unique constraint defined by the combination of StateID and Name to ensure that no duplicate County name may be defined for any given state.
The County table has a foreign key constraint defined by the StateID to ensure that a StateID must exist in the State table before it can be stored in the County table.
Here is a list of modules.
… CMIP: Climate Model Intercomparison Project
For this presentation, I will focus on the Groundwater module. If you are interested in knowing more details, please feel free to talk to me after the presentation.
These are the analysis tools that we will cover
Should explain how to
Add/Remove Data Maps: For WQ, the class intervals are defined based on the MCL of the selected analyte.
Export Data
Solve problems by combining various data layers.
This slide shows the southwest corner of San Bernardino County and the southwest corner of the Chino Basin. This map shows the location of wells at the California Institute for Men (CIM), a maximum-security prison and the Chino Airport, owned by the San Bernardino County. The prison and the airport have been in existence since the 1940s. This map also shows the location of wells with detections of TCE. The TCE concentration data is shown as a class interval with blue ranging from half the MCL to the MCL (2.5 to 5.0 µg/L), green ranging from the MCL to 2 x MCL, yellow ranging from 2 x MCL to 4 x MCL, and so on. Also shown are directional vectors that show the direction of groundwater flow for the spring of 2011.
The airport was issued a Clean up and Abatement Order by the Santa Ana Regional Water Quality Control Board in 2008 that required the County to investigate the source of TCE and to develop a remediation plan to remediate the plume. The County’s consultant, upon their first round of sampling at existing wells claimed that the TCE observations southwest of the Chino Airport were likely due to the industrial operations at the prison to the northwest of the airport. Their claim was based on one round of TCE sampling and groundwater level measurements located near the airport and printed reports describing the remediation efforts at the prison. The Chino Basin Watermaster, our client, asked us to investigate the airport consultant’s claim.
The first thing WEI did was to explore the universe of TCE data in the southwestern part of the Chino Basin. This slide is identical to the prior slide except that it shows the TCE concentration data for all wells in the map area. Note that there are white dots on the map. The white dots show the location of wells where TCE was sampled and with the results ranging from non detect to half the MCL. If the prison were the source of the TCE southwest of the airport there should be TCE detection at wells between the prison and the airport there were none.
The time required in HD to add the TCE data to the map was about 15 seconds.
This slide is similar to the prior slide except that it shows the PCE concentration at wells instead of the TCE concentration. Note that the primary contaminant at the prison was PCE with relatively low levels of TCE concentration. PCE has not been detected at the airport or southwest of it. If the prison were the source of the contamination there would be significant PCE levels at the airport and between the prison and the airport.
The time required in HD to replace the TCE with PCE was about 15 seconds.
So why are the plumes at the prison and the airport aligned the way they are, given the groundwater flow direction observed by the County’s consultant?
This slide provides the answer. This slide is identical to Slide 2 except that the directional vectors represent here in are for 1977 and are more representative of historical conditions in this area. The consultant claim was based solely on limited modern groundwater level data that include the impacts of a massive new well field that became operational in the early 2000’s. The directional vectors shown herein are based on groundwater level data and groundwater modeling work that show a predominately northeast to southwest flow direction existed here for the period 1940s through 2000.
The time required to replace the directional vectors was about 10 seconds.
That brings me to the end of my presentation. Thank you for your kind attention.