2. 2
Data Systems Overview
PIMS (Permit Information Management System)
• Southeast Regional Office (St. Petersburg, FL)
• PostgreSQL Database Management System
VMS (Vessel Monitoring System)
• Office of Law Enforcement (Silver Spring, MD)
• Oracle Database Management System
Gulf Shrimp Observer Program
• Southeast Fisheries Science Center (Galveston, TX)
• Microsoft Access
GIS (Geospatial Information System)
• Multiple Sources
• Various Storage Formats (Shapefiles, Grids, Excel files, MS Access
databases, Oracle, JPEG images, Binary & ASCII Raster)
3. 3
Permit Information Management
PIMS (Permit Information Management System)
• PostgreSQL Database Management System
• 146 Transactional Tables
• Access via a local replicated database (Disaster Recovery Server)
• MS Access configured to read and extract data (ODBC)
• Migrated data tables to Oracle 11g RDBMS
• Primary Tables Include:
1. TBL_REQMIT (Permits and Requests)
2. TBL_VESSELS (Vessel Characteristics)
3. TBL_FISHERY_TYPE (Fishing Industry)
4. TBL_REQMIT_STATUS (Permit Status)
4. 4
Vessel Monitoring System
VMS
• Oracle RDBMS
• 108 Transactional Tables
• Access local replicated database and direct access via DBLink
• Receive nightly updates (~80,000 records) with a 4-day lag
• FMC_POS is the primary table of interest containing:
• ID
• LAT_LON (SDO_GEOMETRY)
• UTC_DATE
• COURSE
• SPEED
• TRACK (SDO_GEOMETRY)
• RADIO (VESSEL IDENTIFIER)
5. 5
Gulf Shrimp and Reef Fish
Observer Program
Observer Data
• Microsoft Access RDBMS
• Data manipulated to create
TRIPS and TOWS tables:
• The TRIPS table documents
when the trips started and
ended. This information is used
to extract the locations from the
warehouse.
• The TOWS table identifies when
trawling is occurring which is the
target variable. This is used to
assign this behavior to the
locations previously extracted.
TRIPS TOWS
Vessel Official Number Vessel Official Number
Trip Number Trip Number
Trip Start Tow Number
Trip End Time In
Number of Days Time Out
Number of Tows/Sets Location
7. 7
Geospatial Data Warehouse
1. Assign fishery permit to each VMS location (Vessel_ID and
Date)
2. Spatially-join bathymetry, distance from shore, and direction to
shore to each VMS location (Raster Cell Value)
3. Organize facts and dimensions based on the data warehouse
design.
4. Populate materialized view containing relevant data elements in
one master table
5. Identify which locations pertain to each observer trip. Assign
target variable (FISHING) a value of 1 for each location within
the TIME_IN and TIME_OUT window. All others receive 0.
13. 13
Predictive Analytics
1. Upload training data for Shrimp (trawling) and import into SAS Enterprise Data Miner.
2. Partition the data into training and validation segments based on their original distributions:
1. Develop models, Regression and Decision Tree, to predict fishing behavior. The Auto-
Neural Network model was not selected for this project since the resulting variable
coefficients must be understood.
2. Compare the models to determine which is the most effective at predicting fishing behavior.
BEHAVIOR VALUE SHRIMP
FISHING 1 43.69%
NOT FISHING 0 56.31%
14. 14
Model Pathway
Additional data were not scored due to the relatively high
misclassification rate (0.38551) of the regression model. The
decision tree model had a similar misclassification rate of
(0.38636). The model must be refined prior to its application within
an operational context.
15. 15
Trawling Regression Model
1. The regression model established that the following variables were most useful in predicting
shrimp trawling behavior.
Parameter DF Estimate Standard
Error
Wald
Chi-Square
Pr > ChiSq Standard
Estimate
Intercept 1 -2.7513 0.6571 17.53 <0.0001 0.064
ADW 1 -0.5844 0.0574 103.75 <0.0001 0.557
Bathymetry 1 -0.00663 0.00105 40.12 <0.0001 -0.1403
Freezer 1 0.3899 0.0584 44.64 <0.0001 1.477
Fuel Capacity 1 -0.00004 7.32E-6 23.12 <0.0001 -0.1666
Gross Weight 1 -0.00490 0.00236 4.31 0.0378 -0.0630
Longitude 1 -0.0355 0.00643 30.44 <0.0001 -0.1276
RS 1 0.2542 0.0922 7.60 0.0058 1.289
Steel Hull 1 -0.1832 0.0590 9.64 0.0019 0.833
WRK 1 0.7395 0.1003 54.38 <0.0001 2.095
20. 20
Decision Tree Explained
1. If the LATITUDE is >= 35.165, there is a 60.7% chance that the vessel is fishing.
2. If LATITUDE is < 35.165, there is a 40.1% chance that the vessel is fishing.
3. If LATITUDE is < 35.165 and LONGITUDE < -81.045, there is a 44.7% chance that the vessel is fishing.
Furthermore, if the vessel has a KM permit, there is a 67.3% chance that the vessel is fishing as opposed
to a 43.4% chance if the vessel does not have a KM permit.
4. If LATITUDE is < 35.165 and LONGITUDE > -81.045, there is a 32.1% chance that the vessel is fishing. If
the NET_WEIGHT of the vessel is less than 69.5 tons there is a 41.5% chance that the vessel is fishing. In
addition, if the vessel’s speed is >= 0.105 knots, then there’s a 47.2% chance that it is fishing. If the speed
is <0.105 knots, then the LONGITUDE must be greater than -79.955 degrees to have a 83.3% chance of
predicting fishing behavior.
5. On the other hand, if the NET_WEIGHT of the vessel is >= 69.5 tons, there is a 24.3% chance that the
vessel is fishing. In addition, if the HOLD_CAPACITY of the vessel is less than 14,000 pounds, there is a
52.0% chance that the vessel if fishing. Furthermore, if the DISTANCE to the closest shore is < 7,394
meters, then it is 100% likely that the vessel is fishing as opposed to 40.0% likely if the distance is greater
than or equal to 7,394.
27. 27
Next Steps
1. Develop observer data warehouse
2. Link VMS/Permit and Observer data warehouses
3. Use the observer data to determine fishing vs non-fishing
locations for all programs (pelagics, reef fish, shrimp, sharks)
4. Develop, test, and validate program specific models
5. Incorporate model output into operational scoring routine
6. Use validated models to quantify fishing effort