In this project the group members will play with daily rainfall data collected in Gulf coast (535stations in total) from 1949 to 2017. The purposes of this exercise are to:
1) to give students an idea of a typical example of a climate data set (spatio-temporal data) and someassociated scientific questions (e.g. how rainfall extremes vary in space and time and how that mightbe affected by other things like greenhouse gases or temperatures).
2) to get students familiar with data analysis using R including data manipulation, data visualization, and data summary.
3) to introduce some statistical methods (e.g. time series analysis, spatial statistics, extreme value analysis) to analyze this kind of data to "answer" (perform statistical inference) the questions of interest.
Group members: Lin Ge, Jianan Jang, Jessica Robinson, Erin Song, Seth Temple, Adam Wu
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
Undergraduate Modeling Workshop - Southeastern US Rainfall Working Group Final Presentation, May 25, 2018
1. Gulf Coast Rainfall Data Analysis
Lin Ge, Whitney Huang, Jianan Jiang, Jessica Robinson,
Erin Song , Seth Temple, Adam Wu
SAMSI Undergraduate Workshop:
Applied Mathematics and Statistics in Climate
May 25th, 2018
2. The Data
535 weather stations positioned throughout the gulf coast area
Collecting daily rainfall data from 1949 to 2017
Source: Global Historical Climatology Network (GHCN),
provided by Ken Kunkel from NC State University/NOAA
3. Questions
How can we describe extreme maximum rainfall
data?
We considered two methods to answer this question.
4. Method 1: Block Maxima Approach
For each station, separate data into ”blocks”. In this case,
one block is one year.
Find the maximum daily precipitation for each block.
Fit the Generalized Extreme Value Distribution (GEVD) to
annual maximum rainfall events.
5. Method 2: Peak Over Threshold Approach
Choose a value to function as the threshold.
Fit a Generalized Pareto Distribution (GPD) to rainfall events
including and above the chosen threshold.
8. Which Method Worked?
We chose to continue with the block maxima approach and fit
the data to the Extreme Values Distribution because...
Choosing an appropriate threshold for each of the 535
stations is difficult.
11. Animations
−105 −100 −95 −90 −85
262830323436
Longitude
Latitude
2
4
7
12
20
MaxRainfall(inches)
20 Year Return Levels
12. Challenges
Lack of time to choose appropriate
threshold for each individual station.
Limited data range
Few covariates
13. Further Research
Make a model to forecast major rainfall events
Data mine for more covariates
Re-frame to study total yearly precipitation
Observe seasonal changes (modify block)
Create a tutorial using rainfall data to educate
future statistics students about extreme value
theory
14. Many Thanks To...
Whitney Huang
Doug Nychka
Thomas Gehrmann
Chris Jones
Statistical and Applied Mathematical Sciences Institute
North Carolina State University
15. Citations
Original S functions written by Janet E. Heffernan with R port
and R documentation provided by Alec G. Stephenson.
(2018). ismev: An Introduction to Statistical Modeling of
Extreme Values. R package version 1.42.
https://CRAN.R-project.org/package=ismev.
Hadley Wickham, Romain Fran¸cois, Lionel Henry and Kirill
M¨uller (2018). dplyr: A Grammar of Data Manipulation. R
package version 0.7.5.
https://CRAN.R-project.org/package=dplyr.
Douglas Nychka, Reinhard Furrer, John Paige and Stephan
Sain (2017). “fields: Tools for spatial data.” doi:
10.5065/D6W957CT (URL:
http://doi.org/10.5065/D6W957CT), R package version 9.6,
www.image.ucar.edu/ nychka/Fields.
Eric Gilleland, Richard W. Katz (2016). extRemes 2.0: An
Extreme Value Analysis Package in R. Journal of Statistical
Software, 72(8), 1-39. doi:10.18637/jss.v072.i08