{"id":83,"date":"2010-11-05T15:23:24","date_gmt":"2010-11-05T19:23:24","guid":{"rendered":"http:\/\/web.colby.edu\/mgimond\/?p=83"},"modified":"2011-10-26T12:13:30","modified_gmt":"2011-10-26T16:13:30","slug":"analysis-of-precipitation-trends-in-the-us-using-r","status":"publish","type":"post","link":"https:\/\/web.colby.edu\/mgimond\/2010\/11\/05\/analysis-of-precipitation-trends-in-the-us-using-r\/","title":{"rendered":"R: Analysis of precipitation trends in the US"},"content":{"rendered":"<p>Data files used with the following script were downloaded from: ftp:\/\/ftp.ncdc.noaa.gov\/pub\/data\/ghcn\/daily\/.<\/p>\n<pre class=\"brush: r; title: ; notranslate\" title=\"\">\r\n# Author: Manuel Gimond\r\n# Date: 10\/29\/2010\r\n#\r\n# Purpose: Extract data from DAILY GLOBAL HISTORICAL CLIMATOLOGY NETWORK for\r\n# a select time period. Only stations that satisfy the minimum NA count threshold\r\n# for the time period are kept.\r\n# The script converts the aggregated monthly data to z values, then generates trend\r\n# values for each station using linear regression model. The original inspiration\r\n# for this script was to reproduce fig.1 of Balling and Goodrich's &quot;Spatial analysis\r\n# of variations in precipitation intensity in the USA&quot; (Theor Appl Climatol, Oct 2010)\r\n# Libraries\r\n# library(MASS) # Used if robust regression is chosen\r\n#\r\n# User variables\r\nmy.path  = &quot;ghcnd_hcn&quot; # Directory where raw files are stored\r\nfile.ext = &quot;.dly&quot; # Extension of filenames\r\nmet.par  = &quot;PRCP&quot; # Meteorological paramater to extract\r\nmet.conv = 0.1 # Factor to convert raw number to real number (e.g. precip = 0.1)\r\nna.cap   = 0.05 # Percent of no data accepted for analysis (stored as a fraction)\r\nyr.st    = 1975 # Start year (inclusive)\r\nyr.en    = 2009 # End year (inclusive)\r\nrobust.z = F # Use robust z scores? True (T) or False (F)\r\n#\r\n######################### Functions #########################################\r\n# FUNCTION: Computes standardized z values using conventional or robust methods\r\n# Note: all data values are converted including year\r\n# Table format should follow:\r\n# |Year,sta1,sta2,sta3,...|\r\nZval   = function(x.in,Robust) {\r\nx.out  = x.in # Copy input data frame to output data frame\r\nx.rows = nrow(x.in)\r\nx.cols = ncol(x.in)\r\n#\r\nif(Robust){ # quartile-based estimators\r\n  m.x   = apply(x.in, 2, median, na.rm = T)\r\n  iqr.x = apply(x.in, 2, IQR, na.rm = T) # Normalized IQR\r\n  s.x   = iqr.x\/(qnorm(0.75) - qnorm(0.25))\r\n}else{ # moments-based estimators\r\n  m.x   = apply (x.in, 2, mean, na.rm=T)\r\n  s.x   = apply (x.in, 2, sd, na.rm=T)\r\n}\r\n#\r\nfor (i in 1:x.cols) {\r\n if ( s.x[i] != 0){\r\n   x.out[,i] = (x.in [,i] - m.x[i])\/s.x[i] # z-score\r\n }else{\r\n   x.out[,i] = 0\r\n }\r\n}\r\nreturn(x.out)\r\n}\r\n#\r\n# FUNCTION: Computes linear trend for each station across\r\n# entire time period. Appends coefficients to the dataframe.\r\n# Choice is between robust regression (Robust=T) or non-robust\r\n# regression (Robust=F).\r\n# Table format should follow:\r\n# |Year,sta1,sta2,sta3,...|\r\nStaReg = function(x,Robust) {\r\na = &quot;intercept&quot;\r\nb = &quot;slope&quot;\r\nR = &quot;adj_R2&quot; # Adjusted R2\r\nsta.lst = names(x)\r\nfor (i in 2:length(sta.lst)){\r\n if(Robust){\r\n   lm.res = rlm(get(sta.lst[i]) ~ get(sta.lst[1]), data=x, na.action=na.omit)\r\n }else{\r\n   lm.res = lm(get(sta.lst[i]) ~ get(sta.lst[1]), data=x, na.action=na.omit)\r\n }\r\n a[i] = lm.res$coefficients[1] # Intercept\r\n b[i] = lm.res$coefficients[2] # Slope\r\n R[i] = summary(lm.res)$adj.r.squared\r\n}\r\n# Append coefficients to existing table\r\nx = rbind(x,a,b,R)\r\nreturn(x)\r\n}\r\n######################### END of Functions #########################################\r\n#\r\n# Internal variables\r\nfile.stat = data.frame(name = numeric(0), expected = numeric(0),values = numeric(0),\r\n            null = numeric(0), frac=numeric(0)) # Create an empty list to hold data stats\r\nfile.lst = list.files(path=my.path,pattern=file.ext) # Get list of files\r\nyr.days  = (yr.en - yr.st +1) * 12 * 31 # Number of expected records for each station\r\n#\r\n# Internal variables specific to precipiation data\r\nall.Total = as.data.frame(seq(yr.st,yr.en,1))\r\ncolnames(all.Total) = &quot;Year&quot;\r\nall.INTNS = as.data.frame(seq(yr.st,yr.en,1))\r\ncolnames(all.INTNS) = &quot;Year&quot;\r\nall.N50  = as.data.frame(seq(yr.st,yr.en,1))\r\ncolnames(all.N50) = &quot;Year&quot;\r\n#\r\nfor (i in 1:length(file.lst)){\r\n#for (i in 1:30){\r\nprint( c(&quot;Processing&quot;,file.lst[i]))\r\nfile.stat[i,1] = file.lst[i]\r\n# Read data into columns (the '-1' value removes the flags)\r\nsub1 = read.fwf(paste(my.path,'\/',file.lst[i],sep=&quot;&quot;),na.strings = &quot;-9999&quot;,\r\n                 width=c(11,4,2,4,rep(c(5,-1,-1,-1),31)))\r\n# Grab only meteo parameter of interest\r\nmet.data = sub1[sub1$V4==met.par,]\r\n# Grab only years of interest\r\nmet.data = met.data[met.data$V2&gt;=yr.st &amp; met.data$V2&lt;=yr.en,]\r\n# Compute the percent NA (make sure to ignore the header columns -&gt; e.g. use seq(-1...))\r\n# Note that for months with fewer than 31 days, missing days are also counted as NA. This\r\n# should not influence the results significantly.\r\n#\r\nmet.data.na = sum(as.numeric(is.na(met.data[,seq(-1,-4,-1)]))) # Total NA\r\nmet.data.num = sum(as.numeric(abs(is.na(met.data[,seq(-1,-4,-1)])-1)))\r\n# Some files have missing dates; we need to add the missing data to NA\r\nmet.data.na = (yr.days - (met.data.num + met.data.na)) + met.data.na\r\nmet.data.na.frac = met.data.na \/ yr.days\r\n# Output file stats to list\r\nfile.stat[i,] = c(file.lst[i],yr.days,met.data.num,met.data.na,met.data.na.frac)\r\nif (met.data.na.frac &lt;= na.cap) {\r\nmet.data.conv = cbind(met.data[2],met.data[3],met.data[,seq(-1,-4,-1)] * met.conv)\r\n                       * met.conv # Apply conversion factor to raw data &amp; remove superfluous columns\r\n# The following is parameter specific (precipitation in this example)\r\n# Note the use of 'seq' (e.g. [,seq(-1,-2,-1)]) to exclude columns (e.g. 1 through 2)\r\nmet.data.conv = cbind(met.data[2],met.data[3],met.data[,seq(-1,-4,-1)] * met.conv)\r\nprec.cnt.mon  = cbind(met.data.conv[1],met.data.conv[2],\r\n                      rowSums((met.data.conv[,seq(-1,-2,-1)] &gt; 0),na.rm=T))\r\nprec.cnt.yr = aggregate(prec.cnt.mon[,seq(-1,-2,-1)], by = prec.cnt.mon[1],\r\n                         FUN = &quot;sum&quot;) # total yearly count of rain events\r\nprec.tot.mon = cbind(met.data.conv[1],met.data.conv[2],rowSums(met.data.conv[,seq(-1,-2,-1)],na.rm=T))\r\nprec.tot.yr = aggregate(prec.tot.mon[,seq(-1,-2,-1)], by = prec.tot.mon[1], FUN = &quot;sum&quot;)\r\nprec.INTNS.yr = cbind(prec.tot.yr[1], prec.tot.yr[2] \/ prec.cnt.yr[2])\r\nprec.N50.mon = cbind(met.data.conv[1],met.data.conv[2],\r\n                      rowSums((met.data.conv[,seq(-1,-2,-1)] &gt;= 50),na.rm=T))\r\nprec.N50.yr = aggregate(prec.N50.mon[,seq(-1,-2,-1)], by = prec.N50.mon[1],\r\n               FUN = &quot;sum&quot;) # total yearly count of rain events above 50mm\r\nprec.Total.yr = prec.tot.yr\r\ncolnames(prec.Total.yr) = c(&quot;Year&quot;,gsub(file.ext,&quot;&quot;,file.lst[i]))\r\ncolnames(prec.INTNS.yr) = c(&quot;Year&quot;,gsub(file.ext,&quot;&quot;,file.lst[i]))\r\ncolnames(prec.N50.yr)   = c(&quot;Year&quot;,gsub(file.ext,&quot;&quot;,file.lst[i]))\r\nall.Total = merge(all.Total, prec.Total.yr, by=&quot;Year&quot;, all=T)\r\nall.INTNS = merge(all.INTNS, prec.INTNS.yr, by=&quot;Year&quot;, all=T)\r\nall.N50   = merge(all.N50, prec.N50.yr, by=&quot;Year&quot;, all=T)\r\n}\r\n}\r\n#\r\n# Convert rain indices to z-scores (normalized value) for each station (i.e. by column)\r\nall.Total.z = Zval(all.Total,robust.z)\r\nall.INTNS.z = Zval(all.INTNS,robust.z)\r\nall.N50.z   = Zval(all.N50,robust.z)\r\n#\r\n# Compute temporal trend for each station\r\nall.Total.coef   = StaReg(all.Total,F)\r\nall.Total.z.coef = StaReg(all.Total.z,F)\r\nall.INTNS.z.coef = StaReg(all.INTNS.z,F)\r\nall.N50.z.coef   = StaReg(all.N50.z,F)\r\n#\r\n# Rename the first column (Year) to original value if it was converted to a z-score\r\nall.Total.z.coef[,1] = all.Total.coef[,1]\r\nall.INTNS.z.coef[,1] = all.Total.coef[,1]\r\nall.N50.z.coef[,1]   = all.Total.coef[,1]\r\n#\r\n# Transpose new data subset so that the stations are listed as records\r\nall.Total.coef.t   = t(all.Total.coef)\r\nall.Total.z.coef.t = t(all.Total.z.coef)\r\nall.INTNS.t = t(all.INTNS)\r\nall.INTNS.z.coef.t = t(all.INTNS.z.coef)\r\nall.N50.t = t(all.N50)\r\nall.N50.z.coef.t  = t(all.N50.z.coef)\r\n#\r\n# Rename header name and remove redundant header\r\ncolnames(all.Total.coef.t) = all.Total.coef.t[1,]\r\ncolnames(all.Total.z.coef.t) = all.Total.z.coef.t[1,]\r\ncolnames(all.INTNS.t) = all.INTNS.t[1,]\r\ncolnames(all.INTNS.z.coef.t) = all.INTNS.z.coef.t[1,]\r\ncolnames(all.N50.t) = all.N50.t[1,]\r\ncolnames(all.N50.z.coef.t) = all.N50.z.coef.t[1,]\r\nall.Total.coef.t = data.frame(all.Total.coef.t)[-1,]\r\nall.Total.z.coef.t = data.frame(all.Total.z.coef.t)[-1,]\r\nall.INTNS.t = data.frame(all.INTNS.t)[-1,]\r\nall.INTNS.z.coef.t = data.frame(all.INTNS.z.coef.t)[-1,]\r\nall.N50.t = data.frame(all.N50.t)[-1,]\r\nall.N50.z.coef.t = data.frame(all.N50.z.coef.t)[-1,]\r\n#\r\n# Export to CSV file\r\nwrite.table(all.Total.coef.t,file=&quot;precip_Total_coef.csv&quot;, sep=&quot;,&quot;,\r\n            quote=F,col.names=NA,row.names=T,na=&quot;&quot;) # Set NA to blank\r\nwrite.table(all.Total.z.coef.t,file=&quot;precip_Total_z.csv&quot;, sep=&quot;,&quot;,\r\n            quote=F,col.names=NA,row.names=T,na=&quot;&quot;) # Set NA to blank\r\n<\/pre>\n<p><code style=\"font-size: 11pt\"><br \/>\n<\/code><\/p>\n<p>The following figure shows the standardized regression coefficients from a Universal Kriging analysis of the Precip_total_coef_all_z.csv file (the values are in fact correlation coefficients). The time period spans 1975 &#8211; 2009.<\/p>\n<p><a href=\"http:\/\/web.colby.edu\/mgimond\/files\/2010\/11\/1975_2007_Total_precip_z_regr.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-medium wp-image-94\" src=\"http:\/\/web.colby.edu\/mgimond\/files\/2010\/11\/1975_2007_Total_precip_z_regr-300x231.jpg\" alt=\"\" width=\"300\" height=\"231\" srcset=\"https:\/\/web.colby.edu\/mgimond\/files\/2010\/11\/1975_2007_Total_precip_z_regr-300x231.jpg 300w, https:\/\/web.colby.edu\/mgimond\/files\/2010\/11\/1975_2007_Total_precip_z_regr-1024x791.jpg 1024w\" sizes=\"(max-width: 300px) 100vw, 300px\" \/><\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Data files used with the following script were downloaded from: ftp:\/\/ftp.ncdc.noaa.gov\/pub\/data\/ghcn\/daily\/. The following figure shows the standardized regression coefficients from a Universal Kriging analysis of the Precip_total_coef_all_z.csv file (the values are in fact correlation coefficients). The time period spans 1975 &#8211; 2009.<\/p>\n","protected":false},"author":1199,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"ngg_post_thumbnail":0,"footnotes":""},"categories":[12807,12808],"tags":[],"_links":{"self":[{"href":"https:\/\/web.colby.edu\/mgimond\/wp-json\/wp\/v2\/posts\/83"}],"collection":[{"href":"https:\/\/web.colby.edu\/mgimond\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/web.colby.edu\/mgimond\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/web.colby.edu\/mgimond\/wp-json\/wp\/v2\/users\/1199"}],"replies":[{"embeddable":true,"href":"https:\/\/web.colby.edu\/mgimond\/wp-json\/wp\/v2\/comments?post=83"}],"version-history":[{"count":22,"href":"https:\/\/web.colby.edu\/mgimond\/wp-json\/wp\/v2\/posts\/83\/revisions"}],"predecessor-version":[{"id":100,"href":"https:\/\/web.colby.edu\/mgimond\/wp-json\/wp\/v2\/posts\/83\/revisions\/100"}],"wp:attachment":[{"href":"https:\/\/web.colby.edu\/mgimond\/wp-json\/wp\/v2\/media?parent=83"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/web.colby.edu\/mgimond\/wp-json\/wp\/v2\/categories?post=83"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/web.colby.edu\/mgimond\/wp-json\/wp\/v2\/tags?post=83"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}