Content area
This study examines a trend in the 7-year seasonal (i.e., trimonthly) rainfall amounts (September 2001 to August 2008) observed by the TRMM Precipitation Radar (PR). The PR-observed rainfall amounts averaged in the range of 36°S to 36°N tend to increase slightly over this period. This tendency can be caused not only by natural variations but also by sampling errors (random errors) due to the PR's sparse observations in time. To separate the natural variations from the sampling errors, we developed a method of evaluating the sampling errors by a bootstrap method using the actual data observed by the PR. This method enables an evaluation of regional difference and characteristics of the sampling error in 5° grid boxes. As an application of the simulated sampling errors, we tested the significance of the trends in the averaged rainfall amounts by using Akaike's Information Criterion (AIC) and concluded that the positive trends were significant in all three cases; the trend in rainfall amounts averaged over all observed area (0.283 mm (3 months)2), that over land (0.518 mm (3 months)2) and that over the ocean (0.208 mm (3 months)2). We also tested the significance of the trends in the individual 5° grid boxes during this period and found that regions with the largest trends correspond to those in which the earlier studies have found the precipitation patterns caused by the El Niño-Southern Oscillation (ENSO).
1. Introduction
From the viewpoint of climate change detection, there has been increasing interest in the long term trend of global rainfall [e.g., Gu et al., 2007; Allan and Soden, 2007; Allan and Soden, 2008; Hilburn and Wentz, 2008; Allan et al., 2010]. The Precipitation Radar (PR) aboard the Tropical Rainfall Measuring Mission (TRMM) satellite [e.g., Simpson et al., 1988; Kummerow et al., 1998; Kozu et al., 2001] has an advantage that it can provide homogeneous rainfall observations in space with the horizontal resolution of about 5 km. For more than 11 years from the beginning of the mission in December 1997 to the instrument anomaly on 29 May 2009, all components of the PR including the 128 elements of the transmitting and receiving system had shown no instrument issues. (The PR was restarted using redundant components in mid-June 2009.) The long-term hardware stability of the PR is verified from the small monthly variation (less than 0.05 dB) of sea surface scattering cross sections in no rain conditions (K. Okamoto et al., Long term trend of ocean surface normalized radar cross section observed by TRMM precipitation radar, paper presented at XXVIIth General Assembly, Int. Union of Radio Sci., Maastricht, Netherlands, 2002).
In this study, seasonal (i.e., trimonthly) rainfall amounts from September 2001 to August 2008 observed by the TRMM/PR are analyzed. We focus our analysis on the PR data set only after the TRMM satellite altitude change, since the quality of data has changed after the TRMM satellite altitude was raised from 350 km to 402.5 km in August 2001 [Takahashi and Iguchi, 2004; Shimizu et al., 2009; Nakazawa and Rajendran, 2009]. The seasonal rainfall amounts are analyzed here because the TRMM satellite observes areas between roughly 36°S and 36°N at various hours of the day over the course of 46 days from a non-Sun-synchronous orbit. In other words, it takes 46 days to cover the diurnal cycle. Thus if there is a diurnal cycle in rainfall, a bias of the observation may remain in the monthly average, but it can be minimized in the seasonal average.
We find that the seasonal rainfall amounts averaged in the PR observation range tend to increase slightly over this period, as shown later. This trend can be due to not only natural variations such as the El Niño-Southern Oscillation (ENSO) but also sampling errors in the rainfall amount data. The seasonal rainfall products include large sampling errors due to the PR's sparse observations in time because of the narrow observation swath (245 km) of the PR.
This study develops a method of evaluating the sampling errors in PR-observed seasonal rainfall amounts by a bootstrap technique [e.g., Efron and Tibshirani, 1993]. As an application of the simulated sampling errors, we test a statistical significance of the trend in the time series of seasonal rainfall amounts during the 7 years using PR 2A25 Version 6 near surface rain data [Iguchi et al., 2000, 2009].
Most previous studies of satellite sampling errors have used rainfall amount data obtained by surface-based rain gauges and/or radar [e.g., Laughlin, 1981; McConnell and North, 1987; Shin and North, 1988; Bell et al., 1990; Kedem et al., 1990; Seed and Austin, 1990; Graves et al., 1993; North et al., 1993; Oki and Sumi, 1994; Soman et al., 1995, 1996; Li et al., 1996; Steiner, 1996; Bell and Kundu, 1996, 2000, 2003; Steiner et al., 2003; Iida et al., 2006; Fisher, 2007]. Although a surface-based rainfall data set has high accuracy and high resolution, it can be obtained only in limited regions where surface-based observation equipment can be established. Therefore the sampling errors have been evaluated in such limited regions where the surface-based rainfall amount data set exists. To overcome this problem, some attempts have been undertaken to evaluate global sampling errors using a global rainfall amount data set with high temporal resolution derived from a numerical model and a multisatellite product [e.g., Lin et al., 2002; Nesbitt and Anders, 2009].
Lin et al. estimated the relative sampling errors using an hourly 2.25° gridded rainfall amount data set derived from a numerical model, the Colorado State University General Circulation Model (CSU GCM). Since the statistics of precipitation generated by a GCM do not highly likely represent real rainfall [e.g., Waliser et al., 2003], the accuracy of relative sampling error estimates derived from the GCM output is questionable.
Nesbitt and Anders estimated the relative sampling errors using 3-hourly 0.25° gridded rainfall amounts by the TRMM Multisatellite Precipitation Analysis 3B42 data set [Huffman et al., 2007]. The 3B42 data set is a merged product from various instrument sources: microwave-based estimates from the TRMM Microwave Imager (TMI) aboard the TRMM satellite and the Special Sensor Microwave Imager (SSM/I) aboard the multiple Defense Meteorological Satellite Program (DMSP) satellites, infrared (IR) rainfall estimates from geostationary satellites, and surface rain gauges. Thus the 3B42 data set is considered to represent a more realistic rainfall compared with the GCM output. However, it has several important issues: calibration problems of the satellite sensors, inhomogeneous qualities of the rainfall estimates derived from multiple sources, and errors due to the sparse rain gauges in the Tropics.
[Figure omitted, see PDF]
Figure 1. Conceptual diagram of the bootstrap as it is applied to estimation of the sampling error in a PR-observed seasonal rainfall amount averaged over a 5° by 5° grid box.
[Figure omitted, see PDF]
Figure 2. Scatterplot of R against N observed by the PR during the 3 months from September to November 2001 in the 5° grid boxes in the PR observation range.
The bootstrap was introduced by Efron in 1979 as a computer-based method for evaluating the accuracy of an estimator, e.g., standard error of a sample mean (see Figure 1 and Figure 2 in section 2). It has an advantage that it enables the estimation of the sampling error of a space-time averaged rainfall amount using limited available observation data. PR observation data have a merit of nearly global coverage and high spatial resolution but are temporally very sparse. If we can estimate the precipitation distribution on the basis of PR data, we can draw samples with as many repetitions as we want from the estimated distribution and approximate the true distribution of the estimator or the space-time averaged rainfall amount in our case, calculated from the sample. In the bootstrap key idea, an empirical distribution function calculated from the sample is assumed to be the true probability distribution function of the unknown population. We can evaluate the sampling error as a standard deviation of the space-time averaged rainfall amounts derived from the random samples that are drawn from the empirical distribution function.
In section 2, we develop a method of evaluating sampling errors of the PR-observed seasonal rainfall amounts by a bootstrap. Section 3 provides the results of sampling error evaluation using the current method for the PR data. As an application of the simulated sampling errors, section 4 shows tests of the statistical significance of the trend in the seasonal rainfall amounts. Section 5 gives the summary and conclusions.
2. Development of a Sampling Error Evaluation Method by a Bootstrap
2.1. Core Idea
We consider the true rainfall, that is, what fell on the ground over a 5° by 5° grid box (area A) during a particular season (period T), as indicated in Figure 1. We do not know the true rainfall in the space-time domain. The true rainfall rate at point (x, y) at time t is denoted by r(x, y, t). Here r(x, y, t) is a continuous function of x, y, and t and we cannot observe it. We do not consider r(x, y, t) as a random variable. We define the true rainfall amount Rt in the space-time domain as
displayed equation
where the spatial integral is an integral over the area A. Rt is not an observable quantity. One of the goals of the satellite rainfall observations is to estimate Rt as accurate as possible by the sample mean of the satellite rainfall observations.
The PR observes the rainfall in a 5° box n times during the season period T. An event that any portion of the PR swath occurs within the 5° box is referred to as a visit. The PR data provide the number of PR pixels N and the mean rainfall rate R over the box at each visit. We can obtain the PR-observed seasonal rainfall amount averaged in the space-time domain as
Equation 1
Here, we assume that R(i) is given in mm/h and that 1 month consists of 30 days. We consider Rs as an unbiased estimate of Rt.
We are interested in how closely Rs represents the true rain amount Rt. We define the sampling error as
displayed equation
Many of the papers of satellite sampling error studies (including Iida et al. [2006]) are descriptions of attempts to estimate the variance of .
We can use the PR observation data themselves to estimate the sampling error in the rainfall totals by the bootstrap method. Figure 1 illustrates a conceptual diagram of the bootstrap [e.g., Efron and Tibshirani, 1993]. We assume that N and R are random variables and independent of each other. The independence of N and R is justified as described later. On the left side of the diagram is the real world. The PR gives the random samples N = (N(1), N(2), , N(n)) and R = (R(1), R(2), , R(n)) drawn from unknown probability distributions F and G, respectively:,
displayed equation
We define the probability distribution P in Figure 1 as a pair of the distributions F and G:
displayed equation
Once P is given, we can draw samples N and R from it and calculate Rs by substituting them into equation (1). If we denote the expectation of a random variable with respect to P by EP, the true rainfall amount Rt in the space-time domain is represented by
displayed equation
and the sampling error of our interest can be evaluated as
displayed equation
The true value Rt and the value of are not known in the real world.
The bootstrap imposes the independence on each element of the random sample (including the bootstrap sample) drawn from the unknown distribution P. In reality, however, the PR observation data are correlated spatially and temporally to each other. In practice, we need to devise how to deal with this rainfall correlation. We ignore the temporal rainfall correlation between different visits of PR observations on the time domain, since the time intervals between the PR's successive visits are generally much longer than the temporal rainfall correlation time at almost all latitudes except for the midlatitudes around 34°N and S. The spatial correlations among rainfall rates at individual pixels (PR's instantaneous footprints of about 5 km) in a single visit are difficult to estimate. Therefore we use averages R of individual visits over the 5° box which can be treated independent in the bootstrap samples, as an analogy to the idea of the moving blocks bootstrap [e.g., Efron and Tibshirani, 1993; Zwiers, 1990; Wilks, 1997]. Spatial average R of the rainfall rates has a merit that we can implicitly consider the spatial rainfall correlation in the PR fractional coverage on the space domain.
The core idea of the bootstrap is to estimate the unknown distribution P by the empirical distribution function calculated from the observed random samples, which is known as the plug-in principle. The black wide arrow process in Figure 1 indicates the calculation of the empirical probability distribution function from the observed samples. Conceptually, this is the crucial step in the bootstrap. We estimate the unknown distributions F and G by the empirical distributions calculated from N and from R, respectively. This idea is valid if the sample size n is large (the Glivenko-Cantelli Theorem). It allows us to draw two bootstrap samples N* and R* by random sampling with as many repetitions as we want from the empirical distribution = (, ) in the bootstrap world (the right side of Figure 1), that is,
displayed equation
This is a big merit of the bootstrap. By substituting N* and R* randomly drawn from into equation (1), we can estimate Rs. We denote the realization of Rs obtained in this matter as Rs*. When the sample size n is large, we can approximate the true distribution P by :
displayed equation
Therefore when we denote the expectation of the random variable with the distribution as E, the true rainfall amount Rt in the space-time domain can be assumed as
displayed equation
and the sampling error of our interest can be evaluated as the standard deviation of Rs*,
displayed equation
We can directly calculate the value of the sampling error in the bootstrap world.
N is not really a random variable since the set of N is highly predictable by the orbit calculation. N and R may not seem to be independent of each other. However, actual PR data show that we can regard N and R independent. Figure 2 indicates the scatterplot of N versus R observed by the PR during the 3 months from September to November 2001 in the 5° grid boxes in the PR observation range. The correlation coefficient between N and R is 0.05. N and R are effectively uncorrelated with each other. No obvious structure can be seen in the distribution of the data points Figure 2. Therefore we conclude that we can safely assume N and R to be independent.
The estimation of sampling errors in seasonal rainfall amounts averaged in the PR observation range of 36°S to 36°N starts from the assumption that the probability distribution functions inherent to the individual 5° grid boxes in the PR observation range are independent of each other. The distribution in each box is estimated from random sample N and R of size n observed by the PR in the corresponding box. The PR-observed seasonal rainfall amount Rs*(k) in 5° box k (k = 1, 2, , 16 × 72) is a random variable with the distribution (k). Rs*(k) in the mth simulation in the total of M simulations is denoted by Rs*(k, m) (m = 1, 2, , M). The Rs*(k, m) (k = 1, 2, , 16 × 72) for the 5° boxes are independent of each other in the individual simulations. The seasonal rainfall amounts (m) averaged in the PR observation range can be represented as
displayed equation
by simple averaging of Rs*(k, m) in the simulation m. When we think of the probability distribution function G = {(k) | k = 1, 2, , 16 × 72} in the PR observation range, we can consider that is a random variable with the probability distribution G. We estimate the true distribution PG in the PR observation range by the empirical distribution G. When the sample size n in the individual 5° boxes is large, we can approximate the true distribution PG as
displayed equation
Therefore when we denote the expectation of the random variable with the distribution G as EG, the true space-time rainfall amount averaged in the PR observation range can be assumed as
displayed equation
We can define the sampling error of our interest as
displayed equation
The sampling error can be evaluated as the standard deviation of ,
displayed equation
We can directly calculate the value of the sampling error in the bootstrap world.
2.2. Rainfall Amount in the 5° by 5° Grid Box
[Figure omitted, see PDF]
Figure 3. A flowchart of the estimation of the sampling error in an average rainfall amount by the bootstrap.
[Figure omitted, see PDF]
Figure 4. Zonal average of the number of PR visits (i.e., sample size n) over 5° by 5° grid box during the 3 months from September to November 2001. A visit is referred to as an event that a portion of the PR swath occurs within the grid box on its partial coverage.
We subdivide the PR observation range (36°S to 36°N) into 5° by 5° latitude/longitude grid boxes. Before we estimate the sampling error in the seasonal rainfall amount averaged in the PR observation range, we investigate the regional difference and characteristics of the sampling error in the 5° box (see Figure 3). For this purpose, we simulate the PR-observed seasonal rainfall amounts over 5° grid boxes from 40°S to 40°N using the bootstrap. The PR observes rainfall in a 5° box n times per 3 months. Note that the PR sometimes observes only a fraction of a grid box and the sampling time interval is not always uniform. Figure 4 indicates zonal average of the number of PR observations n, which is calculated using the PR data during the 3 months from September 2001 to November 2001. The sample size n is almost independent of the longitude. The zonal average of the sample size n widely ranges from 120 to 330 depending on the latitude of the grid box center. Similar results are found in other 3 months. The PR-observed seasonal rainfall amount Rs (mm (3 months)1) averaged over the 5° box can be obtained as equation (1), where N(i) is the number of PR pixels over the box at the ith visit, and R(i) is the mean rainfall rates (mm h1) of the PR pixels at the ith visit. Thus, the simulation of Rs by equation (1) requires resampling N* and R* with n visits from estimated populations of N* and R*, respectively. Here, we assume that N* and R* are independent of each other.
[Figure omitted, see PDF]
Figure 5. Probability distribution functions (PDFs) of N and R estimated from the PR 2A25 data during the 3 months from September to November 2001. (left) PDF of the number of PR pixels N [r = (N)] and (right) the PDF of the mean rainfall rate of the PR pixels R [r = (R)].
We estimate the probability distribution function (PDF) (N) over the 5° grid box from the seasonal histogram which is constructed from the number of PR pixels {N(i), i = 1, 2, , n} actually obtained from the PR data. N(i) is obtained from counting the number of footprints corresponding to the region of overlap between the grid box and the PR swath at the ith visit. Figure 5 (left) shows an example of the estimated (N), which is obtained using the PR data during 3 months from September 2001 to November 2001 over the 5° grid box centered at (7.5°S, 177.5°E). The population with (N) over the 5° grid box is assumed to be a true population. We resample n independent samples of N* from the population with (N) using the relationship N* = 1(r1*) by generating a uniform random number r1* in the range of (0: 1).
Similarly, we assume the population with PDF (R) over the 5° grid box in the case of the mean rainfall rate R as in the case of N. (R) over the 5° grid box is similarly estimated from the seasonal histogram which is constructed from the mean rainfall rate of the PR pixels {R(i), i = 1, 2, , n} calculated from the PR data (see Figure 5, right). As shown in Figure 5, the PDF of R* is estimated by linear interpolation in a logarithmic scale. We sample R* from the population with (R) using the relationship R* = 1(r2*) by generating a uniform random number r2*. r2* is independent of random number r1*. We assign the realization R* of R to a zero value when r2* is less than or equal to n0/n, that is (0), where n0 is the frequency in the R = 0 case. We resample the n independent samples R*.
By substituting the simulated N*(i) and R*(i) (i = 1, 2, , n) into equation (1), an estimate of seasonal rainfall amount Rs* can be obtained. The set of simulated Rs* by 1000 repetitions of this process is assumed to be a true population of the PR-observed seasonal rainfall amount. A confidence interval estimation requires approximately 1000 repetitions, although an estimation of standard deviation requires only 100 to 200 repetitions [Efron and Tibshirani, 1993]. This study computes 1000 repetitions to estimate the standard deviation. The sampling error is calculated as the standard deviation of the 1000 simulated seasonal rainfall amounts Rs*. Since we are unable to know the true population of the seasonal rainfall amounts, the true value Rt is assumed to be equal to the averaged value of the 1000 simulated seasonal rainfall amounts. We regard a relative standard deviation, that is, divided by Rt, as a measure of the relative sampling error.
The limitation of the current method is found in terms of the construction of the PDF. Here, the PDF is calculated by the PR data during 3 months, while the PR covers the daily cycles in 46 days and the observation swath is narrow, as noted in section 1. It can be affected by seasonal and longer variability. On the other hand, intraseasonal and shorter variability cannot be fully described due to the sparse sampling of the PR. However, Nakazawa and Rajendran indicated the annual cycle and interannual variation are dominant in the monthly PR product. Therefore we emphasize that our target is the rainfall variability with time scales more than 3 months.
2.3. Rainfall Amount Averaged in the PR Observation Range of 36°S to 36°N
[Figure omitted, see PDF]
Figure 6. Distribution of the 1000 simulated rainfall anomalies [mm (3 months)1] of the seasonal rainfall amounts averaged in the range of 40°S to 40°N from September to November 2001.
Figure 3 illustrates a flowchart of the estimation of sampling errors in seasonal rainfall amounts averaged in the PR observation range of 36°S to 36°N. We simulate the 1000 seasonal rainfall amounts Rs* over each of the 5° boxes, then average them over all the 5° grid boxes to obtain 1000 averaged in the PR observation range. We consider that the seasonal rainfall amounts Rs* over the 5° grid boxes are not correlated with each other. For each season period, we obtain the distribution of the 1000 simulated averaged rainfall anomalies, that is, (m) = (m) (m = 1, 2, , 1000). Here, is the average of the 1000 values. Figure 6 demonstrates that the simulated rainfall anomalies tend to follow the normal distribution. We regard the standard deviation as a measure of the sampling error and a relative standard deviation, that is, divided by , as a measure of the relative sampling error.
3. Results of Sampling Error Evaluation
3.1. Sampling Errors in the 5° by 5° Grid Boxes
Figure 7a. Results of estimated relative sampling errors in 5° grid boxes for the 3 months, i.e., boreal fall season period, from September to November 2001. (top) Mean [mm (3 months)1] and (bottom) Std.dev./Mean [%].
Using the current method of sampling error evaluation, we estimated the sampling errors in the seasonal rainfall amounts over 5° grid boxes for the 7 years from September 2001 to August 2008. For each season period, we simulated the 1000 seasonal rainfall amounts in each 5° grid box using (1) and calculated the mean value and the standard deviation. Figures 7 indicate the estimated results for four season periods from September 2001 to August 2002. In Figure 7 (top) the mean values (mm (3 months)1) are shown, and the relative standard deviations (%) are displayed in the bottom panel. In regions of low (high) rainfall, relative standard deviations are generally found to be large (small). This tendency is consistent with the theoretical prediction by Bell and Kundu [2000]. We found that the relative standard deviations (%) to the mean values over the oceanic grid boxes tend to be smaller than those over the land grid boxes. Steiner et al. reached a similar conclusion by comparing the relative sampling errors in oceanic rainfall and rainfall over land, which were obtained in the previous studies by radar and/or rain gauge. They pointed out that a larger relative sampling error over land than over ocean is consistent with a larger variability and shorter time correlation of rainfall over land than over ocean.
Figure 7b. As in Figure 7a except for the boreal winter season period from December 2001 to February 2002.
Figure 7c. As in Figure 7a except for the boreal spring season period from March to May 2002.
Figure 7d. As in Figure 7a except for the boreal summer season period from June to August 2002.
The current method enables a near global evaluation of regional differences and characteristics of sampling errors. Figures 7 also demonstrate regional differences in the relative sampling errors in the seasonal rainfall amounts. We can find from Figure 7a that the relative sampling errors are relatively large over the Amazon and inland Australia, and near the equator of Africa, in spite of the large seasonal rainfall amounts Rt over those regions. The relative sampling error is not as large as expected by Bell and Kundu in spite of the small seasonal rainfall amounts over the southern Pacific and the southern Atlantic Ocean. Bell and Kundu [1996, 2000] indicated theoretically that the relative sampling error depends on the rainfall variability and the space-time correlation length of rainfall in addition to the mean rainfall amounts and the sampling frequency. The discrepancy between sampling errors in this study and the theoretical prediction by Bell and Kundu can be possibly due to the differences of the rainfall variability and the space-time correlation length of point rainfall in those regions. These discrepancies are found in the other three season periods (see Figures 7b, 7c, and 7d). This issue needs further investigation.
Figures 7 indicate the seasonal variations in sampling errors. The seasonal differences and characteristics of relative sampling errors are outstanding in the Asian monsoon regions, the Central Asia, the South Africa, the northwestern and Central Australia, the northeastern Pacific Ocean, the South America, and the southern Atlantic Ocean. In particular, the differences of relative sampling errors between the boreal winter and boreal summer season periods are large in those regions. The seasonal differences are mainly due to seasonal rainfall amounts, as predicted by Bell and Kundu [2000]. In these regions, however, we also find discrepancies between the relative sampling error in this study and the theoretical prediction by Bell and Kundu [2000].
The magnitudes of the relative sampling errors evaluated by the current method are compared to those obtained from the empirical power law scaling formula by Iida et al. [2006]. We tested it over two 5° grid boxes ([30°35°N, 130°135°E], [30°35°N, 135°140°E]) around Japan during the boreal winter (December to February) from 2001 to 2008. Iida et al. simulated relative sampling errors by direct sampling of hourly radar-AMeDAS analysis data (rain gauge-calibrated radar data) around Japan during 3 years (19982000) using realistic observational patterns with the swath width of the TRMM Microwave Imager (TMI). They formulated the estimated relative sampling errors as a function of space scale (grid box size) from 0.1 to 5°, the time scale (rainfall averaging time interval) from 1 to 30 days, and the space-time average rainfall rate (mm h1). The formula is applied to the seasonal mean rainfall amount Rt in this study over grid boxes with a spatial scale of 5° for each year from 2001 to 2008, where we extrapolate to the time scale of 90 days. The relative sampling errors of the TMI are smaller than those of the PR, since the number of TMI observations per 3-months over the grid box is approximately 3 times that of the PR. Therefore we translate the TMI results to relative sampling errors of the PR using the relationship of Bell and Kundu that indicates that the relative sampling error is inversely proportional to the square root of the effective number of observations. By this translation, the relative sampling errors of 20%25% were obtained. These values are comparable to the results in this study which gives the relative standard deviation of 17%24% for the corresponding boxes. The current method ignores the temporal rainfall correlation between PR visits, whereas the empirical formula by Iida et al. considers it. Because of the shorter time intervals between PR visits at the midlatitude around the 34° N and S, some of the PR visits may have some temporal correlation with each other which may explain the slightly small error estimates in this study.
3.2. Sampling Error in the Rainfall Amount Averaged in the PR Observation Range
[Figure omitted, see PDF]
Figure 8. Results of estimated relative sampling errors (Std.dev./Mean [%]) for the 3 months from September 2001 to August 2008 (square symbol: the range of 40°S to 40°N [total: land and ocean], cross symbol: over land, triangle symbol: over ocean).
We estimated the sampling errors in the seasonal rainfall amounts averaged in the PR observation range for the 7 years from September 2001 to August 2008. Figure 8 indicates the estimated relative sampling errors using the simulated 1000 seasonal rainfall amounts. For the 3 months from September to November 2001, the standard deviation was found to be estimated as 1.82 mm (3 months)1 and approximately 0.8% relative to the mean value of 230.6 mm (3 months)1.
Figure 8 also shows the relative sampling errors in the rainfall amounts averaged over land and ocean separately. The estimated relative sampling errors over the ocean are smaller than those over the land during the whole period. The errors in the total domain are the lowest. This is considered to be due to the difference of the rainfall amount and the number of the 5° boxes (i.e., averaging grid size), as shown in the previous studies [e.g., Bell and Kundu, 2000; Steiner et al., 2003]. As the rainfall amount and the averaging grid size are larger, the relative sampling error becomes smaller.
4. Significance Test of a Trend in Seasonal Rainfall Amounts
[Figure omitted, see PDF]
Figure 9. Results of the fit of two lines (first-order model: red line, zero-order model: blue line) to the time series of the 7-year seasonal rainfall amounts weighted with the simulated sampling errors. (a) Average in the range of 40°S to 40°N [land and ocean], (b) average over land, and (c) average over ocean. Error bars (green) attached to black lines show the sampling errors (standard deviations).
In this section, we test the statistical significance of the trend in time series of 7-year seasonal rainfall amounts from September 2001 to August 2008 observed by the PR, as an application of the simulated sampling errors in section 3. One of the purposes of this paper is to test a statistical significance of the rainfall trend considering the sampling error only. We quantify the trend of the seasonal rainfall amounts considering the sampling error only and determine whether the trend is significant or not. Figure 9 shows the time series of the seasonal rainfall amounts averaged in the PR observation range from 40°S to 40°N. We plot the seasonal rainfall amounts yi (i = 0, 1, 2, , 27) as a function of the number of 3 months xi = i (i = 0, 1, 2, , 27). The error bar attached in each seasonal rainfall amount yi indicates the simulated sampling error (standard deviation i). We assume two models and examine which model better explains the trend: a first-order linear regression model (y = ax + b, a 0) and a zero-order linear regression model (y = b, a = 0). We fit two lines to the time series of seasonal rainfall amount weighted with the simulated sampling error (standard deviation i) (i = 0, 1, 2, , 27), and calculate the coefficients.
The selection of an appropriate model is judged by Akaike's Information Criterion (AIC) [e.g., Akaike, 1974] quantitatively: we consider the model with the smaller value of the AIC as the better model. The value of AIC itself means nothing, but only the difference between the AIC values of the two models has its own meaning. When the difference between the AIC values of the two models is larger than 1, the difference is considered to be statistically significant [e.g., Shimizu, 1993]. The use of the AIC is related to hypothesis testing in that the difference of the AIC between two statistical models is distributed as the chi-square statistic (2) with s degrees of freedom, where s is the difference in the number of parameters between the two models [e.g., Sexton et al., 2003]. When the first-order linear regression model is more suitable than the zero-order linear regression model we conclude that the trend is significant.
The calculation of the AIC accompanies the estimation of model parameters (a and b) using the maximum likelihood method. We assume a normal distribution model that has the mean value axi + b and the standard deviation (the simulated sampling error i) for each 3 months xi (i = 0, 1, 2, , 27):
displayed equation
since the simulated rainfall anomalies in Figure 5 tend to follow the normal distribution. The model in the case of a 0 is a first-order linear regression model (y = ax + b) and the one in the case of a = 0 is a zero-order linear regression model (y = b). The first-order linear regression model has two parameters a and b, whereas the zero-order model has one parameter b. We assume that the seasonal rainfall amounts yi (i = 0, 1, 2, , 27) are independent random variables. We estimated the model parameters by maximizing the log likelihood:
displayed equation
Thus the maximum likelihood method attributes to the weighted least square method for the last term (we omit the coefficient 1/2) of the equation with sampling error.
Table 1 summarizes the slopes a and AIC values of the first-order and zero-order linear regression models. Figure 9 indicates the fit of two lines to the 7-year seasonal rainfall amounts with the simulated sampling errors. Since the difference between AIC values of the two models (AIC(0)AIC(1)) is larger than 1, we consider that the first-order linear regression model is significantly more suitable than the zero-order linear regression model and the trend is significant. In addition, the slope a of the first-order linear regression model is found to take a positive value. Thus we conclude that the trend is significantly positive in the seasonal rainfall amounts during this period and that this trend is not due to the sampling error.
Moreover, we test the statistical significance of the trends in the seasonal rainfall amounts averaged over land and ocean separately during this period. Table 1 and Figure 9 summarize the results of the significance test considering the simulated sampling errors. The magnitude of the slope a over the ocean is close to that over the total domain (land and ocean), and the magnitude of the slope a over the land is larger than total. Since the difference between AIC values of the two models (AIC(0)AIC(1)) is larger than 1 both over land and ocean, we conclude that the first-order linear regression model with a positive slope is selected (that is, the trend is significantly positive) both over land and ocean. The value of trend over land is about 2.5 times that over ocean.
[Figure omitted, see PDF]
Figure 10. The 5° boxes with significant trends in the time series of 7-year seasonal rainfall amounts.
Finally, we tested the statistical significance of the trend in the seasonal rainfall amounts in 5° grid boxes. Figure 10 summarizes the results of the trend that is statistically significant in the 5° grid boxes. The number of 5° grid boxes in which we conclude that the trend is statistically significant is about 55% of the total number of 5° grid boxes. The numbers of 5° grid boxes with the positive and negative trends are about 27% and 28% of the total number of grid boxes, respectively. The significant trends widely range from 19 mm (3 months)2 to 18 mm (3 months)2. The distribution of the largest positive trends are found over the western Pacific Ocean near the equator and the Atlantic Ocean near the equator, whereas the distributions of the largest negative trends are found over the central Pacific Ocean near the equator and the Indian Ocean near Malaysia. These regions with the largest trends correspond to those in which the earlier studies [e.g., Ropelewski and Halpert, 1987; Dai and Wigley, 2000; Curtis and Adler, 2003; Nakazawa and Rajendran, 2009] have found the precipitation patterns caused by the ENSO. The distributions of the largest trends are possibly due to the ENSO. In fact, in this period, an El Niño event was observed during boreal summer 2002 to boreal winter 2002/2003 and La Niña events were observed during boreal fall 2005 to boreal spring 2008. Over the Pacific Ocean, we can find two diagonal bands of large positive trends from [0°, 135°E] to [40°S, 135°W] and large negative trends from [0°, 175°E] to [40°S, 95°W]. Distributions of large negative trends are also found over the central South American continent, near the Rio de la Plata of South America and the Mississippi River area, and central Africa.
5. Summary and Conclusion
This study examined a trend in the 7-year seasonal rainfall amounts (September 2001 to August 2008) observed by the TRMM/PR. We found that the PR-observed rainfall amounts averaged in the range of 40°S to 40°N (PR observation range) tend to increase slightly over this period. This tendency can be due to not only natural variations but also sampling errors due to the PR's sparse observations in time. Therefore this study evaluated the sampling errors using the actual data observed by the PR in a bootstrap technique and tested the statistical significance of the positive trend in the PR-observed averaged rainfall amounts during 7 years considering the simulated sampling errors.
The main results in this study are summarized as follows:
1. We developed a method of evaluating a sampling error by a bootstrap using the PR data.
2. We showed that this method enables an evaluation of regional differences and characteristics of sampling errors over 5° grid boxes in the PR observation range. We found that the relative sampling errors in the grid boxes over land tend to be larger than those in oceanic grid boxes. In addition, the relative sampling errors tend to be smaller for larger seasonal rainfall amounts, as predicted by Bell and Kundu [2000]. However, discrepancies between the estimated relative sampling error and the prediction by Bell and Kundu were found in some regions.
3. We evaluated relative sampling errors in rainfall amounts averaged in the total domain, over land, and over ocean separately. We found that the estimated relative sampling errors over the ocean are smaller than those over the land for the 7 years. The errors of the total domain were the lowest.
4. The positive trend of 0.283 mm (3 months)2 was significant in the 7-year seasonal rainfall amounts averaged in the PR observation range. Moreover, we investigated trends in the seasonal rainfall amounts averaged over land and ocean separately during this period. The trends were significantly positive both over land and ocean. The significant trend was 0.518 mm (3 months)2 over land and 0.208 mm (3 months)2 over ocean.
5. We tested the significance of the trend in the 7-year seasonal rainfall amounts in 5° grid boxes in the PR observation range. The number of the grid boxes in which we conclude that the trends are statistically significant was about 55% of the total number of grid boxes.
6. The distributions of the largest positive trends were found over the western Pacific Ocean near the equator and the Atlantic Ocean near the equator, whereas the distributions of the largest negative trends were found over the central Pacific Ocean near the equator and the Indian Ocean near Malaysia. These regions with the largest trends correspond to those in which the earlier studies have found the precipitation patterns caused by the ENSO.
We conclude that the trends in the PR-observed seasonal rainfall amounts averaged over the total domain, over the land, and over the ocean during this period are significantly positive, and these trends are not due to the sampling error. We also showed the 5° grid boxes in which we conclude that the trends are statistically significant.
The PR has an advantage that it can provide a homogeneous quality of data in space and time. This study indicated that the seasonal rainfall amount data set derived from the PR data is useful for climate analysis such as the long-term trend detection.
Note that some of the issues related to retrieval error in the PR algorithm that may contribute to these trends are not dealt with. Slowly varying retrieval errors might hamper interpretation of the trends presented in the paper. For example, Kozu et al. have recently shown that the PR has some skill in estimating DSD parameters in a climatological sense. One could expect that the ENSO cycle has some influence on the DSD that is not fully handled properly by the PR rain retrieval algorithm. This kind of variations might lead to nonrandom measurement error that may result in an apparent trend. Given that the trends may be influenced by this kind of variations, there could be some issues in the trend estimate related to measurement errors.
Acknowledgments
The authors would like to thank N. Kashiwagi of the Institute of Statistical Mathematics and three anonymous reviewers of the manuscript for their valuable comments. The paper was carefully proofread by D. Short of the National Institute of Information and Communications Technology.
------------
Citation: Iida, Y., T. Kubota, T. Iguchi, and R. Oki (2010), Evaluating sampling error in TRMM/PR rainfall products by the bootstrap method: Estimation of the sampling error and its application to a trend analysis, J. Geophys. Res., 115, D22119, doi:10.1029/2010JD014257.
Copyright 2010 by the American Geophysical Union.
| a | AIC(1) | AIC(0) |
Total | 0.283 | 189 | 237 |
Land | 0.518 | 214 | 245 |
Ocean | 0.208 | 187 | 201 |
aHere a is the slope of the first-order linear regression model y = ax+b [mm (3 months)2], AIC(1) is AIC of the first-order linear regression model, and AIC(0) is AIC of the zero-order linear regression model. |
Akaike, H. (1974), A new look at the statistical model identification, IEEE Trans. Autom. Control, 19, 716723, doi:10.1109/TAC.1974.1100705.
Allan, R. P., and B. J. Soden (2007), Large discrepancy between observed and simulated precipitation trends in the ascending and descending branches of the tropical circulation, Geophys. Res. Lett., 34, L18705, doi:10.1029/2007GL031460.
Allan, R. P., and B. J. Soden (2008), Atmospheric warming and the amplification of precipitation extremes, Science, 321, 14811484, doi:10.1126/science.1160787.
Allan, R. P., B. J. Soden, V. O. John, W. Ingram, and P. Good (2010), Current changes in tropical precipitation, Environ. Res. Lett., 5, 025205, doi:10.1088/1748-9326/5/2/025205.
Bell, T. L., and P. K. Kundu (1996), A study of the sampling error in satellite rainfall estimate using optimal averaging of data and a stochastic model, J. Clim., 9, 12511268, doi:10.1175/1520-0442(1996)009<1251:ASOTSE>2.0.CO;2.
Bell, T. L., and P. K. Kundu (2000), Dependence of satellite sampling error on monthly averaged rain rates: Comparison of simple models and recent studies, J. Clim., 13, 449462, doi:10.1175/1520-0442(2000)013<0449:DOSSEO>2.0.CO;2.
Bell, T. L., and P. K. Kundu (2003), Comparing satellite rainfall estimates with rain gauge data: Optimal strategies suggested by a spectral model, J. Geophys. Res., 108(D3), 4121, doi:10.1029/2002JD002641.
Bell, T. L., A. Abdullah, R. L. Martin, and G. R. North (1990), Sampling errors for satellite-derived tropical rainfall: Monte Carlo study using a space-time stochastic model, J. Geophys. Res., 95(D3), 21952205, doi:10.1029/JD095iD03p02195.
Curtis, S., and R. F. Adler (2003), Evolution of El Niño-precipitation relationships from satellites and gauges, J. Geophys. Res., 108(D4), 4153, doi:10.1029/2002JD002690.
Dai, A., and T. M. L. Wigley (2000), Global patterns of ENSO-induced precipitation, Geophys. Res. Lett., 27(9), 12831286, doi:10.1029/1999GL011140.
Efron, B., and R. J. Tibshirani (1993), An Introduction to the Bootstrap, Chapman and Hall, London.
Fisher, B. L. (2007), Statistical error decomposition of regional-scale climatological precipitation estimates from the Tropical Rainfall Measuring Mission (TRMM), J. Appl. Meteorol. Climatol., 46, 791813, doi:10.1175/JAM2497.1.
Graves, C. E., J. B. Valdés, S. S. P. Shen, and G. R. North (1993), Evaluation of sampling errors of precipitation from spaceborne and ground sensors, J. Appl. Meteorol., 32, 374385, doi:10.1175/1520-0450(1993)032<0374:EOSEOP>2.0.CO;2.
Gu, G., R. F. Adler, G. J. Huffman, and S. Curtis (2007), Tropical rainfall variability on interannual-interdecadal and longer time scales derived from the GPCP monthly product, J. Clim., 20, 40334046, doi:10.1175/JCLI4227.1.
Hilburn, K. A., and F. J. Wentz (2008), Intercalibrated passive microwave rain products from the Unified Microwave Ocean Retrieval Algorithm (UMORA), J. Appl. Meteorol. Climatol., 47, 778794, doi:10.1175/2007JAMC1635.1.
Huffman, G. J., R. F. Adler, D. T. Bolvin, G. Gu, E. J. Nelkin, K. P. Bowman, Y. Hong, E. F. Stocker, and D. B. Wolff (2007), The TRMM multi-satellite precipitation analysis: Quasi-global, multi-year, combined-sensor precipitation estimates at fine scale, J. Hydrometeorol., 8, 3855, doi:10.1175/JHM560.1.
Iguchi, T., T. Kozu, R. Meneghini, J. Awaka, and K. Okamoto (2000), Rainprofiling algorithm for the TRMM Precipitation Radar, J. Appl. Meteorol., 39, 20382052, doi:10.1175/1520-0450(2001)040<2038:RPAFTT>2.0.CO;2.
Iguchi, T., T. Kozu, J. Kwiatkowski, R. Meneghini, J. Awaka, and K. Okamoto (2009), Uncertainties in the rain profiling algorithm for the TRMM Precipitation Radar, J. Meteorol. Soc. Jpn., 87A, 130, doi:10.2151/jmsj.87A.1.
Iida, Y., K. Okamoto, T. Ushio, and R. Oki (2006), Simulation of sampling error of average rainfall rates in space and time by five satellites using radar-AMeDAS composites, Geophys. Res. Lett., 33, L01816, doi:10.1029/2005GL024910.
Kedem, B., L. S. Chiu, and G. R. North (1990), Estimation of mean rain rate: Application to satellite observations, J. Geophys. Res., 95(D2), 19651972, doi:10.1029/JD095iD02p01965.
Kozu, T., T. Kawanishi, H. Kuroiwa, M. Kojima, K. Oikawa, H. Kumagai, K. Okamoto, M. Okumura, H. Nakatsuka, and K. Nishikawa (2001), Development of precipitation radar onboard the Tropical Rainfall Measuring Mission Satellite, IEEE Trans. Geosci. Remote Sens., 39, 102116, doi:10.1109/36.898669.
Kozu, T., T. Iguchi, T. Kubota, N. Yoshida, S. Seto, J. Kwiatkowski, and Y. N. Takayabu (2009), Feasibility of raindrop size distribution parameter estimation with TRMM Precipitation Radar, J. Meteorol. Soc. Jpn., 87A, 5366, doi:10.2151/jmsj.87A.53.
Kummerow, C., W. Barnes, T. Kozu, J. Shiue, and J. Simpson (1998), The Tropical Rainfall Measuring Mission (TRMM) sensor package, J. Atmos. Oceanic Technol., 15, 809817, doi:10.1175/1520-0426(1998)015<0809:TTRMMT>2.0.CO;2.
Laughlin, C. R. (1981), On the effect of temporal sampling on the observation of mean rainfall, in Precipitation Measurements From Space, Workshop Rep., pp. D59-D66, NASA Goddard Space Flight Cent., Greenbelt, Md.
Li, Q., R. L. Bras, and D. Veneziano (1996), Analysis of Darwin rainfall data: Implications on sampling strategy, J. Appl. Meteorol., 35, 372385, doi:10.1175/1520-0450(1996)035<0372:AODRDI>2.0.CO;2.
Lin, X., L. D. Fowler, and D. A. Randall (2002), Flying the TRMM satellite in a general circulation model, J. Geophys. Res., 107(D16), 4281, doi:10.1029/2001JD000619.
McConnell, A., and G. R. North (1987), Sampling errors in satellite estimates of tropical rain, J. Geophys. Res., 92(D8), 95679570, doi:10.1029/JD092iD08p09567.
Nakazawa, T., and K. Rajendran (2009), Interannual variability of tropical rainfall characteristics and the impact of the altitude boost from TRMM PR 3A25 data, J. Meteorol. Soc. Jpn., 87A, 317338, doi:10.2151/jmsj.87A.317.
Nesbitt, S. W., and A. M. Anders (2009), Very high resolution precipitation climatologies from the Tropical Rainfall Measuring Mission precipitation radar, Geophys. Res. Lett., 36, L15815, doi:10.1029/2009GL038026.
North, G. R., S. S. P. Shen, and R. Upson (1993), Sampling errors in rainfall estimates by multiple satellites, J. Appl. Meteorol., 32, 399410, doi:10.1175/1520-0450(1993)032<0399:SEIREB>2.0.CO;2.
Oki, R., and A. Sumi (1994), Sampling Simulation of TRMM rainfall estimation using radar-AMeDAS composites, J. Appl. Meteorol., 33, 15971608, doi:10.1175/1520-0450(1994)033<1597:SSOTRE>2.0.CO;2.
Ropelewski, C. F., and M. S. Halpert (1987), Global and regional scale precipitation patterns associated with the E1 Niño/Southern Oscillation, Mon. Weather Rev., 115, 16061626, doi:10.1175/1520-0493(1987)115<1606:GARSPP>2.0.CO;2.
Seed, A., and G. L. Austin (1990), Variability of summer Florida rainfall and its significance for the estimation of rainfall by gages, radar, and satellite, J. Geophys. Res., 95(D3), 22072215, doi:10.1029/JD095iD03p02207.
Sexton, D. M. H., H. Grubb, K. P. Shine, and C. K. Folland (2003), Design and analysis of climate model experiments for the efficient estimation of anthropogenic signals, J. Clim., 16, 13201336.
Shimizu, K. (1993), A bivariate mixed lognormal distribution with an analysis of rainfall data, J. Appl. Meteorol., 32, 161171, doi:10.1175/1520-0450(1993)032<0161:ABMLDW>2.0.CO;2.
Shimizu, S., R. Oki, T. Tagawa, T. Iguchi, and M. Hirose (2009), Evaluation of the effects of the orbit boost of the TRMM satellite on PR rain estimates, J. Meteorol. Soc. Jpn., 87A, 8392, doi:10.2151/jmsj.87A.83.
Shin, K. S., and G. R. North (1988), Sampling error study for rainfall estimate by satellite using a stochastic model, J. Appl. Meteorol., 27, 12181231, doi:10.1175/1520-0450(1988)027<1218:SESFRE>2.0.CO;2.
Simpson, J., R. F. Adler, and G. R. North (1988), Proposed tropical rainfall measuring mission (TRMM) satellite, Bull. Am. Meteorol. Soc., 69, 278295, doi:10.1175/1520-0477(1988)069<0278:APTRMM>2.0.CO;2.
Soman, V. V., J. B. Valdés, and G. R. North (1995), Satellite sampling and the diurnal cycle statistics of Darwin rainfall data, J. Appl. Meteorol., 34, 24812490, doi:10.1175/1520-0450(1995)034<2481:SSATDC>2.0.CO;2.
Soman, V. V., J. B. Valdés, and G. R. North (1996), Estimation of sampling errors and scale parameters using two- and three-dimensional rainfall data analyses, J. Geophys. Res., 101(D21), 26,45326,460, doi:10.1029/96JD01387.
Steiner, M. (1996), Uncertainty of estimates of monthly areal rainfall for temporally sparse remote observations, Water Resour. Res., 32(2), 373388, doi:10.1029/95WR03396.
Steiner, M., T. L. Bell, Z. Zhang, and E. F. Wood (2003), Comparison of two methods for estimating the sampling-related uncertainty of satellite rainfall averages based on a large radar dataset, J. Clim., 16, 37593778, doi:10.1175/1520-0442(2003)016<3759:COTMFE>2.0.CO;2.
Takahashi, T., and T. Iguchi (2004), Estimation and correction of beam mismatch of the precipitation Radar after an orbit boost of the Tropical Rainfall Measuring Mission satellite, IEEE Trans. Geosci. Remote Sens., 42, 23622369, doi:10.1109/TGRS.2004.837334.
Waliser, D. E., et al. (2003), AGCM simulations of intraseasonal variability associated with the Asian summer monsoon, Clim. Dyn., 21, 423446, doi:10.1007/s00382-003-0337-1.
Wilks, D. S. (1997), Resampling hypothesis tests for autocorrelated fields, J. Clim., 10, 6582, doi:10.1175/1520-0442(1997)010<0065:RHTFAF>2.0.CO;2.
Zwiers, F. W. (1990), The effect of serial correlation on statistical inferences made with resampling procedures, J. Clim., 3, 14521461, doi:10.1175/1520-0442(1990)003<1452:TEOSCO>2.0.CO;2.
Copyright 2010 by American Geophysical Union