Content area
Full text
ABSTRACT
This paper shows theoretically and with examples that climatological means derived from spectral methods predict independent data with less error than climatological means derived from simple averaging. Herein, "spectral methods" indicates a least squares fit to a sum of a small number of sines and cosines that are periodic on annual or diurnal periods, and "simple averaging" refers to mean averages computed while holding the phase of the annual or diurnal cycle constant. The fact that spectral methods are superior to simple averaging can be understood as a straightforward consequence of overfitting, provided that one recognizes that simple averaging is a special case of the spectral method. To illustrate these results, the two methods are compared in the context of estimating the climatological mean of sea surface temperature (SST). Cross-validation experiments indicate that about four harmonics of the annual cycle are adequate, which requires estimation of nine independent parameters. In contrast, simple averaging of daily SST requires estimation of 366 parameters-one for each day of the year, which is a factor of 40 more parameters. Consistent with the greater number of parameters, simple averaging poorly predicts samples that were not included in the estimation of the climatological mean, compared to the spectral method. In addition to being more accurate, the spectral method also accommodates leap years and missing data simply, results in a greater degree of data compression, and automatically produces smooth time series.
(ProQuest: ... denotes formulae omitted.)
1. Introduction
Traditionally, the climatology is defined as the distribution of a random draw from the system conditioned on the phase of the annual or diurnal cycle. This definition is appropriate for cyclostationary systems-that is, systems whose statistics are invariant with respect to a shift in time equal to an integral multiple of a specified period-but may not be appropriate for systems exhibiting trends or other secular variations. As such, the climatological distribution depends on time, or more precisely on the phase of the annual or diurnal cycle. In any case, the climatological distribution is not known and therefore must be estimated from finite samples. There are at least two approaches to estimating the mean of the climatological distribution. The first, which we call simple averaging, is to average the state with respect to...





