a.1 Monte Carlo SSA

The particular Monte Carlo approach to S/N separation introduced by Allen (1992) has become known as Monte Carlo SSA (MC-SSA), and detailed description can be found in Allen and Smith (1996; AS hereafter). See also the review papers of Ghil and Yiou (1996), and Ghil and Taricco (1997).

Monte Carlo SSA can be used to establish whether a given timeseries is linearly distinguishable from any well-defined process, including the output of a deterministic chaotic system, but we will focus on testing against the linear stochastic processes which are normally considered as ``noise''. ``Red noise'' is often used to refer to any linear stochastic process in which power declines monotonically with increasing frequency, but we prefer to use the term to refer specifically to a first-order auto-regressive, or AR(1), process tex2html_wrap_inline253 whose value at a time t depends on the value at time t-1 only,

equation88

where tex2html_wrap_inline259 is a gaussian-distributed white-noise process, for which each value is independent of all previous values, tex2html_wrap_inline261 is the process mean and tex2html_wrap_inline263 and tex2html_wrap_inline265 are constant coefficients.

When testing against a red noise null-hypothesis, the first step in MC-SSA is to determine the red-noise coefficients tex2html_wrap_inline263 and tex2html_wrap_inline265 from the time series X(t) using a maximum-likelihood criterion. Estimators are provided by Allen (1992) and AS which are asymptotically unbiased in the limit of large N and close to unbiased for shorter series provided the series length is at least an order of magnitude longer than the timescale of decay of autocorrelation, tex2html_wrap_inline275 . Based on these coefficients, an ensemble of surrogate red-noise data can be generated and, for each realization, a covariance matrix tex2html_wrap_inline277 is computed. These covariance matrices are then projected onto the eigenvector basis tex2html_wrap_inline279 of the original data,

  equation92

Since (6) is not the SVD of that realization, the matrix tex2html_wrap_inline281 is not necessarily diagonal, but it measures the resemblance of a given surrogate set with the data set. This resemblance can be quantified by computing the statistics of the diagonal elements of tex2html_wrap_inline281 . The statistical distribution of these elements, determined from the ensemble of Monte Carlo simulations, gives confidence intervals outside which a time series can be considered to be significantly different from a generic red-noise simulation. For instance, if an eigenvalue tex2html_wrap_inline183 lies outside a 90% noise percentile, then the red-noise null hypothesis for the associated EOF (and PC) can be rejected with this confidence. Otherwise, that particular SSA component of the time series cannot be considered as significantly different from red noise.

Allen (1992) stresses that care must be taken at the parameter-estimation stage because, for the results of a test against AR(1) noise to be interesting, we must ensure we are testing against that specific AR(1) process which maximises the likelihood that we will fail to reject the null-hypothesis. Only if we can reject this (the ``worst case'') red noise process, can we be confident of rejecting all other red noise processes at the same or higher confidence level. A second important point is test multiplicity: comparing M data eigenvalues with M confidence intervals computed from the surrogate ensemble, we expect M/10 to lie above the 90th percentiles even if the null hypothesis is true. Thus a small number of excursions above a relatively low percentile should be interpreted with caution. AS discuss this problem in detail, and propose a number of possible solutions.

The MC-SSA algorithm described above can be adapted to eliminate known periodic components and test the residual against noise. This adaptation provides sharper insight into the dynamics captured by the data, since known periodicities (like orbital forcing on the Quaternary time scale or seasonal forcing on the intraseasonal-to-interannual one) often generate much of the variance at the lower frequencies manifest in a time series and alter the rest of the spectrum. Allen and Smith (1992) and AS describe this refinement of MC-SSA which consists in restricting the projections to the EOFs that do not account for known periodic behavior. This refinement is implemented in the SSA-MTM toolkit, where it is referred to as the "Hybrid Basis" (see the Toolkit Demonstration section below).