The particular Monte Carlo approach to S/N separation introduced by Allen (1992) has become known as Monte Carlo SSA (MC-SSA), and detailed description can be found in Allen and Smith (1996; AS hereafter). See also the review papers of Ghil and Yiou (1996), and Ghil and Taricco (1997).
Monte Carlo SSA can be used to establish whether a given timeseries is
linearly distinguishable from any well-defined process,
including the output of a deterministic chaotic system, but we will
focus on testing against the linear
stochastic processes which are normally considered as ``noise''.
``Red noise'' is often used to refer to any linear
stochastic process in which power declines monotonically with
increasing frequency, but
we prefer to use the term to refer specifically
to a first-order auto-regressive, or AR(1), process whose
value at a time t depends on the value at time t-1 only,
where is a gaussian-distributed white-noise process, for which
each value is independent of all previous values,
is the process mean and
and
are constant
coefficients.
When testing against a red noise null-hypothesis,
the first step in MC-SSA is to determine the red-noise
coefficients and
from the time series X(t) using a
maximum-likelihood criterion. Estimators are
provided by Allen (1992) and AS
which are asymptotically unbiased
in the limit of large N and close to unbiased for shorter series
provided the series length is at least an order of magnitude longer
than the timescale of decay of autocorrelation,
.
Based on these coefficients, an ensemble of
surrogate red-noise data can be generated
and, for each realization, a
covariance matrix
is computed. These covariance matrices
are then projected onto the eigenvector basis
of the
original data,
Since (6) is not the SVD of that realization, the matrix
is not necessarily diagonal,
but it measures the resemblance of a given
surrogate set with the data set.
This resemblance can be quantified by computing the
statistics of the diagonal elements of
. The statistical
distribution of these elements, determined
from the ensemble of Monte Carlo simulations, gives confidence
intervals outside which a time series can be
considered to be significantly different from a generic red-noise
simulation. For instance,
if an eigenvalue
lies outside a 90% noise percentile,
then the red-noise null hypothesis for the associated EOF (and
PC) can be rejected with this confidence. Otherwise, that particular
SSA component of the time series
cannot be considered as significantly different from red noise.
Allen (1992) stresses that care must be taken at the parameter-estimation stage because, for the results of a test against AR(1) noise to be interesting, we must ensure we are testing against that specific AR(1) process which maximises the likelihood that we will fail to reject the null-hypothesis. Only if we can reject this (the ``worst case'') red noise process, can we be confident of rejecting all other red noise processes at the same or higher confidence level. A second important point is test multiplicity: comparing M data eigenvalues with M confidence intervals computed from the surrogate ensemble, we expect M/10 to lie above the 90th percentiles even if the null hypothesis is true. Thus a small number of excursions above a relatively low percentile should be interpreted with caution. AS discuss this problem in detail, and propose a number of possible solutions.
The MC-SSA algorithm described above can be adapted to eliminate known periodic components and test the residual against noise. This adaptation provides sharper insight into the dynamics captured by the data, since known periodicities (like orbital forcing on the Quaternary time scale or seasonal forcing on the intraseasonal-to-interannual one) often generate much of the variance at the lower frequencies manifest in a time series and alter the rest of the spectrum. Allen and Smith (1992) and AS describe this refinement of MC-SSA which consists in restricting the projections to the EOFs that do not account for known periodic behavior. This refinement is implemented in the SSA-MTM toolkit, where it is referred to as the "Hybrid Basis" (see the Toolkit Demonstration section below).