This work presents preliminary work
in using data mining techniques to find interesting spatio-temporal
patterns from Earth Science data. The data consists of time series
measurements for various Earth science and climate variables (e.g. soil
moisture, temperature, and precipitation), along with additional data
from existing ecosystem models (e.g. Net Primary Production). The
ecological patterns of interest include associations, clusters,
predictive models, and trends. In this work, we discuss some of the
challenges involved in preprocessing and analyzing the data, and also
consider techniques for handling some of the spatio-temporal issues.
Earth Science data has strong seasonal components that need to be
removed prior to pattern analysis, as Earth scientists are primarily
interested in patterns that represent deviations from normal seasonal
variation such as anomalous climate events (e.g., El Nino) or trends
(e.g., global warming). We compare several alternatives (including
singular value decomposition (SVD), discrete Fourier transform (DFT),
"monthly" Z score, and moving average) with respect to their
effectiveness in removing seasonality. We describe the different kinds
of association analysis that can be performed on such data. Our current
technique for finding associations transforms the time series into
transactions and then applies existing algorithms traditionally used
for market-basket data. Some of the transformations lead to dense
columns in the transaction matrices, causing an exponential growth in
the computing requirements. Furthermore, no single interestingness
measure accurately reflects the quality of the derived patterns.
Indeed, we argue that existing approaches for mining association rules
and sequential patterns may not be able to capture all the interesting
patterns due to the spatio-temporal nature of this data.