Innovative Techniques to Combine Data and Improve Environmental Modeling
Scientific studies have shown associations between air
pollutants and negative human health effects, as well as
detrimental ecological impacts. However, the interplay of
diverse emission sources and complex atmospheric processes
makes the air pollution problem difficult to understand.
Recently, Battelle developed a hierarchical Bayesian
approach to statistical spatial modeling for an air quality
assessment project to estimate spatial gradients, or variation,
of air pollutants for the U.S. Environmental Protection
Agency (EPA). (The term "hierarchical" in this context
means that large numbers of variables are not modeled
simultaneously, but instead dealt with in layers that build
upon one another.) Such models also can define the spatial
areas that episodes of unhealthy air quality will affect, or
illuminate relationships between different air pollutants.
This innovative approach can be applied in most situations
where two or more sources of data, which may differ
in bias and precision, are collected to study one process or
situation. As one example, Battelle scientists considered
monitoring data and model predictions as two spatial
representations of fine particulate matter. EPA's Community
Multi-Scale Air Quality (CMAQ) Modeling System,
which provided predictive modeling data, collected data at
many locations in the study area, but its
predictions were considered biased and
imprecise - i.e., spatially dense but
inaccurate. The monitoring data were
drawn from the nation-wide Federal
Reference Method (FRM) monitoring network, the 'gold
standard' of air quality information as measured. Unfortunately,
this network of high-quality monitoring measurements
provides data from relatively few locations - i.e.,
spatially sparse but accurate information. By combining
both sources, the hierarchical Bayesian approach takes
advantage of the complementary strengths of each one.
The graphics below illustrate ambient particulate matter
(PM2.5) concentration data that were collected over a large
portion of the eastern United States for a two-week period
in January 2000. The Bayesian surface is generally higher
than the CMAQ surface, indicating that the monitoring
data were used to adjust for an innate bias in the CMAQ
model. Also note that the numerous local area peaks that
were not necessarily accounted for, due to the relative
sparseness of the monitoring data, still appear in the
Bayesian surface, illustrating that the CMAQ information
also has been fully incorporated.
Left Plot of the CMAQ model (surface) and the monitoring observations (spheres).
Right Plot of the Bayesian predictions (surface) and the monitoring observations (spheres).
Other applications of this modeling technique include
making predictions about how air pollutants will act within
defined geographic areas over a given period of time, using
information from more than two inputs, and using it to
make several other promising extensions.
For more information, contact Dr. Steve Bortnick
(614) 424-7487, bortnick@battelle.org or
Dr. David Wendt (614) 424-7653,
wendtd@battelle.org.
|