Exploratory Data Analysis of the Multilevel Anthropogenic Copper

Jan 16, 2004 - A comprehensive multilevel contemporary cycle for stocks and flows of copper is analyzed by the tools of exploratory data analysis (EDA...
1 downloads 8 Views 868KB Size
Environ. Sci. Technol. 2004, 38, 1253-1261

Exploratory Data Analysis of the Multilevel Anthropogenic Copper Cycle T. E. GRAEDEL,* M. BERTRAM, A. KAPUR, B. RECK, AND S. SPATARI† Center for Industrial Ecology, School of Forestry and Environmental Studies, Yale University, New Haven, Connecticut 06511

A comprehensive multilevel contemporary cycle for stocks and flows of copper is analyzed by the tools of exploratory data analysis (EDA). The analysis is performed at three discrete spatial levelsscountry (56 countries or country groups that comprise essentially all anthropogenic stocks and flows of copper), eight world regions, and the planet as a whole. Among the most interesting results are the following: (1) EDA is employable and valuable for use in the analysis of material flows, especially those across multiple spatial levels; (2) All distributions of countrylevel stock and flow data are highly skewed, a few countries having large magnitudes, many having small magnitudes; (3) Rates of fabrication of copper-containing products for the countries are poorly correlated with rates of extraction, reflecting the fact that many countries that extract copper do not fabricate products from copper to any significant degree and vice versa; (4) Virtually all countries are adding copper to stock (in pipe, wire, etc.); These rates of addition are highly correlated with rates of copper entering use in all regions and are higher in regions under vigorous development; (5) With weak confidence, the rate of copper landfilling by regions is about one-half the rate of copper discarded; (6) The statistical distributions of both country-level and regional-level copper cycle parameters have successively lower standard deviations at later life stages; and (7) Copper flow distributions at different life stages tend to reflect those of lower spatial level extreme values, but Asia’s and Europe’s regional patterns are much more reflective of country-level distributions as a whole.

The Potential of Cross-Level Analysis One of the definitions of industrial ecology has traditionally been that it is the study of the interactions between technology and the environment. The interactions have almost entirely been addressed at either very small spatial levels (e.g., the factory) or very large spatial levels (e.g., the planet). From a temporal perspective, industrial ecologists have almost always dealt with the contemporary situation, though a few scholars have looked at emissions to the environment or the use of energy over time periods of a few decades. These approaches strongly resemble the tendencies in biological ecology to restrict one’s studies to a single * Corresponding author phone: (203) 432-9733; fax: (203) 4325556, e-mail: [email protected]. † Current address: Department of Civil Engineering, University of Toronto, Toronto, Canada. 10.1021/es0304345 CCC: $27.50 Published on Web 01/16/2004

 2004 American Chemical Society

restricted temporal and spatial level, i.e., to this season’s vernal pools or half-hectare ecosystems or landscapes, thus avoiding the challenges of studying (in Princeton ecologist Simon Levin’s words) “how the signatures of actions at one level manifest themselves at levels higher and lower” (1). Despite the reticence of industrial ecologists to address multilevel issues, there is substantial evidence that these issues are important, perhaps even crucial. Environmental, resource, and technology issues clearly cross levels, as when energy use in rural Alabama contributes to the potential for global climate change or when the rate of extraction of metal ores is dramatically changed by the migration of population to rapidly evolving cities. We address these challenges in this paper by employing the tools of exploratory data analysis (EDA). Exploratory analysis is designed to find out ‘what the data are telling us’. Its basic intent is to search for interesting relationships and structures in a body of data and to exhibit the results in such a way as to make them recognizable. This process involves “summarization”, perhaps in the form of a few simple statistics (e.g., mean and variance of a set of data) or perhaps in the form of a simple plot (such as a scatterplot). It also involves “exposure”, that is, the presentation of the data so as to allow one to see both anticipated and unexpected characteristics of the data. EDA was originated by Tukey (2), and the techniques have been described in detail by Cleveland (3) and Wiehs (4). The Stocks and Flows (STAF) project at Yale chose copper as the initial material for EDA. In a companion paper (5, hereafter “Paper I”), we present multilevel copper cycle data for the 56 countries or country groups, 9 regions, and the planet as a whole. All data are for ca. 1994. The present paper draws on those data (with the exception of the Antarctic region, a unique entity due to its abundantly supported but tiny population) to explore the statistical attributes of the multilevel copper cycle.

Application of Exploratory Data Analysis to Copper Cycles To form a consistent set of country-level data, the data reported for the two European country groups (Beneluxs Belgium, Netherlands, Luxemburgsand “STAF Scandinavia”s Denmark, Finland, Norway, Sweden in Paper I) are allocated to the individual countries at the same ratio as their relative populations. The result is a set of 61 country cycles that can be subject to analysis. The anthropogenic copper cycle in countries and regions can be regarded as having six major flows, as shown in the lower panel of Figure 1: the rate of extraction of copper in ore, the rate of copper fabrication and manufacture, the copper use rate, the rate of copper additions to in-use stock, the copper discard rate, and the copper landfill rate. In addition, the overall balance between imports and exports is of interest. Since there are 61 country cycles, their flows constitute 7 data sets of 61 values each. Similarly, the eight regional cycles provide seven data sets of eight values each. These comprise the data sets that are explored herein. EDA is particularly suitable for the multilevel copper data because no similar composite data set has been available for analysis, and the important information contained in the data is not necessarily easy to anticipate. The analyses reported herein utilize the S Plus 6.1 software (6). It is important to note that the copper cycle system we study here, and as represented by Figure 1a of Paper I, is “resource transformational” in the same sense as a multiVOL. 38, NO. 4, 2004 / ENVIRONMENTAL SCIENCE & TECHNOLOGY

9

1253

TABLE 1. Statistical Data for Country-Level Copper Cycle Parameters (61 Countries) parameter, Gg Cu/year

mean

median

standard deviation

1. extraction rate 2. fabrication rate 3. use rate 4. rate of addition to stock 5. discard rate 6. landfill rate 7. import/export rate

175 202 189 127 64 29 15

11 62 51 33 14 6 14

423 386 424 269 175 86 395

fabrication rate as a function of rate of extraction rate of addition to stock as a function of use rate rate of landfilling as a function of rate of discard

FIGURE 1. (bottom) Principal reservoirs and flows in the anthropogenic copper cycle: P ) processing; F ) fabrication and manufacture, including both domestic and imported inputs; U ) entry into use; W ) waste management;, L ) landfill; I/E ) (importexport) summed for all reservoirs. (top) Comparative box plots for several selected copper cycle parameters for the 61 countries. In this approach, most easily seen in the red display at the far right, the magnitudes of the upper and lower quartiles of the data define the upper and lower boundaries of the box, the median is a line within the box, and the horizontal lines outside the box are the “inner fences”, which are data located no more than 1.5 times the interquartile range beyond the box top and bottom. Values beyond the inner fences are plotted individually. All rates are in units of Gg Cu/year. The statistics for the data distributions are given in Table 1. process transformational system in manufacturing. Accordingly, it can be addressed by the mathematics of chemical process analysis (e.g., 7). This latter framework incorporates material balance constraints perfectly appropriate for our present purposes. From that perspective, the 61 country cycles represent 61 unique experiments with the same process network. We anticipate that process analysis approaches will be particularly valuable in future dynamics analyses as we characterize the copper cycle over time as well as space.

Country-Level Copper Cycle Statistics Copper stocks and flows at the country level strongly reflect the state of development of the country, the wealth of its inhabitants, and the customs and practices of its society. For the STAF-copper project, we developed information for all countries or country groups of the world with higher than minimal rates of extraction, fabrication, and/or use of copper. Complete copper cycle details for each country in our study was given in Paper I and its Supporting Information. In this paper we present and discuss data for countries representa1254

9

ENVIRONMENTAL SCIENCE & TECHNOLOGY / VOL. 38, NO. 4, 2004

slope

intercept

R2

0.38

135

0.18

0.61

11

0.98

0.49

-2

0.98

tive of upper and lower extremes and of means in the data and in Table 1 as statistical information for each parameter in the complete country data set. By grouping and analyzing the data in this way, we produce representations of countrylevel cycle parameters that are not overly influenced by the data for any single country. Figure 2 displays extremes and mean values for the seven different parameters of country-level copper cycles together with box plots (3, 8) of the entire data distributions. Figure 2a, for example, is constructed from data for the rate of extraction of virgin copper ore. This rate depends on the occurrence of the mineral resource, the deployment of appropriate technology to extract it, and the transportation and processing infrastructure. Chile and the United States are seen to be the two biggest producers. Portugal and Mongolia are typical of moderate extractors; the magnitudes of these data are nonetheless rather low because only a few countries produce most of the world’s copper. Twenty-four countries at the bottom of the data set mine no copper at all. This very uneven distribution in extraction rates is reflected by the box plot for the entire data set (shown on the right), with a few extreme upper values and most of the data clustered at very low values. The rate of copper fabrication, shown in Figure 2b, combines cathode copper from in-country processing and imported copper semi-products. It is reflective of the deployment of technology within a country, its labor costs, and its proximity to markets. With the realization of global trade in copper metal, it has become unnecessary for a country to have a high processing rate in order to have a high fabrication rate, and proximity to market is a significant factor only when transportation costs are important. The world’s two leading fabrication countries, in fact, demonstrate that high fabrication rates can occur with either of the two principal sources of purified copper: domestic refineries (United States) and import (Japan). Several other countries with relatively high fabrication rates (not shown in this diagram) are also large importers (e.g., China and Taiwan), while others combine high rates of importing and recycling (e.g., Germany and Italy). Turkey and India are typical of fabricating countries near the mean of the distribution. Eight countries do little or no copper fabrication. The copper use rate for a country is a function of its wealth and of its cultural and social norms and preferences. Every country, of course, is nonetheless a user of copper to some degree. As one would anticipate, the highly developed countries tend to be at the top of this list and largely undeveloped at the bottom. Figure 2c demonstrates that the

United States and Japan are the world leaders, as was the case with fabrication, but many countries are clustered around rates of a few hundred Gg Cu/year. Hong Kong and Malaysia are typical of this group. Only a few small, poor countries such as Botswana and Papua New Guinea use essentially no copper. In nearly all countries, the amount of copper entering use exceeds that being discarded. Data for the difference, addition to stock in use, is shown in Figure 2d. Whereas Japan was the second highest country for rate of copper use, it is not the second highest for additions to in-use stock. This can be understood by the comparison of use and discard rates for the United States, Japan, and China (shown as Figure 2h in the Supporting Information.) The United States and Japan, with relatively mature industrial and infrastructural systems, discard nearly one-half as much copper as enters into use. China, in contrast, is in the early stages of rapidly increasing its in-use copper stock, most likely in building plumbing and wiring and in telecommunications, so its proportional additions are higher. In contrast, most of the countries near the mean of the distribution are increasing their stock of copper at a rate of only tens or hundreds of Gg Cu/year; Brazil and Australia are examples. Only two countries, India and Russia, were characterized as having significant losses from stock. In the case of India, we suspect that the data may not capture such features as large amounts of informal recycling and copper in ocean-going ships sent to India for dismantling. In the case of Russia, the loss from stock in this 1994 characterization probably represents the upheaval of the Russian economy following the breakup of the Soviet Union. The two countries with the largest copper use rates, the United States and Japan, are also those with the highest rates of discard (Figure 2e). These two countries also lead the world in the rates at which they landfill copper (Figure 2f). The recycling rates (the difference between discard and landfill rates) are different, however: about 65% for Japan and 50% for the United States. Countries near the means of the distributions of discard and landfill rates have rates that are about 5% of those of the United States, as was the case for copper entering use. Obviously, countries that use very little copper, such as Botswana or Papua New Guinea, discard and landfill very little as well. Rates of (import-export) of copper (in concentrate, blister, and cathode forms and in semi-products and products) are shown in Figure 2g. This diagram, similar to those devised by Kesler (9), reflects a preponderance of extractive activity for exporters and of fabrication and utilization for importers. China and Japan are the countries with the largest import copper flow (Germany is close behind). Chile is by far the largest exporter; Russia is second, but at a much lower annual rate, and the Russian export copper presumably comes from the decrease in in-use stock. Box plots for the complete data sets are compared in the upper panel of Figure 1, the stages of the copper life cycle moving from left to right (except for [import-export], which involves all life stages). The distribution of extraction rates, reflecting copper ore occurrences and processing activity or lack thereof, is highly skewed, with the median value near zero. The extreme upper bound in the next five data sets decreases left to right as does the standard deviation (Table 1), that is, from earlier to later life stages. This indicates that the dominance of a small number of countries in early lifecycle stages related to industrial capacity is less important at the later life stages that are more related to individual human actions. The mean value for the rate of landfilling is the lowest of the nonextraction data sets, reflecting the fact that a large fraction of copper in in-use stock entered use in the last few decades and has not yet reached end-of-life. [Import-export] is the most symmetric of the distributions,

indicating that copper-importing countries and copperexporting countries are roughly in balance in both number and flow magnitude. Relationships among the data sets can be further explored by scatterplotting one set against another. This is done in Figure 3 for three pairs of country-level copper rates, the statistical information being given in Table 1. Consider first Figure 3a, the rate of copper fabrication as a function of the rate of copper extraction. Three countries stand out on this plotsthe United States (high rate of extraction, high rate of fabrication), Chile (high rate of extraction, low rate of fabrication), and Japan (low rate of extraction, high rate of fabrication). Many countries do a little fabrication and/or a little extraction, but Table 1 demonstrates that the correlation between the two activities is not very high. In Figure 3b1, the rate of copper addition to in-use stock is plotted as a function of the rate of copper use. The entire data set demonstrates high correlation throughout, with R2 ) 0.98. China is notable as a country falling above the regression line, reflecting the large fraction of China’s copper use rate that is being added to stock in buildings and infrastructure. We find that a line fit to data for only the Asian countries has a steeper slope than the fitted line for all countries, emphasizing the enhanced rate of copper addition to in-use stock in that region. The same relationship holds for Europe, which appears to have been most highly influenced by intensive infrastructure development in reunified Germany in the mid-1990s. Figure 3b2 is an expanded view to examine countries near the middle portions of the data sets. The data for many individual countries can be seen in these diagrams, and the higher slope of the Asian and European fitted lines is clearly due to only three countries in each case: China, South Korea, and Taiwan for Asia and Germany, the United Kingdom, and Italy for Europe. A further expanded view to delineate countries in the lower portion of the scatterplot appears as Figure 3b3 in the Supporting Information. Figure 3c1 plots the rate of landfilling of copper as a function of the rate of discarded copper. Overall, the landfill rate is about 30% of the discard rate. There is a very high correlation between these two parameters (R2 ) 0.97). We regard this result as of low significance, however, because data on copper discards and landfilling is sparse in many countries (Paper I), and our informed estimates for those countries inherently tend to produce a correlation between the two parameters. An expanded view of the lower portion of the scatterplot appears as Figure 3c2 in the Supporting Information. It is important to note that the information in Table 1 and Figures 2 and 3 reflects the situation for ca. 1994, the latest year for which comprehensive data were available when we began this work. The copper cycle demonstrates a much higher rate of extraction and entry into use than of discard and dissipation (Paper I), and the cycle parameters are a snapshot at one point in time for this evolving system. They thus provide perspective for thinking about other time periods, but it is realistic to anticipate that the data distributions for any particular epoch will be unique.

Regional-Level Copper Cycle Regional-level copper cycles are the aggregates of information from the appropriate country-level cycles; the result is nine regions that encompass the entire world. Cycles for those regions and for the planet as a whole appear in Paper I. In Table 2, we present statistical information from the regionallevel cycles for the same seven cycle parameters as for the country-level cycles of Table 1. The distributions of the data sets are shown as box plots in Figure 4. The regional data sets of Figure 4 consist of only eight values each (we have omitted Antarctica in this analysis VOL. 38, NO. 4, 2004 / ENVIRONMENTAL SCIENCE & TECHNOLOGY

9

1255

1256

9

ENVIRONMENTAL SCIENCE & TECHNOLOGY / VOL. 38, NO. 4, 2004

FIGURE 2. Country-level copper cycle parameters. Each display shows data for the two countries among the 61 characterized with the highest rates of flow, the two nearest the mean, and the two lowest. In the case of the mean and low countries, if there are several with identical rates, typical examples are given. A box plot for the entire 61-country data set completes each display. In this approach, most easily seen in Figure 2g, the magnitudes of the upper and lower quartiles of the data define the upper and lower boundaries of the box, the median is a line within the box, the horizontal lines outside the box are the “inner fences”, which are data located no more than 1.5 times the interquartile range beyond the box top and bottom, and extreme values outside the inner fences are plotted individually. The parameters are defined in Table 1 and discussed in the text. All rates are in units of Gg Cu/year: (a) Copper extraction rate (24 countries extract no copper), (b) Copper product fabrication rate (8 countries fabricate no copper semi-products), (c) Rate of copper use, (d) Rate of copper addition to in-use stock, (e) Copper discard rate, (f) Copper landfill rate, (g) Copper (import-export) rate. because of its uniquely tiny population and correspondingly low rate of copper use). In every case, except import/export, the distributions do not approach normality but rather tend to consist of three or four high values and four or five low values, reflecting the dominance for all data sets of the high copper flows in Asia, Europe, and North America (and for extraction rate of Latin America and the Caribbean). The median value for extraction is the highest of any of the data sets. The low end of the extraction distribution is notable; this is the Middle East region, which extracts almost no copper. Even more drastic is the extreme low value for (import-export): this is South America, almost entirely due to the large rate of export of Chilean copper. As with the country-level distributions, the standard deviation decreases from the copper use rate data set to that for copper landfilling (Table 2). Scatterplots of pairs of distributions for the regions, similar to those for countries in Figure 3, are shown in Figure 5, with associated statistical data in Table 2. The rate of addition of copper to in-use stock is plotted as a function of the rate of copper use in Figure 5a. Three regions are high and five are low in both parameters. North America’s position is dominated by the United States activities. In the case of Asia, China and Japan are the most important actors; for Europe, Germany‘s contribution is crucial. For both Asia and Europe, however, a number of other countries also play significant roles. The rate of fabrication as a function of the rate of extraction is given in Figure 5b in the Supporting Information. Four regions stand out on that plot-Asia, Europe, North America, and Latin America and the Caribbean. The locations of the latter two regions are largely the consequence of the copper cycle regional dominance of the United States and Chile, respectively. For Asia and Europe, in contrast, a number of countries contribute to the regional behavior.

Figure 5c, also in the Supporting Information, plots the regional copper landfill rate as a function of the regional copper discard rate. The pattern and its dependencies are reminiscent of Figure 5a.

Multilevel Statistics for Copper Cycles The challenge of exploring how the detail in data at one spatial level is reflected at other levels can be approached with the information described above. Our method is as follows. First, we utilize the country-level statistical sets for the various cycle parameters, as shown in Figure 1. Second, we utilize the similar set of statistics for the regional-level cycles of Figure 4. Finally, we compare the two statistical distributions for each of the parameters with the global value for the parameter, taken from the “global best estimate anthropogenic copper cycle”, given in Figure 6 of Paper I. Since the global cycle eliminates considerations of import and export, this operation can be done with flows 1-6 of Figure 1 but not with flow 7. Data sets are statistically analyzed most conveniently if they are normal in character. To investigate the normality of an example of our data sets, we plot in Figure 6a the quantiles of the copper use rate at the country level against the quantiles of a normal distribution. On this quantilequantile (Q-Q) plot, a normal distribution of empirical data would lie on a straight line (8, 10). It is immediately apparent that the country copper use rate data are not normally distributed. It is also clear because of the upward curvature of the data in the right side of the plot that the data have a much longer right tail than does a normal distribution. Physically, this indicates that the copper use rate of a small number of countries is higher than would be the case if use rates were normally distributed. We can apply power transformations to the data set of Figure 6a in order to arrive at a data set that is approximately VOL. 38, NO. 4, 2004 / ENVIRONMENTAL SCIENCE & TECHNOLOGY

9

1257

FIGURE 3. Scatterplots and least-squares fitted lines of three paired sets of country-level copper cycle parameters: (a) Fabrication rate as a function of rate of extraction; note the two outlier points in upper and lower right corners; (b) Rate of addition to in-use stock as a function of use rate; and (c) Rate of landfilling as a function of rate of discard. To demonstrate regional differences, the countries for each region are indicated by distinctive symbols. The second diagram is shown at two magnitude scales and with a dashed fitted line that applies to Asian countries only, plus one-third magnitude scale in the Supporting Information. The third diagram is shown at one magnitude scale here and another in the Supporting Information.

TABLE 2. Statistical Data for Regional-Level Copper Cycle Parameters (8 Regionsa) parameter, Gg Cu/year

mean

median

standard deviation

1. extraction rate 2. fabrication rate 3. use rate 4. rate of addition to stock 5. discard rate 6. landfill rate 7. import/export rate

1342 1568 1451 957 499 218 -124

845 355 313 180 190 34 435

1122 1886 1664 1153 550 283 1725

fabrication rate as a function of rate of extraction rate of addition to stock as a function of use rate rate of landfilling as a function of rate of discard a

slope

intercept

R2

0.54

838

0.1

0.69

-39

0.98

0.51

-36

0.97

Without Antarctica.

normal. Q-Q plots for two such transformations are illustrated: a logarithmic (base 10) transformation in Figure 6b and a square root transformation in Figure 6c. The latter is clearly inappropriate. The former passes the Kolmogorov1258

9

ENVIRONMENTAL SCIENCE & TECHNOLOGY / VOL. 38, NO. 4, 2004

FIGURE 4. Comparative box plots for several selected copper cycle parameters at the regional level. (CIS is the Commonwealth of Independent States, the former Soviet Union). The statistics for the data distributions are given in Table 2. Data for Antarctica are not included, as they are insignificantly low compared to the other regions.

FIGURE 5. Scatterplots of three paired sets of regional-level copper cycle parameters: (a) Rate of addition to in-use stock as a function of use rate; (b) Fabrication rate as a function of rate of extraction; and (c) Rate of landfilling as a function of rate of discard. The statistics of fit are given in Table 2. As with Figure 4, the Antarctic region is not included. Figure 5b and c appear in the Supporting Information. Smirnov goodness-of-fit test (p ) 0.712; refs 8 and 11), provided we delete a handful of very small values for the rate of copper use from the data set. These deletions refer to countries with negligible employment of copper and with sparse data, and we can safely ignore them in an overall assessment. We therefore choose the logarithmic transformation as appropriate to the data. Similar results (not shown) are applicable to the other data sets of Figures 1 and 4. To compare distributions at different spatial levels, it is necessary to normalize them (because regional-level values are far larger than nearly all country-level ones). Accordingly, we perform a second transformation on the data sets of Figures 1 and 4 by setting the mean value for each data set at unity and computing the other values as ratios of the mean value. The global value is unity by definition. The results of these operations are shown in Figure 7. Consider first Figure 7c, where the distributions of rates of copper use at different spatial levels are compared. On the diagram, the two normalized distributions are shown as box plots, together with the normalized global data point. The country-level distribution is nearly symmetric around the median, as predicted by Figure 6b. At the regional level, the distribution is significantly more asymmetric. Extreme values at both ends of the distribution are clearly important in reflecting the country level signature at the regional level, while only the extreme upper values are important in reflecting the regionallevel signature at the global level. The pattern is repeated with modest variations for the rates of copper extraction, fabrication, and addition to inuse stock. Extreme values become increasingly less dominant as one moves to later stages of the copper life cycle. At the discard and landfill stages, the entire distributions are close to log-normal at the country level and approaching lognormal at the regional level. Overall, the results again demonstrate the increase in standard deviation from endof-life stages to those stages more closely tied to industrial activity and individual country affluence. These results constitute, we believe, the first coordinated multilevel statistical analysis of a material resource. We find that the magnitude of a cycle parameter at one spatial level is not necessarily simply related to the values of that parameter at smaller levels. In some cases, such as the rate of copper being discarded, the logarithmically transformed and normalized distributions flow relatively smoothly from

FIGURE 6. Quantile-quantile plots of the country copper use rate data set against a normal distribution: (a) Untransformed countrylevel data; (b) Logarithmically transformed country-level data; (c) Square root-transformed country-level data. level to level. In others, such as the rate of copper use, a few dominant high values at one level obviously have a strong influence on the values at the next higher level. This is, of course, a function of the distributions of parameter values, which in a number of instances are asymmetric, with a long high-value tail.

Implications for Resources and the Environment It is immediately apparent from the results of this research that the tools of EDA can be usefully applied to detailed studies of resource stocks and flows. The most obvious benefit VOL. 38, NO. 4, 2004 / ENVIRONMENTAL SCIENCE & TECHNOLOGY

9

1259

FIGURE 7. Comparative box plots for several selected copper cycle parameters at different spatial levels: country (on the left), region (center), and planet (on the right). (a) Extraction rate, (b) Fabrication rate, (c) Rate of use, (d) Rate of addition to in-use stock, (e) Discard rate, (f) Landfill rate. All rates are logarithmically transformed and then expressed as ratios to the median value within each distribution. is that the statistical distributions of the data sets as well as the influence of extreme values can be quickly recognized. Perhaps more beneficial is that large sets of tabular data can be explored in order to uncover their relationships, some of which may be hidden and unexpected. Finally, EDA approaches are natural, efficient ways in which to compare the characteristics of linked data sets in order to examine the ways in which linkage occurs. A number of features of the stocks and flows of copper are demonstrated by this analysis: (1) All distributions of country-level stock and flow data are highly skewed, a few countries having large magnitudes, many having small magnitudes; (2) Rates of fabrication of copper-containing products for the countries are poorly correlated with rates of extraction, reflecting the fact that many countries that extract copper do not fabricate products from copper to any significant degree and vice versa; (3) Virtually all countries are adding copper to in-use stock (in pipe, wire, etc.). These rates of addition are highly correlated with rates of copper use in all regions, but the relationship differs with level of development; (4) With weak confidence, the rate of copper landfilling by countries appears to be highly correlated with the rate of discard; (5) The statistical 1260

9

ENVIRONMENTAL SCIENCE & TECHNOLOGY / VOL. 38, NO. 4, 2004

distributions of both country-level and regional-level copper cycle parameters have successively lower standard deviations at later life stages. The most significant (and unique) result is the multilevel analysis of Figure 7. In this case, we have a clear demonstration of the ways in which the signatures of actions at one level manifest themselves at higher levels. The principal conclusion is that the statistical distributions of most of the copper cycle parameters are approximately log-normal at the country level, but this behavior is approached at the regional level only at the end-of-life stages. The signatures of lower levels that are manifested at higher levels are sometimes, but not always, highly dependent on lower level extreme values rather than those of the distributions as a whole. This result permits us to speculate about the distributions of copper flows at spatial levels lower than those we analyzedsfractions of countries, for example. We intuitively anticipate that those fractions would be a mixture of cities, suburbs, agricultural areas, mountain areas, etc., and that rates of copper use would show enormous variation. With some modest confidence, we suggest that were data at a subcountry level available, the distributions would be even broader than those shown for countries in Figure 7 but still

roughly log-normal. The accuracy of these purported distributions could be tested, of course, by characterizing the data at the subcountry level, but the magnitude of such a task is large, and much of the necessary data might be unavailable in any case. The use of these EDA cross-level approaches is not limited to material flow analysis. One other area that appears promising is that of biological ecology, where one could study distributions of data on resource stocks, resource flows, species populations, and so forth, at different spatial levels. Energy use statistics across spatial levels might also be usefully analyzed in these ways. In summary, we have performed what we believe to be the first coordinated multilevel analysis of a material resource using data for stocks and flows of copper in countries, regions, and the world and employing the techniques of EDA. The results reveal unanticipated characteristics of the copper cycles and open a set of new and useful tools for employment in material flow analysis.

Acknowledgments This research was funded by the U.S. National Science Foundation under grant BES-9818788.

Supporting Information Available Three figures that compare the copper flows of large-user countries, provide details of country-level scatterplots, and display regional-level scatterplots. This material is available free of charge via the Internet at http://pubs.acs.org.

Literature Cited (1) Levin, S. A.; Grenfall, B.; Hastings, A.; Perelson, A. S. Science 1997, 275, 334-343. We subscribe in the present paper to the definitions of scale, a spatial, temporal, quantitative, or analytical dimension used to measure and study a phenomenon (example, political or geographical entity), and level, the location along a scale (example: country, region, and planet), of Gibson, C. C.; Ostrom, E.; Allen, T. K. Ecol. Econ. 2000, 32, 217-239 and have modified the quotation from Levin et al. accordingly. (2) Tukey, J. W. Exploratory Data Analysis; Addison-Wesley: Reading, MA, 1977. (3) Cleveland, W. S. The Elements of Graphing Data; Wadsworth: Monterey, CA, 1985. (4) Weihs, C. J. Chemometr. 1993, 7, 131-142. (5) Graedel, T. E.; van Beers, D.; Bertram, M.; Fuse, K.; Gordon, R. B.; Gritsinin, A.; Kapur, A.; Klee, R. J.; Lifset, R. J.; Memon, L.; Rechberger, H.; Spatari, S.; Vexler, D. Environ. Sci. Technol. 2004, 38, 1242-1252. (6) S-Plus 6.1 for Windows; Insightful Corp.: Seattle, WA, 2002. (7) Hangos, K.; Cameron, I. Process Modelling and Model Analysis; Academic Press: San Diego, CA, 2001. (8) Freund, R. J.; Wilson, W. J. Statistical Methods, 2nd ed.; Academic Press: San Diego, CA, 2003. (9) Kesler, S. E. Mineral Resources, Economics, and the Environment; Macmillan: New York, 1994 (10) Kleiner, B.; Graedel, T. E. Rev. Geophys. Space Phys. 1980, 18, 699-717. (11) Conover, W. J. Practical Nonparametric Statistics; John Wiley & Sons: New York, 1980.

Received for review April 16, 2003. Revised manuscript received November 20, 2003. Accepted November 21, 2003. ES0304345

VOL. 38, NO. 4, 2004 / ENVIRONMENTAL SCIENCE & TECHNOLOGY

9

1261