Environ. Sci. Technol. 2008, 42, 7309–7314
Analyzing Beijing’s In-Use Vehicle Emissions Test Results Using Logistic Regression CHENG CHANG AND LEONARD ORTOLANO* Department of Civil and Environmental Engineering, Stanford University, Jerry Yang & Akiko Yamazaki Environment & Energy Building, Room 249, Stanford, California 94305-4020
Received October 18, 2007. Revised manuscript received June 1, 2008. Accepted July 8, 2008.
A logistic regression model was built using vehicle emissions test data collected in 2003 for 129 604 motor vehicles in Beijing. The regression model uses vehicle model, model year, inspection station, ownership, and vehicle registration area as covariates to predict the probability that a vehicle fails an annual emissions test on the first try. Vehicle model is the most influential predictor variable: some vehicle models are much more likely to fail in emissions tests than an “average” vehicle. Five out of 14 vehicle models that performed the worst (out of a total of 52 models) were manufactured by foreign companies or by their joint ventures with Chinese enterprises. These 14 vehiclemodeltypesmayhavefailedatrelativelyhighratesbecause of design and manufacturing deficiencies, and such deficiencies cannot be easily detected and corrected without further efforts, such as programs for in-use surveillance and vehicle recall.
1. Introduction Air pollution related to motor vehicles has increased significantly in Chinese cities, which are among the most polluted in the world. Between 1997 and 2003, the number of vehicles in China went from 1 million to 2 million. In 1999, Beijing became the first Chinese city to implement Euro I emission standards, which were the standards the EU employed between 1992 and 1998. In 2001, the Euro I emission standards were extended to cover all of China, and in 2005, the more rigorous Euro II standards were adopted. China’s vehicle emission control regulations contain only some of the elements found in comparable regulations in Europe and the United States. Although China has emission standards and a program for annual inspection and maintenance (I/M) to test in-use vehicles for compliance with standards, it lacks an in-use surveillance program and a recall system, both of which are important in detecting systematic failures in vehicle design and manufacturing. The analysis herein centers on a sample of annual I/M data collected at 16 Beijing inspection stations during several months in 2003. Annual inspections in Beijing are conducted at 19 urban stations and 19 suburban stations operated by the Public Security Bureau of Beijing Municipal Government. Owners of in-use vehicles are free to select the station at which they would like to have their vehicles tested. * Corresponding author phone: 650-723-2937; fax: 650-725-3164, e-mail:
[email protected]. 10.1021/es702636a CCC: $40.75
Published on Web 08/27/2008
2008 American Chemical Society
The data, which consist of emissions test results for 129 604 motor vehicles in Beijing, were analyzed using a logistic regression model. Regression modeling is widely used in analyzing vehicle emissions test data, and comprehensive reviews of relevant statistical modeling studies are available in the literature (1). Traditionally, statistical analyses of vehicle emissions test data have used several covariates to predict emission values of specific pollutants. This is typically done with multivariate linear regression models, sometimes with logarithmic transformations to account for the nonnormality of data (2). Recently, several studies have used logistic regression to model how a group of predictor variables contributes to whether a vehicle failed or passed an emissions test, and that is the approach followed herein (3, 4). On the basis of the literature (2-5), factors usually considered in statistical analyses of I/M test data include • vehicle technology (such as whether vehicles have carburetors or three-way catalytic converters); • vehicle make and model, for which manufacturer is sometimes used as an indicator for vehicle model (3, 5); • specific characteristics of a vehicle, such as weight, engine displacement, number of cylinders, and so forth; • fuel quality (e.g., use of oxygenated additives or fuels without additives); • vehicle model year, age, and mileage; • vehicle maintenance and evidence of misuse; • testing conditions (e.g., room temperature of inspection station); and • socioeconomic parameters, such as household income. The choice of variables is based on data availability, and no single modeling study has used all of these variables. For the data set examined herein, all vehicles were required to meet Euro I emission standards, and therefore, each used a three-way catalytic converter. The modeling study reported on below includes the following variables: vehicle model and model year, inspection station, vehicle registration area (urban vs suburban), and category of ownership. Inspection station location was examined because I/M programs in Beijing are conducted only at authorized centralized stations. The inclusion of vehicle registration area turned out to be fruitless because most of the vehicles in the data set were registered in urban areas. Data on other factors cited in the literature, such as specific vehicle characteristics, were not collected as part of the data set, and thus, they were not included in the modeling. Of all the variables considered, vehicle model had the greatest effects on emissions test results, and those are the modeling results emphasized herein. Although our analysis method is not novel, the results are nonetheless notable for a number of reasons. Wide agreement exists on the international significance of the domain: vehicle emissions in China. In addition, the analysis adds new knowledge about Beijing’s I/M programs; very few people either within or outside of China have conducted in-depth analyses of Chinese emissions inspection data, and we are not aware of any researchers that have analyzed a Chinese vehicle emissions database that includes results for both vehicles that passed emissions tests on the first try and those that required two or more tests to pass. Moreover, because of inadequacies in China’s vehicle certification procedures, new vehicle models having excessively high failure rates because of problems with design and manufacturing may reach the market. Heretofore, neither government officials nor researchers have investigated this matter in a systematic fashion. VOL. 42, NO. 19, 2008 / ENVIRONMENTAL SCIENCE & TECHNOLOGY
9
7309
2. Data and Methods Tsinghua University in Beijing has assembled a substantial set of vehicle emissions test data from inspection stations in Beijing, and this paper employs a portion of that data to investigate whether particular vehicle types have excessively high failure rates. The Beijing I/M program uses the acceleration simulation mode (ASM) (6) method, a steady-state loaded mode test. The cutpoints in the standards vary by vehicle model year, type, and weight. In designing the Beijing ASM standards, experts in Beijing referred to the following documents: U.S. EPA-AA-RSPD-IM-96-2 (1996) and the ASM section in California’s BAR97 Regulation. The rationale behind the Beijing ASM standards is similar to the U.S. ASM standards, but specifics vary on the basis of particular circumstances faced in Beijing. For example, there are differences in the number of weight categories and specific cut points. The testing procedure used in Beijing allows fast-pass. If the emissions of pollutants (i.e., CO, HC, NOx) of a vehicle pass initial tests in a high load (50%) and low driving speed (24 km/h) testing mode, then the vehicle does not need to be tested in low load (25%) and high driving speed (40 km/h) mode. However, if a vehicle fails for any of the three pollutants in low-speed mode, it must go through a high-speed mode testing process. A vehicle must satisfy conditions for all three pollutants to pass the test. (The Supporting Information contains further details on testing procedures.) Emissions testing stations in Beijing are managed by contractors, but overseen by the Beijing Environmental Protection Bureau (EPB). Testing facilities are not permitted to repair vehicles that have failed I/M testing; vehicles are to be repaired by independent service stations. The focus of our analysis is vehicle performance (i.e., pass or fail) on initial tests of vehicles during 2003, and the issue of vehicle repairs is not considered. During our data cleaning process, we observed that some vehicles failed multiple times, but our data set only includes results from initial tests. The ASM testing method and standards were implemented in early 2003 in Beijing, and the resulting data is the first set of emissions test results in China based on the ASM testing method and standards. All vehicles in the data set were registered during or after 1999, and all were required to meet Euro I standards. Standards for older vehicles are much less stringent than Euro I. In China, vehicles that fail an annual I/M emissions test on the first try must take additional tests (presumably after making changes) until the vehicle passes. If a vehicle does not eventually pass, it cannot operate legally. The data set contains results from tests conducted at 15 urban inspection stations and 1 suburban station from June to October of 2003. Although the ASM testing was officially supposed to begin in early 2003, it did not start in a serious way until June of 2003. The main reason for the delay in starting the program on time at full scale was the SARS outbreak. Among the approximately 8000 data points we collected for vehicles that failed their first emissions tests, only about 100 of those vehicles failed during the period from January to May 2003. Under the circumstances, we used only 2003 data from June to October. The data are for 52 vehicle models. Only light duty gasoline vehicles (e.g., sedans, SUVs, minivans, and light duty trucks) are included in the data set. The number of vehicles tested per model ranged from over 24000 to 103. This paper investigates whether some models have failure rates that are significantly higher than average. Since vehicle model is only one of several variables affecting emissions test results, other variables must also be considered. 2.1. Data Structure and Variables. A single observation in the raw data set is for a vehicle’s first test in 2003. For each 7310
9
ENVIRONMENTAL SCIENCE & TECHNOLOGY / VOL. 42, NO. 19, 2008
vehicle, the data contains basic information about the vehicle (model and model year) as well as the identity of the inspection station and results from emissions testing. Since the primary goal was to examine failure rates of vehicle models, numerical emissions test results were not analyzed in the regression study. Numerical results were generated by testing machines and were compared with relevant standards by computers to determine whether the vehicle passed or failed the test. The dependent variable, whether a vehicle failed or passed the test during the first time the vehicle was tested in a given year, was recoded as a binary variable (a value of one indicates a failure and a zero indicates a pass). The status of a vehicle in a test was determined by the testing machines at a station. For vehicles that failed their first test in the year, both cut points and the vehicle’s actual emissions values for the three pollutants were printed on paper sheets. On the basis of these data, we were able to check the consistency between the cut points and the standards. The raw data were also used to create “independent” (or predictor) variables, also called covariates. Vehicle model name and manufacturer were used to create the variable “MODEL,” which has 52 categories. Models were numbered consecutively based on sample size, beginning with M1 for the largest sample size. For any particular value of MODEL, a variable called “CLASS” was determined, and it has three categories: luxury, economy, and truck/van/minivan. Apart from a few exceptions described by Chang (7), it is not difficult to classify automobile models in terms of class. As of 2003, over 100 auto makers sold cars in China, and most were either domestic firms or foreign joint ventures. The number of foreign imports was small. With the exception of several large conglomerates, most manufacturers’ product lines were short. The variable “DOMESTIC” indicates whether a vehicle was made by a domestic manufacturer, a foreign company or a joint venture involving both a Chinese and a foreign firm. (Because most imports were difficult to distinguish from foreign models produced in China, all foreign models were categorized as “foreign” in the data set, regardless of whether they were imported or produced by joint ventures involving foreign partners.) By definition herein, cars categorized as “domestic” are not linked to foreign vehicle manufacturers. The CLASS and DOMESTIC variables were both coded from information used to define MODEL, and therefore, class and domestic are highly collinear and not included in the regression model. They are introduced here because they play a role in interpreting model outcomes. A binary variable called “SUBURBAN” has a value of 1 to signify registration at a suburban office, and a value of zero to indicate registration at an urban office. The “OWNERSHIP” variable has three categories: government (including public schools and other public service units), business, and individual. A categorical variable called “SITE” has 16 values, one for each of the inspection stations at which vehicles in the data set were tested. The Supporting Information contains a table summarizing the predictor variables. The first year in which a vehicle was registered is treated as the vehicle’s model year (MY). There are four model years because only vehicles having model years after 1998 must meet Euro I standards. Vehicles with earlier model years were not included in this data set because emission standards for those vehicles are much less stringent. The dependent variable in this analysis is a binary indicator of whether a particular vehicle failed the first time it was tested in 2003. The focus is on “first failure.” If a vehicle fails an emissions test more than once, the data set includes only test results from the first failure. The fact that it took multiple times for the vehicle to pass is not reflected in the data. 2.2. Threats to Data Quality. Threats to data quality from inspection stations can result from incompetence or mis-
conduct in the context of any I/M program. No one program is unique in this regard. Although we took measures to ensure the quality of data used in the model building process, it was not feasible to eliminate every conceivable threat. In the interests of completeness, we indicate potential difficulties. However, delineating possible threats to data quality is not equivalent to saying that our data suffers from these problems. Moreover, data quality difficulties have been reported even for I/M programs in OECD countries (8). Below, we indicate actions taken to avoid gathering inaccurate data. Inconsistencies in equipment and procedures among testing stations can be a problem, and Beijing EPB took steps to avoid such difficulties by using its new ASM testing method in small-scale trials before full-scale implementation in 2003. In addition, the EPB trained inspection station staff. We found no evidence of inconsistencies in testing equipment or procedures. However, technical deficiencies in testing procedures may have been overlooked or remained unidentified during the summer of 2003 when crowding occurred at stations because the SARs outbreak caused postponement of tests for vehicles originally slated for testing in spring 2003. Staff were required to work longer hours than usual in the summer because stations remained open later to accommodate increased demand for testing (7). Corruption or other misconduct is always a possibility with I/M programs. In 2001, Beijing EPB identified misconduct at a few inspection stations, but not among stations in our database (7). The EPB responded by imposing sanctions for the misconduct, temporarily halting operations, and having station managers make changes to prevent future misconduct. That the problems were dealt with quickly demonstrated the EPB’s intent on having a meaningful vehicle emission control program based on high quality testing. In creating the database for this research, we took our own steps to control data quality, particularly on test results for vehicles failing their first tests during 2003; these data were generated by measuring instruments but recorded on paper sheets instead of in computers. The Supporting Information describes procedures used to ensure high levels of quality in data entry for test results involving vehicle failures. Most information associated with passing emissions tests was computer-generated and posed fewer challenges. Another data quality threat is the improper use of pretest conditioning procedures, for example, taking sufficient time between tests to allow testing equipment to self-adjust before continuing. Because we spent limited time during visits to inspection stations, we cannot claim that proper pretest conditioning procedures were always followed. However, they were carefully followed during our visits. Under the circumstances, we believe the data is as accurate as it can be and arguably the best of its type for Beijing, and possibly for all of China. If our analysis leads some to challenge the quality of the data, such challenges could provide a basis for motivating Chinese officials to improve vehicle emissions data gathering efforts. 2.3. Logistic Regression Model. The response variable of interest is whether a vehicle fails in its first emissions test. Logistic regression models, rather than linear regression models, are the most appropriate for a binary response variable of this type (9, 10). In logistic regression models of the type used herein, the response variable Y is binary (0 or 1), and it is not modeled directly. Instead, the probabilities that the response takes on a 0 or 1 are modeled. The relationship between the probability π and covariates (or predictor variables) x (x1, x2,..., xk) is represented by a logistic response function, π ) Pr(Y ) 1|X ) x)
eβ0+β1x1+β2x2+···+βixi+···+βkxk 1 + eβ0+β1x1+β2x2+···+βixi+···+βkxk
(1)
where π is the probability of an event occurring given a set of predictor variables; Y is 1 if an event occurs, 0 if the event
does not occur; x is the vector of predictor variables; e is the base of the natural logarithm; xi is the value of the ith predictor variable (where, i )1, 2,..., k), βi is the coefficient of xi, and k is the number of covariates. The function in equation 1 can be transformed into a linear function by taking the natural log of the ratio of two probabilities as follows: g(x1, ... , xk) ) ln
( 1 -π π ) ) β + β x + β x + ··· + β x + ··· + 0
1 1
2 2
i i
βkxk (2) The ratio of the probabilities [i.e., π/(1 - π)] is the probability of an event’s happening (Y ) 1) over the probability of the event’s not happening (Y ) 0). This ratio is referred to as the odds of the event (11). When two groups of outcomes are compared, it is common to use this terminology and work with an odds ratio (OR, a ratio of the odds for the two events). The logarithm of the odds ratio is called log odds or logit. In logistic regression, the relationship between logit and the covariates is linear (see eq 2). An odds ratio is often used to assess the influence a covariate can have on the probability of an event if the covariate changes a small amount (usually one unit) and all other covariates remain constant. For a dichotomous covariate, an odds ratio compares the odds of a pair of outcomes on the response variable (Y) on the basis of the corresponding pair of categories for the covariate. For example, consider a dichotomous predictor variable, xi; suppose two groups (or categories) for xi are vehicle models A and B, and the goal is to compare the influence of each category on the response variable. In this case, the odds ratio comparing group A (OddsiA) and group B (OddsiB) is πi
A
ORi(A vs B) )
OddsiA OddsiB
)
1 - πiA
(3)
πiB 1 - πi
B
In logistic regression analysis, it is common to use the reference category of a variable as group B, and this reference category is, by definition, not included in the regression (11). In this case, the reference category approach assigns 1 to the xi in group A and 0 to the xi in group B (the reference group). The reason for dropping one category from the variable is to prevent collinearity or multicollinearity. The process of model building involves estimating the coefficients in eq 1 (i.e., βi, i ) 1...k) using a standard procedure, the maximum likelihood method. Once the values of βi are known, eqs 2 and 3 can be used to compute odds ratios, and this is how results are normally presented. However, as demonstrated by Chang (7), these odds ratios, together with information on the probability of failure for variables used as reference categories, make it possible to present results in terms of the probability of failure for each of the other covariates. Because failure probabilities are easier to interpret, they are used herein. Standard methods were used to construct the logistic regression model. First a “main effects” model was constructed; it involved no interactions among the variables. After examining the main effects model and the distribution of different vehicle model years, it became apparent that interactions between MY and MODEL required special attention. Therefore, the final model that was built involved the above-mentioned variables, but with separate terms for interactions between MODEL and MY. Details of the model building process are given by Chang (7). VOL. 42, NO. 19, 2008 / ENVIRONMENTAL SCIENCE & TECHNOLOGY
9
7311
FIGURE 1. Test failure rates for different vehicle models.
3. Results and Discussion 3.1. Model Type and Failure Rate. Of all the variables, MODEL has the strongest correlation with whether a vehicle fails or passes an emissions test the first time the vehicle is tested. The failure rate for a vehicle model is defined as the number of vehicles of a particular model that fail on the first test divided by the total number of observations for the model. The average failure rate for the 52 vehicle models in Figure 1 is 6.2%. The numbering sequence of the models corresponds to a descending order of the total number of observations for each model. (Because the data could be obtained only on the condition that vehicle models would not be identified by name, all models are delineated by number.) For vehicle models numbered between 1 and 20, sample sizes for models are above 1000; for vehicle models numbered above 20, sample sizes are between 100 and 1000. Many models have failure rates well above the 6.2% average, and nearly 30% of the 52 vehicle models have failure rates more than double the average. With two exceptions, these vehicle models have less than 1000 vehicles in the data set. One possible explanation for this pattern is that small automakers, which are associated with many of the vehicle models with small sample sizes, may have significant problems with quality control because small auto manufacturers in China often have limited technological and innovative capabilities. A second possible explanation, which may account for the exceptionally high failure rate for model 52, is that relatively small samples may yield biased estimates of failure rates because the influence of data input errors is relatively large and the samples may not be representative. 3.1.1. Exact Binomial Confidence Intervals of Failure Rates. The data set includes more than 90% of all vehicles tested at 16 of the 38 inspection stations in Beijing during the period from June to October 2003, and thus, the vehicles tested for each vehicle model represent samples out of the entire fleet (population) of most vehicle models in Beijing in 2003. The 129 604 vehicles in the data set represent about a quarter of the population (7). Confidence intervals (with a 95% confidence range) were calculated, and the vehicle models were grouped by CLASS and DOMESTIC (see Figure 2.) 7312
9
ENVIRONMENTAL SCIENCE & TECHNOLOGY / VOL. 42, NO. 19, 2008
Most luxury vehicle models are foreign, and of those, only 7 out of 19 models have average failure rates below the full sample average of 6.2%. Moreover, 11 foreign luxury vehicles have lower bounds on their confidence intervals above 6.2%. The results for foreign luxury cars are striking since foreign manufacturers either exporting cars to China or producing cars in China as joint venture with domestic companies would be expected to have access to technologies far more advanced than required to meet Euro I standards. Results for the two domestic luxury models are notable. M44 has among the lowest failure rates of all luxury models (i.e., foreign plus domestic). However, M14 has a failure rate (13%) higher than the average for foreign luxury models (11%). Some confidence intervals in Figure 2 are not symmetric. This occurs as a result of our using the exact confidence interval approach based on the binomial distribution; this method, as opposed to the commonly used method based on the normal distribution, is appropriate for instances in which the sample size is small. Figure 2 shows that almost every economy class vehicle model in our data set has an upper bound on its confidence interval of less than 6.2%, regardless of whether the vehicle is a domestic or a foreign model. The poor performance of models M38 and M45, which are both foreign, is notable, with the greater than 40% average failure rate for M38 being particularly striking. The average failure rate for all truck models is 12.7%, whereas the average failure rate for all sedans (both luxury and economy) is only 4.9%. The average failure rate for luxury sedans is 7.1%, and that for economy sedans is 4.1%. One possible explanation for these dramatic differences in performance between trucks and sedans could be as follows: Trucks were designed to a less stringent certification standard than cars (see the Euro I new vehicle certification standards in Table S2), yet the Beijing ASM (Table S3) and U.S. ASM (Table S4) cut points are very similar for cars and trucks. (The differences consist of small deviations based on different vehicle test weights.) This does not provide a complete explanation for the differences in failure rates, however, because the Euro I new vehicle certification standards are based on total mass of pollutants emitted, but the Beijing
FIGURE 2. Failure rates with confidence intervals grouped by CLASS and DOMESTIC. and U.S. ASM cut points are based on concentration (percentage or ppm). Therefore, even though a truck emits much more mass per mile (or second), the concentration of emissions from trucks could be low because the displacements of truck engines are usually larger than those of sedan engines. The larger emissions in mass from a truck engine can thus be diluted by the relatively large volume of air associated with the bigger engine size, and thus, similar cut points in concentration between sedans and trucks do not necessarily mean that the standards for trucks are less stringent than those for sedans. Foreign vehicles in the truck/van/minivan class perform relatively well. In contrast, the domestic vehicles, with some notable exceptions, perform poorly, with 5 of the 19 domestic vehicles in this category having lower bounds on their confidence intervals of greater than 25%. For both luxury and economy classes, about 95% of the vehicles are in the foreign category; in contrast, for trucks, vans, and minivans, about 95% of the vehicles are in the domestic category. 3.1.2. Fourteen Worst-Performing Vehicle Models. Vehicle model plays a key role in the overall analysis, and thus, the focus herein is on vehicle models in our data set with relatively poor performance. The process of identifying poorly performing models relied on the main effects model. The final model could not be used to identify models with the highest failure probabilities because each vehicle model has four values of probabilities (one for each of the four model years). Using the main effects model, it was possible to identify the worst performing vehicle models, defined as the models having statistically significant values of βi greater than 1.5, which corresponds to an estimated probability of failure of 23.2% relative to the 6.35% failure probability of the reference category for MODEL (i.e., M2) in the main effects model. (The Supporting Information contains modeling details.) There were 14 such vehicle models, and they are listed in Table 1, along with corresponding average failure rates and estimated probabilities of failure.
CLASS and DOMESTIC for the 14 Worst Models In analyzing the 14 worst vehicle models, consider the two variables that can be derived on the basis of vehicle models:
TABLE 1. Summary of Fourteen Worst Performing Vehicle Models vehicle model M52 M29 M25 M28 M39 M38 M20 M43 M34 M30 M51 M41 M47 M40
estimated prob. of failurea
av failure rate (%)
vehicle class
domestic or foreign
93.7 68.8 65.3 61.5 49.2 41.9 41.9 40.1 34.0 30.0 29.4 29.1 27.9 27.5
83.5 51.7 35.8 52.2 34.9 42.0 20.0 12.4 7.3 9.2 23.4 28.8 24.5 16.0
truck truck truck truck truck economy truck truck truck truck luxury luxury luxury luxury
domestic domestic domestic domestic domestic foreign domestic domestic domestic domestic foreign foreign foreign foreign
a All estimated probabilities are relative to the average 6.35% failure probability of the reference category for MODEL (i.e., M2)
vehicle class and type of manufacturer (i.e., domestic or foreign). As shown in Table 1, nine of the worst performing vehicle models are within the class category of truck, van, and minivan, and they are all domestic. Of the five foreign vehicles in Table 1, four fall into the luxury class, and one is in the economy class. A cross tabulation between CLASS and DOMESTIC shows that foreign manufacturers produced or sold very few trucks, vans, and minivans (only 1.2% out of the total vehicle population in the data set). In contrast, domestic automakers focus primarily on the production of trucks, vans, and minivans. It is surprising that many foreign luxury models had large probabilities of failure. As explained below, the appearance of the nine domestic truck/van/ minivan models is not a surprise. During interviews with government officials and auto manufacturer managers conducted for this research in Beijing, Shanghai, Changchun, Wuhu, and Ningbo between VOL. 42, NO. 19, 2008 / ENVIRONMENTAL SCIENCE & TECHNOLOGY
9
7313
2002 and 2004, many interviewees noted that the technological capabilities of domestic manufacturers that produce light-duty trucks, vans, and minivans lag far behind those of foreign joint ventures producing comparable types of vehicles. In addition to the less-advanced manufacturing technologies used in the production of trucks, vans, and minivans, the emission control equipment purchased by domestic manufacturers of these vehicle types often consisted of “low-end” technologies produced inexpensively by domestic equipment manufacturers. These observations help explain why the nine domestic vehicle models in Table 1 have relatively high probabilities of failure. However, the results for foreign manufacturers in the table are more difficult to comprehend because foreign manufacturers and foreign joint ventures have ready access to technologies needed to meet Euro I standards. Results in Table 1 suggest that even vehicles produced by foreign manufacturers and joint ventures may have systematic defects in terms of emission controls. The commonly expressed view that foreign companies (including joint ventures) produce vehicles with higher quality may not reflect reality. However, without an inuse surveillance program designed to detect systematic shortfalls in the design and manufacturing of vehicle models, the findings suggesting systemic defects (based on these I/M testing data) are open to question. Problems in performance of vehicle models made by multinational corporations are not unique to China. Stewart et al. (12), in reporting on vehicle emissions data in British Columbia, Canada, indicated that multinational auto companies might have sometimes sold cars in Canada with systematic shortcomings in terms of meeting emissions requirements (12). Stewart et al. found that some Japanese, American, and European vehicle models (with model years from 1998 to 2002) tested in British Columbia had very high failure rates. (Examples are given in the Supporting Information.) This comparison has its limitations, since Canada does not have a domestic auto industry like China’s, and the definition of “foreign” used herein includes vehicles imported to China as well as those manufactured by foreign companies or joint ventures in China. 3.2. Environmental Implications. The linkage between vehicle model type and probability of emissions test failure has strong implications for policy. China does not have either an in-use surveillance program or a vehicle recall program. A surveillance program would need to be put in place to determine if particular vehicle models have design and manufacturing deficiencies that make them fail at significantly higher rates compared to other vehicle model types. And a recall program would provide a way to eliminate design and manufacturing deficiencies that make vehicles fail at significantly higher than average rates. However, before any testing and recall program is implemented, a comprehensive understanding of the I/M program in Beijing through more thorough data collection and analysis should be conducted. Another strategy, one followed in Sweden, would be to provide incentives for manufacturers to improve vehicle emissions system performance by disclosing to the public results showing average failure rates for particular makes and models (13). Questions can be raised about the feasibility and cost of in-use surveillance and recall programs for China. At this point, after several years of solid experience with vehicle emission standards and I/M programs, Chinese environmental authorities have the technical capabilities and experience to design and implement in-use surveillance and recall programs. Regarding resources needed for program implementation, an important issue is whether the monetary benefits are likely to exceed the costs. For a perspective on this question, consider that the monetary costs of air pollution in China have been estimated to be 1-7% of GDP (14, 15). And although it is difficult to isolate the influence on ambient air quality of vehicles with
7314
9
ENVIRONMENTAL SCIENCE & TECHNOLOGY / VOL. 42, NO. 19, 2008
systematic defects in vehicle emission control systems, it is often the case that a small number of high emitters account for the majority of total emissions from vehicles. Moreover, given that the number of vehicles in Chinese cities is increasing at a very rapid rate, air pollution from vehicles is likely to increase in significance. Apart from these cost-benefit considerations, the need for a program to identify vehicle model types with defective vehicle emission control systems can be framed in terms of fairness. Under the current system, vehicle owners are responsible for any costs incurred if a vehicle fails the annual emissions test. Without a program to identify systematically defective vehicle models, consumers would continue to be responsible for repair costs that could have been avoided by better vehicle design and manufacturing practices.
Acknowledgments The authors thank Paul Switzer, Lynn Hildemann, Jean Oi, and James Sweeney of Stanford University and Lixin Fu, Jiming Hao, and Kebin He of Tsinghua University for their help and support during the process of data gathering, model building, and data analysis. We also thank our ES&T Editor, Armistead Russell, and two anonymous reviewers for their many suggestions for improving the paper.
Supporting Information Available Additional information on statistical model variables, vehicle emissions testing procedures, data quality control methods, main effects model, and final model. This material is available free of charge via the Internet at http://pubs.acs.org.
Literature Cited (1) J. Transp. Stat. 2000, 3 (Special issue on the statistical analysis and modeling of automotive emissions). (2) Wenzel, T. Singer, B. C. Slott, R. Some issues in the statistical analysis of vehicle emissions. J. Transp. Stat. 2000 3 (Special issue on the statistical analysis and modeling of automotive emissions). (3) Beydoun, M.; Guldmann, J.-M. Vehicle characteristics and emissions: Logit and regression analyses of I/M data from Massachusetts, Maryland, and Illinois. Transp. Res., Part D 2006, 11, 59–76. (4) Bin, O. A logit analysis of vehicle emissions using Inspection and Maintenance testing data. Transp. Res., Part D 2003, 8, 215–227. (5) Washburn, S.; Seet, J.; Mannering, F. Statistical modeling of vehicle emissions from I/M testing data: an exploratory analysis. Transp. Res., Part D 2001, 11, 21–36. (6) National Research Council (Committee on Vehicle Emission Inspection and Maintenance Programs-Board on Environmental Studies andToxicology-TransportationResearchBoard). EvaluatingVehicle Emissions Inspection and Maintenance Programs; The National Academies Press: Washington, D.C., 2001. (7) Chang, C. Automobile Pollution Control in China: Enforcement of andCompliancewithVehicleEmissionStandards; Ph.D. dissertation, Stanford University, 2006. (8) Hubbard, T. An empirical examination of moral hazard in the vehicle inspection market. Rand J. Economics 1998, 29, 406–426. (9) Agresti, A. Categorical Data Analysis; 2nd ed.; Wiley-Interscience Press, 2002. (10) Kleinbaum, D. G.; et al. Applied Regression Analysis and Other Multivariable Methods; 3rd ed.; Duxbury Press: Pacific Grove, 1998. (11) Chatterjee, S.; Hadi, A. S.; Price, B. Regression Analysis by Example; John Wiley & Sons, Inc., New York, 2000. (12) Stewart, S.; Wong, J. In Proceedings of the 15th CRC On-road Vehicle Emissions Workshop; 2005. (13) The National Swedish Board for Consumer Policies. Cars-Strong & Weak Points; 4th ed.,The Swedish Motor Vehicle Inspection Company, 1996. (14) U.S. Embassy, Beijing. The Cost of Environmental Degradation in China 2000. Available at http://www.usembassy-china org.cn/ sandt/Costofpollution -web.html. (15) World Bank. “Clear Water, Blue Skies”; 1997.
ES702636A