Safety Assessment of Food and Feed from GM Crops in Europe

Jun 2, 2017 - 503/2013). The Implementing Regulation refers to guidelines set forth by the European Food Safety Authority (EFSA) for the design, condu...
1 downloads 11 Views 936KB Size
Subscriber access provided by BRIGHAM YOUNG UNIV

Article

Safety Assessment of Food and Feed from GM Crops in Europe: Evaluating EFSA's Alternative Framework for the Rat 90-day Feeding Study Bonnie Hong, Yingzhou Du, Pushkor Mukerji, Jason M. Roper, and Laura Marie Appenzeller J. Agric. Food Chem., Just Accepted Manuscript • Publication Date (Web): 02 Jun 2017 Downloaded from http://pubs.acs.org on June 7, 2017

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Journal of Agricultural and Food Chemistry is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 52

Journal of Agricultural and Food Chemistry

Safety Assessment of Food and Feed from GM Crops in Europe: Evaluating EFSA's Alternative Framework for the Rat 90-day Feeding Study

Bonnie Hong a, Yingzhou Du a,b, Pushkor Mukerji c, Jason M. Roper c, Laura M. Appenzeller a,*

c

a

Pioneer Hi-Bred International, Inc., Johnston, IA, USA

b

Iowa State University, Snedecor Hall, Ames, IA, USA

DuPont Haskell Global Centers for Health and Environmental Sciences, Newark, DE, USA

*Corresponding Author: Laura M. Appenzeller, Email: [email protected]; Phone: +1 515 535 5748; Fax: +1 515 535 7279

Title Running Header: Evaluating EFSA’s Rat 90-day Feeding Study Design

1 ACS Paragon Plus Environment

1

Journal of Agricultural and Food Chemistry

1

ABSTRACT

2

Regulatory-compliant rodent subchronic feeding studies are compulsory regardless of a

3

hypothesis to test, according to recent EU legislation for the safety assessment of whole

4

food/feed produced from genetically modified (GM) crops containing a single genetic

5

transformation event (European Union Commission Implementing Regulation No. 503/2013).

6

The Implementing Regulation refers to guidelines set forth by the European Food Safety

7

Authority (EFSA) for the design, conduct, and analysis of rodent subchronic feeding studies.

8

The set of EFSA recommendations was rigorously applied to a 90-day feeding study in Sprague-

9

Dawley rats.

After study completion, the appropriateness and applicability of these

10

recommendations were assessed using a battery of statistical analysis approaches including both

11

retrospective and prospective statistical power analyses as well as variance-covariance

12

decomposition. In the interest of animal welfare considerations, alternative experimental designs

13

were investigated and evaluated in the context of informing the health risk assessment of

14

food/feed from GM crops.

15 16

Page 2 of 52

KEYWORDS: EFSA, experimental design, genetically modified, rat, statistical power

2 ACS Paragon Plus Environment

2

Page 3 of 52

17

Journal of Agricultural and Food Chemistry

1. INTRODUCTION

18

International regulatory recommendations for the pre-commercialization safety

19

assessment of foods and feeds from genetically modified (GM) crops with a single event require

20

a scientifically rigorous, systematic assessment to identify potential hazards arising not only

21

from expression and activity of non-endogenous proteins introduced to confer a phenotypic trait,

22

but also from potential transformation-induced unintended pleiotropic alterations.1-18 Intended

23

and unintended modifications are likely to be detected through the comprehensive comparison of

24

agronomic, phenotypic, molecular, and compositional characteristics of the GM crop and derived

25

food/feed with those of near-isogenic and other conventional non-GM varieties conducted as part

26

of the systematic assessment. Historically, a whole food/feed dietary subchronic (also referred to

27

as 13-week or 90-day) toxicological evaluation in rats has been recommended when potential

28

hazards have been identified or uncertainties remain following the comprehensive comparative

29

assessment.8,12,19-24

30

pleiotropic alterations in, the food/feed matrix with respect to their possible detrimental impact

31

on human/animal health.3,7,8,15,22,25-27 In the absence of a formal guideline describing whole-food

32

dietary toxicity studies, the internationally-harmonized Organisation for Economic Co-operation

33

and Development (OECD) 408 chemical testing guideline28 was adapted from a dose-response

34

study into a comparative limit-test design.

Its purpose is to characterize intended changes to, and detect potential

35

The inherent flexibility of the comparative limit test29-55 enables variable dietary

36

incorporation of feed macroingredients such as maize, soybean meal, rice, and canola meal to

37

accommodate the dietary tolerance of the test system.

38

common, although studies have been published in which two or three dietary inclusion levels

39

have been used. The majority of these subchronic studies was conducted in the Sprague-Dawley

A single dietary inclusion level is

3 ACS Paragon Plus Environment

3

Journal of Agricultural and Food Chemistry

Page 4 of 52

40

rat; three studies were found in which the Wistar rat was used, and two studies used mice

41

(BALB/c or C57BL/6J). Number of animals per sex per group varied between six and twenty;

42

most studies used ten to twelve. Animals were housed singly (16 studies), in groups of five (6

43

studies), or in pairs (4 studies). Statistical analysis was performed separately for each sex, and

44

only three studies employed an adjustment for multiple testing.

45

paradigm has been applied to the safety assessment of food and feed from GM crops for over a

46

decade with no evidence of adverse effects reported to-date.29-55

This subchronic testing

47

Recent changes to European Union (EU) legislation have made compulsory the rodent

48

subchronic toxicological evaluation of food/feed produced from new GM crops containing a

49

single event, independent of potential hazards identified through prior investigations (preamble

50

[11] of the European Union Commission Implementing Regulation No. 503/2013, of 3 April

51

2013).56 The Implementing Regulation refers to specific recommendations described in the

52

European Food Safety Authority (EFSA) Scientific Committee's guidance document for the

53

design, conduct, statistical analysis, and data presentation for the rodent subchronic feeding

54

study (part 1.4.4.1 in Section II of Annex II of the Implementing Regulation).23 The alternative

55

study framework retains basic OECD 408 metrics, such that a comprehensive toxicological

56

evaluation of animal health can be made in accordance with accepted scientific principles. 28,57,58

57

However, EFSA's recommendations include blinding of treatment, randomization and blocking,

58

paired-housing, justification of sample size calculation based on a pre-specified effect size of

59

toxicological relevance, inclusion of multiple dose levels, combined-gender statistical analysis,

60

reporting of data on a standardized effect size scale, and multiplicity adjustment to control Type I

61

error rate.23 These modifications complicate the existing design and statistical analysis, but were

62

intended to minimize potential sources of experimental bias, to maximize the power of the study

4 ACS Paragon Plus Environment

4

Page 5 of 52

Journal of Agricultural and Food Chemistry

63

to detect any possible toxicological effects in test groups, and to enable assessment of a dose-

64

response relationship for observed biological effects.

65

This alternative framework was applied to a 13-week feeding study in Sprague-Dawley

66

rats examining potential health effects related to consumption of grain from an insect-protected

67

and herbicide-tolerant GM maize in development, a novel molecular stack containing a single

68

event comprising four genes of microbial origin.

69

comparative assessments were conducted according to international recommendations1-18 prior to

70

initiation of the feeding study.

71

Supplemental Information, incorporated elements of study design, conduct, analysis, and data

72

presentation in accordance with the recommendations of the EFSA guidance.23

Event characterization and systematic

The study referenced herein, described in greater detail in the

73

Data collected in this study were used to assess the inherent variability of the EFSA

74

design using retrospective statistical power analyses as well as variance-covariance

75

decomposition at the cage and individual-animal levels. The recommended statistical analysis

76

approach, including elements not typically applied in rodent toxicology studies such as

77

combined-gender analysis and presentation of data on the standardized effect size scale, is also

78

evaluated. Lastly, statistical power analysis was used to explore the comparative robustness of

79

alternative study designs in the interest of animal welfare considerations (number and size of

80

experimental groups, single vs. social housing).

81

2. MATERIALS AND METHODS

82

2.1. Test, control, and reference maize grain and characterization: Supplemental Information,

83

Appendix A

84

2.2. Experimental diet formulation and administration

5 ACS Paragon Plus Environment

5

Journal of Agricultural and Food Chemistry

85

Six experimental diets were formulated by Purina Mills, LLC (St. Louis, MO) to be

86

isonitrogenous and isocaloric with a nutritional profile comparable with that of a common

87

laboratory chow, PMI Nutrition International, LLC Certified Rodent LabDiet 5002,59 and

88

prepared in meal form by Purina TestDiet (Richmond, IN). Diet characterization is described in

89

Supplemental Information, Appendix A.

90

sources of reference maize grain (commercial Pioneer® hybrids, identified as ref1, ref2, and ref3)

91

were each incorporated at 40 % of the diet by weight and fully replaced the bulk maize grain

92

normally sourced. An additional test diet, identified as low dose, incorporated 20 % test grain

93

and 20 % control grain. Total maize incorporation was set to 40 %, slightly above the inclusion

94

rate of maize grain in Rodent LabDiet 5002, to accommodate a low dose of test maize that was

95

higher than typical human daily consumption as required by EFSA.23 This inclusion allowed for

96

formulation of nutritionally balanced diets using the same ingredients normally found in 5002,

97

thus maintaining the optimal texture, palatability, and digestibility of the standard diet. Numeric

98

diet codes were assigned randomly prior to diet manufacture using a web-based random number

99

generator. Experimental diets were stored refrigerated, except during administration. Fresh diet

100

was supplied weekly, and rats were fed their respective experimental diets ad libitum for 92-

101

95 consecutive days.

102

experimental diet assigned to each group until the completion of the in-life phase of the study.

103

2.3. Animal care and management

Page 6 of 52

Test (identified as high dose), control, and three

The testing facility technical staff was blind to the identity of the

104

This study was performed at DuPont Haskell Global Centers for Health and

105

Environmental Sciences (Newark, DE), a facility accredited by the Association for Assessment

106

and Accreditation of Laboratory Animal Care International (AAALAC). The protocol and study

107

design were reviewed and approved by the DuPont Haskell Institutional Animal Care and Use

6 ACS Paragon Plus Environment

6

Page 7 of 52

Journal of Agricultural and Food Chemistry

108

Committee (IACUC).

All procedures complied with the Guide for the Care and Use of

109

Laboratory Animals,60 the US EPA FIFRA (40 CFR part 160) Good Laboratory Practice (GLP)

110

Standards, and the following guidelines for rodent subchronic toxicology studies:

111

Section 4 (Part 408), Repeated-dose 90-Day Oral Toxicity Study in Rodents, Guideline for the

112

Testing of Chemicals,28 and the EFSA Scientific Committee's Guidance on conducting repeated-

113

dose 90-day oral toxicity study in rodents on whole food/feed.23

OECD,

114

Male and female Crl:CD®(SD) (Sprague-Dawley) rats were obtained from Charles River

115

Laboratories, Inc. (Raleigh, NC). The Crl:CD®(SD) rat was selected based on consistently

116

acceptable health status, well-established suitability for use in subchronic toxicity testing, and

117

extensive experience with this strain at DuPont Haskell (robust historical control database).

118

Animals were received as a single shipment consisting of two cohorts differing in age by one

119

week (approximately 4 and 5 weeks old), and housed in a single room. The animal room was

120

maintained at 23 °C ± 3 °C and relative humidity 50 % ± 20 %, with an approximate 12-hour

121

light/dark cycle.

122

bedding and environmental enrichment and given tap water (United Water Delaware;

123

Wilmington, DE) ad libitum. During acclimation, rats were fed PMI® Nutrition International,

124

LLC Certified Rodent LabDiet® 5002 ad libitum (Purina Mills, Richmond, IN). During the test

125

period, animals were fed their respective experimental diets ad libitum, except when fasted prior

126

to sacrifice when food, but not water, was withheld for a minimum of 15 hours.

127

2.4. Sample size determination

Rats were pair-housed within cohort by sex in solid-bottom caging with

128

Sample size determination defines the number of animals per sex per treatment group

129

needed to detect a pre-specified biologically relevant effect size with a specified statistical power

130

(generally ≥ 80 %) and significance level (commonly set at 0.05) and usually requires a

7 ACS Paragon Plus Environment

7

Journal of Agricultural and Food Chemistry

131

statistical power analysis to be conducted prior to study initiation based on a set of key outcome

132

parameters with known effect sizes and variation as observed in a typical experiment. Because

133

outcome parameters of critical relevance to rat subchronic feeding studies with whole food/feed

134

have not been defined by any regulatory body, a variety of biological alterations purported to

135

represent early indicators of adverse nutritional or toxicological effects were identified in the

136

literature.61-67 These biologic alterations were considered primarily within the context of dose

137

selection for endocrine, chronic, or carcinogenicity studies, and for the purposes of

138

demonstrating either selection or attainment of an appropriate maximum tolerated dose (MTD) in

139

long-term rodent bioassays. The suggested effect sizes for these biological alterations were used

140

as targeted effect sizes for the purpose of statistical power analysis for the rat subchronic feeding

141

study, even though they should not be considered synonymous with biologically or

142

toxicologically relevant effects in subchronic studies. Nine outcome parameters (also referred to

143

as endpoints) were selected based on their likelihood to be impacted by altered test substance

144

palatability, digestibility, or nutrient bioavailability, and their respective targeted effect sizes are

145

provided in Table 1.

146

Biological variation in a typical experiment is based ideally on historical control data

147

obtained from the same experimental design at the same testing facility using the same age and

148

strain of test animal, although some deviations (e.g., use of data from similar studies conducted

149

at multiple testing facilities) may be acceptable when no substantial change to experimental

150

variance across sites has been demonstrated. Prior to this study, few pair-housed rat subchronic

151

feeding studies had been conducted that met these criteria. Thus, data from five available pair-

152

housed studies (20 groups each of male and female rats, 6 pairs/sex/group) conducted at multiple

153

testing facilities were used to estimate coefficients of variation (CV’s) for males and females. In

8 ACS Paragon Plus Environment

Page 8 of 52

8

Page 9 of 52

Journal of Agricultural and Food Chemistry

154

the absence of sufficient data for paired-housing and a randomized complete block design, a

155

simplified power analysis approach was utilized based on CV’s between experimental units

156

(cages) and assumed a completely randomized design. The power analysis was conducted for

157

males and females separately, using 3 treatment groups (high dose, low dose, control) and 8

158

cages per group per sex. For each endpoint and sex, expected power was calculated using a level

159

of significance of 0.05 and a two-sided hypothesis test. CV values and results of this prospective

160

power analysis are provided in Table 1.

161

The power analysis indicated an expected power well over 90 % for both males and

162

females for the majority of endpoints, with the exception of absolute lymphocyte count (ALYM)

163

in females, (expected power of 75 %). Therefore, a sample size of 8 cages per group per sex was

164

determined to be appropriate for the study.

165

Table 1. Selected key outcome parameters, targeted effect sizes, CV values, and expected

166

and attained power

167

2.5. Randomization and blocking

168

Due to the size of the study (16 rats/sex/group x 6 groups = 192 rats), half of the animals

169

(one age-matched cohort) were placed on study on each of two start dates, one week apart;

170

animals were approximately 7 weeks old at their respective study initiation dates.

171

On the first day of experimental diet administration, male and female rats of the same

172

cohort were randomized to sex-matched pairs based on stratification of body weight; pairs were

173

then randomized to diet group and cage, blocked by body weight, with the cage representing the

174

experimental unit. Cage position was randomized on the cage racks, such that cages from the six

175

diet groups were intermingled. Cage racks contained animals of one sex. This is illustrated

176

schematically in Figure 1. Two additional diet groups (different test substance) were also

9 ACS Paragon Plus Environment

9

Journal of Agricultural and Food Chemistry

177

included to take advantage of shared control and reference group animal data, but were not used

178

in any analyses reported herein.

179

Figure 1. Randomization and blocking scheme

180

2.6. Study conduct: Clinical observations, Ophthalmology and neurobehavioral evaluation,

181

Clinical pathology, Anatomic pathology: Supplemental Information, Appendix A

182

2.7. Comparative analysis of experimental data

Page 10 of 52

183

Experimental data from each test group (high dose and low dose) and the control group

184

were included in the comparative analysis; data from the reference groups were summarized

185

(mean, standard deviation, range) to assess the biological variability of endpoints evaluated and

186

the biological relevance of statistical differences between test and control groups. The following

187

endpoints were evaluated:

188

strength, motor activity, quantitative clinical pathology endpoints, and absolute and relative

189

organ weights. All statistical analyses and associated statistical tests were conducted using SAS®

190

software, Version 9.3 (SAS Institute Inc., Cary, NC) at the significance level of 0.05.

body weight and gain, food consumption and efficiency, grip

191

Although endpoints with continuous, categorical, and discrete values were assessed, only

192

the statistical analysis approaches for endpoints with continuous values (endpoints with

193

biological responses on a continuum; most endpoints evaluated in this study, including those in

194

Table 1) are discussed.

195

approach which incorporated the design and treatment structure. Three different linear mixed

196

models were developed, including both sexes when possible to satisfy the required combined-

197

gender analysis. Model (1) was used for endpoints measured on the individual rat basis, such as

198

body weight and neurobehavioral and clinical pathology endpoints; Model (2) was used for

199

endpoints measured on the cage basis, such as food consumption and food efficiency; and Model

Comparative analysis was conducted using a linear mixed model

10 ACS Paragon Plus Environment

10

Page 11 of 52

Journal of Agricultural and Food Chemistry

200

(3) was used for endpoints involving sex-specific organs for which a combined-gender model is

201

not applicable.

202

2.7.1. Linear mixed model analysis for endpoints measured on the individual rat basis (Model 1)

203 204 205

Let

yijkl

be the response from block i , sex j , diet group

and rat l , where i  1, ..., 8 ;

j  Female, Male ; k  high dose,low dose,control ; l  1, 2 . Statistical analysis was conducted using

the following linear mixed model.

yijkl     j  k  ( ) jk  ij   ijk   ijkl

206

Model (1)

 Fixed effects in Model (1) include:  , the overall mean; j , the main effect of sex j ;

207

( ) jk

208

 k , the main effect of diet group

209

Random effects include:

210

to diet group

211

assigned to diet group

212

and

213

normal distribution with heterogeneous variance by sex

 ijkl

k

ij

k

;

, the interaction between sex j and diet group

k

k

.

 , the effect of block i for sex j ; ijk , the effect of the cage assigned

 of sex j in block i ; ijkl , the error term associated with rat

l

in the cage

  of sex j in block i . Model (1) assumes that random effects ij , ijk

are independent of each other. Random effect of block

2 ij ~ iid N (0, Block ,j)

214 215

k

ij

is assumed to follow a

.

Note: The variance of block is a function of sex j . Notation

~ iid N (0, a2 ) here indicates

216

a random variable that is identically independently distributed (iid) as normal with zero mean

217

and variance

 a2 , subscript

a

represents the corresponding source of variation.

11 ACS Paragon Plus Environment

11

Journal of Agricultural and Food Chemistry

218

Page 12 of 52

Random effect caused by pair-housing rats in the experimental unit (cage) is modeled as

219

two correlated normal random variables,

220

structure

( ijk   ijk1 )

and

( ijk   ijk 2 )

  2   ijk   ijk1    Cage, j   ~ iid N  0,  Rat , j   ijk   ijk 2     Cage, j    

221



, with variance-covariance

  Cage, j    Cage, j  

2 Rat , j

.

Note: The variance-covariance structure is a function of sex j (i.e., heterogeneous by

222

 Cage, j

223

sex). Parameter

224

parameter setting is commonly known as the compound symmetry structure. For its advantages

225

in statistical modeling, see Littell et al., 2006.68

226

2.7.2. Linear mixed model analysis for endpoints measured on the cage basis (Model 2)

227

Let

yijk

represents the covariance between two rats from the same cage. This

be the response from a cage associated with block i , sex j , diet group

k

, where

228

i  1, ..., 8 ; j  Female, Male ; k  high dose,low dose,control . Statistical analysis was conducted using

229

the following linear mixed model.

yijk     j  k  ( ) jk  ij  ijk

230

Fixed effects in Model (2) include  ,

231

j

ij

Model (2)

( ) jk ,  k , and , with the same definition as

those in Model (1). Random effects include

233

Model (1).

234

and

235

2.7.3. Linear mixed model analysis for endpoints involving sex-specific organs (Model 3)

ij

and

 ijk

2  ijk ~ iid N (0, Cage ,j)

and

 ijk

232

, with the same definition as those in

are assumed to be independent of each other:

2 ij ~ iid N (0, Block ,j)

.

12 ACS Paragon Plus Environment

12

Page 13 of 52

Journal of Agricultural and Food Chemistry

Let yikl be the response from block i , diet group

236

k

and rat l , where i  1, ..., 8 ;

237

k  high dose,low dose,control ; l  1, 2 . Statistical analysis is conducted using the following linear

238

mixed model. yikl    k  i   ik   ikl

239

Model (3)

Fixed effects in Model (3) include:  , the overall mean;  k , the effect of diet group

240

k

.

241

Random effects include: i , the effect of block i ;  ik , the effect of the cage assigned to diet group

242

k

243

block i . Random effects i ,  ik and  ikl are assumed to be independent of each other and

244

in block i ;  ikl , the error term associated with rat

i ~ iid N (0, 

 Cage

2 Block

l

in the cage assigned to diet group

k

in

2    Rat   Cage  Cage     ik   ik1     .   ),  ~ iid N 0 , 2            Cage Rat Cage    ik ik 2   

245

Again, parameter

represents the covariance between two rats housed in the same cage.

246

2.7.4. Statistical comparisons between each of the test groups and the control group

247

Under Models (1) and (2), across-gender comparisons between test and control diet

248

groups correspond to testing linear algebraic contrasts that involve  k ’s; gender-specific

249

comparisons between test and control diet groups correspond to testing linear algebraic contrasts

250

( ) jk that involve both  k ’s and ’s at a specific sex j . In addition, testing for sex by diet group

251

interaction corresponds to the F test on whether the effect of the term

252

significant. The approximated degrees of freedom for these three types of tests were derived

253

using the Kenward-Roger method.69 When evaluating statistical test results, sex by diet group

254

interaction was first examined to determine whether gender-specific results or across-gender

255

results were appropriate for evaluation. If the sex by diet group interaction was not significant,

13 ACS Paragon Plus Environment

( ) jk

is statistically

13

Journal of Agricultural and Food Chemistry

Page 14 of 52

256

then across-gender results were evaluated; if the interaction was significant, gender-specific

257

results were evaluated. Under Model (3), comparisons between test and control diet groups

258

correspond to testing linear algebraic contrasts that involve  k ’s. Only gender-specific testing

259

results were available for evaluation.

260

The following metrics are reported: for every endpoint, gender-specific estimates of

261

means and across-gender estimate of mean, if applicable, of each diet group (labeled as LS-

262

Means); LS-Means differences between the control and high dose, between the control and low

263

dose, and 95 % confidence intervals (CI) of the differences (labeled as Difference, 95 % CI); for

264

every endpoint measured on both sexes, P-values for testing sex-by-diet group interaction.

265

2.7.5. Multiplicity adjustment for a large number of statistical tests across endpoints

266

The false discovery rate (FDR) method of Benjamini and Hochberg70,71 was applied as a

267

post hoc procedure to account for multiple testing (multiplicity) due to independent analysis of a

268

large number of endpoints, and P-values were adjusted accordingly. For each set of pairwise

269

comparisons (all pairwise comparisons conducted between the high or low dose group and the

270

control group), an FDR adjustment was conducted across all endpoints for P-values from the

271

across-sex comparisons.

272

separately for males and females.

273

2.8. Retrospective power analysis for evaluation of EFSA design

FDR adjustments for sex-specific comparisons were conducted

274

After the completion of the current study conducted under EFSA's alternative framework,

275

a retrospective power analysis was performed to evaluate the variability associated with the

276

recommended experimental design.

277

compare the attained statistical power for the selected key outcome parameters with the expected

278

power calculated prior to the study; 2) to calculate the detectable effect size for all continuous

The purpose of this analysis was several-fold:

14 ACS Paragon Plus Environment

1) to

14

Page 15 of 52

Journal of Agricultural and Food Chemistry

279

endpoints; and 3) to evaluate the statistical power of combined-gender analysis compared with

280

traditional separate-gender analyses.

281

employed, and cross-gender comparison were considered, rather than the simplified approach

282

used in the power analysis prior to the study as described in Section 2.3.

283

The actual experimental design, statistical models

One hundred forty-four continuous endpoints, accounting for 94 % of those measured in

284

this study, were included in the retrospective power analysis.

The calculation of attained

285

statistical power for selected endpoints identified initially (Table 1) includes consideration of

286

three additional outcome parameters with published targeted effect sizes: 10 % decrease in

287

cumulative body weight gain and 25 % increases in kidney and liver weights relative to terminal

288

body weight.62-66 Attained statistical power was calculated based on the given targeted effect

289

size and the actual variability observed in this study.

290

Since pre-study power analysis for sample size determination requires pre-specified

291

targeted effect sizes, which were not available for most endpoints, a retrospective analysis

292

considering the observed variability of all continuous endpoints was developed to generate

293

power values for a set of possible effect sizes expressed as % change from the control group

294

(detailed methodology not presented). The smallest effect size value that generated a power

295

value over 80 % was defined as the detectable effect for a given endpoint. The statistical tests

296

were then grouped by the detectable effect size into categories of < 5 %, 5 % ‒ 10 %, 10 % ‒ 20

297

%, 20 % ‒ 50 %, and > 50 % (Table 4).

298

2.9. Retrospective power analysis for evaluation of combined- vs separate-gender analysis

299

Endpoints representing variance heteroscedasticity by gender (end-of-study body weight)

300

and variance homoscedasticity (absolute lymphocyte count) were selected. The combined-

301

gender model utilized to analyze all continuous endpoints in the current study assumed

15 ACS Paragon Plus Environment

15

Journal of Agricultural and Food Chemistry

302

heterogeneous variance for males and females (Model 1 and Model 2); thus, the retrospective

303

power analysis was conducted under the same model applied to the study data. For lymphocyte

304

count, the retrospective analysis was also conducted under the alternative combined-gender

305

model, which assumed homogeneous variance for males and females to assess the impact of

306

using a simpler model for combined-gender analysis. Results are provided in Table 5.

307

2.10. Prospective power analysis for evaluation of alternative experimental designs

Page 16 of 52

308

Study designs considered animal housing options, number of experimental groups, and

309

whether blocking factors were included (i.e., randomized complete block or completely

310

randomized design). Historical study data from non-test groups were obtained from the testing

311

facility that performed the current study. Data from non-test animals from three randomized

312

complete block pair-housed studies (completed in 2014 and 2015, including the current study; 14

313

groups each of male and female rats, 8 pairs/sex/group), were utilized to estimate the variability

314

of the EFSA design. Similarly, data from non-test animals from three completely-randomized

315

single-housed studies (completed between 2008 and 2012; 12 groups each of male and female

316

rats, 12 animals/sex/group) were utilized to estimate the variability associated with alternative

317

designs.

318

The prospective power analyses were conducted for the selected endpoints and targeted

319

effect sizes identified in Table 1, including cumulative body weight gain and relative liver and

320

kidney weights.

321

interactions cannot be excluded a priori, power analyses were performed separately for the two

322

sexes.72

323 324

In contrast to the retrospective power analysis, since sex by diet group

Measurements were collected on a per rat basis. Therefore, the linear mixed model utilized for separate-gender analysis was as follows, which is the same as Model (3):

16 ACS Paragon Plus Environment

16

Page 17 of 52

Journal of Agricultural and Food Chemistry

325

yikl    k  i   ik   ikl .

326

For a given endpoint, the targeted effect size (change relative to control), denoted as 𝛥, is

327

incorporated into the alternative hypothesis (H1) in the power analysis as follows:

328

𝐻0 : 𝛽𝑎 − 𝛽𝑏 = 0 𝑣𝑠

329

𝐻1 : 𝛽𝑎 − 𝛽𝑏 = 𝛥 ≠ 0,

330

for the statistical comparison between diet groups a and b, and the sampling distribution of mean

331

difference is 2(

332

2   Rat + 𝐿 Cage )

𝑦̅𝑎.. − 𝑦̅𝑏.. ~ 𝑁 𝛽𝑎 − 𝛽𝑏 ,

𝐵𝐿

(

≜ 𝑉𝑑𝑖𝑓𝑓 , )

333

where 𝑦̅𝑘.. is the mean response of rats fed the 𝑘 𝑡ℎ diet, 𝛽𝑘 is the treatment effect of the 𝑘 𝑡ℎ diet,

334

2 𝜎𝑅𝑎𝑡 is the variance of residuals,

335

the total number of blocks, and 𝐿 is the total number of rats per cage (L = 2). It is apparent that

336

the comparison between diet groups is carried out by controlling block effects and, therefore,

337

2 variance of blocks 𝜎𝐵𝑙𝑜𝑐𝑘 is not an element in the standard error of the mean difference.

338 339

 Cage

is the covariance between rats within the same cage, 𝐵 is

The power function for a two-sided t-test to compare two treatment means is: 𝑝𝑜𝑤𝑒𝑟 = 𝑃 (𝑋 < 𝑡𝛼,𝑑𝑓 − 2

𝛥 𝛥 ) + 𝑃 (𝑋 > 𝑡1−𝛼,𝑑𝑓 − ) 𝑆𝐸 𝑆𝐸 2

340

where 𝛥 is the targeted effect size which is defined as a coefficient (percentage value) times the

341

̂  Cage 2 ̂ control mean. 𝑆𝐸 is the standard error of mean difference, which is √2 (𝜎𝑅𝑎𝑡 + 𝐿 )⁄(𝐾𝐿).

342

̂  Cage 2 ̂ Reasonable estimates of the control mean and variance values (µ + 𝛽𝑐𝑜𝑛𝑡𝑟𝑜𝑙 , , 𝜎̂𝑅𝑎𝑡 ) can be 17 ACS Paragon Plus Environment

17

Journal of Agricultural and Food Chemistry

Page 18 of 52

343

obtained by analyzing the data of control and reference diet groups from available studies. 𝑑𝑓 is

344

the degree of freedom associated with the corresponding difference test, which equals (𝐷 −

345

1)(𝐵 − 1) where D represents the total number of diet groups to be included in the linear mixed

346

model analysis. The significance level 𝛼 is set to 0.05.

347

Given the above, the statistical power values were calculated by setting different numbers

348

of replications or experimental units (B) and different numbers of diet groups (D). For this

349

power analysis comparison, 5 treatment groups (one test group, one control group, and three

350

reference groups) were assumed, and 12 and 16 rats per sex per group (6 and 8 cages per sex per

351

group in the paired-housing design) were evaluated. Mathematically, B = 6, 8 and D = 5 were

352

set. Results are presented in Table 6.

353

3. RESULTS AND DISCUSSION

354

3.1

355

Appendix B

356

3.2

357

organ weights, and gross and anatomic pathology: Supplemental Information, Appendix B

Characterization of maize grain and experimental diets: Supplemental Information,

In-life toxicology, clinical observations, neurobehavioral assessment, clinical pathology,

358

Results for the selected endpoints considered in the initial power calculations are

359

presented in Tables 2 and 3 as representative examples of statistical reporting both in traditional

360

format and as preferred under EFSA’s alternative framework; results for all continuous and

361

selected categorical endpoints are summarized in Supplemental Information, Appendix C

362

(summary statistics) and Appendix D (comparative statistics).

363

Under the conditions of this study, no treatment-related differences were observed in rats

364

fed diets containing insect-protected, herbicide-tolerant genetically modified maize grain at

365

either incorporation rate compared with rats fed diets containing control or commercial reference

18 ACS Paragon Plus Environment

18

Page 19 of 52

Journal of Agricultural and Food Chemistry

366

maize grain. The distribution of qualitative observations and numerical measurements across

367

treatment groups was attributed to normal biological variation between randomly chosen samples

368

from a population of animals based on the following considerations:

369

significant differences in any parameter were observed when the FDR adjustment was applied

370

and data from both sexes were combined where applicable (text Table 3 and Supplemental

371

Information, Appendix D); 2) data from animals in test groups were consistent with data from

372

animals in the concurrent reference groups and/or historical control data obtained from the same

373

testing facility (text Table 2 and Supplemental Information, Appendix C); and 3) no consistent

374

evidence of biologically relevant effect sizes, dose-dependent relationships, or corroborative

375

observations across related endpoints or across sexes were observed (Supplemental Information,

376

Appendix B).

377

Table 2. Summary statistics of selected biological parameters in male and female rats

378

(arithmetic mean ± SD; range of individual values)

379

Table 3. Comparative statistics of selected biological parameters in male and female rats

1) no statistically

380 381

3.3 Statistical power analyses

382

The alternative study framework developed by EFSA23 proposes extensive statistical

383

treatment of the data, one goal of which is to maximize statistical power (sensitivity) to detect

384

biologically meaningful differences between test and control groups. This approach may be

385

possible in cases when specific test-substance-related adverse outcomes in any given parameter

386

or set of related parameters have been identified from previous experiments, and are therefore

387

hypothesized to occur in the subchronic study;73 however, this approach is not applicable to GM

388

crops in which no inherent hazards have been identified. In the latter case, the mandatory rat

19 ACS Paragon Plus Environment

19

Journal of Agricultural and Food Chemistry

Page 20 of 52

389

subchronic feeding study is not hypothesis-driven, but rather is exploratory in that any

390

endpoint(s) could be toxicologically relevant.

391

exploratory study prospectively to detect potentially meaningful differences in all measured

392

endpoints, considering that biologically-relevant effect sizes differ among and are not defined for

393

all endpoints. Furthermore, it is difficult to define a specific percent change as important for a

394

given endpoint in isolation, because biologically relevant effects typically involve a continuum

395

of change and multiple endpoints, any of which could vary based on the specific toxicological

396

manifestation.

It is particularly challenging to power an

397

To manage this problem, EFSA23 proposes to define a pre-specified targeted effect size in

398

standard deviation (SD) units (Standardized Effect Size, or SES) and claims that for all

399

endpoints, a difference of one SD or less is of little toxicological relevance. A recent cross-study

400

analysis of data compiled from several rodent subchronic feeding studies follows this

401

framework.74 This approach, however, relies on the same experimental data to provide both the

402

estimate of an absolute effect size and the value against which it is compared. This is clear if the

403

inequality describing the assessment of biological relevance (

404

to 𝑎𝑏𝑠𝑜𝑙𝑢𝑡𝑒 𝑒𝑓𝑓𝑒𝑐𝑡 ≤ 𝑆𝐷𝑝𝑜𝑜𝑙𝑒𝑑 by multiplying both sides of the equation by the pooled

405

standard deviation. The danger of this approach is that standard deviations are estimated with

406

error and their values are influenced by the particulars of an experiment. For example, if the

407

same set of samples is measured by two different labs, then the lab with inferior consistency and

408

thus greater standard deviation will estimate a lower SES (assuming sufficient sample numbers

409

such that the two labs estimate similar absolute effect sizes). In addition, this approach fails to

410

recognize that one standard deviation of a difference for one endpoint may hold greater or lesser

411

biological relevance, if any, with regard to other endpoints. For this reason, it is most

𝑎𝑏𝑠𝑜𝑙𝑢𝑡𝑒 𝑒𝑓𝑓𝑒𝑐𝑡

20 ACS Paragon Plus Environment

𝑆𝐷𝑝𝑜𝑜𝑙𝑒𝑑

≤ 1) is rearranged

20

Page 21 of 52

Journal of Agricultural and Food Chemistry

412

appropriate to specify effect size for the targeted endpoints as absolute (in original units74) or as

413

% difference from the control for the purpose of prospective power analysis.

414

A retrospective analysis of study data to determine the attained statistical power of this

415

new study design is presented, as well as what effect sizes (% difference from control) can be

416

detected statistically. "Detectable" does not signify toxicological relevance. Implications of

417

these results are discussed.

418

3.3.1 Attained statistical power for selected biological endpoints

419

For the selected biological endpoints with known targeted effect sizes, the attained

420

statistical power was calculated and compared with the expected power calculated prior to the

421

study (Table 1).

422

Evaluation of the retrospective power for the selected biological endpoints concluded that

423

experimental variation within the study was well-controlled. With the exception of cumulative

424

body weight gain, all of the endpoints attained a retrospective statistical power greater than 95 %

425

in this study. Taking into account the commonly accepted threshold for sufficient statistical

426

power to be 80 %, the experimental design could be considered over-powered such that small,

427

biologically-unimportant differences were likely to be detected as statistically significant.

428

Compared with the prospective powers calculated prior to the study, the retrospective

429

powers for all endpoints attained in this study were similar or higher. Both analyses used 3

430

treatment groups and 8 cages (paired housing) per group per sex. The difference in the results is

431

attributed to the following factors: 1) the retrospective power analysis was based on the observed

432

variability in the current study while the prospective power analysis conducted prior to this study

433

was based on the CV values obtained from five available paired-housing studies (GLP- and

434

OECD 408-compliant) performed at multiple testing facilities; and 2) the retrospective power

21 ACS Paragon Plus Environment

21

Journal of Agricultural and Food Chemistry

435

analysis was conducted using the combined-gender model, the same model utilized for analyzing

436

the study data, while the prospective power analysis was conducted using a simplified approach

437

and considered males and females separately.

438

3.3.2 Detectable effect size for all endpoints

439

Since appropriate targeted effect sizes necessary for power analysis of a rodent

440

subchronic study were not defined a priori for the majority of endpoints, power values were

441

generated for a set of possible effect size values; the smallest effect size value that generated a

442

power value over 80 % was defined as the detectable effect size for a given parameter (Table 4).

443

The twelve selected outcome parameters are bolded/italicized/underlined.

444

Table 4.

445

retrospective power analysis

Page 22 of 52

Detectable effect size (% change relative to control) for all parameters via

446

Many biological parameters measured in OECD 408-compliant rodent subchronic

447

toxicity studies can vary considerably with no physiological manifestations, while others (e.g.,

448

serum electrolytes) have a very narrow normal range and small differences can be biologically

449

important.58 The results in Table 4 show that for most measurements, the current subchronic

450

study design and recommended statistical analysis23 is capable of detecting very small

451

differences of negligible biological relevance. One notable exception is that the 10 % difference

452

in cumulative body weight gain, considered one of the earliest indicators of potential systemic

453

toxicity, cannot be detected statistically with a power of 80 %. Table 4 indicates that the

454

detectable effect size for this parameter is actually greater than 10 %. As discussed in the

455

literature, a 10 % difference (decrease) in cumulative body weight gain was considered a

456

reasonable targeted effect size for selection of a maximum tolerated dose (MTD) for chemical

457

substances in long-term rodent chronic and carcinogenicity studies.62-66 Thus, this difference

22 ACS Paragon Plus Environment

22

Page 23 of 52

Journal of Agricultural and Food Chemistry

458

manifests over the lifetime of the animal, and may not represent a reliable biological effect size

459

in shorter-term bioassays such as the rodent subchronic feeding study.

460

To explore this outcome further, a prospective power analysis was conducted using data

461

from this and similar paired-housing rat subchronic studies from the same testing facility. In

462

contrast to the retrospective power analysis, since sex by treatment interactions cannot be

463

excluded a priori, the prospective power analyses were performed separately for the sexes. 72

464

Using the current study design with 8 cages (16 rats) per sex per treatment group, the power to

465

detect a 10 % difference in cumulative body weight gain is 43 % for males and 36 % for females

466

(data not shown). The sample size required to reach 80 % power is 19 cages (38 rats) per

467

treatment group for males and 23 cages (46 rats) per treatment group for females. This places an

468

unrealistic demand on animal use that could be considered unethical28,75 and will necessarily

469

over-power the study to detect small and meaningless differences in nearly all biological

470

endpoints. With 8 cages/sex/group, a 15 % difference in cumulative body weight gain can be

471

detected with a power of 77 % for males and 68 % for females; while this still does not meet the

472

80 % power criterion for distinguishing a potentially real difference from the background of

473

natural variation, it is perhaps a more realistic targeted effect size of biological relevance to

474

studies of subchronic, rather than chronic or lifetime, duration. It is noteworthy that, particularly

475

for a non-hypothesis-driven study, adverse outcomes are characterized using a weight-of-

476

evidence approach that considers all observations and measurements, independent of statistical

477

significance.23 In that regard, the inability to power adequately for every measured parameter is

478

not exclusionary in the context of informing the risk assessment.

479

3.4 Additional statistical considerations of the alternative study framework

23 ACS Paragon Plus Environment

23

Journal of Agricultural and Food Chemistry

Page 24 of 52

480

The analysis, evaluation, and interpretation of all qualitative observations and numerical

481

measurements collected in this study were executed according to the well-grounded scientific

482

principles outlined above.

483

(Supplemental Information, Appendix C), the EFSA framework23 requires unconventional

484

statistical treatments not commonly included in guideline rodent toxicology studies.

485

accommodate these requirements, new statistical treatments were developed and applied.

486

Contribution and relevance of some of the less-familiar concepts are discussed below.

487

3.4.1 Importance of multiplicity adjustment

488

In addition to the standard reporting of toxicological results

To

Rodent toxicology studies conducted to satisfy regulatory product safety testing

489

requirements typically report statistical significance using unadjusted P-values.

490

conservative assessment. When numerous endpoints are evaluated concurrently and statistical

491

tests are conducted separately for each endpoint, increased probability of Type I, or false

492

positive, errors associated with multiple testing (known as “multiplicity”) is expected.

493

Occurrence of these errors results in spurious incidences of apparent statistical significance,

494

which may be irrelevant from a biological context. In large, complex studies where multiplicity

495

may be problematic, such as the rodent subchronic feeding study, overreliance on statistical

496

significance to identify unknown but potentially biologically-relevant variation in individual

497

endpoints can confound interpretation of results.76-78 It is therefore recommended to employ a

498

multiplicity adjustment procedure to address this issue. The FDR method70,79 was applied in this

499

study as a post hoc statistical procedure to adjust P-values across endpoints; the FDR-adjusted P-

500

values were then used to help differentiate statistically significant differences of potential

501

biological relevance. Use of adjusted P-values to control occurrence of false positives in rodent

502

subchronic feeding studies is also endorsed in the EFSA guidance.

24 ACS Paragon Plus Environment

This is a

24

Page 25 of 52

503

Journal of Agricultural and Food Chemistry

3.4.2 Understanding the combined-gender analysis

504

In rodent toxicology studies, it is standard practice to analyze data from males and

505

females separately. By contrast, the alternative framework requires combining data from both

506

sexes to maximize the statistical power.23 A recently-published set of rodent subchronic studies

507

from the European Commission's publicly-funded GMO Risk Assessment and Communication

508

of Evidence project (Project GRACE) utilized separate-gender statistical analysis despite

509

reporting that the feeding trials were performed according to the EFSA guidance.23,52 This

510

deviation from EFSA recommendations is not surprising, in that combined-gender analysis raises

511

several conceptual problems: 1) there is no historical basis for evaluating rodent toxicology

512

results in the context of the natural variation of combined-gender data; 2) the potentially

513

enhanced statistical power associated with the across-gender statistical test is unwelcome

514

because it results in overly sensitive tests; and 3) the anticipated increased statistical power could

515

be confounded by the different inherent variability of males and females for some endpoints.80 If

516

this difference in variability between sexes is ignored, the statistical test will be over-sensitive

517

for one sex and under-sensitive for the other. Otherwise, if the difference in variability is taken

518

into account by using an appropriate variance-covariance structure, the complexity of the

519

analysis will be increased and the power advantage of a combined analysis would likely be

520

eliminated.

521

To more thoroughly understand the advantages and disadvantages of combined- versus

522

separate-gender analysis, a retrospective analysis of two endpoints considered to be

523

representative of variance dissimilarity (heteroscedasticity) or similarity (homoscedasticity) was

524

conducted (Table 5).

525

different from variance within the female group) is expected for some endpoints such as end-of-

Variance heteroscedasticity (i.e., variance within the male group is

25 ACS Paragon Plus Environment

25

Journal of Agricultural and Food Chemistry

526

study body weight, while for other endpoints such as absolute lymphocyte count, variance

527

homoscedasticity may be a reasonable assumption.

Page 26 of 52

528 529

Table 5. Retrospective power analysis: combined-gender vs separate-gender

530

As expected, the combined-gender analysis assuming heterogeneous variance in body

531

weight for males and females generated the same statistical power for males and females as the

532

separate-gender analysis.

533

assuming homogeneous variance resulted in statistical overpowering of the test in males and

534

underpowering of the test in females. This is because the mean absolute lymphocyte count for

535

females was lower than that for males, making it more difficult to detect an absolute difference

536

of smaller magnitude (for both sexes, the targeted effect size is a 30 % difference relative to

537

control).

In the lymphocyte count example, the combined-gender analysis

26

538

These examples illustrate that when the combined-gender analysis is required, a model

539

assuming heterogeneous variance for males and females is recommended. However, other than

540

enabling the required comparison, the combined-gender analysis does not offer meaningful, if

541

any, gain in statistical power compared with the separate-gender analysis. When an adjusted P-

542

value is significant (less than 0.05) for the combined-gender analysis, results are further

543

evaluated separately for males and females; the results for males and females generated from the

544

combined-gender analysis are the same as those generated from the separate-gender analysis, an

545

outcome consistent with traditional evaluation of rodent toxicology studies.

546

3.4.3 Understanding the standardized effect size

547

Another statistical metric with limited historical application in rodent toxicology studies

548

is the standardized effect size, a unitless normalization of comparative results across potentially

26 ACS Paragon Plus Environment

Page 27 of 52

Journal of Agricultural and Food Chemistry

549

all continuous endpoints. As used here, the SES refers to the observed SES from the study,

550

compared with use of a targeted SES for power analysis (section 3.3). Consistent with current

551

requirements of the alternative study framework, differences between the means of the test

552

groups and those of the control group were converted to the unitless scale of standardized effect

553

size (SES) for selected endpoints,23 which enabled the SES results from multiple endpoints to be

554

plotted together to facilitate a visual presentation of statistical results across the study (Figure 2).

555

Conceptually, this data presentation imparts an equal weight/importance to every endpoint while

556

ignoring the context of biological variation considered normal for individual endpoints. For

557

example, serum electrolyte values are tightly controlled, such that small excursions outside a

558

population mean may be toxicologically relevant, while some serum chemistry markers of

559

systemic and/or organ function, such as aspartate aminotransferase (AST) and alanine

560

aminotransferase (ALT), can be highly variable in the absence of biological or toxicological

561

consequence.58

562

appropriate to apply a universal equivalence limit such as ± 1 SD to evaluate observed SES for

563

all endpoints as was applied to similar rodent subchronic feeding study data compiled from

564

Project GRACE.74,81

For this reason, in addition to rationale presented in section 3.3, it is not

565

Consistent with accepted toxicological practice, the prospective power and sample size

566

analyses made prior to study initiation were based on parameter-dependent biologically-

567

important effect sizes specified as percent differences from the control mean62-66 and were not

568

based on a standardized effect size. Thus, the SES results (e.g., point estimates and CI) with

569

respect to power are not clear and cannot be interpreted rigorously. Technically, SES

570

determination is not part of the statistical analysis of the data, as the mixed-model analyses

571

(which generate the LS-means and 95 % CI of the difference) and the associated mean

27 ACS Paragon Plus Environment

27

Journal of Agricultural and Food Chemistry

Page 28 of 52

572

comparisons (which generate the P-values) are performed prior to SES analysis. Rather, it is a

573

post hoc transformation procedure82 that expresses the difference between two means in units of

574

the standard deviation (SD). Therefore, other than providing a graphical representation of the

575

statistical results obtained through mixed-model analysis, the value of SES to support data

576

interpretation is limited. SES analysis can provide greater utility for meta-analysis when it is

577

necessary to evaluate patterns of biological effects across multiple independent studies.

578

Figure 2. Standardized effect size estimates and 95 % confidence interval for selected

579

variables

580

3.5

Animal welfare considerations and the feasibility of alternative study designs

581

As the requirement for the rodent subchronic feeding study continues to be mandatory83

582

for single transgenic events as per the Implementing Regulation, alternative study designs were

583

explored that would reduce and refine animal use yet retain statistical rigor. In Table 6, the

584

results of prospective power analyses for alternative study design scenarios utilizing different

585

options for housing and group size are presented; the cage is still the experimental unit. To

586

enable this evaluation, animal data from non-test groups were extracted from several rodent

587

subchronic studies completed between 2008 and 2015 at the same testing facility used for the

588

current study, as indicated in the footnote to Table 6. The power to detect a pre-specified

589

targeted effect size (as per Table 1) is given for the selected endpoints except cumulative body

590

weight gain, where a difference of 10 % effect size cannot be detected without unreasonably

591

large group sizes.

592

Table 6. Comparative power analysis of alternative study designs

593

As results in Table 6 indicate, the power to detect targeted effect sizes in all selected

594

endpoints except final body weight is at least 80 % for the more conservative animal use scenario

28 ACS Paragon Plus Environment

28

Page 29 of 52

Journal of Agricultural and Food Chemistry

595

of 12/sex/group, single-housed. Interestingly, statistical power for final body weight was almost

596

equivalent for single-housed males and females, but was higher for females and lower for males

597

under paired-housing conditions, suggesting possible sex-dependent stabilizing or antagonistic

598

relationships between pair-housed animals. To explore this observation further, Table 7 presents

599

the results of covariance analysis performed for weekly body weights of male and female rats

600

from the current study. As early as study week 4, an increasingly negative covariance (COVcage)

601

is observed for body weights of pair-housed males, while covariance for body weights of pair-

602

housed females remains consistently low and positive. The strong negative covariance for pair-

603

housed males is correlated with length of time on study, suggesting emergence of a dominance

604

hierarchy characterized by monopolization of shared resources (e.g., food). These data suggest

605

that pair-housing of male rats may actually confound body weight-dependent data by increasing

606

variability at both the experimental unit level (cage-to-cage variation) and the measurement level

607

(rat-to-rat variation) in toxicology studies exceeding four weeks in duration.

608

Table 7. Covariance analysis of weekly body weights for male and female rats

609

3.6

Informing the risk assessment

610

To our knowledge, this is the first published exploratory rat subchronic study with

611

biotechnology-derived food/feed conducted with rigorous adherence to the recommendations set

612

forth in the EFSA Scientific Committee's guidance document.23 Although substantial study

613

design and analysis modifications were introduced into the alternative study framework,

614

toxicological effects, recognized as test-substance-related, biologically-relevant, adverse impacts

615

on animal health, were not observed (Tables 2 and 3; Supplemental Information).

616

The study results and statistical evaluations we present here reinforce the main

617

conclusions drawn from Project GRACE (GMO Risk Assessment and Communication of

29 ACS Paragon Plus Environment

29

Journal of Agricultural and Food Chemistry

Page 30 of 52

618

Evidence), a publicly-financed EU 7th Framework Programme for Research project initiated in

619

2012 by an independent academic consortium, which performed, analyzed, and published52 the

620

results of three rat 90-day and two 1-year feeding studies with biotechnology-derived feed

621

ingredients

622

Project GRACE failed to find scientific evidence that would justify the necessity of the rat 90-

623

day feeding study in the risk assessment of single transgenic events for which previous

624

systematic comparative studies identified no hazards likely to impact human or animal well-

625

being.

626

toxicity studies regardless of uncertainty surrounding inherent hazard24,56 also contradicts the

627

scientific recommendations of Europe's independent food safety advisory body, EFSA,8,21,72,84

628

and is inconsistent with existing directives to reduce animal use in research.75,85-87 As such, it

629

remains uncertain whether the mandatory inclusion of a 90-day feeding study serves to better-

630

inform the risk assessment for new GM crops or to raise public confidence in the performed risk

631

assessment.

(http://www.grace-fp7.eu/en/content/reports-study-plans-consultation-documents).

The European Commission policy requiring compulsory rodent subchronic dietary

632

Compared with the internationally-harmonized OECD 408 test guideline,28 the study

633

design recommended by EFSA and referred to in Implementing Regulation 503/2013 neither

634

refines nor reduces animal use, but rather expands study size to maximize statistical power

635

(Section 3.3.1; Table 1) and increases complexity from a logistical and analytical perspective.

636

Addition of a low-dose test group of limited experimental value precludes use of a more-refined

637

and conservative limit-test design requiring a single high-dose test group. The complex design,

638

based on a blinded and blocked randomization scheme (Figure 1), represents good experimental

639

design but encumbers logistics and efficiency of routine animal husbandry procedures and data

640

collection. The experimental unit is predefined as a cage containing two animals such that paired

30 ACS Paragon Plus Environment

30

Page 31 of 52

Journal of Agricultural and Food Chemistry

641

housing is obligatory, yet the basis for and necessity of paired housing for rats remain equivocal

642

(Section 3.5; Tables 6 and 7).60

643

From an analytical perspective, the combined-gender analysis does not offer meaningful

644

gain in statistical power compared with the separate-gender analysis, and therefore is not

645

recommended (Section 3.4.2; Table 5).

646

differences to standardized effect sizes does not add value to data interpretation nor advise the

647

toxicological assessment of animal health (Sections 3.3 and 3.4.3; Figure 2). It is unreasonable

648

either to compare the observed standardized effect sizes with a set value (e.g., ± 1 SD) for all

649

endpoints or to use a fixed standardized effect size (e.g., ± 1 SD) as the targeted effect size for

650

statistical power analysis (Sections 3.3 and 3.4.3). Because of the numerous statistical tests

651

conducted in the EFSA rodent subchronic study, e.g, comparisons between high- and low-dose

652

test groups to the control group for well over 100 endpoints for both sexes, it is important to

653

acknowledge the multiplicity problem; FDR adjustment, as endorsed by EFSA, is a

654

recommended approach to alleviate the burden of increasing Type I error rate (Section 3.4.1).

The post-hoc conversion of observed statistical

655

Several provisions of the EFSA rat 90-day study framework, such as larger group size,

656

multiple test groups, use of randomized complete block design, and combined-gender analysis,

657

were intended to maximize statistical power or to assess dose-response of observed adverse

658

effects. While our results demonstrate that these objectives can be attained (Table 1, attained

659

statistical power), the toxicological interpretation of the data remains unchanged (Section 3.2;

660

Tables 2 and 3; Supplemental Information). To address the inconsistency with existing mandates

661

to reduce and refine animal use in research inherent in EFSA’s study framework,23 we validated

662

alternative study designs (Sections 2.10 and 3.5; Tables 6 and 7) that: are statistically robust

663

(adequately rather than maximally powered), use fewer animals (one test group; smaller group

31 ACS Paragon Plus Environment

31

Journal of Agricultural and Food Chemistry

664

size), remove confounding stressors (e.g., paired housing), and facilitate routine animal handling

665

and data collection (completely randomized design, which can also be more statistically

666

powerful than the randomized complete block design in the absence of known sources of

667

selection bias).

Page 32 of 52

668

Subsequent to the completion of the rodent feeding study discussed herein, EFSA

669

published an explanatory statement72 clarifying the intent of its original guidance.23 The

670

Explanatory Statement included recommendations for implementing mandatory non-hypothesis-

671

driven exploratory rodent subchronic studies for biotechnology-derived food/feed that allow

672

flexibility of study design and conduct. The Explanatory Statement acknowledges: limitations

673

to determining an appropriate sample size in the absence of a test hypothesis, thus

674

acknowledging that the requirement for a "powered" sample size cannot be advocated; the

675

acceptability of a limit test design; and that allowing cages within a treatment group to be

676

arrayed systematically minimizes chances for confusion and error in the animal room. However,

677

the Explanatory Statement maintains advocacy for paired housing, stating without justification

678

that social housing controls inter-individual variability.72 This position from EFSA is consistent

679

with general recommendations of the Guide for the Care and Use of Laboratory Animals, but the

680

Guide specifies that single housing may be warranted for experimental reasons.60 Although our

681

research (Tables 6 and 7) is not intended to comprehensively address the impact of social

682

housing, it does refute the unsupported EFSA claim about inter-individual variability, especially

683

for male rats, and is consistent with the Guide’s explanation that “social incompatibility may be

684

sex biased.”60

685

Overall, our results validate simpler, more conservative, yet adequately powered study

686

designs that both reduce and refine animal use and that are in alignment with many provisions

32 ACS Paragon Plus Environment

32

Page 33 of 52

Journal of Agricultural and Food Chemistry

687

for flexibility of design and conduct of rat subchronic studies performed as part of the safety

688

assessment of GM crops as recommended in EFSA's Explanatory Statement.72

689

4. ABBREVIATIONS USED

690

GM, genetically modified; EU, European Union; EFSA, European Food Safety Authority; CV,

691

coefficient of variation; ALYM, absolute lymphocyte count; LS, least squares; CI, confidence

692

interval; FDR, false discovery rate; GRACE, GMO Risk Assessment and Communication of

693

Evidence; SES, standardized effect size; AST, aspartate aminotransferase; ALT, alanine

694

aminotransferase; SD, standard deviation

695

5. ACKNOWLEDGEMENTS

696

The authors express appreciation to scientists and staff of Purina Mills, LLC; St. Louis, MO and

697

Purina TestDiet, Richmond, IN for experimental diet formulation and production, respectively;

698

to scientists at EPL Bio Analytical Services, Niantic, IL for diet composition and contaminant

699

analyses; to statisticians formerly with Pioneer Hi-Bred International, Inc. for statistical

700

modeling supporting the comparative analysis; and to the technicians and facility staff of DuPont

701

Haskell, Newark, DE for expert animal care and management.

702

6. FUNDING

703

This study was sponsored by Pioneer Hi-Bred International, Inc.

704

Notes

705

The authors declare no competing financial interest.

706

7. SUPPORTING INFORMATION

707 708 709

PDF file, Supplemental Information – Appendices A, B, C, and D

710

study conduct, methods and results not presented in the manuscript; Supplemental Information,

Supplemental Information, Appendices A and B: additional 90-day rat study details describing

33 ACS Paragon Plus Environment

33

Journal of Agricultural and Food Chemistry

711

Appendix C: summary statistics calculated for quantitative endpoints in all diet groups;

712

Supplemental Information, Appendix D: comparative statistical analysis for quantitative

713

endpoints

Page 34 of 52

714

34

34 ACS Paragon Plus Environment

Page 35 of 52

Journal of Agricultural and Food Chemistry

715

8. REFERENCES

716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758

(1) Codex Alimentarius Commission Guideline for the Conduct of Food Safety Assessment of Foods Derived from Recombinant-DNA Plants; Codex Alimentarius: 2003. (2) Codex Alimentarius Commission Joint FAO/WHO Food Standards Programme, Codex Alimentarius Commission. Report of the Sixth Session of the Codex Ad Hoc Intergovernmental Task Force on Foods Derived from Biotechnology; Food and Agriculture Organization, Rome, Italy: 2007. (3) EC Commission recommendation of 29 July 1997 concerning the scientific aspects and the presentation of information necessary to support applications for the placing on the market of novel foods and novel food ingredients and the preparation of initial assessment reports under Regulation (EC) No. 258/97 of the European Parliament and of the Council (97/618/EC). Official Journal of the European Commission 1997, L 253, 1-36. (4) EC Regulation (EC) 1829/2003 of the European Parliament and of the Council of 22 September 2003 on genetically modified food and feed. Official Journal of the European Commission 2003, L268, 1-23. (5) EC Guidance document for the risk assessment of genetically modified plants and derived food and feed; The Joint Working Group on Novel Foods and GMOs, European Commission: Brussels, Belgium, 2003; p 27. (6) EC Genetically modified crops in the EU: food safety assessment, regulation, and public concerns; ENTRANSFOOD. European Communities, Luxembourg, Belgium: 2004; p 99. (7) EFSA Guidance document of the scientific panel on genetically modified organisms for the risk assessment of genetically modified plants and derived food and feed. EFSA J. 2006, 99, 1-100. (8) EFSA Guidance for risk assessment of food and feed from genetically modified plants. EFSA J. 2011, 9, 2150. (9) FAO/WHO Biotechnology and food safety; Report of a Joint FAO/WHO Consultation, Rome, Italy, 30 September - 4 October 1996: Rome, 1996; p 34. (10) FAO/WHO Safety aspects of genetically modified foods of plant origin: Report of a Joint FAO/WHO Expert Consultation on Foods Derived from Biotechnology, 29 May – 2 June 2000; World Health Organization, Geneva: 2000. (11) Howlett, J.; Edwards, D. G.; Cockburn, A.; Hepburn, P.; Kleiner, J.; Knorr, D.; Kozianowski, G.; Müller, D.; Peijnenburg, A.; Perrin, I.; Poulsen, M.; Walker, R. The safety assessment of Novel Foods and concepts to determine their safety in use. Int. J. Food Sci. Nutr. 2003, 54, S1-S32. (12) ILSI Task Force Nutritional and Safety Assessments of Foods and Feeds Nutritionally Improved through Biotechnology: An Executive Summary A Task Force Report by the International Life Sciences Institute, Washington, D.C. Compr. Rev. Food Sci. Food Saf. 2004, 3, 35-104. (13) Jonas, D. A.; Antignac, E.; Antoine, J. M.; Classen, H. G.; Huggett, A.; Knudsen, I.; Mahler, J.; Ockhuizen, T.; Smith, M.; Teuber, M.; Walker, R.; De Vogel, P. The safety assessment of novel foods: Guidelines prepared by ILSI Europe novel food task force. Food Chem. Toxicol. 1996, 34, 931-940. (14) OECD Safety evaluation of foods produced by modern biotechnology: Concepts and principles; Organisation for Economic Cooperation and Development: Paris, France, 1993; p 74.

35 ACS Paragon Plus Environment

35

Journal of Agricultural and Food Chemistry

759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804

Page 36 of 52

(15) OECD In Report of the OECD workshop on the toxicological and nutritional testing of novel foods, 5-8 March 1997; Organisation of Economic Co-operation and Development: Paris, France, 2002; p 57. (16) OECD Considerations for the Safety Assessment of Animal Feedstuffs Derived from Genetically Modified Plants; Organisation for Economic Co-operation and Development: 2003; p 46. (17) WHO Strategies for assessing the safety of foods produced by biotechnology. Report of a joint FAO/WHO consultation; World Health Organization: Geneva, Switzerland, 1991. (18) WHO Application of the Principles of Substantial Equivalence to the Safety Evaluation of Foods or Food Components from Plants Derived by Modern Biotechnology; World Health Organization: 1995. (19) ADAS Review of the strategies for the comprehensive food and feed safety and nutritional assessment of GM plants per se. EFSA Supporting Publications 2013, 10, EN-480. (20) Chassy, B.; Egnin, M.; Gao, Y.; Glenn, K.; Kleter, G. A.; Nestel, P.; NewellMcGloughlin, M.; Phipps, R. H.; Shillito, R. Nutritional and safety assessments of foods and feeds nutritionally improved through biotechnology: case studies. J. Food Sci. 2007, 72, R131R137. (21) Devos, Y.; Aguilera, J.; Diveki, Z.; Gomes, A.; Liu, Y.; Paoletti, C.; du Jardin, P.; Herman, L.; Perry, J. N.; Waigmann, E. EFSA’s scientific activities and achievements on the risk assessment of genetically modified organisms (GMOs) during its first decade of existence: looking back and ahead. Transgenic Res. 2014, 23, 1-25. (22) EFSA Safety and nutritional assessment of GM plants and derived food and feed: The role of animal feeding trials; Report of the EFSA GMO Panel Working Group on Animal Feeding Trials. Food Chem. Toxicol. 2008, 46, S2-S70. (23) EFSA Guidance on conducting repeated-dose 90-day oral toxicity study in rodents on whole food/feed. EFSA J. 2011, 9, 2438. (24) Herman, R. A.; Ekmay, R. Do whole-food animal feeding studies have any value in the safety assessment of GM crops? Regul. Toxicol. Pharmacol. 2014, 68, 171-174. (25) Barlow, S. M.; Greig, J. B.; Bridges, J. W.; Carere, A.; Carpy, A. J. M.; Galli, C. L.; Kleiner, J.; Knudsen, I.; Koëter, H. B. W. M.; Levy, L. S.; Madsen, C.; Mayer, S.; Narbonne, J. F.; Pfannkuch, F.; Prodanchuk, M. G.; Smith, M. R.; Steinberg, P. Hazard identification by methods of animal-based toxicology. Food Chem. Toxicol. 2002, 40, 145-191. (26) Cellini, F.; Chesson, A.; Colquhoun, I.; Constable, A.; Davies, H. V.; Engel, K. H.; Gatehouse, A. M. R.; Kärenlampi, S.; Kok, E. J.; Leguay, J.-J.; Lehesranta, S.; Noteborn, H. P. J. M.; Pedersen, J.; Smith, M. Unintended effects and their detection in genetically modified crops. Food Chem. Toxicol. 2004, 42, 1089-1125. (27) NAS Safety of Genetically Engineered Foods: Approaches to Assessing Unintended Health Effects. National Academy of Sciences, The National Academies Press: Washington, D.C., 2004; pp 235. (28) OECD OECD Guideline for the Testing of Chemicals 408: Repeated Dose 90-day Oral Toxicity Study in Rodents; Organisation for Economic Cooperation and Development: Paris, France, 1998; p 10. (29) Appenzeller, L. M.; Malley, L.; MacKenzie, S. A.; Hoban, D.; Delaney, B. Subchronic feeding study with genetically modified stacked trait lepidopteran and coleopteran resistant (DAS-Ø15Ø7-1xDAS-59122-7) maize grain in Sprague-Dawley rats. Food Chem. Toxicol. 2009, 47, 1512-1520. 36 ACS Paragon Plus Environment

36

Page 37 of 52

805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848

Journal of Agricultural and Food Chemistry

(30) Appenzeller, L. M.; Munley, S. M.; Hoban, D.; Sykes, G. P.; Malley, L. A.; Delaney, B. Subchronic feeding study of herbicide-tolerant soybean DP-356Ø43-5 in Sprague-Dawley rats. Food Chem. Toxicol. 2008, 46, 2201-2213. (31) Appenzeller, L. M.; Munley, S. M.; Hoban, D.; Sykes, G. P.; Malley, L. A.; Delaney, B. Subchronic feeding study of grain from herbicide-tolerant maize DP-Ø9814Ø-6 in SpragueDawley rats. Food Chem. Toxicol. 2009, 47, 2269-2280. (32) Arjó, G.; Capell, T.; Matias-Guiu, X.; Zhu, C.; Christou, P.; Piñol, C. Mice fed on a diet enriched with genetically engineered multivitamin corn show no sub-acute toxic effects and no sub-chronic toxicity. Plant Biotechnol. J. 2012, 10, 1026-1034. (33) Chukwudebe, A.; Privalle, L.; Reed, A.; Wandelt, C.; Contri, D.; Dammann, M.; Groeters, S.; Kaspers, U.; Strauss, V.; van Ravenzwaay, B. Health and nutritional status of Wistar rats following subchronic exposure to CV127 soybeans. Food Chem. Toxicol. 2012, 50, 956-971. (34) Delaney, B.; Appenzeller, L. M.; Munley, S. M.; Hoban, D.; Sykes, G. P.; Malley, L. A.; Sanders, C. Subchronic feeding study of high oleic acid soybeans (Event DP-3Ø5423-1) in Sprague-Dawley rats. Food Chem. Toxicol. 2008, 46, 3808-3817. (35) Delaney, B.; Appenzeller, L. M.; Roper, J. M.; Mukerji, P.; Hoban, D.; Sykes, G. P. Thirteen week rodent feeding study with processed fractions from herbicide tolerant (DPØ73496-4) canola. Food Chem. Toxicol. 2014, 66, 173-184. (36) Delaney, B.; Karaman, S.; Roper, J.; Hoban, D.; Sykes, G.; Mukerji, P.; Frame, S. R. Thirteen week rodent feeding study with grain from molecular stacked trait lepidopteran and coleopteran protected (DP-ØØ4114-3) maize. Food Chem. Toxicol. 2013, 53, 417-427. (37) Hammond, B.; Dudek, R.; Lemen, J.; Nemeth, M. Results of a 13 week safety assurance study with rats fed grain from glyphosate tolerant corn. Food Chem. Toxicol. 2004, 42, 10031014. (38) Hammond, B.; Lemen, J.; Dudek, R.; Ward, D.; Jiang, C.; Nemeth, M.; Burns, J. Results of a 90-day safety assurance study with rats fed grain from corn rootworm-protected corn. Food Chem. Toxicol. 2006, 44, 147-160. (39) Hammond, B. G.; Dudek, R.; Lemen, J. K.; Nemeth, M. A. Results of a 90-day safety assurance study with rats fed grain from corn borer-protected corn. Food Chem. Toxicol. 2006, 44, 1092-1099. (40) He, X.; de Brum, P. A. R.; Chukwudebe, A.; Privalle, L.; Reed, A.; Wang, Y.; Zhou, C.; Wang, C.; Lu, J.; Huang, K.; Contri, D.; Nakatani, A.; de Avila, V. S.; Klein, C. H.; de Lima, G. J. M. M.; Lipscomb, E. A. Rat and poultry feeding studies with soybean meal produced from imidazolinone-tolerant (CV127) soybeans. Food Chem. Toxicol. 2016, 88, 48-56. (41) He, X. Y.; Huang, K. L.; Li, X.; Qin, W.; Delaney, B.; Luo, Y. B. Comparison of grain from corn rootworm resistant transgenic DAS-59122-7 maize with non-transgenic maize grain in a 90-day feeding study in Sprague-Dawley rats. Food Chem. Toxicol. 2008, 46, 1994-2002. (42) He, X. Y.; Tang, M. Z.; Luo, Y. B.; Li, X.; Cao, S. S.; Yu, J. J.; Delaney, B.; Huang, K. L. A 90-day toxicology study of transgenic lysine-rich maize grain (Y642) in Sprague–Dawley rats. Food Chem. Toxicol. 2009, 47, 425-432. (43) Healy, C.; Hammond, B.; Kirkpatrick, J. Results of a 13-week safety assurance study with rats fed grain from corn rootworm-protected, glyphosate-tolerant MON 88017 corn. Food Chem. Toxicol. 2008, 46, 2517-2524.

37 ACS Paragon Plus Environment

37

Journal of Agricultural and Food Chemistry

849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894

Page 38 of 52

(44) Liu, P.; He, X.; Chen, D.; Luo, Y.; Cao, S.; Song, H.; Liu, T.; Huang, K.; Xu, W. A 90day subchronic feeding study of genetically modified maize expressing Cry1Ac-M protein in Sprague–Dawley rats. Food Chem. Toxicol. 2012, 50, 3215-3221. (45) MacKenzie, S. A.; Lamb, I.; Schmidt, J.; Deege, L.; Morrisey, M. J.; Harper, M.; Layton, R. J.; Prochaska, L. M.; Sanders, C.; Locke, M.; Mattsson, J. L.; Fuentes, A.; Delaney, B. Thirteen week feeding study with transgenic maize grain containing event DAS-Ø15Ø7-1 in Sprague-Dawley rats. Food Chem. Toxicol. 2007, 45, 551-562. (46) Malley, L. A.; Everds, N. E.; Reynolds, J.; Mann, P. C.; Lamb, I.; Rood, T.; Schmidt, J.; Layton, R. J.; Prochaska, L. M.; Hinds, M.; Locke, M.; Chui, C.-F.; Claussen, F.; Mattsson, J. L.; Delaney, B. Subchronic feeding study of DAS-59122-7 maize grain in Sprague-Dawley rats. Food Chem. Toxicol. 2007, 45, 1277-1292. (47) Qi, X.; He, X.; Luo, Y.; Li, S.; Zou, S.; Cao, S.; Tang, M.; Delaney, B.; Xu, W.; Huang, K. Subchronic feeding study of stacked trait genetically-modified soybean (3Ø5423 x 40-3-2) in Sprague–Dawley rats. Food Chem. Toxicol. 2012, 50, 3256-3263. (48) Schrøder, M.; Poulsen, M.; Wilcks, A.; Kroghsbo, S.; Miller, A.; Frenzel, T.; Danier, J.; Rychlik, M.; Emami, K.; Gatehouse, A.; Shu, Q.; Engel, K.-H.; Altosaar, I.; Knudsen, I. A 90day safety study of genetically modified rice expressing Cry1Ab protein (Bacillus thuringiensis toxin) in Wistar rats. Food Chem. Toxicol. 2007, 45, 339-349. (49) Song, H.; He, X.; Zou, S.; Zhang, T.; Luo, Y.; Huang, K.; Zhu, Z.; Xu, W. A 90-day subchronic feeding study of genetically modified rice expressing Cry1Ab protein in Sprague– Dawley rats. Transgenic Res. 2015, 24, 295-308. (50) Tang, M.; Xie, T.; Cheng, W.; Qian, L.; Yang, S.; Yang, D.; Cui, W.; Li, K. A 90-day safety study of genetically modified rice expressing rhIGF-1 protein in C57BL/6J rats. Transgenic Res. 2012, 21, 499-510. (51) Wang, Z.-h.; Wang, Y.; Cui, H.-r.; Xia, Y.-w.; Altosaar, I.; Shu, Q.-y. Toxicological evaluation of transgenic rice flour with a synthetic cry1Ab gene from Bacillus thuringiensis. J. Sci. Food Agric. 2002, 82, 738-744. (52) Zeljenková, D.; Ambrušová, K.; Bartušová, M.; Kebis, A.; Kovrižnych, J.; Krivošíková, Z.; Kuricová, M.; Líšková, A.; Rollerová, E.; Spustová, V.; Szabová, E.; Tulinská, J.; Wimmerová, S.; Levkut, M.; Révajová, V.; Sevčíková, Z.; Schmidt, K.; Schmidtke, J.; La Paz, J. L.; Corujo, M.; Pla, M.; Kleter, G. A.; Kok, E. J.; Sharbati, J.; Hanisch, C.; Einspanier, R.; AdelPatient, K.; Wal, J.-M.; Spök, A.; Pöting, A.; Kohl, C.; Wilhelm, R.; Schiemann, J.; Steinberg, P. Ninety-day oral toxicity studies on two genetically modified maize MON810 varieties in Wistar Han RCC rats (EU 7th Framework Programme project GRACE). Arch. Toxicol. 2014, 88, 22892314. (53) Zhou, X. H.; Dong, Y.; Xiao, X.; Wang, Y.; Xu, Y.; Xu, B.; Shi, W. D.; Zhang, Y.; Zhu, L. J.; Liu, Q. Q. A 90-day toxicology study of high-amylose transgenic rice grain in Sprague– Dawley rats. Food Chem. Toxicol. 2011, 49, 3112-3118. (54) Zhu, Y.; He, X.; Luo, Y.; Zou, S.; Zhou, X.; Huang, K.; Xu, W. A 90-day feeding study of glyphosate-tolerant maize with the G2-aroA gene in Sprague-Dawley rats. Food Chem. Toxicol. 2013, 51, 280-287. (55) Zhu, Y.; Li, D.; Wang, F.; Yin, J.; Jin, H. Nutritional assessment and fate of dna of soybean meal from roundup ready or conventional soybeans using rats. Arch. Anim. Nutr. 2004, 58, 295-310. (56) EC Commission Implementing Regulation (EU) No 503/2013 of 3 April 2013 on applications for authorisation of GM food and feed in accordance with Regulation (EC) No 38 ACS Paragon Plus Environment

38

Page 39 of 52

895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940

Journal of Agricultural and Food Chemistry

1829/2003 of the European Parliament and of the Council and amending Commission Regulations (EC) No 641/2004 and (EC) No 1981/2006. Official Journal of the European Commission 2013, L157, 1-52. (57) Lewis, R. W.; Billington, R.; Debryune, E.; Gamer, A.; Lang, B.; Carpanini, F. Recognition of Adverse and Nonadverse Effects in Toxicity Studies. Toxicol. Pathol. 2002, 30, 66-74. (58) OECD Guidance Notes for Analysis and Evaluation of Repeat-Dose Toxicity Studies; Organisation for Economic Co-operation and Development: Paris, France, 2002; p 87. (59) LabDiet® 2012 LabDiet® Technical Update. LabDiet®: 2012. http://www.labdiet.com/cs/groups/lolweb/@labdiet/documents/web_content/ndjf/mtyz/~edisp/36 142_163851.pdf (accessed Jan 26, 2017). (60) NRC Guide for the Care and Use of Laboratory Animals. 8 ed.; The National Academies Press: Washington, D.C., 2011; pp 220. (61) Borzelleca, J. F. A Proposed Model for Safety Assessment of Macronutrient Substitutes. Regul. Toxicol. Pharmacol. 1996, 23, S15-S18. (62) US-EPA Endocrine Disruptor Screening Program Test Guidelines OPPTS 890.1500: Pubertal Development and Thyroid Function in Intact Juvenile/Peripubertal Male Rats; United States Environmental Protection Agency: October, 2009; p 28. (63) OECD Guidance Document 116 on the Conduct and Design of Chronic Toxicity and Carcinogenicity Studies, Supporting Test Guidelines 451, 452 and 453 – 2nd Edition; Organisation of Economic Co-operation and Development: Paris, France, 2012; p 156. (64) Rhomberg, L. R.; Baetcke, K.; Blancato, J.; Bus, J.; Cohen, S.; Conolly, R.; Dixit, R.; Doe, J.; Ekelman, K.; Fenner-Crisp, P.; Harvey, P.; Hattis, D.; Jacobs, A.; Jacobson-Kram, D.; Lewandowski, T.; Liteplo, R.; Pelkonen, O.; Rice, J.; Somers, D.; Turturro, A.; West, W.; Olin, S. Issues in the Design and Interpretation of Chronic Toxicity and Carcinogenicity Studies in Rodents: Approaches to Dose Selection. Crit. Rev. Toxicol. 2007, 37, 729-837. (65) US-EPA Hepatocellular hypertrophy; United States Environmental Protection Agency: 2002; p 24. (66) US-EPA Rodent Carcinogenicity Studies: Dose Selection and Evaluation; United States Environmental Protection Agency: 2003; p 52. (67) Weil, C. S.; McCollister, D. D. Safety Evaluation of Chemicals, Relationship between Short-and Long-Term Feeding Studies in Designing an Effective Toxicity Test. J. Agric. Food Chem. 1963, 11, 486-491. (68) Littell, R. C.; Milliken, G. A.; Stroup, W. W.; Wolfinger, R. D.; Schabenberger, O. Example: Variance Component Estimates Equal to Zero. In SAS® for Mixed Models, 2 ed.; SAS Institute Inc.: Cary, NC, 2006; pp 148-154. (69) Kenward, M. G.; Roger, J. H. Small Sample Inference for Fixed Effects from Restricted Maximum Likelihood. Biometrics 1997, 53, 983-997. (70) Benjamini, Y.; Hochberg, Y. Controlling the False Discovery Rate: a Practical and Powerful Approach to Multiple Testing. J.R. Statist. Soc. B 1995, 57, 289-300. (71) Westfall, P. H.; Tobias, R. D.; Rom, D.; Wolfinger, R. D.; Hochberg, Y. Concepts and Basic Methods for Multiple Comparisons and Tests. In Multiple Comparisons and Multiple Tests: Using SAS, SAS Institute Inc.: Cary, NC, 1999; pp 13-40. (72) EFSA Explanatory statement for the applicability of the Guidance of the EFSA Scientific Committee on conducting repeated-dose-90-day oral toxicity study in rodents on whole food/feed for GMO risk assessment. EFSA J. 2014, 12, 3871. 39 ACS Paragon Plus Environment

39

Journal of Agricultural and Food Chemistry

941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983

Page 40 of 52

(73) EFSA Statistical Significance and Biological Relevance. EFSA J. 2011, 9, 2372. (74) Schmidt, K.; Schmidtke, J.; Schmidt, P.; Kohl, C.; Wilhelm, R.; Schiemann, J.; van der Voet, H.; Steinberg, P. Variability of control data and relevance of observed group differences in five oral toxicity studies with genetically modified maize MON810 in rats. Arch. Toxicol. 2017, 91, 1977-2006. (75) European Directive Directive 2010/63/EU of the European Parliament and of the Council of 22 September 2010 on the protection of animals used for scientific purposes. Official Journal of the European Commission 2010, L276, 33-79. (76) Doull, J.; Gaylor, D.; Greim, H. A.; Lovell, D. P.; Lynch, B.; Munro, I. C. Report of an Expert Panel on the reanalysis by Séralini et al. (2007) of a 90-day study conducted by Monsanto in support of the safety of a genetically modified corn variety (MON 863). Food Chem. Toxicol. 2007, 45, 2073-2085. (77) Hothorn, L. A.; Oberdoerfer, R. Statistical analysis used in the nutritional assessment of novel food using the proof of safety. Regul. Toxicol. Pharmacol. 2006, 44, 125-135. (78) Poulsen, M.; Schrøder, M.; Wilcks, A.; Kroghsbo, S.; Lindecrona, R. H.; Miller, A.; Frenzel, T.; Danier, J.; Rychlik, M.; Shu, Q.; Emami, K.; Taylor, M.; Gatehouse, A.; Engel, K.H.; Knudsen, I. Safety testing of GM-rice expressing PHA-E lectin using a new animal test design. Food Chem. Toxicol. 2007, 45, 364-377. (79) Benjamini, Y.; Yekutieli, D. The control of the false discovery rate in multiple testing under dependency. The Annals of Statistics 2001, 29, 1165-1188. (80) Carakostas, M. C.; Banerjee, A. K. Interpreting rodent clinical laboratory data in safety assessment studies: Biological and analytical components of variation. Fundam. Appl. Toxicol. 1990, 15, 744-753. (81) Schmidt, K.; Schmidtke, J.; Kohl, C.; Wilhelm, R.; Schiemann, J.; van der Voet, H.; Steinberg, P. Enhancing the interpretation of statistical P values in toxicology studies: implementation of linear mixed models (LMMs) and standardized effect sizes (SESs). Arch. Toxicol. 2016, 90, 731-751. (82) Festing, M. F. W. Extending the Statistical Analysis and Graphical Presentation of Toxicity Test Results Using Standardized Effect Sizes. Toxicol. Pathol. 2014, 42, 1238-1249. (83) EC Summary Report of the Joint Meeting Standing Committee on Plants, Animals, Food and Feed, Section Genetically Modified Food and Feed and Environmental Risk and Regulatory Committee under Directive 2001/18EC held in Brussels on 27 January 2017; European Commission: 2017. (84) Devos, Y.; Naegeli, H.; Perry, J. N.; Waigmann, E. 90-day rodent feeding studies on whole GM food/feed. EMBO Rep. 2016, 17, 942-945. (85) EC Communication from the Commission on the European Citizens' Initiative "Stop Vivisection"; European Commission: Belgium, Brussels, 2015. (86) EC Press release - Commission replies to "Stop Vivisection" European Citizens' Initiative. European Commission: 2015. http://europa.eu/rapid/press-release_IP-15-5094_en.htm (accessed June 3, 2015). (87) Russell, W. M. S.; Burch, R. L. The Principles of Humane Experimental Technique. Methuen Publishing: London, 1959; pp 238.

40 ACS Paragon Plus Environment

40

Page 41 of 52

Journal of Agricultural and Food Chemistry

FIGURE CAPTIONS Figure 1. Randomization and blocking scheme. “Subset” is defined here as a cohort of animals that was processed together (motor activity, necropsy). Each subset contained 2 cages from each diet group, one male and one female. For the first start date, subsets reflected blocking on mean cage body weight, with the lightest 8 cages within a sex assigned to subset 1, progressing to the heaviest in subset 4; for the second start date, cages were assigned in the same manner to subsets 5 through 8.

Cage positions within a subset were assigned randomly.

Numbers within a subset denote the diet/group code for each cage; unused cage positions are represented by “X”. Figure 2. Standardized effect size estimates and 95 % confidence interval for selected variables. The standardized scale has no units, which enables simultaneous presentation of normalized effect size estimates across endpoints.

41 ACS Paragon Plus Environment

41

Journal of Agricultural and Food Chemistry

Page 42 of 52

TABLES

Table 1. Selected key outcome parameters, targeted effect sizes, CV values, and expected and attained power

Parameter

Targeted effect size (change relative to control) a

Expected power calculated prior to the study b (average CV) c

Retrospective power calculated after the study d

Male

Female

Combined-gender

97 % (4.9 %)

96 %

Body weight (final non-fasted)

Decrease 10 %

97 % (4.9 %)

Cumulative body weight gain

Decrease 10 %

NA e

NA

54 %

Liver weight, absolute

Increase 25 %

> 99 % (7.9 %)

> 99 % (6.9 %)

> 99 %

Liver, % body weight

Increase 25 %

NA

NA

> 99 %

Kidney weight, absolute

Increase 25 %

> 99 % (6.4 %)

> 99 % (6.5 %)

> 99 %

Kidney, % body weight

Increase 25 %

NA

NA

> 99 %

93 % (16.6 %) 89 % (18.1 %) > 99 % (12.9 %) > 99 % (10.0 %) > 99 % (7.5 %) > 99 % (12.8 %)

81 % (20.3 %) 75 % (21.7 %) > 99 % (16.7 %) > 99 % (9.9 %) > 99 % (7.5 %) > 99 % (20.8 %)

Leukocyte (WBC) count

Decrease/increase 30 %

Lymphocyte (ALYM) count

Decrease/increase 30 %

Cholesterol (CHOL)

Increase 200 %

Blood urea nitrogen (BUN)

Increase 50 %

Creatinine (CREA)

Increase 50 %

Alkaline phosphatase (ALKP)

Increase 100 %

a

> 99 % 98 % > 99 % > 99 % > 99 % > 99 %

As identified by U.S. EPA, 200962; OECD, 201263; Rhomberg et al., 200764; U.S. EPA, 200265, 200366 Expected statistical power was calculated using 3 treatment groups of 8 cages of pair-housed rats per group per sex under a completely randomized design. Prospective power was not calculated for three biological parameters (cumulative body weight gain, liver to body weight ratio, and kidney to body weight ratio) prior to the study c CV, coefficient of variation between experimental units (cages). Data were obtained from 5 available paired-housing studies conducted at multiple testing facilities (20 groups each of male and female rats, 6 pairs/sex/group) d Data obtained from non-test animals (control and reference groups) in current study e NA, analysis not performed b

42 ACS Paragon Plus Environment

42

Page 43 of 52

Journal of Agricultural and Food Chemistry

Table 2. Summary statistics of selected biological parameters a in male and female rats

b

(arithmetic mean + SD; range of individual values) Males Cumulative body weight gain (g) Final body weight (non-fasted, g) Liver weight (absolute, g) Liver to body weight (%) Kidney weight (absolute, g) Kidney to body weight (%) WBC (x103/μL) ALYM (x103/μL) CHOL (mg/dL) BUN (mg/dL) CREA (mg/dL) ALKP (U/L) Females Cumulative body weight gain (g) Final body weight (non-fasted, g) Liver weight (absolute, g) Liver to body weight (%) Kidney weight (absolute, g) Kidney to body weight (%) WBC (x103/μL) ALYM (x103/μL) CHOL (mg/dL) BUN (mg/dL) CREA (mg/dL) ALKP (U/L) a b c

control n = 16 303 ± 54.7 196 - 434 541 ± 59.7 431 - 675 12.9 ± 2.39 10.1 - 20.0 2.51 ± 0.231 2.22 - 3.13 3.09 ± 0.395 2.61 - 4.31 0.603 ± 0.053 0.525 - 0.713 8.88 ± 1.83 5.37 - 11.6 6.90 ± 1.54 4.04 - 9.55 60.1 ± 13.5 36 - 80 13.6 ± 2.96 10 - 20 0.334 ± 0.0537 0.25 - 0.46 75.8 ± 17.9 51 - 115 control n = 16 119 ± 15.7 93.7 - 144 294 ± 21.3 250 - 328 7.21 ± 0.949 c 5.86 - 9.52 2.61 ± 0.252 c 2.30 - 3.24 1.90 ± 0.231 c 1.51 - 2.39 0.686 ± 0.057 c 0.622 - 0.813 4.86 ± 1.27 c 2.95 - 7.31 3.80 ± 1.09 c 2.32 - 6.12 71.8 ± 16.8 c 45 - 116 15.1 ± 2.36 c 12 - 22 0.369 ± 0.0436 c 0.30 - 0.44 42.2 ± 15.8 c 19 - 79

test (40 %) n = 16 308 ± 54.1 210 - 418 550 ± 66.2 443 - 690 13.2 ± 1.91 9.74 - 17.8 2.52 ± 0.125 2.28 - 2.70 3.24 ± 0.424 2.54 - 4.20 0.619 ± 0.056 0.515 - 0.731 8.89 ± 1.52 5.61 - 11.9 6.95 ± 1.24 4.79 - 9.24 58.7 ± 11.6 41 - 88 13.6 ± 1.31 12 - 16 0.328 ± 0.0349 0.27 - 0.38 71.1 ± 11.6 53 - 96 test (40 %) n = 16 133 ± 19.5 94.9 - 164 307 ± 25.3 255 - 342 7.21 ± 0.370 6.52 - 7.78 2.54 ± 0.187 2.32 - 3.01 1.91 ± 0.158 1.67 - 2.17 0.673 ± 0.072 0.545 - 0.820 4.75 ± 1.54 2.43 - 7.48 3.75 ± 1.36 1.97 - 6.01 75.7 ± 16.0 50 - 107 14.9 ± 1.61 13 - 18 0.381 ± 0.0420 0.33 - 0.45 37.8 ± 8.82 27 - 55

test (20 %) n = 16 330 ± 60.3 251 - 481 570 ± 66.5 478 - 725 13.7 ± 1.95 10.8 - 18.0 2.53 ± 0.209 2.26 - 3.01 3.32 ± 0.480 2.58 - 4.53 0.613 ± 0.070 0.541 - 0.827 9.95 ± 2.07 7.69 - 15.7 7.84 ± 2.16 5.01 - 13.8 56.9 ± 10.5 42 - 81 13.4 ± 4.07 10 - 28 0.344 ± 0.0671 0.30 - 0.58 76.9 ± 18.3 53 - 128 test (20 %) n = 16 131 ± 21.3 99.8 - 185 306 ± 25.6 263 - 358 7.20 ± 0.517 6.38 - 8.34 2.51 ± 0.080 2.30 - 2.65 1.85 ± 0.180 1.51 - 2.14 0.647 ± 0.063 0.563 - 0.758 5.14 ± 1.45 3.03 - 8.53 4.08 ± 1.20 2.55 - 6.86 75.9 ± 15.0 50 - 101 16.3 ± 1.18 14 - 18 0.390 ± 0.0219 0.36 - 0.44 43.6 ± 13.1 27 - 72

ref 1 n = 16 327 ± 53.2 230 - 403 564 ± 62.6 459 - 651 13.9 ± 2.40 11.2 - 18.1 2.59 ± 0.222 2.17 - 2.90 3.49 ± 0.431 2.86 - 4.29 0.656 ± 0.077 0.518 - 0.806 9.05 ± 2.05 5.29 - 12.6 7.23 ± 1.87 4.31 - 10.2 54.9 ± 13.6 33 - 73 12.9 ± 2.03 9 - 16 0.324 ± 0.0549 0.26 - 0.44 76.7 ± 20.4 54 - 126 ref 1 n = 15 131 ± 24.8 96.8 - 183 312 ± 33.7 260 - 380 7.59 ± 0.939 6.19 - 9.27 2.61 ± 0.154 2.45 - 2.98 1.95 ± 0.197 1.62 - 2.27 0.671 ± 0.053 0.582 - 0.740 4.82 ± 1.26 3.19 - 8.73 3.90 ± 1.08 2.44 - 7.25 79.1 ± 25.7 51 - 133 15.1 ± 1.55 13 - 17 0.365 ± 0.0354 0.31 - 0.42 36.4 ± 12.9 21 - 70

ref 2 n = 16 319 ± 35.7 254 - 365 557 ± 37.6 478 - 613 14.4 ± 2.86 11.1 - 23.4 2.71 ± 0.447 2.32 - 4.16 3.75 ± 1.54 2.67 - 9.38 0.710 ± 0.288 0.518 - 1.76 10.1 ± 3.42 5.60 - 18.1 7.92 ± 2.59 4.38 - 13.2 61.7 ± 14.0 32 - 86 13.4 ± 1.78 11 - 19 0.320 ± 0.0468 0.23 - 0.44 79.5 ± 12.9 62 - 101 ref 2 n = 16 123 ± 26.2 93.5 - 188 299 ± 30.5 259 - 364 7.46 ± 0.737 6.59 - 9.27 2.69 ± 0.188 2.40 - 3.09 1.93 ± 0.240 1.65 - 2.48 0.693 ± 0.071 0.579 - 0.830 4.80 ± 0.921 3.27 - 6.65 3.88 ± 0.895 2.45 - 5.81 72.5 ± 15.3 52 - 95 15.8 ± 2.26 12 - 20 0.368 ± 0.0467 0.28 - 0.45 40.0 ± 11.6 20 - 56

As identified by U.S. EPA, 200962; OECD, 201263; Rhomberg et al., 200764; U.S. EPA, 200265, 200366 Data from current study n = 15

43 ACS Paragon Plus Environment

ref 3 n = 16 312 ± 31.5 245 - 363 553 ± 39.8 471 - 635 13.4 ± 1.66 11.5 - 16.4 2.53 ± 0.184 2.28 - 2.83 3.21 ± 0.317 2.52 - 3.72 0.611 ± 0.058 0.511 - 0.718 9.38 ± 2.06 6.76 - 14.0 7.54 ± 2.01 5.15 - 11.9 57.3 ± 10.5 46 - 80 13.4 ± 1.15 11 - 15 0.331 ± 0.0412 0.26 - 0.41 76.0 ± 15.8 59 - 129 ref 3 n = 16 119 ± 16.4 94.9 - 155 292 ± 24.1 260 - 349 7.34 ± 0.960 6.13 - 9.20 2.69 ± 0.239 2.34 - 3.05 1.86 ± 0.208 1.52 - 2.23 0.685 ± 0.059 0.599 - 0.791 4.95 ± 1.68 2.05 - 8.40 3.94 ± 1.48 1.64 - 7.60 77.3 ± 20.0 43 - 115 15.3 ± 2.35 11 - 20 0.369 ± 0.0401 0.30 - 0.43 37.7 ± 10.2 19 - 58

43

Journal of Agricultural and Food Chemistry

Page 44 of 52

Table 3. Comparative statistics of selected biological parameters a in male and female rats b control n = 16 Cumulative body weight gain (g) LS-Means Difference 95 % CI Final body weight (non-fasted, g) LS-Means Difference 95 % CI Liver weight (absolute, g) LS-Means Difference 95 % CI Liver weight (% body weight) LS-Means Difference 95 % CI Kidney weight (absolute, g) LS-Means Difference 95 % CI Kidney weight (% body weight) LS-Means Difference 95 % CI WBC (x103/μL) LS-Means Difference 95 % CI ALYM (x103/μL) LS-Means Difference 95 % CI CHOL (mg/dL) LS-Means Difference 95 % CI BUN (mg/dL) LS-Means Difference 95 % CI CREA (mg/dL) LS-Means Difference 95 % CI ALKP (U/L) LS-Means Difference 95 % CI a b

test (40 %) n = 16 Males

test (20 %) n = 16

303

327 23.3 (-12.1, 58.7)

321 17.6 (-17.8, 52.9)

541

563 22.4 (-14.8, 59.5)

560 19.6 (-17.5, 56.7)

12.9

13.5 0.503 (-1.12, 2.13)

13.2 0.276 (-1.35, 1.90)

2.51

2.49 -0.0199 (-0.192, 0.152)

2.49 -0.0191 (-0.191, 0.153)

3.09

3.20 0.108 (-0.152, 0.368)

3.18 0.0916 (-0.168, 0.351)

0.603

0.599 -0.00413 (-0.0433, 0.0351)

0.603 -0.000688 (-0.0399, 0.0385)

8.88

9.72 0.840 (-0.173, 1.85)

8.85 -0.0275 (-1.04, 0.986)

6.90

7.75 0.855 (-0.0235, 1.73)

6.99 0.0869 (-0.792, 0.965)

60.1

55.6 -4.44 (-11.0, 2.08)

57.8 -2.25 (-8.77, 4.27)

13.3

13.3 -0.00615 (-0.110, 0.0978)

13.1 -0.0217 (-0.126, 0.0823)

0.334

0.348 0.0144 (-0.0130, 0.0418)

0.331 -0.00313 (-0.0305, 0.0243)

75.8

81.9 6.13 (-6.61, 18.9)

76.1 0.375 (-12.4, 13.1)

control n = 16 c

test (40 %) n = 16 Females Sex x Treatment interaction: 119 131 12.1 (-5.35, 29.5) Sex x Treatment interaction: 294 310 15.7 (-3.94, 35.4) Sex x Treatment interaction: 7.18 c 7.40 0.214 (-0.351, 0.778) Sex x Treatment interaction: 2.60 c 2.57 -0.0328 (-0.170, 0.104) Sex x Treatment interaction: c 1.89 1.89 -0.00549 (-0.145, 0.134) Sex x Treatment interaction: c 0.687 0.657 -0.0305 (-0.0629, 0.00202) Sex x Treatment interaction: c 4.85 4.89 0.0382 (-0.640, 0.716) Sex x Treatment interaction: 3.79 c 3.92 0.123 (-0.481, 0.726) Sex x Treatment interaction: 71.8 c 79.6 7.79 (-5.04, 20.6) Sex x Treatment interaction: 15.0 c 14.8 -0.0143 (-0.0959, 0.0674) Sex x Treatment interaction: c 0.371 0.387 0.0157 (-0.0124, 0.0438) Sex x Treatment interaction: c 42.1 39.5 -2.62 (-11.2, 5.97)

test (20 %) n = 16 P-value = 0.721 123 3.36 (-14.1, 20.8) P-value = 0.712 297 3.41 (-16.3, 23.1) P-value = 0.937 7.32 0.139 (-0.426, 0.703) P-value = 0.825 2.63 0.0280 (-0.109, 0.165) P-value = 0.713 1.92 0.0238 (-0.116, 0.163) P-value = 0.397 0.690 0.00329 (-0.0292, 0.0358) P-value = 0.356 4.62 -0.235 (-0.913, 0.443) P-value = 0.346 3.61 -0.190 (-0.793, 0.414) P-value = 0.196 78.4 6.67 (-6.16, 19.5) P-value = 0.981 14.5 -0.0336 (-0.115, 0.0481) P-value = 0.796 0.379 0.00820 (-0.0199, 0.0363) P-value = 0.469 40.5 -1.62 (-10.2, 6.97)

As identified by U.S. EPA, 200962; OECD, 201263; Rhomberg et al., 200764; U.S. EPA, 200265, 200366 Data from current study; c n = 15; d n = 31

44 ACS Paragon Plus Environment

control n = 32 d

test (40 %) n = 32 Combined-gender

test (20 %) n = 32

211

229 17.7 (-1.46, 36.8)

222 10.5 (-8.69, 29.6)

417

436 19.0 (-1.31, 39.4)

429 11.5 (-8.84, 31.9)

10.1 d

10.4 0.358 (-0.487, 1.20)

10.3 0.207 (-0.638, 1.05)

2.56 d

2.53 -0.0263 (-0.132, 0.0789)

2.56 0.00444 (-0.101, 0.110)

2.49 d

2.54 0.0512 (-0.0917, 0.194)

2.55 0.0577 (-0.0852, 0.201)

0.645 d

0.628 -0.0173 (-0.0418, 0.00723)

0.646 0.00130 (-0.0232, 0.0258)

6.87 d

7.31 0.439 (-0.147, 1.03)

6.74 -0.131 (-0.718, 0.455)

5.35 d

5.84 0.489 (-0.0234, 1.00)

5.30 -0.0515 (-0.564, 0.461)

65.9 d

67.6 1.68 (-5.31, 8.66)

68.1 2.21 (-4.77, 9.19)

14.1 d

14.0 -0.0102 (-0.0739, 0.0534)

13.8 -0.0276 (-0.0913, 0.0360)

0.352 d

0.368 0.0150 (-0.00371, 0.0338)

0.355 0.00254 (-0.0162, 0.0213)

58.9 d

60.7 1.75 (-5.73, 9.24)

58.3 -0.621 (-8.11, 6.86)

44

Page 45 of 52

Journal of Agricultural and Food Chemistry

Table 4. Detectable effect size (% change relative to control) for all parameters via retrospective power analysis Effect size 99 %

> 99 %

> 99 %

99 %

> 99 %

> 99 %

> 99 %

Kidney weight (absolute)

94 %

> 99 %

> 99 %

> 99 %

98 %

> 99 %

> 99 %

> 99 %

Kidney weight (% body weight)

95 %

> 99 %

> 99 %

> 99 %

99 %

> 99 %

> 99 %

> 99 %

Leukocyte count (absolute)

78 %

82 %

88 %

82 %

90 %

92 %

95 %

92 %

Lymphocyte count (absolute)

72 %

72 %

83 %

80 %

85 %

85 %

92 %

90 %

a

Parameters that achieved the same power for both sexes under all scenarios were not tabulated. Liver weight (% body weight), cholesterol, blood urea nitrogen, creatinine, and alkaline phosphatase all achieved > 99% power for both sexes under all scenarios. Data were obtained from non-test animals from three recent studies (completed in 2014 and 2015, including the current study; 14 groups each of male and female rats, 16 animals/sex/group) conducted under the current paired-housing design as well as from three older single-housed studies (completed between 2008 and 2012; 12 groups each of male and female rats, 12 animals/sex/group).

48

48 ACS Paragon Plus Environment

Page 49 of 52

Journal of Agricultural and Food Chemistry

Table 7. Covariance analysis of weekly body weights for male and female rats VBa

COVcagea

Vεa

Body weight (measured level = rat)

Male

Female

Male

Female

Male

Female

Day 1 (week 0)

222

168

35.6

13.3

15.0

15.6

Day 8 (week 1)

263

133

58.4

38.6

111

42.1

Day 15 (week 2)

306

167

63.1

3.07

410

93.8

Day 22 (week 3)

368

267

15.4

22.7

848

70.8

Day 29 (week 4)

472

292

- 119

22.1

1280

124

Day 36 (week 5)

479

243

- 205

50.2

1760

136

Day 43 (week 6)

596

281

- 247

2.97

2050

190

Day 50 (week 7)

675

283

- 300

40.0

2500

158

Day 57 (week 8)

618

318

- 389

22.9

2780

169

Day 64 (week 9)

585

294

- 457

13.6

3080

228

Day 71 (week 10)

596

283

- 487

10.6

3520

250

Day 78 (week 11)

606

292

- 554

21.9

3710

231

Day 85 (week 12)

633

317

- 551

-20.8

4000

261

Day 92 (week 13)

636

229

- 476

43.7

3970

328

Data obtained from non-test animals in current study a V estimated variance within a block; COV B, cage, estimated covariance between two rats from the same cage; Vε, estimated residual variance

49 ACS Paragon Plus Environment

49

Journal of Agricultural and Food Chemistry

Page 50 of 52

FIGURES Figure 1. Male

Column

Rack

Row

1

1 2 3 4 5 6

Female

subset 1

subset 2

subset 3

X 8 6 5 7 X

X 7 2 8 1 X

X 3 2 4 5 X

X 2 3 1 4 X

Male

X 6 3 4 5 X

X 6 7 1 8 X

Column

Rack

Row

2

1 2 3 4 5 6

subset 5

subset 6

X 2 1 4 6 X

X 2 3 7 8 X

X 8 4 7 1 X

Male

X 4 6 5 1 X

X 6 3 5 2 X

Column

Rack

Row

3

1 2 3 4 5 6

subset 7

subset 8

X 8 4 6 1 X

X 5 1 3 8 X

X 3 5 2 7 X

X 2 4 6 7 X

Row

1

1 2 3 4 5 6

subset 1

subset 2

subset 3

X 5 6 1 2 X

X 7 5 3 2 X

X 4 3 7 5 X

X 4 7 3 8 X

Female

subset 4

X 8 3 5 7 X

Rack

Column

Rack

Row

2

1 2 3 4 5 6

X X X X X X

X X X X X X

Row

3

1 2 3 4 5 6

50 ACS Paragon Plus Environment

X 8 2 1 6 X

Column subset 4

subset 5

subset 6

X 7 5 6 3 X

X 1 6 5 2 X

X 5 3 2 8 X

X 7 1 4 6 X

X X X X X X

X X X X X X

X 1 8 2 4 X

Female Rack

X 6 8 4 1 X

X 4 3 8 7 X

Column subset 7

subset 8

X 3 5 6 2 X

X 1 6 8 2 X

X 1 8 4 7 X

X 3 5 7 4 X

50

Page 51 of 52

Journal of Agricultural and Food Chemistry

Figure 2.

51

51 ACS Paragon Plus Environment

Journal of Agricultural and Food Chemistry

Page 52 of 52

GRAPHIC FOR TABLE OF CONTENTS

52

52 ACS Paragon Plus Environment