A Randomized Controlled Trial of Providing Feedback

Jul 14, 2019 - Personal protective equipment (PPE) such as gloves that are acceptably impervious to the chemicals being handled6 can and should be use...
0 downloads 0 Views 1MB Size
This is an open access article published under a Creative Commons Attribution (CC-BY) License, which permits unrestricted use, distribution and reproduction in any medium, provided the author and source are cited.

Article Cite This: J. Chem. Educ. XXXX, XXX, XXX−XXX

pubs.acs.org/jchemeduc

Measuring and Reducing Chemical Spills by Students: A Randomized Controlled Trial of Providing Feedback Aimilia M. Tsokou, Alix Howells, and Moray S. Stark* Department of Chemistry, University of York, Heslington, York YO10 5DD, United Kingdom

Downloaded via 5.8.47.188 on August 28, 2019 at 20:11:22 (UTC). See https://pubs.acs.org/sharingguidelines for options on how to legitimately share published articles.

S Supporting Information *

ABSTRACT: The ability to handle chemicals safely is a key aspect of the learning development of students studying chemistry; however, there have been no previously reported investigations of the quantity of chemicals spilled by students during lab experiments. Therefore, the first part of this article reports the assessment of the volume of chemicals spilled by year 1 undergraduate chemistry students (n = 64) at a U.K. university during an existing chemical analysis practical designed to develop volumetric handling skills. The experiment was carried out on paper liners, allowing the areas of students’ spills to be visible and quantified using calibrated spill volumes of liquid to determine the resultant spill area. The volume spilled by the student group was ca. 1.2% of that handled; however, the amount spilled by individual students ranged widely, from ca. 0.02% to ca. 10% of the volume handled. A feedback tool has been developed to allow laboratory demonstrators to rapidly quantify chemical spillage by individual students. This tool also provides the demonstrators with a framework to communicate the potential safety significance of the volume of chemical a student has spilled. A randomized controlled trial (RCT) was carried out to examine the effect of providing feedback to students on their chemical spillage during a subsequent experiment. From a cohort of 185 year 1 undergraduate students, 150 consented to be randomized (81%), and data was collected for 144 students (96% of those randomized). A Hodges−Lehmann estimator for the median change in volume spilled during the second experiment due to providing feedback on spillage during first experiment was a 50% decrease in volume spilled (95% confidence range: 0 to 80% decrease, Mann−Whitney U test p = 0.05). The RCT was a waiting list trial, with all student receiving feedback either during or after the RCT, with blinded assessment by the demonstrators assessing volume spilled for the RCT. KEYWORDS: First-Year Undergraduate/General, Second-Year Undergraduate, Upper-Division Undergraduate, Laboratory Instruction, Safety/Hazards, Testing/Assessment, Laboratory Management



INTRODUCTION Spillage resulting from mishandling of chemicals has the potential to cause harm to people working with chemicals,1−3 and exposure though accidental spillage and subsequent skin contact can cause a wide range of serious acute or chronic adverse effects,4 including, in some cases, life-changing or lifethreatening harm.5 Personal protective equipment (PPE) such as gloves that are acceptably impervious to the chemicals being handled6 can and should be used; however, the use of PPE is cited at the lower end of hierarchies of control techniques used to manage risk,7,8 due to limitations of their use. Hence, a preferable approach is to aim at minimizing chemical spillage in the first place. A key aspect of university chemistry education is that graduating students have demonstrated an ability to handle chemicals safely, and accreditation of chemistry degree courses reflects this in their requirements.9,10 Assessment of laboratory skills can be done in a variety of ways; however, the specific skill of being able to handle chemicals without significant spillage is typically not explicitly considered. Instead, outcomes such as an ability to obtain a synthesis product in a certain yield or purity, or achievement of a concentration analysis of © XXXX American Chemical Society and Division of Chemical Education, Inc.

an acceptable accuracy, are commonly used as a measure of skill in the lab.11−16 Two aspects of chemical spillage by students are reported here: (1) the development of a simple technique for assessing chemical spillage by individual students during laboratories, and (2) the design of a tool for demonstrators to allow them to provide formative feedback to the students on the potential consequences of the volume of chemicals they have spilled. Using this measurement technique and feedback tool, a randomized controlled trial (RCT) design was used to assess whether providing feedback to students on the volume of chemicals they have spilled has an effect on the volume they subsequently spill. Received: March 26, 2019 Revised: July 14, 2019

A

DOI: 10.1021/acs.jchemed.9b00262 J. Chem. Educ. XXXX, XXX, XXX−XXX

Journal of Chemical Education



Article

PRELIMINARY INVESTIGATION OF CHEMICAL SPILLAGE BY STUDENTS

Student Concentration Determination Experiment

The year 1 chemistry undergraduate students in this study undertake a spectroscopy experiment in their first term to develop their volumetric handling skills; chemical spillage by the students during this experiment is reported here. The students were each randomly assigned two colored copper(II) compounds from four possibilities; copper(II) sulfate, copper(II) chloride, copper(II) nitrate, or copper(II) acetate. For each of the selected chemicals, they prepared a 50 cm3 aqueous stock solution of 0.1 M concentration (or 0.05 M for copper(II) acetate). The hazards of the chemicals used have no chronic toxicity listed in their safety data sheets, and they show comparatively low acute toxicity (e.g., Acute toxicity, Oral (Category 4)).17,18 Scanning UV−vis spectrometry was used to determine the wavelength of maximum absorption, λmax, of the solution, and four dilutions with a range of concentrations were then prepared by the student from their stock solution. The absorbance at λmax was determined for each concentration. The molar absorption coefficient, determined from the absorbance vs concentration graphs for their calibration samples, was then used to determine the concentration of unknown samples for each of the two chemicals. In total, the volume of liquid handled by each student throughout the day-long experiment (that could give a color if spilled) was approximately 200 cm3 (the students had some discretion in what volumes and dilutions to use), handling ca. 100 cm3 for each of the copper compounds, with one examined in the morning session and the second in the afternoon. The experiment script for the students is available in the Supporing Information.

Figure 1. Examples of controlled volumes (left to right, 0.25−1.00 cm3) of 0.05 M copper acetate solution spilled onto lab paper, showing a well-defined edge.

for the 2 × 2 cm2 grid. Therefore, to speed up evaluation, a 2 × 2 cm2 grid was used with a 10% adjustment. To convert the area of spillage into volume spilled, a range of volumes of a 0.05 M solution of copper acetate were dropped onto the paper in a controlled manner using a micropipette, and the resultant areas of spill measured. The relationship between volume spilled and area measured was close to directly proportional, with a standard error of the gradient on the order of 5%. One limitation of this approach is that overlapping spills would only be counted once; hence, this may underestimate spillage. The chemical spillage of two cohorts of year 1 undergraduate chemistry students has been examined. For the first cohort of students, the amount of spillage during this lab practical was measured for 64 out of 66 students, which was one-third of the cohort undertaking the concentration determination experiment during the week of this preliminary spills investigation. The total volume of chemical spilled during this practical was approximately 148 cm3, which was 1.2% of the total handled, ca. 1.3 × 104 cm3 (64 students handling ca. 200 cm3 of solution each throughout the day-long experiment). The amount spilled by individual students varied over a wide range, from ca. 0.02% to ca. 10% of the volume handled. The distribution of students’ spillages is shown in Figure 2, which

Quantification of Chemical Spillage During Analysis Experiment

It was intended that spill measurements would be performed by laboratory demonstrators on large groups of students during the lab. To avoid additional burden on the demonstrators, the technique had to be straightforward. To achieve this, each student had a sheet of paper (60 cm width, 49 cm depth) designed to protect surfaces from chemical spillage (Benchguard BG-50E extra-absorbent paper) placed in their half of the shared fumehoods and on which they carried out all their volumetric handling (weighing of solids for making up stock solutions was carried out elsewhere on communal balances). The paper used has an absorbent side, and when solutions were spilled onto this in a controlled manner from a comparatively low height (a few centimeters, thought to be representative of many of the spills by the students), the initially localized spill soaked in and spread out over a few seconds to give a visible area of contamination (see Figure 1 for examples). To confirm that all spillages of the copper compounds and dilutions would be visible, the stock solutions and the most dilute solution were spilled in a controlled manner onto the paper sheets using a micropipette. All were visible immediately after the spill, and after 1−2 h. To measure the area of visible stains produced by student spillage, a 2 × 2 cm2 grid on an A3 clear plastic sheet was overlaid, and for each student the number of squares in which any stain appeared was counted (by author A.H.). For 25 students, area measurements were also done using a 1 × 1 cm2 grid, with the results highly correlated and directly proportional, with the 1 × 1 cm2 grid values being ca. 10% lower than

Figure 2. Volumes of solution spilled by 64 students as percentage of total handled and in cm3 (ordered left to right, lowest to highest).

highlights that for ca. 90% of the spillages over an amount approaching 2 orders of magnitude, this logarithmic graph is approximately linear; therefore, the distribution of volumes is approximately exponential for most spillages.



FEEDBACK TOOL FOR LAB DEMONSTRATORS ON CHEMICAL SPILLAGE BY STUDENTS When carried out by demonstrators during a lab practical, the technique using a 2 × 2 cm2 grid in the preliminary study for measuring spills was judged to be too time-consuming for B

DOI: 10.1021/acs.jchemed.9b00262 J. Chem. Educ. XXXX, XXX, XXX−XXX

Journal of Chemical Education

Article

larger spills. Therefore, it was slightly adapted to make it faster for demonstrators by having a larger 3 × 3 cm3 grid, with an insert of 1.5 × 1.5 cm3 for smaller spills. Observations of demonstrators using this larger grid indicate a typical time required for counting the spillage squares as ca. 15−20 s per student. A simple tool (available in the Supporting Information) was developed to allow demonstrators (all Ph.D. students) to give prompt, in-lab formative feedback to students on the volume of chemicals they have spilled. This included set phrases on the potential consequences of the volume they spilled, with reference to a small number of well-known chemicals (ethanol, ethyl acetate, hexane, and 1 M potassium cyanide). For these chemicals the volumes corresponding to the derived no effect level (DNEL) for long-term dermal exposure were used,19 along with a worst-case scenario assumption that all spilled chemical came in contact with the students’ skin. These chemicals were chosen to give nominal threshold safe spillage volumes covering the range of spillages noted in the preliminary study and were calculated assuming a 66 kg worker,20 shown in Table 1 (the DNELs quoted are the mass of chemical in milligrams of dermal exposure per kilogram of body weight per day, for a worker).

Table 2. Feedback Phrases Given by Demonstrators to Students Based on the Number of Squares Spilled for Each Student Number of (3 × 3 cm2) Squares with Chemical Spilled

150

Chemical Ethanol Ethyl acetate Hexane KCN 1M

343 63 10.3 0.14

Nominal Safe Limit per Personb per Day/cm3

Number of 3 × 3 cm2 Squares Corresponding to Safe Volume

28.7 4.6

146 27

1.04 0.14

9 2

a

For more on DNELs (derived no effect levels), see ref 19. Assuming an average of a 66 kg person, see ref 20.

b

Utilizing the thresholds from Table 1, set phrases were developed for the demonstrators to give to the students on the basis of the amount of chemical spilled by students (Table 2). The numbers of squares were slightly rounded to make it easier for demonstrators to rapidly provide feedback to students. This feedback was formative, and the students were informed that measuring spillage was not summatively assessed.21 The demonstrators’ tool also had indicative descriptive grades (A−E) that they could give to the students, to provide a shorthand measure of how well they were handling the chemicals without spillage.



“If you spilled this volume of chemicals routinely, then you would”: “be able to handle high hazard chemicals safely, such as 1 M potassium cyanide” “be able to handle high hazard solvents, such as hexane, safely but not more hazardous chemicals, such as cyanides” “be able to handle routinely hazardous chemicals safely, such as ethyl acetate, but not more hazardous chemicals, such as hexane or cyanides” “be able to handle low hazard chemicals safely (such as ethanol), but not more hazardous chemicals” “not be able to handle even low hazard chemicals safely, such as ethanol”

A B C

D E

consequences of this. The volume of chemical spilled during the afternoon session was measured, and a comparison of the amounts spilled was made for the control (no feedback) and intervention (received feedback) groups. This was a parallel group randomized controlled trial, with 1:1 allocation to intervention:control. This was also a waiting list randomized controlled trial as feedback was given to all students, just not at the same time (nomenclature derived from its origin in medical RCTs): students allocated to the control group received feedback at the end of the afternoon experiment, to help minimize any potential ethical concerns related to different students receiving a different educational experience.22,23 Eligibility criteria for participants was solely that they were first year undergraduate chemistry students who gave consent, at the University of York, U.K. The intervention group received feedback on their spillage for the morning experiment before carrying out the second experiment; the control group did not receive feedback prior to the afternoon experiment. The primary outcome of the RCT was the volume spilled during the afternoon experiments. There have been no previous estimates upon which to base a sample size calculation; therefore, the sample size used was the year 1 Chemistry cohort, which was 185 students, consenting to take part in this RCT. No interim analyses or stopping criteria were put in place for this RCT. It is good practice in the design of RCTs to take into account prognostic factors (factors that are known to affect an outcome) which may, by chance, not be equally distributed across groups. For instance, the randomization may be stratified to ensure that the control and intervention groups are similar in respect of people with these factors (e.g., age or gender). However, as there is no prior evidence on which factors can affect the amount of chemicals spilled by individuals, no stratified randomization of factors was attempted. Allocation of the students to the control and intervention groups was achieved by giving a random number between 0 and 1 to each student who had given consent using the Excel RAND function, then ordering the students by this random number, with the lowest 50% allocated to the control group, and the highest 50% to the intervention group. Using a computer for the randomization process decreases the possibility of bias affecting the outcome.24

Table 1. Derived No Effect Levels, Nominal Safe LongTerm Exposure Limits, and Equivalent Number of 3 × 3 cm2 Squares for the Four Selected Threshold Chemicals DNELa for Dermal Exposure/mg/kg bw/Day

Descriptive Grade

Feedback

RANDOMIZED CONTROLLED TRIAL OF EFFECT OF FEEDBACK ON CHEMICAL SPILLAGE

RCT Methodology

The effect of this demonstrators’ spillage feedback tool on the amount of chemicals subsequently spilled by students was investigated using a randomized controlled trial (RCT) with a second cohort of students. As the students undertook the experiment twice, once in the morning and once in the afternoon, they were split into intervention and control groups, with the former receiving feedback after the morning session on the amount of chemicals they spilled and possible C

DOI: 10.1021/acs.jchemed.9b00262 J. Chem. Educ. XXXX, XXX, XXX−XXX

Journal of Chemical Education

Article

Figure 3. CONSORT flow diagram for the transparent reporting of trials (consolidated standards of reporting trials).31

Whitney U test is typically used for ordinal outputs, so it is used here with rankings of volume spilled by the students. No other hypotheses were prespecified or tested during this trial to avoid the need to reduce the confidence level used to adjust for the problem of multiple comparisons.29 There were no deviations or alternations between the statistical methods proposed to be used in the trial protocol and those implemented in the trial.

To implement the allocation, a list was given to the demonstrators overseeing the morning experiment with the names of the students assigned into the control and intervention groups. The control/intervention allocation list was not revealed to any students. The random allocation sequence, enrollment of participants, and assignment of participants to the intervention and control group was carried out by an author (M.S.S.). Assessment was blinded, as demonstrators overseeing the afternoon experiment were different from the morning demonstrators and they were not given the group allocation list (they were also requested to not ask the morning demonstrators or the students whether they had received feedback during the morning session). Therefore, the demonstrators assessing how much chemical the student spilled in the afternoon did not know who received feedback in the morning. For the primary outcome, two nonparametric statistical tools were prespecified in the trial protocol (available in the Supporting Information). As the data obtained was nonnormally distributed, nonparametric statistical tools were used to measure the difference in volume spilled, with the prespecified summary statistic for the effect size being the Hodges−Lehmann estimator and associated 95% confidence intervals, and the primary inferential statistic being the Mann− Whitney U test with a prespecified confidence level of 0.05. The former is a measure of the median difference between the intervention and control groups which returns the median of all possible differences between the control and intervention groups (here, 5092 = 67 × 76) and gives a more reliable measure for non-normal data than a simple comparison of the medians of the two groups.25,26This approach also allows for confidence intervals to be evaluated,27 while the latter is a twosided test of significance for independent observations used here to examine whether providing the feedback is statistically significant.28 The prespecified null hypothesis investigated is that providing feedback has no effect on the volume of chemical subsequently spilled by the students. The Mann−

Results of RCT of Effect of Feedback on Chemical Spillage

Of the 185 students in the year 1 cohort, 150 consented to take part in this trial (81% of cohort), with 75 allocated to each of the control and intervention groups. Five students did not attend the practical, while five were provided an intervention contrary to that indicated by randomization (two received feedback when they were allocated to the control group, while three allocated to the intervention group did not receive feedback). The consent form for student participation in the randomized trial was circulated from the 24th of September 2018 until the 12th of October 2018. The trial ran from the 22nd of October until the 9th of November 2018. The average (mean) A-level tariff of the students was 177 (approximately 3 A’s at A-level), and the male/female ratio was 55/45;30 the average age of students was 18−19 years old. This data is not available for the control and intervention groups. There were 67 students who received the intervention, i.e., feedback on spillage, and 76 students did not receive the intervention. This is shown in the CONSORT (consolidated standards of reporting trials) flow diagram, Figure 3, which is a set of benchmark guidelines designed to allow for more straightforward replication and subsequent synthesis with future findings into a combined result using a meta-analysis of more than one study on a topic.31 There were 5 absences from the lab, and data was not collected for 2 participants. As prespecified in the trial protocol, results for these individuals were excluded from the study. D

DOI: 10.1021/acs.jchemed.9b00262 J. Chem. Educ. XXXX, XXX, XXX−XXX

Journal of Chemical Education

Article

All calculations were performed both in Excel and SPSS.32 The statistical analysis was performed using a per protocol approach33 (i.e., that analysis was performed on the basis of what the students actually received), and also an intention to treat analysis (i.e., that analysis is based on which group the student was allocated to).34 However, following interview of the demonstrators for the five students who received a different treatment from that allocated by randomization, they indicated that this was not due to any conscious decision by them but due to oversight and was thought to be random; therefore, per protocol statistics are given as the primary choice of analysis. The volumes of chemicals spilled by the control and intervention groups are shown in Figure 4 and are also

on the threshold of accepting or rejecting the null hypothesis given the prespecified 95% confidence limits stated in the trial protocol.28 There is no assumption for normality of the data for this test.35 The effect size was estimated using the Hodges−Lehmann estimator, which is a measure of the median difference between all the different possible pairing between control and intervention groups (67 × 76 = 5092 pairings). It does not assume normal distribution for the control and intervention data sets but does assume an asymptotic approach to normal distribution of the differences.36 Two approaches were used in determining the Hodges− Lehmann estimator; in one the analysis was performed using absolute differences of volume spilled between control and intervention groups, while in the second, the analysis was performed using differences of the log10(volume). Both volume and log volume analyses are presented where there is a noticeable difference between the two. It is worth noting that the analysis performed using differences of the log10(volume) spilled is equivalent to the ratio of the amount spilled by the control group over the intervention group, so it is easily expressible as a percentage change in amount spilled by giving the feedback intervention. Furthermore, the log volume data distribution of the differences gives a symmetrical bell-shaped distribution, while the volume data does not, so since near normality of the distribution of the differences is an assumption of the Hodges−Lehmann method, analysis by log volume is cited with preference here.25,37 This is described in more detail in the Supporting Information. Demonstrators recorded that a small number of students spilled no chemical. By contrast, in the preliminary investigation the volume spilled was measured by an author (A.H.) after the experiment was completed so was not under the time pressure the demonstrators were under and recorded no zero spillages. Hence, it is possible that spillage measurements carried out by demonstrators may have missed very small, dilute, spillages. It is estimated that, in practice, a minimum spill amount observable by the demonstrators given their limited time would be ca. 0.001 cm3, and that value was used instead of 0 to allow a logarithm for this analysis to be obtained. To permit a sensitivity analysis on this lower limit, calculations were redone assuming a minimum value spilled a factor of 10 higher at 0.01 cm3, and it was found that the Hodges−Lehmann estimator and the confidence intervals were unaffected. The Hodges−Lehmann estimator for the median difference between intervention and control logged volumes is −0.301, which is equivalent to those receiving the intervention spilling approximately half the volume spilled by the control students, a 50% decrease in chemical spillage due to receiving feedback. The 95% confidence interval for the Hodges−Lehmann estimator of the median difference for the logged data was 0 to −0.699, which is equivalent to those receiving the intervention spilling between the same volume spilled by the control students and 0.2 of that spilled by the control students, which alternatively represents a decrease in chemical spillage due to receiving feedback of between zero and a factor of 5. A sensitivity analysis was also performed using both per protocol analysis and the intention to treat (ITT) method. In ITT analysis, students that were initially allocated to the intervention group but did not receive feedback were still included in the intervention group and vice versa. The ITT

Figure 4. Volumes of solution spilled by 67 students receiving the feedback intervention and the 76 students not receiving this feedback (ordered left to right, lowest to highest).

Figure 5. Volumes of solution, displayed with logarithmic vertical axes, spilled by 67 students receiving the feedback intervention and the 76 students not receiving this feedback (ordered left to right, lowest to highest).

displayed in Figure 5 with a logarithmic vertical axis. Also included on the secondary vertical axes is the approximate percentage of the volume handled that was spilled (which was ca. 100 cm3 per student for this afternoon session). For the intervention group the median spillage postintervention was 0.071 cm3; the interquartile range (IQR) was 0.018−0.161 cm3 (range 0.00−4.99 cm3). For the control group the median spillage was 0.089 cm3, IQR 0.018−1.48 cm3 (range 0.00−13.8 cm3). The median is reported due to the skewed nature of the data. The primary hypothesis testing statistical analysis performed was the Mann−Whitney U test, which gave a p value of 0.05, E

DOI: 10.1021/acs.jchemed.9b00262 J. Chem. Educ. XXXX, XXX, XXX−XXX

Journal of Chemical Education

Article

Hodges−Lehmann median was −0.301, with an upper limit of 0 and lower limit of −0.602 similar to the per protocol analysis (−0.301, 0, and −0.699, respectively). The p value for the Mann−Whitney test was somewhat different at 0.09 using the ITT approach. There were no important harms or unintended effects noted in either group for this trial.

volumetric analysis experiments, and it fulfills 11 out of 12 eligibility criteria for pragmatic trials (excepting intention to treat analysis).45 This randomized control trial is also original for the field as no other RCTs have been conducted on teaching techniques aimed at reducing the amount of chemicals the students spill. A literature review indicates limited RCTs in the field of education research,46 including in chemistry education research on classroom-based teaching (for example, on the effect of peer-led team learning47) and none in the field of teaching methods to improve laboratory practical skills. The closest parallels to this study are in RCTs on other motor skills, for instance, in assessment of different approaches to teaching surgery skills,48 assessment of different approaches to motor skill development in preschool children,49 or school-based driver education for preventing accidents.50 A survey was carried out of those with U.K. university chemical safety roles to ask what percentage reduction in spillage they would consider sufficiently meaningful to make it worthwhile to adapt their lab practicals. Although there was only a low response rate of six respondents, the mean worthwhile reduction reported was 40%, so the reduction in chemical spillage of 50% found in this trial may be judged of sufficient size to be worthwhile for implementation of this technique. The RCT result does bear some uncertainty, as the confidence intervals range from 0 to 80%. However, as there were no harms in this trial and only minimal expense was needed (except from minimum amount required to acquire the paper, ca. 1 $/£/€ and ca. 1 min per student), the benefits may be judged to likely outweigh the uncertainty for this trial and could make it applicable to other educational institutions for replication or possibly implementation as part of the undergraduate curricula.



DISCUSSION This randomized control trial measured the effect of providing feedback about spillage to undergraduate chemistry students on the amount of chemical subsequently spilled. The result showed a (Hodges−Lehmann) median decrease of 50% in chemical spillage by students receiving feedback. The Mann− Whitney test result is on the threshold of classifying this change as statistically significant, given the measured p value of 0.05 determined and a 95% confidence limit prespecified in the trial protocol, and there has been some discussion in the literature on the relative merit of quoting p values or effect sizes and corresponding confidence intervals, particularly when close to confidence limits for rejecting the null hypothesis; both are quoted here.38−41 Potential limitations of this RCT study include that the calculation of the Hodges−Lehmann estimator in terms of volume or log volume was not prespecified in the trial protocol. As the distribution of differences is not close to normally distributed, while the logged data is near to a normal distribution and the latter gives a relative effect size straightforwardly, analysis using logged volumes is given preference here. Another potential limitation is that the randomization of students and the analysis of the data were done by the authors of this paper and not by an independent third party.42 This is due to the fact that only two people were conducting the trial. An attempt to mitigate this issue was through having a prespecified protocol, and primary analysis and randomization carried out by different authors (A.M.T. and M.S.S., respectively). Despite randomized control trials being able to reduce or eliminate selection bias,24 some sources of bias might be present, for instance, baseline differences between control and intervention groups.42 However, these differences were not available for the control and intervention groups, and also there is insufficient knowledge of which factors, if any, may affect chemical handling skills. There is also a possibility that members of the control group could have discussed the feedback with a member of the intervention group before the afternoon experiments, or overheard this feedback being given to someone else. This potential contamination bias could have the effect of causing the observed effect size to be underestimated, but this is difficult to avoid or quantify in the typical lab setting. A per protocol analysis has been adopted in this trial as the preferred method to report; however, it has been observed that per protocol and intention to treat analysis can give different results to an extent, and so results using both methods are reported here.43 These potential limitations notwithstanding, this trial was designed to be a pragmatic RCT and so seeks to be representative of possible implementation elsewhere.44 Nothing significant was altered in the procedure for the practical for this feedback on spillage to be given. The chemicals spilled were measured by demonstrators who were already assigned to oversee this practical. This approach could be applicable to other institutions’ laboratories carrying out undergraduate



CONCLUSIONS A technique is reported for measuring the volume of chemical spilled during undergraduate laboratory practical classes and has been used to evaluate for the first time the distribution of spillage among a class of 64 students, showing a very wide range in volumes spilled, ca. 3 orders of magnitude. This measuring technique has been adapted into a reliable tool for laboratory demonstrators, allowing them to rapidly measure spillage during practical classes and provide individual feedback to students on possible consequences of this and future spillages. A randomized controlled trial has been carried out on 150 consenting year 1 undergraduate chemistry students, to evaluate the effects of providing feedback to students on the volume of chemical they spilled by measuring its impact on subsequent spillage of chemical. The (Hodges−Lehmann estimator for the) median change in volume spilled due to providing feedback was a 50% lower volume spilled (95% confidence range: 0 to 80% decrease, Mann−Whitney U test, p = 0.05). As this spills feedback was implemented in a pragmatic manner, using lab demonstrators and volumetric experiments that the students routinely carry out, and as the relative cost per student of providing feedback on spillage is comparatively low (ca. 1 $/£/€ and ca. 1 min per student), and the median reduction in spillage was potentially significant, there is merit in this spills measurement and feedback being evaluated in other institutions, either as an educational tool, or using the F

DOI: 10.1021/acs.jchemed.9b00262 J. Chem. Educ. XXXX, XXX, XXX−XXX

Journal of Chemical Education

Article

support during these undergraduate lab practicals; David Pugh and Nick Wood for developing and running the analysis experiment used for this spillage investigation and for the student script for the experiment; and Andrea Nelson and Anna Riach for statistical & methodological advice.

waiting list RCT structure described here to provide improved confidence limits on the size of the effect measured here. This study highlights that although there have been welldiscussed issues with carrying out RCTs in the chemistry education field,51 the use of the waiting list RCT design offers a methodology to investigate the effects of educational techniques that is perceived as fairer for the participants while having the benefits of randomization for reducing bias.52 Another issue highlighted in chemical education research is in achieving a sufficient number of subjects for a statistically useful result to be achieved.51 This study demonstrates that, where an educational approach is delivered to the individual, a cohort in a typical degree course at a single university can provide sufficient subjects to allow the detection of effect sizes of a meaningful magnitude, helped here by having a strong sign-up rate when seeking consent of ca. 80%, possibly benefited by using the waiting list trial design. Therefore, this format of RCT could help to address some of the issues of replication of findings, discussed recently in the chemistry education sector,53 following well-documented concerns raised about the reproducibility of reported findings in other fields with human subjects.54 To assist with any potential future studies on this topic, the results from this RCT are presented, as far as possible, using the CONSORT guidelines for reporting RCTs.31





(1) Brückner, S.; Marendaz, J.-L.; Meyer, T. Using very toxic or especially hazardous chemical substances in a research and teaching institution. Safety Science 2016, 88, 1−15. (2) Mannan, M. S.; O’Connor, T. M.; Keren, N. Patterns and trends in injuries due to chemicals based on OSHA occupational injury and illness statistics. J. Hazard. Mater. 2009, 163, 349−356. (3) Le, L. M. M.; Reitter, D.; He, S.; Bonle, F.; Launois, A.; Martinez, D.; Prognon, P.; Caudron, E. Safety analysis of occupational exposure of healthcare workers to residual contaminations of cytotoxic drugs using FMECA security approach. Sci. Total Environ. 2017, 599−600, 1939−1944. (4) Rim, K.-T. Reproductive Toxic Chemicals at Work and Efforts to Protect Workers’ Health: A Literature Review. SH W 2017, 8, 143− 150. (5) Nierenberg, D. W.; Nordgren, R. E.; Chang, M. B.; Siegler, R. W.; Blayney, M. B.; Hochberg, F.; Toribara, T. Y.; Cernichiari, E.; Clarkson, T. Delayed cerebellar disease and death after accidental exposure to dimethylmercury. N. Engl. J. Med. 1998, 338, 1672−1676. (6) Blayney, M. B. The Need for Empirically Derived Permeation Data for Personal Protective Equipment: The Death of Dr. Karen E. Wetterhahn. Appl. Occup. Environ. Hyg. 2001, 16, 233−236. (7) Manuele, F. A. Risk Assessment & Hierarchies of Control. Professional Safety 2005, 50, 33−39. (8) Health and Safety Executive. Working with Substances Hazardous to Health: A Brief Guide to Control of Substances Hazardous to Health; HSE: United Kingdom, 2012; p 5. (9) American Chemical Society. ACS Guidelines and Evaluation Procedures for Bachelor’s Degree Programs; ACS: Washington, DC, 2015; p 13;https://www.acs.org/content/dam/acsorg/about/ governance/committees/training/2015-acs-guidelines-for-bachelorsdegree-programs.pdf (accessed Jul 12, 2019). (10) Royal Society of Chemistry. Accreditation of Degree Programmes; RSC: United Kingdom, pp 9−14;http://www.rsc.org/images/ Accreditation%20criteria%202017-%20update%20july%2017_tcm18151306.pdf (accessed Jul 12, 2019). (11) Hofstein, A.; Lunetta, V. N. The laboratory in science education: Foundations for the twenty-firstcentury. Sci. Educ. 2004, 88, 28−54. (12) Reid, N.; Shah, I. The role of laboratory work in university chemistry. Chem. Educ. Res. Pract. 2007, 8, 172−185. (13) McDonnell, C.; O’Connor, C.; Seery, M. K. Developing practical chemistry skills by means of student-driven problem based learning mini-projects. Chem. Educ. Res. Pract. 2007, 8, 130−139. (14) Meester, M. A. M.; Maskill, R. First-year chemistry practicals at universities in England and Wales: organizational and teaching aspects. Int. J. Sci. Edu. 1995, 17, 705−719. (15) Kelly, O. C.; Finlayson, O. E. Providing solutions through problem-based learning for the undergraduate 1st year chemistry laboratory. Chem. Educ. Res. Pract. 2007, 8, 347−361. (16) George-Williams, S. R.; Ziebell, A. L.; Kitson, R. R. A.; Coppo, P.; Thompson, C. D.; Overton, T. L. What do you think the aims of doing a practical chemistry course are?’ A comparison of the views of students and teaching staff across three universities. Chem. Educ. Res. Pract. 2018, 19, 463−473. (17) Sigma Aldrich SDS Search. https://www.sigmaaldrich.com/ united-kingdom.html (accessed Nov 27, 2018)). (18) United Nations. Globally Harmonized System of Classification and Labelling of Chemicals; 2003 http://www.unece.org/trans/ danger/publi/ghs/ghs_rev00/00files_e.html (accessed Jul 12, 2019). (19) European Chemicals Agency. https://echa.europa.eu (accessed Jul 12, 2019).

ASSOCIATED CONTENT

S Supporting Information *

The Supporting Information is available on the ACS Publications website at DOI: 10.1021/acs.jchemed.9b00262.



REFERENCES

Trial protocol, including details on consent; further details on the distribution of differences resulting from calculating the Hodges−Lehmann estimator for volume and log10(volume) data, for comparison with normal distributions; the experiment script provided to the undergraduate students; and the information given to the lab demonstrators to provide feedback, and also regarding taking part in this RCT (PDF, DOCX)

AUTHOR INFORMATION

Corresponding Author

*E-mail: [email protected]. ORCID

Moray S. Stark: 0000-0002-2175-2055 Notes

The authors declare no competing financial interest. This trial was not registered and was lodged with a colleague at the University of York not otherwise involved in the project in advance of the start of data collection. The trial design was submitted to the University of York Faculty of Science Ethics Committee and approved. Participants agreed to be randomized through a consent form that was completed online prior to the trial. This work was unfunded, except for a small consumables budget provided by the University of York, and the authors declare no conflicts of interest in this work.



ACKNOWLEDGMENTS The authors thank the year 1 students and graduate demonstrators for agreeing to take part in this trial; Helen Burrell, Liza Binnington, Phil Helliwell, and Scott Hicks for G

DOI: 10.1021/acs.jchemed.9b00262 J. Chem. Educ. XXXX, XXX, XXX−XXX

Journal of Chemical Education

Article

(20) Serlachius, A.; Hamer, M.; Wardle, J. Stress and Weight Change in University Students in the United Kingdom. Physiol. Behav. 2007, 92, 548−553. (21) Wiliam, D.; Black, P. Meanings and Consequences: a basis for distinguishing formative and summative functions of assessment? British Educational Research Journal 1996, 22, 537−548. (22) Miller, F.; Brody, H. What makes placebo-controlled trials unethical? American Journal of Bioethics 2002, 2, 3−9. (23) Resnik, D. Environmental Health Research Involving Human Subjects:Ethical Issues. Journal of Environmental Health 2008, 70, 28− 30. (24) Torgerson, C. J.; Torgerson, D. J. Avoiding Bias in Randomised Controlled Trials in Educational Research. British Journal of Educational Studies 2003, 51, 36−45. (25) Rosenkranz, G. A note on the Hodges-Lehmann estimator. Pharmaceutical Statistics 2009, 9, 162−167. (26) Hodges, J.; Lehmann, E. Estimates of Location Based on Rank Tests. Ann. Math. Stat. 1963, 34, 598−611. (27) Bonett, D. G.; Price, R. M. Statistical inference for a linear function of medians: Confidence intervals, hypothesis testing, and sample size requirements. Psychological methods 2002, 7 (3), 370− 383. (28) Mann, H.; Whitney, D. On a Test of Whether one of Two Random Variables is Stochastically Larger than the Other. Ann. Math. Stat. 1947, 18, 50−60. (29) Smith, G. D.; Ebrahim, S. Data dredging, bias, or confounding: they can all get you intothe BMJ and the Friday papers. British Medical Journal 2002, 325, 1437−1438. (30) University League Tables.The Guardian; 2019; https://www. theguardian.com/education/ng-interactive/2018/may/29/universityleague-tables-2019 (accessed Jul 12, 2019). (31) Schulz, K. F.; Altman, D. G.; Moher, D. and the CONSORT GroupCONSORT 2010 Statement: updated guidelines for reporting parallel group randomised trials. Trials 2010, 11 (32), 1−8. (32) IBM SPSS Software; https://www.ibm.com/uk-en/analytics/ spss-statistics-software (accessed Jul 12, 2019). (33) Porta, N.; Bonet, C.; Cobo, E. Discordance between reported intention-to-treat and per protocol analyses. Journal of Clinical Epidemiology 2007, 60, 663−669. (34) Gupta, S. K. Intention-to-treat concept: A review. Perspectives in clinical research 2011, 2, 109−112. (35) Wayne, W. D. Applied Nonparametric Statistics, 2nd ed.; Houghton Mifflin Company: Boston, 1978; p 82. (36) Wolfe, H. Nonparametric Statistical Methods, 1st ed.; Wiley: New York, 1973; p 34. (37) Nakagawa, S.; Cuthill, I. C. Effect size, confidence interval and statistical significance: a practical guide for biologists. Biological Reviews 2007, 82, 591−605. (38) Sterne, J. A. C.; Smith, G. D. Sifting the evidencewhat’s wrong with significance tests? Physical Therapy 2001, 81, 1464−1469. (39) Gardner, M. J.; Altman, D. G. Confidence intervals rather than P values: estimation rather than hypothesis testing. Br Med. J. (Clin Res. Ed) 1986, 292, 746−750. (40) Altman, D. G.; Bland, J. M. Uncertainty beyond sampling error. British Medical Journal 2014, 349, g7064−g7064. (41) Altman, D. G.; Bland, J. M. Absence of evidence is not evidence of absence. British Medical Journal 1995, 311, 485−485. (42) Lewis, S.; Warlow, C. How to spot bias and other potential problems in randomised controlled trials. J. Neurol., Neurosurg. Psychiatry 2004, 75, 181−187. (43) Spieth, P. M.; Kubasch, A. S.; Penzlin, A. I.; Illigens, B. M.; Barlinn, K.; Siepmann, T. Randomized controlled trials - a matter ofdesign. Neuropsychiatr. Dis. Treat. 2016, 12, 1341−1349. (44) Torgerson, C. J. Randomised controlled trials in education research: a case study of an individually randomised pragmatic trial. Education 2009, 37 (4), 313−321. (45) Patsopoulos, N. A. A pragmatic view on pragmatic trials. Dialogues in Clinical Neuroscience 2011, 13, 217−224.

(46) Torgerson, D. J.; Torgerson, C. J. The Need for Randomised Controlled Trials in Educational Research. British Journal of Educational Studies 2001, 49, 316−328. (47) Chan, J. Y. K.; Bauer, C. F. Effect of Peer-Led Team Learning (PLTL) on Student Achievement, Attitude, and Self-Concept in College General Chemistry in Randomized and Quasi Experimental Designs. J. Res. Sci. Teach. 2015, 52 (3), 319−346. (48) Moulton, C. A.; Dubrowski, A.; MacRae, H.; Graham, B.; Grober, E.; Reznick, R. Teaching surgical skills: what kind of practice makes perfect?: a randomized, controlled trial. Annals of Surgery 2006, 244, 400−409. (49) Hestbaek, L.; Andersen, S.; Skovgaard, T.; Olesen, L.; Elmose, M.; Bleses, D.; Andersen, S.; Lauridsen, H. Influence of motor skills training on children’s development evaluated in the Motor skills in PreSchool (MiPS) study-DK: study protocol for a randomized controlled trial, nested in a cohort study. Trials 2017, 18 (400), 1−11. (50) Roberts, I. G.; Kwan, I. School-based driver education for the prevention of traffic crashes. Cochrane Database of Systematic Reviews 2001, 2−4. (51) Cooper, M. M.; Stowe, R. L. Chemistry Education ResearchFrom Personal Empiricism to Evidence, Theory, and Informed Practice. Chem. Rev. 2018, 118, 6053−6087. (52) Ronaldson, S.; Adamson, J.; Dyson, L.; Torgerson, D. Waiting list Randomized controlled trial within a case-finding design: Methodological considerations. Journal of Evaluation in Clinical Practice 2014, 20, 601−605. (53) Cooper, M. M. The Replication Crisis and Chemistry Education Research. J. Chem. Educ. 2018, 95 (1), 1−2. (54) Ioannidis, J. P. A. Why Most Published Research Findings Are False. PLoS Med. 2005, 2 (8), No. e124.

H

DOI: 10.1021/acs.jchemed.9b00262 J. Chem. Educ. XXXX, XXX, XXX−XXX