Crossing Boundaries in Electronic Learning - American Chemical

in chemistry lectures (2), as well as their preparation for laboratory courses (3). The students work actively with the electronic learning material b...
0 downloads 10 Views 2MB Size
Chapter 3

Crossing Boundaries in Electronic Learning:

Downloaded by UNIV OF FLORIDA on December 11, 2017 | http://pubs.acs.org Publication Date (Web): November 20, 2017 | doi: 10.1021/bk-2017-1260.ch003

Combining Fragmented Test Data for a New Perspective on Students’ Learning Sebastian Hedtrich and Nicole Graulich* Institute of Chemistry Education, Justus-Liebig-University Gießen, Heinrich-Buff-Ring 17, 35392 Gießen, Germany *E-mail: [email protected].

Blended-learning has become a well-known and widely used scaffolding tool to support a student’s learning process. Thereby, blended-learning offers the opportunity to profit by best practices in face-to-face learning situations and computer-based learning. Nevertheless, teachers use the computer-based part mostly as a black box scaffolding tool to support their students. “Test passed” or “test not (yet) passed” are generally the only information teachers receive. Hence, the influence of the computer-based learning on the face-to-face lessons can be described by a simple access control or some digital-assisted homework check. Our idea to extend the benefit of both learning worlds is the development of a software solution that can offer a more detailed insight into a student’s learning during the computer-based learning. Consequently, teachers can provide face-to-face lessons that meet the needs of individual students better and new types of automatically delivered feedback can be realized.

Blended-learning has become a widespread technique to support and supplement traditional face-to-face instructions in higher education during the last few years. Blended-learning is also gaining increasing influence in the secondary sector. The students’ preparation for laboratory classes is improved by offering electronic supporting material, especially video demonstrations, via a Learning

© 2017 American Chemical Society Gupta; Computer-Aided Data Analysis in Chemical Education Research (CADACER): Advances and Avenues ACS Symposium Series; American Chemical Society: Washington, DC, 2017.

Downloaded by UNIV OF FLORIDA on December 11, 2017 | http://pubs.acs.org Publication Date (Web): November 20, 2017 | doi: 10.1021/bk-2017-1260.ch003

Management System (LMS) (1). Flipped classroom approaches, as one special form of blended-learning, have demonstrated how to improve students’ learning in chemistry lectures (2), as well as their preparation for laboratory courses (3). The students work actively with the electronic learning material between the face-to-face lessons in a traditional blended-learning scenario (Fig. 1). The computer-based part is used to support the students in their preparation for the upcoming instruction or to revise the topics of the past one. Beyond that, there is usually no strong connection between the parts. The result is that the function of the computer-based part is often reduced to a tool that helps students to prepare and to support teachers to check whether the homework has been done correctly.

Figure 1. Blended-learning occurs in two more or less distinct worlds.

Students spend a huge amount of time using the LMS. Face-to-face lessons and digital learning at home are sometimes equally time-consuming. During their time using the LMS, the students leave a lot of data trails. Considering this large data pool, the information which is delivered back to students and teachers by the LMS is astonishingly small. Students only receive help on the level of single, isolated tasks and teachers can supervise the homework (Fig. 2). These possibilities for supervision are limited to the percentage of points earned for each student in the whole assignment. There are, for instance, no options to monitor how students’ strengths and weaknesses progress in types of LMS used commonly, such as ILIAS, Moodle or openOLAT. 22 Gupta; Computer-Aided Data Analysis in Chemical Education Research (CADACER): Advances and Avenues ACS Symposium Series; American Chemical Society: Washington, DC, 2017.

Downloaded by UNIV OF FLORIDA on December 11, 2017 | http://pubs.acs.org Publication Date (Web): November 20, 2017 | doi: 10.1021/bk-2017-1260.ch003

Figure 2. Different views of assessments in blended-learning. Therefore, learning in the context of blended-learning scenarios can be described best as learning in two separate worlds. Every approach to maximize the benefit of blended-learning should aim at making connections between both worlds, as this offers the most potential for learning improvement. Although students can estimate their current performance in an assessment approximately, they often fail to realize how much needs to be done to master the learning objective or to acquire a competency in a satisfactory manner (4). This shortcoming can be solved by bringing the learning world of the LMS closer to the face-to-face lessons. Our idea was to create a software solution that could establish stronger connections between both worlds. Software that offers educators a deeper insight into their students’ learning process occurring in the LMS. In this way, face-to-face lessons can be designed based on the individual student’s needs. Additionally, there are possibilities to implement new automatic information systems, such as a feedback system that delivers feedback directly to the students about their strengths and weaknesses.

Try To Make Connections Big Data in Education – Educational Data Mining – Learning Analytics Students generate a huge data pool while they are working within a blended-learning scenario. One idea to handle this vast amount of data is to look at other disciplines in the IT sector which deal with similar amounts of data. Every person surfing on the World Wide Web leaves large trails of digital data. Handling these tons of data material has become a big business and companies are making money by working with this big data. The term “Educational Data Mining” (EDM) describes the application of methods of big data mining in educational contexts (5). The traditional data mining techniques consist of methods and algorithms from statistics, data modeling, and visualization, as well as machine learning. In EDM, those techniques must be revised and supplemented for 23 Gupta; Computer-Aided Data Analysis in Chemical Education Research (CADACER): Advances and Avenues ACS Symposium Series; American Chemical Society: Washington, DC, 2017.

Downloaded by UNIV OF FLORIDA on December 11, 2017 | http://pubs.acs.org Publication Date (Web): November 20, 2017 | doi: 10.1021/bk-2017-1260.ch003

educational purposes (6). Techniques from psychometrics, for instance, also play a role in EDM. The main influence of data mining in EDM is the field of clustering and classifying students into different groups (7). This offers the possibility of detecting critical behavior in the LMS and to inform the educator about students who are at risk of failing (8). Within the last few years, EDM has been focusing strongly on log data from learning management systems (5). Morris et al. derived eight variables from the access logs of LMSs (9). These variables can explain 31 % of the variability in achievement. Given the fact that most educators are not familiar with data mining tools, how to visualize the results is an important task of the EDM. Good visualizations can help educators to understand the learning process itself and can even help people who are not concerned with clustering, statistics and other data mining techniques to get information about students’ learning (10). Another discipline which deals with logged educational data is Learning Analytics (LA). The latter tries to find information about learners and the context in which their learning occurs. This information provides an understanding of learning and opportunities for optimization (10). It is not possible to separate both disciplines completely and to give distinct definitions. Both LA and EDM make extensive use of data mining techniques, and LA additionally uses social network and discourse analysis, among others, as study methods to inspect the context of learning on a course or a department level (11). Thus, LA appears to be the discipline which uses a wider view of the data material than EDM. Learning Analytics offers diagnostic tools for educators. These tools allow educators to improve their teaching or their teaching material. Learning Analytics, for instance, offers help in test construction. Abductive machine learning decreases the number of tasks in electronic testing, while the accuracy of the exam remains nearly constant (12). Consequently, educators have less work in creating new test items and the students’ workload is reduced during testing, but test accuracy does not suffer simultaneously. By contrast, the predominant aim of EDM is to provide ready-to-use solutions. Both disciplines depend on data access, especially LMS data. This dependence on data access is often a constraint for the implementation of any new educational diagnostic tool. Educators at universities are usually not assigned to the LMS’s administration and, thus, cannot implement new diagnostic software within the LMS. Software that runs on personal computers at home instead of an LMS is strongly dependent on the data material available. Therefore, educational diagnostic software must consider this narrow availability.

Ways of “Data Mining” Educators Can Do Most universities currently use LMS to offer their students electronic learning material. Educators need access to the data stored in the LMS to use their own diagnostic software. The type of LMS has a strong impact on the availability of additional data access. Two types of LMS can be found: locally and remotely hosted. 24 Gupta; Computer-Aided Data Analysis in Chemical Education Research (CADACER): Advances and Avenues ACS Symposium Series; American Chemical Society: Washington, DC, 2017.

Downloaded by UNIV OF FLORIDA on December 11, 2017 | http://pubs.acs.org Publication Date (Web): November 20, 2017 | doi: 10.1021/bk-2017-1260.ch003

“Locally hosted” means that the LMS is hosted on servers under university control. The university can decide which software is installed and has complete access to all data. “Remotely hosted” means that the university rents the LMS as a web service of a specialized commercial provider. The LMS is ready to use, and maintenance service is covered. Conversely, access to stored data is limited to products which are available for sale. In the second case, the question of receiving additional logged data to run a diagnostic instrument can be easily answered. If the LMS’s provider offers special diagnostic instruments or further access to more stored data, it is possible to buy them. If not, there is no chance to make any changes to the LMS or to gather more data than those already collected. In the other case, when the LMS is hosted locally, the university owns the LMS’s data completely. Thus, reuse of the stored data is easily possible. Furthermore, most common LMSs are open source software and can, thus, be modified easily. Hence, in this case, there are options to realize new diagnostic tools directly within the LMS. However, the LMS’s data is stored primarily for operating purposes. This means that information about the performance in different assessments is spread over the whole database and is organized more to guarantee a fast and stably running LMS. Consequently, it is more than helpful to reorganize the stored data before any diagnostic reuse (13). This leads to a lot of exhausting work with the database before any extraction of new information can start. It is hardly possible for educators to stem this exhausting work alone. Additionally, access to the pure database is restricted due to ethical and privacy reasons. To sum up, there is no realistic chance for educators to be in a position to receive additional data access in either remotely or locally hosted systems. Consequently, software tools which are intended to help educators to benefit from students’ LMS data in their lessons should use accessible data. In their position as educators, they must be able to supervise their students’ learning. For this reason, almost every LMS provides control opportunities for teachers, for example, the LMS presents pages informing educators about students’ learning progression. Software can grab the information on those pages in two ways. The data material can be collected directly by reading the content of the specific pages. In this way of data collection, all the information the educator can see is transferred to the software. Unfortunately, this way of data generation is strongly dependent on the LMS used and this way of reading the data requires changes with almost every new LMS version. Contrary to the direct way of data mining, there are ways of indirect data collection within the LMS. Almost every LMS offers export functionalities. The export features are usually intended to provide backups for the teacher’s records, but these export files can also be used for data mining purposes. The export files are an enormously useful data basis that is well-structured and, thereby, ready-to-use. In contrast to the educator’s information pages within the LMS, the export file’s structure does not usually change at all, even if there are major changes in the LMS. That is why export files are reliable data sources, but they are less detailed than the educator’s direct view into the LMS. Exemplarily, the answers in assessments in export files are reduced to single scores and, consequently, the information about which answer was given is lost. 25 Gupta; Computer-Aided Data Analysis in Chemical Education Research (CADACER): Advances and Avenues ACS Symposium Series; American Chemical Society: Washington, DC, 2017.

Downloaded by UNIV OF FLORIDA on December 11, 2017 | http://pubs.acs.org Publication Date (Web): November 20, 2017 | doi: 10.1021/bk-2017-1260.ch003

For this reason, software which intends to offer diagnostic support for teachers should be able to use different ways of data import and types of data material. The LMS’s export files constitute the data basis and the basic functionality because they can be utilized easily and there is a high chance that they can be processed even during version changes of the LMS. Also, other ways of direct data import should be realized to gather more detailed data. This direct import feature can generate a broader data basis while the basic data import features with the export files offer the guarantee that the software is operational even when the LMS installation has changed. These two methods of data mining mentioned are the type that educators can perform.

The Learning Management System Analysis Kit “Personal Data Mining” – Data Generation within the LMS Our idea was to offer educators possibilities to get deeper insights into their students’ learning progress and to bring them closer to their students’ learning. The only way for us to realize this idea was to develop our own software tool that requires only the preceding ways of data mining that educators can perform. Consequently, we started to work on the LMS Analysis Kit (LMSA Kit). This software should help close the gap between both learning worlds. It should offer information about the learning progress like other diagnostic instruments from LA or EDM, but without the pitfalls that are generated by the necessity for nonaccessible data. The LMSA Kit uses all possibilities for data import from the LMS and features to analyze the data material for educational purposes. A basic connection to the LMS’s data is provided by import functionalities of different export file formats offered by LMSs. Hence, basic data material, such as scores earned in assessments or single tasks, can be transferred easily to the LMSA Kit. Additionally, advanced connection to the LMS’s data will be provided by reading the content of the LMS’s pages directly in future versions. This also allows the import of single given answers, for instance, to recognize the presence of misconceptions or reoccurring mistakes. The data connections are established via a plug-in system, which makes it easily scalable and simply adaptable. A new published LMS can be included in the LMSA Kit through a new plug-in. Major changes in a supported LMS do not affect the software as a whole; they only affect the specific plug-in. Thus, it is easy to keep the software up-to-date. Educators are usually interested in the competencies a student has acquired and the learning objectives that have been mastered. However, the diagnostic tools within the LMS do not offer this type of information easily. In most cases, it is almost impossible to retrieve this information with the built-in educational diagnostic tools. They are generally intended to offer a view of complete assessments rather than customized combinations of solved tasks in detail (Fig. 2), apart from that, it is impossible to measure competency. The acquisition of a competency is indirectly measured by solving different tasks – aiming at this specific competency – across different tests (14). 26 Gupta; Computer-Aided Data Analysis in Chemical Education Research (CADACER): Advances and Avenues ACS Symposium Series; American Chemical Society: Washington, DC, 2017.

Downloaded by UNIV OF FLORIDA on December 11, 2017 | http://pubs.acs.org Publication Date (Web): November 20, 2017 | doi: 10.1021/bk-2017-1260.ch003

The LMSA Kit can close this gap. It can inspect the competency acquisition during different tasks across different tests due to importing all the LMS’s data on assessments. In the same way, it is also possible to get an insight into the mastery of learning objectives. All student-solved tasks are stored in the LMSA Kit. Consequently, the tasks can be viewed independently from the context of a single test and can, therefore, be rearranged into other combinations. New collections of tasks are composed that now allow one to describe a competency or learning objective. It is also possible to combine all tasks of a single topic, so an overview is given about the learning progress on a specific topic. Educators are responsible for the theoretical construct behind the collection of tasks and, accordingly, they also define which criterion is measured (Fig. 3). The LMSA Kit does not depend on any specific learning theory, nor does it prefer specific types of criterion.

Figure 3. Combination of tasks describing one competency. In contrast to the scores the students have earned, the answer given is usually not exported directly by the LMS’s export functions. In further stages of development, the LMSA Kit should also be able to deal with the answers given themselves, instead of only the earned scores, using direct import techniques. There is large potential in the answers given, especially the wrong ones that hitherto cannot be used. It allows the elucidation of common misconceptions and learning difficulties. One of our main developmental aims is to realize this shortly, but the difficult method of data collection excludes this feature at this early stage of development. Recombination of Test Data for Educational Diagnostics The LMSA Kit offers an opportunity to measure the learning progress in a criterion. All the tasks the students have solved during classes are stored in one single database. The strong connection to the test in which the task has been solved is, thus, broken up. Educators can select the tasks that define a competency or a learning objective (Fig. 4). Subsequently, the student’s performance in the collection of tasks can be seen. 27 Gupta; Computer-Aided Data Analysis in Chemical Education Research (CADACER): Advances and Avenues ACS Symposium Series; American Chemical Society: Washington, DC, 2017.

Downloaded by UNIV OF FLORIDA on December 11, 2017 | http://pubs.acs.org Publication Date (Web): November 20, 2017 | doi: 10.1021/bk-2017-1260.ch003

Figure 4. Usage of the LMSA Kit. Of course, not every combination of tasks is convenient to measure a criterion. Sometimes there is an unexpected and, therefore, hidden level structure within a competency, or a learning objective can be divided into different sub-learning objectives. To avoid this set of problems, EDM and LA make massive use of machine learning by employing classification algorithms to classify identical answer patterns within the whole data material (15). The students are, thus, divided into classes of comparable performance in the criterion. However, these algorithms must learn from existing data. This means the level of competency acquired, or the progress in learning is known already. This data material is necessary since it allows the algorithms to find the right classifications. A support vector machine or an artificial neural network, for example is dependent on known data material for learning purposes (16). The more data they assess, the more accurate their future classifications are. Even if these clustering techniques work well in EDM or LA, they are not applicable for use at home as an educational diagnostic software for educators. Teachers do not generally have access to a huge amount of old data material to train algorithms in machine learning. Additionally, the configuration of these algorithms is not carried out easily by educators who are laypeople in the usage of machine learning. Another problem is that students do not complete every test in an LMS and incomplete data is more or less useless for machine learning (16). The algorithms used within a software for teachers must manage the lack of training material for algorithms. Consequently, machine learning and other classification algorithms cannot be used in the LMSA Kit. One way this can be 28 Gupta; Computer-Aided Data Analysis in Chemical Education Research (CADACER): Advances and Avenues ACS Symposium Series; American Chemical Society: Washington, DC, 2017.

Downloaded by UNIV OF FLORIDA on December 11, 2017 | http://pubs.acs.org Publication Date (Web): November 20, 2017 | doi: 10.1021/bk-2017-1260.ch003

solved is to use educators’ professionality with the right technical support. On the one hand, educators know which criterion they want their students to reach. On the other hand, educators know the criterion a task belongs to. Hence, educators can define collections of tasks which describe a criterion instead of a training process by machine learning. This contrasts with LA, which sees the pedagogical work outside the domain of LA. The pedagogical work is “coded” into the data material and educators are doing pedagogical work, while they are working with the information LA offers. There are no pedagogical decisions within the models of LA at all (17). However, educators need additional support to combine a set of tasks meaningfully. If the subparts of the learning objective or the competency are too divergent, a decrease of accuracy and measurement errors, such as false positives, will be the results. Whether the structure of a competency or a learning objective needs to be divided into smaller pieces cannot be seen easily by just comparing the specific properties of the tasks. The LMSA Kit tries to help educators to identify such problematic task compilations. Due to the lack of training material, the LMSA Kit is not able to mark task compilations for a competency or a learning objective as definitely wrong or unacceptable regarding accuracy. Nevertheless, it is possible to offer support in identifying problematic task compilations through a missing one-dimensionality. Cronbach’s alpha is a necessary, but insufficient condition for a proof of one-dimensionality. High values in Cronbach’s alpha tend to more acceptable task compilations, whereas low values tend more to identify problematic ones. The threshold value is much discussed in the literature, especially in pedagogical contexts with higher learning objectives and more complex competencies. Low values of Cronbach’s alpha can be accepted if the compilation of tasks has been carried out conscientiously (18). The sufficient condition for a unidimensional task compilation is the unanimous loading to one factor in a factor analysis. Additionally, if the factor analysis identifies more than one factor, it is a useful hint regarding into which groups the tasks should be divided. The LMSA Kit offers this analysis tool to identify problematic task compilations as a core feature. Furthermore, cross-tables show the inter-item correlations between all tasks of one compilation. Therefore, misleading tasks within the compilation can be identified by selecting items with a low overall correlation. Educators, thus, receive as much support as possible while they are defining task collections. Nevertheless, we hope to improve this process in the future by using old data material to train algorithms. As one result, it could be possible to offer at least secure intervals of Cronbach’s alpha which identify critical or non-critical task compilations. Another improvement can be to automatize the factor analysis which could possibly present alternative task compilations directly. Consequently, teachers need to verify whether new compilations are, in fact, different sub-criteria. In other words, we are using our existing data material to train algorithms and to verify their accuracy to derive general rules and new algorithms to support educators like LA does. Thus, educators do not depend on existing old data material. In contrast to LA, we are using pedagogical decisions 29 Gupta; Computer-Aided Data Analysis in Chemical Education Research (CADACER): Advances and Avenues ACS Symposium Series; American Chemical Society: Washington, DC, 2017.

on purpose to allow the transfer of our general calculation models for application at home.

Downloaded by UNIV OF FLORIDA on December 11, 2017 | http://pubs.acs.org Publication Date (Web): November 20, 2017 | doi: 10.1021/bk-2017-1260.ch003

Online Chess and Student’s Abilities After the tasks are collected into different compilations, the earned scores of a student in all tasks of the compilation are transferred into a number. This single number must represent the progress in the defined criterion. We have the fundamental assumption in Classical Test Theory (CTT) that that every measure contains a measurement error, but this error in measurement has an expectation value of zero. Thus, the more often a property is measured, the more accurate the mean value tends towards the true value. If all tasks in a compilation are measuring the performance in the same criterion, there are enough recurrences in measurement to get a close approximation of the true value. Unfortunately, this concept has some weak points. It must be clear that all tasks that are used to measure the competency acquisition or the mastery of a learning objective are really measuring the same property. Some differences within the task can lead to the wrong measurement. Such a difference can already be a result of the varying difficulties of the tasks. For this reason, the compilation of tasks should be subdivided into smaller groups of tasks with the same degree of difficulty. However, the resulting subgroups normally consist of only a few tasks, in most cases, only one single task. Consequently, the measurement error cannot be reduced. The Item Response Theory (IRT) can be used to overcome this problem during educational testing. In IRT, the probability that a learner can give the right answer is not only related to his or her ability level, but it is also related to the difficulty of a specific task. In this framework of testing, the difficulties of the tasks and the abilities of the learners are estimated in one single step. Taken this into account, there is no need to subdivide all task compilations into groups of tasks with the same difficulty. Conversely, IRT requires complex estimation processes, mostly based on maximum likelihood estimations, which is strongly dependent on complete and well-structured data material, in other words, all students have done every assessment once and solved every task. This criterion is met merely in the context of blended-learning. Thus, calculation models based on CTT or IRT cannot be used to estimate students’ performance in a criterion during blended-learning. The LMSA Kit solves these problems by using matchmaking algorithms to estimate students’ abilities and tasks’ difficulties. Matchmaking is the process of selecting players for a new match in online games. The process of matchmaking requires player’s abilities to be estimated validly to avoid boring player constellations in matches (19). The idea is that matches consisting of players with equal abilities are well-balanced and, thereby, motivating. Especially young internet companies which offers web-based online games are interested in well-balanced matches that fascinate players and keep them playing and paying. That is why they are interested in improving this estimation process and keeping their efforts secret. One exception is Microsoft’s “True Skill” rating system (20). It is used in multiplayer games on the Xbox Live platform. Unfortunately, it is well secured with international patents and not free of charge. 30 Gupta; Computer-Aided Data Analysis in Chemical Education Research (CADACER): Advances and Avenues ACS Symposium Series; American Chemical Society: Washington, DC, 2017.

Downloaded by UNIV OF FLORIDA on December 11, 2017 | http://pubs.acs.org Publication Date (Web): November 20, 2017 | doi: 10.1021/bk-2017-1260.ch003

One of the first attempts to make a matchmaking process was to play wellbalanced chess matches. The ELO rating algorithm was the first algorithm to estimate the ability of a chess player and is still in use today. The ability of chess players is expressed by the ELO number and, for instance, a chess player must have an ELO number higher than 2500 at least once in his or her career to be able to become a chess grandmaster (21). The ELO rating algorithm is free to use and its usage in the case of pedagogical diagnostics is well documented (22). To do so, the situation of solving a task by a student is regarded as a single match between the student and the task. After enough “matches” have been played, the ability of the student and the “abilities of the opponent tasks,” i.e. the task’s difficulty, can be estimated. We used different matchmaking algorithms while developing the LMSA Kit: the ELO rating algorithm and Glicko V1 and V2. The last two were designed by Mark Glickman to improve the ELO rating algorithm (23). All algorithms were tested with old data material, i.e. scores from the final exams. Thus, the matchmaking algorithms could be trained in the same way, similar to the classification algorithms in EDM or LA, because all matchmaking algorithms need to set some parameters to work correctly. The value of these parameters depends on the purpose of application and the data material used (22). There are few or no values for these parameters in literature. Hence, most parameters in the configuration must be figured out by testing. Consequently, we performed an intensive testing and training of parameter combinations of the algorithms used during the development of the LMSA Kit. We used the Pearson Product-Moment Correlation Coefficient (PPMCC) to evaluate the accuracy of the estimation process. The PPMCC is used to inspect linear correlations between the two sides of a paired data set. The estimated ability and the shown ability in the same criterion in an examination are such a paired data set. The PPMCC is often used to verify the accuracy of pedagogical classifications (24). The values of the PPMCC vary between -1 and 1, hereby, values near -1 and 1 indicate the existence of a linear correlation, whereas values close to 0 neglect it. A modified version of the ELO rating algorithm is currently leading the race, but we are also still looking for further improvements. The rating algorithms were tested on two old data sets of undergraduate students who are doing a minor in chemistry. These students must pass a laboratory class with a final exam. During their laboratory class, they are supported by blended-learning and must solve weekly tests aiming at learning objectives and competencies that are tested later in the final exam. The big data set consists of data material from 300 students, the smaller one consists of 120 students. The rating algorithm used within the LMSA Kit shows an average correlation to the exam data of about r = 0.37 in the big data set and r = 0.27 in the smaller data set. If two different raters are rating the same short answer question, the correlation between the raters is r = 0.59 (24). The best correlation between an automatic scoring system and a human rater is quite close to this value, but with r = 0.52, a little less (24). The difference between the two groups shows the “learning effect” of the ELO algorithm and that the training is more effective with more people. 31 Gupta; Computer-Aided Data Analysis in Chemical Education Research (CADACER): Advances and Avenues ACS Symposium Series; American Chemical Society: Washington, DC, 2017.

Downloaded by UNIV OF FLORIDA on December 11, 2017 | http://pubs.acs.org Publication Date (Web): November 20, 2017 | doi: 10.1021/bk-2017-1260.ch003

The LMSA Kit does not using matchmaking algorithms as a scoring system for the same tasks that a human rater scores. It estimates the ability in criterion by using test data from electronic assessments during blended-learning. The educator, as the human rater, tries to estimate the ability in the same criterion, but the rater uses some other tasks from the final exam that usually differ from the digital ones. For this reason, the smaller correlation between the prediction and the final score is hardly surprising. Additionally, the students prepare themselves for the final exam between the assessments in blended-learning and the final exam. Nevertheless, the correlation between prediction and final exam is in a span that is relevant to educational purposes. The LMSA Kit offers different matchmaking algorithms to estimate the ability in a criterion. Educators can select and configure different matchmaking algorithms. It is possible to calculate different algorithms or the same algorithm with different configurations. Thus, the varying results are directly comparable to improve the single specific estimation process and to figure out the right configuration. There are additional features especially for small data sets, such as a “training option,” whereby the data set is reused several times for improving the accuracy in estimation. After the calculation process, the results are presented as a table (Fig. 5). The columns are the values of the different rating algorithms. The first rows contain the difficulties of the tasks within the tasks compilation and the second section of rows contains the students’ abilities. We are currently working on improving the visualization, for example, information displayed as diagrams might be more beneficial. Automatically generated reports can inform about critical developments in more than one criterion. Educators can identify students at risk at an early stage so that they can take action to support these students in a specific manner. In the same way, weaknesses of the whole course can be revealed, and support is possible.

Figure 5. Results of an estimation process. 32 Gupta; Computer-Aided Data Analysis in Chemical Education Research (CADACER): Advances and Avenues ACS Symposium Series; American Chemical Society: Washington, DC, 2017.

The LMSA Kit allows the construction of a bridge between both learning worlds. Educators can now see more about their students’ learning when they are looking at their students through the eyes of the LMSA Kit. The students’ work within the LMS is more than just electronically supervised homework. Educators can reveal learning difficulties and weakly performing students. Face-to-face lessons can be designed closer to the students’ needs because educators can to identify them.

Downloaded by UNIV OF FLORIDA on December 11, 2017 | http://pubs.acs.org Publication Date (Web): November 20, 2017 | doi: 10.1021/bk-2017-1260.ch003

Automatic Generated Criteria-Based Feedback Educators can see and supervise their students’ learning progress with the LMSA Kit. However, educators cannot support each student individually every time. This is nearly impossible, especially in large classes. For this reason, the information generated by the LMSA Kit should be available to the students. Thus, the students can at least see their learning progress and may come to a realistic insight regarding what must be done for the final exam. This reveals students’ miscalibrations and helps students to find a calibration that allows them to estimate the work to be done more realistically (4). Consequently, a more self-regulated learning process is feasible (25). On the other hand, this information should contain more than just the scores earned within a compilation of tasks, because feedback that contains a grading character has been shown to hinder learning (26). Consequently, the feedback must be more formative. The students need information about how close they are to the learning objective to estimate the necessary effort of learning towards gaining mastery or acquisition (4). Further information about the next steps to master the learning objective or how to acquire the competency is also necessary. This can be realized by an automatic feedback system. The LMSA Kit offers the possibility to generate this type of feedback by calculating the students’ abilities in each task compilation and providing text templates for the feedback messages. There is an editor tool within the LMSA Kit to create custom text templates for feedback generation (Fig. 6). Moreover, a coding scheme and a graphical programming interface are implemented in the templates. This allows educators to define their own text templates without any knowledge of programming languages. Knowledge of how to create presentation slides are enough. The educator arranges the text templates and the logical parts of the template on the work area. The parts that are linked to each other are connected by “digital” wires. Hence, every person who can operate an office application can manage the generation of feedback templates. Less automated feedback systems have already been able to show that students perceive such a template-based feedback as valid feedback (27). Even though the templates do not allow a complete individual feedback to be written for each student, the students do not evaluate this feedback as too impersonal. They even feel that this feedback is more objective than grades or other formative resources (28). 33 Gupta; Computer-Aided Data Analysis in Chemical Education Research (CADACER): Advances and Avenues ACS Symposium Series; American Chemical Society: Washington, DC, 2017.

Downloaded by UNIV OF FLORIDA on December 11, 2017 | http://pubs.acs.org Publication Date (Web): November 20, 2017 | doi: 10.1021/bk-2017-1260.ch003

Figure 6. Text template editing tool.

We tested the LMSA Kit in combination with the automatic feedback system with students who attend laboratory classes at our institute. The class is supported by blended-learning techniques; therefore, students had to pass electronic tests, which were the data basis for the LMSA Kit. During the summer semester of 2015, 39 students received a feedback email, and a year later, 41 students got feedback via email. From these cohorts 19 and 22 students, respectively, participated in the survey (Table 1 and 2). The feedback dealt with the different topics of the laboratory. Instead of providing feedback based on the single laboratory days and the students’ performance on the electronic tests, learning objectives, and competencies that were interspersed throughout the whole class were addressed. Such topics were, for example, correct assembling of the experiments, safety rules and working safely, and the chemical background of the experiments. The feedback text offered all topics in the same structure (Fig. 7). Firstly, it described the topic or the problem and gave supporting examples or exemplary tasks to make the criterion clear to our students. In the second part of such a feedback paragraph, the actual performance and the learning progression in the criterion is presented. Finally, the progression aimed for is shown to allow students to evaluate their learning progress and to plan further steps. In the last part of a chapter, additional learning opportunities are presented. 34 Gupta; Computer-Aided Data Analysis in Chemical Education Research (CADACER): Advances and Avenues ACS Symposium Series; American Chemical Society: Washington, DC, 2017.

Table 1. Results of Feedback Evaluation

Downloaded by UNIV OF FLORIDA on December 11, 2017 | http://pubs.acs.org Publication Date (Web): November 20, 2017 | doi: 10.1021/bk-2017-1260.ch003

2015 (n = 19)

2016 (n = 22)

Accuracy

2.9

2.2

Transparency

2.4

1.7

Comprehensibility

1.9

1.8

Benefit

1.9

1.8

Benefit in test preparation

3.0

2.7

Request for further feedback

2.0

1.9

Students’ ratings regarding different aspects of the feedback received. accordance, 6: highest discordance.

1: highest

Figure 7. Default feedback template. 35 Gupta; Computer-Aided Data Analysis in Chemical Education Research (CADACER): Advances and Avenues ACS Symposium Series; American Chemical Society: Washington, DC, 2017.

Downloaded by UNIV OF FLORIDA on December 11, 2017 | http://pubs.acs.org Publication Date (Web): November 20, 2017 | doi: 10.1021/bk-2017-1260.ch003

The feedback messages were presented by the university’s questionnaire survey system. In this way, the feedback could be presented step-by-step and every step addressed a different feedback topic. After each part of the feedback, the students were asked to rate their satisfaction with the given information. Additionally, they were asked to rate their benefit or describe their wish regarding future feedback. The evaluation of the first year uncovered different weaknesses of our feedback system. One of the main problems was the low rated satisfaction in the estimation accuracy felt. We ran an interview study during the winter semester of 2016 to get a more detailed insight into the problems and how the feedback is perceived by our students.

Table 2. Overall Accordance with the Feedback 2015

2016

t-test

In total

-0.5 (p = .008)*

0.3 (p = .030)*

+0.75 (p = .000)

Description of performance progress

0.4 (p= .049)*

0.25 (p = .056)*

+ 0.64 (p= .006)

Accordance with the feedback in total. The feedback was too bad: -2, the feedback was too good: 2. 0 indicates a proper estimation. * indicates t-test for mean is 0 and is significant

Consequently, the feedback is now presented a few days earlier and it offers more hints for further learning activities. The interview study revealed that the additional learning material did not satisfy the students’ needs for further learning opportunities completely. The biggest change to improve the credibility of the feedback was to implement a final preparation test. This special test has the character of a trial exam. It can only be done once, in contrast to the other electronic tests in the blended-learning material. This allowed students to learn with the tests during the course, but they did not appreciate that the feedback messages were based on these tests, which they perceived as “often not seriously done.” Moreover, in addition to learning electronically at home, students also improved their competencies during lab classes; it is, therefore, not surprising that they did not perceived feedback based on electronic data as trustworthy enough to judge their learning progress. The additional trial exam seems to change this feeling in the student’s minds. This trial exam is “seriously done,” and they can show what progress they have made. Finally, only one year later, we could offer an improved and more helpful feedback to our students. The results of the evaluation showed that the improved feedback was more accepted by the students and the accuracy and transparency felt increased. Consequently, the benefit to our students also increased. Additionally, the attitude felt towards the feedback delivered has been changed. The slightly negative attitude is replaced by a moderately positive one. The moderately positive attitude felt helps students to accept the feedback given and this acceptance is necessary if feedback is intended to change the students’ self-calibration (29). 36 Gupta; Computer-Aided Data Analysis in Chemical Education Research (CADACER): Advances and Avenues ACS Symposium Series; American Chemical Society: Washington, DC, 2017.

Downloaded by UNIV OF FLORIDA on December 11, 2017 | http://pubs.acs.org Publication Date (Web): November 20, 2017 | doi: 10.1021/bk-2017-1260.ch003

Conclusion and Outlook The LMSA Kit is a software solution which allows us to build a bridge between the two worlds. It offers the possibility for educators to get an insight into the students’ learning within the LMS. The acquisition of a competency or the mastery of a learning objective is transparent for the educators. Consequently, the face-to-face lessons are closer to the specific student’s needs. The students’ benefit from working in the LMS is increased because teachers are receiving more information from their students’ digital homework. The built-in diagnostic support features allow educators who are usually laypeople to carry out data mining and clustering techniques to generate applicable educational diagnostic data. They could identify wrong combinations of tasks, therefore, errors in measurement are as minimal as possible. The benefit of EDM and LA is partially available for educators. Continuing research and development of our software will help us to improve this process further. Another way to support students in their learning process at home is an automatically generated and delivered feedback. The LMSA Kit allows the sending of this new type of feedback to the students. The calculated scores earned within an acquired competency or a learning objective are not only used to inform educators about their students, but also to inform students immediately. In combination with a feedback delivery system, students can even profit directly from the work done within the LMS. The additional work for educators is thereby minimal and is not increased by the number of participants, on the other hand, students positively rate such feedback and they benefit from it. We are further optimizing the quality of the feedback in our research efforts. We started to create a software tool with the LMSA Kit that allows educators and students to be more aware of the learning progress during blended-learning. Much work must still be done, but the first results make us feel satisfied and confident that we will be able to support educators and students even more in the future.

References 1. 2. 3. 4. 5. 6. 7.

8. 9.

Chittleborough, G. D.; Mocerino, M.; Treagust, D. F. J. Chem. Educ. 2007, 84, 884–888. Seery, M. K. Chem. Educ. Res. Pract. 2015, 16, 758–768. Teo, T. W.; Tan, K. Ch. D.; Yan, Y. K.; Teo, Y. Ch.; Yeo, L. W. Chem. Educ. Res. Pract. 2014, 15, 550–567. Hattie, J. J. Learn. Instr. 2013, 24, 62–66. ALMazroui, Y. A. Int. J. Inf. Tech. Comput. Sci. 2013, 7, 8–18. Baker, R. S. J. D.; Kalina, Y. J. Educ. Data. Min. 2009, 1, 3–16. Romero, C.; Ventura, S.; Espejo, P. G.; Hervás, C. In Mining Algorithms to Classify Students, Education Data Mining 2008 International Conference on Educational Data Mining; de Baker, R. S. J.; Barnes, T.; Beck, J. E.; Eds.; June 20−21, 2008; Proceedings, Montréal, Québec, Canada, 2008; pp 8−17. Macfadyen, L. P.; Dawson, S. Comput. Educ. 2010, 54, 588–599. Morris, L. V.; Finnegan, C.; Wu, S. Internet Higher Educ. 2005, 8, 221–231. 37

Gupta; Computer-Aided Data Analysis in Chemical Education Research (CADACER): Advances and Avenues ACS Symposium Series; American Chemical Society: Washington, DC, 2017.

Downloaded by UNIV OF FLORIDA on December 11, 2017 | http://pubs.acs.org Publication Date (Web): November 20, 2017 | doi: 10.1021/bk-2017-1260.ch003

10. Johnson, L.; Levine, A.; Smith, R.; Stone, S. The 2010 Horizon Report; The New Media Consortium: Austin, TX; 2010; pp 29−32. 11. Long, Ph.; Siemens, G. Educause Rev. 2011, 46, 31–40. 12. El-Alfy, E.; Abdel-Aal, R. E. Comput. Educ. 2008, 51, 1–16. 13. Krüger, A.; Marceron, A.; Wolf, B. A Data Model to Ease Analysis and Mining of Educational Data. In Educational Data Mining 2010, 3rd International Conference on Educational Data Mining; de Baker, R. S. J.; Merceron, A.; Pavlik, P. I., Jr.; Eds.; June 11-13, 2010; Pittsburgh, PA, 2010; pp 131−139 14. Ghanbari, S. A. Competency-Based Learning. In Encyclopedia of the Sciences of Learning; Seel, N. M, Ed.;Springer: New York, 2012; pp 668−671. 15. Nisbet, R.; Elder, J.; Miner, G. Handbook of Statistical Analysis and Data Mining Applications; Academic Press/Elsevier: San Diego, CA, 2009; pp 121−172. 16. Nisbet, R.; Elder, J.; Miner, G. Handbook of Statistical Analysis and Data Mining Applications; Academic Press/Elsevier: San Diego, CA, 2009; pp 235−258. 17. Greller, W.; Drachsler, H. J. Educ. Tech. Soc. 2012, 15, 42–57. 18. Schmitt, N. Psychol. Assess. 1996, 8, 350–353. 19. Coulom, R. Whole-History Rating: A Bayesian Rating System for Players of Time-Varying Strength. In Computer and Games – 6th International Conference; Herik, H. J., Xu, X., Ma, Z., Winands, M. H. M., Eds.; September 29-October 1, 2008, CG 2008, Beijing, China, Proceedings; Springer :Berlin Heidelberg, 2008; pp 113−124. 20. Herbrich, R.; Minka, T., Graepel, T. True Skill: A Bayesian Skill Rating System. In Advances in neural information processing systems 19; Schölkopf, B., Platt, J. C., Hofmann, T., Eds.; Proceedings of the 2006 conference, Vancouver, British Columbia, Canada, December 4−7, 2006; 2007; pp 569−576. 21. World Chess Federation. Handbook FIDE Title Regulations, Article 1.53. https://www.fide.com/fide/handbook.html?id=174&view=article (accessed Nov. 4, 2016). 22. Pelánek, R. Comput. Educ. 2016, 98, 169–179. 23. Glickman, M. E. J. R. Stat. Soc. C 1999, 48, 377–394. 24. Liu, O. L.; Rios, J. A.; Heilman, M.; Gerard, L.; Linn, M. C. J. Res. Sci. Teach. 2016, 53, 215–233. 25. Boekaerts, M. Learn Inst .1997, 7 (2), 161–186. 26. Shute, V. J. Rev. Educ. Res. 2008, 78, 153–189. 27. Debuse, J. C. W.; Lawley, M. Br. J. Educ. Technol. 2016, 47, 294–301. 28. Denton, Ph.; Madden, J.; Roberts, M.; Rowe, Ph. Br. J. Educ. Technol. 2008, 39, 486–500. 29. Lundgren, D. C.; Sampson, E. B.; Cahoon, M. B. Psychol. Rep. 1998, 82, 87–93.

38 Gupta; Computer-Aided Data Analysis in Chemical Education Research (CADACER): Advances and Avenues ACS Symposium Series; American Chemical Society: Washington, DC, 2017.