“Current data-based methods of analysis and predictive models are insufficient – big data is able to remedy this. ”
– Dr. Alexander Lenk
Pass/fail decisions by faculty are mostly based on the ability of students to pass or fail prior to (Angoff) or post standard setting (Borderline Regression Analysis) criteria developed by faculty. Standard setting is a critical part of educational, licensing and certification testing. But outside of the cadre of practitioners, this aspect of test development is not well understood. Standard setting is the methodology used to define levels of achievement or proficiency and the cut-scores corresponding to those levels. A cut-score is simply the score that serves to classify the students whose score is below the cut-score into one level and the students whose score is at or above the cut-score into the next and higher level. read more
The Standard Error of Measurement (SEM) indicates the amount of error around the observed score. The observed score, the score we retrieve, store and analyse from an OSCE, is in fact the result of the true score and error around this true score. If we want a reliable decision around passing or failing a station e.g. an OSCE, we need to incorporate the SEM in that decision.
Observed Score is the true ability (true score) of the student plus the random error around that true score. The error is associated with the reliability or internal consistency of score sheets used in OSCEs. Within our system, Qpercom calculates Cronbach’s alpha as a reliability score indicating how consistent scores are being measured, and the Intra Class Correlation coefficient; how reliable are scores between the different stations (Silva et al., 2017). These classical psychometric measures of the data can be used to calculate the SEM. An observed score +/- the SEM means that with 68% certainty the ‘true score’ of that station is somewhere in between the actual score, plus or minus the SEM. In principle, one should consider plus or minus the 95% Confidence Interval, which is the Observed score plus or minus 1.96 * SEM (Zimmerman & Williams, 1966).read more
Borderline regression analysis (BRA) is an absolute, examinee-centered standard setting method widely used to standard set OSCE exams, Yousuf, Violato, and Zuberi (2015). Candidates are awarded a “global score” for a station in a circuit, based on the examiner’s professional judgment of their ability.
Borderline Regression Method is illustrated above using item score on the Y-axis and Global Ratings on the X-axis. 0=Fail, 1=Borderline, 2=Pass, 3=Good and 4=Excellent.
For a working example, in Qpercom’s OSCE Management Information System, Observe, three different types of Borderline Regression Analysis are available: read more
A historically significant paper I have to say, and although not yet cited it formed the basis of what I eventually pursued for the last 10 years in the spin-off company, Qpercom. According to the Irish Times, we are “dragging exam assessment out of the dark ages“ (Oct. 2016). This suppressed paper actually forms the basis of what Qpercom has worked to achieve since 2008 with client partners worldwide. From being a PT clinician by training, I moved into medical education. As clinical researchers, we put a lot of effort into developing the Smallest Detectable Difference (SDD), to be detected using a ‘ruler’. Measuring maximal mouth opening with a metal ruler is one of the outcome variables in patients with maxillofacial pain. With the newly acquired evidence that you had to measure at least 12 mm difference in mouth opening before and after an intervention to be successful in patients with temporomandibular joint disc displacement, I changed jobs and moved into medical/dental education. I was immediately challenged to look into comparable measurements used in oral hygiene training. Probing depth measurements were used as an example to demonstrate the use of generalisability and decision studies in educational decision-making. Fourteen years after this publication, we are comparing 10 different European Universities on their quality assurance outcome of OSCEs. Have a read, use the evidence, and I hope this will help students and staff measuring any kind of assessment outcome, plus this historically significant paper needs citations!read more
A study was set up to assess usefulness and acceptability of a method of assessing professional behaviour of undergraduate dental students.
The first year preclinical course at the Department of Dentistry and Oral Hygiene, University of Groningen, the Netherlands.
Materials and Method
A form was developed with an ordinal scale to assess undergraduate professional behaviour. A standard means of carrying out assessment was then undertaken and subsequently used to give feedback to the students at the end of each of three terms. The students’ self-assessment was then compared to that of the staff.read more
A great new PhD track was launched with Markus Fischer’s idea to look into situation awareness (SA). The latter term is known from the aviation industry and requires pilots to be aware of all critical situations that may occur while flying a plane. More and more evidence is emerging to suggest SA is also applicable to medicine (particularly emergency medicine and surgery). However, the question arose whether we can find any similarities in OSCE stations that might not be designed to detect SA, but which contain item descriptors that could be linked to three different types of SA. Markus Fischer’s first paper provides insight in the literature on situational awareness and OSCEs. A pretty advanced subject that just recently received it’s first citation. I am sure there are more to come once Dr Fischer’s other papers are published. read more
This paper was written by two undergraduate medical students. This is another good example of student participation in undergraduate medical education research. John and Margaret performed a pilot according to Markus’s initial observation that aspects of Situation Awareness could be measured in OSCEs. Both students used the station OSCE score-sheets of three freely available OSCE training guides, and addressed what was already proclaimed by Markus: that all three aspects of SA can be identified in an OSCE score-sheet, although not developed as such. As John and Margaret correctly addressed, it was ‘easy to do’, but reliability and validity was still an issue as this type of research was never done before. It was the step-up for Markus after his literature research to design and conduct his PhD track developing a consecutive training on how to assess SA using OSCE score-sheets. Unfortunately, this breaking news appears hard to publish and maybe we are a bit ahead of the music. This paper however, provides good insight into how important SA is in training and assessing students at an early stage of their curriculum. read more
In medical education it is extremely helpful to compare outcomes. To be able to compare communication skills outcomes between students, years of study or between institutions is very challenging. If the measurement of particular learning outcomes is not standardised, just as using a standardised measurement tape to measure length, you cannot trust the outcome. In this study we attempted to compare communication skills outcomes between groups of students.
Since communication skills assessment forms are not standardised at our School of Medicine within the College of Medicine, Nursing and Health Sciences of the National University of Ireland in Galway, we developed the MAAS-Global proportion (MG-P) as a result of one of our previous studies. If we know how large the MG-P of an assessment form is we might be able to compare different students, groups of students or years of the curriculum. We therefore introduced the MAAS-Global score followed by MAAS-Global proportion and section percentage.read more
For the Irish and moreover in an international context, an important paper written by my colleague in the School of Medicine/National University of Ireland, Galway, Dr Maureen Kelly. Once multiple-mini-interviews were made available in an electronic fashion, data retrieval, storage and analysis appeared more accessible than collecting all data from paper score-sheets. International medical students, those attending medical school outside of their country of citizenship, account for a growing proportion of medical undergraduates worldwide. This study aimed to establish the fairness, predictive validity and acceptability of Multiple Mini Interview (MMI) in an internationally diverse student population. MMI appears to be a welcome addition to assessment armamentarium for selection, particularly with regard to stakeholder acceptability. Understanding the mediating and moderating influences for differences in performance of international candidates is essential to ensure that MMI complies with the metrics of good assessment practice, and principles of both distributive and procedural justice for all applicants, irrespective of nationality and cultural background. read more
After the discovery that about 17 different styles of communications skills are used in the field of communication skills training in medical education, it was apparent we needed to validate the communication skills items included in OSCE checklists. Within our own School of Medicine, in the College of Medicine, Nursing and Health Sciences of the National University of Ireland in Galway, about 280 OSCE stations assessment forms throughout 4 years, and from 4 different medical specialties contained a variety of communication skills items. None of these were ever validated using existing reliable and valid Communication Skills Questionnaires. read more
The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.