Sunday, 7 July 2013

Incroyable, fraude aux examen

Copy of stolen exam
The last few weeks the papers were filled with the largest exam fraud ever in the Netherlands, the news even reached the New York Times. Three high school students were arrested under the suspicion of stealing copies of national tests of over 20 courses. The fraud was first detected at the end of May when a French Language test was posted online. To be save the board of Examination postponed the French test, affecting almost 17,000 students. It is unclear how many students benefited from the stolen exams, investigations are still in progress. As a precaution, all exams for the school at which the exams were stolen have been declared invalid and the students had to take the examinations again.  But will all those who benefited from the stolen exams be found, even if they studied at other schools? Luckily, fraud can be detected using mathematical techniques and sharp thinking.

Fraud and cheating is of all times. As W.C. Fields puts it: “if it is worth having, it’s worth cheating for”. In the case of the Dutch exam fraud, since most of courses are examined using multiple choice tests, statistical test can be applied to find smoking guns for fraud. During an examination, each student fills in a scantron bubble sheet. These sheets can be read using a special device that digitizes the sheet into a string similar to:

012345678           2333324424444441241255352413511441414343113412.....

The string starts with the student id number followed by the answer string. The numbers 1,2 ... map to answers A,B ... respectively. Comparing this string with the answer key will result in the number of correct answers and the overall score. When a student wants (or needs) to cheat to pass the exam, he probably doesn’t want to change all the wrong answers, that would give a way too much certainly if his past performance is not that good. So, he needs to decide which answers to change, for example by making mistakes on purpose.  It would be reasonable to expect that the more difficult the questions, the more students will get the answer wrong. Since the fraudulent student doesn’t know which questions are more difficult than others, his answers will deviate from the expected pattern as he will have more difficult questions right and more simple ones wrong. This deviation can be detected using statistical tests. In case students cooperate (or conspire), their answers will have a similar pattern. By using statistics to test for similarity of parts of the answer string these patterns can be detected as well, for example using Cohen’s Kappa statistic.

Cheating is everywhere, not only in high school exams. Using statistical tests this can be detected, even when there are no stolen exams uploaded to the Internet that act as a red flag. The tests indicate that something might be wrong and could direct further investigations. These tests don't only expose students but teachers and professors as well. In 2011 Diederik Stapel professor and dean of the School of Social and Behavioral Sciences at Tilburg University stepped down because of committing research fraud. It was his misuse of statistics that gave him away. The numbers didn’t lie, even though they were made up.