Google Fares Better Than Proprietary Plagiarism Software

Expensive plagiarism detection software from vendors such as Turnitin and SafeAssign proves to be no better than Google at detecting plagiarism. In fact, in past studies, Google has done a better job.

InsideHigherEd recently reported on a study by Susan E. Schorn from the University of Texas at Austin. The data come from Susan E. Schorn, a writing coordinator at the University of Texas at Austin. Schorn first ran a test to determine Turnitin’s efficacy back in 2007, when the university was considering paying for an institutionwide license. Her results initially dissuaded the university from paying a five-figure sum to license the software, she said. A follow-up test, conducted this March, produced similar results. For the 2007 test, Schorn created six essays that copied and pasted text from 23 different sources, which were chosen after asking librarians and faculty members to give examples of commonly cited works. Examples included textbooks and syllabi, as well as websites such as Wikipedia and free essay repositories. Of the 23 sources, used in ways that faculty members would consider inappropriate in an assignment, Turnitin identified only eight, but produced six other matches that found some text, nonoriginal sources or unviewable content. That means the software missed almost two-fifths, or 39.34 percent, of the plagiarized sources.

SafeAssign (the product UT-Austin ended up choosing, as it was bundled with the university's learning management system) fared even worse. It missed more than half, or 56.6 percent, of the sources used in the test. Mark Strassman, Blackboard's senior vice president of industry and product management, said the company has since "changed the match algorithms … changed web search providers" and "massively" grown the database of submissions SafeAssign uses.

Google -- which Schorn notes is free and worked the fastest -- trounced both proprietary products. By searching for a string of three to five nouns in the essays, the search engine missed only two sources. Neither Turnitin nor SafeAssign identified the sources Google missed.

A more recent test shows that results are not much better since 2007. As UT-Austin recently replaced its learning management system, it also needed to replace its plagiarism detection software. Schorn therefore conducted the Turnitin test again this March. Out of a total of 37 sources, the software fully identified 15, partially identified six and missed 16. That test featured some word deletions and sentence reshuffling -- common tricks students use to cover up plagiarism.

We must be cognizant of the limitations of these plagiarism detectors. While they are useful, plagiarism detectors are a starting point, and we cannot use them with abandon.

Comments

Popular posts from this blog

Law School Rankings & Law Libraries

The Problem with Impact Factor in Law

Law Librarians Who (Know) Code