Similarity Texter

Additional Information

Collusion Test 2012

Collusion Detection System Test Report 2012

Debora Weber-Wulff, Katrin Köhler, Christopher Möller
HTW Berlin


Abstract: The Plagiarism Research Group at the HTW Berlin last tested plagiarism detection systems in 2010. These systems look for plagiarism by searching for similarities in papers found in material on the Internet or stored in databases. Collusion detection systems look for similarities between papers in a group of papers. This is especially interesting for teachers of large groups of students who want to ensure that no duplication of material has been submitted. In 2012 the group tested collusion detection systems, evaluating 18 systems for effectiveness on text and on programming code. This paper presents the results.

Results overview



0. Introduction

The Plagiarism Research Group at the HTW Berlin has been testing plagiarism detection systems since 2004 [The results of the tests 2004, 2007, 2008, and 2011 are available on this portal]. These systems accept a paper, either uploaded to an online server or checked offline on the teacher’s computer, and compare the paper with material found on the Internet and/or material stored in diverse databases, looking for similarities or text parallels that could be an indication of plagiarism. This is often useful for students who are submitting a final thesis or a term paper, although the expectations that people have in the capabilities of such systems for detecting plagiarism are extremely optimistic.

However, there is another, related kind of cheating that cannot always be detected with such software. This is the case when collusion occurs, that is, when two or more students submit either the same paper or assignment, or slightly modified versions of the same paper, to the teacher. Collusion can happen both with natural language papers as well as in computing assignments, in which one student solves a problem and others make slight changes such as renaming variables or inserting or rearranging comments before handing in the result.

Barrett and Cox investigated the understanding of the terms “collusion” and “collaboration” amongst students in 2005. They note that this apparently happens more often in instructional settings in which there are many students. It can happen intentionally, for example if the students are convinced that the teacher does not actually read the papers that were submitted, or by accident, when students are not aware of the rules governing copying of text or programs, or for material that was co-generated. The difference between permitted collaboration and not permitted collusion is quite unclear, Barrett and Cox found.

The first investigation of the effectiveness of collusion detection systems was done at the HTW Berlin in 2008 [ (in German)]. Only four systems were tested at that time, and the testing was rather shallow on test cases prepared from our normal test material. For the test 2012, test cases were prepared that were specific for testing whether certain kinds of collusion could be detected. After a discussion of these test cases, the results of the test will be presented.

1.  Test Cases for Collusion

We prepared two sets of test cases for the collusion detection test, one set of text files and one set of program code files. The text files simulate short student papers, the program code is especially crafted to determine if the system can determine if semantics-preserving transformations on the code have happened or not.

1.1 Text Test Cases

The basis for the text files was seven texts in German and in English that were used in the test of plagiarism detection systems in 2010:

  • 28-Brantenberg is a test case about the Norwegian author Gert Brantenberg that was translated by hand from a Norwegian online source into German.
  • 29-Facebook is a German test case, half of which is taken with permission as a copy & paste plagiarism from a student blog, the rest is original.
  • 30-Rauchverbot is an original paper in German about the smoking ban in public places in Germany.
  • 31-Pickles is a shake & paste (large blocks of text copied from more than one source) plagiarism in English from the Wikipedia and a site called WiseGeek about pickles.
  • 32-Zakumi is a shake & paste plagiarism in English from the Wikipedia and the FIFA homepages about the mascot of the soccer world championships held 2010 in South Africa (used by permission).
  • 34-Stieg-Larsson is an original essay in English about the Swedish novelist.
  • 35-Perl is an original essay in English about the programming language Perl that includes Cross-Site Scripting code that would inserted a blinking red statement into the reports, if a system is written in Perl and does not use a sanitizer on the input to check for malicious statements.

The basic test case was taken and various numbers of copies were made using three different techniques. The first one involves swapping individual letters with a homoglyph, a letter in a different computer character set that looks the same on the screen but has a different internal representation. For example, the letter ‘e’ was replaced by the Cyrillic letter Je (e), which is represented in Unicode as  U+0435.

The second has different stages of editing the basic text, and in the third an automatic synonym swapper was used to change the text to various degrees of dissimilarity

The following changes were made to some of the basic texts:

  1. One of the paragraphs in the paper had all letters ‘e’ exchanged for a homoglyphic ‘e’.
  2. The letters ‘e’ in the entire paper were changed to a homoglyphic ‘e’.
  3. Major editing changes were made on
    • a) the first and the last sentence,
    • b) the first paragraph,
    • c) words throughout the text were substituted with synonyms, and
    • d) the text remained the same, but a different font was used.
  4. Using the Plagiarisma Synonymizer [a system that automatically replaces a given percentage of words in a text with synonyms, found at] the text was automatically changed at 5 %, 15 %, 25 %, 50 %, 75 %, and 100 % of the words. This does not necessarily give a readable text.

This table gives an overview of the 31 test cases constructed:

Which change?File namesText used
1T01, T01e28-Brantenberg
2T07, T07e31-Pickles without the picture
3T02, T02a-d29-Facebook
3T03, T03a-d30-Rauchverbot
3T04, T04a-d32-Zakumi (d uses Courier New 11)
3T05,T05a-d34-Stieg-Larsson (d uses Book Antiqua 10)
4T06, T06_05, T06_15, T06_25, T06_50, T06_75, T06_10035-Perl

 1.2  Program Code Test Cases

In order to properly test collusion in program code, a class set of program code was obtained [Many thanks to Ralph Lano and his SS 2011 course “Informatik 2” at the HTW Berlin for giving us permission to use this material.] from a colleague at the HTW Berlin. There were 35 solutions to the same problem collected, but for one reason or another only 21 of them were usable. The students were informed in advance that their answers would be used in this manner, stripped of identifying information. They were also offered the option of not participating. Interestingly enough, there was actually a collusion in the underlying set of materials, but we did not take any action on the matter.

In all, there were 58 classes in the 21 projects. Since the changed and unchanged files were uploaded at the same time, there were 116 code classes submitted to each of the systems. From various of the programming sets the following collusions were constructed to see if the software could specifically identify the particular changes made. Many should be trivial to identify, but others are quite complex.

  • 01_guvs: renaming of a bound variable with a short name
  • 02_guvl: renaming of a bound variable with a long name
  • 03_gums:  renaming of a method with a short name
  • 04_guml:  renaming of a method with a long name (>= 16 characters)
  • 05_mmove: moving a method to another position
  • 06_icom: insertion of comments, 50 one line comments and 21 multiple line comments
  • 07_dcom: deletion of comments (26)
  • 08_cmove: moving comments (8)
  • 09_form: reformatting the code with indentation structure, multiple lines joined on one line (15 lines affected)
  • 10_lineorder: switching two lines of code Z
  • 12_insertline: insertion of empty line
  • 13_classname: bound renaming of a class
  • 14_braceadded: braces added around a statement
  • 15_fieldsmoved: Fields switched or moved to other location F
  • 16_elabloop: elaborate loop inserted that does nothing for(int i = 0; i < 1; i++) …
  • 17_forwhile: for loop transformed into an equivalent while loop
  • 18_dummyparam: every method gets an extra, unused parameter in the formal parameter list
  • 19_eclref: Eclipse is asked for a refactoring suggestion, and the first one is taken
  • 20_switchorder: the order of the case branches is switched
  • 21_changeexpr: Expression is reformulated but evaluates to the same value
  • 22_restlicheProgrammkollusionen: This test includes many additional files as a test to see if the system can handle a large amount of data.

Before the testing began, a matrix was set up awarding points on the basis of being able to determine which of these editing actions was detectable. A maximum of three points was achievable for each test.

2. Systems tested

It was a challenge to obtain an overview of systems that can detect collusion, since not every system advertises that it can find collusion. Some do advertise this fact, but some systems such as Turnitin only present themselves as plagiarism detection systems, although the system is quite strong in collusion detection. 18 systems were selected for the test, generally based on the impression that the site made as being serious. There were 15 additional systems that either did not respond to an email request for a guest login for testing purposes, or which appeared to be simple difference-finding programs (diff programs). The systems tested are listed here in alphabetical order:

  • AntiCutAndPaste
  • Beyond Compare
  • BOSS2
  • Code Compare
  • CodeSuite
  • Crot
  • Eclipse Compare
  • Ephorus
  • ExamDiff Pro
  • JEdit JDiff Plugin
  • JPlag
  • KDiff 3
  • MOSS
  • SPLaT
  • Turnitin
  • WCopy Find
  • Yaplaf

2.1 The Text Collusion Test

There were 28 possible points awarded for the text collusion test. Only four had 20 points or more, TurnItIn, Ephorus, SPLaT, and JPlag. TurnItIn achieved a perfect score here, Ephorus managed 24.5 points, as it was easily confused by substituted homoglyphs. SPLaT and JPlag, with 21 and 20 points respectively, are also included in this group. Eight systems managed more than 10 points, SIM_TEXT, Beyond Compare,  WCopy Find, Eclipse Compare, ExamDiffPro, CodeSuite, Code Compare, and KDiff 3, in rank order.

One of the major problems with looking for collusion is the explosion of the number of cases that need to be tested: Each paper must be compared with all of the others in the group. That means that the number of tests that have to be done will grow quadratically with the number of papers being tested. Additionally, the reports have to be able to visualize the results so that candidates for collusion are quickly identified.

The identification of collusion in text should be a simple text parallel search on a closed set of candidates. This ought to be trivial for a good plagiarism detection system to do. However, there are many small problems that can result in false negatives. Some systems do not deal well with non-ASCII characters, others have problems with text that has been edited so that they are not identical copies of each other. And of course, translations and strongly edited texts, although sometimes easily identifiable by a teacher, are quite difficult to find.

2.2 The Program Code Collusion Test

The results for the program code collusion test were quite different from the pure text test. Due to the number of tests conducted there were 63 possible points awarded. Only four systems had more than 40 points: JPlag, MOSS, Sim_Text, and SPLaT. CodeSuite trailed far behind with 27 points. TurnItIn only managed 18 points in this category, and Ephorus only achieved 16.5 points.  So for identifying program code collusion only the top four systems can be considered to be even partially useful.

2.3 Results

NumberTest SystemTest DateProgram codeTextSum
C12-04BOSSSystem could not be installed.
C12-08ExamDiff Pro29.02.20129,51322,5
C12-09Eclipse Compare22.03.201281523
C12-10JEdit JDiff Plugin29.02.201214620
C12-11Beyond Compare27.03.2012161733
C12-13Code Compare27.03.20124,51115,5
C12-14WCopy Find27.03.201251621
C12-15YaplafSystem could not be encouraged to run.

3. Individual test results

In this chapter the individual tests for the systems are discussed in alphabetical order.

3.1 AntiCutAndPaste
This system is dedicated to finding source code in programs that has be reused — however, it does not achieve the stated goal. The results are very difficult to interpret, as they announce line numbers without explaining what exactly is similar at that position. These line numbers are not absolute, but are relative to the starting point for the comparison,  forcing the user to do a lot of calculations.

The system only does a textual comparison, and even that is not well done. A number of copies of code and text were not discovered at all. The default settings (and the default settings are always used for tests) only reacted to plagiarism that covered at least 8 lines — much plagiarism can be found at a much lower threshold, and an entire paragraph is considered to be a line.

The reports are said to be storable, but they are stored in a PDF with “copy protection” and large, unintelligible letters written over the report. One report obtained for a Java text was, however, for a C++ file, apparently from a previous test someone else made! And the places found were masked, so that one would have to purchase a license to see the results. An email was written to the company and access requested in order to complete the test, but all emails went unanswered.

The usability of the system was quite poor. Only one file can be uploaded at a time, it takes a long time to upload a large set of papers. It does not remember where in the folder tree one was uploading, one must click down to the directory for each file. Numerous errors were found, such as announcing errors in a line number that is larger than the last line in the file or windows without exit buttons, and it was easy to fool — changing only one word in every 8th paragraph/line caused the system to not register plagiarism at all.

In summary, this system is not useful for either text and program code collusion detection at university.

3.2 Beyond Compare
BeyondCompare offers a comparison of files or folders, but when it compares folders it only looks to see if there are files of the same name in the folder, it does not compare the files themselves. Only two files can be compared at the same time. The system accepts .java files, but only does a textual comparison of them. It was only able to deal with changed comments and inserted empty lines.

It is able to show the identical parts, the different parts, or the entire file. Since the system worked on a word level, it was able to get a bit more than half of the possible points.

In summary, this system is partially useful for detecting text collusion and not useful for detecting code plagiarism for universities.

3.3 BOSS2
The BOSS Online Submission System is a course management tool, developed by the Department of Computer Science at the University of Warwick. It needs to install a server locally in which it keeps copies of the theses submitted.

The system was installed on a local virtual server, with all of the tools necessary (MySQL, Tomcat, Java, mail server). All of the commands as given in the installation guide were used, but the system still would not start. It was possible to set up the database, but it was not possible to connect it with BOSS2, despite many hours of effort.

In summary, since it was not possible to install and test this system, it is not graded.

3.4 Code Compare
CodeCompare also offers a comparison of files or folders, but when it compares folders it also only looks to see if there are files of the same name in the folder, it does not compare the files themselves. Only two files can be compared at the same time. The system accepts .java files, but only does a textual comparison of them. It attempts to sort out moved code, but is not successful. Instead, it aligns them with code that has absolutely nothing to do with the moved portion.

It only performs a textual comparison, and is the worst system in the test on code comparisons. For text it will only accept .txt files, and was only able to reach 11/28 points. When one tries to print out the reports, it just prints out the files without any further markings.

In summary, this system is not useful for either text and program code collusion detection at university.

3.5 CodeSuite / Doc Mate
CodeSuite is a system that specifically targets source code plagiarism, DocMate is for comparing text documents with each other. There are eight different tools collected in Code Suite that offer functionality specific to collusion detection. After a tool is selected, two directories must be chosen for the comparison. A database is prepared and then a report is generated. CodeMatch shows the matching parts of code, CodeDiff the differences. The system does not just do textual matching, but looks at the structure of the underlying code. Comments, for example, are ignored. CodeSuite finds code that has been moved, but not does not react if the code contains variable renaming or the insertion of dummy parameters.

DocMate also takes files that are in folders and compares each with the others. It only looks at words that are longer than three letters, and it ignores punctuation marks — but it replaces them with a newline character. It has some problems with reading in the files, it interpreted formatting information in a .doc file as content, that then matches the same formatting information in other documents. Files in .pdf-format cannot be used at all.

In both categories the system still did not achieve even half of the possible points, so in summary the system is not suited for use at university.

3.6 Crot
Crot is a plugin for the Moodle learning management system. We tested it with version 2.2 of Moodle. It was very easy to install. It did not have as many parameters to set as MOSS, another Moodle plugin, did. As soon as the plugin is activated, the students see a warning message in the upload window that their work will be checked for plagiarism. When the files have been uploaded, the administrators can see all of the work handed in and the percentage of collusion. The percentages are, however, unclear, as even comparing a text with itself gives a value less than 100 %. And even when 100 % is given, not all of the text is marked.

Only the name of the user is seen, not the name of the uploaded file, similar to Turnitin. The report only has two colors, red and black, so especially moved text is hard to find. The reports are quite simple and cannot be stored or printed. It is also not possible to scroll in both windows at the same time, which can cause a teacher to get lost reading a long file.

In both categories the system did not achieve even half of the possible points, so in summary the system is not suited for use at university.

3.7 Eclipse Compare
This program can compare up to three files that are uploaded in an Eclipse project with each other. This means that it can only deal with files that have endings that Eclipse can import. Neither .doc, .pdf, nor .rdf files can be uploaded. Newlines are ignored, umlauts and special characters are not displayed properly

The program is geared towards Java, but is unable to interpret program code correctly. Instead, it just produces a textual comparison. This explains the very low score of 8/63 points. A side-by-side comparison can only be viewed in Eclipse and cannot be printed from there. There are two options, Java Source Compare and Text Compare, but they both produce the same results. It does, however, mark first the lines with differences and then the individual words that have been changed.

In summary, this tool is not useful for the detection of collusion at university.

3.8 Ephorus
Ephorus is a Dutch system that can also used for general plagiarism detection. A teacher can set up folders in the system and upload one or more files for examination. A .ZIP archive can also be uploaded and is expanded by the system. Since the system only seems to check new documents against old ones, the original files were uploaded first and then the copies. In general use, however, this is not the case so it can be unclear who copied from whom.

There are a number of options that can be set, but the results cannot be opened in new tabs or the overview and an individual document at the same time. The detailed reports at least offer a side-by-side comparison of two documents, with the common text running over both columns.

The system will not accept .java files, so they have to be renamed .txt before they can be examined. Since the system is not geared to program code, it only reaches 16.5 out of 63 points on the code test. It is difficult to understand how the percentages are calculated.

The text test is much better. The system accepts .pdf and .doc files and compare each with the other. It is also not confused by font changes. It achieved 21 out of 28 points on this test.

In summary, Ephorus is useful for finding collusion between text-based files. It does not perform well at all on comparing program code, so it is considered to only be occasionally useful overall.

3.9 ExamDiff Pro
This tool can only compare two files with each other, not an entire class set. One can choose two directories, but then it also only compares if the file names in each directory are the same, not if the content of the files is the same.

The report is given as a side-by-side, but the same color is used for the syntax highlighting and identical text highlighting. It is difficult to see where moved code has been identified, and the summaries are rather cryptic.

For the text test, a plugin converted .doc and .pdf files to .txt, but they don’t work well. Sometimes text is cut off, other times a paragraph is reduced to a single line. At least umlauts are recognized correctly. The system only does a textual comparison. It found some small changes, but had trouble with larger changes to the test.

In summary, this system is not suited for university use in detecting collusion.

3.10 JDiff Plugin for jEdit
jEdit is a free editor for Java programs that has many plugins available. One of these is JDiff, that is supposed to find the differences between two files. It is installed using the PluginManager. There is a split screen in which the plugin can be activated and the two files chosen for comparison.  The plugin then colors lines deleted from the version declared the base version in red, changed lines in yellow and newlines in green. It is also possible to automatically insert the changes from one file into the other.

The system only performs a text comparison, and that is only done on .java files, and here multiple lines are often collapsed into one. The split windows can only be scrolled together. While the report does look good, although they cannot be stored, the system itself finds almost nothing.

In summary, this system is not suited for university use in detecting collusion.

3.11 JPlag
JPlag, a system developed at the University of Karlsruhe in Germany, is the winner for recognizing program code collusion and in fourth place for the recognition of textual collusion. The files are uploaded using a Web Start client. It actually compiles the programs, so that it rejects outright any files that do not compile. It accepts programs in Java, C, C++, C#, and Scheme. It will not accept ZIP files, but will work on a directory that has all of the files in it as input. It can also handle nested directories.

One is then directed to an overview page that attempts to quantify the results – the more collusion, the more “#” are output. Groups of co-similar collusion are grouped together, and the side-by-side presentation makes a comparison easy. The reports are good, although the arrows are not easy to understand at first. But the good use of color makes it easy to see which parts have been copied. However, the reports can only be stored as HTML files, there is no special printing version and we were not able to find a good way of printing them so that an examination board could deal with a case.

It is not quite clear what the difference between the average frequency and the maximum frequency is. But the system was able to deal with renamed variables, insertion and deletion of comments or empty lines, and inserted braces. Code that was moved to another part of the code was only partially recognized.

For the text collusion test it had the problem that it could not handle .doc files, and thus missed quite a number of points for font changes or homoglyphs. Strangely, the first word in the text was never marked, although it was often part of the plagiarism. The system has problems with umlauts, even if German is set as the language. It also only displays a ? instead of the character for other special characters.

In summary, this is the best system to use for finding plagiarism in program code and one of the two system that are determined to be useful for collusion detection.

3.12 KDiff 3
KDiff3 also offers a comparison of files or folders, but when it compares folders it also only looks to see if there are files of the same name in the folder, it does not compare the files themselves. Only two files can be compared at the same time. The system accepts .java files, but only does a textual comparison of them. It can also determine if two files are binary equal.

It is able to show the identical parts, the different parts, or the entire file. In order to deal with umlauts it just deletes them. The report is printable, although often an entire paragraph is marked, even though there is only one word that is different. If multiple words are changed, the system is confused and reports nothing.

In summary, this system is not suited for university use in detecting collusion.

3.13 MOSS
A research group at Stanford University has a system for detecting plagiarism that is a plugin for the popular eLearning system Moodle, MOSS (Measure of Software Similarity). The development started  in 1994, long before Moodle was developed, with the intention of finding plagiarisms in the programs handed in by students. MOSS is able to analyze programs in a wide variety of programming languages.

The plugin was installed on a fresh installation of Moodle version 2.2 on a virtual server. There is an option to check for using plagiarism detection software (there are other systems possible). Two user accounts were set up for submitting the originals and the copies. A base file can also be submitted with the exercise text, this will not be regarded as plagiarism. After the students have submitted their work and the system synchronizes the files, the plagiarism detection is run and the results are presented to the teacher, giving an overview of the plagiarisms found.

Clicking on one of the plagiarisms gives a nice side-by-side presentation, with different parts of identical text shown in different colors. The only problem is that one cannot scroll simultaneously in both texts. Unfortunately, there is no printable report, it just cuts off at the end of the page. The system handily accepted .java files, but completely refused to work with .txt files, so the text test could not be run.

There is a commercial version of the software available from the company Similix that can be installed locally, and Stanford offers the use of Moss to anyone who registers with their site.

In summary, this is the second best system to use for finding collusion in program code, but is unfortunately not at all useful for text. Overall it is occasionally useful.

SIM_TEXT  is installed locally on a computer system by downloading the C source code files from the home page and starting it from a terminal [Alternatively, at there is a version implemented in JavaScript that can be used in any browser without installation.]. Despite the excellent technical documentation, the program is not easy to understand for non-computer scientists. The results are, however, excellent.

The reports are difficult to read at first, as they are only in text with no markings. But one can easily use the reported line numbers to quickly identify and mark the similar passages in an editor or a printed version of the offending documents. The similar passages are not sorted, so one must at times jump around between the files that were compared. At one point it was discovered that the results were only the differences between the files, not the similarities, although it was unclear what was changed in order for this to happen.

This system also does not work with German umlauts [Although it is possible, an online implementation at the VroniPlag Wiki, for example, does deal with umlauts properly.]. If individual words are changed too often in the text, the system will drop to a 0 % similarity measure. But the system in general does an excellent job of finding similarities and can certainly be integrated in other systems.

In summary, the system is difficult to install and use, but produces good results for text collusion and useful results for program collusion and is thus one of the systems that is useful for detecting collusion.

3.15 SPLaT

The system SPLaT (Self-Plagiarism Tool)  was developed  at the University of Arizona. It is written in Java and can thus run on any platform that supports Java. The graphical interface is not what one would call pretty, there are very many parameters that can be set or changed and they are all crammed onto one screen.

The system converts known file types to text, and offers the possibility of installing a converter that can convert from other formats, such as PDF. Even though it does not construct a syntax tree for the candidates, it does compare them textually. The reason this system performed better than other textual comparison tools is that it checks the documents in both directions, that is, what parts of A are in B and what parts of B are in A. For the test cases that were set up, there was a partial complete match in at least one of the directions.

The system appears to analyze in minute detail. With other systems, changing just one word (for example, a variable name) would cause a larger portion to be skipped, SPLaT only skipped the exact characters that were different and continued right after. The price to pay for this attention to detail is the time that is necessary to check the papers. This time grows exponentially with the number of papers. The system needed 10 minutes for checking 89 files and then stopped calculating at 46 %.

As soon as the comparisons are finished, a window opens with the numbers and the file names. Clicking on a file name produces a side-by-side report that unfortunately only uses one color. The color can be changed, but it would have been better to have the individual fragments in different colors. There are many options for storing and printing the reports, although since there is only one color they can be difficult to read.

The system does, however, have problems with umlauts, making it difficult to use for German texts.

In summary, the system came in fourth for detecting code collusion and third for text collusion, so it is quite useful for use at university.

3.16 Turnitin
Turnitin, a system offered by the American company iParadigms, LLC, has three different roles defined: administrator, teacher, and student. A course needs to be set up by the administrator and assigned to a teacher. The teacher sets up a hand-in assignment within this course that requires a check mark for “originality report required”. The students are then given a code and they upload their work using this code, or the teacher can upload for them.

Students cannot, however, upload multiple files or a ZIP file, although the teacher is able to do so. This is a problem for programming assignments, as they often consist of multiple files. TurnItIn will not accept files with an ending of .java, although .pdf, .doc, and .txt are allowed. The java files had to be renamed to .txt, but still two were rejected as some were less than 20 words (typical for the simple main class that starts a program). So the system is useless for this task.

There is no side-by-side report on screen. Parts of the text are marked in red, that means that these portions are in other sources. If one clicks on those passages, a new window opens up. Unfortunately, instead of giving the file name or the author, it just states “Submitted by the HTW Berlin on …”. This makes more work for trying to find out who is involved if a collusion is found. The reports can be easily stored, but again a side-by-side report is missing.

The system was given full points (28/28) for all of the text test cases, as they were all identified as collusion, even though it was difficult to obtain their names. It was not bothered by changes in the text, different fonts, umlauts, or the homoglyphs. For the program code, however, since no structural comparison was done, not quite a third of the points were reached (18/63).

In summary, this system is well-equipped for finding text collusion in student papers, although it would be useful to make it easier for a teacher to see which papers are involved in the collusion without having to request the intervention of an administrator. But it is useless for finding code collusion. Overall it is occasionally useful.

3.17  WCopy Find
WCopyFind is one of the oldest and most well-know of the collusion detection systems. It was previously tested in 2008, but without very convincing results. It is able to compare multiple files at a time with each other. Identical text is displayed in red, different text in black. In the code test it only reached 5 points, just barely above CodeCompare. As soon as text was moved, it jumped over the rest of the text.

For the text test it was slightly better, although it also had problems with texts in which multiple words were changed. The reports are difficult to read with only the changes marked, but it does produce a report for each pair of files checked.

In summary, the system is partially useful for finding text collusion and not useful for finding code collusion at university

3.18  Yaplaf
This is a system from the Technical University in Vienna, Austria. The desktop version was downloaded as a ZIP file, but it was not possible to install it. On one Macintosh system (Lion) there were immediate errors thrown on installation. On an older Macintosh (Snow Leopard), a GUI window did appear and asked the user to select a ZIP file with the texts to be compared. When this was done, errors were thrown and the system had to be manually killed.

In summary, it was not possible to install and test this system, so it is not graded.

4. Summary and Recommendations

In summary it can be seen that it is easier to find collusion than it is to identify plagiarism in general. The reason is because collusion happens in a closed group — all the papers are available for investigation. General plagiarism detection involves many additional factors, and there are many possible sources on the internet and offline. For finding collusion in text-only papers there are a few systems that do a good job. For finding collusion in program code, there are a few systems that are partially useful. However, for translations or heavily edited material, the systems are powerless to detect collusion. There are only three systems that are at least useful in one category and partially useful in the other category: JPlag, SPLaT, and SIM_TEXT. Turnitin and Ephorus are useful for finding text collusion only, MOSS for finding program code collusion only.

The authors wish to thank the HTW Berlin research council for funding this research.


Lyon, C. Barrett, R., and Malcom, J. (2004) A theoretical basis to the automated detection of copying between texts, and its practical implementation in the Ferret plagiarism and collusion detector. In: Proceedings of the first Plagiarism Conference (accessed 15.5.2012)

Barrett, R., & Cox, A.L. (2005). ‘At least they’re learning something’: the hazy line between collusion and collaboration. Journal of Assessment and Evaluation in Higher Education, 30 (2), 107-122.

Link list

System and Report URL
Beyond Compare
Code Compare
Eclipse Compare
ExamDiff Pro
JEdit JDiff Plugin
KDiff 3 http://?kdiff3?.source?forge?.net
WCopy Find