The authors of a paper were asked to explain alleged plagiarism in a submitted and subsequently accepted manuscript. This was based on a Turnitin report showing 28% similarity between the submitted manuscript (Author B) and a previously published paper (Author A), and 37% similarity between a published manuscript (Author A) and a submitted manuscript (Author B). The authors came back with an expert opinion which pointed out the need for human interpretation of a Turnitin score, and the fact that certain valid factors may lead to an elevated similarity level. These could include: the use of common, specialist language; use of standardised language and layout for a paper in this field; very similar methods of operating and data collection in the two investigations; short word limits which reduces the number of ways that the same things can be said; and common references in a very specialist area. When these areas of commonality are taken out, it was suggested, the two papers had very few similarities. The expert suggested that one of the reasons that the comparison of Dr A’s paper with Dr B’s paper was higher than the comparison of Dr B’s paper with Dr A’s (37% vs 28%) was probably that there was a difference in length of almost 2000 words, so that a greater proportion of Dr A’s paper matched Dr B’s paper. Furthermore, a large number of similarities were generated by tables (which showed standard measures of the effectiveness of treatment), and there is no intellectual property in the format of a table.
Questions for COPE Council
- Is this still plagiarism?
- What does COPE think about the expert opinion provided by the authors?
- How should the journal proceed?
Advice on this case is from a small number of COPE Council Members. Most cases on the COPE website are presented to the COPE Forum where advice is offered by a wider group of COPE Members and COPE Council Members. Advice on individual cases is not formal COPE guidance.
The percentage of overlap of two papers is meaningless without detailed analysis of the manuscript. While an expected overlap between established terms and shared forms of language may account for some level of similarity, over 20% is quite high. However, the only way to adjudicate properly is to read the articles with the similarities highlighted to see where they are the same.
Text similarity may be assessed by considering the following items:
- Was the source cited and discussed? If so, this is less serious but citation alone is not sufficient. Directly copied text should be explicitly quoted.
- Is the source by the same authors? This is less serious, although it might still constitute a breach of copyright or redundant publication.
- Is the reuse only/mainly in the methods, which are commonly reproduced in scientific studies? Note: copying in the methods can be a problem if the copying article passes off a method as the author's own; the originator of the method should be attributed in the text and the source should be cited.
- Is the overlap largely due to shared references? This is less of an indication of problematic copying and it is often useful to run the similarity check with references excluded. Note: if mainly the same references are cited in the same order (and perhaps with the same format), this can be a sign of plagiarism.
- Is the similarity in blocks or only in fragments? Copying blocks of text from one or a few sources is more serious than copying short parts of many sources, although the latter is still not good practice. Note: similarity that appears on the surface to be fragmentary can be due to the copying authors using paraphrasing and synonyms in an attempt to confound software checks.
- Is there any copying of results or discussion? Copying of results is evidence of fabrication and the authors' institution will need to be informed. Copying of the discussion is misleading, because authors cannot accurately discuss their own findings using the words of others.
- Was there copying of unique phrases only ever used in the source article? This removes the possibility that the similarity is due to using stock phrasing or both copying from a common source.
Regarding the authors’ 'expert' analysis, it is impossible to give much weight to this analysis without identifying features that would allow the journal to determine expertise or conflict of interest. In any event, the journal should do its own analysis of the results and, if necessary, call in its own expert to comment on the overlap. It is not unusual for a journal to send information to a referee and to ask an expert to read the papers to understand how the overlap reflects on the originality and uniqueness of the results.
Journals can allow authors the opportunity to revise the paper to reduce text similarity and to properly discuss similar articles, but they are not obliged to do so. The journal does not need to establish a formal finding of plagiarism to decide to not publish an article—high similarity alone can be sufficient, especially if the source is not cited and discussed, but this should be decided by a side-by-side comparison of the text and not by relying on the percentage match. The decision on how to proceed will depend on the level and nature of the copying, evidence that this was done knowingly, and any evidence of attempts to conceal this.
Plagiarism detection software can be run at various points in the review process and the editor should consider doing it prior to manuscript acceptance to avoid other potential issues, such as having to rescind an acceptance. The editor may want to adjust the settings used, for example, to exclude references when running comparisons.