Case Discussion: Data fabrication in a rejected manuscript

10 October 2019

Case Summary

A journal rejected two manuscripts because of data fabrication, as confirmed by an inspection of the original patient data. Without the lead author knowing, the co-authors had fabricated the dataset after recruiting only a few patients. The lead author was cleared of misconduct in an institutional investigation but later published one of the manuscripts and other suspicious manuscripts in other journals.

Question(s) for the COPE Forum

• Should other journals be warned about this case?

• Should anyone else be informed about this case?

Forum Advice

The Forum suggested that the journal editor contact the journal that published the similar manuscript, following the COPE guidelines titled “Sharing of Information Among Editors-in-Chief Regarding Possible Misconduct”. However, the editor should not refer to dissatisfaction about the lead author, to avoid possible legal issues such as defamation.

Case Discussion

This archived COPE Forum case is categorised under the COPE Core Practice of Data and Reproducibility, which states: “Journals should include policies on data availability and encourage the use of reporting guidelines and registration of clinical trials and other study designs according to standard practice in their discipline.” The contents of the case and the Core Practice might initially suggest relevance only to policies and processes in medical publishing. However, they apply to all disciplines that collect data and have implications for journals (and other scholarly publications such as books and conference proceedings), authors/researchers, and their institutions.

This case illustrates that a journal editor has the authority to request time-stamped original (raw or primary) data files, including associated data sources and records if practicable, in order to check that a submitted manuscript is based on sound evidence. Although data are commonly numerical, digital, or code, or converted into those formats, data sources can be textual, pictorial, or in any other media, and can arguably be any source used or referred to, including recordings/transcripts, artefacts, instrument printouts, and any cited published or unpublished material. By asking for access to files or materials underlying the scholarship of a manuscript, an editor is asking authors “to show their working”. The journal guidelines should make this policy clear, and authors should archive all findings and data files in case a journal editor or editor’s representative requests access during manuscript review or after publication.

Because journals have limited resources to police every aspect of a manuscript, they need to mainly rely on authors’ honesty. Some journals ask for specific declarations that data were obtained and used ethically and legally and that authors themselves had full access to the data during analysis and manuscript preparation. If an editor or reviewer has suspicions that data in a submitted paper are too good to be true or an illustration has been manipulated, raw data files or data sources may need to be inspected to verify analyses and data integrity. Data inspections may also be needed if a reader or other whistleblower notifies an editor about suspicious data or illustrations in a published article. Journals may need to call on specialists to investigate or may invite the authors’ institutions to launch an investigation. Relevant COPE flowcharts are “What to Do if You Suspect Fabricated Data: (a) Suspected fabricated data in a submitted manuscript”, “What to Do if You Suspect Fabricated Data: (b) Suspected fabricated data in a published manuscript”, and “What to Do if You Suspect Image Manipulation in a Published Article”.

Editors can automatically reject a submitted manuscript in clear cases where data have been fabricated (made up) or falsified (altered or partially deleted), as well as where there is a clear major error in data handling/analysis or if authors are unable to produce relevant data files or materials. This step helps maintain the integrity of the research and scholarly record. If the editor suspects that the data are fraudulent (eg, fabricated, falsified, plagiarised, or used without permission), then the institutions should also be informed, in neutral terms, as explained in “Cooperation Between Research Institutions and Journals on Research Integrity Cases: Guidance from the Committee on Publication Ethics (COPE)”. It is unclear in the presented case if only the institution of the corresponding (lead) author was contacted or if all authors were at the same institution. All authors and authors’ institutions should be informed of the submission outcome and reason. This step should hopefully prompt institutional disciplinary investigations and internal improvements in research processes, data management and curation, and training in publication ethics.

The case also illustrates that if an editor notices subsequent publication elsewhere of fraudulent data from a previously rejected paper, it is permissible for editors to contact each other to discuss concerns about data integrity. This exception to the maintenance of confidentiality of peer review should be explained in journal guidelines, as outlined in “Sharing of Information Among Editors-in-Chief Regarding Possible Misconduct”. The editor of the published article need only be informed in neutral terms about the fraudulent dataset in a previously rejected paper, and can contact the corresponding author and then institution, following the flowchart “What to Do if You Suspect Fabricated Data: (b) Suspected fabricated data in a published manuscript”. However, the editor that rejected the first manuscript must be confident that the same fraudulent dataset was used. If this were also true for the other published articles, then the other editors could be informed about the fraudulent dataset, as well as the possible issue of redundant publication or salami slicing.

A precaution that editors can take is, for manuscripts on human studies, to insist on study preregistration in a registry database before the research was conducted and written up. In that way, protocols and analyses (and, subsequently, participant recruitment/flow and basic results) that have been uploaded to a registry can be compared with those later reported in the manuscript. Study preregistration is required by some funders and many journals for prospective clinical trials, such as randomised controlled trials. Note that randomised controlled trials can be non-medical – for example, some educational or economic intervention studies – and not all registries are medical ones (see https://osf.io/registries). Observational studies of human participants can also be preregistered in a registry database.

Study preregistration aims to help maintain the integrity of the research record by decreasing publication bias and increasing transparency and reproducibility. The same reasons underlie the option of “registered reports” (peer review and acceptance-in-principle of a protocol before the results are known) that is now being offered by journals in many different disciplines (see https://cos.io/rr/). Complete and transparent reporting can also be encouraged by the use of international reporting guidelines, which can apply to many study types and disciplines that involve human or animal research (see http://www.equator-network.org/).

Last, but not least, increasing data availability to other researchers is a goal of a growing number of research funders, especially public funders, including those in social sciences and humanities. As part of the open data, open access, and open research/science movements, funders and institutions commonly require researchers to formulate and implement data management plans to optimise the preservation and sharing of research data. Journals are also increasingly encouraging researchers to share primary data publicly for others to verify, reuse, and cite. Data description articles, data journals, and data repositories are venues that directly help researchers share data and metadata publicly. If any of the journals mentioned at the end of the presented case had linked to the fraudulent dataset in a repository or as a supplementary file, the editors would have to request that the lead author formally retract all uploaded copies of the dataset.

Although all journals should explain to authors that raw data can be requested by the journal at any time for inspection, not all journals may require the wider sharing of raw data, as exemplified in COPE-funded research on journal data-sharing policies across disciplines (presented at the 2017 Peer Review Congress). Indeed, different levels and conditions of data sharing are possible, as shown in the Transparency and Openness Promotion (TOP) guidelines. The extent, timing, and duration of data availability; appropriate data repository and licensing; and requirements for data citation and data availability statements in manuscripts can all depend on discipline, subdiscipline, data type, journal, and funder. As explained in a 2018 COPE webinar on creating and implementing research data policies, journal offices need to decide on and make clear their policies regarding data sharing, what data should be made available, and when and where data should be archived. For example, Scientific Data and PLOS journals mandate data sharing (with allowable exceptions) and comprehensively list recommended repositories by subject. The journal editor first mentioned in the presented case could readily notice similar cases of fraudulent data if the journal were to mandate data sharing and submission of a dataset along with the manuscript. However, the editor would have to also initiate a sustainable and reliable system for the peer review of data.

Trevor Lane on behalf of the COPE Education Subcommittee

Read October 2019 Digest newsletter with a focus on Data and Reproducibility and a summary of our European Seminar in Leiden in Deborah Poff's letter from the COPE Chair. We also share the new COPE Authorship Discussion Document and announce our November Forum discussion on "Artificial intelligence in decision making". We say goodbye and thank you to six of our Council members who have come to the end of their term. Plus the monthly update on news, collated by COPE Council members.

You are here

Case Discussion: Data fabrication in a rejected manuscript

Case Summary

Forum Advice

Case Discussion