You are here

Artificial intelligence: Lightning talk summary

New tools and directions in AI for scholarly publishing

January 2024 Lightning talk

In the first of our new Lightning Talks Marie Soulière and Nishchay Shah spoke about new tools and directions in AI for scholarly publishing. Lightning Talks are short, informal events, designed to introduce the audience to current topics or new guidance and give them the opportunity to ask questions. The event was chaired by COPE Chair, Dan Kulp.

Marie (COPE Council Member and Head of Publication Ethics and Quality Assurance at Frontiers Open Access Publishing) and Nishchay (Business Head, Emerging Products and CTO at Cactus Communications) started by outlining the current state of AI for the scholarly publishing industry. Marie described the industry-wide increase in scientific fraud perpetrated by malicious actors who have access to increasingly sophisticated AI tools. Nishchay followed up with a broad perspective on the state of AI which situated the challenges for publishers and editors in a wider landscape of emerging tools which operate at differing scales, and with multimodal capabilities.

Both speakers are familiar with the challenges AI brings for those seeking to balance high ethical standards with the free exchange of ideas. Marie introduced Frontiers’ in-house Artificial Intelligence Review Assistant (AIRA) which has been in use since 2016 to support human-led integrity checks on submissions. Cactus Communications’ tool is called Paperpal Preflight, and like AIRA, runs a wide range of checks to verify author information, screen for integrity flags on submissions, and monitor the review process for signs of conflicts of interest or manipulation. Both companies see their tools as powerful supports to human judgement because they can check submissions against a large body of data, spotting patterns not visible to the human eye. However, they do not replace human decision making; in fact, according to Nishchay they are seen as giving humans greater power to monitor the integrity of the scholarly record. 

The speakers concluded by calling for everyone to keep learning from one another and looking around for new tools to help them detect ethical problems in scholarly publishing. Better systems and agreements for sharing information ethically between publishers will facilitate this, and both COPE and STM are developing guidance and tools in this area. This will also benefit publishers who do not have access to their own or paid-for detection tools.   

Questions from the audience

What training datasets do you use?

Both platforms were trained on internally-curated data sets which varied according to each specific tool. The image tools were trained on images with or without manipulation, for example, and the language assessment software on text examples. While Frontiers uses data from their own repositories of previously submitted papers and internal evaluations, Cactus’ Preflight tool relies more on open access sources and business experience.

Some of the examples given showed checks being done after peer review. Why are they not carried out earlier in the process?

The checks carried out by AI are done at various stages: as part of initial verifications, as peer review is initiated, and as it continues. Sometimes flags are raised late in the process, for example, if new data have been added to the training bank after the article went into the process. Deciding when to run checks is a complicated process, especially when a manuscript is also changing as it moves through review and amendment.

Do you get many false positive flags from the AI tools?

There is a high false positive rate in identifying potential integrity issues. However, since the tools are seen as one signal among many in creating a final composite score, this is not necessarily a problem. It is generally considered better to over-flag than to miss problematic content.

How should AI tools be disclosed? Is this likely to change as they become more universally used?

The acceptance rate for AI tools has been on a very rapid rise and there is now much more knowledge about responsible use. They are likely to become even more commonplace in the future, and consequently become more akin to spelling and grammar tools in word processing software. However, generative AI tools like ChatGPT bring additional ethical concerns because they can be used in many different ways (e.g. to generate research hypotheses, to analyse data, to generate figures, to write discussion sections). There are several consortia and organisations currently working on the responsible use of AI with an aim to produce consensus over where and how it can be used appropriately.

Is there technology available to detect AI-generated images?

Neither AIRA nor Cactus’ Preflight is currently able to detect AI-generated images although research is ongoing. AI changes so rapidly that tools in development rapidly lose viability, and there are also problems with finding enough training data. Both systems contain tools to detect image integrity problems, including duplication, manipulation, rotation.

What do author verification tools look for?

It is often difficult to verify author identities because people may use non-institutional email addresses, or submit under different name variations. AIRA uses Frontiers’ own submission data and other external repositories including Google Scholar, Pubpeer and ORCID as part of its checks, which include whether the author has published before, if they are at the institution they claim they are, if they have retracted articles, or if they have suspicious patterns of submission.

Can the data provided by AI help to improve the detection of integrity issues by humans?

There has not yet been a direct transfer of expertise in this way, although the teams involved are learning from the experience of using AI. When a large publisher tested Preflight they found that it did often confirm their own impressions about the presence or otherwise of integrity issues. Companies like Cactus and Frontiers tend to limit how many details they share about the patterns detected by their tools, in order to prevent malicious actors from learning how to circumvent them.

Can AI tools detect text that has been improved by AI?

It is generally agreed that it is impossible to properly detect AI-generated text. At Frontiers authors agree to declare the use of AI as part of their author agreements. In a broader sense, however, there is a debate to be had about how much it matters if AI tools have improved academic writing; indeed, it could be considered to be simply increasing equity between writers who have English as a first language, and those who do not.

Can AI tools detect fake papers used in review articles?

Preflight has capability to detect issues like circular or self-citations as part of its tabulation of data problems. However, this is not part of the functionality of the tools at the disposal of most editors; these must be built up to achieve this level of sophistication.

Related resources