You are here

Artificial intelligence and authorship

There has been a tweet circulating recently which gives instructions on how to remove a peanut butter sandwich from a video recorder, written in the style of Biblical verse. It’s very funny – at least until you realise that it was written by an AI bot. At that stage it becomes very clever, but it loses all the humour. It seems that the wit lies in the intelligent use of language; the self-conscious parody of a shared understanding of the form that is being mimicked. Once the author has been revealed to be a computer programme all this is lost; it is simply a tool applying rules it has learned.

This question of what being an author means is just one of the more pressing issues in the accelerating market of AI bots. The newest generation of these – the most discussed currently being OpenAI’s ChatGPT, the latest version of which was launched in November 2022 – are based on Large Language Models (LLMs). They are trained on vast banks of example text which enable them to determine, in a probabilistic sense, what words, sentence structures, topics and evidence, are most likely to appear together in an answer on almost any given question. The internet has gone wild with people reporting their interactions with ChatGTP, from those challenging it to recommend the best AI writing bot (it was admirably diplomatic), to a rabbi who asked it to write a sermon for him. In January new registrations to the service had to be suspended because of the demand. 

So far, the novelty means that most users are clear in declaring how they interacted with these bots. However, there are many purposes for which it may not suit an author to admit that their content is not entirely original or that ideas and articulation were outsourced to an AI. For those of us working in publication ethics this turns the issue from an interesting one about the nature of creativity and inspiration (which participants at the most recent COPE Council Retreat noted can legitimately come from many places: books, the internet, colleagues…) into one of authorship, intellectual property, and responsibility for content.

Back to top

The companies which produce these artificial intelligence machine learning tools are very clear on both the legal and ethical standing of their products. Bloom states in its specifications that ‘using the model in high-stakes settings is out of scope…The model is not designed for critical decisions nor uses with any material consequences on an individual’s livelihood or wellbeing.’ These include situations where medical care, legal judgement, finance or the scoring of individuals are concerned: areas which are represented in the portfolios of many academic publishers. Bloom’s company disclaimer goes on to say that ‘Indirect users should be made aware when the content they’re working with is created by the LLM.’ In fact, in January 2023 WAME released an early response to the use of LLMs in scholarly publishing which made precisely the same recommendation. ChatGPT also ‘recognises’ its own limitations, responding to one journalist that ‘There is no inherent ethical issue with using AI in research or writing, as long as the AI is used appropriately and ethically’. In another instance it returned a statement that it did not fulfil all the ICMJE authorship criteria. Both the WAME guidance and COPE’s own position statement concur: AI bots should not be permitted as authors since they have no legal standing and so cannot hold copyright, be sued, or sign off on a piece of research as original. Springer Nature and Taylor & Francis have both come out with similar statements, asking authors to specify the nature of any interactions with AI in methods or acknowledgement sections.

There are many ways in which AI tools can be a positive boon for publishers, journals and academic authors (indeed, many are already being used). They can identify suitable peer reviewers and summarise content; they can tag metadata, identify duplicated images and gels and create original but relevant cover art. They certainly have the potential to help with the types of cases brought to COPE’s member Forum where there is doubt over the originality or authenticity of images. Such tools may be of particular importance in detecting paper mill activity. Except that experience already shows that ChatGPT doesn’t always give the same answer to the same question twice. 

This is because AI bots have no notion of reliability, replicability or ‘truth’: they simply return, from the range of facts and statements in their repository, one which make probabilistic sense given what they have been trained on. Sometimes there is only one answer to a question, but in many cases there will be multiple possible responses, all of which are equally – in the bot’s terms – likely. This is how it can assert multiple different responses to the same question. It is a very human reaction to be offended or side-swiped by this. An AI doesn’t care whether the information it returns is ‘true’, only whether it is plausible. A colleague likens using AI to a scholar throwing out questions to a group of colleagues who are on holiday in a bar. They will try to be helpful, and they will respond with answers – but there is no way of telling how much alcohol they’ve had to drink. They might be sending a response which is startling insightful and accurate – or it could be drunken nonsense wrapped up in the language of scholarly research. As Bloom’s specifications state, ‘The model outputs content that appears factual but may not be correct.’ 

Back to top

These issues are likely to present differently in different disciplines. An AI attempting a maths proof may produce something which looks highly plausible, but it is unlikely (without more refinement at least) to be error-free. It is a relatively simple task for a peer reviewer to spot the issue – or (ironically) for another technological tool like a theorem checker, to do so. However, the point is still that editors and publishers will need to rely even more on author responsibility and good peer review to pick such problems up. And as a recent blog article has suggested, we cannot always rely on peer review to identify flaws that are based on results rather than method. Given the somewhat ‘generic, hedged, inoffensive’ responses so far generated by ChatGPT, according to one contributor to Scholarly Kitchen, it will be interesting to see whether it performs more plausibly as an author in the more subjective arts, humanities, and social sciences subjects. Several users have pointed to examples where ChatGPT has misattributed or fabricated citations suggesting that its training data leans towards a fairly free and easy attitude to plagiarism. On the other hand, as it learns on more refined sets of data it may become more creative: COPE Council member Matt Hodgkinson has already found an example where Bloom returned the apparently novel and very human assertion that one of the potential uses for machine learning in peer review was ‘predicting “meh” from noteworthy’. While it went on to amble off-topic, that one phrase certainly suggests that it is in touch with a certain zeitgeist.

‘Fluent but flaky’ as the bots might be for now, we are, essentially, faced with a computer program which might pass the Turing test. Some will rejoice in this, the technological innovation it represents, and the time it frees up for human authors to engage in more scholarship (or complete more peer reviews). However, other people will find it deeply unsettling, and not only because it suggests that credit for work done is no longer the moral cornerstone we all hope for. While many of us are used to websites and subscription services suggesting books and TV programs based on what we’ve read or watched before, few of us like the idea that our creativity and analytical power can be matched by a piece of software – or that we can be duped into thinking we are engaging with a human when we are not. On the other hand, we could react to LLMs as a tool to spark greater human creativity and achievement. In 2015 Deepmind’s AlphaGo came up with a move in the notoriously difficult game of Go which was so unlikely that it looked like an error. In fact it was a winning move (move 37, to be precise) – but one never spotted by humans before. The end result was that the bot made human players better. It’s also worth noting that AlphaGo had no idea it was a stunningly novel way to play. Whether that is reassuring or alarming will depend on your attitude to AI assistance for humans.

Back to top

So what does all this mean for publication ethics and for that question of authorship? All of us are working to ensure that the scholarly record can be trusted; that it is accountable and gives credit for work that is reported responsibly and with regard for ethical practices. We want to promote an industry where publishers, editors, scholars and research institutions all support the trustworthy dissemination of research that is reported fairly. In that world, definitions like authorship matter. In the cases our members bring to us a lot hinges on what we mean by terms like ‘submitted, ‘published’ and ‘authored’. It matters who (and what) an author is, and whether they can answer for the ethics and trustworthiness of their work. A bot – however well trained, and with whatever degree of clarity that is brought by distance from the messy, human experience of research, planning and writing - cannot understand what it writes. Put simply, it cannot be responsible. As we saw already, the bots have been trained to say this explicitly. 

At this moment in time AI looks like an amazing tool - when used ethically – for certain purposes. It’s highly likely already one which can’t be ignored. But there are a whole host of wider considerations which need to be thought through carefully on how and when it should be used in the scholarly literature, and that’s not even touching on the issues of potential bias and unsavoury material in its training  material which will, in turn affect what it produces. It’s even possible that the AI-detection tools which are already in development by STM, Turnitin and the like (referenced here), could in the future be used to train the bots better in writing authentically human language (let us hope that they also train it in ethical practices). But AI as legitimate authors? The world of publication ethics is already turning resolutely against that idea, and it is easy to see why. We are already starting to see that AI is a dancing bear – fascinating, but ultimately with no idea of the feat that it is performing. The more we explore, the more we may feel that as far as scholarly work is concerned it’s not even that good a dancer.

This article was written by a human.

Alysa Levene, COPE Operations Manager

With thanks to Rich Savage for creative input

Back to top

Related resources