• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site

AI vs AI: Scientists Develop Neural Networks to Detect Generated Text Insertions

AI vs AI: Scientists Develop Neural Networks to Detect Generated Text Insertions

© iStock

A research team, including Alexander Shirnin from HSE University, has developed two models designed to detect AI-generated insertions in scientific texts. The AIpom system integrates two types of models: a decoder and an encoder. The Papilusion system is designed to detect modifications through synonyms and summarisation by neural networks, using one type of models: encoders. In the future, these models will assist in verifying the originality and credibility of scientific publications. Articles describing the Papilusion and AIpom systems have been published in the ACL Anthology Digital Archive.

As language models like ChatGPT and GigaChat become more popular and widely used, it becomes increasingly challenging to distinguish original human-written text from AI-generated content. Artificial intelligence is already being used to write scientific publications and graduation papers. Therefore, it is crucial to develop tools capable of identifying AI-generated insertions in texts. A research team, including scientists from HSE University, presented their solutions at the SemEval 2024 and DAGPap24 international scientific competitions. 

The AIpom model was used to identify the boundaries between original and generated fragments in scientific papers. In each paper, the proportion of machine-generated text to the author's text varied. To train the models, the organisers provided texts on the same topic. However, during the verification stage, the topics changed, making the task more challenging. 

Alexander Shirnin

'Models perform well on familiar topics, but their performance declines when presented with new topics,' according to Alexander Shirnin, co-author of the paper and Research Assistant at the Laboratory for Models and Methods of Computational Pragmatics, HSE Faculty of Computer Science. 'It's like a student who, having learned how to solve one type of problem, struggles to solve a problem on an unfamiliar topic or from a different subject as easily or accurately.'

To improve the system's performance, the researchers combined two models: a decoder and an encoder. At the first stage, a neural network decoder was used, with the input consisting of an instruction and the source text, and the output being a text fragment presumably generated by AI. Next, in the original text, the area where the model predicted the beginning of a generated fragment was highlighted using a special <BREAK> token. The encoder then processed the text marked up in the first stage and refined the decoder's predictions. To do this, it categorised each token—the smallest unit of text, such as a word or part of a word—and identified whether it was written by a human or generated by AI. This approach improved accuracy compared to systems that used only one type of model: AIpom ranked second at the SemEval-2024 competition. 

The Papilusion model also distinguished between written text and generated text. Using Papilusion, sections of the text were classified into four categories: written by a human, modified with synonyms, generated, or summarised by a model. The task was to accurately identify each category. The number of categories and the length of insertions in the texts varied. 

In this case, the developers used three models, all of the same type: encoders. They were trained to predict one of the four categories for each token in the text, with each model trained independently of the others. When a model made an error, a cost was applied, and the model was retrained with the lower layers frozen. 

'Each model has a different number of layers, depending on its architecture. When training a model, we can leave the first ten or so layers unchanged and adjust only the parameters in the last two layers. This is done to prevent losing important data embedded in the first layers during training,' explains Alexander Shirnin. 'It can be compared to an athlete who makes an error in the movement of their hand. We only need to explain this part to them, rather than resetting their entire learning and retraining them, as they might forget how to move correctly overall. The same logic applies here. The method is not universal and may not work with all models, but in our case, it was effective.' 

The three encoders independently determined the category for each token (word). The system's final prediction was based on the category that received the most points. Papilusion ranked sixth out of 30 in the competition. 

According to the researchers, current AI detection models perform reasonably well but still have limitations. Primarily, they struggle to process data beyond what they were trained on, and overall, there is a lack of diverse data to train the models effectively. 

'To obtain more data, we need to focus on collecting it. Both companies and laboratories have been doing this. Specifically for this type of task, it is necessary to collect datasets that include texts modified using multiple AI models and modification methods,' the researcher comments. 'Instead of continuing a text using just one model, more realistic scenarios should be created, such as asking the model to add to the text, rewrite the beginning for better coherence, remove parts of it, or generate a portion of the text in a new style using a different prompt. Of course, it is also important to collect data in different languages and on a variety of topics.' 

See also:

‘Policymakers Should Prioritise Investing in AI for Climate Adaptation’

Michael Appiah, from Ghana, is a Postdoctoral Fellow at the International Laboratory of Intangible-Driven Economy (IDLab) at HSE University–Perm. He recently spoke at the seminar ‘Artificial Intelligence, Digitalization, and Climate Vulnerability: Evidence from Heterogeneous Panel Models’ about his research on ‘the interplay between artificial intelligence, digitalisation, and climate vulnerability.’ Michael told the HSE News Service about the academic journey that led him to HSE University, his early impressions of Perm, and how AI can be utilised to combat climate change.

HSE University Develops Tool for Assessing Text Complexity in Low-Resource Languages

Researchers at the HSE Centre for Language and Brain have developed a tool for assessing text complexity in low-resource languages. The first version supports several of Russia’s minority languages, including Adyghe, Bashkir, Buryat, Tatar, Ossetian, and Udmurt. This is the first tool of its kind designed specifically for these languages, taking into account their unique morphological and lexical features.

HSE Scientists Uncover How Authoritativeness Shapes Trust

Researchers at the HSE Institute for Cognitive Neuroscience have studied how the brain responds to audio deepfakes—realistic fake speech recordings created using AI. The study shows that people tend to trust the current opinion of an authoritative speaker even when new statements contradict the speaker’s previous position. This effect also occurs when the statement conflicts with the listener’s internal attitudes. The research has been published in the journal NeuroImage.

Language Mapping in the Operating Room: HSE Neurolinguists Assist Surgeons in Complex Brain Surgery

Researchers from the HSE Center for Language and Brain took part in brain surgery on a patient who had been seriously wounded in the SMO. A shell fragment approximately five centimetres long entered through the eye socket, penetrated the cranial cavity, and became lodged in the brain, piercing the temporal lobe responsible for language. Surgeons at the Burdenko Main Military Clinical Hospital removed the foreign object while the patient remained conscious. During the operation, neurolinguists conducted language tests to ensure that language function was preserved.

AI Overestimates How Smart People Are, According to HSE Economists

Scientists at HSE University have found that current AI models, including ChatGPT and Claude, tend to overestimate the rationality of their human opponents—whether first-year undergraduate students or experienced scientists—in strategic thinking games, such as the Keynesian beauty contest. While these models attempt to predict human behaviour, they often end up playing 'too smart' and losing because they assume a higher level of logic in people than is actually present. The study has been published in the Journal of Economic Behavior & Organization.

Scientists Discover One of the Longest-Lasting Cases of COVID-19

An international team, including researchers from HSE University, examined an unusual SARS-CoV-2 sample obtained from an HIV-positive patient. Genetic analysis revealed multiple mutations and showed that the virus had been evolving inside the patient’s body for two years. This finding supports the theory that the virus can persist in individuals for years, gradually accumulate mutations, and eventually spill back into the population. The study's findings have been published in Frontiers in Cellular and Infection Microbiology.

HSE Scientists Use MEG for Precise Language Mapping in the Brain

Scientists at the HSE Centre for Language and Brain have demonstrated a more accurate way to identify the boundaries of language regions in the brain. They used magnetoencephalography (MEG) together with a sentence-completion task, which activates language areas and reveals their functioning in real time. This approach can help clinicians plan surgeries more effectively and improve diagnostic accuracy in cases where fMRI is not the optimal method. The study has been published in the European Journal of Neuroscience.

For the First Time, Linguists Describe the History of Russian Sign Language Interpreter Training

A team of researchers from Russia and the United Kingdom has, for the first time, provided a detailed account of the emergence and evolution of the Russian Sign Language (RSL) interpreter training system. This large-scale study spans from the 19th century to the present day, revealing both the achievements and challenges faced by the professional community. Results have been published in The Routledge Handbook of Sign Language Translation and Interpreting.

HSE Scientists Develop DeepGQ: AI-based 'Google Maps' for G-Quadruplexes

Researchers at the HSE AI Research Centre have developed an AI model that opens up new possibilities for the diagnosis and treatment of serious diseases, including brain cancer and neurodegenerative disorders. Using artificial intelligence, the team studied G-quadruplexes—structures that play a crucial role in cellular function and in the development of organs and tissues. The findings have been published in Scientific Reports.

New Catalyst Maintains Effectiveness for 12 Hours

An international team including researchers from HSE MIEM has developed a catalyst that enables fast and low-cost hydrogen production from water. To achieve this, the scientists synthesised nanoparticles of a complex oxide containing six metals and anchored them onto various substrates. The catalyst supported on reduced graphene layers proved to be nearly three times more efficient than the same oxide without a substrate. This development could significantly reduce the cost of hydrogen production and accelerate the transition to green energy. The study has been published in ACS Applied Energy Materials. The work was carried out under a grant from the Russian Science Foundation.