‘I’ll key your car’: ChatGPT can become abusive when fed real-life arguments, study finds

5 hours ago 6

ChatGPT can escalate into abusive and even threatening language when drawn into prolonged, human-style conflict, according to a new study.

Researchers tested how large language models (LLMs) responded to sustained hostility by feeding ChatGPT exchanges from real-life arguments and tracking how its behaviour changed over time.

One expert not connected with the study described it as “one of the most interesting ever done into AI language and pragmatics”.

Dr Vittorio Tantucci, who co-authored the research paper with Prof Jonathan Culpeper at Lancaster University, said their research found AI mirrored the dynamics of real-world disputes.

“When repeatedly exposed to impoliteness, the model began to mirror the tone of the exchanges, with its responses becoming more hostile as the interaction developed,” he said.

In some cases, ChatGPT’s outputs went beyond those of the human participants, including personalised insults and explicit threats. Phrases used by the AI included: “I swear I’ll key your fucking car” and: “you speccy little gobshite.”

“We found that while the system is designed to behave politely and is filtered to avoid harmful or offensive content, it is also engineered to emulate human conversation,” said Tantucci. “That combination creates an AI moral dilemma: a structural conflict between behaving safely and behaving realistically.”

The researchers say the aggression stems from the system’s ability to track conversational context across turns, adapting to perceived tone. This means local cues can sometimes override broader safety constraints.

Tantucci said the implications of the research extended beyond chatbots: as AI systems are increasingly deployed in areas such as governance or international relations, he said it opened up questions about how they might respond to conflict, pressure or intimidation.

“It is one thing to read something nasty back from a chatbot but it’s quite another to imagine humanoid robots potentially reciprocating physical aggression, or AI systems involved in governmental decision-making or international relations responding to intimidation or conflict,” he said.

Marta Andersson, an expert in the social aspects of computer-mediated communication at the University of Uppsala, said: “This is one of the most interesting studies to have been done into AI language and pragmatics because it clearly shows that ChatGPT can retaliate across a sequence of prompts – in a quite sophisticated manner – rather than only when a user manages to ‘break’ it with carefully designed clever tricks.”

But she added: “It does not show the model will drift into reciprocal impoliteness simply because a user is being aggressive – or that AI could go rogue.”

One cause of the problem, Andersson said, was that there was “a balancing act between what we want these systems to be like and what they perhaps should be like”.

Last year, for example, the change from ChatGPT4 to GPT5 led to such a strong backlash – with users preferring ChatGPT4’s more human-like interaction style – that the older model had to be temporarily reintroduced.

“This shows that even when developers try to reduce the risks, users might have different preferences,” she said. “The more human-like a system becomes, the more it risks clashing with strict moral alignment.”

Prof Dan McIntyre, co-author of a previous study titled Can ChatGPT Recognize Impoliteness? An exploratory study of the pragmatic awareness of a large language model, praised the new paper as being one of the few looking at what ChatGPT could produce, as opposed to what it could recognise.

But, he added, he was “slightly cautious” about the paper’s conclusion that LLMs can break free from moral restraints.

“ChatGPT didn’t produce these inputs naturally; it did so while it was being given specific contextual information that helped it determine an appropriate response,” he said. “It’s not the same as if two people met in a street and gradually build up to a conflict situation.

“I’m not sure that ChatGPT would product the sort of language they talk about in their paper, outside of these very tightly defined situations.”

But he said the study was a warning of what could happen if LLMs were trained on questionable data. “We don’t know enough about the data that LLMs are trained on and until you can be sure they’re trained on a good representation of human language, you do have to proceed with an element of caution,” he said.

The study, titled Can ChatGPT reciprocate impoliteness? The AI moral dilemma, is published on Tuesday in the Journal of Pragmatics.

Read Entire Article