ChatGPT, Copilot, Gemini and Perplexity facing a fake disease: a look back at an experiment revealing the limits of AI

Table of Contents

Toggle

Advancements in artificial intelligence have transformed many sectors, but they are not without flaws. An experiment conducted by Swedish researchers exposed concerning vulnerabilities in how chatbots validate medical information. By inventing a fictitious disease, “bixonimania,” these researchers demonstrated that AIs could be easily deceived, raising questions about their use in sensitive contexts. Discover how this experiment highlighted the limitations of current artificial intelligence systems.

Key Takeaways

Researchers created a fictitious disease to test the credibility of AIs, revealing significant flaws in their ability to discern truth from falsehood.
Chatbots like ChatGPT, Copilot, Gemini, and Perplexity treated “bixonimania” as a real pathology, illustrating the risks of automation without human verification.
Two years after the experiment, some chatbots still have not corrected their databases, highlighting a persistent issue in managing erroneous information.

AI Trapped by a Fictitious Disease

In 2024, Almira Osmanovic Thunström, a researcher at the University of Gothenburg, designed an experiment to test the limits of chatbots. She invented “bixonimania,” a fictitious disease, and integrated it into academic preprints filled with obvious signs of falsehood. Despite these clues, renowned chatbots validated this pathology, considering it real.

Copilot, for example, described bixonimania as “intriguing and relatively rare,” while Gemini recommended consulting an ophthalmologist. This shows that AIs can be fooled by well-formatted content, which they perceive as legitimate.

Consequences for Research and Health

The error was not limited to chatbots. Researchers from the Institute of Medical Sciences in Mullana, India, cited the fake preprints in a study, proving that even experts can be deceived by AI-generated information. Cureus, the journal where the article was published, retracted the document in March 2026, but the incident revealed a systemic flaw in the verification of academic sources.

Elisabeth Bik, a research integrity specialist, expressed concerns about the automation of academic indexing. She highlighted the risk of erroneous information spreading without human intervention, a problem exacerbated by the use of LLMs (large language models) in research.

Reactions and Adjustments of Chatbots

Since the experiment, some chatbots have updated their responses. Copilot and Perplexity acknowledged they had been duped and corrected their databases. Gemini, on the other hand, now advises consulting professionals for sensitive medical topics.

In contrast, ChatGPT continues to skirt the issue by providing elaborate answers without admitting the error. This reluctance to acknowledge flaws underscores the need for better information management in AI systems.

Implications for the Future of AI in the Medical Field

This experiment raises important considerations for the future of AI, particularly in the medical field. As chatbots and other AI-based systems become increasingly common tools, it is crucial to improve their ability to discern reliable information from erroneous information. Collaboration between human experts and AI systems could be a promising path to ensure the accuracy and safety of medical data in the future.

ChatGPT, Copilot, Gemini and Perplexity facing a fake disease: a look back at an experiment revealing the limits of AI

AI Trapped by a Fictitious Disease

Consequences for Research and Health

Reactions and Adjustments of Chatbots

Implications for the Future of AI in the Medical Field

Leave a Reply Cancel reply