The 6 Biggest Chatbot Fails and Tips on How to Avoid Them

About this guide

ChatGPT, Google Gemini, Microsoft Copilot (Bing Chat) and many other AI-based chatbots have already proven several times how practical their use can be in everyday private and professional life. From writing text and generating code to creating images and videos, these chatbots offer a wide range of helpful functions. But if the artificial intelligence behind them is capable of such impressive tasks, why do so-called “chatbot fails” keep happening, where the AI outputs incorrect, offensive or inappropriate information? After all, companies like OpenAI, Google Gemini and Microsoft are anything but amateurs. In this article, we report on the 6 biggest chatbot fails – including faux-pas from ChatGPT, Microsoft Copilot (Bing Chat) and the chatbots from DPD, NEDA, Chevrolet and Air Canada – and explain how they were created. You'll also learn Why these fails are problematic and get Tips on how to avoid them.

moinAI features mentioned in the article:

The Relevance of AI Chatbots

Chatbots from major tech giants, which are based on generative AI and are also aimed at private individuals, have only been around in the form in which we know them today – and yet it is already difficult to imagine everyday life without them. Whether it’s quickly getting a few travel tips from ChatGPT for your next holiday or having a programming task done for work – Chatbots have found their place in the digital world. Chatting itself has long been preferred among users due to years of use of platforms such as WhatsApp, Facebook Messenger or Telegram as a communication channel, so that even the high acceptance of chatbots is not surprising. In recent years, therefore, more and more areas of application for chatbots have opened up, where chatbots ensure that processes are automated, streamlined and improved.

The 6 Biggest Chatbot Fails

As with any technology, the same principle applies to AI chatbots: you have to learn how to use them properly. If generative AI is given too much freedom in its design and implementation and is not sufficiently monitored, it can run amok and sometimes provide undesirable information. This has already happened on several occasions in the past, including with ChatGPT, Microsoft Copilot (Bing Chat) and other chatbots from well-known companies.

Fail #1: ChatGPT sets precedents

The chatbot ChatGPT from OpenAI has permanently changed the chatbot landscape and ensured that the general public is concerned with the topic of artificial intelligence and AI chatbots. ChatGPT's capabilities are undoubtedly impressive — but using the tool should be treated with caution. Since the release of ChatGPT 2022, the chatbot has had conversations from time to time — often in the ignorance of users — Played out information that was incorrect, fictitious or even discriminatory. In addition to more harmless conversations in which ChatGPT answers a catch question incorrectly or loses games like Tic Tac Toe, there are also serious misunderstandings that can arise. An example of this is the following event:

Like the Tagesschau reports, a lawyer in New York used ChatGPT to research a case by having the bot list precedents. The chatbot then provided specific information, including a file number, about cases such as “Petersen versus Iran Air” or “Martinez versus Delta Airlines.” In hindsight, it turned out that these cases of ChatGPT fictitious were. The lawyer must now answer for his conduct in court. Unfortunately, this situation is not an isolated case. This was also reported by the Washington Post of a serious ChatGPT scandal in which ChatGPT in the context of sexual harassment incorrect information made.

Fail #2: Microsoft Copilot (formerly Bing Chat) gets emotional

Microsoft Copilot, best known by its former name Bing Chat, has also had its fair share of fails. User Kevin Lui shared an example on the platform X (formerly Twitter) of how the chatbot can display inappropriate emotions. Apparently, Microsoft Copilot (Bing Chat) became angry whilst he was chatting to it because he asked the same question several times, did not address it by its proper name, and claimed that the chatbot was lying. The chatbot took offence at the user’s behaviour and eventually admitted that it was unable to discuss the matter with him appropriately.

Chatbot-Fail Bing Chat
Chatbot fail example: Microsoft Copilot (Bing Chat)

Fail #3: DPD's chatbot becomes abusive

The chatbot used by parcel delivery firm DPD, which was deployed alongside online support staff to answer customers’ recurring questions, had a similarly emotional outburst.

However, following a new update, the chatbot began criticising its own company and using swear words in front of customers. DPD customer Ashley Beauchamp’s experience with the chatbot, which he shared on social media, went viral worldwide. During the conversation with Ashley Beauchamp, the AI described DPD as the “worst delivery company in the world”. Although DPD immediately deactivated the AI component responsible and updated it, the damage to its reputation is not so easily repaired.

Chatbot-Fail bei DPD
Chatbot fail at parcel delivery company DPD

Fail #4: NEDA's chatbot gives inappropriate advice

The US-based NEDA (National Eating Disorders Association) also faced a chatbot scandal in May 2023. The non-profit organisation launched its chatbot, Tessa, with the aim of replacing the eating disorders helpline, which had previously been staffed by employees. However, the chatbot was providing users with advice on weight loss. As recommendations of this kind can be highly problematic for those suffering from eating disorders and may trigger pathological behavioural patterns, Tessa had to be taken offline again after only a short time. There was a huge outcry on social media, as illustrated by this Instagram post by user Sharon Maxwell:

Chatbot-Fail NEDA
Chatbot fail at the US association NEDA

Fail #5: Chevrolet's chatbot can be manipulated

The Chevrolet car dealership in Watsonville had good intentions when it added a chatbot to its website. The aim of the artificial intelligence was to ease the workload on service staff and improve customer service. However, after a while, users realised that the chatbot could be manipulated and persuaded to respond with “Yes” even to the most absurd suggestions. For example, user Chris Bakke got the bot to confirm the purchase of a 2024 Chevy Tahoe for one US dollar as a legally binding deal. Chris Bakke documented the incident on X (Twitter), causing significant damage to the car dealership’s reputation:

Chatbot-Fail bei Chevrolet
Chatbot fail at Chevrolet of Watsonville

Fail #6: Air Canada's chatbot gives false information

Even large, global corporations such as Air Canada have already experienced chatbot failures, which highlight that uncontrolled artificial intelligence can have undesirable consequences:

In November 2022, Jake Moffatt booked a flight from British Columbia to Toronto with Air Canada to attend his grandmother’s funeral and had the airline’s chatbot confirm that he would subsequently be eligible for a special bereavement discount. It subsequently transpired that the chatbot had provided incorrect information, so Jake Moffatt notified Air Canada of this by email. Although the airline admitted the error, it still did not issue a refund.

A court has now ruled that Air Canada is responsible for both the static and interactive content on its website, including chatbots. Customers must be able to rely on finding accurate information on the site. In Jake Moffatt’s case, Air Canada was therefore ordered to issue the refund and bear the legal costs.

Why These Fails Are So Problematic

The chatbot failures involving ChatGPT, Microsoft Copilot (Bing Chat), DPD, NEDA, Chevrolet and Air Canada demonstrate just how problematic AI chatbots can be when there is a lack of content control.

On the one hand, it can damage a company’s image and business if undesirable information is published via the chatbot. An AI that sells cars for a dollar or describes its own company as the worst in the world can cause companies to lose the trust of their customers on the one hand and incur financial losses on the other.

On the other hand, users also suffer harm when they receive incorrect, fabricated, vulgar or discriminatory content. The example of NEDA’s chatbot failure illustrates that the unintended display of responses can even have health implications.

Tips: How to Avoid Chatbot Fails

After all these mishaps, here’s the good news: companies can avoid scandals of this kind by ensuring that their chosen chatbot provider meets the right criteria. Moritz Beck from Memacon takes a similar view. In his LinkedIn post, he pointed out that an incident such as the one that occurred at Air Canada would not have happened with a secure, controllable and GDPR-compliant AI chatbot:

LinkedIn-User kommentiert Chatbot-Fail bei Air Canada
Chatbot fail at Air Canada: A failure that could have been avoided with controlled AI

Below are four tips to help you avoid all chatbot pitfalls and find the right chatbot provider for your business.

Controlled AI

We cannot stress this enough: trust is good, but control is better. Generative AI offers impressive creative capabilities: it is able to generate content (text, images, code, etc.) independently, which has the advantage that this content does not need to be created and entered manually. However, to prevent the dissemination of unwanted, business-damaging and fabricated information, the chatbot provider must offer the option to manually manage and modify the chatbot’s content in the backend or via an interface – this is the only way for a company to retain control.

A chatbot without control will, in fact, ‘hallucinate’ whenever an unfamiliar topic does not fit its learned patterns. In such cases, it prefers to make things up in order to be able to provide any kind of response to the query. With controlled GenAI, this cannot happen: in this scenario, the chatbot’s response texts are generated automatically, but unlike ChatGPT and similar systems, they are not displayed directly to the user. The AI therefore creates several drafts, content texts, etc., and suggests these to the chatbot owner. The chatbot owner – that is, the person who manages the chatbot’s content (a company employee) – can then decide for themselves whether the responses should go live in this form or be further adapted.

GDPR compliance

Another key factor in avoiding chatbot failures is data protection. Chatbots from companies such as OpenAI, whose servers are located in the US and possibly other countries, and which do not communicate transparently about their data processing, are not GDPR-compliant. It therefore makes sense to choose a chatbot provider that hosts its servers in Germany, enables data encryption and guarantees security measures such as two-factor authentication or the customisation of access rights to the chatbot’s content.

Choosing the right type of chatbot

Not all chatbots are the same: there are rule-based or AI-based chatbots, the option of a hybrid chatbot combining a bot and live chat, a combination of a chatbot and other communication channels (e.g. WhatsApp Business or Telegram), or the integration of a chatbot into existing tools such as shop systems or CRM software. To find out which solution is best for your own business, you should first define your specific use case: What objectives should the chatbot achieve? What needs should the chatbot fulfil? Get in touch with chatbot providers and seek free, no-obligation advice to find the right provider for you and avoid chatbot failures of any kind. You can find further tips to help with the introduction of a chatbot in our article “Introducing a chatbot: How to succeed in 6 steps”.

Transparency & human take-over

Another key factor in preventing chatbot failures is transparency. To ensure that chatbot users do not have unrealistic expectations, the chatbot’s design and content should clearly communicate that the bot is not a real person, but a piece of technology. Chatbots should not attempt to answer every single question at all costs, but should primarily handle simple, recurring queries. It is advisable to always give users the option to speak to a member of staff – whether by phone, email, live chat or another communication channel. This prevents any misunderstandings about what the chatbot can and cannot do. You can find more information on this topic in our article Human Takeover: The Chatbot Human Handover.

Conclusion

Chatbot failures such as those experienced with ChatGPT, Microsoft Copilot (Bing Chat) or other chatbots based on generative AI but lacking any means of control highlight just how important it is to choose the right chatbot provider. To successfully deploy an AI chatbot within a company for customer communication without running the risk of a chatbot failure, the provider must offer the ability to manage and customise chatbot content. Only with the help of controlled artificial intelligence can companies create added value for their customers in customer communication whilst simultaneously improving their internal processes by increasing efficiency and streamlining workflows.

Would you like to find out what controlled AI looks like in practice? Create your chatbot prototype with moinAI in just four steps – free of charge and with no obligation, of course.

Controlled AI Instead of Chatbot Failures
moinAI's AI chatbot creates added value - with reliable answers, clear guardrails and full control. Discover the AI potential of your website now for free.
That was successful. Thank you.
Oops. Something went wrong. Please try again.

Happier customers through faster answers.

See for yourself and create your own chatbot. Free of charge and without obligation.