Language Learning Models (LLMs) such as ChatGPT use vast amounts of text data to learn how to generate human-like text. You can think of this as a model studying for a big test. It crams in as much knowledge from books, articles, and web pages as it can. But, unlike us, it doesn't actually understand what it's learning. It doesn't have thoughts or experiences. Instead, it learns many complex patterns.
When you ask it a question, it uses all the information it has learned to come up with a response. It generates this response based on probability. The model selects words and phrases that it thinks are most likely to follow the previous ones. Most of the time this gives incredible accurate results.
Sometimes, models can generate information that seems plausible but isn't true; this is a hallucination. The LLM starts making up facts that sound believable but are not actually correct.
LLMs have become so good we can let our guard down; we have come to rely on them too much. It's becomes very easy to copy and paste without checking and miss a hallucination.
"LLMs have become so good we can let our guard down; we have come to rely on them too much."
When we use LLMs as part of our services our customers expect our responses to be true. This is especially critical in education.
Although there is no watertight way to avoid hallucinations altogether, we can manage the risk. All technology and all employees get it wrong sometimes. If we put enough safeguards in place the risk becomes so low that the benefits our our service outweigh the risk by a large factor.
1. Make sure the students know they are working with an LLM. We never pretend LLM answers are from a human.
2. Provide reference links to relevant material so students can easily cross check answers.
3. We provide the LLM lots of relevant material to work with along with the student's questions. LLMs are much less like to hallucinate when they have the context they need.
4. Through prompt engineering we instruct the LLMs we use to be very conservative.
5. LLMs use probability to predict answers. We control the parameters of this probability to minimise the risk of hallucinations.
6. Human in the loop. Tutello combines Humans and AI, human tutors' flag and correct hallucinations.
7. Data analysis, we analyse answers to flag potential hallucinations and improve our system.
If you would like to discuss hallucinations and how we manage them, please get in touch.