The era of Large Language Models (LLMs) has transformed the landscape of artificial intelligence, especially with the emergence of ChatGPT, which has amazed users with its conversational and memory capabilities.
However, despite being trained on vast corpora of data, these models have certain limitations when answering questions about specific domains or recent events.
This is where the concept of Retrieval Augmented Generation (RAG) comes into play.
Fine-tuning and RAG are two powerful techniques for specializing an LLM, but they operate differently. Fine-tuning directly adjusts the model's parameters to make it more specific, while RAG automatically enriches the questions posed to the LLM by adding relevant context.
RAG, on the other hand, overcomes these limitations by automatically adding relevant context to questions.
RAG combines two main techniques: retrieval and text generation. This method automatically adds relevant context to questions posed to the LLM, thereby increasing the accuracy and relevance of its responses.
Consider a concrete example illustrating how RAG works. In the diagram below, a user asks a question about our company, Aqsone. Since the LLM was not trained on data related to the company, it is unable to respond or provides an answer based on a hallucination.
In this new diagram, RAG is employed. Relevant documents are provided to RAG for searching information related to the question. Using similarity between the question and the documents, it identifies the document(s) that potentially contain elements needed to answer the question (Retrieval). The question, combined with these contextual elements, is then submitted to the LLM, which uses them to answer the question (Generation).
To delve further into the technique behind RAG, consider the following diagram and steps:
RAG can be optimized using various advanced techniques, such as:
Retrieval Augmented Generation (RAG) has powerful applications across various domains. Here are a few concrete examples where RAG can provide significant added value:
Description: These chatbots are designed to answer frequently asked questions and resolve user issues using a knowledge base derived from past questions/answers and tickets.
Example: An employee attempts to connect to a secure network but fails. They ask the IT support chatbot what to do. The RAG system searches past tickets and FAQs, finds a documented solution for this type of problem, and provides clear steps to resolve the issue.
Description: These chatbots assist new employees by providing the necessary information and resources for their integration into the company, based on internal documentation.
Example: A new employee asks how to submit a leave request. The chatbot retrieves the relevant segments from the HR documentation that describe the leave request procedure and provides a detailed response with step-by-step instructions.
Discover our use case for the HR Assistant Chatbot.
Description: These chatbots support maintenance teams by providing rapid diagnostics for equipment malfunctions using sensor data and similar failure histories.
Example: A technician observes abnormal vibrations in a machine and consults the chatbot, which identifies a potential bearing issue based on past cases. The chatbot then provides diagnostic steps and repair recommendations, helping to resolve the problem more efficiently.
Description: These chatbots help users examine and understand contract clauses, and can also identify potentially abusive clauses.
Example: A user submits a contract to the chatbot and asks if any clauses might be abusive. The system analyzes the document, flags specific clauses like excessive late penalties or unusual termination conditions, and explains the risks associated with these clauses.
Retrieval Augmented Generation represents a significant advancement in the field of AI, offering effective solutions to the limitations of traditional LLMs. By combining retrieval and generation, RAG enhances the accuracy, relevance, and specialization of responses, making interactions with AI systems more useful and pertinent.