Through a strategic project within Area 3 that deals with the application of Large Language Models (LLMs) in an industrial context, a first prototype has been developed and presented at the 8th Pro²Future Partner Conference on September 13, 2023.
The developed application, AERIALL, allows expanding the knowledge of LLMs, which are trained on large amounts of text, through specific, customized documents. In the application, documents can be uploaded, and questions about the content of the document can be asked using a chat interface. The generated answers include references to the source pages in the document where the relevant text passages were found. This approach, called Retrieval-Augmented Text Generation, aims to address the issues of hallucination and verifiability of LLMs to ensure the correctness of answers.
Another advantage over commercially known LLMs like ChatGPT is the fact that the Language Model adapted by us was released as part of an open-source project and can be operated entirely locally on a self-provided infrastructure. This prevents the often times unwanted sharing of sensitive documents and chat logs.
There are hardly any limits to the application because wherever (text) documents are involved, AERIALL can be used as a supportive tool. Information within organizations is widely distributed, and searching for relevant content is often a challenge. AERIALL enables semantic searching in documents, providing efficient access to extensive sources of information.
The future prospects for the development of this concept are promising. Currently, we see topics such as Prompt Recommendation and Reinforcement Learning by Human Feedback (RLHF) as potential next steps to further improve the application.
A video demonstrating the functionality and some examples can be found on our YouTube channel.