It's About Time: Incorporating Temporality in Retrieval Augmented Language Models
The web serves as a global repository of knowledge, used by billions of people to search for information. Ensuring that users receive the most relevant and up-to-date information, especially in the presence of multiple versions of web content from different time points remains a critical challenge for information retrieval. This challenge has recently been compounded by the increased use of question answering tools trained on Wikipedia or web content and powered by large language models (LLMs) [13] which have been found to make up information (or hallucinate), and in addition have been shown to struggle with the temporal dimensions of information. Even Retriever Augmented Language Models (RALMs) which incorporate a document database to reduce LLM hallucination are unable to handle temporal queries correctly. This leads to instances where RALMs respond to queries such as "Who won the Wimbledon Championship?", by retrieving document passages related to Wimbledon but without the ability to differentiate between them based on how recent they are.
Introduction. The web serves as an ever-expanding reservoir of real-world knowledge, with textual documents constituting a significant fraction of its content. Moreover, information changes over time, leading to updates to existing documents, or the addition of new documents. This leads to multiple versions of information from various time frames to co-exist and grow over time. A major challenge in information retrieval is ensuring that users get access to the most relevant and up-to-date knowledge at any time. This challenge has recently been compounded by the increased use of question answering tools powered by large language models (LLMs), which have gained popularity as a result of the release of chatGPT [13]. LLMs have been shown to absorb and serve immense quantities of information from textual data [14]. This information is typically derived from a static snapshot of a large number of documents scraped from the web at a specific point in time. However, real-world information changes continuously, frequently on a daily, hourly or even real-time basis.
Discussion / Conclusion. In this study, we introduced and evaluated TempRALM, a Retriever Augmented Language Model (RALM) augmented with temporal awareness. Unlike conventional RALM approaches that rely solely on semantic similarity, TempRALM considers both semantic and temporal relevance when selecting documents to pass to its Large Language Model (LLM) in response to a given query. Our results indicate an improvement in performance of up to 74% compared to the Atlas-large model, even when multiple versions of documents (from different time points) are present in the document index. Notably, we achieve this without the need for model pre-training, replacing the document index with an updated index, or adding any of other computationally intensive elements. We plan to explore a number of avenues for building on the work presented in this paper, such as implementing and evaluating different learning strategies for the parameters of our temporal relevance function, and exploring the interplay between the retriever and LLM.