The number of generative AI solutions available to e-discovery professionals has exploded, and each uses large language models (LLMs) in a different way. From information retrieval to summarization and fact development to document analysis … it can be challenging to keep up with an ever-growing list of generative AI offerings. Yet it’s important for teams to understand which solutions will have the most impact on their work so they can determine where to invest time and resources.
Throughout the last year, two particular implementations of generative AI have emerged and gained traction, offering great promise to the e-discovery industry.
- Document-by-document review: A “bottom-up” approach using generative AI for document review where documents are analyzed one at a time, with an individualized response or decision generated for each document.
- Chatbot: A “top-down” approach using generative AI to look at sets of documents to analyze information and generate a response to a question or task. It often includes analysis or summarization across multiple documents simultaneously.
Each approach leverages generative AI differently to analyze data. Let’s look at these two different uses and compare their applications, benefits, and limitations to better understand how and when they should be used by legal professionals.
How They Work
Bottom-Up: Document-by-Document Review
The bottom-up approach leverages generative AI LLMs to analyze, classify, and return results from each individual document selected—much like a human document reviewer would do.
The user of these applications can select a single document or hundreds of thousands of documents to analyze. No matter how many documents are selected by the user, the text of each document is presented to the LLM along with the prompt (a set of instructions). The AI then analyzes the text of one document at a time, returns a response, then moves on to the next, repeating that process for each document. Using this method, the AI’s response is based only on the text of that document. It does not remember, compare, or summarize information from other documents in creating its response.
As you can see, this method most closely resembles e-discovery document review processes used today. So, it’s not surprising that examples where the bottom-up approach is being used in e-discovery include classifying or coding documents as either responsive or non-responsive, determining if a document contains information about a particular issue, identifying the existence of privileged information within a document, extracting events and facts from a document, and summarizing a document.
Top-Down: Chatbots
Conversely, the top-down approach leverages generative AI LLMs to analyze a group of documents simultaneously. Unlike the bottom-up approach that looks at every document individually, the top-down approach provides a response looking across a set of documents.
LLMs have a limit on how much information they can analyze at one time, so to allow top-down solutions to analyze data sets containing more than a few dozen documents, most top-down applications add a step to first use a search and retrieval process to find a subset of documents that are likely relevant to a task, question, or search. Several different technologies are used for this search and retrieval process. Some applications use simple key terms derived from the prompt to find the subset of relevant documents. Others use a more complex technology like vector embedding.1 Once the subset of documents most likely to help the AI complete the given task are identified, the text from that subset is sent to the LLM so that it can analyze the information and generate a response.
So, what does this look like in practice? It’s most commonly a chatbot-style interface where a user can ask a question or instruct the LLM to perform a task, such as “find me examples of John Doe talking about the Company ABC to his colleagues.” The application then searches the data set to find the subset of documents that it determines most likely contains information about John Doe talking about the Company with others, and then it sends the text of those documents to be analyzed by the LLM. The application uses the LLM to prepare and send a response back to the user within the chatbot interface.
How They Compare
Now that we’ve gone over how these different approaches work, let’s see how they compare in terms of their performance, and how that impacts the scenarios where they should be used.
Scalability and Efficiency
Bottom-up / Doc-by-Doc Review: Given its individual, unit-based nature, this approach may become time-consuming and resource-intensive as the volume of documents grows. It guarantees that each document is reviewed, and ensures that every document is evaluated, but because it must analyze each document it requires computational resources to scale.
Top down / Chatbot: This method can work very quickly. By first using a search and retrieval method to select a subset of documents most likely to support the response needed, this approach can rapidly narrow down the pool of data needing to be analyzed by the LLM and significantly speed up getting an answer. However, keep in mind that it is only using the LLM to analyze a small subset of the documents.
Accuracy and Verifiability
Bottom up / Doc by Doc Review: The strength of this method lies in its transparency and verifiability. Since each document is analyzed individually, users can directly verify the accuracy of the analysis by looking at the response for each document themselves or by using a validation technique across a random sample of documents to measure the accuracy of the responses.2
Accuracy can also be supported by requiring the LLM to identify an excerpt or citation—the exact text within the document that its response is based on—which allows the user to quickly confirm that the text identified is consistent with their own analysis.
An additional benefit of requiring the LLM to identify and provide an excerpt or citation supporting its response is that it can also be used to detect, and prevent, potential “hallucinations” (inaccurate or fabricated information) by confirming the text of the excerpt or citation identified by the AI is actually found within the document.
Top-down / Chatbot: While powerful for quick searches and responses across large data sets, this method often does not provide insight beyond the selection of the subset of documents for analysis. It also does not give the user the ability to easily verify that the application has returned a complete and accurate response. The top-down approach introduces a more complex two-step process, making the application and the work of the user to verify results more difficult.
The use of search and retrieval methods to identify and send only a subset of documents to the LLM for analysis can raise questions around completeness because there’s the potential to miss important material if the user’s prompt is not clear, and/or if the solution does not include all unique potentially relevant documents. The user often has no way of knowing whether the search and retrieval process has found the best or complete set of documents needed to perform the task from within the larger data set.
These solutions provide minimal control and often no way to see which documents or information were excluded (and therefore never analyzed by the AI). While the response can contain links or citations to the documents or information it did use to create the response, it is difficult for the user to confirm the completeness and accuracy of the response because the user often doesn’t know what documents or information was not included in the analysis. Further, it would be very difficult to verify that all potentially relevant information was included in the analysis without conducting a document-by-document review of the entire data set.
So What’s the Answer? It Depends.
Both approaches offer distinct advantages and face unique challenges. When determining when and where it makes sense to use each, it comes down to the problem your team is tackling and the tradeoffs that you want to make in that scenario.
For example, bottom-up, document-by-document analysis is most closely aligned and useful as a replacement for other document review methods already familiar and in use today in e-discovery. This includes use cases such as reviewing documents for production, reviewing documents for privilege, or identifying all the important facts in a matter. The approach stands out for its thoroughness and defensibility, making it invaluable in situations where every document’s content must be accounted for and where the responses need to be verified. It’s ideal for scenarios where accuracy such as precision and recall need to be calculated, and where documentation of review practices is required for compliance or litigation purposes.
On the other hand, top-down chatbot solutions are best suited for exploratory searches or preliminary data sifting where the goal is to quickly identify and analyze relevant documents within a vast data set. Its efficient data navigation and quick insights provide a strategic advantage in rapidly pinpointing and summarizing relevant information. However, it does so at the cost of potential gaps in completeness and challenges in verifying the accuracy of the results. It is particularly useful in early case assessments or answering specific questions about the contents of the documents where speed is of the essence, but accuracy and thoroughness are not as high of a concern.
For legal professionals, the choice between these methodologies should be informed by the specific needs and purpose of their use case while also balancing the demands for speed, accuracy, completeness, and defensibility. As generative AI continues to evolve, staying on top of new developments and understanding the risks and benefits of each approach will be crucial in leveraging this powerful technology effectively in the legal domain.
Interested in learning more about a Relativity aiR for Review’s generative AI document review solution? Sign up for our upcoming AI Advantage webinar, Aiming for Prompt Perfection? Level Up with aiR for Review, where we’ll discuss best practices for drafting prompt criteria to get the best results.
1 A vector embedding database is an index that identifies and groups data and documents with similar content together. In this context, the vector embedding index helps the generative AI solution find and evaluate the subset of documents that are most likely to contain the information needed to respond to the prompt instructions without going over the LLM’s analytical size limits.
2 The e-discovery industry has been effectively using a process to validate the accuracy of document review for many years. The process involves reviewing a sample of documents from the larger data set to confirm the accuracy of the coding decision applied to the documents and extrapolating a level of confidence from that sample to the larger data set if that larger data set was reviewed. There are many resources that reference this validation process. A paper was recently published by The Sedona Conference on the use of the validation process for generative AI document review.
Chris Haley is an experienced e-discovery and legal technology leader with an extensive 25+ year career in the field. His current role at Relativity includes consulting with e-discovery and legal technology professionals on ways to leverage artificial intelligence in their workflows to improve results and key value drivers for their services.