By Rob Hellewell

Published on Wed, February 24, 2021

All posts by this person

Identifying attorney-client privilege is one of the most costly and time-consuming processes in ediscovery. Since the dawn of the workplace email, responding to discovery requests has had legal teams spending countless hours painstakingly searching through millions of documents to pinpoint attorney-client and other privileged information in order to protect it from production to opposing parties. As technology has improved, legal professionals have gained more tools to help in this process, but inevitably, it still often entails costly human review of massive amounts of documents.

AI and Analytics Reinventing the Privilege-Review Model_AdobeStock_44449474

What if there was a better way? Recently, I had the opportunity to gather a panel of ediscovery experts to discuss how advances in AI and analytics technology now allow attorneys to identify privilege more efficiently and accurately than previously possible. Below, I have summarized our discussion and outlined how legal teams can leverage advanced AI technology to reinvent the model for detecting attorney-client privilege.

Current Methods of Privilege Identification Result in Over Identification

Currently, the search for privileged information includes a hodgepodge of different technology and workflows. Unfortunately, none of them are a magic bullet and all have their own drawbacks. Some of these methods include:

  • Privilege Search Terms: The foundational block of most privilege reviews involves using common privilege search terms (“legal,” “attorney,” etc.) and known attorney names to identify documents that may be privileged, and then having a review team painstakingly re-review those documents to see if they do, in fact, contain privileged information.
  • Complex Queries or Scripts: This method builds on the search term method by weighting the potential privilege document population into ‘tiers’ for prioritized privilege review. It sometimes uses search term frequency to weigh the perceived risk that a document is privileged.
  • Technology Assisted Review (TAR): The latest iteration of privilege identification methodologies involves using the TAR process to try to further rank potential privilege populations for prioritized review, allowing legal teams to cut off review once the statistical likelihood of a document containing privilege information reaches a certain percentage.

Even applied together, all these methodologies are only just slightly more accurate than a basic privilege search term application. TAR, for example, may flag 1 out of every 4 documents as privilege, instead of the 1 out of every 5 typically identified by common privilege search term screens. This result means that review teams are still forced to re-review massive amounts of documents for privilege.

The current methods tend to over-identify privilege for two very important reasons: (1) they rely on a “bag of words” approach to privilege classification, which removes all context from the communication; (2) they cannot leverage non-text document features, like metadata, to evaluate patterns within the documents that often provide key contextual insights indicating a privileged communication.

How Can Advances in AI Technology Improve Privilege Identification Methods

Advances in AI technology over the last two years can now make privilege classification more effective in a few different ways:

  • Leveraging Past Work Product: Newer technology can pull in and analyze the privilege coding that was applied on previous reviews, without disrupting the current review process. This helps reduce the amount of attorney review needed from the start, as the analytics technology can use this past work product rather than training a model from scratch based on review work in the current matter. Often companies have tens or even hundreds of thousands of prior privilege calls sitting in inactive or archived databases that can be leveraged to train a privilege model. This approach additionally allows legal teams to immediately eliminate documents that were identified as privileged in previous reviews.
  • Analyzing More Than Text: Newer technology is also more effective because it now can analyze more than just the simple text of a document. It can also analyze patterns in metadata and other properties of documents, like participants, participant accounts, and domain names. For example, documents with a large number of participants are much less likely to contain information protected by attorney-client privilege, and newer technology can immediately de-prioritize these documents as needing privilege review.
  • Taking Context into Account: Newer technology also has the ability to perform a more complicated analysis of text through algorithms that can better assess the context of a document. For example, Natural Language Processing (NLP) can much more effectively understand context within documents than methods that focus more on simple term frequency. Analyzing for context is critical in identifying privilege, particularly when an attorney may just be generally discussing business issues vs. when an attorney is specifically providing legal advice.

Benefits of Leveraging Advances in AI and Analytics in Privilege Reviews

Leveraging the advances in AI outlined above to identify privilege means that legal teams will have more confidence in the accuracy of their privilege screening and review process. This technology also makes it much easier to assemble privilege logs and apply privilege redactions, not only to increase efficiency and accuracy, but also because of the ability to better analyze metadata and context. This in turn helps with privilege log document descriptions and justifications and ensuring consistency. But, by far the biggest gain, is the ability to significantly reduce costly and time-intensive manual review and re-review required by legal teams using older search terms and TAR methodologies.


Leveraging advances in AI and analytics technology enables review teams to identify privileged information more accurately and efficiently. This in turn allows for a more consistent work product, more efficient reviews, and ultimately, lower ediscovery costs.

If you’re interested in learning more about AI and analytics advancements, check out my other articles on how this technology can also help detect personal information within large datasets, as well as how to build a business case for AI and win over AI naysayers within your organization.

To discuss this topic more or to learn how we can help you make an apples-to-apples comparison, feel free to reach out to me at

About the Author
Rob Hellewell

Vice President

At Lighthouse, Rob counsels the world’s top corporations and law firms in ediscovery and leveraging analytics, data science, and technology to extract critical insights from data. Rob’s expertise includes applying analytics and developing data-driven solutions to reduce risk in compliance and legal matters.

Rob’s education and professional experience combine data analytics and legal expertise. He received his M.S. in Business Analytics from New York University, where his research focused on applying text mining, metadata, and sentiment analysis to detect legal risk in unstructured data sources. He received his J.D. from Brigham Young University’s J. Reuben Clark Law School. Rob previously practiced in the Antitrust and Trade Regulation group of Skadden, Arps, Slate, Meagher & Flom, where he counseled clients in connection with complex litigation and regulatory investigations from the Federal Trade Commission, Department of Justice, Securities and Exchange Commission, and other state and federal agencies.