Generative AI in Legal Industry: Opportunities, Risks & Strategies

Businesses are increasingly embracing Generative AI capabilities. The legal industry, in particular, presents numerous opportunities for innovation by leveraging Generative AI (Gen AI) effectively. This is particularly evident when dealing with text-based outputs, as legal professionals often need to sift through a large number of documents and other unstructured content like images and videos. Additionally, Gen AI provides predictive analysis capabilities that can enhance insights in legal matters.

There are various areas for the application of generative AI in legal industry, including:

Document Creation: It can assist in drafting specific documents such as wills, probates, business contracts, and citations while ensuring adherence to prescribed formats and standards of presenting information, including those for legal citations.

Legal Research & Knowledge Retrieval: Generative AI can summarize extensive text and enable the creation of work products related to statutes (including industrial law) and regulations while adhering to standards, maintaining accuracy, and ensuring relevance.

Enhancing Matter Management: Predictive analysis and simulations can be employed in various practice areas to understand patterns in revenue generation, discover new opportunities, accelerate billing, and improve cash flow.

Case Litigation Management: Utilizing Gen AI-led strategies to increase the likelihood of favorable outcomes, supported with accurate inferences from case law, court decisions, and obiter dicta.

Chatbots: Personalizing the user experience through AI for legal operations and customer interactions.

In the Legal Context, What Types of Data Sources are Relevant?

Data used by legal firms can be categorized into three main groups:

Internal (Enterprise) Data: This includes the legal firm’s own enterprise data, both internal and client-related. This data serves as essential input and is often used in prompts to generate content tailored to specific business objectives. It is correlated with the following two data categories.

Reliable External Data (including public domain data): Online resources have made case law information increasingly accessible. Government websites offer free access to case decisions and texts. Additionally, reputable websites with research articles fall into this category. There may also be other external data sources available through industry bodies and consortiums, bought and subject to suitable agreements that guide how the data is utilized. This category encompasses data related to industrial law, specific regulations, and standards across various industries.

Less Reliable Public Domain Data: Not all data from web sources can be considered reliable. For instance, internet-based news information may lack accuracy and could be biased. Opinions found online can be especially misleading, as they spread through social media channels and get distorted.

What Are the Risks Associated with External Data?

In recent news, a lawyer was found quoting fictitious case information from ChatGPT in court. This highlights the risks of relying solely on Gen AI engines for output. Not only may external data used in the pre-training of LLMs (Large Language Models) be inaccurate, but wrong inferences may also be drawn even when data is accurate.

Moreover, the data provided in prompts can become accessible to the general public or, even worse, could be reverse engineered for extracting Personal Identifiable Information (PII) and used unethically, such as for fraudulent activities. Additionally, concerns related to intellectual property and copyright infringement can arise with the output produced by Generative AI engines.

What Choices are Available for Large Language Models (LLMs)?

Proprietary or Closed-sourced LLMs: Proprietary models like ChatGPT can provide legal firms access to more online resources and dedicated support. Also, the owning entity has complete control over the development and maintenance of the model, which can lead to consistent quality and security. Conversely, users have no control over which training data is used and limited ability to fine-tune the model to meet their needs. Additionally, the costs of utilizing the API can often be high.
Open-Source LLMs: Opting for open-source models offers the advantage of collaborating with a community of like-minded developers, resulting in more scope for customization and better transparency. However, it’s important to note that open-source models may have dependencies. Not all aspects of data and access security may be fully addressed, which can be a concern for legal enterprises.
Private LLMs: These are LLMs developed exclusively for and by the enterprise. They offer the benefit of enhanced security and the ability to leverage enterprise-specific knowledge.

It’s worth noting that the first two options may raise concerns about accuracy and data protection, as some models explicitly state that user inputs may be utilized further for model improvements. On the other hand, the third option allows for building greater safeguards for the enterprise, including addressing specific operational needs, even though it may come with higher costs.

What Considerations Should Be Taken into Account for Private LLMs?

First and foremost, the legal content fed into these models (for training or generated output) must originate exclusively from the enterprise’s most reliable public and other external sources. It should be continuously maintained via incremental updates and kept curated, especially as new sources are integrated. This diligence is essential to prevent the inclusion of duplicated or distorted information that could lead to problematic “hallucinations.” Once this information is integrated, it should not circulate back through the enterprise’s security perimeter.

Secondly, data privacy and security remain significant concerns, even when LLMs operate within the enterprise’s perimeter. Authorized users within the enterprise may inadvertently or maliciously introduce sensitive legal data, which underscores the need for robust governance over data, models, and user inputs (prompts).

Thirdly, ongoing training of LLMs is crucial as internal and external data volumes continue to expand. Keeping these models up to date with the latest domain knowledge, whether internal or external, is vital for ensuring their value in handling specialized tasks.

Lastly, validating the output of LLMs is a formidable challenge and a substantial cost consideration. Assertions, inferences, or data generated by Gen AI must be rigorously checked for accuracy. While the output may be accurate for numerous scenarios, a single error could expose the legal firm to significant liability. Comprehensive explanations for each scenario, supported by data utilized and the trail of user decisions taken, are necessary to comply with regulatory and ethical standards, as well as the internal compliance requirements of the legal firm.

Open-source LLMs can serve as a solid foundation for kickstarting model development within the enterprise through customization. However, this necessitates a thorough analysis of the model algorithms and an extensive evaluation of the output they produce.

A Strategy to Mitigate Business Risks and Costs

Data Governance from the Start

The primary principle governing data handling within a legal firm’s data operations should be to only bring in accurate data, and to safeguard and maintain that data once it’s in the system. This approach extends to the use of public APIs for external data, bearing in mind that some external data sources retain the content of queries. It’s equally essential to maintain a diverse data collection and consistently exclude biased data through regular audits.

Data within the enterprise can be managed and governed at different levels. Simplistically, three levels can be identified to establish appropriate boundaries for effective data and model governance:

Firm Level: This encompasses data shared organization-wide.
Practice Level: It includes firm-level and practice-specific data distributed to teams and individual practitioners within the practice, including partners and employees.
Team or Individual (Legal Professional) Level: This level includes data from the previous levels and data exclusive to the specific team working on a particular work product.

Data governance principles within legal firms should align with this hierarchy while promoting decentralization and democratization, allowing “domain owners” to capture, process, organize, and manage the data they generate (with all principles of a data mesh applied comprehensively). However, data generated at a lower level should only traverse upward if it adheres to client requirements, data protection, privacy, and intellectual property (IP) considerations. This includes data input at the team level (such as for prompt engineering), AI-generated output, and any other inferences drawn from utilized data, such as for RAG (Retrieval-Augmented Generation).

Ownership, usage, and modification of LLMs should be subject to similar principles. Legal experts working on specific work products should avoid sharing the products and any insights or key points derived from them. When different teams collaborate to fine-tune LLMs for the firm’s goals, the implications of data, content, and algorithm sharing must be carefully assessed to align objectives, as what might be a null hypothesis for one team could be an alternative hypothesis for another.

Creating and Retraining LLMs

Once foundational training at the firm level is completed, models can be segregated and explicitly developed for practice sub-domains (e.g., Corporate Law, Real-Estate Law, Family Law, etc.) as needed, refining them over time. This approach is relatively less resource-intensive and more cost-effective. Chaining LLMs in sequence, from the highest level to the lowest level, can also be applied.

An essential question is how frequently LLMs should be trained. The answer depends on whether the underlying data upon which the model trains has undergone significant changes. Factors to consider include:

The pace of growth in content, such as case law, court decisions, research articles, and practice-specific reference material, may be slower than business data in core industries.
Laws and statutes for compliance also change slowly, often with large volumes of text written with extensive thought.
Argumentation strategies in litigation may vary, however, technicalities in case arguments can be expected to have minor variations only.

As a result, retraining may be required less frequently. However, given the high sensitivity of client and case data in legal firms (and intricate policies implemented operationally), model specialization and segregation remain imperative.

Emphasizing RAG as an additional layer could potentially yield results that come close to fine-tuning the primary LLM. However, it’s important to note that RAG primarily concentrates on content retrieval and doesn’t delve into the nuances of legal domain-specific taxonomy, patterns, and language intricacies.

In this approach, the generative aspect of the RAG system can be fine-tuned and optimized independently. This can be achieved by incorporating additional enterprise document databases, tailoring them to the specific legal context, and establishing a user feedback loop. Moreover, this approach tends to be computationally less intensive.

Prompt Engineering vs. Fine-Tuning

Prompt engineering does not involve retraining the model, although it may appear that way from the user’s perspective. Thus, separate teams can use the same underlying model for a specific purpose (e.g., using AI for legal research) at the legal practice level without altering the model. Users should fully comprehend and internalize this distinction when using prompts.

Prioritizing prompt engineering to achieve more accurate output for specific legal use cases, rather than frequent retraining or model overhaul, may represent a more pragmatic approach for the legal industry. Like iteratively refining SQL queries, prompt engineering enhances the specificity of questions, especially with In-Context Learning (ICL) methodologies. User feedback on prompt results can be logged to improve retraining in later stages.

Addressing Ethical Concerns

Like any AI technology, generative AI requires human oversight in tasks. This oversight is essential to ensure that accuracy and the weight of facts in decision-making do not override human judgment. Think of it in terms of the robot in the movie iRobot, where it had to decide between saving a drowning adult (played by Will Smith) with a 45% chance of survival and a drowning child with only an 11% chance, but the robot chose the former. Inferences and court judgments in the legal realm hold consequences that are just as significant, affecting entire segments of the population.

Cost Considerations

At this stage, private LLMs are cost-prohibitive for most enterprises. To justify their implementation, clear financial business cases must consider both the costs of developing the models and validating their outputs. However, as with emerging technologies, development costs can be expected to decrease over time – along with raw computing costs and the costs for ensuring data security and privacy with advanced architectures and tools.

Conclusion: “Gen AI for the Enterprise” is the Solution for the Legal Industry

With a well-defined context and robust data governance, Gen AI has the potential to revolutionize legal operations and litigation. In essence, Gen AI involves developing and refining algorithms that provide a competitive edge for a legal firm.

The objective should be to strike a balance between risk and reward, recognizing the importance of human oversight. The intent should be not just enabling RLHF, but also ensuring that critical decisions lie only in human hands. While the initial costs may seem high, it is essential to maintain LLMs explicitly tailored for the legal firm. This ensures that the benefits of Generative AI for legal are harnessed without compromising on ethics or incurring too much business risk.