Using LLMs to Discover Legal Factors

Paper · arXiv 2410.07504 · Published October 10, 2024

Recently, large language models (LLMs) have been applied automatically to annotate legal case texts from particular legal domains in terms of factors from pre-existing factor lists. In this paper, we describe and assess a methodology for employing LLMs to discover factors in case texts without using a pre-existing factor list.

Our method takes as input raw court opinions and produces a set of factors and associated definitions.We evaluate the extent to which an LLM can identify from scratch any factors in the cases from a legal domain where the LLM has no apparent access to a pre-existing list of factors or their definitions for that domain. We demonstrate that a semi-automated approach, with a human in the loop produces factor representations that can predict case outcomes with moderate success, if not yet as well as expert-defined factors can. In the absence of predefined factors from courts or legislative bodies, legal scholars manually analyze hundreds of cases to identify factors, a process that is highly time-consuming and costly. Our methodology could enable a more efficient process of identifying factor representations of legal domain cases.

Researchers read cases and identified the sentences from which one could infer that facts associated with a factor had occurred in the cases and indicating that a court decided the case as it did because, or in spite, of the presence of that factor.

automated factor annotation methods could enable researchers who perform empirical legal studies or build computational models of legal argument to automatically classify factors in much larger numbers of cases, potentially increasing their works’ accuracy and scope.