Legal Arguments with AI (Part I): Foundations

This is Part I of a three-part series. In Part II, we’ll put AI to work unpacking a live Supreme Court case. Part III will go further, exploring how the technology itself can keep improving to meet the challenges of real legal reasoning.

The law consists of rules and standards. Rules are bright lines — clear, predictable, rigid. Standards are open-ended — flexible, messy, demanding judgment and analogy. Large language models ("LLMs") can mimic rule-based logic with some success, but standards expose deeper weakness: they can imitate comparisons without truly weighing what matters.

Sep 23, 2025

Intro

Legal arguments are strange beasts.

On the one hand, they are more tightly structured than almost any writing form. They unfold like sequenced proofs: establish one proposition, then the next, each step necessary to carry the argument forward. At a more micro level, too, their form is exacting. A lawyer must line up facts against the elements of a test, moving from premise to conclusion with rigor (the car was traveling 80 mph in a 60 mph zone, therefore it was speeding).

On the other hand, the subject matter of law resists — perhaps inevitably — reduction to pure codification. Sooner rather than later, one meets not a bright-line rule but an open-ended standard: “reckless,” “willful,” “reasonable.” These terms invite interpretation and judgment, with an entire body of case law to "assist" with these exercises. Legal reasoning in this setting remains exacting, but it is far from mechanical. Lawyers must draw out the right comparisons from precedent case law and make the subtle distinctions necessary for complete and compelling argumentation.

Rules vs. Standards in Law

Law works through a blend of rules and standards, and understanding the difference is essential for anyone thinking about how legal reasoning actually functions.

Rules are fixed in advance. They provide bright-line guidance: the speed limit is 60 mph. Because they’re clear and predictable, rules allow people to plan and give courts an easy benchmark for deciding cases. They’re most effective when applied to recurring, predictable situations where the upfront cost of defining the rule is worth it.
Standards, by contrast, are more open-ended. They’re applied after the fact, leaving decision-makers to judge whether behavior met a vague threshold like “due care” or “reasonableness.” Standards save lawmakers the cost of defining every possible scenario ahead of time, but they shift interpretive power to judges, juries, or regulators.

The trade-off is simple:

Rules promote certainty but can be rigid.
Standards allow flexibility but can create unpredictability.

In practice, legal systems rely on both. Traffic laws are filled with clear numerical rules, while tort law leans heavily on standards. Statutes written as standards often solidify into rules as courts refine them through precedent. Conversely, rigid statutory rules can soften into flexible, case-by-case standards as courts interpret them or carve out exceptions.

For lawyers, the craft lies in moving seamlessly between the two — applying rules when they fit, and interpreting or elaborating standards when cases demand nuance. This duality is one of the reasons legal reasoning is such a distinctive skill: it requires both technical precision and analogical judgment.

LLMs and the Challenge of Rules + Standards

The genius of large language models (LLMs) lies in their ability to predict the next word in a sequence. This deceptively simple mechanism allows them to mirror natural language and present human knowledge with remarkable speed and fluency. But when it comes to legal reasoning, especially the application of rules and standards, that strength encounters limits.

Rules.
LLMs still struggle with formal, step-by-step reasoning. Lawyers working with rules often proceed like mathematicians: apply a test, move through its prongs, check each fact against each element. That kind of structured, deductive process doesn’t always emerge naturally from word prediction. Progress has been made — larger models achieve statistical precision that often resembles deduction, and “chain-of-thought” prompting can encourage them to think in steps and check their work. That’s why, when you give a model a reasoning-heavy question, you’ll often see it pause and produce a more deliberate, multi-step answer. The results here are mixed but improving.

Standards.
Standards pose a deeper challenge. Unlike rules, they are not mechanical tests. To apply a standard like “recklessness” or “due care,” one must draw analogies to past cases, weigh competing considerations, and interpret context. This isn’t just formal logic — it’s analogical reasoning, sensitivity to nuance, and judgment about what counts as “similar enough” to be persuasive. These are precisely the moves lawyers make when distinguishing fact patterns or extending precedent. For LLMs, which rely on statistical prediction rather than genuine conceptual grasp, this is a far harder task. The model can mimic the surface structure of analogies, but it lacks an internal sense of relevance, proportion, or weight.

This is why advances in scale and prompting, while helpful for rule-based reasoning, may not fully solve the problem of standards. Applying standards is less about marching through a fixed test and more about making creative, yet precise comparisons across cases.