Judgment Day coming for AI algorithms, as shown by two recent reports
By Michael Smith
Nearly 40 years ago legal philosopher, Ronald Dworkin, postulated heroic Judge Hercules, an idealised judge with superhuman intelligence and unlimited time. Two impressive recent studies suggest the ideal remains human.
Twelve British judges, five from the UK Supreme Court address the challenges and opportunities of artificial intelligence
Twelve British judges, five of them from the UK Supreme Court, recently addressed an unsettling new Herculean colleague: artificial intelligence.
Polls suggest that “judges, judicial support staff, prosecutors, and lawyers around the globe have started to use chatbots. . . to draft. . . judicial decisions, and elaborate arguments”. The judges’ conversations — gathered in a quietly momentous study presented at CHIWORK 2025 — are a rare glimpse into the judicial psyche on the eve of the machine-learning age. The tone is wary, thoughtful, sagacious, occasionally amused — and almost uniformly sceptical.
The tone is wary, thoughtful, sagacious, occasionally amused — and almost uniformly sceptical
The study, co-authored by researchers from Harvard, Toulouse and Maynooth University, reports how judicial perceptions of how the integration of AI into judicial systems might transform the way judges and legal professionals work. It asks a simple, existential question of lawyers: what, if anything, should AI be allowed to do in a court of law?
The judges’ answers are refreshingly human. The paper highlights striking enthusiasm for AI’s efficiency gains alongside deep concern for justice’s human dimension.
As background it is important to note that, in contrast to the US Supreme Court, UK Supreme Court judges typically write their judgments by themselves. One participant stated, “There’s no question that anything in a judgment that I hand down will be written by anyone other than me. My judicial assistant will do research for me and maybe give me an analysis of cases, but I will then go to the cases. Similarly, the judicial assistant might produce a chronology, but I will go to the individual documents when I’m writing the judgment”.
Judges spoke with understated urgency about what separates mechanical logic from legal judgment: not just facts and rules, not a matter of pure logic, but of practical reasoning, empathy, and above all, moral responsibility: “You’re given the job, not just for intellectual ability, it’s the judgment that you can see that logic is taking you in a direction that you shouldn’t be going and you need a practical, humane result”.
Several participants noted that in some types of cases, a human judge is vital to providing emotional and psychological closure, and a sense of “dignity”. You can’t, said another group, “underestimate the catharsis that there is in a trial and the importance of that for peaceful dispute resolution so that the person who loses can say…I understand why I lost”.
At the highest levels of the judiciary, writing a judgment isn’t clerical. These judges speak of drafting with the care of novelists, choosing words like tools, shaping arguments to make not just a ruling, but a record of reason. One noted: “When we come out of a case, we all meet together and discuss what we think about it and why. We can’t have a room of robots doing that”.
AI, for all its emotionless bluster, doesn’t yet understand the difference between language and thought. Of AI’s ability to articulate the reasoning behind a decision, “AI isn’t really undertaking that process”, one judge noted dryly. And it is prone to a particular sort of error that humans are not, namely, the invention of fictitious legal authorities.
Still, the judges aren’t Luddites. They know their courts are clogged, their paperwork Sisyphean, and their resources exhausted. And so they eye the machines with guarded interest. AI could probably help with bulk administrative drudgery, it could summarise documents, flag inconsistencies, or draft plain-English versions of decisions for the public — or, where appropriate, for children. For high volume courts, the initial judgment drafting or summarising of background was considered a potential boost to efficiency and better proofreading would also be helpful. There is also potential for “small claims” and some other types of cases to be fully resolved through AI, with a possible tradeoff between efficiency and quality of judgment.
Sentencing was identified as an area that AI could support, as AI could analyse the relevant background information, precedents and additional considerations and make a recommendation. There could be similar support for deciding what would qualify as a fair amount in settlement agreements (e.g. personal injury).
Further, many “boring” or bulk administrative tasks were identified as areas in which AI could be beneficial.
Legal research and summarisation of cases, disclosures, and bundles of documents was an area of much discussion. “It might even increase access to justice”, one remarked, “at least for the sorts of cases that never make it before a judge anyway”.
Context is important. One participant noted: “It’s one thing to have cheap and cheerful AI tool to resolve a £500 dispute over a second hand car sales contract. It’s quite another if somebody’s being sent to prison or somebody’s having their children taken away from them and put into care”. AI stepping into the role of decision-maker — on questions of liberty, custody, guilt or innocence would be too much. Even if the machine were right, it would be wrong. “We want a decision, as a matter of principle, made by a human being”. That insistence is not about sentimentality—it’s about legitimacy. Law, after all, is not only a system of rules. It’s a theatre of authority. Strip away the human element, and you may win efficiency — but you lose the drama, the dignity, and perhaps the consent of the governed under the separation of powers.
The Economist has built a bot to predict how the US Spreme Court will rule
This is not an idle worry. Across the Atlantic, a less cautious experiment is underway. The Economist, with uncharacteristic glee, has built a bot—SCOTUSbot — to predict how the nine justices of the US Supreme Court will rule in upcoming cases. They fed it legal briefs, oral arguments, and ran each case through the machine ten times to produce vote counts, explanations, and mock majority opinions. If the justices faithfully follow legal principles, an AI aware of all the precedents ought to predict their votes fairly reliably. If politics drives some decisions, the patterns would be less clear. And the eerie part is that it’s often right. Or close enough to tilt some wigs.
Its main problems are moments of uncertainty and a far from ideal susceptibility to blowing a fuse. Occasionally it provides analysis of the wrong case. In one case, it predicted a 6-3 ideological split that never came—but the judges it wrongly pegged as dissenters did write separate concurring opinions. In another, deprived of oral arguments, it became “unhinged”. Like a clever student bluffing an exam, SCOTUSbot knows the tone and the jargon but can’t quite be trusted with the thesis. Yet. Tantalisingly it was unequivocal about how Trump v United States—last year’s presidential-immunity case—should have come out: 7-2 or 9-0, it insists, against Mr Trump.
What’s striking is how both stories — one a sober qualitative study, the other a journalistic provocation—converge on the same insight: AI might mimic the form of legal reasoning, but it doesn’t yet grasp the substance. It may anticipate outcomes, but it doesn’t feel the weight of decision. It can predict, but it cannot judge.
There’s also the question of how humans behave in the presence of the machine. UK judges voiced concern not just about AI getting things wrong, but about human users becoming lazy, credulous, or dependent. If junior judges begin relying on AI to find precedent or draft arguments, do they ever learn to do it themselves? The risk is not just error — it’s decadence.
Other concerns included:
• Reliability is currently insufficient for legal information. Reliability and trust in the AI came up frequently;
• Use of language is precise and critical in the work of judges;
• Privacy is a concern but not specific to AI;
• AI bias needs to be understood;
• AI could lead to de-skilling.
One judge put it with quiet cynicism (judges’ energy could never be in question): “You’d rely on AI’s output without [ever] going to the source documents”. Another, more colourfully, warned of losing the “satisfaction of the chase”, the intellectual pleasure of hunting down a footnote or teasing out a contradiction.
Both the CHIWORK study and SCOTUSbot’s American parlour trick encourage scepticism, though the unsentimental Economist’s seems largely to centre on teething problems whereas the English judges seem more philosophical about whether law itself is something that should be done by a machine. The answer, for now, seems to depend on what you think justice is. If it’s just the application of rules to facts, the machines will soon outperform them. If it’s a public performance of reasoning, responsibility and restraint, then AI should remain a hidden Hercules at best.
And if we forget that distinction, we may soon find ourselves judged not by our peers, but by an algorithm that may, now or in the future, be trained on insidious or dubious instincts.
Sources: “Interacting with AI at Work: Perceptions and Opportunities from the UK Judiciary” — CHIWORK ’25, Amsterdam, June 2025 (Erin Solovey, Brian Flanagan, Daniel Chen) http://users.nber.org/~dlchen/papers/Interacting_With_AI_at_Work_CHIWORK.pdf (to be published 23 June 2025); “Can AI Predict Supreme Court Rulings?” — The Economist, June 4, 2025.