AI’s Arrival in Dutch Courts · European Law Blog – Go Health Pro

Artificial intelligence tools are making waves across the legal sector. ChatGPT in particular is all the hype, with debates still ongoing about its potential role in courtrooms: from assisting judges in drafting opinions, to lawyers relying on AI-generated arguments, and even parties submitting ChatGPT-produced evidence. As a report on Verfassungsblog suggests, the “robot judge” is already here, with Colombian judges using ChatGPT to write complete verdicts. In the UK, appeal judge Lord Justice Birss described ChatGPT as jolly useful for providing summaries of an area of law. Meanwhile, in China, AI is embraced in courtrooms, with the “Little Sage” (小智) system handling entire small claims proceedings.

In the Netherlands, ChatGPT made its judicial entrance when a lower court judge caused controversy by relying on information provided by the chatbot in a neighbour dispute over solar panels. The incident triggered significant discussion among Dutch lawyers about the impact of AI on litigants’ rights, including the right to be heard and party autonomy. However, several other judgments have also mentioned or discussed ChatGPT.

In this blog post, I will look at all six published Dutch verdicts referencing ChatGPT use—either by the judge or by litigants—and explore whether there is any common ground in how AI is approached. I will also sketch the EU-law context surrounding AI-use in courts, and consider the implications of this half-dozen rulings for current efforts by the Dutch judiciary to regulate the use of AI.

ChatGPT in Court: Promises and Pitfalls

Before delving into the specific judgments, it’s helpful to understand why ChatGPT is drawing so much attention in court.

Legal applications of AI are not new. For decades, the field of AI and Law has researched the possibilities of so-called ‘expert systems’, based on logical models representing human legal reasoning, to replace certain elements of legal decision-making. Currently, such systems are deployed on a large scale in, for example, social security and tax administration. However, more recently, the data-driven approach to legal AI has caused a revolution. By combining large datasets (Big Data) with machine learning techniques, AI systems can learn from statistical correlations to make predictions. This enables them to predict the risk of recidivism or use previous case law to forecast outcomes in new cases.

Large Language Models (LLMs) such as ChatGPT follow similar principles, training on massive, internet-scraped textual datasets and deploying machine learning and natural language processing to predict, in essence, the most probable next word in a sentence. ChatGPT can instantly generate responses to complex questions, draft documents, and summarize vast amounts of legal text. As a remarkable byproduct, it can appear to perform quite well in legal tasks such as research assistance, and it has even proven capable of passing first-year American law exams.

Yet, these possibilities come with risks. ChatGPT can produce inaccurate answers (so-called “hallucinations”) and lacks real-time access to private legal databases. Research by Dahl et al. demonstrated that ChatGPT-4 generated erroneous legal information or sources in 43% of its responses. In a now-infamous incident, a New York lawyer was reprimanded after ChatGPT cited non-existent case law. Furthermore, the technology is akin to a black box: due to the complex nature of neural networks and the vast scale of training data, it is often difficult—if not impossible—to trace how specific outputs are generated. Lastly, bias can arise from incomplete or selective training data, leading to stereotypical or prejudiced output. Over- or underrepresentation in the input data affects the system’s results (garbage in, garbage out).

In spite of these significant caveats, the following Dutch judgments show how AI is increasingly making its appearance in courtrooms, potentially shaping judicial discourse and practice. First, the use of ChatGPT by a judge is discussed, followed by the cases in which litigants used the chatbot.

ChatGPT in Action: From the Bench to the Bar

A.    Judicial Use Cases

1.     Gelderland Court (ECLI:NL:RBGEL:2024:3636)

In this neighbour dispute over rooftop construction and diminished output from solar panels, the court in Gelderland used the chatbot’s estimates to approximate damages. It held:

“The district court, with the assistance of ChatGPT, estimates the average lifespan of solar panels installed in 2009 at 25 to 30 years; it therefore puts that lifespan here at 27.5 years … Why it does not adopt the amount of € 13.963,20 proposed by the claimant in the main action has been sufficiently explained above and in footnotes 4–7.” (at 5.7)

The judge, again relying on ChatGPT, also held that insulation material thrown off the roof was no longer usable, thus awarding damages and cleanup costs. (at 6.8)

B.    Litigant Use Cases

2.     The Hague Court of Appeal (ECLI:NL:GHDHA:2024:711)

In this appellate tax case concerning the official valuation of a hotel, the Court of Appeal in The Hague addressed arguments derived from ChatGPT. The appellant had introduced AI-generated text to contest the assessed value, but the court found the reference unpersuasive. It held:

“The arguments put forward by the interested party that were derived from ChatGPT do not alter [the] conclusion, particularly because it is unclear which prompt was entered into ChatGPT.” (at 5.5)

3.     The Hague Court of Appeal (ECLI:NL:GHDHA:2024:1771)

In a tax dispute regarding the registration tax (BPM) on an imported Ferrari, the Court of Appeal in The Hague rejected the taxpayer’s reliance on ChatGPT to identify suitable comparable vehicles. The claimant had asked ChatGPT to list vehicles that shared a similar economic context and competitive position with the Ferrari in question, leading to a selection of ten luxury cars. The court explicitly dismissed this approach, considering that while AI might group vehicles based on general economic context, this method does not reflect what an average consumer (a human) would consider genuinely comparable. (at 5.1.4)

4.     District Court of The Hague (ECLI:NL:RBDHA:2024:18167)

In this asylum dispute, the claimant (an alleged Moroccan Hirak activist) argued that returning to Morocco would expose him to persecution because authorities there routinely monitor protestors abroad. As proof of this state surveillance, his attorney cited a ChatGPT response. The court dismissed the argument as unfounded:

“That the claimant’s representative refers at the hearing to a response from ChatGPT as evidence is deemed insufficient by the court. First, because the claimant has not submitted the question posed nor ChatGPT’s answer. Moreover, the attorney admitted at the hearing that ChatGPT also did not provide any source references for the answer to the question posed.” (at 10.1)

5.     Amsterdam District Court (ECLI:NL:RBAMS:2025:326)

In this European Arrest Warrant case, the defence submitted a Polish-language report about prison conditions in Tarnów, hoping to prove systemic human rights violations. However, at the hearing, counsel acknowledged using ChatGPT to translate the report into Dutch, and the court found that insufficient. Lacking an official translation or a version from the issuing organization itself, the court ruled it could not verify the authenticity and reliability of the AI-generated translation. As a result, the ChatGPT-based evidence was dismissed and no further questions were posed to the Polish authorities concerning the Tarnów prison. (at 5)

6. Council of State (ECLI:NL:RVS:2025:335)
In a dispute over a compensation claim for reduced property value, the claimant (Aleto) tried to show that the expert’s use of certain comparable properties was flawed and submitted late-filed material derived from ChatGPT. The Council of State ruled:

“During the hearing, Aleto explained that it obtained this information through ChatGPT. Aleto did not submit the prompt. Furthermore, the ChatGPT-generated information did not provide any references for the answer given. The information also states that, for an accurate analysis, it would be advisable to consult a valuer with specific knowledge of industrial sites in the northern Limburg region.” (at 9.2)

Double Standards: AI-Use by the Court vs. the Litigants

How to make sense of these rulings? Let’s start with the ruling in which the judge used ChatGPT to estimate damages, which generated by far the most debate. Critics of the ChatGPT use in this case argue that the judge essentially introduced evidence that the parties had not discussed, thereby running counter to fundamental principles of adversarial proceedings, such as the right to be heard (audi et alteram partem), and to the requirement under Dutch civil law that judges base decisions only on facts introduced by the parties or on so-called “facts of general knowledge” (Article 149 of the Dutch Code of Civil Procedure).

A comparison has been drawn to prior case law involving judges who independently searched the internet (the so-called “Googling judge”). As a crucial criterion, the Dutch Supreme Court has ruled that parties must, in principle, be given the opportunity to express their views on internet-sourced evidence. However, others are less critical of the judge’s ChatGPT-use, pointing out that judges have considerable discretion in estimating damages under Article 6:97 of the Dutch Civil Code.

Looking more closely at the way the judge used ChatGPT in this case, it remains unclear whether the parties were afforded the opportunity to contest the AI-generated results. Nor does the ruling specify what prompts were entered or the precise answers ChatGPT produced. That stands in stark contrast to the Colombian judge’s use of ChatGPT, described in Juan David Gutiérrez’s Verfassungsblog post, where the court fully transcribed the chatbot’s responses, comprising 29% of the judgment’s text.

It also contrasts ironically to the judicial attitude toward litigants’ own ChatGPT submissions. In fact, three reasons have been cited by Dutch judges for rejecting ChatGPT-generated evidence:

  1. It is unclear which prompt was entered into ChatGPT;

  2. The claimant has not provided ChatGPT’s answers;

  3. The ChatGPT-generated text did not cite sources.

In other cases, ChatGPT use was dismissed because ChatGPT’s views were seen as incomparable to those of the average consumer, or because the translation was deemed unreliable.

Returning to the Dutch judge’s use of ChatGPT, it seems that it suffers from the very same shortcomings courts have identified as grounds for rejecting ChatGPT-based evidence when introduced by the parties. Such double standards, in my view, point to an urgent need to develop consistent guidelines.

Emerging Guidelines for Judicial AI-Use?

Although new guidance documents on AI use by judges and lawyers are emerging in jurisdictions such as the EU, the UK, New Zealand, and the Flemish Region, they rarely spell out explicit requirements for introducing AI-generated content in legal proceedings, and instead emphasize general principles such as acknowledging limitations of AI and maintaining attorney-client privilege. By contrast, the Dutch case law thus far suggests at least three elements that might shape best practices: (1) ensuring the prompt is disclosed, (2) requiring that ChatGPT’s complete answers are shared, and (3) demanding proper references.

While such requirements align with widely recognized principles of transparency and reliability, they alone may not suffice. The Dutch reaction to the judge’s use of AI reflects deeper concerns about fair trial guarantees and the importance of human oversight, which bring the matter into the domain of constitutional importance. Consequently, when judges employ LLMs to assist in decision-making, they should be vigilant as to the impact on parties’ rights to be heard, to submit evidence, and to challenge evidence.

Additionally, it is important to consider how the requirements of the EU AI Act come into play when LLMs such as ChatGPT are used by judges and litigants. Annex III of the AI Act qualifies certain judicial applications of AI as ‘high risk’, triggering the AI Act’s most stringent obligations, such as the requirement for deployers to conduct a Fundamental Rights Impact Assessment (Article 27), as well as inter alia obligations regarding data-governance (Article 10), human oversight (Article 14), and transparency (Article 13). These obligations come into play with regard to:

“AI systems intended to be used by a judicial authority or on their behalf to assist a judicial authority in researching and interpreting facts and the law and in applying the law to a concrete set of facts, or to be used in a similar way in alternative dispute resolution” (Annex III(8)(a)).

As the corresponding recital notes, this high-risk qualification serves to “address the risks of potential biases, errors and opacity” (Rec. 61).

However, the classification becomes problematic for AI-systems that serve more than strictly legal purposes, like LLMs. LLMs like ChatGPT fall under the umbrella of general-purpose AI (GPAI) systems, meaning they are highly capable and powerful models that can be adapted into a variety of tasks. In practice, the very same LLM might be employed to increase efficiency in organizational processes (in principle a low-risk application), yet also be employed to assist judicial authorities in “researching and interpreting facts and the law”. Whether a given use case falls under ‘high-risk’ can therefore be a fine line, even though it matters greatly for the applicable obligations.  

Moreover, even if AI use in court would not meet the criteria for high-risk classification, certain general provisions still apply, such as Article 4 on AI literacy. The AI Act’s inclusion of rules for AI systems capable of inter alia text (Article 50(2) AI Act) further complicates matters by imposing transparency requirements in a machine-readable format. It remains undefined how this transparency obligation might operate in, for example, ad hoc use of ChatGPT by judges and litigants. In that regard, the transparency obligation is directed primarily at providers, who must implement technical solutions, rather than the users themselves.

Lastly, with regard to privacy and data security, the EU AI Act merely refers to the general framework laid down by the General Data Protection Regulation ((EU) 2016/679), the ePrivacy Directive (2002/58/EC), and the Law Enforcement Directive ((EU) 2016/680). However, the data-driven approach inherent to machine-learning algorithms like ChatGPT—involving massive processing of (personal) data—opens up a Pandora’s box of novel privacy risks.

As Sartor et al. emphasize in their study for the European Parliament, this approach “may lead […] to a massive collection of personal data about individuals, to the detriment of privacy.” Yet, from a privacy point of view, the AI Act does not itself lay down a framework to mitigate these risks, and it remains questionable whether existing regulations suffice. In particular, precisely how the GDPR applies to AI-driven applications remains subject to ongoing guidance by data protection authorities. It goes without saying that, in sensitive contexts such as judicial settings, ensuring compliance with the GDPR’s requirements is vital for preserving public trust.

Prospects for a Dutch Judicial AI-Strategy

On a concluding note, the jurisprudence discussed above carries direct implications for the Dutch judiciary’s forthcoming AI strategy. The Council for the Judiciary’s 2024 Annual Plan mentions the development of such a strategy by mid-2024, yet no document had been published by the end of that year. In a letter to the Dutch Senate, the State Secretary for Digitalization and Kingdom Relations stated that the judiciary intends to present its AI strategy in early 2025; however, no such strategy has been made public to date.

What is clear from the rulings so far is that AI and LLMs are increasingly finding their way into courtrooms. Yet, the current situation is far from ideal. It reflects a fragmented patchwork of double standards and legal uncertainty concerning a technology that intersects directly with constitutional guarantees such as the right to a fair trial, including the right to be heard and equality of arms.

In light of this, it seems vital that any overarching AI policy be accompanied by clear and practical guidelines for judicial AI use. Based on the developments reviewed in this post, three elements appear especially urgent:

1.     Establishing appropriate preconditions for AI use by both judges and litigants (thus avoiding double standards while safeguarding parties’ fundamental rights and due process);

2.     Introducing a clear risk categorization in line with the EU AI Act (benefiting from lower-risk applications while exercising caution with high-risk ones); and

3.     Ensuring robust data privacy and security.

Although it may be tempting to adopt AI tools simply because they are “jolly useful,” disregarding these principles jeopardizes the trust and legitimacy needed for a responsible integration of AI within the judicial system. If AI is to become part of the future of the courts, then the time to lay down the rules is now.

D.G. (Daan) Kingma holds an LLB in International and European Law from the University of Groningen and a master’s in Asian Studies from Leiden University (both studies cum laude). He currently studies legal research (LLM) at Groningen University, focusing on the intersection between technology, privacy, and EU digital regulation.

Leave a Comment