
Europe’s ambition to lead in trustworthy AI sits uneasily beside one of its oldest privacy rules: data minimisation. Like its sibling principle, purpose limitation, it was drafted in an analogue world — when computer memory was expensive and data collection was clumsy.
In the era of machine learning, the rule that data must be “adequate, relevant, and limited to what is necessary” (GDPR Article 5(1)(c)) is colliding with the realities of modern innovation. In Opinion 28/2024, the European Data Protection Board (EDPB) highlighted the importance of the data minimisation principle, even in the context of AI development and deployment. Yet AI thrives on more, not less — more variety, more volume, more iteration. If Europe insists on training tomorrow’s AI with the data philosophy of yesterday, it risks building the world’s most ethical technology that no one actually uses.
Data minimisation was born out of a perfectly rational fear: the rise of computerised government databases in the 1960s and 70s. Early data protection laws, from Sweden’s 1973 Data Act to the OECD Guidelines (1980) and the Council of Europe’s Convention 108 (1981), all shared the same goal — stop bureaucracies and corporations from gathering and storing excessive amounts of personal information. Back then, more data meant more danger. Each record carried risk; storage was costly; and state surveillance, not statistical insight, was the primary threat. The regulatory philosophy was simplistic and moral: collect only what you need, for as long as you need it.
By the time the GDPR took effect in 2018, the digital world had turned that logic upside down. Storage was cheap, computing was fast, and data had become the raw material of innovation. Yet Europe kept the same guiding principle, barely touched since 1980.
The problem is not that minimisation is wrong, it is that it’s anachronistic. The data minimisation principle assumes that the value of data is known in advance and that collecting “too much” creates risk without benefit. But in AI, the relationship is reversed: we often don’t know which data will matter until after the model learns from it. AI models need data that are diverse. Limiting data collection too tightly risks producing biased, inaccurate, or fragile systems. Precisely the sort of AI development Europe says it wants to avoid, in order to extract the benefits of innovation.
The EDPB, however, maintains a strict line, the data minimisation principle still applies.
Three structural features of the digital era make the minimisation rule increasingly unworkable:
The irony is sharp: a rule designed to protect individuals from data misuse now risks depriving them of the benefits of responsible data use — from better healthcare diagnostics to climate modelling to AI technology.
In short, Europe’s devotion to small data may be producing small results.
In the global AI race, the consequences are visible.
Europe’s advantage in regulation has become its disadvantage in innovation.
Europe does not need to abandon privacy to stay competitive — it needs to modernise how it interprets its principles. Europe’s approach to AI development should not prioritise minimal data but must demand responsible data.
Pragmatic reforms could align data minimisation with digital reality. Such reforms might focus on increased risk-based interpretations that shift the focus from data quantity to risk mitigation. They might also include dynamic proportionality tests, allowing broader data use when clear societal benefits and public interests are at stake. And moving away from compliance checklists to outcome based accountability, giving AI developers flexibility whilst preserving public trust.
These changes would align the GDPR’s spirit — protecting individuals — with its new context: enabling responsible AI that benefits them.
The principle of data minimisation clearly made sense in the fledgling days of digital technology but the technologies we now seek to regulate have changed beyond recognition. Europe is trying to train modern AI under rules written for another era. Insistence on ever-smaller datasets in the name of privacy might deliver the cleanest compliance record, but at the cost of innovation.
Less data may mean more virtue — but it also means less progress.
17 December 2025