Why big data could sink Europe’s ‘right to be forgotten’

Europe’s proposed ‘right to be forgotten’ has been the subject of intense debate, with many people arguing it’s simply not practical in the age of the internet for any data to be reliably expunged from history.

Well, add another voice to that mix. The European Network and Information Security Agency (ENISA) has published its assessment of the proposals, and the tone is sceptical to say the least. And, interestingly, one of the biggest problems ENISA has found has to do with big data.

The European Commission‘s proposals define the sort of data that has to be erased (if the data subject asks for it) in more than one way. Without wishing to get into comparison of various sections’ wording, here’s what ENISA has to say:

“[The definitions] leave to interpretation whether [personal data] includes information that can be used to identify a person with high probability but not with certainty… Neither is it clear whether it includes information that identifies a person not uniquely, but as a member of a more or less small set of individuals, such as a family.”

Here comes the kicker:

“A related question is how aggregated and derived forms of information (e.g. statistics) should be affected when some of the raw data from which statistics are derived are forgotten. Removing forgotten information from all aggregated or derived forms may present a significant technical challenge. On the other hand, not removing such information from aggregated forms is risky, because it may be possible to infer the forgotten raw information by correlating different aggregated forms.”

That’s a pretty big problem. If data gets aggregated and crunched by analytics software, you can’t say in all cases that the process can’t be reverse-engineered, particularly when you’re correlating different sets of derived data. But getting it out is, well, a challenge.

This isn’t the only problem ENISA’s identified. Here’s a tl;dr rundown of the report’s other comments and questions:

  • When you have a photo on a social network that features multiple people, “who gets to decide if and when the photo should be forgotten?”
  • The internet is not a closed system, and it spans multiple jurisdictions. “Enforcing the right to be forgotten is impossible in an open, global system, in general.”
  • “Unauthorized copying of information by human observers is ultimately impossible to prevent by technical means.”
  • You could try DRMing all data, but these things can be bypassed, and people wouldn’t like it.
  • So what could work? “A possible partial solution may be a legal mandate aimed at making it difficult to find expired personal data, for instance, by requiring search engines to exclude expired personal data from their search results.”

ENISA is too politically savvy to just come out and say that the right to be forgotten is doomed, but they come pretty close.

Is it doomed? Quite possibly – which is a pity, in some ways, as it would be nice to avoid an inexorable slide into a world where people lose control over their own history.

As ENISA points out in its report, a fundamental problem with European laws such as this proposed revision to the data protection directive is that they need to be broad enough to be interpreted by member states in ways that fit with their national principles. Technical solutions don’t do ‘broad’. They need ‘specific’.

And with problems such as those highlighted by ENISA, good luck to the European Commission with nailing down those specifics in a way that pleases everyone.


GigaOM