Root Causes 350: Public Certificates and the GDPR Right to Be Forgotten
GDPR provides a "right to be forgotten," whereby individuals can demand the removal of PII from IT systems. This can run directly contrary to the transparency and permanence built into the DNA of public PKI systems. We explore this conundrum.
- Original Broadcast Date: December 21, 2023
Episode Transcript
Lightly edited for flow and brevity.
-
Tim Callan
We want to talk about the right to be forgotten, the GDPR right to be forgotten and digital certificates.
So, first of all, a little definition for everybody. We all know, of course, what GDPR is. It’s the sweeping personal privacy regulation law, set of regulations that exist inside the European Union which has changed many people’s computing lives in many ways as their online experiences have changed as a consequence of GDPR and one of the things that the GDPR contains is it contains what is called a right to be forgotten, which is that you can contact somebody who has PII about you and you can demand that your PII basically be erased. Right? That’s why I call it a right to be forgotten. That you can have your information removed from these services or databases so that it doesn’t sit there into perpetuity and it’s one of the rights you have under GDPR. And so, what’s interesting about this – the right to be forgotten – is where it rubs up against the world of digital identity which in many ways has it’s own permanence and transparency and public nature that can be just actually sometimes mathematically irreconcilable with the right to be forgotten and it’s interesting to see how those two things bump up against each other.
-
Jason Soroko
Boy is it ever interesting. You know, if you purchase a publicly-trusted certificate – note that word publicly-trusted. You know, a couple things happen which is, first of all, the certificate exists. A digital file exists and there is obviously recordkeeping on the part of the Certificate Authority and - -
-
Tim Callan
Which is required by the way. Recordkeeping that you must do and if you don’t do it according to certain rules and don’t retain it for a certain period of time, in principle that can get you kicked out of browser root store programs. But go on.
-
Jason Soroko
And the other thing, Tim, is by the same set of rules in our industry, your publicly-trusted certificate is literally, literally, you know, carved into digital stone, if you will, for all the world to see inside of the CT logs. Right? The certificate transparency logs.
-
Tim Callan
Yeah. And certificate transparency logs are themselves immutable. That is part of the architecture of those CT logs. They were designed that way from the beginning on purpose.
-
Jason Soroko
They can’t change even if you want them to change. They won’t change. Mistakes that are made even by a CA are there forever. That’s just the point to them.
-
Tim Callan
The word transparency is right in there. Right? And that’s what it’s all about. It’s to make all of this available for public display. So, this presents a little bit of a conundrum when somebody wants to exercise their GDPR right to be forgotten. Right? And so first of all, once anything is in a CT log, it’s there. Like it’s there. Even if you revoke the certificate, the record of the certificate is there and literally, short of abolishing the log, there is nothing you can do about it. Like you would honestly have to destroy the entire log and all the information it contains for all the other certs that are in there and somehow make sure there aren’t any copies of the log anywhere in the world to erase that information. It simply can’t be done.
-
Jason Soroko
Can’t be done. You know, this almost reminds me of a conversation we had in a previous podcast, Tim, where the Australian government wanted to basically create some kind of a hole into end-to-end encryption and I think it was the Prime Minister at the time saying that, you know, the law of physics mean nothing.
-
Tim Callan
This may be a law of mathematics but it’s not the law of the land.
-
Jason Soroko
Not the law of Ophelia exactly.
-
Tim Callan
It is. If it’s the law of mathematics, it is the law of mathematics even in Australia. And so the CT logs are one problem. Just public certificates more generally are also a problem. Like if you put your information into a public certificate and you put it out there in the world, again, like the certificate itself is unalterable. It can be revoked but I can’t go change the content of your certificate after I’ve given it to you. That can’t be altered.
-
Jason Soroko
Tim, you and I, I believe just recorded a podcast and it will be published soon on the nature of Merkle trees. Please refer to that podcast because we go right into the technical reasoning for why the hash chain of a Merkle tree, the whole point of it is it’s not to be altered. It is a one, truly, truly, mathematically one-way data structure.
-
Tim Callan
Absolutely. And by design, once again, by design, the reason that CT logs are based on the Merkle tree strategy in part is because of it’s unalterability.
-
Jason Soroko
Tim, I want to call out Root Causes Episode 306 – Certificate Transparency Logs and Privacy – where you and talked a lot about the problems. CT logs, I believe in them. They are a great idea. Look, we belong to an industry where that’s just gospel now and that’s the end of the story and I believe in CT logs. I really and truly do. However, for those of you who want to be forgotten, right?
Have public information available forgotten and for those of you who have other concerns such as, hey, the information on my publicly-trusted certificate which is proprietary and I don’t want information to leak out such as information about your internal intranet sites which might provide competitors with information or heck, I just set up a domain and I don’t want people to know that I set up that domain, well, I’m sorry folks. Right?
CT logs exist and that means there is going to be a lack of privacy due to the public nature of CT logs.
-
Tim Callan
And we’ve talked about this before. There’s this confusion I think some people get where when you engage in public certificates, whenever you do anything using a public certificate, the word public is there. By it’s nature, you are publishing information for the entire world to see. That is the very act you are going through. So, if I could use an analogy, it would be like saying, I’m going to take out an advertisement in the New York Times and then after the newspaper has been printed and distributed around the world, call them up and say I want you to redact that information. You can’t.
-
Jason Soroko
You can't.
-
Tim Callan
You cannot. It’s permanent. It’s in the world and there’s no going back and that’s what happens as soon as you publish something actively. And the other reason that analogy is apt is that it starts with a proactive act of publishing by me. I take out the ad. I take the action. I provide the information. I spend my own money and I follow through and at the end of the day, it gets published by a publisher. That is exactly what a public CA is. It’s just with very specific rules and cryptographic and mathematical constraints around how you do it but at the end of the day, that’s what a CA is doing. And so, for somebody to reach out to a CA proactively, provide that information, take action to authenticate that that information is correct, like DCV, spend their money and then after it’s published to turn around and insist on a right to be forgotten is just fundamentally non-sensical. Right?
-
Jason Soroko
It is non-sensical. You know, Tim, this is important for CAs to talk about because, you know, that’s why we podcasted about CT logs and privacy. That’s why we are having this podcast. It’s to inform you that when you are buying – exactly what you just said – when you are buying or utilizing a publicly-trusted certificate, that publicness is truly public and it isn’t coming back. That information is never coming back to you. You are not in control of that information anymore. And it just needs to be part of the awareness of publicly-trusted certificates and that’s it.
-
Tim Callan
And then there’s another layer to this which is there’s a whole family of tools and our, of course, the one that’s dear to us is crt.sh. It’s created by our own Rob Stradling. But there’s other tools, too, that use CT logs in a variety of ways and the other thing that has occurred in addition to consumers and you’re reaching out to public CAs and attempting to exercise a right to be forgotten on the information in their certs is we’ve also seen examples of consumers reaching out to the people who provide these tools and insisting on a right to be forgotten and again, that seems to be a fundamental misunderstanding of the nature of what’s going on here. That information is in the CT logs. The only thing that the tools do is interpret the information that is in the CT logs. The tools don’t contain this information. They are lenses. So, if we use my earlier analogy, this is like you going to the library that contains the microfiche of the New York Times or the actual issues in their stacks and insisting on a right to be forgotten there, which is to say all we do is we show you what’s already there in the other publication. We literally are just a window, we are a lens. And if you are tool like crt.sh, that’s what you are. You are a window onto information that exists elsewhere and so, if crt.sh were to somehow enforce an ability to be forgotten or any tool, right, were to enforce an ability to be forgotten, the way that would work is they would actually have to be able to block it which, ironically, is the opposite of that. It would add the information that you are insisting that they forget because that would be the only way to prevent that information from displaying where you as the consumer could see it.
-
Jason Soroko
Geez, Tim. Wait until those same people find out that all of their Bitcoin and cryptocurrency exchanges are also publicly available information.
-
Tim Callan
Are also indelibly available until the end of time. Yes! Exactly. And so, it’s understandable maybe why ordinary consumers would get confused about this kind of thing, um, especially if they don’t understand the inner workings of a tool like crt.sh but at the same time, it’s important to remember that if we were looking at the earlier analogy I said, something you deliberately put an ad in the newspaper then everybody would laugh you out of the room if you then came back the next day and insisted that the newspaper had to somehow remove all those ads. And yet, that’s the same situation in public CAs and people with perfectly straight faces make this claim on a pretty regular basis.
-
Jason Soroko
Man, I tell ya, it just amazes me that you could look at a tool – and I love the way you put it – which is a lens to data and think that, well, the data is in this thing and whoever it is that offered this lens must be responsible for or has some sort of - -
-
Tim Callan
Has an obligation under European law to remove this or I’m gonna pursue them for a GDPR violation. And it’s just, it kind of shows just a fundamental misunderstanding of what’s going on with all of this stuff from an information theory level, from a computer science level, from a mathematical level, from a cryptographic level, like in all those senses if you understood those you would understand why this request is just literally, literally impossible.
-
Jason Soroko
It really shows me, Tim, that the understanding of distributed systems and distributed architectures really have not sunk into a lot of people’s minds and as that comes more and more and more into the world – because this is the way everything is going – goodness gracious – I just wonder, you know - - ok, this is totally cynical and I shouldn’t even say it but it’s almost like people are gonna start to think that it’s black magic now than it being an actual - - it is a literal computer system. It’s just done very differently. But if you are thinking of everything being very, very proprietary and singular and isolated and monolithic then the way that things are build today must look like some sort of black magic.
-
Tim Callan
And this touches another one of my hoity horses and this is something that we’ve talked about a lot just in very recent episodes, once again, which is that people who don’t understand technology who have a certain amount of legal power because of their function in government attempting to enforce rules and technology and I absolutely fear the day when the European Union declares that, you know, a public CA is violating GDPR because they aren’t willing to redact information from unalterable public certificates or unalterable CT logs. I can completely see that happening or I can see the European Union declaring that somebody who maintains a CT log somehow has an obligation to do this and, fortunately, I am not aware of that occurring but I don’t put it out of the realm of possibility because we’ve seen, right, it’s this laws of mathematics vs. laws of Australia thing. We’ve seen the fact that frequently people in governmental positions who do not have the technical knowledge to make this kind of decision also don’t have the wisdom to recognize that and try to make these decisions anyway and this is where a lot of trouble comes from.
-
Jason Soroko
Tim, I reject your internet. I’m gonna have my own internet. Oh, by the way, we podcasted on that one time.