Root Causes 104: 21 PKI Pitfalls to Avoid
Our hosts often discuss the idea of errors in PKI implementations and the potential negative consequences for organizations. In this episode they categorize twenty-one PKI pitfalls to avoid according to five main categories of error: certificate problems, deployment problems, systemic security problems, governance problems, and visibility problems. Join us for a crisp description of these twenty-one pitfalls so you can be on the lookout for them.
- Original Broadcast Date: July 6, 2020
Episode Transcript
Lightly edited for flow and brevity.
-
Tim Callan
So, we have a real interesting topic today. Twenty-one – we didn't think it would be this many – but 21 PKI pitfalls to avoid.
-
Jason Soroko
Yeah. So, PKI pitfalls, we came to this topic Tim thinking it would be kind of important because when you are thinking about your PKI implementation there is a lot of ways to do things incorrectly and we kind of brainstormed what are those things you gotta watch out for.
-
Tim Callan
And you know what? We say this a lot. We say, "hey, don’t make rookie mistakes." Don't make it be PKI amateur hour, but a listener pointed out to me that we don't explain what those mistakes are. So, we are like, hey, let's do that. So, you and I made a list. As I said, it is 21 items long. Don't freak out listeners. A lot of them are related to each other and then we broke them into five main categories. So, what if we list off the categories first so people have a roadmap for where we are going and then we'll jump into it.
-
Jason Soroko
Yeah. Let's do that Tim.
-
Tim Callan
Alright. So, the first category is certificate problems, which is problems with the actual certificates you issue. The second one is deployment problems. Problems with the way that you deploy those certificates. The third one is security problems. Problems with the overall security of the system. Fourth is governance problems. Problems with how you set your rules and compliance and things. Lastly, we have visibility problems. Problems where you can't see and understand what's going on and how that gets you in trouble. So, with that, certificate problems. Number one, weak keys.
-
Jason Soroko
This is something we podcasted on. In fact, I think that will be a theme of this list Tim. Are we have actually listed off a lot of these in the past in various podcasts? In fact, weak keys is something that I can see, you know, a Linux administrator who is being asked to set up an OpenSSL PKI setup and just choosing some defaults.
-
Tim Callan
Yeah.
-
Jason Soroko
Or setting up other PKI systems and just choosing defaults and not really thinking hard about the certificate definition and what is appropriate for their implementation.
-
Tim Callan
Do you think is there a sense of somebody deliberately trying to choose smaller keys because they think that there is a performance requirement to do so or there's a performance benefit to do so?
-
Jason Soroko
I can't get into the mind of that person. I don't see why they would do that. If that person really wanted to choose a smaller key size or key digest size, I could see them wanting to have already researched the fact that you can use smaller digi sizes on things like ECC vs. RSA.
-
Tim Callan
Sure. Sure. That's a great way to that. Alright. So, number two of certificate problems. Certificates where the term is too long. Certificates that basically are sticking around longer than really, they should be.
-
Jason Soroko
That is a common one in the early days of IoT.
-
Tim Callan
Right.
-
Jason Soroko
So therefore, there was such thing as fire and forget certificates just because devices were so constrained and there were no technologies at the time or limited technologies at the time to be able to do renewals and revocation on closed off networks is very difficult. So a lot of people at the time chose these very, very long certificate life spans. With something like an automobile, you know, perhaps back in the earliest days you might say, well I think the car is gonna last 15 years, why don't I make it a 15-year cert.
-
Tim Callan
Or double it just to be safe. Make it a 30-year cert. Right. Exactly.
-
Jason Soroko
Yeah. Yeah. And so, therefore, yeah, there have been some long ones. Tim, in your experience with public trust has that come up in public trust?
-
Tim Callan
Well public trust is of course regulated. What's happened in the world of public trust is there's been this push/pull dynamic about the length of certificates. So, if you go all the way back to the beginning, they were one-month – SSL certs were one-year long. They were all just one year. That's all there was and then people started to say well what about a longer cert and how about two years, three years and they got up to the point where you could like a ten-year SSL cert, which I think definitely is excessive and that's a lot of what the CA/Browser Forum has done is it's provided a venue for the industry to work on standards like that and what we've seen is the acceptable duration of certs has gotten shorter and shorter and shorter and shorter. It's down to one year now for us to sell coming up in just a couple months and could get shorter still. And, so, that's an important part of this whole thing obviously. In the case of public certs, you're capped. You can only buy what's allowed but you could also imagine the same questions occurring for your private TLS certs. So if I'm running these certs in my own system I might ask myself, well gee, do I really feel ok issuing a five-year certificate when in the public world they've brought it down to much, much, much less than that.
-
Jason Soroko
And, in private trust. Of course, we've talked about concepts such as DevOps which even has a shorter life span perhaps than certain kinds of IoT device leave certificates. We've gone down to suggesting two hours is sufficient for certain kinds of tasks. So therefore, anybody who is configuring their CA for that and allowing their leave certificates to last a period of days or weeks, or months or perhaps even years are probably over – making the length of viability for that certificate too long.
-
Tim Callan
And of course, one of the consequences of that is crypto agility which is a theme we hit over and repeatedly and with that let's move onto the next one, which is outdated cryptographic algorithms and a great example of that would be SHA-1.
-
Jason Soroko
Isn't it amazing Tim, but again, I think as we saw in a recent podcast groups such as people who are involved with the OpenSSH initiative, they are taking away that as a default and that seems to be perhaps the number one way to avoid this problem is first of all, if there is a default setting in your PKA system that you are setting up just question it. Am I choosing it just because it's the default or am I choosing it because I know why I'm choosing it.
-
Tim Callan
Right.
-
Jason Soroko
I would say that would be the number one reason.
-
Tim Callan
Yeah. If I don't know enough to answer that question maybe I should learn a little more or get someone onto the project who does, right? Number four for certificate problems – stapling.
-
Jason Soroko
Well sure. I suppose here we are talking about OCSP?
-
Tim Callan
Yeah.
-
Jason Soroko
So, in terms of stapling, you know, how that can be set up correctly or incorrectly. I'm sure there's ways in both directions. I would say though that in terms of whether you are a web browser – sorry – if you are a web server administrator, for example. Your choice of accepting stapling or not I think that this is an area where if you are gonna be in the business of serving a public website you should learn about what you are accepting and not accepting and how you are configured to be able to speed up the process of revocation checking. I think that's very, very important.
-
Tim Callan
Yeah. And stapling obviously, there are people who do it and it has its proponents, but it leaves you in a very severe crypto agility deficit. Right? And if you need to respond quickly to events it really hurts your ability to do that. So, it's definitely – is it an always no-no? Maybe not but it certainly is a dangerous practice, and you have to know what you are doing.
-
Jason Soroko
So many of these, Tim, aren’t they? They come around to that theme of just choosing a default or choosing something because you heard of it in the past. You are not quite sure why you are doing it. Studying further or calling an expert is typically the around a lot of these pitfalls.
-
Tim Callan
Ok. So, moving on. Deployment problems. The first three of these are going to be very similar to each other, so I'll start with the first one. Reuse of the same certificate across multiple servers, devices, etc. So we see this all the time.
-
Jason Soroko
Wow. This one is so common. It's so common. This one needs to put an asterisk and an underline and everything else. Get out your markers. Yeah. So, for those of you in the IoT device manufacturing business, for heavens sakes, please do not use the same leave certificate on every single device.
-
Tim Callan
Right.
-
Jason Soroko
In fact, I'll even go further and say why in the world do we have default username and passwords. Why are we still having a problem of the same symmetric token? It just goes on and on. It's not just an asymmetric PKI certificate, this problem is rampant amongst every form of authenticator.
-
Tim Callan
Yeah. Agreed. Absolutely. And why do people do it? Well, I think number one is they just want to save some money and number two is I think they want to save some effort. It's easier. You can be more intellectually lazy and get away with it.
-
Jason Soroko
If your device – if you are in the business of making copies of a thing then it's so much easier to make everything absolutely identical. What we are talking about is the identity, the digital identity of these things have to be unique. This is where the rubber hits the road here.
-
Tim Callan
Yeah. And course, as I just said, the key is if you can get away with it. The problem with that practice, of course, is that if one cert is bad, they all are bad. And with that let's return to number six on the list, which is overextension of the use of wildcards. So, we know wildcards are very useful for being able to handle multiple domains on a single server on a single cluster. We do see people trying to just paint their entire install with one wildcard. Right?
-
Jason Soroko
I've seen it. I've seen it. If things go bad, things go very bad for you.
-
Tim Callan
Right. And that's the same thing as using the same cert. It's sort of an anything fails; everything fails kind of scenario. You've got the same thing with overuse of the wildcard. So real anecdote, we all remember we've done podcasts in the past where we talked about revocation and forced revocation and the rules revocation and malware and things like that. So as a standard practice we get reports of phishing sites and we confirm that they are really phishing sites and if they are, we revoke the cert because that's against our CPS. And it's against our EULA right? And so, at one time in the past we got a report of a phishing site. We checked it out. Yep, indeed, it's a phishing site. We revoked the cert and then next thing you know somebody is complaining to us on Twitter and they are some hosters. It's not like one of the big names you know but it's some hosting company and they say, "Hey Sectigo, you just took down 300 customers or something of mine." And the response to that is why are you sticking 300 of your customers who control their own content and who put up their own content who know nothing about each other on the same wildcard? Cause as soon as you do that if one of them needs to be revoked or if one of them has a problem, they all go down. It's an inaccessible fragile system.
-
Jason Soroko
The hoster example is a perfect example but for large enterprises you can have the same mistake being made.
-
Tim Callan
Alright. So, then the same thing, overuse of multi-domain certificates. You basically do the same thing, but you use the SAN field to stuff too much in your certificates and you find yourself in the exact same situation.
-
Jason Soroko
Yeah. We see CDNs that have upwards of 400 domains within a single multi OV. You can look at those on the CT logs, it's quite fascinating.
-
Tim Callan
And then you say, is that really safe. Right? Number eight. Failure to automate certificate renewal. We talked about this gee, like 50% of our episodes.
-
Jason Soroko
Yeah. It kind of comes up in almost every talk. So, in other words, it's great to have these certificates. It's great. But especially in the public realm if you are depending on a human Linux administrator for example to be able to renew those certificates, you are probably in for a lot of pain because there have just been so many examples. Tim and I are now almost exhausted talking about all the examples of failure to automate this kind of thing.
-
Tim Callan
Exactly. To learn more, listen to just about any episode of this podcast.
-
Jason Soroko
Right.
-
Tim Callan
Number nine. Maintaining – what do I want to say? Unnecessary private CAs. Unnecessary redundant private CAs. So how this goes is somebody thinks they need a CA and they stand up a CA and then at another time another individual at a different place in the organization thinks they need a CA and they stand up a CA instead of using the CA the company already has and you wind up with different private CAs sitting out there and they all need to maintain and they all need to be monitored and the same machines don't necessarily have trust to the same CAs if the root didn't get installed and it just makes things more fragile and it adds work for no good reason.
-
Jason Soroko
Right Tim. I would say though on the flip side there's also organizations who probably didn’t think through, hey, I'm going to sell off part of my business down the road potentially.
-
Tim Callan
Right. Yeah.
-
Jason Soroko
And therefore, I need that product line to have its own CA so that they can cleanly be severed from me, and we can all be happy.
-
Tim Callan
I love it Jay. So maybe it's not maintaining the right CAs to reflect your organizational needs, right?
-
Jason Soroko
That's it. And think of even potential future needs that you might not have ever considered. This is where talking to experts about the normal forms of trust models for various kinds of businesses, this is where the experience, you know, 20-30 years of experience that we have in the industry, we've seen all these cases and we've seen where people must tear their hair out later. This is a pitfall where you probably want to talk to an expert for some advice.
-
Tim Callan
Cool. And then ten is very similar to nine. Unnecessarily maintaining Microsoft CA. So once upon a time Microsoft CA was kind of the only game in town in a private CA and didn't matter because it worked for all your Windows machines and as you and I have talked it is a rare enterprise now that can get all their private CA needs handled with Microsoft CA and yet we see people then thinking ok, I'm gonna have two. I'm gonna have a Microsoft CA for the Windows stack and I'm gonna have a private CA for everything else and under those circumstances it may be that that Windows CA, that Microsoft CA is not necessary and that you could just be using one for your full set of machines.
-
Jason Soroko
If we were still living in the Steve Ballmer world and Bill Gates era of Microsoft, I can see how people got caught up in the fact that everything will always be Microsoft stacked forever, forever, forever. Not even Microsoft says that now. And so, therefore, moving away from Microsoft CA is probably a better bet for you.
-
Tim Callan
Yeah. And moving away also is very practical. So, the other reason you might be maintaining an unnecessary Microsoft CA is because you did earlier. Right? It's been around. It's entrenched and we do this with clients all the time. Like you really ought to be able to pull that stuff off and stick it on your private CA and it's not that big of a deal and it certainly isn't a technical barrier.
Alright. Security problems. So don't worry, there's only two of these. We are up to number 11. Number 11, not properly protecting your private keys.
-
Jason Soroko
Right. That's probably a problem regardless of public or private trust. And that stems all the way from the provisioning process. In other words when you are receiving your certificate. That whole process of how it is transmitted to you needs to be secure. And that's why, Tim, we've really embraced a lot of public stands in how to do that because a lot of these problems have been solved.
-
Tim Callan
Yeah.
-
Jason Soroko
Once you have the key material, once you have your certificate in IoT it's a pretty big deal. Therefore we have secure elements. This is why we talk about secure elements all the time. On mobile devices there are enclaves. On laptops there are TPMs. There's usually a good place to put your certificate.
-
Tim Callan
Cool. Number 12 – failure to apply patches and respond to zero days. So one of the problems of running your own CA, your own PKI is that you are not necessarily fully focused on responding to things in real time as they need to be responded to. You know, people have day jobs. People have vacations. People have things like that and as a result there can be gaps. There can be a lag or there could be things that just plain get missed and that itself can introduce vulnerabilities.
-
Jason Soroko
Sure Tim. And, don’t forget, we have seen things like heart bleed for public implementations of OpenSSL. Just make sure you can patch your stuff.
-
Tim Callan
Right. And that you do patch your stuff and you know what you need to patch. Find out what patches you need and get them.
Alright. Governance problems. Number 13. So just at a high level, number 13 is lack of governance at all. Lack of a policy. So, it's the Wild West. People just get to do whatever they want and there's no rules and there's no guidance.
-
Jason Soroko
I am seeing that pitfall kind of rearing its ugly head now in some of the latest usages of PKI, namely DevOps, Tim.
-
Tim Callan
Sure. Great example.
-
Jason Soroko
Where OpenSLL CAs are just being stood up by Linux administrators just to get their Kubernetes cluster working and then the CSO has no visibility to it. It's quite amazing.
-
Tim Callan
It doesn't even know it exists. And also, and I would say this is still true today, Jay. Certainly, it's been true in the years leading up to now is in the world of IoT. No governance. No real rules. Right? An individual product team just kind of makes a choice and nobody knows what choice they are making. There are no guidelines on what choice to make and there's no consistency. Other people might make a different choice and that's just how it gets done.
-
Jason Soroko
Yeah. Typically, in the more modern or newer use cases for PKI where it's the Wild West periphery. People aren't quite sure how to do it so they just do it some way.
-
Tim Callan
Yeah. They just do something, and they don't necessarily even report on what they did. They might not document it, etc.
Alright. So, number 14 – Certificate practices or shall I say CPS problems. So, there are a variety of ways that your CPS can be bad.
-
Jason Soroko
Yeah. Absolutely. And we don’t have the time to go into all of them. Perhaps Tim, that's a whole podcast onto itself.
-
Tim Callan
It probably is, right. We could rattle a few off. It's writing it badly. It's just taking somebody else's CPS and universally replacing the name and crossing your fingers in hope. Yeah.
-
Jason Soroko
You know what, Tim? It almost goes together. Don’t roll your own crypto and don't write your own CP and CPS.
-
Tim Callan
There you go. I love it. I love it. Number 15 – Failure to do revocation checking.
-
Jason Soroko
Yeah. You know, in some cases in some forms of IoT there may not be a good reason to do revocation. I would say in just about every other scenario you want to do revocation checking. And we've had a whole podcast on the problems of revocation checking.
-
Tim Callan
Yes.
-
Jason Soroko
Especially in the public realm but on the other hand, if you are not checking your certs, if you are not revoking your certs you had better have a good reason and you had better also have a shorter certificate life span than you think you need.
-
Tim Callan
Yeah.
-
Jason Soroko
That's, again, perhaps another whole podcast that we could really dig into those details.
-
Tim Callan
Yeah, but unless it's a mighty short certificate life span like a couple of hours you talked about earlier, probably still there are scenarios where you need to be able to revoke a cert. What really matters is that you need to be able to revoke a cert.
Alright, number 16. Sixteen and 17 I'm gonna go together cause they're flips of the same thing. Which is, choosing public certs when private would serve you better or choose private certs when public would serve you better. So thoughts on that Jason?
-
Jason Soroko
Tim, here's a thought. I've had customers in the field come to my team and they've asked us – simply because they know the browser analogy with certificates. They know that hey I need to be publicly trusted with my browser because the browser is gonna raise an error or raise some sort of fuss if it's a self-signed certificate of some kind. Therefore, I must need a publicly trusted certificate, right? When what the customer really needed was to have a signed certificate that was signed by a top-level hierarchy that was trusted by a third-party CA such as a Sectigo, for example. And therefore, those certificates would be trusted in a form of walled garden of trust and therefore a publicly trusted certificate was not necessary and that's an example.
-
Tim Callan
Yeah. And the private certs give you more control and more flexibility and more visibility in that scenario, so you are better off. And yet on the opposite, right, you could imagine if your walled garden isn't fully walled, isn't as walled as you think it is and you want to give access to other parties or you want to be part of a larger ecosystem where attempting to do what you just went down might get you in a situation where other people have accessibility problems or where there are compatibility problems with other systems and in that scenario you might just say go full on public and you won't have that problem.
-
Jason Soroko
And here's a sub example of that. If the configuration of whatever it is you are doing requires a browser and human being touching that browser you are probably gonna want a public cert.
-
Tim Callan
Right. Cause there is no way they are all gonna get it right. Absolutely zero chance of all of them getting it right.
-
Jason Soroko
Yes.
-
Tim Callan
So that's a good thing to think about. Right? And people don't really think about that that deeply. They just kind of pick something and maybe later it hurts them.
Alright. Visibility problems. We are up to number 18. We are almost there, folks. Don’t despair.
Eighteen – allowing rogue certificates to operate in your environment without taking them under management.
-
Jason Soroko
I'll give you a public and private example, Tim.
-
Tim Callan
Go.
-
Jason Soroko
The public example of that is hey, you know, my company is big and I allow my Linux administrators at any given time to buy whatever flavor of SSL certificate they want and I don't know what I have implemented.
-
Tim Callan
Right.
-
Jason Soroko
Number two, in the private realm, hey how about getting privately issued certificates from goodness-knows-where from what CA into my ATM machines, into my point-of-sale systems and perhaps into some other scanning device and I have absolutely no idea what sort of certificate policies are being used for each of those. And by the way, I'm also now setting up a DevOps operation for some web applications and I have absolutely no idea that there is even a CA in there even though there is.
-
Tim Callan
Right.
-
Jason Soroko
I guarantee there are some organizations in the world that have everything I just said as a problem.
-
Tim Callan
Correct. And so, what are your responses to that? I think it's two things. Number one is this is where CAA can be really useful. Like in the world of SSL certs, which is a big part of this, you put CAA in place. You put some parameters around what can be done and when some yay-hoo just tries to go out and buy a cert and stick it on then maybe they can't. Right. And then they will say hey I can't get this cert issued and it'll send them to you, and you know what they are doing.
The second one that's more universal and helpful in this regard is certificate discovery. You go out. You crawl your network. You find the certs. Now you know what they are. Now you don't get surprised by expirations or certificate problems like we had in the first one or lack of automated renewal like we had in the second section. Like all that stuff gets easier and more addressable once you know what those things are.
-
Jason Soroko
And Tim, number three, I would say for CSOs if you just heard all of that and you are wondering if that applies to you – just keep in mind that the PKI vendors that are out there today, namely us, the concept of a single pane of glass for all of your certificate issuance and management that is a real possibility right now today and you it is something you should look at.
-
Tim Callan
Got it. Number 19. This will be really quick; is same thing but for a CA. So allowing rogue CAs to operate in your environment. And I think we touched on that. So there's the certs – where people issue certs or buy certs and stick them up and there's the CAs where people stick up an entire CA, like in the DevOps scenario. And any other case allowing those to operate on their own without governance is fraught with risk.
-
Jason Soroko
Yeah Tim. I guess the only thing I would add to that is it's a real benefit to be able to set up an experimental CA for quick learning purposes, but those things should be torn down and should be brought into a wider governance.
-
Tim Callan
Right. And should be sandboxed and all that.
-
Jason Soroko
Oh yeah. Everything.
-
Tim Callan
Alright. Number 20 – Future use case planning. You already touched on this one, Jay, which is we see a lot of people kind of making the decision that's gonna get them through the end of the day rather than saying what is my business going to look like in a year or in five years and making the decisions that will enable that.
-
Jason Soroko
Yeah. That's everything from as small as what's my key digest size all the way to what's my ultimate trust model five/ten years from now. Or what could it potentially be and how do I set up things properly? This is where calling in an expert to think it through with you, that's where the benefit of that really comes in.
-
Tim Callan
And granted, you can't, you know, who is it who said predictions are hard especially when they are about the future. That was Yogi Berra, I think. So granted you can't know everything that's going to happen, but you can at least think it through and at least account for the likely outcomes.
-
Jason Soroko
Yeah, Tim.
-
Tim Callan
And then number 21 is staying current on cryptography, which of course becomes very important now that we are looking at the coming advent of quantum safe, post quantum cryptography. So, to be able to say I know what's happening with crypto, I know what's needed and I'm prepared to adjust as it needs to adjust is very important and people who aren't doing that it's gonna be a lot of jeopardy if you are not doing that.
-
Jason Soroko
Yeah, and I hate to say it but this also comes into the realm of work with a PKI vendor that can future proof you. In other words, leave that problem to the experts. And if you are leaving it to yourself, you are probably gonna end up in trouble because time does move on.
-
Tim Callan
Alright. Well, that's it, folks. A super-fast 21 PKI pitfalls to avoid. We have gone into depth on some of these in the past. We will go into depth on more of these in the future. But there's your nice handy list for you. So, Jason?
-
Jason Soroko
Yeah. No, thanks Tim.
-
Tim Callan
Thanks. Thank you, listeners. This has been Root Causes.