Root Causes 64: What Is Digital Identity?

Tim Callan

How are you doing today, Jay?

Jason Soroko

Doing fantastic, Tim, and you know, sometimes I gotta ask the question, is this really Tim I’m speaking to here?

Tim Callan

Exactly. That’s directly relevant to what we are discussing today. Today, we are going to talk about identity.

Jason Soroko

Yeah. Identity. It’s funny. It’s one of those terms or concepts that people almost take for granted. They don’t think too hard about it.

Tim Callan

Right.

Jason Soroko

I know who you are. I can tell from your voice. I can tell from your mannerisms and not only that, but you are somebody that I trust and therefore, if you tell me x is y, I will probably believe that. How does that all happen? Well, in a human sense, it’s complex, isn’t it?

Tim Callan

Yeah. Absolutely. And, you know, it’s funny, we have these, I think these basic offline intuitive understandings of some of these words – identity, security, things like that that are almost a perfect overlap on the Venn diagram with what we mean in the computer science sense but maybe not a perfect overlap. Right? So, you know, you’d say this is a perfect example. You’d say what’s your identity? Well, for a human, it’s easier in a way because there is a big mass of cells. Right? There’s a bunch of cells with skin on the outside and we call that the human. But when we get into the world of computer science, you know, your car. How many identities does your car have? And so, this is where I think it gets a little different and maybe a little more abstract.

Jason Soroko

It really starts to get abstract very quickly and interesting. You know, in a human sense there’s all kinds of transactional things that we do daily. For example, you know, I have a bank account and therefore, I have an identity to a bank. Well, how did that happen? Well, there had to be an onboarding procedure or a provisioning procedure at some point in time. I had to go into the bank probably. I needed to be able to show them various forms of government-issued identification and those things needed to be able to crosscheck each other. Well, even that had to be - - because of a previous trusted procedure where my government itself probably issued a driver’s license, birth certificate and whatever else I needed to do to be able to be part of that, you know, a passport, etc. How are all those things provisioned. It’s interesting how those are things that I would then possess at some point in time to be able to assert my human identity to do something like a financial transaction. Right, Tim?

Tim Callan

We would call those credentials.

Jason Soroko

Yeah. Absolutely. And, in fact, you know, I log into various digital platforms daily and those themselves came from a provisioning procedure where I initiated, hey, I am who I am and once that was proven to a certain point, I was then handed a credential with privileges to be able to log in with something like a username and password and multi-factor authentication, etc., etc., right?

Tim Callan

So, again, we kind of get it although even then I think we can get into - - you get a little corner casey even with people, right? Like if somebody has, I don’t know, major personality shift, we can have a nice philosophical debate about whether they are the same person or not. Um, but, you know, we take this world and we kind of bring it over to the world of computers and in the case of computers, you know, what we are really trying to understand is, is this digital entity that is attempting to communicate with some other digital entity, what is that? Right? And is it on a known list of trusted entities or is it not and is there another way to attach an identity after the fact?

Jason Soroko

Yeah, because ultimately what is going on, Tim, is if I’m able to, if I’m a bad guy and I’m trying to impersonate you, right? Impersonate your identity, especially if you have some level of credential that’s useful to me then in a digital sense then I am able to, if I’m able to copy your identity; if I’m able to impersonate your identity, I am them able to perform similar types of tasks that you have credentials to do. That becomes very scary if you have access to very sensitive information or even a nightmare scenario for a lot of organizations, which is, let’s say you are an IT domain administrator, for example. If I can somehow take your credential and walk around with it, I’m essentially you. So, therefore, that’s about protecting your identity. And I’ll give you an example of that. This is from a previous podcast. You know, we’ve been talking about SSH. We’ve been talking about Microsoft Windows hashes. As an example, if I’m able to steal somehow your Active Directory hash, I essentially can walk around as you and presume your identity without a lot of fuss going on, on the network. The network will not complain too much if I possess your hash. So, isn’t this interesting, Tim? We have so many ways to represent yourself, which is, from the credential. A lot of which can sometimes be a symmetric example of a credential. A username and password. Which again, is just a shared secret that can be copied or stolen.

Tim Callan

Yeah. That’s what you know? Right? If you do what you are, what you have, what you know, that’s a what you know.

Jason Soroko

And on the other hand, you might also be logging into a system with a PKI-based asymmetric certificate, which is something that you might possess. Something that you have. And it might be protected in a secure element and my ability to socially engineer you, my ability to somehow compromise your end point and try to steal that becomes vastly more difficult. I think what’s interesting here, Tim, we are talking about the concept of identity and when you think of it as you are protecting your identity regardless of the underlying credential type, it helps you to understand why the importance of protecting that credential which represents your identity is so important.

Tim Callan

Yeah. Well, and so much of that is, and you used the example of if I took over your AD identity, so trust – in a computer science world and probably in an offline world as well for the most part – trust comes from identity and if I have the credentials and I am believed to have admin access, then at that point, I get those abilities. You know, we’ve talked in the past about, you know, regulating and controlling things. Different identities get different levels, so if I’m not on the computer science team or if I’m not in the developer team, I don’t get access to the source code, right, that’s where trust comes from identity. We identify you are a member of the developer team; therefore, we are gonna give you these privileges. Or, I think of an offline world, when I pick up my son from daycare what I do is I get my iris scanned and that opens the door for me and then I can go to the room where the teacher knows me by my face and she lets me walk away with my son and that, again the trust, the privileges of putting my son in the car and going home are coming from me identifying myself as an individual who has those privileges.

Jason Soroko

That’s right, Tim. At some point, a decision is made to be able to, let’s say, you know, you are given privileges to become a domain administrator, you are given privileges to be recognized as, hey, you are the parent of, you know, Student So-and-So. Like these are all, at some point a decision is made to grant you those privileges and therefore, once those privileges are granted your identity becomes quite important and valuable.

Tim Callan

Yeah. And so, you know, this is part of why it matters and so, you know, it’s interesting, we talked again, about the old adage, right? What you know. We talked about what you know. That’s a password. By the way, I have that. I have to put a pin in the system. Right? That’s what I know. Then I get my iris scanned. That’s what I am. Right? And then I’m allowed to go through the door. So, I don’t have a what you have in that scenario. I’m not using my phone or my dongle but there are plenty of other scenarios where we are using that as well. If we start talking about machine identity which is probably what we want to be getting to, I’m not sure, these divisions kind of break down, right? Like what’s the difference between what you know and what you have when you are a headless IOT device?

Jason Soroko

This is where, you know, what you have, what you know, what you are becomes a lot more complicated is when we are talking about an IoT device, a smaller computer, perhaps a constrained computer. This is interesting because at the advent of IoT, for example, we saw a ton of schemes for credentializing these devices that essentially were just good old-fashioned shared secrets.

Tim Callan

Right.

Jason Soroko

We saw the Mirai Botnet, for example, take advantage of the fact that a lot of these devices were using hard-coded username and passwords or symmetric tokens, which are essentially just shared secrets, and these things were embedded in the device and they were quite easy to steal and yet that was the credential for the device. These were, you know, what was it? Something the device knew? Something the device was? Well, it kind of showed that it didn’t fit. That model didn’t fit. These older fashioned credential types, shared secrets that work with human being, really that’s a very poor form of identity assertion for devices that it’s just, I think where we are going with this, where I’d like to take this, Tim, is, you know, one thing a device can do well is to possess something and if it can possess something in such a way that’s extremely difficult to copy, to steal because of a compromise of an endpoint, that’s kind of the ideal credential types.

Tim Callan

So, what’s an example?

Jason Soroko

So, an example could be, let’s just try and break it down what the properties of what a thing really should be.

Tim Callan

Ok.

Jason Soroko

Something it should possess. So, it should be a file. Right? A binary file as an example. It should be not copied in two or more places. Therefore, it should be asymmetric.

Tim Callan

Right.

Jason Soroko

And it should be able to be stored essentially as a secret because this is what we are talking about. It is essentially a secret. The secret should be stored in such a place where it’s not part of the normal file system of the device itself. It’s an isolated place and whenever the assertion mechanism occurs, that secret never needs to be exposed.

Tim Callan

So, this is a unique public private key pair with a TPM?

Jason Soroko

I think that’s one of your ideal scenarios, yes.

Tim Callan

Right. Ok. I mean are there others?

Jason Soroko

You know, what interesting, first, how was that generated? How was that secret generated? Quite often for devices it might have been offboard generated by a certificate authority.

Tim Callan

So, they just crank out a bunch? They have way of coming up with an unpredictable seed number and then they just make the number of certificates they need to make and put them on the system.

Jason Soroko

Right. But putting them onto the devices is essentially a chicken and egg problem that is solved quite often at the point of manufacture of device when the device is known and trusted and within safe hand. Right?

Tim Callan

Uh-huh.

Jason Soroko

Because quite often that credential might need to be provisioned to the device at a non-trusted place or time.

Tim Callan

Right.

Jason Soroko

So, in other words, it’s just the same problem as, you know, when I as a human being walk into a bank there needs to be a provisioning procedure. Right? It’s the same thing with a device. I can - - we know we can let that device possess something but at what point do you provision that thing that it will possess?

Tim Callan

Right. So, if I’m the original manufacturer of the chip and you are prepared to trust that I am honest and secure then I could manufacture and provision certificates at that point and trust emanates from there. On the other hand, if I’m putting these things out into the supply chain then you’ve got to look at your supply chain and understand how and when this is being provisioned such that we can still trust that it’s honest and secure.

Jason Soroko

Yeah. That’s right, Tim. So, I think with devices, especially IoT devices as an example, the provisioning of these things it really needs to be thought out carefully in terms of the ecosystem of the device itself. In other words, can we take advantage of the manufacturers, you know, the supply chain of the device itself? In other words, even the fully formed device itself, can it be provisioned with an identity at a subset of the device. Perhaps the chipset at an earlier form of the supply chain. Right?

Tim Callan

Yeah.

Jason Soroko

So, you know, can it be a transport key? Can it be a shared secret at an earlier point and then hook itself into a more established identity down the road. I think what’s interesting here and we could talk about all the different scenarios for devices because very rarely is it ever identically the same. I think what’s interesting from an identity standpoint though is the fact that the concept of identity remains regardless of the underlying technology whether it’s the provisioning technology or the form of credential itself, which may be in the form of an x.509 certificate that we are all quite used to, or it could be in other forms as well. Either way, it’s an identity.

Tim Callan

Yeah. Now one of the interesting things I think though is identity - - so, again, with humans it’s kind of simple. You say there’s one human being, right? One head, ten toes, a bunch of stuff in between and we are gonna call that one human and we are gonna give it one identity. There is not really a concept of saying that, you know, my kidney has to have its own identity but if you get into the computing world, that’s not true at all. You know, I mentioned earlier your car doesn’t have one identity. Like in the offline world, my car has one identity. It has a VIN. It has a license plate number. Right. It has one identity but in the online world, every individual component of that car is its own digital actor essentially and therefore it needs to have its own separate identity. Right? If you look at a complex system, if you look at a commercial jet, that’s not one identity. That’s a whole ecosystem.

Jason Soroko

Yeah. That is exactly right. An automobile is a perfect example because most modern automobiles have let’s just say a lot more than just one or two on-board computers. Sometimes they’ll have up into the level of 100 or more and each one of these things has their own distinct role within the action of an automobile – engine systems, braking systems, even the gateway units themselves. All these different electronic control units each needing their own identity, their own way to assert themselves and perhaps as part of that, it’s not just the what you have. Each one of them might potentially possess an x.509 certificate to assert its identity but they might also want to be able to identify what they are and what the nature of they are quite important because a braking system ECU should have a limited scope of privileges within the automotive network compared to say a gateway.

Tim Callan

Right. And like there’s no good reason why the entertainment system should ever be applying the brakes. Right?

Jason Soroko

Yeah. That’s absolutely 100% correct. So, things like manufacturing usage descriptions and other schemes have been talked about. Unfortunately, I don’t think that those concepts have been really baked out completely within most IoT security platforms but it’s a very important consideration with respect to identity and security.

Tim Callan

It sure seems like there is a lot of potential there and this is where, again, if we go talk about the humans, you know, we have this all the time. Right? I have an identity inside of the enterprise and that identity gives me privileges. If I am identified as a member of dev team as we talked about earlier, now I’m given access to the source code. If I’m identified as a member of the HR team maybe I’m given access to employee’s information but not the other way around. The HR person doesn’t get the source code and the developer doesn’t get the employee information. Right? And so, you could see the same thing working in the car. If you are the ABS, you can do all things to the brakes, but you can’t do anything to the steering.

Jason Soroko

Once again, the good ‘ol principle of least privileges becomes a powerful mechanism.

Tim Callan

Absolutely. At least privileges here for sure. Yeah. And, you know, you mentioned something to me in a different conversation and I just want to recall, bring you back to, Jay, because I thought it was so interesting. You talked about systems where we look for unique laws in the silicon of that IoT device in order to give it a truly unique key.

Jason Soroko

Yeah. Well, the thing is, right, you and I coming from the PKI business, we quite often speak in things such as x.509 certificate terms because it’s very, very ubiquitous but we also talk on this podcast quite a bit about SSH keys, which are not necessarily x.509 certificates and then there is also a concept of, alright, well how about a device which has very, very low entropy. It cannot do key generation on its own so what can I do with it? Where can I get the secret from to be able to properly uniquely identify this device and one interesting scheme that people have come up with is, hey, let’s look at the unique pattern of flaws within the silicon of the device and then generate a private key. What is the equivalent of a private key based off that?

Tim Callan

And that’s cool because that’s the what you are, right? So, I think earlier on I made a statement to say that, you know, this what you are, what you have, what you know thing doesn’t really work in the world of, you know, a lot of the computing world but in this case it does. Right? The what you are is the silicon there and that is a very good analogy to what I mentioned earlier about, you know, I have a unique iris or I have a unique fingerprint on my left thumb that identifies me uniquely based on my physical person. That’s identifying a device uniquely based on its physical properties.

Jason Soroko

Yes. And the term that I think we used earlier, Tim, that’s so important here is, it’s part of an ecosystem and that ecosystem includes the fact that there was a procedure to measure this uniqueness. There was a procedure at some point in the supply chain to measure the uniqueness of the pattern of whatever it is in the silicon to generate this key and then there has to be an ecosystem of things to be able to go off and read that key. So, you know, there is obviously therefore some kind of provisioning procedure as well as, you know, software to go off and be able to take advantage of it and all these things constitute an ecosystem, which I think identity alone is almost an impossible concept – especially in digital security. I think that’s one of the takeaways here, Tim, is regardless of the identity assertion scheme that we want to employ, whether it’s human or a device, there must be an ecosystem around it in order to be able to support it.

Tim Callan

Yeah. Absolutely. And, you know, that goes back to offline examples as well, right? A license plate number in and of itself isn’t valuable. What’s valuable is that they are all unique and you can look them up and things along those lines. Right? You know, if I write down the license plate number - - if somebody has a hit and run and I write down the license plate number, the point is there is an ecosystem that can track that back to an individual who owns that car and that’s the ecosystem. Right? And same thing here. These things exist inside of this larger system where they are used correctly to get the effects we want.

Jason Soroko

So, Tim, I think one of the conclusions that I have from this is, you know, in the human world, right, you know, we still live in a world of something you know, which is essentially a shared secret password along with something that you have, which could be some form of MFA, right? Second-factor authentication. That kind of works in the human world and it’s been working for a while. It could also just be something you have which is, you know, possessing something like a client certificate as an example.

Tim Callan

Yeah.

Jason Soroko

When we get into the machine world a lot of the human problems of what is the, you know, something you are, something that you know, those things could be quite difficult for a computer but for a computer being able to allow it possess something that’s a good strong secret still works and in today’s world, that just happens to be an x.509 certificate in many, many cases.

Tim Callan

Or a key pair, right? Almost always a certificate but a certificate or a key pair. But let me abstract this one step further for you, Jay. So, we are still talking about devices and servers and these kinds of physical things but one of the changes in computing in the last ten years is this decoupling of process from metal. Right? So, you’ve got public cloud, you’ve got containerized environments, microservices, and so now what you have is you have loads, you have workstreams that are running that aren’t necessarily connected to any physical piece of silicon and they need identity, too.

Jason Soroko

Which is exactly why - - this is exactly why you need an ecosystem to be able to support the provisioning of those abstractions such as say a doc container, right?

Tim Callan

Right.

Jason Soroko

And the way to provision that is through its container orchestration engine of choice, such a Kubernetes, as example. Those would typically be supported by an ecosystem of a certificate authority which is offline generating with good amounts of entropy key pairs and then put into the form an x.509 certificate and provisioned in a secure manner into that system of containers and orchestration engines. That is how we do it today. In other words, it comes down to the identity of these things is what you have because these things have and possess this credential which happens to be in the form of an asymmetric key pair.

Tim Callan

Now, there still could be a what you are component as you were talking about earlier, which is to say, if I’m supposed to perform a specific operation, my code that’s existing in this container is supposed to perform a specific operation. I am firewalled, for want of a better term, from doing other things inside of that network. So, even hypothetically, if a bad guy managed to inject their bad code into my ecosystem there would be limits on what they would be able to do based on the permissions that I ought to have?

Jason Soroko

Yeah, Tim. This is something we haven’t talked about a lot on this podcast but there is a really big difference between authentication and authorization.

Tim Callan

Sure.

Jason Soroko

So, in other words, I may be able, you know, I may be able to authenticate myself into your active directory system or ERP or CRM or whatever it happens to be that you are offering to me as a service, your API, but when you originally provisioned me, you should have also provisioned me with a specific set of privileges which can then be tested against, you know, those policies can be tested against perhaps a database of those things or perhaps it’s even written directly within my assertion to be able to say, you know, I am the braking system; therefore, if you happen to measure the fact that a braking system is talking to engine management why is that happening?

Tim Callan

You know, and you used an important word there, Jay, which is you said I would be interested to know, and I have no idea how to find this out, probably this is something that can’t be found out. I would be interested to know how often that really happens. I am suspicious that most of the time people who are developing these systems are lazy about that and that they probably aren’t building those kinds of controls.

Jason Soroko

Well, Tim, you know, in the device world it’s at least talked about. I said earlier, you know, manufacturing usage descriptions and other schemes of authorization and policies are out there. To further what you just said, I’d like to know how many Active Directory systems in the world, you know, what’s the percentage breakdown of people who were using GPOs, which is essentially a policy correctly so that there is segregation of credentials between say your crown jewels of your enterprise versus, you know, let’s say your HR department, your finance department and other things that have their own sets of crown jewels, how much GPO segregation there is a policy standpoint.

Tim Callan

Exactly. Or going back to our earlier example, if I have a process that’s supposed to do a specific thing, I take dataset A and I manipulate it and I give a command to this other, you know, this other process over here. If malicious code got injected into that part of the system, would it be able to issue commands to different processes or would those other processes reject that and say no, I won’t take that from you because you are not authorized to tell me. And, again, that’s where I suspect that a lot of the time or most of the time people probably aren’t taking that extra step.

Jason Soroko

And sometimes when they do take the step, Tim, they take it - - because they are having to reverse engineer the system, it can become very difficult because let’s say you and I were IT administrators and we just invested in a great big fancy log event system. You know, a SIM, for example.

Tim Callan

Sure.

Jason Soroko

You and I might look in the logs for, hey, why in the world did the print spooler from the fourth floor happen to log into an HR system? Perhaps we could detect that condition because of credentials that we know have a specific usage or expected usage that might be doing things that they shouldn’t be doing. I think quote often, you know, if you were to walk the vendor hall at RSA, for example, Tim, you might see people who are trying to do this, but they are trying to do it in the reverse, you know, in the reverse order. And what I mean by reverse order is they are trying to reverse engineer the network. That’s fine. I think there’s usages for that if you are looking for something specific. However, it would be way better if we could control it at the actual level of identity.

Tim Callan

Yeah. And we all know why this is hard, right? You’ve got legacy systems that were in place before these concepts were. You’ve got different groups in different part of the enterprises working on things. You’ve got ecosystems that extend beyond your corporate orders where you have partners and suppliers and customers that are all kind of inside of this world and you can’t necessarily control their behavior as closely as you can control your own behavior. You have skunkworks projects that suddenly gain importance. You have the political nature of the enterprise where, you know, this executive doesn’t want to take orders from that executive and all that stuff makes this concept very, very hard. Those human organizational reasons make this concept very hard; but even then, we could get as far as we can get that’s pragmatically realistic and I’m not even sure we are always trying to act.

Jason Soroko

That is 100% true, Tim. That’s the reality of the world. I would say the takeaway perhaps from this podcast is that, you know, identity is an incredibly important topic and concept, and it involves a lot more things than just the form of credential. Quite often, you and I are talking about certificates, which is great, but there’s all kinds of forms of assertion. There are all kinds of forms also even of authorization and how you inflict those policies to create security. Identity is all these things.

Tim Callan

Yeah, and it’s super contextual, right? We throw out this word and you and I talk about identity every podcast whether we use the word. Right? We are always talking about it as an underlying assumption but it’s very contextual and you must really understand the specific circumstances and nuances of what it is in order to make sure that you are thinking about these ideas correctly and you are implementing correctly.

Jason Soroko

Yeah. And you know, Tim, that whole idea of legacy, it will be with us forever and unfortunately, it even has carried over into much more modern concepts that you’ve talked about, which is things such as containers and dev ops and IoT and all these things that are new. Old concepts of how to credentialize, how to create policies, you know, much older forms that were carried over from human-based credentialling have carried over into where they shouldn’t have. And that’s something that we are going to must live with for a while, but you and I are trying to put out the message for better ways of thinking about this.

Tim Callan

Yeah. Absolutely. And to some degree, we’ve got the easy job, right, because someone else must go and say that’s all well and good but how do I implement this in this real-world sticky, difficult situation. But, you know, again, thank you. This is great. We could probably go on for another hour on this topic, but I think this was really good in terms of just sort of laying out, look, this is what we mean, and these are the underpinnings that we are thinking about when we take it up a level, when we are talking about PKI systems and trust models and certificates standards and all of things that we usually talk about. At the end of the day, all that stuff is there to serve what we discussed today.

Jason Soroko

Yeah. That’s exactly right, Tim. Never forget that none of the stuff works without an ecosystem. Just like when you went in to get a loan at the bank or a bank account, there was an ecosystem there. It’s no different than in digital identity concepts as well.

Tim Callan

Alright. Well wonderful. Great talk and I’m glad we covered this one today, Jay, and as always, I enjoy talking to you. Thank you very much.

Jason Soroko

Thank you, Tim. Wait until we start getting into issues of trust and how trust is established. That’s a whole other topic.

Tim Callan

We need to do that one as well. We’ll save that but we’ll do that in a future podcast for sure. Thank you everybody. This is Root Causes.