Podcast Jul 05, 2023

Root Causes 314: AI-based Deepfakes in Real Crimes

We have spoken in previous episodes about the potential for deepfakes in real-world crimes. In this episode we discuss a variety of real-world attacks in which deepfakes have played a role. These include fake kidnapping, "sextortion," and a range of spear phishing attacks and social media scams.

Original Broadcast Date: July 5, 2023

Tim Callan · Root Causes 314: AI-based Deepfakes in Real Crimes

Episode Transcript

Lightly edited for flow and brevity.

Tim Callan

We've been talking a lot recently about deep fakes and AI, and their potential to be disruptive to how things work in our society, because you can't know what's a real piece of captured content, and you can't not, and the way that this might present problems. And you and I were using these words, like might and may and what we wanted to do in this episode was talk about real world examples of where this was really going on and we actually have a list of actual criminal activities and sources that are fundamentally driven by the ability to use AI-based deep fakes.
Jason Soroko

Yeah, Tim. Some of these are recent, some of them older. The point being that these technologies have been around for a long time, the democratization of it is now making it more common and it's definitely worth talking about, Tim, because it really comes down to the concept of identity, which is something that's central to this podcast.
Tim Callan

And we've talked in the past about how in general technology moves down this pyramid of user capability and it starts with extremely specific, bespoke engineering projects, and it moves more into the mainstream and more than the mainstream. And we're seeing this here. What you're discussing is the democratization is that's exactly right. It's now at the point where it's in the hands of ordinary online criminals who are using this in scams that in some ways are very similar scams that we're already used to. And some of them are kind of new and unique, but they're all using these tools that are readily available.

So okay. So the first one, you found this article, Jay. This is the Evening Standard, June 15, 2023. Author is Andrew Williams and the headline reads - AI Clones Child's Voice in Kidnapping Scam. And the gist of this is this woman testified in front of Congress recently, that in January of this year, she received a phone call from what proposed to be kidnappers of her teenager and essentially they demanded $50,000 to return what she thought was her kidnapped child. And here's a quote. This is at one point, she heard her daughter, and her daughter says, “Mom, these bad men have me. Help me. Help me.” And the wrinkle in all of this is that child was not harmed in any way the child was off skiing, was happy, thought nothing was wrong. This entire thing was a scam based on a deep fake of the girl's voice.
Jason Soroko

Yes, unfortunately, it works, doesn't it? I think when I was first looking at that article, there was a link to an interview with the mother. And one of the keys that was important to me to understanding all this was she had been asked at any point, did you have doubts that it was your daughter, and the mother's response was immediate. She goes I never doubted for a moment this was my daughter speaking to me crying out for help. And that led, of course, to the rest of the story, as tragic as it is. And I think, Tim, this is the crux to this for us with respect to identity and trust, because obviously, these voice fakes are more than sufficient to make people act based on that trust. Preying, of course, on a mother is I can't think of anything much more cruel than that. But there it is. It's happening. It’s a big topic that we really need to bring up.
Tim Callan

Yeah and if you think about the context of it too, like if I handed you two audio files of me talking and said one of these is deep fake and one of these is real, figure out which one is which. You could sit and scrutinize them and maybe you'd come up with a theory about which one is real. But if you just get a phone call, you're going about your day, and you get a phone call. And somebody says we have your child, and you hear your child saying Mom, help me, help me. You're not going to do that. You're going to go straight to the conclusion that any parent would go to, which is that this is really my child's voice.
Jason Soroko

You got it. Urgency, of course those of you who have studied social engineering, urgency is a major component for how it works and also for how you can detect that there's problems and maybe how you should have suspicions. It goes both ways. But in this case, for somebody who is not expecting social engineering within a corporate setting it's a mother receiving this message from what's purported to be her daughter. Urgency, in that case was on the benefit side to the attacker, and it's definitely a part of it.
Tim Callan

And breaking context. This is the other thing that we talk about a lot - I talk about a lot - which is if you're used to expecting something in an email that might be skeevy, in an incoming email, then if they do it a different way, like they text you, maybe you'll fall for it, or they do it a different way, like they approach you on social media, maybe you'll fall for it. And when we talked about Tim's big fishing adventure, way back in the day, people who wouldn't necessarily give away their details if they received an inbound email, were giving it away because they were getting an inbound LinkedIn. And I think this is the same thing there. You take this, and you put it in another context. If I would be suspicious of an email that said, I have kidnapped your loved one, you wouldn't necessarily be suspicious of a phone call with what sounded like your loved ones voice on it.
Jason Soroko

It takes it to a whole new level of trust. We've now spent a lot of time with technologies that we've been trained, we've been taught, we've had experience with. You know, the Nigerian scam. Even people who don't work in a corporate environment have seen things like the Nigerian scam, and people can be hip to it. But in this case, I think people are not yet hip to the idea that you're going to be hearing your own family's voice and it's not going to be real. I think a lot of people, Tim, if they see on social media a celebrity or the President say something that's completely out of character, they might go, oh, I think I've heard of that. That's called the deep fake. People might not be quite as fooled because they've got that the urgency of the problem isn't big. It's just like, oh, look at the crazy thing that this person said. Well, I don't that that's real. I think people have started to catch on to some of that, but a family member in trouble and you have to react in the moment - that's a whole different deal.
Tim Callan

And I think maybe another part of this is, like you said, a celebrity or a politician or something, you start to think well, a lot of resources could go into this. Maybe there are paid actors who are imitating them. Maybe there's some kind of real fancy computer science and you imagine somebody sitting at a big monitor with lots of wavy sound lines, adjusting dials and stuff. But if you start to say, well, nobody could do this with me, or my loved one or my child, because we're not famous. To your point, the bar to get over this has gotten down so low, and you and I just spoke in a recent episode about there's lots of people have lots of audio out there. If you're YouTubing or Tik Toking, you might be producing a lot of audio and that's the starting point. That's the audio that people use.
Jason Soroko

And, Tim, I think that in this case, the way that you read out that article is so true in that the amount of audio that needed to be generated to socially engineer the mother was very minimal. It's not like what we tried to do, which is to have the daughter speak an entire podcast. That's quite complicated and there's a lot. In other words, they probably did not need a lot of the daughter's voice to be able to - -
Tim Callan

Good point. They needed like 10 words. They needed a credible 10 word clip that anyone was going to believe. And maybe they gave it a lot of tries. Maybe they didn't like one. They rearranged the words. They tried another one. Maybe they went until they got something that was good. They have the chance to do that. They have the opportunity to do that. And that's another thing to think about with deep fakes is it's not necessarily a real time real world kind of conversation. Someone can sit and craft this, and someone can take as much time as they need and when they're ready, they've just got their recording or their image or their video, and they just plain put it out there.
Jason Soroko

Exactly. With the advent of social media, even people who are not YouTube creators, podcasters, there's definitely more than enough public exposure of our voices to the world. Tim, I've been talking about this for years and years but things like that biometric of eyes and fingerprints. They're definitely not secrets. All you need is a somewhat high resolution image of your face and that's more than enough bits of information of your eyes and even your fingerprints to be able to capture and voice might be one of the easiest ones to be able to capture.
Tim Callan

Well especially since we put it out there so much. Like we're not taking a lot of photographs of our fingertips but we're recording our voices a lot.
Jason Soroko

We are recording our voices a lot and increasingly so and making that available and again, it's not celebrities and politicians and podcasters. This is anybody now who is putting things out there publicly are at risk. And I think that pretty much includes just about everybody. I’m not saying anybody who isn't is a Luddite. I am saying though, that it's such a large proportion of the population now, especially the younger population, that goodness, you're out there. Whether you know it or not.
Tim Callan

Yeah and also, if you're talking about like any of the schemes we're talking about today, which we'll get into a little more on, it's not about that you have to target one specific individual. What you'd have to do is you have to find an individual who is targetable. And so you don't need 100% of the populace to have this material available. All you need is a victim to have this material available and that makes the bar for that to be incredibly low as well.
Jason Soroko

You got it, Tim. It wasn't this specific girl, this specific mother that they were going after, this girl obviously happened to have a voice to be able to train upon and a mother to call. That's really all that needed to happen. And they found it.
Tim Callan

Exactly. And there may have been many, many 1000s of other potential victims as well that they could have found if they didn't find this one. Because they might have a very big kind of pool to fish in.

So number two. This is a June 5, 2023, public service announcement from the FBI and I'll give you the headline. And again, you can search for it yourself, but the headline reads, Malicious Actors Manipulating Photos and Videos to Create Explicit Content and Sextortion Schemes. And the gist of this here is a warning from the FBI, that people are using available public content, which is all very G-rated, to then manipulate it into deep fakes that are not and are using those deep fakes to do things like extort money from somebody who doesn't want this displayed or to harass people, or to do various other things that are to the detriment of the person who gets deep faked.
Jason Soroko

Some things get to the point where it's like, do we even have to talk about it? But I think we do. And in this case, Tim, let's talk about the, again, the democratization of technology and the progression that we've seen of this particular problem. I mean, it used to be you had to have some Photoshop skills and, and some software, and it looked a bit crude, perhaps, and whatever. It still had its purposes, its negative purposes, and probably hurt people even back in the day. But now, the ability to blend, the ability to not have to do manual, complicated graphics design work in order to do this, the democratization not only to do it, but do it at a quality level that, once again, you could be fooled, or it's harmful enough to the victim that it's just oh, it makes me shrink away from it but in reality, here's just another example of these technologies coming along and hurting people and I think you're going to hear more about this, unfortunately.
Tim Callan

And what was interesting also here is if we go back to the headline, manipulating photos and videos, so I think we're all kind of confident in our knowledge of the idea that someone could take Photoshop, and make all kinds of crazy stuff look real. But this warning says that the same thing is happening with video, which again, takes things to a whole new level in terms of the complexity of the problem and this is the kind of thing that AI makes possible.
Jason Soroko

Tim, you combine video with voice, and we were just talking about voices. You combine video and voice, and I think you're dead on in saying people are not expecting that level of deep fake of themselves at this point in time and yet we are now there.
Tim Callan

I mean, we've got an FBI announcement. They're not talking about a thing that isn't happening, they're talking about a thing that they know is. So the third one, the third one on the list is I want to reference a report from McAfee that came out in May of this year and the title reads Beware The Artificial Imposter, A Mcafee Cybersecurity Artificial Intelligence Report. And this report itself, I don't know that you need to look at this in particular because when you really look into it, mostly what this is about is their survey results when they surveyed ordinary consumers about what they think of deep fakes. So you know, 67% of people say they don't think they would know the difference between a deep fake voice and a real voice. Well, I don't know that those people know enough to answer that question. I'm not sure that's particularly relevant. But what it does do is it does bring us around to the idea of all of the traditional kind of scams that we're all used to through email or then subsequently things like social media or texting now being applied to what appears to be some kind of audio message or video message. It could be a voicemail or a recorded Zoom message or something along those lines.

And so let's just run through what are the kinds of scams we're talking about? We're talking about the business email compromise. It would have been in the past, I get an email from what appears to be my CFO’s personal email account saying, we need you to wire this amount of money to this bank account by Monday morning or the lights are gonna go out. And the helpful mid-level finance employee does it to be helpful and it turns out that those $5,000 weren't really going to the landlord. They were going to a criminal. Well, we all get trained not to fall for that but now all of a sudden, if I get a voicemail in my mailbox that sounds like my CFO’s voice that says the exact same thing, maybe I'll fall for that. So it's taking these things into the new context.

Or another example would be what they used to call the Western Union attack way back in the day, which was, I get an email from somebody, from my friend, or seems to be from my friend saying, hey, I'm traveling abroad. I just got robbed. I don't have any money or anything. I don't know how to get home. If you can wire $500 to this Western Union office, I'll pick it up and I'll pay you back when I get home. And you want to help your friend out and you go, oh you bet. You know, he's good for it. She’s good for it. They're not going to steal $500 from me. Well, they're not the ones who are stealing $500. It's someone else. That one you can say, okay, we all learned not to fall for that. We don't fall for the email. We don't fall for the text. But what if it seems to be a voice message from what sounds like my friend.

Or another example of a very similar thing would be just the good old-fashioned, the gift card scam. Where again, something that seems to come from my boss or someone high in my company says, look, we've got some really disgruntled customers, I want to make right by them. Can you just do me a favor, buy a $300 gift card on name your favorite online shopping store and send it to this account and put it through expenses. And we know, we've all been educated, trained, don't fall for that if you get that text. Don't fall for that if you get that email. But once again, if I get that what appears to be that voicemail, do I fall for it? And this is now happening. This is something that you're seeing where these attacks, these venerable attacks, well understood attacks that have been used by criminals, in some cases for decades, are moving into the new medium because of the new technology and because it's breaking the context, they are getting new victims.
Jason Soroko

There is a really good point, Tim, in that, everything you've ever learned about, hey, if somebody starts asking you for gift cards, and is in a real hurry, you should start to be suspicious, even if it's the person's voice in a voicemail or speaking to you directly on the phone. I think that all the old cans of worms that we've mostly learned from can all get reopened again with voice deep fakes. There's no question.
Tim Callan

Yep. Yep. And a couple principles of security. One is that all the old attacks are still potential attacks if they're not guarded against. And this is a good example of that. The fact that people have been doing something for 20 years doesn't matter if it's effective today. And then the other principle that's attached to that is that social engineering will never die. Social engineering is always a profitable attack vector, and we see that going on in this story as well.
Jason Soroko

Tim, I wonder how long it is before - and maybe it's already happened - but I wonder how long before you start to get Zoom calls, MS Team calls that include video and voice, a quick message from a superior who you know is busy, he doesn't have a lot of time, they give you a quick directive to do X, Y and Zed and you're like, oh, I better go off and do it.
Tim Callan

Absolutely. Something shaky. You kind of see their face and says, hey, this is whoever. I'm in the airport. Sorry about this. I gotta get on my plane right now. I need you to do the following thing for me. I'm just gonna say it, grab a pen, write this down. Oh, they're calling my plane. I gotta go. Bye. I'm gonna be in the air for hours and hours. I'll call you in 10 hours. I hope you can have it done by then. Click. I could see that absolutely working.
Jason Soroko

And it even discourages you psychologically from double checking with the person. Because you don't want to disturb them. They’re in the air. I can't reach them. They must be in the air. Absolutely. So it's got every element to fool you and to get you to feel the urgency of the action. And man, social engineering, these are going to be some powerful tools that we've got to be aware of that the bad guys are going to have against us now.
Tim Callan

So Jay, I'm gonna make a prediction. And this isn't the first time you and I have made this prediction. And I don't think it'll be terribly controversial but nonetheless, I'm going to predict this is the new normal, and we all need to learn what in the past we would have accepted as canonical, produced, evidentiary content, is now potentially entirely manufactured.
Jason Soroko

100%. Folks, this is what this podcast is about because we're in the business of trust and identity. And yes, we have competitors. And yes, we have colleagues and partners, but we want to speak to everybody about this and make you guys aware, Tim, and I get to see it because this is the world we live in. We get to experiment with it.
Tim Callan

This is our job.
Jason Soroko

It’s part of our job. But what Tim and I are seeing right now, we're pulling the chain on the horn, because we need to alarm people about the game has changed here.
Tim Callan

And part of the point of this podcast when you and I talked about wanting to do this topic today, was to make it clear that we're not in the world of deep science, we're in the world of real world attacks that actually happen against individuals to their detriment.
Jason Soroko

True. This is not quantum apocalypse level scientific work here that we often talk about in this podcast. This is democratized to the point where Tim and I can pull off this stuff without a lot of effort. Like, that's why we did the podcast we did recently to show that it is democratized to the point where we can do it with minimal effort.
Tim Callan

And that was so little effort. Like, imagine, imagine what that would have looked like if we had the time, which we just didn't, to try to do a better job of it because I think we probably could have. And one day, I think we probably will.
Jason Soroko

Tim, I bet you there's folks in Hollywood right now who really do know how to do it, and really do have the right tools who could make you doubt yourself. Like, my goodness, what could be produced right now.
Tim Callan

So right. So, a few people have shared some examples with me on this. One of them, of course, is in one of the recent Star Wars movies, there were scenes with Grand Moff Tarkin, who looked exactly like the Grand Moff Tarkin back from the first Star Wars. But of course, that actor, if he's alive, looks nothing like that today. And that was all AI-based. And we all said, ooh, wow, that's really impressive. But again, you imagine this going on on a big giant supercomputer in a big studio with professionals who do this all the time. Now we're seeing this moving into a much more generic general kind of set of circumstances and context.
Jason Soroko

I believe James Earl Jones talked about a contract he might have and maybe I'm speaking out of turn, because I don't know it. But I think his contract says, they can use his voice generated by AI into the future.
Tim Callan

Okay. Sure. Why not? Absolutely.

And then the other example I heard of was, apparently, somebody was doing a movie about the life of Anthony Bourdain and they wanted him narrating a bunch of his own writings, because he had a lot of writings. He wrote a lot of books and there are a lot of things that - - they are Anthony Bourdain’s words, but they weren't captured in his voice and so and I believe that you and I talked about this in the past, Jay, that someone used AI to allow Anthony Bourdain to read his own words in his own movie about himself, even though he wasn't alive to do it.

Now, those are good examples. James Earl Jones being able to use his voice even after he's not able to use it. Everybody being able to enjoy a character from a Star Wars movie that they all remember fondly. Everybody being able to hear Anthony Bourdain saying his own words in his own voice. Those are positive examples, and it goes back to the other theme that you and I touch a lot, which is when we use our advances in computers to make things better, which is why we do them, they do make things better, and a lot of things get better. And in the process of those things getting better, vulnerabilities and exploits open up and it winds up being both and it's always both.
Jason Soroko

Tim, exactly. I’ll let you in on a little something. The tool that we used for our recent podcasts where we synthesized our own voices, the real purpose for it was to post in post, when you're trying to mix a podcast for example, the ability to add a few words or even just a single word to - -
Tim Callan

To fix something. I flubbed it and I want to just go back and have it be a little smoother.
Jason Soroko

That's the legitimate intention for why you'd want to train an AI with your voice and then have it generate something. So positive legitimate purposes and people go hey, that's pretty cool. Thank you for giving me that capability. On the other hand, just like fire, it can also burn you. If you're using it for bad.
Tim Callan

Exactly. So anyway, I’m glad we did this because you and I have been talking about a lot of what ifs and you can and this and that, but I like us talking about not so much you can as people are.
Jason Soroko

People are. People are and it’s real. And I'll tell you, I invite you to take a look at that article from that Evening Post that Mother, It's heart wrenching. And that should really make you think about, hopefully not the eventuality for all of us being attacked that way because I know, Tim, probably you as well but I get a lot of fake phone calls all day long. It’s terrible.
Tim Callan

I get fake phone calls all day long.
Jason Soroko

And if it raises up to this level of game by the bad guys, oh my god. Just beware, folks. That's what it's about. Beware.
Tim Callan

Beware. And maybe we all need to think like, maybe there needs to be some deep thinking at a bigger level about what we all can do procedurally, societally, to mitigate or fight back against this problem. I don't know the answer. That's probably something you know I will continue to discuss.
Jason Soroko

Yes. No. It's going to be a topic you’re going to hear more and more from us for sure.

Contributors

Tim Callan

Chief Compliance Officer

Jason Soroko

Fellow

About Root Causes

Tim Callan and Jason Soroko explore the issues surrounding digital identity, PKI, and cryptographic connections in today's dynamic and evolving computing world.

View All