Root Causes 310: Another AI Episode
In this episode we continue to explore the capabilities of AI to replicate known people in deep fakes with AI-generated content.
- Original Broadcast Date: June 13, 2023
Episode Transcript
Lightly edited for flow and brevity.
-
Tim Callan
Hello and welcome to Root Causes, a podcast about digital identities and all things PKI. I’m Tim Callan, Chief of Customer Experience at Sectigo and joining me today is our Vice President of Product, Jason Sirocco. How are you today, Jay?
Ok. So, that was not a bad try. Hey, Jay, how are you doing?
-
Jason Soroko
Hey there, Tim. I’m hoping you heard a little bit of that clip there.
-
Tim Callan
So, I don’t think we probably fooled anybody. That wasn’t really me but, you know, this is a topic you and I have been talking to and I think you were the mastermind behind this one, Jason. You sort of said, I wonder if we could get an AI to do our podcast for us?
-
Jason Soroko
I think everybody is asking that question. Let’s go back. Let’s talk about the discussion you and I had. And this is going back. This is going back. We have previous podcasts on artificial intelligence, ChatGPT and other things that, Tim, you are gonna help to inform us on.
-
Tim Callan
Real quick. We started that dialogue with our Episode 276 called ChatGPT and Identity Reputation. Definitely worth going and looking at, we also did our 279, ChatGPT Watermarking and one more, um, that I’m not seeing but we had at least one more on the ChatGPT topic that came I think after that.
-
Jason Soroko
That was on malware.
-
Tim Callan
Malware. Yes. Can ChatGPT Write Malware was another one of our episodesn it’s our Episode 285. Sorry about that. Can ChatGPT Write Malware. But then, of course, I’m also gonna take you all the way back to our Episode 198 which is Deep Voice Fakes, because in 198 we talked about this idea that something we’ve considered to be reliable, which is a voice or for that matter, a video, really today isn’t reliable anymore. And then, so the obvious question was what happens if you collide those two things with each other and, I’m gonna give you 98% of the credit for this because you did the heavy lifting. I just showed up and talked but I think we’ve got something that wouldn’t pass for the beginning of our podcast quite yet but honestly, it wasn’t a bad try considering that it was a first try and we actually invested very little in this.
-
Jason Soroko
That I think, Tim, is the main point. We used very, very reasonably low priced cloud-based software and this was our first attempt. We are showing our dirty hands it.
-
Tim Callan
Took one run at it, didn’t go back and try again, didn’t figure out what we could do to smooth it out. Exactly. That was literally the first try.
-
Jason Soroko
You got it. And that did include a round of voice training. So there was forget how long it took. Maybe 10, 15, 20 minutes where I know I was asked to speak a lot of different standardized text and I think you did that as well.
-
Tim Callan
They gave us like, it was like 50 sentences that you had to read.
So, it didn’t take that long. And of course one of the things when you listen to the clip is in some of it – the voice is pretty good. I think it’s the inflection that doesn’t go very well. I don’t remember what it was but I was asked to reach these sentences and would say whatever I’d said, the bluebird is the fastest and, there is a big difference between saying the bluebird is the fastest and the bluebird is the fastest and the bluebird is the fastest. And I didn’t know how I was supposed to inflect these things and surely, those decisions affected what the AI did with the text we gave it.
-
Jason Soroko
Certainly, Tim. It certainly does. And let’s put it this way. Fifty sentences was enough to get you what you heard. And how many sentences have we spoken on our 300+ podcasts so far?
-
Tim Callan
Oh geez! Oh man! I mean just from this podcast, just from this podcast you could get 100 hours of audio. Let alone all kinds of other things that you and I do. We do interviews on t.v. and other people’s web programs and recorded webinars. I mean probably without a lot of effort you could individually get 100 hours of your voice or my voice.
-
Jason Soroko
So, if you were working with an AI that trained in a more sophisticated way and you were able to put context onto various inflections that we do naturally and you were to train the AI with 100 hours of voice then you are dealing with a situation where I think we mentioned this on our previous podcast but they had Anthony Bourdain. You know, words he had written apparently and they voiced it using an artificial voice on a documentary about him. And it was utterly convincing. It was just like he would speak.
-
Tim Callan
Absolutely. And again, so part of the point behind this was how trivially little went into this. Because the other thing of course if that paragraph, Jay. We didn’t mention that part. We didn’t write that paragraph either. That paragraph was also written by an AI.
-
Jason Soroko
In fact, Tim, I’ll back up and tell you exactly what I did one afternoon – and this is based on a conversation you and I had. So, full credit to everybody for coming up with cool ideas but the thing is, the idea was, the first idea was let’s let the AI – and in fact, it was ChatGPT 3.5 at the time – we asked it to write a script for a podcast. And when the first script came out, we wanted it to have a two-person podcast so we did inform it in the prompt that we wanted that. But then here's the interesting point, Tim. You got to see the prompt that I wrote that led to that introduction. So, that is natural language instruction that led to a wording that is pretty much identical to what you normally speak at the beginning of the podcast.
-
Tim Callan
There are subtle things we could change. Like I always say a PKI and security podcast and it just said a podcast. But you just go edit the prompt and you just put a sentence at the end that says introduce it as a PKI and security podcast. Do that and it’s just right. Like one of the things that you had in your prompt that I liked was you said – there’s a couple things. One is you said, Tim, will often address Jason as Jay.
That’s in there and it happens. Then another thing you said was something like Tim starts the podcast by saying, “How are you doing today, Jason.” And that’s in there as well. And so you just write it. You just write English and you just told it to do those things and it did.
-
Jason Soroko
I think a lot of people who are plugged into AI right now and play around with ChatGPT and Google Bard and all the others, I think a lot of people, a lot of people have been messing with it but a lot of people are still in beginner mode. Let’s be honest. And what I showed you with those prompts was kind of like Grade 1. Grade 1. Where you can’t be shy to really tell the AI what you want and once you realize you have full freedom to truly express what you want in natural language, that’s when you start seeing the real power of the AI and it gave us – - what I asked for was a podcast on steganography. And, in fact, Tim, I have another clip coming up shortly where you asked me a question in the podcast – it’s so hilarious. How do we refer to this?
-
Tim Callan
“You” ask “me”.
-
Jason Soroko
Virtual you asked virtual me a question and then I answer it. My voice sounds even a little more robotic than yours in this next clip. We’ll evaluate that in a moment when we play it but what you are gonna hear was not written by us, not spoken by us at all. And this was first attempt, as you’ve said.
-
Tim Callan
The only guidance was steganography. That was it.
-
Jason Soroko
That’s it. That is all. So, I’ll tell you what? Let’s get rid of the suspense. I’ll play this next clip for you, Tim, and I’ll be right back.
-
Tim Callan
Alright. I’m listening.
-
[AI audio clip playing]
-
Tim Callan
Can you start by explaining what steganography is?
-
Jason Soroko
Sure, Tim. Steganography is a method of hiding information within another piece of information, like a picture or a text file. The idea is to keep the existence of the hidden message secret and only reveal it to someone who knows who to access it.
-
[and of audio AI clip]
-
Tim Callan
Again, it has that computer inflection quality that we heard on the OK Computer album from Radiohead. A lot of people remember that. It goes all the way back to Steve Jobs standing on stage with the MacIntosh and it starts talking. It has kind of that, a little bit flat and clipped quality. And that repetitive inflection quality. There’s more variety in a live human talking. But again, there are absolutely with a little bit of love and maybe even some higher end software but I think more than that, it’s just effort and elbow grease, people get around that problem.
-
Jason Soroko
They do get around that problem and what you are hearing there is a lot of interpolation by the AI because it only has 50 sentences of my speech that it was trained on. If it was trained on 100 hours’ worth from the podcast and +++ more, I’m gonna say there are AIs out there right now that can make it indistinguishable from me. And you.
-
Tim Callan
Absolutely. And for any famous person, like how much audio is available in the world of Joe Biden or Keanu Reeves. Just a vast quantity. So, for those celebrities, you’ve got essentially an infinite pool. But even for a non-celebrity. Like you and I just talked about. An amateur podcaster is producing an awful lot of voice, an amateur tik-toker is producing an awful lot of voice.
-
Jason Soroko
I would say there are tens of thousands of people who have produced podcasts, YouTube videos, content creators.
-
Tim Callan
An amateur YouTuber is producing an awful lot of voice.
-
Jason Soroko
Huge amounts of voice and look, 50 sentences was enough to do what you heard. The amount that the average content creator on YouTube or any podcast platform does is way more than enough for any sophisticated AI to be able to very beautifully reproduce their voice.
-
Tim Callan
And so, the other thing that people aren’t getting the full experience of and maybe sometime we’ll be able to accomplish this is the actual transcript on the steganography podcast that we got was great. I read it and I was like, man, I wish I had made this podcast. It was terrific. It was smart. It was informative, at a good pace. It was great.
-
Jason Soroko
We are sparing you, the audience, from listening to the whole thing because I actually did. Maybe let me add another point here, Tim. I did ask the AI to print our voices right across the entire written podcast script and it outputted that and so there is an audio file that I have that has our voices right across the entire script but you read the script just like I did and, Tim, it was great. It was really, really was.
-
Tim Callan
It was very convincing. Like, if it had just shown up as a transcript of our podcast and I didn’t know it wasn’t, or apart from saying, gee, I don’t remember making that, I would have read it and thought, yep, ok. It was completely convincing.
-
Jason Soroko
I tell you what, I’m gonna take some of the credit for some of this and I’ll tell you why. It’s because some of it had to do with the way the prompt was written, Tim. If you remember.
Because there was some elements in the prompt where the original undoctored script that came out wasn’t as great and the reason is because it was very AI generic text in very kind of choppy sentences. And so what I asked was make a mix of long and short sentences and kind of make it at the University level. I forget exactly how I worded it but the prompt included those instructions and sure enough, the AI gave us exactly what we asked for.
-
Tim Callan
And this is something you see happens a lot and I think this makes great sense is using these AI tools at the prompt level - which isn’t the only way you can use AI tools obviously - but at the prompt level, it seems like you can, it’s very efficient to iterate and zero in.
You ask for something, you get your result back in seconds. You looked at it, say eh, not quite what I wanted. You change your prompt. You hit it again. You get your result back in seconds. You say, closer, but you change it and you do it again and, you and I talked about, I guess it was maybe about a year someone who had won a fine art contest at the State Fair with a completely AI-generated, prompt-generated, and this guy had iterated and iterated and iterated. Like this is an artist whose medium is prompt writing. And had worked really hard on these things and wrote many prompts and really worked on nuanced ways to get exactly what he was looking for and so, that is work and it is intellectual and it is a skill and it something you can do well or you can do poorly and your point is exactly right. A lot of the reason this thing was so great was because of your good prompt writing. And at the same time, a lot of it is so great because AI just is kind of mind-blowingly cool. And it was both.
-
Jason Soroko
Yes. So, keep in mind, Tim, there’s a few things to keep in mind on what you just said, which is remember, this was first crack at a little bit of what I would call Grade 1 prompt writing. Compared to the real pros that are out there. And it turned out great. So imagine very sophisticated AI-focused on voice and with a prompt writer who is Grade 2. Or Grade 10 or University level. So, being really, really pro.
-
Tim Callan
And people write books on these things. You could buy books on AI prompt writing – how to manuals. I don’t know if they are good. I haven’t read one but maybe I should. Maybe I will.
-
Jason Soroko
I think everybody is gonna have a few books on those in the near future. This is I think it’s worth saying – a lot of people think, oh, geez, AI is gonna take away my job. Well, it will take some jobs but I will tell you, if you are a good prompt writer and you learn how to interact with a computer in a natural language setting, my goodness, your effectiveness as a computer programmer, as a copywriter in marketing department, it’s just another super powerful tool in your belt, Tim.
-
Tim Callan
Absolutely. And I know a bunch of people who are presently using AI tools, mostly ChatGPT today, to make themselves more efficient in their jobs and it seems to fall into two main camps.
Number one is think of it as first draft. You say to ChatGPT, you tell it what you want and it gets you started. And it gives you something. It’s a good starting spot and you say, great, you saved me a lot of work. Now I’m gonna take this and massage it and nuance it and take it from here.
The other thing I’ve heard a lot of people doing is using it as kind of a basic research assistant. Tell me about such and such. Give me 2,000 words on this topic and then you use that to quickly get up to speed on a topic, learn the things you need to learn more about, dig in and sort of get a basic framework for what you are doing. And in both of that ways, it seems like an obvious productivity enhancer that is not in any way a job taker awayer.
-
Jason Soroko
Oh no. Not even close. Look, lawyers, doctors, people who work in marketing departments, heck, you and I sometimes who are just sometimes you end up with a mental block and it’s like, hey, ChatGPT what’s a way to frame this idea and boom there it is.
-
Tim Callan
Tell me about such and such. Absolutely.
-
Jason Soroko
And what I would say to anybody messing with the tools is be explicit in your prompt. Learn the prompt. Go off and buy one of those books. You’ll be shocked at what you are gonna learn about what the real pros do to make these AIs sing and dance. And that’s especially true, Tim, you heard now AI that has resulted in text, AI that has resulted in audio, but there is also AI that results in images and, there’s Dall-E. And in fact, very recently, I’ve been playing around quite a bit with Midjourney and I am just mind-boggled with what is ultimately – people are going to throw stones at me for saving this – it is plagiarism plus math equals amazing result.
-
Tim Callan
[laughing] That’s funny. But drill down one step on that, Jay. Click through that one. What do you mean by plagiarism plus math?
-
Jason Soroko
Because of this. I was messing around with a friend. We were making a joke over a text message and, she had some stresses going on in her life and I was just trying to make her laugh. And, sometimes you send an emoji, sometimes you send an animated gif and rather than doing that some kind of stock standard meme. I said what? This topic requires a funny picture. So, I went over to Midjourney and after about four or five prompts, I’m like, man, that is the most amazing image that kind of sums up the situation she is in and I sent it over to her. She was just floored. Like what’s that?
-
Tim Callan
That’s so smart. I never thought of that as a use of Midjourney is to make your own little homemade unique memes for your specific situation. But, of course. Why not?! What a great use of the tool. Absolutely.
-
Jason Soroko
Absolutely. And so to answer your question, there was an image of a face in a human face within the image that was created and it was so like, oh, this comes from an image that’s on the internet somewhere. This person probably exists. Now obviously, the AI modified it somewhat – skin tone, maybe the features of the face were made a little more symmetric but all those things are just math on top of - - it’s just a statistical model applied to existing images. So, that’s my drill down, Tim.
-
Tim Callan
Cause it all starts from somewhere real. I know at one point you were fooling around, I don’t remember if it was Midjourney or not but you were sending me pictures of famous celebrities and they’d be wearing a propellor cap or riding a surf board.
-
Jason Soroko
That was Dall-E.
-
Tim Callan
Dall-E. And, obviously, that’s starting from the images of the famous people. And that’s starting from real images that are really out there that are available and then it takes those and modifies those in a consistent way with the real images.
-
Jason Soroko
Sure. You could ask picture Tim Callan in the style of Botticelli?
And it might make you end up looking like Venus floating up from the sea in exactly that sort of painting style. And that’s just statistics. It really is just statistics working in the background and so to me it is plagiarism plus math equals what you are getting here. And it’s pretty cool.
-
Tim Callan
It is pretty cool. So, it is a huge enabler. At the same time, this goes back to some of the points I think we are making when we go back to the deep voice fakes, which is there are certain things that we used to consider to be, I don’t know if the right word is canonical but there was a time where if I had heard a recording of my mother’s voice I said, oh my mother said that. And now, I can’t know that anymore.
-
Jason Soroko
No. You can’t at all.
-
Tim Callan
And it causes us to have to think differently about what we consider digital files to indicate.
-
Jason Soroko
Tim, let’s push that a little further. I mean if you are in the music industry – let’s say you are a music industry, let’s say you are a music industry executive and you are paying these young acts big bucks to come up with music and it’s a pain in the butt because quite often, they’re off on a beach somewhere rather than writing their music and all of the sudden, you now have an AI that you can natural language prompt and out will come very well-crafted music that is unique, that would pass copyright laws. In fact, it’s mathematically even measured to be unique against a lot of other music and totally unique and sellable.
So, Tim, I have recordings of my mother’s voice as well that are canonical even in my own head and I’ll tell ya, not just a person’s voice but also music, words you read almost in any marketing right now, I gotta tell ya, it’s probably – there’s some digital hand in there somewhere even if an artist pick an artist of your choice right now, a big pop artist, people talked about things like lip syncing and that sort of thing but I would say and even the usage of tools such that help to pitch shift your voice, it’s quite common to use that but now you are gonna start to even have digitally generated beats, digitally generated rhythms, melodies, and even possibly sections of voice. So, that’s maybe the beauty of art in the sense that, heck, Vermeer, the Dutch painter, was using camera obscura way, way back in the day and people still to this day say, well, is that art? It’s more like paint by number. Well, no. It’s still art because there’s a lot of subjectivity to it. So, don’t blame the tools. It’s how open-hearted you are and open-minded you are about using these new tools in your art and in your creative pursuits. So, I’m not worried abut it that much but just keep in mind, I think the digital touch and a lot of what of what we are gonna consume in the future is definitely gonna be there.
-
Tim Callan
For sure. And again, to touch on another thing that has been a theme in the small number of podcasts we’ve done on this but a very consistent theme is the pace at which this stuff is moving. And I think that’s one of the things that’s interesting is we go back in time a year ago, we all knew about AI but we didn’t think about it the way we think about it today.
-
Jason Soroko
Well, Tim, you and I are doing voiceovers of ourselves, like, you know? We are 300 podcasts in and we don’t even have to be here.
-
Tim Callan
Exactly. And so I just wonder what conversation are you and I gonna be having in six months or a year?
-
Jason Soroko
I hope, Tim, what we are gonna do is we are gonna redo this podcast with an AI that’s more sophisticated than the one that we used and the inflections are gonna be better.
-
Tim Callan
I’m ready to replace myself. I want to just hit go and I’ll just crank out a bunch of brilliant episodes that are much better than what you and I produce. That’s what I’m going for. What do you think?
-
Jason Soroko
I think we should be just like Daft Punk and wear these helmets and, in fact, the beauty of that is we can have two guys that look like us wearing the helmets and we are not even there.
-
Tim Callan
We can tour. It’s awesome. I love it.
-
Jason Soroko
There you go, Tim.
-
Tim Callan
But seriously, I mean we laugh but I like you. I agree with you. I wonder if you and I can get to the point where we can put up a podcast and no one will know it was AI created?
-
Jason Soroko
You know what? Let’s consider that a goal and we’ll see how long it takes. Is it six months? Is it nine months? Is it a year?
-
Tim Callan
And maybe we can progress. Maybe we can report on our progress? And come back. But then here’s the other point I’m gonna make on this and I kind of hit this already but you and I have these very, very, very full day jobs and we are squeezing this kind of stuff, this podcast, and also this AI podcast into tiny little bits of time here and there when we can make a few minutes. Imagine if this was our full time job?
-
Jason Soroko
This is the thing. I think there’s people right now coming right out of University who – and even younger than that – but people who are fully dedicated who have studied it and they do nothing but who are true pros. That’s why I called my little prompt writing exercise Grade 1 because it was better than just the simple prompts that everybody does but it was literally Grade 1 and right now there’s 100 layers of complexity above that.
-
Tim Callan
And there are people out there who are operating at the college educated level for sure. So, ok. We’re gonna stay on this I guarantee you 100% listeners that this is not the last you are gonna hear of this as Jason and I slice out little bits of time and continue on this really cool interesting journey that I don’t know where it’s gonna wind up.
-
Jason Soroko
I can’t wait but thank you for indulging me, Tim, and thank you listeners for having a sense of humor about listening to that intro that was a digital version of Tim, which was just great.