Deepfake, Text-to-Speech, and AI Voice Changers

We were looking for a way to provide a little anonymity to our videos so co-workers wouldn't find our alter egos doing our crazy passion Maker projects. We thought we'd do some research on Deepfake. It basically shattered our view of the movie industry. Any person can be deaged or made to look more fit in minutes - in real time! This blog covers the free offerings of this very cool tech.

-Sean J. Miller 1/7/2024

So, yes, our image of the movie industry was absolutely shattered.  After wading through Github, we came across a critical few repositories that could easily de-age Indianna Jones in minutes for free. We are certain Tom Cruise in Maverick was taken back at least 10 years in a lot of shots. One can see how this would easily be leveraged to giving one a six-pack of abs or any hair style.  I even transformed into the Joker in realtime - this is NOT makeup, but rather a deepfaked face from the Joker:

 

What is really wild is that there aren't more Youtube channels based on cranking out funny fakes.  Most Github repositories themselves have under 500 favorites, too.  It blows me away how little coder interest is out there.  I thought it would be hundreds of thousands. They get little collaboration, too.  All of this is a great opportunity for one to get on board as a programmer, which I jumped on. At the least, there's always spelling errors and batch file install problems that can use a hand in fixing and I'm happy to do so in trade of their great work in open source format for free.

So let's look at our favorite repositories that will meet our Maker project needs:

Deepfacelab

This is the most popular.  It is the one fueling the deepfake Youtube channels. There are people that have already captured faces of popular figures such as politicians and actors.  You can, in realtime, use a personal video or your webcam to map their faces over yours.  That's right - realtime!  You can even use your face as the feed to alter images of them.  So, they become your puppet.  It's frightening to see and surprising there aren't fake videos of Biden and Putin all over X.com.

Deepface

Applio RVC

This is for doing some very cool voice work.  You can take any voice with just a 10 seconds or so audio track and create a model to apply over your own audio track.  You just need to do your best imitation of the person to be sure the timing of their speech is like they would do it.  It will handle their accent and voice quality to perfection.  It is certain this technique was applied to James Earl Jones in the Obiwan series - making 90+ year old him sound exactly as he did circa 1980.

 

TTS Generation Webui

This one is great for generating text-to-speech.  So, if you'd like narration that you can control, it does a great job.  However, it also can rip out vocals, drums, guitar, and other music into separate tracks.  You can make your own Karaoke songs or give yourself a ripping guitar solo for slowing down and practicing.  We used this tech to make Arthur Morgan of Red Dead Redemption sing a Boy Named Sue.

 

Wunjo

This is my go-to when my wife asks to see if she'd look good in a new hair cut.  It is super easy and fast.

It takes the best of Deepfake and AI Voice Changers and applies basic model assumptions behind the scenes.  In turn, the user doesn't have to know the work flows normally required for quality productions.  It works best for videos where the face is talking directly to the camera such as in how-to videos that we make.  In four clicks, you can transform a video and have a near perfect deepfake.  You can then go after the AI voice changer separately.  The source face can be simply a video made from videoing with an iPhone all around the head to build the model from all angles.

The video above was made by taking an iPhone around Connor's 19 year old, 160 lb and deepfaking him onto my 235lb body.  Basically, we aged him up to see what he'll look like after 30 years of pizza eating.

What is also cool is that it will take a head shot pic and apply a voice over - even if the pic has a closed mouth.  It animates the head to make it look like a video.  Crazy awesome!

  

 

The deepface tech behind these repositories is based on the Python package called DeepFace. The Deepface Python package is a lightweight face recognition and facial attribute analysis framework that was developed by a Turkish data scientist named Ibrahim Can Serengil. The package wraps state-of-the-art models such as VGG-Face, Google FaceNet, OpenFace, Facebook DeepFace, DeepID, ArcFace, Dlib, and SFace. You can install the package via pip or conda. The package provides functionalities such as facial recognition, facial attribute analysis (age, gender, emotion, and race), and facial embeddings.

AI has definitely brought a new age to video and sound.  But also to how humans research and create. As we explore this and ChatGPT, we realize our human capability is already augmented by at least 25% in the current state - if not more if we just had the ideas and time to implement all that we can achieve.  We can't wait to see what the next several years will bring.

 

-Sean J. Miller

Have a question?
Back to Home