AI Voice to Voice cloning the good the bad the easy.

curtified · Mar 19, 2023

I have been diving into AI voice-to-voice lately. I got a trial of respeecher and was quite surprised to see how a vocal take I have can be sung by 100 different male, female, old, young, modeled voices. (I will post examples later)

I'm wondering if anyone has any cheaper or free options they know of. The current solutions are at a high price point. or have been bought out by big record labels. I found one ok solution on git hub but I'm looking for others.

ReSpeecher - 200 a month. Can train your own voices. Only a handful can do accurate pitch tracking
Supertone - Bought out by Korean Kpop conglomerate HYBE. No public version that I know of
Voice Cloning Toolbox - Free - Semi hard install. Isnt as accurate as the consumer versions

ReSpeecher 1:

ReSpeecher 2:

Supertone 1:

Supertone 2:

Voice Cloning toolbox:

Ads Master

roku · Mar 19, 2023

ive been looking into this a little bit lately. id like to start making my own "movie" samples and other such audio for use in my tracks.

i had a friend turned me on to elevenlabs. he trained it to sound like terence mckenna and sure enough...it sounded just like him. they have a pretty inexpensive lower tier subscription.

https://beta.elevenlabs.io

Moogerfooger · Mar 19, 2023

Wait for Diff-SVC to go public w/the libraries again. That’s the best one out there in my opnion… Plenty of Ariana Grande and Mariah Carey YouTube examples to search for lol. Bamber Leany made a 1000ft tutorial a year ago, but you gotta dive deep to make good shit.

for examplez. Here’s Ariana singing Sabrina Carpenters “Nonsense”

Last edited: Mar 19, 2023

Legotron · Mar 19, 2023

That respeecher looks so tempting.. Sad that it is so ridiculously priced. Definetely gonna keep my eyes on the Diff-SVC, thank for the heads up @Moogerfooger

curtified · Mar 20, 2023

roku said: ↑

ive been looking into this a little bit lately. id like to start making my own "movie" samples and other such audio for use in my tracks.

i had a friend turned me on to elevenlabs. he trained it to sound like terence mckenna and sure enough...it sounded just like him. they have a pretty inexpensive lower tier subscription.

https://beta.elevenlabs.io
Click to expand...

Thanks for sharing!! they are good! But they are text-to-speech. not voice to voice. They are one of the best right now for training

curtified · Mar 20, 2023

Moogerfooger said: ↑

Wait for Diff-SVC to go public w/the libraries again. That’s the best one out there in my opnion… Plenty of Ariana Grande and Mariah Carey YouTube examples to search for lol. Bamber Leany made a 1000ft tutorial a year ago, but you gotta dive deep to make good shit.

for examplez. Here’s Ariana singing Sabrina Carpenters “Nonsense”

Click to expand...

Yesss!! This is what I like to see! Thanks!

Dan Fuerth · Mar 20, 2023

Imagine the hilarious songs you can make with this

https://www.pcmag.com/news/microsofts-ai-program-can-clone-your-voice-from-a-3-second-audio-clip

Here is the github page ( note it's not available yet)

https://valle-demo.github.io/

This is going to be fun.....

I am thinking of

Bill Clinton's :

"Under the Desk"
"Smells Like Intern Spirit"
"If I Had That Dress"

Oh man the possibilities are endless.

Last edited: Mar 20, 2023

curtified · Mar 20, 2023

Dan Fuerth said: ↑

Imagine the hilarious songs you can make with this

https://www.pcmag.com/news/microsofts-ai-program-can-clone-your-voice-from-a-3-second-audio-clip

Here is the github page ( note it's not available yet)

https://valle-demo.github.io/

This is going to be fun.....

I am thinking of

Bill Clinton's :

"Under the Desk"
"Smells Like Intern Spirit"
"If I Had That Dress"

Oh man the possibilities are endless.
Click to expand...

I hate how all of these are just papers and websites. Part of me makes me feel like its fake. But once they open the flood gates I cant wait to try it out!

Xupito · Mar 20, 2023

Dan Fuerth said: ↑

"Smells Like Intern Spirit"
Click to expand...

aitken · Mar 20, 2023

How many till we have a R2R version of a 2025 vst plugin that turns any vocal audio to any voice, from Frank Sinatra, Elvis Presley or Maria Callas to Freddie Mercury, George Michael whoever you like and combined them all together if you want, have the tune of this but the accent of that....the day will come

JMOUTTON · Mar 20, 2023

Interesting.

Milli-Vanilli future style. Will not be too much different from the pop songs of today since there is so much pitch correction anyways. There is some humor in it as singers dropped their band mates to form solo groups saying - i am the voice, I am the talent I can just record your take and replace you forever... now that same box of transistors that replaced the musicians is going to replace the singers.

The problem is there is no accounting for taste, by the time you think you've got fashion figured out the cool kids have moved on and there is always going to be lifers.

curtified · Mar 21, 2023

aitken said: ↑

How many till we have a R2R version of a 2025 vst plugin that turns any vocal audio to any voice, from Frank Sinatra, Elvis Presley or Maria Callas to Freddie Mercury, George Michael whoever you like and combined them all together if you want, have the tune of this but the accent of that....the day will come
Click to expand...

saving this for the future. because I think you actually predicted it spot on.

curtified · Mar 21, 2023

JMOUTTON said: ↑

Interesting.

Milli-Vanilli future style. Will not be too much different from the pop songs of today since there is so much pitch correction anyways. There is some humor in it as singers dropped their band mates to form solo groups saying - i am the voice, I am the talent I can just record your take and replace you forever... now that same box of transistors that replaced the musicians is going to replace the singers.

The problem is there is no accounting for taste, by the time you think you've got fashion figured out the cool kids have moved on and there is always going to be lifers.
Click to expand...

I think you nailed it. This will be the new autotune. It got frowned on at first but then creates a sound in its own. But like micorsoft VALL-E its getting easier and easier to sound real.

jordan77 · Apr 17, 2023

Oh wow, with the media talking about this more and more, I'm really interested in what we'll have years (and months) to come.

Moogerfooger · Apr 17, 2023

Something in my gut tells me this AI revolution & cloning peoples voices wont be anything like the sample wars. I'm almost positive lawyers 24/7 are working with the big record labels to front run the initial influx of hyper realistic trained clones... Maybe artist will print their tracks with some sort of digital watermark to prove its them? Maybe additional copywrite laws need to be created in order to protect not just famous people, but possibly YOU? We all know the Amber Heard & Johnny Depp trial used audio & video footage as a major source for evidence. Imagine someone with an axe to grind deep faking audio conversations with YOU. & you're stuck with zero proof it isn't you saying those words. Ask yourself, do you want to live in a world where an already narcissistic generation of youth will be able to do such a thing out of spite?

I dont mean to derail this thread. As I'm also very interested in using AI as a tool for audio production. But lately I've come to a sort of realization that this whole thing is going to get out of hand fast. Not decades away but years & possibly months away. In the meantime I'm good using Synthesizer V for general help laying down demos. Diff SVC still has their notebooks closed to public, but once that opens back up something better already might have been released... Sorry for the rant lol, Cheers y'all

xorome · Apr 17, 2023

"AI Abduction: Arizona mom Jennifer DeStefano claims that scammers used AI to clone her daughter Brie's voice, so they could demand a $1 million ransom."
"Chai AI chatbot 'talked young dad-of-two into suicide', devastated wife claims."
"AI deepfakes purporting to show Trump arrest take over Twitter."

I hate that soon we won't have a single sense left we can trust.

Person sitting quietly in your Zoom meeting? Might not be who they say they are. Can't trust your eyes.
Person calling you saying they're your mom? Can't trust your ears.
Did my friend train a model on our chat history or why do his texts seem slightly off? Better not trust your brain.

phenomboy · Apr 17, 2023

I'm using Dreamtronics' Synthesizer V and it's super good. I'm planing to train models on vocals from splice so i can humanize my Synth V outputs

shake_puig · Apr 17, 2023

phenomboy said: ↑

I'm using Dreamtronics' Synthesizer V and it's super good. I'm planing to train models on vocals from splice so i can humanize my Synth V outputs
Click to expand...

Synthesizer V is really starting to blur the line for me, it's really good and sounds much better than Vocaloid. Maybe it's the over processing on modern pop vocals that makes it harder to detect, but the quality is getting close to "human" like.

phenomboy · Apr 17, 2023

shake_puig said: ↑

Synthesizer V is really starting to blur the line for me, it's really good and sounds much better than Vocaloid. Maybe it's the over processing on modern pop vocals that makes it harder to detect, but the quality is getting close to "human" like.
Click to expand...

Definitely closer and if you get the fine-tuning right with proper phonemes, timing and breaths you can make it indistinguishable from a human. I mean listen to this :

jordan77 · Apr 17, 2023

^
just in 05:06, the words: "if.. you" is very noticeable but agreed that this software is huge

Similar Threads - Voice Voice cloning	Forum	Date
Question Re Voice Cloning for Singing: Can you utilize a speaking voice for a singing voice?	Working with Sound	Dec 25, 2023
Voice Cloning Text To Song with MIDI Control	Software	Dec 20, 2023
Are there Any AI voice cloning downloadable/free-to-use softwares?	Software	Nov 12, 2023
FREE AI Instrument and Voice cloning app with built in stem seperation!	Software News	Oct 26, 2023
Ensoniq SD-1 32-Voice VST Emulation - free and open source	Samplers, Synthesizers	Yesterday at 7:16 AM

AI Voice to Voice cloning the good the bad the easy.

curtified Audiosexual

Ads Master

roku Noisemaker

Moogerfooger Audiosexual

Legotron Audiosexual

curtified Audiosexual

curtified Audiosexual

Dan Fuerth Producer

curtified Audiosexual

Xupito Audiosexual

aitken Kapellmeister

JMOUTTON Audiosexual

curtified Audiosexual

curtified Audiosexual

jordan77 Kapellmeister

Moogerfooger Audiosexual

xorome Audiosexual

phenomboy Producer

shake_puig Producer

phenomboy Producer

jordan77 Kapellmeister

PROFESSIONAL AUDIO LOVERS

AI Voice to Voice cloning the good the bad the easy.

curtified Audiosexual

Ads Master

roku Noisemaker

Moogerfooger Audiosexual

Legotron Audiosexual

curtified Audiosexual

curtified Audiosexual

Dan Fuerth Producer

curtified Audiosexual

Xupito Audiosexual

aitken Kapellmeister

JMOUTTON Audiosexual

curtified Audiosexual

curtified Audiosexual

jordan77 Kapellmeister

Moogerfooger Audiosexual

xorome Audiosexual

phenomboy Producer

shake_puig Producer

phenomboy Producer

jordan77 Kapellmeister

Useful Searches