AI Voice to Voice cloning the good the bad the easy.

Discussion in 'Software' started by curtified, Mar 19, 2023.

  1. curtified

    curtified Rock Star

    Joined:
    Feb 3, 2015
    Messages:
    675
    Likes Received:
    453
    I have been diving into AI voice-to-voice lately. I got a trial of respeecher and was quite surprised to see how a vocal take I have can be sung by 100 different male, female, old, young, modeled voices. (I will post examples later)

    I'm wondering if anyone has any cheaper or free options they know of. The current solutions are at a high price point. or have been bought out by big record labels. I found one ok solution on git hub but I'm looking for others.

    ReSpeecher - 200 a month. Can train your own voices. Only a handful can do accurate pitch tracking
    Supertone - Bought out by Korean Kpop conglomerate HYBE. No public version that I know of
    Voice Cloning Toolbox - Free - Semi hard install. Isnt as accurate as the consumer versions


    ReSpeecher 1:


    ReSpeecher 2:


    Supertone 1:


    Supertone 2:


    Voice Cloning toolbox:
     
    • Interesting Interesting x 5
    • Like Like x 3
    • List
  2.  
  3. roku

    roku Noisemaker

    Joined:
    Jan 9, 2015
    Messages:
    43
    Likes Received:
    6
    ive been looking into this a little bit lately. id like to start making my own "movie" samples and other such audio for use in my tracks.

    i had a friend turned me on to elevenlabs. he trained it to sound like terence mckenna and sure enough...it sounded just like him. they have a pretty inexpensive lower tier subscription.

    https://beta.elevenlabs.io
     
    • Like Like x 2
    • Interesting Interesting x 2
    • List
  4. Moogerfooger

    Moogerfooger Audiosexual

    Joined:
    Jun 11, 2016
    Messages:
    1,447
    Likes Received:
    781
    Wait for Diff-SVC to go public w/the libraries again. That’s the best one out there in my opnion… Plenty of Ariana Grande and Mariah Carey YouTube examples to search for lol. Bamber Leany made a 1000ft tutorial a year ago, but you gotta dive deep to make good shit.

    for examplez. Here’s Ariana singing Sabrina Carpenters “Nonsense”

     
    Last edited: Mar 19, 2023
  5. Legotron

    Legotron Audiosexual

    Joined:
    Apr 24, 2017
    Messages:
    2,201
    Likes Received:
    2,135
    Location:
    Hyperborea
    That respeecher looks so tempting.. Sad that it is so ridiculously priced. Definetely gonna keep my eyes on the Diff-SVC, thank for the heads up @Moogerfooger
     
    • Like Like x 2
    • Agree Agree x 1
    • List
  6. curtified

    curtified Rock Star

    Joined:
    Feb 3, 2015
    Messages:
    675
    Likes Received:
    453
    Thanks for sharing!! they are good! But they are text-to-speech. not voice to voice. They are one of the best right now for training
     
  7. curtified

    curtified Rock Star

    Joined:
    Feb 3, 2015
    Messages:
    675
    Likes Received:
    453
    Yesss!! This is what I like to see! Thanks!
     
    • Like Like x 2
    • Agree Agree x 1
    • List
  8. Dan Fuerth

    Dan Fuerth Kapellmeister

    Joined:
    Nov 2, 2017
    Messages:
    106
    Likes Received:
    50
  9. curtified

    curtified Rock Star

    Joined:
    Feb 3, 2015
    Messages:
    675
    Likes Received:
    453
    I hate how all of these are just papers and websites. Part of me makes me feel like its fake. But once they open the flood gates I cant wait to try it out!
     
  10. Xupito

    Xupito Audiosexual

    Joined:
    Jan 21, 2012
    Messages:
    7,294
    Likes Received:
    4,030
    Location:
    Europe
    :rofl::rofl::rofl::hahaha::hahaha:
     
  11. aitken

    aitken Ultrasonic

    Joined:
    Apr 26, 2022
    Messages:
    72
    Likes Received:
    33
    How many till we have a R2R version of a 2025 vst plugin that turns any vocal audio to any voice, from Frank Sinatra, Elvis Presley or Maria Callas to Freddie Mercury, George Michael whoever you like and combined them all together if you want, have the tune of this but the accent of that....the day will come
     
  12. JMOUTTON

    JMOUTTON Audiosexual

    Joined:
    Jan 10, 2016
    Messages:
    1,099
    Likes Received:
    909
    Location:
    Virginia
    Interesting.

    Milli-Vanilli future style. Will not be too much different from the pop songs of today since there is so much pitch correction anyways. There is some humor in it as singers dropped their band mates to form solo groups saying - i am the voice, I am the talent I can just record your take and replace you forever... now that same box of transistors that replaced the musicians is going to replace the singers.

    The problem is there is no accounting for taste, by the time you think you've got fashion figured out the cool kids have moved on and there is always going to be lifers.
     
  13. curtified

    curtified Rock Star

    Joined:
    Feb 3, 2015
    Messages:
    675
    Likes Received:
    453
    saving this for the future. because I think you actually predicted it spot on.
     
  14. curtified

    curtified Rock Star

    Joined:
    Feb 3, 2015
    Messages:
    675
    Likes Received:
    453
    I think you nailed it. This will be the new autotune. It got frowned on at first but then creates a sound in its own. But like micorsoft VALL-E its getting easier and easier to sound real.
     
  15. jordan77

    jordan77 Ultrasonic

    Joined:
    Dec 18, 2016
    Messages:
    70
    Likes Received:
    38
    Oh wow, with the media talking about this more and more, I'm really interested in what we'll have years (and months) to come.
     
  16. Moogerfooger

    Moogerfooger Audiosexual

    Joined:
    Jun 11, 2016
    Messages:
    1,447
    Likes Received:
    781
    Something in my gut tells me this AI revolution & cloning peoples voices wont be anything like the sample wars. I'm almost positive lawyers 24/7 are working with the big record labels to front run the initial influx of hyper realistic trained clones... Maybe artist will print their tracks with some sort of digital watermark to prove its them? Maybe additional copywrite laws need to be created in order to protect not just famous people, but possibly YOU? We all know the Amber Heard & Johnny Depp trial used audio & video footage as a major source for evidence. Imagine someone with an axe to grind deep faking audio conversations with YOU. & you're stuck with zero proof it isn't you saying those words. Ask yourself, do you want to live in a world where an already narcissistic generation of youth will be able to do such a thing out of spite?

    I dont mean to derail this thread. As I'm also very interested in using AI as a tool for audio production. But lately I've come to a sort of realization that this whole thing is going to get out of hand fast. Not decades away but years & possibly months away. In the meantime I'm good using Synthesizer V for general help laying down demos. Diff SVC still has their notebooks closed to public, but once that opens back up something better already might have been released... Sorry for the rant lol, Cheers y'all
     
  17. xorome

    xorome Audiosexual

    Joined:
    Sep 28, 2021
    Messages:
    1,176
    Likes Received:
    860
    "AI Abduction: Arizona mom Jennifer DeStefano claims that scammers used AI to clone her daughter Brie's voice, so they could demand a $1 million ransom."
    "Chai AI chatbot 'talked young dad-of-two into suicide', devastated wife claims."
    "AI deepfakes purporting to show Trump arrest take over Twitter."

    I hate that soon we won't have a single sense left we can trust.

    Person sitting quietly in your Zoom meeting? Might not be who they say they are. Can't trust your eyes.
    Person calling you saying they're your mom? Can't trust your ears.
    Did my friend train a model on our chat history or why do his texts seem slightly off? Better not trust your brain.
     
    • Agree Agree x 3
    • Like Like x 1
    • List
  18. phenomboy

    phenomboy Producer

    Joined:
    Aug 25, 2011
    Messages:
    575
    Likes Received:
    84
    Location:
    Nantes
    I'm using Dreamtronics' Synthesizer V and it's super good. I'm planing to train models on vocals from splice so i can humanize my Synth V outputs
     
    • Like Like x 1
    • Love it! Love it! x 1
    • List
  19. shake_puig

    shake_puig Producer

    Joined:
    Jun 15, 2011
    Messages:
    257
    Likes Received:
    75
    Synthesizer V is really starting to blur the line for me, it's really good and sounds much better than Vocaloid. Maybe it's the over processing on modern pop vocals that makes it harder to detect, but the quality is getting close to "human" like.
     
  20. phenomboy

    phenomboy Producer

    Joined:
    Aug 25, 2011
    Messages:
    575
    Likes Received:
    84
    Location:
    Nantes
    Definitely closer and if you get the fine-tuning right with proper phonemes, timing and breaths you can make it indistinguishable from a human. I mean listen to this :

     
    • Love it! Love it! x 2
    • Like Like x 1
    • List
  21. jordan77

    jordan77 Ultrasonic

    Joined:
    Dec 18, 2016
    Messages:
    70
    Likes Received:
    38
    ^
    just in 05:06, the words: "if.. you" is very noticeable but agreed that this software is huge :bow:
     
Loading...
Loading...