UVR5 the Best AI stem separation algo?

Discussion in 'Software' started by curtified, Feb 27, 2023.

  1. Legotron

    Legotron Audiosexual

    Joined:
    Apr 24, 2017
    Messages:
    1,923
    Likes Received:
    1,853
    Location:
    Hyperborea
    Do you use specific sample for separation test? I mean, is there like something standard song/track that is difficult to separate? Also any tips on drums only separation is welcome.
     
  2. jarredou

    jarredou Producer

    Joined:
    Jan 25, 2017
    Messages:
    157
    Likes Received:
    97
    1 song is not enough, MVSEP people have made 2 datasets to evaluate the different algos : https://mvsep.com/quality_checker/
     
  3. Rodger

    Rodger Rock Star

    Joined:
    Oct 29, 2022
    Messages:
    138
    Likes Received:
    403
  4. curtified

    curtified Rock Star

    Joined:
    Feb 3, 2015
    Messages:
    627
    Likes Received:
    426
    oh give UVR5 a try you can mix and match algos. it also runs locally so you can do it offline or batch convert.
     
  5. curtified

    curtified Rock Star

    Joined:
    Feb 3, 2015
    Messages:
    627
    Likes Received:
    426
    since were on this stem separation topic. has anyone found anything that can pull specific sounds out of stems? Like kicks or snares? is the AI there yet? I know it will be soon
     
  6. Martel

    Martel Platinum Record

    Joined:
    Jan 8, 2023
    Messages:
    398
    Likes Received:
    176
    Do UVR5 for drums with Htdemucs_ft alone then import into RipX. But youll end up with full tracks when you can actually cherry pick any ''best kicks'' or Best Snare amongst the resulting fro UVR5 htdemucs_ft.

    Id recommend cherry picking on the Htdemucs_ft result as there is no need to get a full 32 bit float kick line of 4 minutes long. Get thos kick and counter kick and you should be gold.
     
  7. jarredou

    jarredou Producer

    Joined:
    Jan 25, 2017
    Messages:
    157
    Likes Received:
    97
    It should be possible to train a MDX models with a custom "drums" dataset. With the full drums mix as mixture, and snare, kick, toms (hihat ?) as individual stems.

    I've already seen a MDX model trained to remove reverb on vocals and it was not bad !
     
  8. curtified

    curtified Rock Star

    Joined:
    Feb 3, 2015
    Messages:
    627
    Likes Received:
    426
    I do the Htdemucs_ft on drums and its amazing! I import it into Ableton, I only extract the necessary sections. However, sometimes there may still be some cymbals or percussive sounds that overlap. Although this is not a major issue, I am still interested in exploring other available options such as RipX.

    Regarding the MDX custom dataset, I have been thinking about the possibility of using a similar approach to that of loopcloud and other sample organization software, which can identify individual instruments quite well. It would be interesting to see if this logic could be applied to an MDX dataset, allowing for the extraction of specific sounds, similar to how loopcloud can tag sounds.
     
  9. jarredou

    jarredou Producer

    Joined:
    Jan 25, 2017
    Messages:
    157
    Likes Received:
    97
    • Interesting Interesting x 1
    • List
  10. Martel

    Martel Platinum Record

    Joined:
    Jan 8, 2023
    Messages:
    398
    Likes Received:
    176
    Which is why I specifically detailed the need to cherry pick your kick and snares. You'll find at least a dozen of those per tracks. Then you'll need to EQ them anyways to make them fits your specific production condition. That should be enough to get you in trouble in the sampling rights territories as it will be very recognizable.
     
  11. jarredou

    jarredou Producer

    Joined:
    Jan 25, 2017
    Messages:
    157
    Likes Received:
    97
    Ok. Someone already made a Demucs model to separate drums elements ! :D
    https://github.com/inagoy/drumsep
    The Google colab is working if you download the model manually and put it in the model folder (the function from the code is broken)

    No info about the dataset used to train the model...

    EDIT : Dataset infos from the dev :
    It's trained with 7 hours of drum tracks that I made using sample-based drum software like Adictive Drums, trying to get as many different-sounding drums as I could. As everything was controlled with MIDI, I could export the isolated bodies: kick, snare, toms (all on one track), and cymbals (including hi-hat). So every dataset example is composed of kick, snare, toms, cymbals, and the mixture (the sum of all of them).
     
    Last edited: Mar 2, 2023
  12. jarredou

    jarredou Producer

    Joined:
    Jan 25, 2017
    Messages:
    157
    Likes Received:
    97
    There's also the "Zero Shot" algo that is maybe also able to separate drums elements. It works with 2 audio files, one is the song you wanna extract something, the second is the reference stems you want to extract (for example a short bass recording if you wanna extract the bass from the song). It can work with any source so it should be possible to feed it with drum elements to extract them from a full drum mix. There's an instance running on replicate : https://replicate.com/retrocirce/zero_shot_audio_source_separation
     
    Last edited: Mar 2, 2023
  13. Legotron

    Legotron Audiosexual

    Joined:
    Apr 24, 2017
    Messages:
    1,923
    Likes Received:
    1,853
    Location:
    Hyperborea
  14. realdannys

    realdannys Noisemaker

    Joined:
    Oct 24, 2016
    Messages:
    14
    Likes Received:
    5
    Interesting - had a quick play around with this last night. I'd be using RIPx for everything. Ripx does a quicker more balanced result but if you want a cleaner vocal then wacking an ensemble on and setting it to all the vocal algos definitely seems to yield better results (though it will be processing for a much much longer time)
     
  15. ArticStorm

    ArticStorm Moderator Staff Member

    Joined:
    Jun 7, 2011
    Messages:
    7,203
    Likes Received:
    3,490
    Location:
    AudioSexPro
    this will fry my notebook CPU and integrated GPU. Otherwise i would.
     
  16. jarredou

    jarredou Producer

    Joined:
    Jan 25, 2017
    Messages:
    157
    Likes Received:
    97
    I think UVR's ensemble mode is only merging wavs output from different models, if I'm right, you can use different algos on mvsep.com and then merge the wav files in your daw (and even balance them as you want). The only downside is that you can't tweak the model parameters (shifts, overlap...)
     
    Last edited: Mar 2, 2023
  17. curtified

    curtified Rock Star

    Joined:
    Feb 3, 2015
    Messages:
    627
    Likes Received:
    426
    let me know if you figure this out.. It currently merges realtime while it is rendering. so the only way I could think about doing it is have the files from mvsep renamed and ready to replace the files generated by UVR before the process is over.
     
  18. jarredou

    jarredou Producer

    Joined:
    Jan 25, 2017
    Messages:
    157
    Likes Received:
    97
    So... it doesn't merge the audio while rendering, it renders each model's output one by one, and once all models are done, it merges them with a chosen algos (max/min, min/max, avg/avg, this last "average" one seems to always give better and more stable results).

    And I've just realised that UVR5.5.0 has in fact a builtin "manual ensemble" feature in the "audiotools" process method, so you can extract stems with mvsep.com (or any other site) and then "ensemble" them locally with UVR. It's fast and no CPU/GPU explosion ! ;)
     
    Last edited: Mar 8, 2023
  19. jarredou

    jarredou Producer

    Joined:
    Jan 25, 2017
    Messages:
    157
    Likes Received:
    97
    Mind blowing side-note : you can also train and use Demucs to REMOVE DISTORTION from recordings, research paper and sound examples here (the code and models are not public (yet?) unfortunately) https://joimort.github.io/distortionremoval/
     
    • Interesting Interesting x 2
    • List
  20. curtified

    curtified Rock Star

    Joined:
    Feb 3, 2015
    Messages:
    627
    Likes Received:
    426
    holy shit!! thank you!! A manual ensemble is what we needed!!
     
Loading...
Loading...