UVR5 the Best AI stem separation algo?

jarredou · May 9, 2023

Arxiv is an open access platform for academic papers.

The one from Gladia ResearchGroup will be publish on their github space, there's a link to it in their demo site. As it's a diffusion algo, it does not extract data from the audio mixture but it recreate each stem (with more or less accuracy and artefacts), like diffusion networks create images from noise (with more or less accuracy and artefacts !)

For the ByteDance MDX23 one, the winner is said to open source its algo, so it should be available (and their algo from 2021 competition was made public after the contest)

EDIT - the official rules:

So in worst case, we'll be able to use it with their model, but they may not publish training code, so impossible to have 'community models' like we have with MDX/Demucs/UVR

Last edited by a moderator: May 9, 2023

Martel · May 9, 2023

In any case, this is really exciting. I love every single bit of it. But we came a long way from that izotope thing.

I tried asking RipX when they would be updating their seperation model and if they were using Demucs 4 and they told me they could not answer that question. ( i guess for legal reason !? )

I was trying to follow the board setups in UVR5 for an ensemble and realized that a few from the chain was missing like MDX-net 496, Demucs mmi and some Kim Vocal model ( not the 1 ) I couldn't download or locate in UVR5.

jarredou · May 9, 2023

In UVR "Demucs-mmi" is "hdemucs-mmi" (it's demucs3 retrained with the 800 additionnal songs used to train demucs4)

"MDX-net 496" is "UVR-MDX-NET Main" (I'm not 100% sure, but 99%)

And the "Kim Vocal 2" is not public yet as it was used for the MDX23 competition, Kimberley should release it soon now that the comp is over.

Last edited by a moderator: May 9, 2023

Martel · May 9, 2023

jarredou said: ↑

In UVR "Demucs-mmi" is "hdemucs-mmi" (it's demucs3 retrained with the 800 additionnal songs used to train demucs4)

"MDX-net 496" is "UVR-MDX-NET Main" (I'm not 100% sure, but 99%)

And the "Kim Vocal 2" is not public yet as it was used for the MDX23 competition, Kimberley should release it soon now that the comp is over.
Click to expand...

You are simply awesome !!!

Thank you so much for sharing all of those info.

jarredou · May 10, 2023

Martel said: ↑

MVSEP is currently online only and it is a Paid premium subscription base.

20$ USD = 1200 minutes of conversion file. ( 1 song of 4 minute divided in 5 stems = 20 credits ) ( this specific one was 2:34 and cost 10 credits for 5 stems)

Thoughts?
Click to expand...

It's only the MDX23 algo that is premium access (and maybe future ones), all others are free, so it's still usefull for people who can't use UVR locally (and it's easier to use than a Google colab).

X-minus is a good option too, between all others paid options, it's one of the contributing dev of UVR who manages the site and also train new models that are accessible for premium users (weeks before being included into UVR most of the time)

Last edited by a moderator: May 10, 2023

Martel · May 10, 2023

jarredou said: ↑

It's only the MDX23 algo that is premium access (and maybe future ones), all others are free, so it's still usefull for people who can't use UVR locally (and it's easier to use than a Google colab).

X-minus is a good option too, between all others paid options, it's one of the contributing dev of UVR who manages the site and also train new models that are accessible for premium users (weeks before being included into UVR most of the time)
Click to expand...

I highly suggest anyone to encourage him / them if possible. The difference is negligible with the UVR Ensemble and the processing time is obviously a lot longer but the result is pretty much the same to my ears. Maybe a tiny bit fuller for MVSEP but that's splitting hair.

Zephor · May 10, 2023

Can someone give me the TLDR of this thread and suggest the best AI stem separation algo?

jarredou · May 10, 2023

Zephor said: ↑

Can someone give me the TLDR of this thread and suggest the best AI stem separation algo?
Click to expand...

Here's is the board with scores for all stems (and the models/algos used), you can sort results by clicking on column titles: https://mvsep.com/quality_checker/leaderboard2.php?sort=bass

Note that things should evolve in the next days/weeks as there was a "demixing competition", that is over now. The winner and some other participants will release their algos & models. The winner should be ByteDance, and their scores are higher than any other algos/models currently available. And some other new source separation method with be also published at the end of month (based on diffusion networks).

jarredou · May 11, 2023

MVSEP's new MDX23 algo is released: https://github.com/ZFTurbo/MVSEP-MDX23-music-separation-model

Martel · May 11, 2023

jarredou said: ↑

MVSEP's new MDX23 algo is released: https://github.com/ZFTurbo/MVSEP-MDX23-music-separation-model
Click to expand...

Anxiously waiting for the run.bat to run the Standalone GUI.

phenomboy · May 12, 2023

BobbyMonfrado said: ↑

There's a discord server for UVR and anything else audio seperation related here: https://discord.gg/vw2Bmfb5 (invite lasts 7days)

Lots of chatter/help/info
Click to expand...

Care to share another invite ?

jarredou · May 12, 2023

MDX23 fine-tuned vocal model (MDX) from Kimberley Jensen is released :

https://drive.google.com/drive/folders/1ikAutU_SuskkQtbwcGFeFWkW6ElrmLn9

jarredou · May 12, 2023

phenomboy said: ↑

Care to share another invite ?
Click to expand...

Here it is : https://discord.com/invite/V88KegPb7D

jarredou · May 12, 2023

Martel said: ↑

Anxiously waiting for the run.bat to run the Standalone GUI.
Click to expand...

Here it is : https://github.com/ZFTurbo/MVSEP-MDX23-music-separation-model/releases/tag/v1.0

R.S · May 12, 2023

When running run.bat I get:

C:\MVSep-MDX23.zip>"./Miniconda/python.exe" gui.py
GPU use: 0
Use device: cuda:0
Model path: C:\MVSep-MDX23.zip/models/Kim_Vocal_1.onnx
Device: cuda:0 Chunk size: 1000000
Model path: C:\MVSep-MDX23.zip/models/Kim_Inst.onnx
Device: cuda:0 Chunk size: 1000000
Go for: C:/AI/DEMO/Demo/Elton John/SOURCE - Goodbye Yellow Brick Road.mp3
LLVM ERROR: Symbol not found: __svml_cosf8_ha

Any thoughts??

jarredou · May 12, 2023

R.S said: ↑

When running run.bat I get:

C:\MVSep-MDX23.zip>"./Miniconda/python.exe" gui.py
GPU use: 0
Use device: cuda:0
Model path: C:\MVSep-MDX23.zip/models/Kim_Vocal_1.onnx
Device: cuda:0 Chunk size: 1000000
Model path: C:\MVSep-MDX23.zip/models/Kim_Inst.onnx
Device: cuda:0 Chunk size: 1000000
Go for: C:/AI/DEMO/Demo/Elton John/SOURCE - Goodbye Yellow Brick Road.mp3
LLVM ERROR: Symbol not found: __svml_cosf8_ha

Any thoughts??
Click to expand...

Better to ask support on MVSep's github space or in MVSep discord channel (invite posted above)

Martel · May 12, 2023

It crash both on CPU and GPU here.

Reported as an issue on Github:

https://github.com/ZFTurbo/MVSEP-MDX23-music-separation-model/issues/1

lxfsn · May 12, 2023

The model requires 11 GB of video memory, otherwise will default to CPU. Edit the bat file and run it with
Code:
"./Miniconda/python.exe" gui.py --chunk_size 200000 --single_onnx
This is how I managed to make it work here

Martel · May 12, 2023

lxfsn said: ↑
The model requires 11 GB of video memory, otherwise will default to CPU. Edit the bat file and run it with
Code:
"./Miniconda/python.exe" gui.py --chunk_size 200000 --single_onnx
This is how I managed to make it work here
Click to expand...
That didn't work for me sadly. It still crashed at 20%

lxfsn · May 13, 2023

Martel said: ↑

That didn't work for me sadly. It still crashed at 20%
Click to expand...

Remove the chunk editing in the bat file (and keep the single_onnx) and try this. In the inference.py file, line 244, set a lower chunk size as shown below. Also feed it a 10-secods long audio file, just to make sure it works. Then for a full audio file it will take about 2 hours. Then maybe increase the chunk until your computer start crashing.

I'm afraid with this new wave of AI tools we need to upgrade to 16GB video cards. ffs
Code:
        if device == 'cpu':
            # chunk_size = 200000000
            chunk_size = 1000000
PS: this setting work in my case on a 32GB DDR system. It constantly takes 30 GB of memory. If you have less memory you'll need to reduce chunk size even more

Last edited: May 13, 2023

Similar Threads - UVR5 Best stem	Forum	Date
best settings for voice extraction with UVR5 ?	Software	Jan 2, 2024
UVR5 (Ultimale vocal remover) - how to get rid of numbers and underscore in new file?	Software	May 12, 2024
Best current library or VST for orchestral music?	Film / Video Game Scoring	Yesterday at 8:57 AM
Best Libraries for Omnisphere	Omnisphere	Monday at 9:12 PM
How to sell UAD Ultimate 12++ best? incl. Hardware or without	Selling / Buying	Monday at 1:33 PM

UVR5 the Best AI stem separation algo?

jarredou Guest

Martel Platinum Record

jarredou Guest

Martel Platinum Record

jarredou Guest

Martel Platinum Record

Zephor Noisemaker

jarredou Guest

jarredou Guest

Martel Platinum Record

phenomboy Producer

jarredou Guest

jarredou Guest

jarredou Guest

R.S Producer

jarredou Guest

Martel Platinum Record

lxfsn Platinum Record

Martel Platinum Record

lxfsn Platinum Record

PROFESSIONAL AUDIO LOVERS

UVR5 the Best AI stem separation algo?

jarredou Guest

Martel Platinum Record

jarredou Guest

Martel Platinum Record

jarredou Guest

Martel Platinum Record

Zephor Noisemaker

jarredou Guest

jarredou Guest

Martel Platinum Record

phenomboy Producer

jarredou Guest

jarredou Guest

jarredou Guest

R.S Producer

jarredou Guest

Martel Platinum Record

lxfsn Platinum Record

Martel Platinum Record

lxfsn Platinum Record

Useful Searches