UVR5 the Best AI stem separation algo?

Discussion in 'Software' started by curtified, Feb 27, 2023.

  1. jarredou

    jarredou Guest

    Arxiv is an open access platform for academic papers.

    The one from Gladia ResearchGroup will be publish on their github space, there's a link to it in their demo site. As it's a diffusion algo, it does not extract data from the audio mixture but it recreate each stem (with more or less accuracy and artefacts), like diffusion networks create images from noise (with more or less accuracy and artefacts !)

    For the ByteDance MDX23 one, the winner is said to open source its algo, so it should be available (and their algo from 2021 competition was made public after the contest)

    EDIT - the official rules:
    [​IMG]


    So in worst case, we'll be able to use it with their model, but they may not publish training code, so impossible to have 'community models' like we have with MDX/Demucs/UVR
     
    Last edited by a moderator: May 9, 2023
  2. Martel

    Martel Platinum Record

    Joined:
    Jan 8, 2023
    Messages:
    398
    Likes Received:
    176
    In any case, this is really exciting. I love every single bit of it. But we came a long way from that izotope thing.

    I tried asking RipX when they would be updating their seperation model and if they were using Demucs 4 and they told me they could not answer that question. ( i guess for legal reason !? )

    I was trying to follow the board setups in UVR5 for an ensemble and realized that a few from the chain was missing like MDX-net 496, Demucs mmi and some Kim Vocal model ( not the 1 ) I couldn't download or locate in UVR5.
     
  3. jarredou

    jarredou Guest

    In UVR "Demucs-mmi" is "hdemucs-mmi" (it's demucs3 retrained with the 800 additionnal songs used to train demucs4)

    "MDX-net 496" is "UVR-MDX-NET Main" (I'm not 100% sure, but 99%)

    And the "Kim Vocal 2" is not public yet as it was used for the MDX23 competition, Kimberley should release it soon now that the comp is over.
     
    Last edited by a moderator: May 9, 2023
  4. Martel

    Martel Platinum Record

    Joined:
    Jan 8, 2023
    Messages:
    398
    Likes Received:
    176

    You are simply awesome !!! :mates:

    Thank you so much for sharing all of those info.
     
  5. jarredou

    jarredou Guest

    It's only the MDX23 algo that is premium access (and maybe future ones), all others are free, so it's still usefull for people who can't use UVR locally (and it's easier to use than a Google colab).

    X-minus is a good option too, between all others paid options, it's one of the contributing dev of UVR who manages the site and also train new models that are accessible for premium users (weeks before being included into UVR most of the time)
     
    Last edited by a moderator: May 10, 2023
  6. Martel

    Martel Platinum Record

    Joined:
    Jan 8, 2023
    Messages:
    398
    Likes Received:
    176
    I highly suggest anyone to encourage him / them if possible. The difference is negligible with the UVR Ensemble and the processing time is obviously a lot longer but the result is pretty much the same to my ears. Maybe a tiny bit fuller for MVSEP but that's splitting hair.
     
  7. Zephor

    Zephor Noisemaker

    Joined:
    Aug 22, 2018
    Messages:
    15
    Likes Received:
    3
    Can someone give me the TLDR of this thread and suggest the best AI stem separation algo?
     
  8. jarredou

    jarredou Guest

    Here's is the board with scores for all stems (and the models/algos used), you can sort results by clicking on column titles: https://mvsep.com/quality_checker/leaderboard2.php?sort=bass

    Note that things should evolve in the next days/weeks as there was a "demixing competition", that is over now. The winner and some other participants will release their algos & models. The winner should be ByteDance, and their scores are higher than any other algos/models currently available. And some other new source separation method with be also published at the end of month (based on diffusion networks).
     
  9. jarredou

    jarredou Guest

    • Like Like x 1
    • Winner Winner x 1
    • Interesting Interesting x 1
    • Love it! Love it! x 1
    • List
  10. Martel

    Martel Platinum Record

    Joined:
    Jan 8, 2023
    Messages:
    398
    Likes Received:
    176
  11. phenomboy

    phenomboy Producer

    Joined:
    Aug 25, 2011
    Messages:
    575
    Likes Received:
    83
    Location:
    Nantes
    Care to share another invite ?
     
  12. jarredou

    jarredou Guest

  13. jarredou

    jarredou Guest

  14. jarredou

    jarredou Guest

  15. R.S

    R.S Producer

    Joined:
    Oct 11, 2017
    Messages:
    152
    Likes Received:
    88
    When running run.bat I get:

    C:\MVSep-MDX23.zip>"./Miniconda/python.exe" gui.py
    GPU use: 0
    Use device: cuda:0
    Model path: C:\MVSep-MDX23.zip/models/Kim_Vocal_1.onnx
    Device: cuda:0 Chunk size: 1000000
    Model path: C:\MVSep-MDX23.zip/models/Kim_Inst.onnx
    Device: cuda:0 Chunk size: 1000000
    Go for: C:/AI/DEMO/Demo/Elton John/SOURCE - Goodbye Yellow Brick Road.mp3
    LLVM ERROR: Symbol not found: __svml_cosf8_ha


    Any thoughts??
     
  16. jarredou

    jarredou Guest

    Better to ask support on MVSep's github space or in MVSep discord channel (invite posted above)
     
  17. Martel

    Martel Platinum Record

    Joined:
    Jan 8, 2023
    Messages:
    398
    Likes Received:
    176
  18. lxfsn

    lxfsn Platinum Record

    Joined:
    Sep 8, 2021
    Messages:
    344
    Likes Received:
    259
    The model requires 11 GB of video memory, otherwise will default to CPU. Edit the bat file and run it with

    Code:
    "./Miniconda/python.exe" gui.py --chunk_size 200000 --single_onnx
    This is how I managed to make it work here
     
    • Like Like x 1
    • Winner Winner x 1
    • Useful Useful x 1
    • List
  19. Martel

    Martel Platinum Record

    Joined:
    Jan 8, 2023
    Messages:
    398
    Likes Received:
    176
    That didn't work for me sadly. It still crashed at 20%
     
  20. lxfsn

    lxfsn Platinum Record

    Joined:
    Sep 8, 2021
    Messages:
    344
    Likes Received:
    259
    Remove the chunk editing in the bat file (and keep the single_onnx) and try this. In the inference.py file, line 244, set a lower chunk size as shown below. Also feed it a 10-secods long audio file, just to make sure it works. Then for a full audio file it will take about 2 hours. Then maybe increase the chunk until your computer start crashing.

    I'm afraid with this new wave of AI tools we need to upgrade to 16GB video cards. ffs

    Code:
            if device == 'cpu':
                # chunk_size = 200000000
                chunk_size = 1000000
    PS: this setting work in my case on a 32GB DDR system. It constantly takes 30 GB of memory. If you have less memory you'll need to reduce chunk size even more
     
    Last edited: May 13, 2023
Loading...
Loading...