Any AI guru´s, coders, trainers here?

Legotron · Jul 14, 2022

There´s been this concept popping in my mind from time to time that, could it be possible to teach AI to perfectly convert drums audio to midi data by simply training it with i.e manually drawn midi file(all the different hits) from the audio then feeding them both to AI.
There´s already very interesting stuff going on and lots of tools available, but my knowledge of AI is very limited. All the few tutorials I have tried to watch are being very technical and more towards to hardcore coders.

Any links, info, forums and knowledge is very welcome.

Thanks L

Ads Master

Mister Grimm · Jul 14, 2022

The concept you are talking about already exists. Superior drummer 3 is able to convert drum audio files (wav, mp3 e.t.c.) to midi. Its not always 100% accurate but most of the times it works great

Legotron · Jul 14, 2022

But SD3 doesn´t use any AI, I mean could it be trained better with AI, and even from music file, usually those are better to do manual

clone · Jul 14, 2022

Logic DMD (Drum Machine Designer) does the same thing, but after it slices; it loads each slice into a sampler instance, and then puts all those sampler instances on their own audio channels, within one Summing Stack/group.

Hit detection is simple enough it doesn't really need "AI", it just looks at +/- amplitudes to determine the initial slices; and it can be adjusted and fine tuned after that. And of course, it renders a midi playback file, just like Recycle.

Last edited: Jul 14, 2022

Ryck · Jul 14, 2022

For my understanding about AI, and about many things, not just Midi to Wav. I believe that technology is further than the human being can imagine.
That is, maybe someone has in mind how they could do it, but then the AI would prevent them from reaching an end.

It's like saying "I imagine being here and teleporting to another place in 1 second" In my mind the idea is there, but technology still can't do it.

The same thing happens with programs that separate a song into tracks, currently I think it can only be done with voice, drums, bass, piano, and others.

Something else can happen. It is that for certain processes a very powerful CPU would be needed.

For example if you make an eq match of a drum loop, you will get the eq of all that loop, but in reality, as the rhythm or melody changes, the eq changes too, so you would have to analyze each change and then apply it to you eq. Which, I think, this would consume all the CPU.

Maybe I'm talking nonsense, it's my understanding, someone correct me if I'm wrong.

recycle · Jul 14, 2022

clone said: ↑

Logic DMD (Drum Machine Designer) does the same thing, but after it slices; it loads each slice into a sampler instance, and then puts all those sampler instances on their own audio channels, within one Summing Stack/group.

Hit detection is simple enough it doesn't really need "AI", it just looks at +/- amplitudes to determine the initial slices; and it can be adjusted and fine tuned after that. And of course, it renders a midi playback file, just like Recycle.
Click to expand...

Yes, Ableton also has the audio to midi drum function: I don't think there is a big technological effort behind this feature to achieve its goal, I mean, calling it AI is too much.
A desire that I have and that I hope will be fulfilled with the help of AI is a feature that could demaster, demaximize and re-dynamize the signal
The point is that I can't stand this loudness anxiety anymore: any song produced in the last 20 years is mutilated in dynamics. I notice that using the intensity of sound for the purpose of expressiveness no longer exists in music, when I listen to any recent song, my output leds are turned on at full scale, and after 1 minute my ears bleed ..
We all know that this mix/mastering technique is a marketing strategy (read: loudness war) and certainly not dependent on the artist.
What it would be right to have to restore dignity to all music is: an AI plugin that separates tracks, rebuilds transients and volumes of each and reconstructs the song as if it had never passed through any compressor/limiter.
Who knows, maybe it already exists and I'm not aware of it

suckajim · Jul 14, 2022

@Legotron
yes, it's possible to train an AI the way you mentioned

Will it produce a perfect midi sequence? No, not at all. It's not that easy. Complex tasks in AI is done on GPU, not on CPUs. But even if you have access to world's all supercomputers, it won't help you create a perfect midi sequence. It's not a matter of "CPU power". The challenge is to use an appropriate model or algorithm for your task, which might have endless possibilities to be tweaked (i.e. how big should your artificial neural network be (bigger isn't better), how many neurons on the first layer, second layer.... much stuff that can be altered, and here we are talking only about some aspects of neural networks but many other strategies exist as well). It usually takes years of research to find a model that's good enough to fullfil your specific task and then that model is only suitable for your task. If you change your task i.e. you want to extract melody from audio files then you have to start from scratch again. Your previous model simply can't do that.

There are many challenges in music recognition where you can participate and win prizes if you contribute anything useful

Usually you have some training data and you try to build a model which is able to transform that data into midi. After you are finished with training your model, you come to the conclusion: damn, midi is recognized pretty bad. Then you start tweaking your model and analyzing which parameters will lead to better results. Using AI doesn't mean your results will be perfect. It's also possible that your model's results are always bad and never good enough. Then you can try different approaches.

It took decades (?) for machines to be able to recognize and classify pictures better than human experts do.

Conclusion: AI is not a toolbox that you can use to create perfect results in short period of time. AI is mathematics. Most of the time you are trying to figure out which models or parameters to use. And of course: waiting until your model is trained on GPU/CPU and is ready for being used. Training might take hours/days/weeks for one single model/parameter combination.

Last edited: Jul 14, 2022

Similar Threads - guru´s coders trainers	Forum	Date
[SOLVED] XILS vocoders setup	Software	Apr 7, 2025
Favorite vocoders and harmonizers	Software	Jan 1, 2023
Any CODERs around HERE ? Behringer VIRTUALIZER PRO FX2000 3D	Soundgear	Feb 7, 2022
MP3 Encoders - Any Difference In Quality Among Software?	Working with Sound	Mar 20, 2017
To my fellow coders	Lounge	Nov 24, 2013

Any AI guru´s, coders, trainers here?

Legotron Audiosexual

Ads Master

Mister Grimm Kapellmeister

Legotron Audiosexual

clone Audiosexual

Ryck Guest

recycle Guest

suckajim Member

PROFESSIONAL AUDIO LOVERS

Any AI guru´s, coders, trainers here?

Legotron Audiosexual

Ads Master

Mister Grimm Kapellmeister

Legotron Audiosexual

clone Audiosexual

Ryck Guest

recycle Guest

suckajim Member

Useful Searches