Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

General questions/Training guitar effect removal #124

Open
anguzo opened this issue Jan 23, 2025 · 1 comment
Open

General questions/Training guitar effect removal #124

anguzo opened this issue Jan 23, 2025 · 1 comment

Comments

@anguzo
Copy link

anguzo commented Jan 23, 2025

Hi

I have the following questions to which someone more familiar with the music source separation scene might know an answer:

  1. Considering there is a derverb model by anvuew for vocals, would it be possible to train in similar way a model to remove effects from guitar? I know about RemFx but want to try roformers for this task in addition to demucs.
  2. If i train a said model, then in this case the dataset I would need (type 1) should be "noeffect", "effect" and "mixture" same as "effect"?
  3. In case of roformers I can set target_instrument which should then prioritize target_instrument loss metric for training?
  4. In case of demucs though as I understood so far it prioritizes both or their average, which is not ideal, how to make demucs work for this task?
  5. Models posted mostly include sdr as a metric, but for metric_for_scheduler generally first to try out should be aura_mrstft or si_sdr? Optimizing for sdr seems deceptive as high sdr does not immediately mean good to listening test.
  6. Is there value in training model for 16 kHz sampling rate? Most models are trained for 44.1 but it is of course computationally expensive. Would it make sense to train model on lower sampling rate to tune training parameters at first for example?

I am a junior researcher and have very little experience with audio ML so hope those questions are appropriate.
Also while this might me extensive, this project would benefit from a more general Wiki/FAQ, as it seems to incorporate if not all but a lot of of most relevant modern research (models, metrics, augmentation implementations) in the field.

Thank you in advance for the project.

@ZFTurbo
Copy link
Owner

ZFTurbo commented Jan 23, 2025

  1. I think it's possible. You need to have "clean guitar" + "effects".
  2. As training data you only need "noeffect/clean guitar" and "effect". You don't need mixture for training (only for validation).
  3. Yes you can do priority, but in my opinion optimizing both instruments at the same time gives the same result.
  4. I didn't test demucs with non-null target_instrument, so I can't say
  5. Usually all metrics grows simultaneously. You can use any you feel better.
  6. We don't actually use sample rate we operate with "chunk_size". If you use the same "chunk_size" for model it will have the same complexity. If you convert data 44.1 kHz -> 16 kHz you can lower chunk size and reduce model size. I didn't try this because everyone wants to use models on high quality data. For debugging it's possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants