Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

torchiva.separation should be torchiva.separate #4

Open
StuartIanNaylor opened this issue Jan 29, 2023 · 3 comments
Open

torchiva.separation should be torchiva.separate #4

StuartIanNaylor opened this issue Jan 29, 2023 · 3 comments

Comments

@StuartIanNaylor
Copy link

StuartIanNaylor commented Jan 29, 2023

python -m torchiva.separation INPUT OUTPUT

python -m torchiva.separate ./examples/samples/mix_reverb/103-1240-0003_1235-135887-0017.wav output.wav

Just wondering if you guys might be planning or would for us not so talented export a quantised model to say tflite or onnx for realtime processing with say a ladspa plugin.

I was just testing with noise than double talk and it works great and as is approx x2.3 realtime on a RK3588 but wondering how light the load could be quantised / c++ ladspa?
There is an absense of opensource BSS libs in Linux, so fingers crossed but likely beyond what I am capable so thought I would ask.

@fakufaku
Copy link
Owner

@StuartIanNaylor Thanks for reporting the typo in the README! Silly of me... 😆

I did not have plans for quantized model, but that is actually a great idea! I will look into it, but I can't promise any timeline... I'll add it to the repo if I manage to get it to work. I am interested to try out onnx for example, but do not have any experience. If you have some and would like to help out, that would also be greatly appreciated.

I am also quite interested to hear about experience with data in the wild! I would be super curious to see your results if you would like to share something 🤗

@StuartIanNaylor
Copy link
Author

Yeah prob can by giving you a look at https://github.com/usefulsensors/openai-whisper as nyadla-sys has created various examples for exports to onnx and then tflite, so plenty of templates for the conversion but prob nyadla-sys would be a great source of info.

As for noise I tested it with https://drive.google.com/file/d/1N90DtDjcm-ejbUpqMHsJAkKYIxoy5iMG/view?usp=share_link
which did a great job and produced https://drive.google.com/file/d/1V8Frsa3H7eCx3YuNRQnm09WlC5dMKZwG/view?usp=share_link

Ignore the glitches due to clipping as that is the test sample that should of had AGC.

I have been trying to find a good low load 2 channel BSS / Deverb for 2 years now and ultimately it should be targetted speaker BSS as guess with doubletalk the target channel will be random, but tackle that one later.

I will be watching and testing, so just ask, but nyadla-sys is the man and examples as said in above repo.

@StuartIanNaylor
Copy link
Author

@fakufaku https://github.com/wenet-e2e/wesignal as might be of interest to what is submitted

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants