torchiva.separation should be torchiva.separate #4

StuartIanNaylor · 2023-01-29T00:32:11Z

python -m torchiva.separation INPUT OUTPUT

python -m torchiva.separate ./examples/samples/mix_reverb/103-1240-0003_1235-135887-0017.wav output.wav

Just wondering if you guys might be planning or would for us not so talented export a quantised model to say tflite or onnx for realtime processing with say a ladspa plugin.

I was just testing with noise than double talk and it works great and as is approx x2.3 realtime on a RK3588 but wondering how light the load could be quantised / c++ ladspa?
There is an absense of opensource BSS libs in Linux, so fingers crossed but likely beyond what I am capable so thought I would ask.

The text was updated successfully, but these errors were encountered:

fakufaku · 2023-01-30T05:01:51Z

@StuartIanNaylor Thanks for reporting the typo in the README! Silly of me... 😆

I did not have plans for quantized model, but that is actually a great idea! I will look into it, but I can't promise any timeline... I'll add it to the repo if I manage to get it to work. I am interested to try out onnx for example, but do not have any experience. If you have some and would like to help out, that would also be greatly appreciated.

I am also quite interested to hear about experience with data in the wild! I would be super curious to see your results if you would like to share something 🤗

… (issue #4)

StuartIanNaylor · 2023-01-30T05:44:00Z

Yeah prob can by giving you a look at https://github.com/usefulsensors/openai-whisper as nyadla-sys has created various examples for exports to onnx and then tflite, so plenty of templates for the conversion but prob nyadla-sys would be a great source of info.

As for noise I tested it with https://drive.google.com/file/d/1N90DtDjcm-ejbUpqMHsJAkKYIxoy5iMG/view?usp=share_link
which did a great job and produced https://drive.google.com/file/d/1V8Frsa3H7eCx3YuNRQnm09WlC5dMKZwG/view?usp=share_link

Ignore the glitches due to clipping as that is the test sample that should of had AGC.

I have been trying to find a good low load 2 channel BSS / Deverb for 2 years now and ultimately it should be targetted speaker BSS as guess with doubletalk the target channel will be random, but tackle that one later.

I will be watching and testing, so just ask, but nyadla-sys is the man and examples as said in above repo.

StuartIanNaylor · 2023-06-02T11:42:08Z

@fakufaku https://github.com/wenet-e2e/wesignal as might be of interest to what is submitted

fakufaku added a commit that referenced this issue Jan 30, 2023

Corrects typo in the command name for the pre-trained model in README…

83f600e

… (issue #4)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

torchiva.separation should be torchiva.separate #4

torchiva.separation should be torchiva.separate #4

StuartIanNaylor commented Jan 29, 2023 •

edited

Loading

fakufaku commented Jan 30, 2023

StuartIanNaylor commented Jan 30, 2023

StuartIanNaylor commented Jun 2, 2023

torchiva.separation should be torchiva.separate #4

torchiva.separation should be torchiva.separate #4

Comments

StuartIanNaylor commented Jan 29, 2023 • edited Loading

fakufaku commented Jan 30, 2023

StuartIanNaylor commented Jan 30, 2023

StuartIanNaylor commented Jun 2, 2023

StuartIanNaylor commented Jan 29, 2023 •

edited

Loading