Skip to content
This repository has been archived by the owner on Feb 8, 2025. It is now read-only.

Extreme background noises on almost all generations and mixed speakers #99

Open
AkumaNoTsubasa opened this issue Mar 11, 2024 · 0 comments

Comments

@AkumaNoTsubasa
Copy link

Hello,

I really love to finally have found an UI for Suno Bark, which makes it really easier to generate some stuff on the fly, as my knowledge in python is so barebones, I am happy I get a line of text spoken. But but I have some major issues.

  1. About 80% of all Text I generate has massive background noises or is just noise.
  2. I have it happen multiple times that, no matter if I use plain input or SSML with only one single speaker defined, that the generation ends up switching between 2-5 voices.
  3. That the chosen model often only respects the language of the premade suno voices but not the acutal chosen speaker. I often get the female voice eventhough I chose a male one.
  4. Random length of the generation. It often generates 3-8 seconds of silence in the beginning and sometimes also 3-4 seconds in the middle of a line of text. It seems it tries to keep the soundfiles at 10-15 seconds length.

I am using a AMD Ryzen 7 5800X 8-Core Processor @ 3.80 GHz and a 3070ti

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant