-
Notifications
You must be signed in to change notification settings - Fork 395
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Solving "TypeError: __init__() got an unexpected keyword argument 'max_iter' #165
base: main
Are you sure you want to change the base?
Conversation
…x_iter' " Issue Thank you for providing me with an interesting project. While exploring Riffusion through streamlit app, I discovered an **issue** and a **solution** for it So I came here to share the solution. TL;DR > pin `torchaudio` version to `2.0.1` > same solution as riffusion#158 --- # What is the Issue🤔 ![image](https://github.com/riffusion/riffusion/assets/43560917/bdcb46a6-0915-4888-95b7-4ec97b3db682) `TypeError: __init__() got an unexpected keyword argument 'max_iter'` # When does it occur⏰ While running '**Text to Audio**' of Riffusion Playground. The process of converting text into a spectrogram image works very well but there is an issue of the process of **extracting audio through converted image** # What is the cause of the issue Due to update of torchaudio in pytorch It is believed to be a structural change in **torchaudio.transforms.InverseMelScale.** ![difference between InverseMelScale in torchaudio 2.0.1 and 2.1.1](https://github.com/riffusion/riffusion/assets/43560917/89154faf-5a50-4b8a-af35-0aca40722022) `riffusion-inference\riffusion\spectrogram_converter.py", line 87, in __init__` has keyword arguments named '**max_iter**', '**torlerance_loss**', '**tolerance_change**', '**sgdargs**' ``` # https://pytorch.org/audio/stable/generated/torchaudio.transforms.InverseMelScale.html self.inverse_mel_scaler = torchaudio.transforms.InverseMelScale( n_stft=params.n_fft // 2 + 1, n_mels=params.num_frequencies, sample_rate=params.sample_rate, f_min=params.min_frequency, f_max=params.max_frequency, max_iter=params.max_mel_iters, tolerance_loss=1e-5, tolerance_change=1e-8, sgdargs=None, norm=params.mel_scale_norm, mel_scale=params.mel_scale_type, ).to(self.device) ``` > 📑 A good article to refer to > < [InverseMelScale - Torchaudio 2.0.1 documentation](https://pytorch.org/audio/2.0.1/generated/torchaudio.transforms.InverseMelScale.html?highlight=inversemelscale#torchaudio.transforms.InverseMelScale) > > < [InverseMelScale - Torchaudio 2.1.1 documentation](https://pytorch.org/audio/2.1.1/generated/torchaudio.transforms.InverseMelScale.html?highlight=inverse#torchaudio.transforms.InverseMelScale) > # How to solve the issue☑ We can solve this issue by installing torchaudio which v.2.0.x 1. Please remove the existing torchaudio, torchvision and torch. 2. Please revise the `requirements.txt` as below to install pytorch 2.0.1 Before: ``` ... torch torchaudio torchvision ... ``` After: ``` ... torch==2.0.1 torchaudio==2.0.2 torchvision==0.15.2 ... ``` 3. if you want to use CUDA, install PyTorch and torchaudio with CUDA support before `python -m pip install -r requirements.txt` > It is recommended to install pytorch 2.0.1on account of we pinned the version of torchaudio to 2.0.x . > < [PyTorch 2.0.1 Install Guide](https://pytorch.org/get-started/previous-versions/#v201) > 4. run `python -m pip install -r requirements.txt` # Result: --- Thank you for reading this long issue I hope you all have a great and peaceful Christmas season. Merry Christmas🎄
now(GMT+9 | December 15th, 2023 13:00), the stable version of torchAudio is 2.1.2 and It also doesn't have < InverseMelScale - Torchaudio 2.1.2 documentation > Therefore It could be a simple solution to pin torchaudio version to 2.0.1 |
now, the stable version of torchaudio is 2.1.2, but the structure of InverseMelScale class is the same as 2.0.1 so changed the link of documentation of InverseMelScale.
I just deleted it related to: But would be awesome to set the versions to a fixed number in the requirements.txt file |
Thank you for providing us with an interesting project.
While exploring Riffusion through streamlit app,
I discovered an issue and a solution for it
So I came here to share the solution.
TL;DR
What is the Issue🤔
TypeError: __init__() got an unexpected keyword argument 'max_iter'
When does it occur⏰
While running 'Text to Audio' of Riffusion Playground.
The process of converting text into a spectrogram image works very well but there is an issue of the process of extracting audio through converted image
What is the cause of the issue❔
Due to update of torchaudio in pytorch
It is believed to be a structural change in torchaudio.transforms.InverseMelScale.
riffusion-inference\riffusion\spectrogram_converter.py", line 87, in __init__
has keyword arguments named 'max_iter', 'torlerance_loss', 'tolerance_change', 'sgdargs'
How to solve the issue☑
We can solve this issue by installing torchaudio which v.2.0.x
Please remove the existing torchaudio, torchvision and torch.
Please revise the
requirements.txt
as below to install pytorch 2.0.1 Before:After:
python -m pip install -r requirements.txt
python -m pip install -r requirements.txt
Result:
It works very well
Thank you for reading this long issue and make a great project.
I hope you all have a great and peaceful Christmas season. Merry Christmas🎄