Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Solving "TypeError: __init__() got an unexpected keyword argument 'max_iter' #165

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

J4BEZ
Copy link

@J4BEZ J4BEZ commented Dec 14, 2023

Thank you for providing us with an interesting project.

While exploring Riffusion through streamlit app,
I discovered an issue and a solution for it

So I came here to share the solution.

TL;DR

pin torchaudio version to 2.0.1
same solution as #158 and #166

I'm sorry to "gu-ma" that I recognize the pull-request of gu-ma after I completed to wrote this article.
If this PR could give you a little help, I would like to request to check PR of gu-ma, not just mine alone for the development of the open-source ecosystem.
Hope you have a merry Christmas🎄


What is the Issue🤔

image TypeError: __init__() got an unexpected keyword argument 'max_iter'

When does it occur⏰

While running 'Text to Audio' of Riffusion Playground.
The process of converting text into a spectrogram image works very well but there is an issue of the process of extracting audio through converted image

What is the cause of the issue❔

Due to update of torchaudio in pytorch
It is believed to be a structural change in torchaudio.transforms.InverseMelScale. difference between InverseMelScale in torchaudio 2.0.1 and 2.1.1

riffusion-inference\riffusion\spectrogram_converter.py", line 87, in __init__
has keyword arguments named 'max_iter', 'torlerance_loss', 'tolerance_change', 'sgdargs'

        # https://pytorch.org/audio/stable/generated/torchaudio.transforms.InverseMelScale.html
        self.inverse_mel_scaler = torchaudio.transforms.InverseMelScale(
            n_stft=params.n_fft // 2 + 1,
            n_mels=params.num_frequencies,
            sample_rate=params.sample_rate,
            f_min=params.min_frequency,
            f_max=params.max_frequency,
            max_iter=params.max_mel_iters,
            tolerance_loss=1e-5,
            tolerance_change=1e-8,
            sgdargs=None,
            norm=params.mel_scale_norm,
            mel_scale=params.mel_scale_type,
        ).to(self.device)

📑 A good article to refer to
< InverseMelScale - Torchaudio 2.0.1 documentation >
< InverseMelScale - Torchaudio 2.1.1 documentation >

How to solve the issue☑

We can solve this issue by installing torchaudio which v.2.0.x

  1. Please remove the existing torchaudio, torchvision and torch.

  2. Please revise the requirements.txt as below to install pytorch 2.0.1 Before:

...
torch
torchaudio
torchvision
...

After:

...
torch==2.0.1
torchaudio==2.0.2
torchvision==0.15.2
...
  1. if you want to use CUDA, install PyTorch and torchaudio with CUDA support before python -m pip install -r requirements.txt

It is recommended to install pytorch 2.0.1 on account of we pinned the version of torchaudio to 2.0.x .
< PyTorch 2.0.1 Install Guide >

  1. run python -m pip install -r requirements.txt

Result:

image
It works very well


Thank you for reading this long issue and make a great project.
I hope you all have a great and peaceful Christmas season. Merry Christmas🎄

…x_iter' " Issue

Thank you for providing me with an interesting project.

While exploring Riffusion through streamlit app, 
I discovered an **issue** and a **solution** for it

So I came here to share the solution.

TL;DR
> pin `torchaudio` version to `2.0.1` 
> same solution as riffusion#158

---
# What is the Issue🤔
![image](https://github.com/riffusion/riffusion/assets/43560917/bdcb46a6-0915-4888-95b7-4ec97b3db682)
`TypeError: __init__() got an unexpected keyword argument 'max_iter'`

# When does it occur⏰
While running '**Text to Audio**' of Riffusion Playground.
The process of converting text into a spectrogram image works very well
but there is an issue of the process of **extracting audio through converted image**

# What is the cause of the issue
Due to update of torchaudio in pytorch
It is believed to be a structural change in **torchaudio.transforms.InverseMelScale.**
![difference between InverseMelScale in torchaudio 2.0.1 and 2.1.1](https://github.com/riffusion/riffusion/assets/43560917/89154faf-5a50-4b8a-af35-0aca40722022)

`riffusion-inference\riffusion\spectrogram_converter.py", line 87, in __init__`
has keyword arguments named '**max_iter**', '**torlerance_loss**', '**tolerance_change**', '**sgdargs**'
```
        # https://pytorch.org/audio/stable/generated/torchaudio.transforms.InverseMelScale.html
        self.inverse_mel_scaler = torchaudio.transforms.InverseMelScale(
            n_stft=params.n_fft // 2 + 1,
            n_mels=params.num_frequencies,
            sample_rate=params.sample_rate,
            f_min=params.min_frequency,
            f_max=params.max_frequency,
            max_iter=params.max_mel_iters,
            tolerance_loss=1e-5,
            tolerance_change=1e-8,
            sgdargs=None,
            norm=params.mel_scale_norm,
            mel_scale=params.mel_scale_type,
        ).to(self.device)
```

> 📑 A good article to refer to
> < [InverseMelScale - Torchaudio 2.0.1 documentation](https://pytorch.org/audio/2.0.1/generated/torchaudio.transforms.InverseMelScale.html?highlight=inversemelscale#torchaudio.transforms.InverseMelScale) >
> < [InverseMelScale - Torchaudio 2.1.1 documentation](https://pytorch.org/audio/2.1.1/generated/torchaudio.transforms.InverseMelScale.html?highlight=inverse#torchaudio.transforms.InverseMelScale) >


# How to solve the issue☑
We can solve this issue by installing torchaudio which v.2.0.x 

1. Please remove the existing torchaudio, torchvision and torch.

2. Please revise the `requirements.txt` as below to install pytorch 2.0.1
Before:
```
...
torch
torchaudio
torchvision
...
```

After:
```
...
torch==2.0.1
torchaudio==2.0.2
torchvision==0.15.2
...
```

3. if you want to use CUDA, install PyTorch and torchaudio with CUDA support before `python -m pip install -r requirements.txt`
> It is recommended to install pytorch 2.0.1on account of  we pinned the version of torchaudio to 2.0.x .
> < [PyTorch 2.0.1 Install Guide](https://pytorch.org/get-started/previous-versions/#v201) >

4. run `python -m pip install -r requirements.txt`

# Result:


---
Thank you for reading this long issue
I hope you all have a great and peaceful Christmas season.
Merry Christmas🎄
@J4BEZ J4BEZ changed the title Solving "TypeError: __init__() got an unexpected keyword argument 'ma… Solving "TypeError: __init__() got an unexpected keyword argument 'max_iter' Dec 14, 2023
@J4BEZ
Copy link
Author

J4BEZ commented Dec 15, 2023

now(GMT+9 | December 15th, 2023 13:00), the stable version of torchAudio is 2.1.2 and It also doesn't have
keyword arguments that named 'max_iter', 'torlerance_loss', 'tolerance_change', 'sgdargs'
in torchaudio.transforms.InverseMelScale either.

InverseMelScale - Torchaudio 2.1.2

< InverseMelScale - Torchaudio 2.1.2 documentation >

Therefore It could be a simple solution to pin torchaudio version to 2.0.1
until we refactor the project to be compatible with torchaudio 2.1.2

now, the stable version of torchaudio is 2.1.2,
but the structure of InverseMelScale class is the same as 2.0.1
so changed the link of documentation of InverseMelScale.
@CyberT33N
Copy link

I just deleted it related to:

But would be awesome to set the versions to a fixed number in the requirements.txt file

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants