-
-
Notifications
You must be signed in to change notification settings - Fork 318
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add __sklearn_tags__ method to all transformers to be compatible with sklearn pipeline #831
Comments
Just met this issue as well. A temporary fix that solved it for me was to downgrade sklearn version to 1.5.2 pip install "scikit-learn==1.5.2" |
sklearn v1.6 also works. It seems like __sklearn_tags__ was introduced in
that version, but didn't make the method a requirement.
…On Mon, Jan 6, 2025 at 10:14 AM Claudio Salvatore Arcidiacono < ***@***.***> wrote:
Just met this issue as well. A temporary fix that solved it for me was to
downgrade sklearn version to 1.5.2
pip install "scikit-learn==1.5.2"
—
Reply to this email directly, view it on GitHub
<#831 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABLGFSQGFPC45PNG72QUHP32JKMWLAVCNFSM6AAAAABUVVPGJOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKNZTGMYTQOJQGA>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Mh... I am not sure about it. The following code does not work for me with I have not checked with other feature-engine transformers. Perhaps from feature_engine.encoding import WoEEncoder
import numpy as np
import pandas as pd
X = np.array([["dog"] * 20 + ["cat"] * 30 + ["snake"] * 38], dtype=object).T
y = [0] * 15 + [1] * 5 + [0] * 15 + [1] * 15 + [0] * 20 + [1] * 18
X = pd.DataFrame({"col": X[:, 0]})
y = pd.Series(y, name="y")
X = X.sample(frac=1, random_state=42).reset_index(drop=True)
enc = WoEEncoder()
enc.fit(X, y).transform(X) |
downgrading doesn't fix it for some reason. Please fixxxxxxxxxxxxxxxxxxxxxxxxxxxxxx |
Hi all, thanks for raising the issue. It is the typical end of year scikit-learn release, with breaking changes that inevitably breaks feature-engine and gives me a lot of headache. I am on it! |
hey @solegalli I was looking into the issue as well, let me know if you need some help. From what I have understood the changes needed to support this new version are pretty major, in essence:
It might be interesting to see how other similar libraries like scikit-lego solved the issue (koaning/scikit-lego#726) |
Hi @ClaudioSalvatoreArcidiacono Thank you! That's very useful. I already started changing the inheritance order and adding the sklearn tags to the classes.in #833 Our transformers fail many of the tests because of the design of feature-engine. So, the main task is to pass the tests correctly to the check_estimator and parametrize_with_tests. sklearn does not permit checks in the init, but we do. I figured that it's more intuitive to fail directly when setting up the class than after fit. Not sure that was the correct decision, but we've made that decision, and for now, sticking to it is the simplest. So that test, is known to fail. I've updated up to the encoding module. If you want to add to the PR by updating some of the remaining, please do. I am going from top to bottom, if you give it a start, maybe start from bottom to top, so we don't duplicate work. Cheers! |
I forgot to mention that some of the tags in more_tags are for our own tests, so we can't implement this function:
because some tags are not relevant for sklearn. |
I am aware of that, that's why I have added the statement: if hasattr(tags, key): In case the tag is not an attribute of the tags object it will not be assigned. This should cover the feature-engine specific tags. |
Describe the bug
In scikit-learn v1.7, all transformers/estimators, used in a pipeline, must have a
__sklearn_tags__
method.__sklearn_tags__
is used to provide metadata about the estimator, such as whether it supports multi-output, requires fitted parameters, etc.For more information, go to this page of the scikit-learndocs and search for "sklearn_tags". It's located under the "Estimator Tags" section.
To Reproduce
Use any feature-engine transformer in a sklean pipeline.
Expected behavior
Proper execution of sklearn pipeline that uses feature-engine transformers.
Screenshots
Here's the warning that's raised:
Desktop (please complete the following information):
Additional context
N/A
The text was updated successfully, but these errors were encountered: