You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As discussed with @bellet, it would be useful to have a sort of TupleTransformer object, that would take as __init__ a regular scikit-learn Transformer (so it would be a MetaEstimator), and that would fit/transform on tuples using the given Transformer (instead of the dataset of points)
i.e. it would deduplicate the points inside, fit the transformer on the dataset, and be able to transform it. This would allow to use it in a pipeline like:
It could also be useful in some cases to have an way to use metric learning algorithms to transform tuples, like a transform_tuples method for instance
There may be other options too, this issue is to discuss about this
The text was updated successfully, but these errors were encountered:
Can you explain a little more on the inputs and outputs of the TupleTransformer? It seems like it would need access to the label information at some point, but I'm not familiar enough with the MetaEstimator API to see how that would work.
I think the TupleTransformer would simply take tuples as input, internally turn them into a plain unlabeled dataset X (by collecting all points involved in tuples) and feed this as input to whatever regular unsupervised transformer given at init?
We won't be able to use any label information (e.g., similar/dissimilar labels for pairs) in the since they are not at the individual point level. So only unsupervised transformers should be allowed (e.g., PCA, but not LDA).
As discussed with @bellet, it would be useful to have a sort of
TupleTransformer
object, that would take as__init__
a regular scikit-learnTransformer
(so it would be aMetaEstimator
), and that wouldfit
/transform
on tuples using the givenTransformer
(instead of the dataset of points)i.e. it would deduplicate the points inside, fit the transformer on the dataset, and be able to transform it. This would allow to use it in a pipeline like:
It could also be useful in some cases to have an way to use metric learning algorithms to transform tuples, like a
transform_tuples
method for instanceThere may be other options too, this issue is to discuss about this
The text was updated successfully, but these errors were encountered: