-
Notifications
You must be signed in to change notification settings - Fork 232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
2. [MRG] Bilinear similarity #329
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @mvargas33 for getting this started! Here are some comments.
You should also ensure the compatibility with pairs, triplets and quadruplet classifiers to make sure we can then build actual bilinear similarity learning algorithms. Probably many existing tests for these will just naturally pass with bilinear similarity, while others may need to be adapted |
@bellet @perimosocordiae @wdevazelhes @terrytangyuan There needs to be a broader discussion regarding adding BilinearSimilarity and Bilinear Similarity Learners such as OASIS. By definition bilinear similarity works in the opposite direction as a metric such as the Mahalanobis. Given two vectors Regarding the API, it means that There are several paths that can be followed to deal with this:
This is also a concern for For example, if we choose 2) and not change The decision made now, might imply many things for the future, moreover if there comes more bilinear learners implementations in the future, or if there are other similarities apart from bilinear and learners for those other similarities. My humble opinion: I believe its more important to keep the global semantic of Ps: By "semantic" I mean the meaning of what the function is intended to do. For instance, the current semantic of |
@mvargas33 I agree with you, IMHO I think it would be good to reuse the code and existing classes/methods as much as possible:
(Note: also regarding the name NB: Probably the
|
I like the idea of Even if this looks like an overkill right now, it guarantees that the code will be scalable and easily extendible in the future. For instance, imagine there are 5 bilinear similarity learners at some point, How this deals with the solution of -1 multiplier + warning? For me that sounds more like a patch. Or what if there's support for data-structure similarities in the future, like tree-similarities or string-similarities. This adds more similarity types that could be easily handled with I also like the idea of renaming |
@mvargas33 That sounds good ! Yes that's right, having two general functions like this would make the code quite generic and extensible |
Hi, thanks @mvargas33 and @wdevazelhes for kicking off the discussion. Here are some comments:
Would be great to hear @perimosocordiae and @terrytangyuan's take on this! |
@bellet Yes I agree the deprecation sounds good, to force users to change their code, and I agree one general method that every learner would implement (like similarity, which respects more the meaning of a "score" indeed) may be enough for having generic code (for us, and for our users), and then a method like
NB: this is also another scikit-learn doc that may be useful: https://scikit-learn.org/stable/modules/metrics.html#metrics |
@wdevazelhes the thing is that a bilinear similarity is not guaranteed to yield a data transformation (and thus be a valid kernel). This is only true when the matrix is PSD, which is often not enforced by the similarity learning algorithms to avoid the computational cost with the projection into the PSD cone. So basically, a bilinear metric learner will not always be a transformer or valid kernel (and in practice, may often not be). We could still implement @mvargas33 at this point of the discussion I think we all agree that it is a good idea to:
So perhaps you can go ahead with the above changes, and we can continue the discussion a bit regarding what to do for bilinear similarity (do we want to have specific things for when the matrix is actually PSD). @terrytangyuan @perimosocordiae your opinion is most welcome! |
3e6adb7
to
68eeda9
Compare
…s B weakly learners. Tests refactor.
@bellet @perimosocordiae @wdevazelhes @terrytangyuan This branch is ready for a code review. It would be cool if can spare some time to look at it. Here's a summary of the changes:
And the summary for the tests:
And after digging deep in the tests these days, the parametrization list mentioned at #136 is already live for most tests. But parametrization for I think that parametrize Best! 😁 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's a review for the documentation part.
Generally we need to change a bit the vocabulary to accommodate both Mahalanobis and bilinear similarities. As common in the literature I propose to use the term "metric" as a generic term (encompassing both distances and similarities).
In addition to the changes mentioned below, some changes are needed in the beginning of "What is metric learning?":
Sec 1
- Many approaches in machine learning require a measure of distance (or similarity) between data points.
- choose a standard distance metric --> choose a standard metric
- Distance metric learning (or simply, metric learning) --> Metric learning
- task-specific distance metrics --> task-specific metrics
- The learned distance metric --> the learned metric
Sec 1.1:
- the goal in this setting is to learn a distance metric --> a metric
- the goal is to learn a distance metric --> a metric
- find the parameters of a distance function --> of a metric function
Please do spellcheck to avoid typos!
Yes, tests that are common to Mahalanobis and Bilinear linears (and thus most likely to any metric learner) should not be duplicated and placed in |
All observations were resolved for Will run the spellchecker soon as well as moving the common tests to |
… to avoid code duplication, trade-off with isinstance(). Added SCML basis that got lost in the list.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some nitpicks on the doc. One additional thing to update is in introduction.rst, in the part:
Strictly speaking, Mahalanobis distances are "pseudo-metrics": they satisfy
three of theproperties of a metric <https://en.wikipedia.org/wiki/Metric_ (mathematics)>
_ (non-negativity, symmetry, triangle inequality) but not
necessarily the identity of indiscernibles.
For consistency of vocabulary, change "pseudo-metrics" into "pseudo-distances" and "properties of a metric" by "properties of a distance"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some small comments on the implementation of the class, but otherwise looks good! I still need to review the test part, which is the big chunk
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's finally the review of the test part, with important additional comments on components_
name for bilinear similarities.
Could you also verify that the new class BilinearMixin
is included in the doc in the same way as MahalanobisMixin
?
"""" | ||
Tests that pair_score() and get_metric() give consistent results. | ||
In both cases, the results must match for the same input. | ||
Tests it for 'n_pairs' sampled from 'n' d-dimentional arrays. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess you should use the backquote ` instead of '
ids=ids_metric_learners_b) | ||
def test_check_correctness_similarity(estimator, build_dataset): | ||
""" | ||
Tests the correctness of the results made from socre_paris(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: socre_paris
assert_array_almost_equal(dists, [96, 120]) | ||
|
||
|
||
# Note: This test needs to be `hardcoded` as the similarity martix must |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: martix
|
||
|
||
# Note: This test needs to be `hardcoded` as the similarity martix must | ||
# be symmetric. Running on all Bilinear learners will throw an error as |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bilinear --> bilinear
ids=ids_metric_learners_b) | ||
def test_pair_score_finite(estimator, build_dataset): | ||
""" | ||
Checks for 'n' pair_score() of 'd' dimentions, that all |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above about quotes
classifiers_b = [(RandomBilinearLearner(), build_classification), | ||
(IdentityBilinearLearner(), build_classification)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you could add those as regressors as well
return self | ||
|
||
|
||
class MockQuadrpletsIdentityBilinearLearner(BilinearMixin, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: Quadrplets
supervised_learners_b = classifiers_b | ||
ids_supervised_learners_b = ids_classifiers_b |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and add them here
metric_learners_pipeline = pairs_learners_m + pairs_learners_b + \ | ||
supervised_learners_m + supervised_learners_b | ||
ids_metric_learners_pipeline = ids_pairs_learners_m + ids_pairs_learners_b +\ | ||
ids_supervised_learners_m + \ | ||
ids_supervised_learners_b |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I forgot the rationale for this list. What is the criterion for inclusion?
My first thought would be that it needs to implement the transform
method. In which case bilinear learners should not be part of this. But at the same time we include pair learners and not others (although they can all implement transform as long as they are Mahalanobis)
Are you able to clarify? And if so, add a comment to explain this explicitly
metric_learn/base_metric.py
Outdated
|
||
Attributes | ||
---------- | ||
components_ : `numpy.ndarray`, shape=(n_components, n_features) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two important comments:
- The shape is incorrect: it will always be (n_features, n_features)
- I am not sure we should name this
components_
because this is typically used to denote a projection matrix used to transform the data (as in PCA, or Mahalanobis learners). I thinkbilinear_matrix_
would be more appropriate. Not sure if having a different name would be at odds with the generality of some tests?
@wdevazelhes @perimosocordiae @terrytangyuan it would be great if one of you can review this PR as well in the next couple of weeks (@mvargas33 will be off for ~2 weeks - so he can finalize the PR once he is back) |
I resolved merge conflicts, but there are still a bunch of review items to address here. |
Hi!
In the aim of implementing OASIS algorithm, I open this PR to discuss the implementation of the BilinearMixin class. Which is a requirement for the algorithm.
I've made some testing but in a file outside the test folder, as I am not familiar with the correct way to test things out. These are basic testing for bilinear similarity with random and handmade toy examples.
As the matrix W in the bilinear similarity is not guaranteed to be symmetric nor PSD, I did not extend from the class MetricTransformer, and did not implemented the
transform()
method.Also, for
score_pairs()
I propose two implementations, and I am not sure which one perms better.Note: along the way this fixes #160