Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding KNN scorer explanation #1899

Merged
merged 4 commits into from
Nov 22, 2023
Merged

Adding KNN scorer explanation #1899

merged 4 commits into from
Nov 22, 2023

Conversation

metonymic-smokey
Copy link
Member

This PR adds a score Explanation to the KNN scorer, along with a unit test.
Also, contains a minor refactor of scoring related code.

@abhinavdangeti
Copy link
Member

p.s. the check failures aren't related to this PR.

--- FAIL: TestBytesWritten (0.30s)
Error:     index_test.go:309: expected bytes written is 37767, got 10
--- FAIL: TestBytesRead (0.36s)
Error:     index_test.go:405: expected bytes read for query string 32349, got 18138
--- FAIL: TestBytesReadStored (0.28s)
Error:     index_test.go:584: expected the bytes read stat to be around 25928, got 7999

It's possible @Thejas-bhat is already looking into these. But, if not yet - @metonymic-smokey would you work with Thejas in determining the cause here or the need to update the unit test expectations for bytes read/written with index sections.

@metonymic-smokey metonymic-smokey force-pushed the knn_scorer branch 2 times, most recently from f386d79 to efaeacb Compare November 20, 2023 09:26
@metonymic-smokey metonymic-smokey force-pushed the knn_scorer branch 2 times, most recently from 3b9f58d to 6475b83 Compare November 22, 2023 04:43
@metonymic-smokey
Copy link
Member Author

@abhinavdangeti pls review this PR instead.
Also added a UT for score = 0 with euclidean distance, since you last reviewed.

Sample explanation:
{
  "value": 0.125,
  "message": "weight(desc:query Vector^1.000000 in one), product of:",
  "children": [
    {
      "value": 0.5,
      "message": "queryWeight(desc:query Vector^1.000000), product of:",
      "children": [
        {
          "value": 1,
          "message": "boost"
        },
        {
          "value": 0.5,
          "message": "queryNorm"
        }
      ]
    },
    {
      "value": 0.25,
      "message": "fieldWeight(desc in doc one), score of:",
      "children": [
        {
          "value": 0.25,
          "message": "vector(field(desc:one) with similarity_metric(dot_product)=0.250000"
        }
      ]
    }
  ]
}
@metonymic-smokey
Copy link
Member Author

metonymic-smokey commented Nov 22, 2023

So in the case where we set the max score to a large float if score = 0 for euclidean distance, this is what the explanation ends up looking like:

"value": 1.7976931348623157e+308,
  "message": "fieldWeight(desc in doc one), score of:",
  "children": [
    {
      "value": 1.7976931348623157e+308,
      "message": "vector(field(desc:one) with similarity_metric(l2_norm)=179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368.000"
    }
  ]
} 

I'm concerned about the large number present in the child message - can't truncate the first part so changed it to use scientific notation everywhere instead, just motivated by this specific case.
Let me know if you have any thoughts, thanks!

}

// if the query weight isn't 1, multiply
if sqs.queryWeight != 1.0 && score != maxKNNScore {
score = score * sqs.queryWeight
if sqs.options.Explain {
childExplanations := make([]*search.Explanation, 2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems we're double allocating unnecessarily above and below, this should be within an if and else.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@metonymic-smokey address this separate please.

Copy link
Member

@abhinavdangeti abhinavdangeti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

..

@abhinavdangeti abhinavdangeti merged commit 835f042 into unstable Nov 22, 2023
0 of 9 checks passed
@abhinavdangeti abhinavdangeti deleted the knn_scorer branch November 22, 2023 17:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants