Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Support Int8Vector in go #38990

Merged

Conversation

cydrain
Copy link
Contributor

@cydrain cydrain commented Jan 3, 2025

Issue: #38666

@sre-ci-robot sre-ci-robot added size/XXL Denotes a PR that changes 1000+ lines. area/compilation area/dependency Pull requests that update a dependency file area/test sig/testing labels Jan 3, 2025
Copy link
Contributor

mergify bot commented Jan 3, 2025

@cydrain

Invalid PR Title Format Detected

Your PR submission does not adhere to our required standards. To ensure clarity and consistency, please meet the following criteria:

  1. Title Format: The PR title must begin with one of these prefixes:
  • feat: for introducing a new feature.
  • fix: for bug fixes.
  • enhance: for improvements to existing functionality.
  • test: for add tests to existing functionality.
  • doc: for modifying documentation.
  • auto: for the pull request from bot.
  1. Description Requirement: The PR must include a non-empty description, detailing the changes and their impact.

Required Title Structure:

[Type]: [Description of the PR]

Where Type is one of feat, fix, enhance, test or doc.

Example:

enhance: improve search performance significantly 

Please review and update your PR to comply with these guidelines.

@cydrain cydrain changed the title feature: Support Int8_Vector feat: Support Int8_Vector Jan 3, 2025
@mergify mergify bot added kind/feature Issues related to feature request from users and removed do-not-merge/invalid-pr-format labels Jan 3, 2025
Copy link
Contributor

mergify bot commented Jan 3, 2025

@cydrain Please associate the related issue to the body of your Pull Request. (eg. “issue: #”)

Copy link
Contributor

mergify bot commented Jan 3, 2025

@cydrain E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

Copy link
Contributor

mergify bot commented Jan 3, 2025

@cydrain go-sdk check failed, comment rerun go-sdk can trigger the job again.

Copy link

codecov bot commented Jan 3, 2025

Codecov Report

Attention: Patch coverage is 54.90196% with 230 lines in your changes missing coverage. Please review.

Project coverage is 81.07%. Comparing base (84f8047) to head (99e7d7f).
Report is 49 commits behind head on master.

Files with missing lines Patch % Lines
internal/storage/insert_data.go 45.45% 29 Missing and 1 partial ⚠️
internal/util/testutil/test_util.go 0.00% 30 Missing ⚠️
pkg/util/testutils/gen_data.go 18.91% 30 Missing ⚠️
internal/core/src/indexbuilder/index_c.cpp 0.00% 18 Missing ⚠️
internal/proxy/task_index.go 7.69% 11 Missing and 1 partial ⚠️
internal/storage/payload_reader.go 55.55% 8 Missing and 4 partials ⚠️
internal/storage/payload_writer.go 61.29% 10 Missing and 2 partials ⚠️
pkg/util/funcutil/placeholdergroup.go 36.84% 12 Missing ⚠️
internal/proxy/validate_util.go 64.51% 11 Missing ⚠️
internal/util/typeutil/storage.go 0.00% 11 Missing ⚠️
... and 13 more
Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff             @@
##           master   #38990       +/-   ##
===========================================
+ Coverage   69.53%   81.07%   +11.53%     
===========================================
  Files         296     1395     +1099     
  Lines       26604   197978   +171374     
===========================================
+ Hits        18499   160507   +142008     
- Misses       8105    31828    +23723     
- Partials        0     5643     +5643     
Components Coverage Δ
Client 79.50% <0.00%> (∅)
Core 69.61% <20.00%> (+0.07%) ⬆️
Go 82.99% <57.71%> (∅)
Files with missing lines Coverage Δ
internal/core/src/index/Utils.cpp 40.90% <ø> (ø)
internal/core/src/index/VectorMemIndex.cpp 64.80% <ø> (ø)
internal/core/src/indexbuilder/IndexFactory.h 96.15% <100.00%> (+0.15%) ⬆️
internal/util/indexcgowrapper/dataset.go 87.28% <100.00%> (ø)
internal/util/indexcgowrapper/index.go 61.61% <100.00%> (ø)
pkg/util/paramtable/autoindex_param.go 88.50% <100.00%> (ø)
pkg/util/typeutil/convension.go 100.00% <100.00%> (ø)
pkg/util/typeutil/gen_empty_field_data.go 91.90% <100.00%> (ø)
internal/core/src/index/Utils.h 80.00% <50.00%> (-8.89%) ⬇️
internal/util/clustering/clustering.go 63.33% <0.00%> (ø)
... and 21 more

... and 1091 files with indirect coverage changes

@cydrain cydrain force-pushed the caiyd_38666_support_int8_vector branch 2 times, most recently from e411dfc to f617d0a Compare January 6, 2025 03:08
@cydrain cydrain changed the title feat: Support Int8_Vector feat: Support Int8Vector in go Jan 6, 2025
Copy link
Contributor

mergify bot commented Jan 6, 2025

@cydrain go-sdk check failed, comment rerun go-sdk can trigger the job again.

@cydrain cydrain force-pushed the caiyd_38666_support_int8_vector branch from f617d0a to 41fcd1b Compare January 6, 2025 09:19
Copy link
Contributor

mergify bot commented Jan 6, 2025

@cydrain go-sdk check failed, comment rerun go-sdk can trigger the job again.

@cydrain cydrain force-pushed the caiyd_38666_support_int8_vector branch from 41fcd1b to 58a24f8 Compare January 7, 2025 06:37
Copy link
Contributor

mergify bot commented Jan 7, 2025

@cydrain cpp-unit-test check failed, comment rerun cpp-unit-test can trigger the job again.

Copy link
Contributor

mergify bot commented Jan 7, 2025

@cydrain go-sdk check failed, comment rerun go-sdk can trigger the job again.

Copy link
Contributor

mergify bot commented Jan 7, 2025

@cydrain E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

@cydrain cydrain force-pushed the caiyd_38666_support_int8_vector branch from 58a24f8 to 84c8e46 Compare January 7, 2025 09:07
@mergify mergify bot added the ci-passed label Jan 7, 2025
Copy link
Member

@liliu-z liliu-z left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QQ, after checking this in, can use create a int8 index? I guess yes for restful API?
Then have we supported it in Knowhere?

}

inline bool
IsIntVectorMetricType(const MetricType& metric_type) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a little bit weird to have this function.
Let's change the metrics check to vector and binary vector

Copy link
Contributor Author

@cydrain cydrain Jan 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cannot get your point.
FloatVector and Int8Vector support different metric types, so I use different APIs to handle.

@@ -450,6 +450,29 @@ BuildSparseFloatVecIndex(CIndex index,
return status;
}

CStatus
BuildInt8VecIndex(CIndex index, int64_t int8_value_num, const int8_t* vectors) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know we just follow the pattern. But it will be great if we can merge these Build functions, since they only have a little discrepancy.
This is a non-blocking comment

Copy link
Contributor Author

@cydrain cydrain Jan 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's because here is C code, we cannot use template here which is only supported in C++

Comment on lines 259 to 268
} else {
// override float/int vector index params by autoindex
for k, v := range Params.AutoIndexConfig.IndexParams.GetAsJSONMap() {
indexParamsMap[k] = v
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest to explicitly show the condition here to match styles.
Also report error when falls to else block.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

Comment on lines 263 to 264
VectorDefaultMetricType = metric.COSINE
SparseFloatVectorDefaultMetricType = metric.IP
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know you are trying to find out a common type for int/float. I guess DenseNumericVector is a not bad one. You can also apply it to all other places

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use FloatVectorDefaultMetricType and IntVectorDefaultMetricType seperately

@cydrain cydrain force-pushed the caiyd_38666_support_int8_vector branch from 84c8e46 to 99e7d7f Compare January 13, 2025 11:34
@mergify mergify bot removed the ci-passed label Jan 13, 2025
Copy link
Contributor

mergify bot commented Jan 13, 2025

@cydrain E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

@cydrain
Copy link
Contributor Author

cydrain commented Jan 13, 2025

/run-cpu-e2e

Copy link
Contributor

mergify bot commented Jan 13, 2025

@cydrain E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

@cydrain
Copy link
Contributor Author

cydrain commented Jan 14, 2025

/run-cpu-e2e

1 similar comment
@cydrain
Copy link
Contributor Author

cydrain commented Jan 14, 2025

/run-cpu-e2e

@mergify mergify bot added the ci-passed label Jan 14, 2025
Copy link
Member

@liliu-z liliu-z left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check this in to unblock the follow up works

/approve
/lgtm

@sre-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cydrain, liliu-z

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@sre-ci-robot sre-ci-robot merged commit 5bf1b2b into milvus-io:master Jan 14, 2025
19 of 20 checks passed
@cydrain cydrain deleted the caiyd_38666_support_int8_vector branch January 15, 2025 02:26
gifi-siby pushed a commit to gifi-siby/milvus that referenced this pull request Jan 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved area/compilation area/dependency Pull requests that update a dependency file area/test ci-passed dco-passed DCO check passed. kind/feature Issues related to feature request from users lgtm sig/testing size/XXL Denotes a PR that changes 1000+ lines.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants