-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: can't build index GPU_CAGRA #38650
Comments
GPU_CAGRA with L2 metric works fine either |
release 2.5.0 failed too |
/assign @Presburger |
@Dong148 hi, CAGRA supports IP. Could you tell me the total amount of data? You can set cache_dataset_on_device to false, which will increase the memory usage. |
1M entries with 4096 dims vector only;
|
@Dong148 Have there been any changes to the GPU configuration corresponding to milvus.yml? For 1 million data points with 4096 dimensions, your 4 GPUs with approximately 80GB of RAM should generally be sufficient to handle it. |
Is there an existing issue for this?
Environment
Current Behavior
when creating GPU_CAGRA index for 4096 dims float field, py sdk keep blocking, and milvus log shows such failed:
failed to build index, raft inner error ...... VectorMemIndex.cpp:276: segcore error[segcoreCode=2004]
Expected Behavior
GPU_IVF_FLAT works fine
Steps To Reproduce
No response
Milvus Log
[2024/12/23 03:36:06.395 +00:00] [INFO] [indexnode/task_index.go:307] ["debug create index"] [clusterID=by-dev] [buildID=454799776879359244] [collection=454799776879358794] [segmentID=454799776879358832] [currentIndexVersion=6] [buildIndexParams="clusterID:"by-dev" buildID:454799776879359244 collectionID:454799776879358794 partitionID:454799776879358795 segmentID:454799776879358832 index_version:136 current_index_version:6 num_rows:1024 dim:4096 index_file_prefix:"files/index_files" insert_files:"files/insert_log/454799776879358794/454799776879358795/454799776879358832/101/454799776879358836" field_schema:{fieldID:101 name:"vector" data_type:FloatVector type_params:{key:"dim" value:"4096"}} storage_config:{address:"minio:9000" access_keyID:"minioadmin" secret_access_key:"minioadmin" bucket_name:"a-bucket" root_path:"files" storage_type:"remote" cloud_provider:"aws" request_timeout_ms:10000 sslCACert:"/path/to/public.crt"} index_params:{key:"dim" value:"4096"} index_params:{key:"cache_dataset_on_device" value:"true"} index_params:{key:"index_type" value:"GPU_CAGRA"} index_params:{key:"build_dram_budget_gb" value:"124.059021"} index_params:{key:"num_build_thread" value:"80"} index_params:{key:"metric_type" value:"IP"} index_params:{key:"intermediate_graph_degree" value:"64"} index_params:{key:"graph_degree" value:"32"} index_params:{key:"vec_field_size_gb" value:"0.000000"} type_params:{key:"dim" value:"4096"}"]
[2024/12/23 03:36:06.396 +00:00] [INFO] [datacoord/task_index.go:316] ["query task index info successfully"] [taskID=454799776879359244] ["result state"=InProgress] [failReason=]
[2024/12/23 03:36:06.493 +00:00] [WARN] [indexcgowrapper/helper.go:71] ["failed to create index, C Runtime Exception: => failed to build index, raft inner error at /workspace/source/internal/core/src/index/VectorMemIndex.cpp:276\n\n"]
[2024/12/23 03:36:06.494 +00:00] [WARN] [indexnode/task_index.go:314] ["failed to build index"] [clusterID=by-dev] [buildID=454799776879359244] [collection=454799776879358794] [segmentID=454799776879358832] [currentIndexVersion=6] [error="failed to create index, C Runtime Exception: => failed to build index, raft inner error at /workspace/source/internal/core/src/index/VectorMemIndex.cpp:276\n\n: segcore error[segcoreCode=2004]"] [errorVerbose="failed to create index, C Runtime Exception: => failed to build index, raft inner error at /workspace/source/internal/core/src/index/VectorMemIndex.cpp:276: segcore error[segcoreCode=2004]\n(1) attached stack trace\n -- stack trace:\n | github.com/milvus-io/milvus/pkg/util/merr.WrapErrSegcore\n | \t/workspace/source/pkg/util/merr/utils.go:1006\n | github.com/milvus-io/milvus/internal/util/indexcgowrapper.HandleCStatus\n | \t/workspace/source/internal/util/indexcgowrapper/helper.go:78\n | github.com/milvus-io/milvus/internal/util/indexcgowrapper.CreateIndex\n | \t/workspace/source/internal/util/indexcgowrapper/index.go:111\n | github.com/milvus-io/milvus/internal/indexnode.(*indexBuildTask).Execute\n | \t/workspace/source/internal/indexnode/task_index.go:309\n | github.com/milvus-io/milvus/internal/indexnode.(*TaskScheduler).processTask.func1\n | \t/workspace/source/internal/indexnode/task_scheduler.go:222\n | github.com/milvus-io/milvus/internal/indexnode.(*TaskScheduler).processTask\n | \t/workspace/source/internal/indexnode/task_scheduler.go:235\n | github.com/milvus-io/milvus/internal/indexnode.(*TaskScheduler).indexBuildLoop.func1\n | \t/workspace/source/internal/indexnode/task_scheduler.go:262\n | runtime.goexit\n | \t/root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s:1695\nWraps: (2) failed to create index, C Runtime Exception: => failed to build index, raft inner error at /workspace/source/internal/core/src/index/VectorMemIndex.cpp:276\nWraps: (3) segcore error[segcoreCode=2004]\nError types: (1) *withstack.withStack (2) *errutil.withPrefix (3) merr.milvusError"]
[2024/12/23 03:36:06.494 +00:00] [WARN] [indexnode/task_scheduler.go:236] ["process task failed"] [error="failed to create index, C Runtime Exception: => failed to build index, raft inner error at /workspace/source/internal/core/src/index/VectorMemIndex.cpp:276\n\n: segcore error[segcoreCode=2004]"] [errorVerbose="failed to create index, C Runtime Exception: => failed to build index, raft inner error at /workspace/source/internal/core/src/index/VectorMemIndex.cpp:276: segcore error[segcoreCode=2004]\n(1) attached stack trace\n -- stack trace:\n | github.com/milvus-io/milvus/pkg/util/merr.WrapErrSegcore\n | \t/workspace/source/pkg/util/merr/utils.go:1006\n | github.com/milvus-io/milvus/internal/util/indexcgowrapper.HandleCStatus\n | \t/workspace/source/internal/util/indexcgowrapper/helper.go:78\n | github.com/milvus-io/milvus/internal/util/indexcgowrapper.CreateIndex\n | \t/workspace/source/internal/util/indexcgowrapper/index.go:111\n | github.com/milvus-io/milvus/internal/indexnode.(*indexBuildTask).Execute\n | \t/workspace/source/internal/indexnode/task_index.go:309\n | github.com/milvus-io/milvus/internal/indexnode.(*TaskScheduler).processTask.func1\n | \t/workspace/source/internal/indexnode/task_scheduler.go:222\n | github.com/milvus-io/milvus/internal/indexnode.(*TaskScheduler).processTask\n | \t/workspace/source/internal/indexnode/task_scheduler.go:235\n | github.com/milvus-io/milvus/internal/indexnode.(*TaskScheduler).indexBuildLoop.func1\n | \t/workspace/source/internal/indexnode/task_scheduler.go:262\n | runtime.goexit\n | \t/root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s:1695\nWraps: (2) failed to create index, C Runtime Exception: => failed to build index, raft inner error at /workspace/source/internal/core/src/index/VectorMemIndex.cpp:276\nWraps: (3) segcore error[segcoreCode=2004]\nError types: (1) *withstack.withStack (2) *errutil.withPrefix (3) merr.milvusError"]
Anything else?
{
"index_type": "GPU_CAGRA",
"metric_type": "IP",
"params": {
'intermediate_graph_degree': 64,
'graph_degree': 32,
"cache_dataset_on_device": "true"
}
#
both the same problems
The text was updated successfully, but these errors were encountered: