Indexing的调度瓶颈(Best Practices to Handle 50k partitions Per Cluster (Compaction/Indexing追不上的问题）：续）) #38998

xiaobingxia-at · 2025-01-04T01:43:05Z

xiaobingxia-at
Jan 4, 2025

context：#38996

https://github.com/milvus-io/milvus/blob/master/internal/datacoord/task_scheduler.go#L239-L263
有大量小partition，会产生大量indexing task，发现index task永远追不上。根据以上的indexing schedule逻辑，感觉会不会有问题。

每隔一秒钟：

他会取出tasks （一个map）里的所有key （task id）
一个个轮询这些task id，assign task id到某个indexing node slot上。如果没找到index node，就break这个循环。如果找到了，就更新状态为In Progress

下一秒钟：

再取出tasks（一个map）里的所有key （task id)
一个个轮询这些task id，检查他们的状态，如果是in progress，看是否完成。完成的话，drop index task

但这个逻辑有个假设，现有的index slot可以承接所有的task。如果不是，indexing process会不会被永远阻塞？

比如我有3个indexing node slot，我assign了3个index task: 1, 2, 3给indexing node slots。

等我下一次轮询的时候，我的task map里，有6个index task了，是1,2,3,4,5,6。 somehow我查询的顺序是4,5,6,1,2,3 （因为我取的是一个task map的key，不是一个可以保证顺序的queue）。这样对于4,5,6，他会尝试assign index node slot，但是找不到，就退出循环。下一次，继续重复同样的事情。这样index task 1,2,3会被永远阻塞在最后一个状态中，然后其他index task没有机会再运行。

这个分析对吗？是不是应该把task scheduler里的tasks由map换成queue？

xiaofan-luan · 2025-01-04T17:13:06Z

xiaofan-luan
Jan 4, 2025
Maintainer

如果没有分配，会等待下一轮继续分配。
已经分配的代码，这里看起来没有问题，执行是go routine分配的，等于说是同步执行。同步执行如果失败了，那么segment状态还是未索引，会被继续调度的

12 replies

xiaobingxia-at Jan 6, 2025
Author

注：尽管这个task scheduler在data coord下面，但是他做的是index任务，不是compaction 任务。

xiaobingxia-at Jan 6, 2025
Author

还有一点，task scheduler在给index task分配index node的时候，应该是一个个index node发RPC call去问的。如果我有100个index node，每assign一个index task可能要问100遍。这会严重影响index task的分配效率。如果有10m个partition，index task上百万的化，是完不成所有index task的。compaction task的分配，应该是在coord这边记录了每个data node的available slot。这样不用一个个call data node就能知道该分配给那个data node。所以compaction task在分配端不存在瓶颈，但是index task的分配会有瓶颈。

xiaofan-luan Jan 6, 2025
Maintainer

还有一点，task scheduler在给index task分配index node的时候，应该是一个个index node发RPC call去问的。如果我有100个index node，每assign一个index task可能要问100遍。这会严重影响index task的分配效率。如果有10m个partition，index task上百万的化，是完不成所有index task的。compaction task的分配，应该是在coord这边记录了每个data node的available slot。这样不用一个个call data node就能知道该分配给那个data node。所以compaction task在分配端不存在瓶颈，但是index task的分配会有瓶颈。

这个地方第一是不会卡住的。第一这里有一个taskID的排序

第二这个task是否能分配是indexnode的状态，而task是否清理是datacoord判断，只要任务在indexnode昨晚就能分配，这里不会死锁。

xiaofan-luan Jan 6, 2025
Maintainer

每次都查询一次状态是一个可以优化的点，之前的index任务都是分钟级别执行的。这里可以优化一下

xiaobingxia-at Jan 6, 2025
Author

@xiaofan-luan 我可能对go不太熟，上面代码哪一行把task IDs排序了？

xiaofan-luan · 2025-01-06T01:37:52Z

xiaofan-luan
Jan 6, 2025
Maintainer

没有执行的所有任务，都在queueTasks里
已经执行的都在 executingTasks里
执行完成的都在 cleaningTasks

queuetasks是通过compactionPlanHandler.schedule 写进去的。

1 reply

xiaobingxia-at Jan 6, 2025
Author

这里说的是indexing task的调度问题，不是compaction task。:)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Indexing的调度瓶颈(Best Practices to Handle 50k partitions Per Cluster (Compaction/Indexing追不上的问题）：续）) #38998

{{title}}

Replies: 2 comments 13 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Indexing的调度瓶颈(Best Practices to Handle 50k partitions Per Cluster (Compaction/Indexing追不上的问题）： 续）) #38998

xiaobingxia-at Jan 4, 2025

Replies: 2 comments · 13 replies

xiaofan-luan Jan 4, 2025 Maintainer

xiaobingxia-at Jan 6, 2025 Author

xiaobingxia-at Jan 6, 2025 Author

xiaofan-luan Jan 6, 2025 Maintainer

xiaofan-luan Jan 6, 2025 Maintainer

xiaobingxia-at Jan 6, 2025 Author

xiaofan-luan Jan 6, 2025 Maintainer

xiaobingxia-at Jan 6, 2025 Author

Indexing的调度瓶颈(Best Practices to Handle 50k partitions Per Cluster (Compaction/Indexing追不上的问题）：续）) #38998

xiaobingxia-at
Jan 4, 2025

Replies: 2 comments 13 replies

xiaofan-luan
Jan 4, 2025
Maintainer

xiaobingxia-at Jan 6, 2025
Author

xiaobingxia-at Jan 6, 2025
Author

xiaofan-luan Jan 6, 2025
Maintainer

xiaofan-luan Jan 6, 2025
Maintainer

xiaobingxia-at Jan 6, 2025
Author

xiaofan-luan
Jan 6, 2025
Maintainer

xiaobingxia-at Jan 6, 2025
Author