-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to determine pod name #77
Comments
The nested task error: Additionally, there are another 25 error messages we're not seeing which could be useful for determining the root cause. |
I just ran into this too; I asked for 6 workers, and it seemed to happen on the 6th (since I got 5 "worker is up" log messages before it failed; no other log messages though). Partial stacktrace: TaskFailedException
Stacktrace:
[1] wait
@ ./task.jl:334 [inlined]
[2] addprocs_locked(manager::K8sClusterManager; kwargs::Base.Pairs{Symbol, String, Tuple{Symbol}, NamedTuple{(:exeflags,), Tuple{String}}})
@ Distributed /usr/local/julia/share/julia/stdlib/v1.7/Distributed/src/cluster.jl:504
[3] addprocs(manager::K8sClusterManager; kwargs::Base.Pairs{Symbol, String, Tuple{Symbol}, NamedTuple{(:exeflags,), Tuple{String}}})
@ Distributed /usr/local/julia/share/julia/stdlib/v1.7/Distributed/src/cluster.jl:447
[truncated]
nested task error: TaskFailedException
nested task error: Unable to determine the pod name from: ""
Stacktrace:
[1] error(s::String)
@ Base ./error.jl:33
[2] create_pod(manifest::DataStructures.DefaultOrderedDict{String, Any, typeof(K8sClusterManagers.rdict)})
@ K8sClusterManagers ~/.julia/packages/K8sClusterManagers/PIZ9P/src/pod.jl:66
[3] macro expansion
@ ~/.julia/packages/K8sClusterManagers/PIZ9P/src/native_driver.jl:103 [inlined]
[4] (::K8sClusterManagers.var"#17#18"{K8sClusterManager, Vector{WorkerConfig}, Condition})()
@ K8sClusterManagers ./task.jl:423
Stacktrace:
[1] sync_end(c::Channel{Any})
@ Base ./task.jl:381
[2] macro expansion
@ ./task.jl:400 [inlined]
[3] launch(manager::K8sClusterManager, params::Dict{Symbol, Any}, launched::Vector{WorkerConfig}, c::Condition)
@ K8sClusterManagers ~/.julia/packages/K8sClusterManagers/PIZ9P/src/native_driver.jl:101
[4] (::Distributed.var"#39#42"{K8sClusterManager, Condition, Vector{WorkerConfig}, Dict{Symbol, Any}})()
@ Distributed ./task.jl:423 On K8sClusterManagers v0.1.3. |
@kolia reported this issue with [email protected]:
The text was updated successfully, but these errors were encountered: