Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tacho: Tacho doesn't declare dependency on rocblas properly #13636

Closed
masterleinad opened this issue Nov 27, 2024 · 5 comments
Closed

Tacho: Tacho doesn't declare dependency on rocblas properly #13636

masterleinad opened this issue Nov 27, 2024 · 5 comments
Assignees
Labels
pkg: ShyLU type: bug The primary issue is a bug in Trilinos code or tests

Comments

@masterleinad
Copy link
Contributor

Bug Report

Configuring Trilinos with HIP support but without TPL_ROCBLAS compiles but at runtime, I'm seeing

symbol lookup error: /tmp/trilinos-install/lib64/libtacho.so.16: undefined symbol: rocblas_create_handle

which indicates that Tacho is using rocblas without declaring a dependency on it. E.g.,

#if defined(KOKKOS_ENABLE_HIP)
if (!_is_rocblas_created) {
_status = rocblas_create_handle(&_handle_blas);
checkDeviceBlasStatus("rocblasCreate");
_is_rocblas_created = true;
_handle_lapack = _handle_blas;
}
#endif
looks suspicious.

@masterleinad masterleinad added the type: bug The primary issue is a bug in Trilinos code or tests label Nov 27, 2024
@masterleinad
Copy link
Contributor Author

Note that even after fixing the link problem by manually finding and linking rocblas, I am observing runtime failures (possibly related to the fact that some places in the package check for the ROC* TPLS and others just for KOKKOS_ENABLE_HIP). It seems that Tacho should really require rocblas, rocthrust and rocasolve when the HIP backend is enabled.

@iyamazaki
Copy link
Contributor

Thank you, @masterleinad. Just to make sure I understand, when you enable ROCBLAS, ROCSPARSE, and ROCSOLVER, the issue with running Tacho goes away? And,it sounds like these are available for including and linking for compiling and building, but the symbols cannot be resolved at runtime with the dynamically-linked libraries?

@masterleinad
Copy link
Contributor Author

Just to make sure I understand, when you enable ROCBLAS, ROCSPARSE, and ROCSOLVER, the issue with running Tacho goes away?

Yes, the issue goes away when those TPLs are enabled (since that implicitly also enables ShyLU_NodeTacho_ENABLE_ROC* and the respective preprocessor variables get defined).

And,it sounds like these are available for including and linking for compiling and building, but the symbols cannot be resolved at runtime with the dynamically-linked libraries?

Yes, the API interface for rocblas seems to be available without enabling the corresponding TPL. It seems the include directories coming with the compiler are sufficient. The symbols are missing when not explicitly linking against the rocblas library.
Again, I tried just linking against it in the downstream application but ran into memory access issues on the GPU that only got resolved after also enabling the TPLs in Trilinos (which makes manually linking against those TPLs downstream obsolete).

@iyamazaki
Copy link
Contributor

Thank you @masterleinad ! We created a PR, and hope it looks okay.

@iyamazaki
Copy link
Contributor

@masterleinad. I'll close this issue. Please let me know if you encounter additional issues!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pkg: ShyLU type: bug The primary issue is a bug in Trilinos code or tests
Projects
None yet
Development

No branches or pull requests

3 participants