Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Version 2024.10.29 #201

Merged
merged 3 commits into from
Oct 30, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
62 changes: 36 additions & 26 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,17 @@
2024.10.29
- Fixed norm to correctly propagate NaN and Inf values.
- Fixed matrix generators to 1-based i,j indices, to match Matlab.
- Added new matrix generators (minij, hilb, ..., gcdmat).
- Require MPI in CMake/Makefile. The non-MPI build was always broken.
- Improved GitHub continuous testing.
- Refactored norm test code.
- Refactored ScaLAPACK wrappers with enums, namespace.
- Replaced Fortran steqr2 with C++ steqr.

2024.05.31
- Add shared library version (ABI version 1.0.0)
- Update enum parameters to have `to_string`, `from_string`;
deprecate `<enum>2str`, `str2<enum>`
- Added shared library version (ABI version 1.0.0)
- Updated enum parameters to have `to_string`, `from_string`;
deprecated `<enum>2str`, `str2<enum>`
- Changed methods to enums; renamed some values and deprecated old values
- Added "all vectors" case to SVD
- Fixed SVD for slightly tall case (m > n but not m >> n)
Expand All @@ -18,29 +28,29 @@
- Added internal timers to testers; use `tester --timer-level 2`

2023.11.05
- Fix variable block sizes
- Fix tau in LQ tester
- Update examples for Users Guide
- Fix CUDA sync in Frobenius norm
- Add random butterfly transform (RBT) solver
- Use `blas_int` in scalapack wrappers, towards supporting int64
- Fix Cholesky QR test with well-conditioned matrix
- Add info check in LU for singular matrix
- Fix SVD tester for all vectors
- Use multi-threaded Intel MKL to improve eig and svd
- Add arbitrary batch regions in `set`
- Add timers in `gesv`, `posv`, `gels`, `heev`, `svd`
- Improve support for 2D GPU grids and lambda constructors
- Fix ROCm complex for ROCm 5.6
- Merge Cholesky potrf Host and Device implementations
- Remove tile life from QR, LQ, add routines
- Fix test matrix generation
- Cleanup MOSI, move to Tile class
- Add zerocol test matrix variant
- Fix receive count
- Use GPU-to-GPU copies
- Fix `tileMB`, `tileNb`
- Improve LU left pivoting for target device
- Fixed variable block sizes
- Fixed tau in LQ tester
- Updated examples for Users Guide
- Fixed CUDA sync in Frobenius norm
- Added random butterfly transform (RBT) solver
- Used `blas_int` in scalapack wrappers, towards supporting int64
- Fixed Cholesky QR test with well-conditioned matrix
- Added info check in LU for singular matrix
- Fixed SVD tester for all vectors
- Used multi-threaded Intel MKL to improve eig and svd
- Added arbitrary batch regions in `set`
- Added timers in `gesv`, `posv`, `gels`, `heev`, `svd`
- Improved support for 2D GPU grids and lambda constructors
- Fixed ROCm complex for ROCm 5.6
- Merged Cholesky potrf Host and Device implementations
- Removed tile life from QR, LQ, add routines
- Fixed test matrix generation
- Cleaned up MOSI, moved to Tile class
- Added zerocol test matrix variant
- Fixed receive count
- Used GPU-to-GPU copies
- Fixed `tileMB`, `tileNb`
- Improved LU left pivoting for target device

2023.08.25
- Added oneMKL/SYCL support
Expand Down
2 changes: 1 addition & 1 deletion CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ cmake_minimum_required( VERSION 3.18 )

project(
slate
VERSION 2024.05.31
VERSION 2024.10.29
LANGUAGES CXX Fortran
)

Expand Down
2 changes: 1 addition & 1 deletion GNUmakefile
Original file line number Diff line number Diff line change
Expand Up @@ -1335,7 +1335,7 @@ LDFLAGS_clean = ${filter-out -fPIC -L./%, ${LDFLAGS}}

.PHONY: ${pkg}
${pkg}:
perl -pe "s'#VERSION'2024.05.31'; \
perl -pe "s'#VERSION'2024.10.29'; \
s'#PREFIX'${abs_prefix}'; \
s'#CXX\b'${CXX}'; \
s'#CXXFLAGS'${CXXFLAGS_clean}'; \
Expand Down
2 changes: 1 addition & 1 deletion docs/doxygen/doxyfile.conf
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ PROJECT_NAME = "SLATE"
# could be handy for archiving the generated documentation or if some version
# control system is used.

PROJECT_NUMBER = "2024.05.31"
PROJECT_NUMBER = "2024.10.29"

# Using the PROJECT_BRIEF tag one can provide an optional one line description
# for a project that appears at the top of each page and should give viewer a
Expand Down
4 changes: 2 additions & 2 deletions include/slate/slate.hh
Original file line number Diff line number Diff line change
Expand Up @@ -26,8 +26,8 @@
namespace slate {

// Version is updated by make_release.py; DO NOT EDIT.
// Version 2024.05.31
#define SLATE_VERSION 20240531
// Version 2024.10.29
#define SLATE_VERSION 20241029

int version();
const char* id();
Expand Down
2 changes: 1 addition & 1 deletion lapackpp
Submodule lapackpp updated 385 files
5 changes: 3 additions & 2 deletions test/idle_gpus.py
Original file line number Diff line number Diff line change
Expand Up @@ -48,15 +48,16 @@
if (s):
gpu = s.group( 1 )

# If using >= 10 MiB or 5% utilization, assume it is not idle.
# If using >= 50% memory or utilization, mark it as not idle.
# This allows some sharing of GPUs with other users.
# Typically idle is 1 MiB and 0% utilization.
# Docker can't see processes in section 2.
s = re.search( '^\| +N/A +\d+C +\w+ +\d+W +/ +\d+W *\| +(\d+)MiB +/ +(\d+)MiB *\| +(\d+)%', line )
if (s):
used_mem = int( s.group( 1 ) )
total_mem = int( s.group( 2 ) )
percent = int( s.group( 3 ) )
if (used_mem >= 10 or percent >= 5):
if (used_mem >= 0.5*total_mem or percent >= 50):
gpus[ gpu ] = 0
else:
# Match process lines:
Expand Down
2 changes: 1 addition & 1 deletion tools/release.py
Original file line number Diff line number Diff line change
Expand Up @@ -219,7 +219,7 @@ def make( project, version_h, version_c ):

# Build Doxygen docs. Create dummy 'make.inc' to avoid 'make config'.
open( 'make.inc', mode='a' ).close()
myrun( 'make docs blas=openblas' )
myrun( 'make docs blas=openblas mpi=1' )
os.unlink( 'make.inc' )

os.chdir( '..' )
Expand Down