Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

heev: document and check for square MPI grid and lower triangular matrix #204

Merged
merged 3 commits into from
Dec 28, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
104 changes: 88 additions & 16 deletions docs/doxygen/DoxygenLayout.xml
Original file line number Diff line number Diff line change
@@ -1,26 +1,44 @@
<?xml version="1.0" encoding="UTF-8"?>
<doxygenlayout version="1.0">
<!-- Generated by doxygen 1.8.11 -->
<!-- Generated by doxygen 1.12.0 -->
<!-- Navigation index tabs for HTML output -->
<navindex>
<tab type="mainpage" visible="yes" title=""/>
<tab type="pages" visible="yes" title="" intro=""/>

<!-- SLATE: change "Modules" to "Routines" -->
<tab type="modules" visible="yes" title="Routines" intro=""/>
<!-- SLATE: change "Topics" to "Routines" -->
<tab type="topics" visible="yes" title="Routines" intro=""/>

<tab type="modules" visible="yes" title="" intro="">
<tab type="modulelist" visible="yes" title="" intro=""/>
<tab type="modulemembers" visible="yes" title="" intro=""/>
</tab>
<tab type="namespaces" visible="yes" title="">
<tab type="namespacelist" visible="yes" title="" intro=""/>
<tab type="namespacemembers" visible="yes" title="" intro=""/>
</tab>

<!-- SLATE: show hierarchy and members -->
<!-- <tab type="classes" visible="no" title=""> -->
<tab type="classlist" visible="no" title="" intro=""/>
<tab type="classindex" visible="no" title=""/>
<tab type="concepts" visible="yes" title="">
</tab>
<tab type="interfaces" visible="yes" title="">
<tab type="interfacelist" visible="yes" title="" intro=""/>
<tab type="interfaceindex" visible="$ALPHABETICAL_INDEX" title=""/>
<tab type="interfacehierarchy" visible="yes" title="" intro=""/>
</tab>
<tab type="classes" visible="yes" title="">
<tab type="classlist" visible="yes" title="" intro=""/>
<tab type="classindex" visible="$ALPHABETICAL_INDEX" title=""/>
<tab type="hierarchy" visible="yes" title="" intro=""/>
<tab type="classmembers" visible="yes" title="" intro=""/>
<!-- </tab> -->

</tab>
<tab type="structs" visible="yes" title="">
<tab type="structlist" visible="yes" title="" intro=""/>
<tab type="structindex" visible="$ALPHABETICAL_INDEX" title=""/>
</tab>
<tab type="exceptions" visible="yes" title="">
<tab type="exceptionlist" visible="yes" title="" intro=""/>
<tab type="exceptionindex" visible="$ALPHABETICAL_INDEX" title=""/>
<tab type="exceptionhierarchy" visible="yes" title="" intro=""/>
</tab>
<tab type="files" visible="yes" title="">
<tab type="filelist" visible="yes" title="" intro=""/>
<tab type="globals" visible="yes" title="" intro=""/>
Expand All @@ -31,9 +49,9 @@
<!-- Layout definition for a class page -->
<class>
<briefdescription visible="yes"/>
<includes visible="$SHOW_INCLUDE_FILES"/>
<inheritancegraph visible="$CLASS_GRAPH"/>
<collaborationgraph visible="$COLLABORATION_GRAPH"/>
<includes visible="$SHOW_HEADERFILE"/>
<inheritancegraph visible="yes"/>
<collaborationgraph visible="yes"/>
<memberdecl>
<nestedclasses visible="yes" title=""/>
<publictypes title=""/>
Expand Down Expand Up @@ -93,66 +111,99 @@
<memberdecl>
<nestednamespaces visible="yes" title=""/>
<constantgroups visible="yes" title=""/>
<interfaces visible="yes" title=""/>
<classes visible="yes" title=""/>
<concepts visible="yes" title=""/>
<structs visible="yes" title=""/>
<exceptions visible="yes" title=""/>
<typedefs title=""/>
<sequences title=""/>
<dictionaries title=""/>
<enums title=""/>
<functions title=""/>
<variables title=""/>
<properties title=""/>
<membergroups visible="yes"/>
</memberdecl>
<detaileddescription title=""/>
<memberdef>
<inlineclasses title=""/>
<typedefs title=""/>
<sequences title=""/>
<dictionaries title=""/>
<enums title=""/>
<functions title=""/>
<variables title=""/>
<properties title=""/>
</memberdef>
<authorsection visible="yes"/>
</namespace>

<!-- Layout definition for a concept page -->
<concept>
<briefdescription visible="yes"/>
<includes visible="$SHOW_HEADERFILE"/>
<definition visible="yes" title=""/>
<detaileddescription title=""/>
<authorsection visible="yes"/>
</concept>

<!-- Layout definition for a file page -->
<file>
<briefdescription visible="yes"/>
<includes visible="$SHOW_INCLUDE_FILES"/>
<includegraph visible="$INCLUDE_GRAPH"/>
<includedbygraph visible="$INCLUDED_BY_GRAPH"/>
<includegraph visible="yes"/>
<includedbygraph visible="yes"/>
<sourcelink visible="yes"/>
<memberdecl>
<interfaces visible="yes" title=""/>
<classes visible="yes" title=""/>
<structs visible="yes" title=""/>
<exceptions visible="yes" title=""/>
<namespaces visible="yes" title=""/>
<concepts visible="yes" title=""/>
<constantgroups visible="yes" title=""/>
<defines title=""/>
<typedefs title=""/>
<sequences title=""/>
<dictionaries title=""/>
<enums title=""/>
<functions title=""/>
<variables title=""/>
<properties title=""/>
<membergroups visible="yes"/>
</memberdecl>
<detaileddescription title=""/>
<memberdef>
<inlineclasses title=""/>
<defines title=""/>
<typedefs title=""/>
<sequences title=""/>
<dictionaries title=""/>
<enums title=""/>
<functions title=""/>
<variables title=""/>
<properties title=""/>
</memberdef>
<authorsection/>
</file>

<!-- Layout definition for a group page -->
<group>
<briefdescription visible="yes"/>
<groupgraph visible="$GROUP_GRAPHS"/>
<groupgraph visible="yes"/>
<memberdecl>
<nestedgroups visible="yes" title=""/>
<modules visible="yes" title=""/>
<dirs visible="yes" title=""/>
<files visible="yes" title=""/>
<namespaces visible="yes" title=""/>
<concepts visible="yes" title=""/>
<classes visible="yes" title=""/>
<defines title=""/>
<typedefs title=""/>
<sequences title=""/>
<dictionaries title=""/>
<enums title=""/>
<enumvalues title=""/>
<functions title=""/>
Expand All @@ -172,6 +223,8 @@
<inlineclasses title=""/>
<defines title=""/>
<typedefs title=""/>
<sequences title=""/>
<dictionaries title=""/>
<enums title=""/>
<enumvalues title=""/>
<functions title=""/>
Expand All @@ -187,6 +240,25 @@
<authorsection visible="yes"/>
</group>

<!-- Layout definition for a C++20 module page -->
<module>
<briefdescription visible="yes"/>
<exportedmodules visible="yes"/>
<memberdecl>
<concepts visible="yes" title=""/>
<classes visible="yes" title=""/>
<enums title=""/>
<typedefs title=""/>
<functions title=""/>
<variables title=""/>
<membergroups title=""/>
</memberdecl>
<detaileddescription title=""/>
<memberdecl>
<files visible="yes"/>
</memberdecl>
</module>

<!-- Layout definition for a directory page -->
<directory>
<briefdescription visible="yes"/>
Expand Down
4 changes: 2 additions & 2 deletions src/ge2tb.cc
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ namespace slate {
namespace impl {

//------------------------------------------------------------------------------
/// Distributed parallel reduction to band for 3-stage SVD.
/// Distributed parallel reduction to band for 2-stage SVD.
/// Generic implementation for any target.
/// Panel computed on host using Host OpenMP task.
///
Expand Down Expand Up @@ -469,7 +469,7 @@ void ge2tb(
} // namespace impl

//------------------------------------------------------------------------------
/// Distributed parallel reduction to band for 3-stage SVD.
/// Distributed parallel reduction to band for 2-stage SVD.
///
/// Reduces an m-by-n matrix $A$ to band form using unitary transformations.
/// The factorization has the form
Expand Down
6 changes: 3 additions & 3 deletions src/he2hb.cc
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ namespace slate {
namespace impl {

//------------------------------------------------------------------------------
/// Distributed parallel reduction to band for 3-stage Hermitian eigenvalue
/// Distributed parallel reduction to band for 2-stage Hermitian eigenvalue
/// decomposition.
/// Generic implementation for any target.
/// Panel computed on host using Host OpenMP task.
Expand All @@ -33,7 +33,7 @@ void he2hb(
using real_t = blas::real_type<scalar_t>;
using blas::real;

assert( A.uplo() == Uplo::Lower ); // for now
slate_assert( A.uplo() == Uplo::Lower ); // for now

// Constants
const scalar_t zero = 0.0;
Expand Down Expand Up @@ -629,7 +629,7 @@ void he2hb(
} // namespace impl

//------------------------------------------------------------------------------
/// Distributed parallel reduction to band for 3-stage SVD.
/// Distributed parallel reduction to band for 2-stage SVD.
///
/// Reduces an n-by-n Hermitian matrix $A$ to band form using unitary
/// transformations. The factorization has the form
Expand Down
41 changes: 31 additions & 10 deletions src/heev.cc
Original file line number Diff line number Diff line change
Expand Up @@ -12,29 +12,41 @@
namespace slate {

//------------------------------------------------------------------------------
/// Distributed parallel Hermitian matrix eigen decomposition.
/// heev Computes all eigenvalues and, optionally, eigenvectors of a
/// Hermitian matrix A. The matrix A is preliminary reduced to
/// Distributed parallel Hermitian matrix eigen decomposition,
/// \[
/// A = Z \Lambda Z^H.
/// \]
/// Computes all eigenvalues and, optionally, eigenvectors of a
/// Hermitian matrix $A$. The matrix $A$ is preliminary reduced to
/// tridiagonal form using a two-stage approach:
/// First stage: reduction to band tridiagonal form (see he2hb);
/// Second stage: reduction from band to tridiagonal form (see hb2st).
/// @see he2hb First stage: reduction to band tridiagonal form.
/// @see hb2st Second stage: reduction from band to tridiagonal form.
///
/// #### Restrictions ####
///
/// Currently requires a **lower triangular** storage Hermitian matrix.
///
/// Currently requires a **square MPI process grid** ($p \times p$).
/// This is because it applies the same QR factorization on the
/// left ($p$ block-rows) and the right ($p$ block-cols), with a size $p$
/// reduction tree. We hope to eventually remove this restriction.
///
//------------------------------------------------------------------------------
/// @tparam scalar_t
/// One of float, double, std::complex<float>, std::complex<double>.
//------------------------------------------------------------------------------
/// @param[in] A
/// On entry, the n-by-n Hermitian matrix $A$.
/// On entry, the $n \times n$ Hermitian matrix $A$.
/// On exit, contents are destroyed.
///
/// @param[out] Lambda
/// The vector Lambda of length n.
/// The vector Lambda of length $n$.
/// If successful, the eigenvalues in ascending order.
///
/// @param[out] Z
/// On entry, if Z is empty, does not compute eigenvectors.
/// Otherwise, the n-by-n matrix $Z$ to store eigenvectors.
/// On exit, orthonormal eigenvectors of the matrix A.
/// On entry, if $Z$ is empty, does not compute eigenvectors.
/// Otherwise, the $n \times n$ matrix $Z$ to store eigenvectors.
/// On exit, orthonormal eigenvectors of the matrix $A$.
///
/// @param[in] opts
/// Additional options, as map of name = value pairs. Possible options:
Expand Down Expand Up @@ -80,6 +92,15 @@ void heev(
MethodEig method = get_option( opts, Option::MethodEig, MethodEig::DC );
Target target = get_option( opts, Option::Target, Target::HostTask );

// Currently he2hb requires lower triangular matrix.
slate_assert( A.uplo() == Uplo::Lower );

// Currently requires square process grid.
GridOrder grid_order;
int nprow, npcol, myrow, mycol;
A.gridinfo( &grid_order, &nprow, &npcol, &myrow, &mycol );
slate_assert( nprow == npcol );

// Scale matrix to allowable range, if necessary.
real_t Anorm = norm( Norm::Max, A );
real_t alpha = 1.0;
Expand Down
10 changes: 6 additions & 4 deletions test/test.cc
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,9 @@ using testsweeper::ansi_bold;
using testsweeper::ansi_red;
using testsweeper::ansi_normal;

using testsweeper::no_check;
using testsweeper::skipped;

using blas::Layout, blas::Layout_help;
using blas::Side, blas::Side_help;
using blas::Uplo, blas::Uplo_help;
Expand Down Expand Up @@ -507,10 +510,9 @@ Params::Params():
ref_gbytes( "ref gbyte/s", 12, 3, PT_Out, no_data, 0, 0, "reference Gbyte/s rate" ),
ref_iters ( "ref iters", 5, PT_Out, 0, 0, 0, "reference iterations to solution" ),

// default -1 means "no check"
// name, w, type, default, min, max, help
okay ( "status", 6, PT_Out, -1, 0, 0, "success indicator" ),
msg ( "", 1, PT_Out, "", "error message" )
// name, w, type, default, min, max, help
okay ( "status", 6, PT_Out, no_check, 0, 0, "success indicator" ),
msg ( "", 1, PT_Out, "", "error message" )
{
// set header different than command line prefix
lookahead.name("la", "lookahead");
Expand Down
Loading
Loading