Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need subspace diagonalization with parallel #5480

Open
14 tasks
pxlxingliang opened this issue Nov 14, 2024 · 1 comment · May be fixed by #5549
Open
14 tasks

Need subspace diagonalization with parallel #5480

pxlxingliang opened this issue Nov 14, 2024 · 1 comment · May be fixed by #5549
Assignees
Labels
Diago Issues related to diagonalizaiton methods

Comments

@pxlxingliang
Copy link
Collaborator

Background

Now, the subspace diagonalization of dav is by lapack with one core, while for large system, the dimension of this subspace may be hundreds, and can be effectively accelerated by parallel.

QE has the same function and can be used by setting value of nd in command: https://www.quantum-espresso.org/Doc/user_guide/node20.html

Describe the solution you'd like

I will implement a function to divide the H and S matrices into 2D blocks, and then call elpa or scalapack to do parallel diagonalization.

Task list only for developers

  • Notice possible changes of behavior
  • Explain the changes of codes in core modules of ESolver, HSolver, ElecState, Hamilt, Operator or Psi

Notice Possible Changes of Behavior (Reminder only for developers)

No response

Notice any changes of core modules (Reminder only for developers)

No response

Notice Possible Changes of Core Modules (Reminder only for developers)

No response

Additional Context

No response

Task list for Issue attackers (only for developers)

  • Review and understand the proposed feature and its importance.
  • Research on the existing solutions and relevant research articles/resources.
  • Discuss with the team to evaluate the feasibility of implementing the feature.
  • Create a design document outlining the proposed solution and implementation details.
  • Get feedback from the team on the design document.
  • Develop the feature following the agreed design.
  • Write unit tests and integration tests for the feature.
  • Update the documentation to include the new feature.
  • Perform code review and address any issues.
  • Merge the feature into the main branch.
  • Monitor for any issues or bugs reported by users after the feature is released.
  • Address any issues or bugs reported by users and continuously improve the feature.
@pxlxingliang pxlxingliang self-assigned this Nov 14, 2024
@mohanchen mohanchen added the Diago Issues related to diagonalizaiton methods label Nov 14, 2024
@pxlxingliang
Copy link
Collaborator Author

pxlxingliang commented Nov 21, 2024

I have conducted a comparative analysis of the computational efficiency of solving generalized eigenvalue problems using ELPA, ScaLAPACK, and LAPACK for matrices of varying dimensions. The results indicate that for small matrices, specifically those with a bandwidth of less than 100, LAPACK demonstrates superior efficiency. However, as the matrix size increases, the efficiency of ELPA and ScaLAPACK becomes more pronounced.

Furthermore, the block size is a significant factor affecting efficiency. For ELPA, optimal performance is achieved with block sizes of either 16 or 32.

The speed up of ELPA/ScaLAPACK relative to LAPACK on matrices of varying dimensions and different cores and block size.
Each row represents the number of parallel cores, and each column corresponds to the block size. The two values presented for each configuration represent the speedup of ELPA/ScaLAPACK relative to LAPACK.
For each case, 10 random H/S matrix are generated and solve 10 times.
The test codes: https://github.com/deepmodeling/abacus-develop/pull/5549/files#diff-4cfdb3bd4f00aee2894decd88dc0691e059bb6810b1888b32a8bd3c6e48b78f2R326

#ndim=64,nband=50​
           1          4          16         20         32         50         64​
4   0.56/0.17  0.91/0.39  1.00/0.58  0.95/0.60  1.00/0.71         --          --​
8   0.53/0.15  0.82/0.35  0.94/0.55         --         --         --          --​
16  0.54/0.15  0.77/0.33  0.85/0.51         --         --         --          --​
​
#ndim=100,nband=50​
           1          4          16         20         32         50         64​
4   0.52/0.14  0.89/0.37  1.00/0.55  0.97/0.57  0.92/0.60  0.91/0.71         --​
8   0.52/0.13  0.83/0.34  0.94/0.54  0.93/0.58          --        --        --​
16  0.53/0.12  0.82/0.33  0.89/0.48  0.87/0.53          --        --        --​
​
#ndim=100,nband=80​
           1          4          16         20         32         50         64​
4   0.71/0.19  1.21/0.49  1.36/0.72  1.35/0.78  1.27/0.80  1.34/0.94      --​
8   0.73/0.18  1.17/0.47  1.29/0.74  1.31/0.79  1.32/0.81          --        --​
16  0.74/0.17  1.12/0.45  1.18/0.67  1.22/0.74  1.23/0.77          --        --​
  ​
#ndim=200,nband=50​
           1          4          16         20         32         50         64​
4   0.43/0.09  0.92/0.35  1.02/0.58  1.08/0.60  1.00/0.64  0.95/0.68  0.81/0.68​
8   0.54/0.10  1.01/0.34  1.08/0.56  1.10/0.61  1.12/0.65  0.95/0.69  --​
16  0.62/0.10  1.03/0.33  1.15/0.55  1.13/0.55  1.08/0.57  0.99/0.68  --​
​
#ndim=200,nband=100​
           1          4          16         20         32         50         64​
4   0.66/0.13  1.31/0.46  1.46/0.75  1.48/0.77  1.42/0.80  1.41/0.86  1.14/0.85​
8   0.82/0.11  1.41/0.36  1.62/0.60  1.54/0.65  1.54/0.66  1.34/0.69  --​
16  0.89/0.14  1.50/0.47  1.67/0.73  1.48/0.73  1.52/0.79  1.34/0.88  --​
​
#ndim=200,nband=160​
           1          4          16         20         32         50         64​
4   0.92/0.13  1.77/0.43  1.99/0.68  2.08/0.71  1.93/0.73  1.85/0.80  1.54/0.76​
8   1.17/0.23  2.06/0.71  2.34/1.16  2.26/1.23  2.17/1.26  2.01/1.34  --​
16  1.27/0.22  2.08/0.70  2.27/1.06  2.20/1.12  2.02/1.13  1.94/1.28  --​
​
#ndim=300,nband=240​
           1          4          16         20         32         50         64​
4   1.16/0.25  2.22/0.88  2.47/1.39  2.37/1.43  2.46/1.47  2.24/1.51  2.12/1.55​
8   1.60/0.29  2.82/0.97  3.20/1.57  3.15/1.65  3.17/1.73  2.64/1.76  2.43/1.78​
16  1.84/0.16  3.14/0.56  3.46/0.87  3.31/0.89  3.28/0.94  2.60/0.97  2.55/1.05​
​
#ndim=400,nband=320​
           1          4          16         20         32         50         64​
4   1.39/0.29  2.63/1.08  2.94/1.73  2.88/1.76  2.93/1.81  2.71/1.81  2.40/1.78​
8   2.12/0.25  3.55/0.82  4.00/1.33  3.97/1.38  3.96/1.46  3.60/1.46  3.01/1.42​
16  2.52/0.18  4.51/0.68  4.97/1.11  4.84/1.14  4.67/1.16  4.12/1.22  3.28/1.21​
​
#ndim=500, nband=400​
          16         20         32         50         64         128​
4   3.36/1.96  3.28/2.02  3.39/2.07  3.14/2.06  2.96/2.03  2.26/1.95​
8   4.89/1.32  4.71/1.34  4.80/1.36  4.34/1.38  4.15/1.39  2.72/1.22​
16  6.18/1.39  5.87/1.40  6.07/1.45  5.09/1.48  4.66/1.55  2.89/1.54

Below are the times by ELPA/SCALAPACK on large matrix with different cores and block size. Unit is ms.

#ndim=600, nband=500​
              16             32             64             128​
4   357.33/595.67  348.67/570.67  397.00/579.33  482.00/580.33​
8   251.67/854.00  239.67/805.33  274.00/804.33  394.33/868.33​
16  193.00/818.33  181.67/775.33  233.67/738.67  367.33/641.67​
​
#ndim=800, nband=600​
               16              32              64              128​
4   667.00/1188.67  651.33/1125.67  731.00/1148.00  962.33/1216.00​
8   447.33/1556.67  436.33/1481.33  516.00/1516.00  714.33/1639.67​
16  337.67/1461.33  325.33/1394.00  394.33/1374.00  642.00/1404.00​
​
#ndim=1000, nband=800​
                16               32               64               128​
4   1150.00/2295.33  1163.67/2240.00  1286.67/2278.33  1607.33/2336.33​
8    770.33/2767.00   763.67/2686.00   857.67/2779.67  1098.00/2957.33​
16   544.67/2559.00   542.33/2474.00   612.00/2428.00   928.33/2387.33​
​
# ndim=1200, nband=1000​
                16               32               64               128​
4   1878.33/3905.33  1853.00/3731.33  2052.33/3772.33  2542.33/3962.00​
8   1203.00/4494.00  1171.33/4352.33  1296.00/4452.33  1625.67/4744.67​
16   831.67/4086.67   818.67/3938.67   923.67/3925.00   923.67/3925.00

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Diago Issues related to diagonalizaiton methods
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants