Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OutOfMemoryError #25

Closed
newplay opened this issue Jan 27, 2023 · 9 comments
Closed

OutOfMemoryError #25

newplay opened this issue Jan 27, 2023 · 9 comments

Comments

@newplay
Copy link

newplay commented Jan 27, 2023

Hi , when I try to run the step 5 with tesk = [5], and calulate only one k-point (1st) . There's inference parameters below:

############################################
[basic]
OLP_dir = /home/zjlin/tbg_Deep_ex/work_dir/olp/TBG_1.05/
work_dir = /home/zjlin/tbg_Deep_ex/work_dir/inference/TBG_1.05_mpi/
structure_file_name = POSCAR
interface = openmx
task = [5]
sparse_calc_config = /home/zjlin/tbg_Deep_ex/work_dir/inference/TBG_1.05_mpi/band_1.json
trained_model_dir = /home/zjlin/tbg_Deep_ex/work_dir/trained_model/2022-12-21_21-30-31
restore_blocks_py = False

[interpreter]
julia_interpreter = /home/zjlin/julia-1.5.4/bin/julia

[graph]
radius = 9.0
create_from_DFT = True
#############################################

and band.json:

#############################################

{
"calc_job": "band",
"which_k": 1,
"fermi_level": -3.8624886706842148,
"lowest_band": -0.3,
"max_iter": 300,
"num_band": 125,
"k_data": ["10 0.000 0.000 0.000 0.500 0.000 0.000 G M ","10 0.500 0.000 0.000 0.333333333 0.333333333 0.000 M K","10 0.333333333 0.333333333 0.000 0.000 0.000 0.000 K G"]
}

##############################################

But I got an error:

##############################################

[ Info: read h5
[ Info: construct Hamiltonian and overlap matrix in the real space
ERROR: LoadError: OutOfMemoryError()
Stacktrace:
[1] Array at ./boot.jl:424 [inlined]
[2] Array at ./boot.jl:432 [inlined]
[3] zeros at ./array.jl:525 [inlined]
[4] zeros(::Type{Complex{Float64}}, ::Int64, ::Int64) at ./array.jl:521
[5] top-level scope at /home/zjlin/anaconda3/envs/pytorch/lib/python3.9/site-packages/deeph/inference/dense_calc.jl:121
[6] include(::Function, ::Module, ::String) at ./Base.jl:380
[7] include(::Module, ::String) at ./Base.jl:368
[8] exec_options(::Base.JLOptions) at ./client.jl:296
[9] _start() at ./client.jl:506
in expression starting at /home/zjlin/anaconda3/envs/pytorch/lib/python3.9/site-packages/deeph/inference/dense_calc.jl:105
Traceback (most recent call last):
File "/home/zjlin/anaconda3/envs/pytorch/bin/deeph-inference", line 8, in
sys.exit(main())
File "/home/zjlin/anaconda3/envs/pytorch/lib/python3.9/site-packages/deeph/scripts/inference.py", line 131, in main
assert capture_output.returncode == 0
AssertionError

###############################################
Here's my question:
When I calculate TBG with 1.05 degree, there's 11908 atoms, how much memery we need prepare?
Best regards

@mzjb
Copy link
Owner

mzjb commented Jan 27, 2023

By default DeepH-pack use the dense matrix to compute the eigenvalues. One should use sparse matrix to perform the calculation for large-scale materials like TBG with 1.05 degree. Please set

[basic]
dense_calc = False

in your inference parameters.

By the way, I just updated a parallel version of sparse calculation script, which is X (X ≈ the number of CPU cores) times faster than the original sparse calculation script. One should update the DeepH-pack and install Pardiso.jl and LinearMaps.jl by following the updated README to use it.

One needs about 80GB of memory to calculate 50 bands of TBG with 11908 atoms.

@newplay
Copy link
Author

newplay commented Jan 27, 2023

@mzjb Thank you for your response, I will try as soon as possible.

@newplay
Copy link
Author

newplay commented Jan 28, 2023

By default DeepH-pack use the dense matrix to compute the eigenvalues. One should use sparse matrix to perform the calculation for large-scale materials like TBG with 1.05 degree. Please set

[basic]
dense_calc = False

in your inference parameters.

By the way, I just updated a parallel version of sparse calculation script, which is X (X ≈ the number of CPU cores) times faster than the original sparse calculation script. One should update the DeepH-pack and install Pardiso.jl and LinearMaps.jl by following the updated README to use it.

One needs about 80GB of memory to calculate 50 bands of TBG with 11908 atoms.

Hello, I have a question about the parallel version of parsing calculation: How can I utilize all of the CPU cores (such as 64 cores)? Are there any parameters in the .ini file that I can adjust to achieve this?

@mzjb
Copy link
Owner

mzjb commented Jan 28, 2023

By default DeepH-pack use the dense matrix to compute the eigenvalues. One should use sparse matrix to perform the calculation for large-scale materials like TBG with 1.05 degree. Please set

[basic]
dense_calc = False

in your inference parameters.
By the way, I just updated a parallel version of sparse calculation script, which is X (X ≈ the number of CPU cores) times faster than the original sparse calculation script. One should update the DeepH-pack and install Pardiso.jl and LinearMaps.jl by following the updated README to use it.
One needs about 80GB of memory to calculate 50 bands of TBG with 11908 atoms.

Hello, I have a question about the parallel version of parsing calculation: How can I utilize all of the CPU cores (such as 64 cores)? Are there any parameters in the .ini file that I can adjust to achieve this?

Use

set_nprocs!(ps, 64)

after line 45 of this file to set the number of threads to 64. I found that the default value is the number of cpu cores when I use Intel oneapi MKL.
(https://github.com/JuliaSparse/Pardiso.jl#mkl-pardiso-1)

@newplay
Copy link
Author

newplay commented Feb 3, 2023

Hi, I have found that Julia 1.5.4 does not support Pardiso 0.5.4 version, and the function "fix_iparm!" will not be executed. This issue has been resolved by updating Julia to version 1.8.5, but I cannot confirm if the program you wrote supports the syntax of 1.8.5. Please take note of this issue.

@mzjb
Copy link
Owner

mzjb commented Feb 3, 2023

Thank you for your reminder. I am actually using julia 1.6.6, and I forgot to update the README.

Hi, I have found that Julia 1.5.4 does not support Pardiso 0.5.4 version, and the function "fix_iparm!" will not be executed. This issue has been resolved by updating Julia to version 1.8.5, but I cannot confirm if the program you wrote supports the syntax of 1.8.5. Please take note of this issue.

@newplay
Copy link
Author

newplay commented Feb 21, 2023

Hi,After changing the keyword to
[basic] dense_calc = False
I have a new question: The resulting band structure appears to contain numerous sawtooth-shaped bands, which is evidently incorrect. I suspect that the Hamiltonian matrix may not be as sparse as I originally thought, or perhaps my radius is set too large. Specifically, my radius is currently set to 9.
Can you give me some advice?

@aaaashanghai
Copy link
Contributor

aaaashanghai commented Feb 21, 2023

Hi,After changing the keyword to
[basic] dense_calc = False
I have a new question: The resulting band structure appears to contain numerous sawtooth-shaped bands, which is evidently incorrect. I suspect that the Hamiltonian matrix may not be as sparse as I originally thought, or perhaps my radius is set too large. Specifically, my radius is currently set to 9.
Can you give me some advice?

Hi there, thank you for raising this issue. I'm another developer of DeepH (Zechen Tang) and am responding to your question.
I believe there's no primary error in your calculation. The reason for seeing sawtooth-shaped bands lies in the incorrect ordering of bands. In the dense_calc mode, all eigenvalues are calculated and thus indexed in a correct manner. In the sparse diagonalization scheme, however, only a few eigenvalues near Fermi level are calculated, which are in general not all bands. It is very likely that these bands are labelled with incorrect indexes.
If you're using matplotlib.pyplot.plot to plot band diagram, the eigenvalues with the same "index" along different k-points are recognized as the same band and are joined together to form a line. This can result in the appearance of a sawtooth pattern so long as the indexes are incorrect.
Here are two ways to solve this issue:

  1. Use scattering plot (matplotlib.pyplot.scatter) instead of line plot. In this way you'll be able to see the shape of the band without being bothered by the sawtooth.
  2. For gapped systems, you can manually assign an energy level in a gap, and sort all VBMs and CBMs by there distance to this energy level. This could give a correct "index" of all calculated eigenvalues, and will result in a correct line plot.

Unfortunately the second approach would involve some coding dependent on the way you organize your band eigenvalue output, and we don't have a general script on this. I would recommend you try the first approach instead.
We sincerely appreciate you for bringing up this issue. If you have any further questions, we would be more than happy to provide you with additional support.

@newplay
Copy link
Author

newplay commented Feb 21, 2023

@aaaashanghai Thank you very much for your response, it completely dispelled my doubts!

@mzjb mzjb closed this as completed Mar 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants