-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
merge gpu branch onto main #21
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My only real concern is with all of the syncthreads, but maybe they're all needed? When I get time, I'll try to run everything on NCAR's supercomputers
! Free the interface and the solver | ||
deallocate(camp_interface) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this cause a crash if it's left in?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At some point it produced a crash, but seems now is not the case. I will leave it as before.
! Free the interface and the solver | ||
deallocate(camp_interface) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above, does this cause a crash?
camp_mpi_pack_size_real_array(this%init_state, l_comm) | ||
camp_mpi_pack_size_real_array(this%init_state_cell, l_comm) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
above both init_state
and init_state_cell
were initialized. Why wouldn't both be sent over mpi? Perhaps I just need to stare at this PR some more. I guess I don't understand the difference between the two yet
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because both arrays are initialized to 1.0 at line 800:
! Set the activity coefficients to 1.0 as default
do i_name = 1, size(unique_names)
i_state_elem = rep%spec_state_id(unique_names(i_name)%string)
this%init_state(i_state_elem + i_cell * this%size_state_per_cell) = &
real(1.0d0, kind=dp)
this%init_state_cell(i_state_elem) = &
this%init_state(i_state_elem + i_cell * this%size_state_per_cell)
end do
I created this "init_state_cell" to accelerate MPI communications, since I only send a cell instead of all the cells.
I believe "init_state" can be simplified quite well, but for the moment this patch is enough for me.
call camp_mpi_unpack_real_array(buffer, pos, this%init_state_cell, l_comm) | ||
|
||
allocate(this%init_state(this%size_state_per_cell * this%n_cells)) | ||
do i_cell = 0, this%n_cells - 1 | ||
do i_state_elem = 1, this%size_state_per_cell | ||
this%init_state(i_state_elem + i_cell * this%size_state_per_cell)=& | ||
this%init_state_cell(i_state_elem) | ||
end do | ||
end do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this answer my question. Does this imply that init_state
is not initialized before the broadcast?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not always. For example, for "camp/data/CAMP_v1_paper/camp_monarch_interface" it is only initialized for MPI rank=0. Then, most of the variables are sent to the rest of the processes. I detect only 1 variable only present in rank=0, which is unique_names (i.e. the names of the species), but it may be more.
Summarizing, "init_state" is initialized at rank=0, then sent to the other ranks
src/camp_solver.c
Outdated
//if(i_cell==0){ | ||
//print_double(md->grid_cell_env,CAMP_NUM_ENV_PARAM_,"env689"); | ||
//print_double(md->grid_cell_state,n_state_var,"state688"); | ||
//double *yp = N_VGetArrayPointer(sd->y); | ||
//print_double(yp,md->n_per_cell_dep_var,"y660"); | ||
//} | ||
//print_double(md->grid_cell_env,CAMP_NUM_ENV_PARAM_,"env689"); | ||
//double *yp = N_VGetArrayPointer(sd->y)+i_cell*md->n_per_cell_dep_var; | ||
//print_double(yp,md->n_per_cell_dep_var,"y660"); | ||
//print_double(md->grid_cell_state,md->n_per_cell_state_var,"state688"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
//if(i_cell==0){ | |
//print_double(md->grid_cell_env,CAMP_NUM_ENV_PARAM_,"env689"); | |
//print_double(md->grid_cell_state,n_state_var,"state688"); | |
//double *yp = N_VGetArrayPointer(sd->y); | |
//print_double(yp,md->n_per_cell_dep_var,"y660"); | |
//} | |
//print_double(md->grid_cell_env,CAMP_NUM_ENV_PARAM_,"env689"); | |
//double *yp = N_VGetArrayPointer(sd->y)+i_cell*md->n_per_cell_dep_var; | |
//print_double(yp,md->n_per_cell_dep_var,"y660"); | |
//print_double(md->grid_cell_state,md->n_per_cell_state_var,"state688"); |
If this is no longer needed, please delete, or wrap in an ifdef to check for a debug print mode
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
src/cuda/cvode_cuda.cu
Outdated
//print_double(atmp1,86,"atmp1766"); | ||
int fflag=cudaDevicef(t_0 + t_j, atmp1, acorr,md,sc,&aux_flag); | ||
//print_double(acorr,86,"acorr721"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
//print_double(atmp1,86,"atmp1766"); | |
int fflag=cudaDevicef(t_0 + t_j, atmp1, acorr,md,sc,&aux_flag); | |
//print_double(acorr,86,"acorr721"); | |
int fflag=cudaDevicef(t_0 + t_j, atmp1, acorr,md,sc,&aux_flag); |
please remove
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
src/cuda/cvode_cuda.cu
Outdated
//print_double(md->dtempv,86,"dtempvN_VLinearSum1"); | ||
md->dtempv[i]=sc->cv_rl1*md->dzn[i+md->nrows]+md->cv_acor[i]; | ||
//print_double(md->dtempv,86,"dtempvN_VLinearSum2"); | ||
md->dtempv[i]=sc->cv_gamma*md->dftemp[i]-md->dtempv[i]; | ||
//print_double(md->dtempv,86,"dtempvcv_lsolve1"); | ||
solveBcgCudaDeviceCVODE(md, sc); | ||
__syncthreads(); | ||
#ifdef CAMP_PROFILE_DEVICE_FUNCTIONS | ||
if(threadIdx.x==0) sc->dtBCG += ((double)(int)(clock() - start))/(clock_khz*1000); | ||
#endif | ||
md->dtempv[i] = md->dx[i]; | ||
//print_double(md->dtempv,86,"dtempvcv_lsolve2"); | ||
__syncthreads(); | ||
cudaDeviceVWRMS_Norm_2(md->dx, md->dewt, &del, md->n_shr_empty); | ||
md->dftemp[i]=md->dcv_y[i]+md->dtempv[i]; | ||
__syncthreads(); | ||
//print_double(md->dcv_y,86,"dcv_y2994"); | ||
//print_double(md->dftemp,86,"cv_ftemplsolve"); | ||
int guessflag=CudaDeviceguess_helper(0., md->dftemp, | ||
md->dcv_y, md->dtempv, md->dtempv1,md->dtempv2, &aux_flag, md, sc | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
//print_double(md->dtempv,86,"dtempvN_VLinearSum1"); | |
md->dtempv[i]=sc->cv_rl1*md->dzn[i+md->nrows]+md->cv_acor[i]; | |
//print_double(md->dtempv,86,"dtempvN_VLinearSum2"); | |
md->dtempv[i]=sc->cv_gamma*md->dftemp[i]-md->dtempv[i]; | |
//print_double(md->dtempv,86,"dtempvcv_lsolve1"); | |
solveBcgCudaDeviceCVODE(md, sc); | |
__syncthreads(); | |
#ifdef CAMP_PROFILE_DEVICE_FUNCTIONS | |
if(threadIdx.x==0) sc->dtBCG += ((double)(int)(clock() - start))/(clock_khz*1000); | |
#endif | |
md->dtempv[i] = md->dx[i]; | |
//print_double(md->dtempv,86,"dtempvcv_lsolve2"); | |
__syncthreads(); | |
cudaDeviceVWRMS_Norm_2(md->dx, md->dewt, &del, md->n_shr_empty); | |
md->dftemp[i]=md->dcv_y[i]+md->dtempv[i]; | |
__syncthreads(); | |
//print_double(md->dcv_y,86,"dcv_y2994"); | |
//print_double(md->dftemp,86,"cv_ftemplsolve"); | |
int guessflag=CudaDeviceguess_helper(0., md->dftemp, | |
md->dcv_y, md->dtempv, md->dtempv1,md->dtempv2, &aux_flag, md, sc | |
); | |
md->dtempv[i]=sc->cv_rl1*md->dzn[i+md->nrows]+md->cv_acor[i]; | |
md->dtempv[i]=sc->cv_gamma*md->dftemp[i]-md->dtempv[i]; | |
solveBcgCudaDeviceCVODE(md, sc); | |
__syncthreads(); | |
#ifdef CAMP_PROFILE_DEVICE_FUNCTIONS | |
if(threadIdx.x==0) sc->dtBCG += ((double)(int)(clock() - start))/(clock_khz*1000); | |
#endif | |
md->dtempv[i] = md->dx[i]; | |
__syncthreads(); | |
cudaDeviceVWRMS_Norm_2(md->dx, md->dewt, &del, md->n_shr_empty); | |
md->dftemp[i]=md->dcv_y[i]+md->dtempv[i]; | |
__syncthreads(); | |
int guessflag=CudaDeviceguess_helper(0., md->dftemp, | |
md->dcv_y, md->dtempv, md->dtempv1,md->dtempv2, &aux_flag, md, sc | |
); |
please remove
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
src/cuda/cvode_cuda.cu
Outdated
//print_double(md->cv_acor,86,"cv_acor1060"); | ||
//print_double(md->dcv_y,86,"dcv_y1060"); | ||
if (m > 0) { | ||
sc->cv_crate = SUNMAX(0.3 * sc->cv_crate, del / delp); | ||
} | ||
dcon = del * SUNMIN(1.0, sc->cv_crate) / md->cv_tq[4+blockIdx.x*(NUM_TESTS + 1)]; | ||
flag_shr2[0]=0; | ||
__syncthreads(); | ||
if (dcon <= 1.) { | ||
//print_double(md->cv_acor,86,"cv_acor1505"); | ||
//print_double(md->dewt,86,"dewt1505"); | ||
cudaDeviceVWRMS_Norm_2(md->cv_acor, md->dewt, &sc->cv_acnrm, md->n_shr_empty); | ||
//print_double(&sc->cv_acnrm,1,"cv_acnrm1151"); | ||
__syncthreads(); | ||
sc->cv_jcur = 0; | ||
__syncthreads(); | ||
return CV_SUCCESS; | ||
} | ||
m++; | ||
if ((m == md->cv_maxcor) || ((m >= 2) && (del > RDIV * delp))) { | ||
if (!(sc->cv_jcur)) { | ||
return TRY_AGAIN; | ||
} else { | ||
return RHSFUNC_RECVR; | ||
} | ||
} | ||
delp = del; | ||
__syncthreads(); | ||
//print_double(md->dcv_y,86,"dcv_y1137"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
//print_double(md->cv_acor,86,"cv_acor1060"); | |
//print_double(md->dcv_y,86,"dcv_y1060"); | |
if (m > 0) { | |
sc->cv_crate = SUNMAX(0.3 * sc->cv_crate, del / delp); | |
} | |
dcon = del * SUNMIN(1.0, sc->cv_crate) / md->cv_tq[4+blockIdx.x*(NUM_TESTS + 1)]; | |
flag_shr2[0]=0; | |
__syncthreads(); | |
if (dcon <= 1.) { | |
//print_double(md->cv_acor,86,"cv_acor1505"); | |
//print_double(md->dewt,86,"dewt1505"); | |
cudaDeviceVWRMS_Norm_2(md->cv_acor, md->dewt, &sc->cv_acnrm, md->n_shr_empty); | |
//print_double(&sc->cv_acnrm,1,"cv_acnrm1151"); | |
__syncthreads(); | |
sc->cv_jcur = 0; | |
__syncthreads(); | |
return CV_SUCCESS; | |
} | |
m++; | |
if ((m == md->cv_maxcor) || ((m >= 2) && (del > RDIV * delp))) { | |
if (!(sc->cv_jcur)) { | |
return TRY_AGAIN; | |
} else { | |
return RHSFUNC_RECVR; | |
} | |
} | |
delp = del; | |
__syncthreads(); | |
//print_double(md->dcv_y,86,"dcv_y1137"); | |
if (m > 0) { | |
sc->cv_crate = SUNMAX(0.3 * sc->cv_crate, del / delp); | |
} | |
dcon = del * SUNMIN(1.0, sc->cv_crate) / md->cv_tq[4+blockIdx.x*(NUM_TESTS + 1)]; | |
flag_shr2[0]=0; | |
__syncthreads(); | |
if (dcon <= 1.) { | |
cudaDeviceVWRMS_Norm_2(md->cv_acor, md->dewt, &sc->cv_acnrm, md->n_shr_empty); | |
__syncthreads(); | |
sc->cv_jcur = 0; | |
__syncthreads(); | |
return CV_SUCCESS; | |
} | |
m++; | |
if ((m == md->cv_maxcor) || ((m >= 2) && (del > RDIV * delp))) { | |
if (!(sc->cv_jcur)) { | |
return TRY_AGAIN; | |
} else { | |
return RHSFUNC_RECVR; | |
} | |
} | |
delp = del; | |
__syncthreads(); |
please remove the prints
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, please remove all of the commented out prints, or wrap them in debug ifdefs.
Also, I feel like some of the __syncthreads
may be superfluous, but I could be very wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was able to remove 3 quarters of ___syncthreads
(from 200 calls to 50)
/** \brief Calculate the time derivative \f$f(t,y)\f$ for only some specific | ||
* types | ||
* | ||
* \param model_data Pointer to the model data | ||
* \param time_deriv TimeDerivative to use to build derivative array | ||
* \param time_step Current model time step (s) | ||
*/ | ||
#ifdef CAMP_USE_SUNDIALS | ||
void rxn_calc_deriv_specific_types(ModelData *model_data, | ||
TimeDerivative time_deriv, | ||
realtype time_step) { | ||
// Get the number of reactions | ||
int n_rxn = model_data->n_rxn; | ||
|
||
// Loop through the reactions advancing the rxn_data pointer each time | ||
for (int i_rxn = 0; i_rxn < n_rxn; i_rxn++) { | ||
// Get pointers to the reaction data | ||
int *rxn_int_data = | ||
&(model_data->rxn_int_data[model_data->rxn_int_indices[i_rxn]]); | ||
double *rxn_float_data = | ||
&(model_data->rxn_float_data[model_data->rxn_float_indices[i_rxn]]); | ||
double *rxn_env_data = | ||
&(model_data->grid_cell_rxn_env_data[model_data->rxn_env_idx[i_rxn]]); | ||
|
||
// Get the reaction type | ||
int rxn_type = *(rxn_int_data++); | ||
|
||
// Call the appropriate function | ||
switch (rxn_type) { | ||
case RXN_HL_PHASE_TRANSFER: | ||
rxn_HL_phase_transfer_calc_deriv_contrib(model_data, time_deriv, | ||
rxn_int_data, rxn_float_data, | ||
rxn_env_data, time_step); | ||
break; | ||
case RXN_SIMPOL_PHASE_TRANSFER: | ||
rxn_SIMPOL_phase_transfer_calc_deriv_contrib( | ||
model_data, time_deriv, rxn_int_data, rxn_float_data, rxn_env_data, | ||
time_step); | ||
break; | ||
} | ||
} | ||
} | ||
#endif | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is not use in any part of the code. It is some old code that I added time ago.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wow, this is a lot of work! I was able to build and run the tests, and they pass (cpu only). I didn't look in detail at the cuda code, but the changes to the rest of the code look mostly ok as far as I can tell. I added some comments, but mostly minor things.
data/CAMP_v1_paper/README.md
Outdated
* Dawson, M. L., Guzman, C., Curtis, J. H., Acosta, M., Zhu, S., Dabdub, D., Conley, A., West, M., Riemer, N., and Jorba, O.: Chemistry Across Multiple Phases (CAMP) version 1.0: an integrated multiphase chemistry model, Geosci. Model Dev., 15, 3663–3689, https://doi.org/10.5194/gmd-15-3663-2022, 2022. | ||
* M. Dawson, C. Guzman, J. H. Curtis, M. Acosta, S. Zhu, D. Dabdub, | ||
A. Conley, M. West, N. Riemer, and O. Jorba (2021), | ||
Chemistry Across Multiple Phases (CAMP) version 1.0: An | ||
Integrated multi-phase chemistry model, in preparation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this overwrote the correct reference
@@ -79,8 +90,11 @@ program box_model | |||
camp_state%state_var( idx_O2 ) | |||
end do | |||
|
|||
deallocate( camp_core ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this need to be deleted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried that line in my last commit and is working fine, so It was fixed somewhere in the way. I will push that last commit with the rest of the changes.
I think it was failing because a missing " call camp_mpi_finalize( )" or something like that
@@ -108,7 +108,6 @@ program box_model | |||
camp_state%state_var( idx_O2 ) | |||
end do | |||
|
|||
deallocate( camp_core ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this need to be deleted?
@@ -170,7 +170,6 @@ program box_model | |||
#endif | |||
!! [output] | |||
|
|||
deallocate( camp_core ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this need to be deleted?
doc/references.bib
Outdated
@article{Tie2003, | ||
author = {Tie, Xuexi and Emmons, Louisa and Horowitz, Larry and Brasseur, Guy and Ridley, Brian and Atlas, Elliot and Stround, Craig and Hess, Peter and Klonecki, Andrzej and Madronich, Sasha and Talbot, Robert and Dibb, Jack}, | ||
title = {Effect of sulfate aerosol on tropospheric NOx and ozone budgets: Model simulations and TOPSE evidence}, | ||
journal = {Journal of Geophysical Research: Atmospheres}, | ||
volume = {108}, | ||
number = {D4}, | ||
pages = {}, | ||
keywords = {tropospheric aerosol, NOx, ozone}, | ||
doi = {https://doi.org/10.1029/2001JD001508}, | ||
url = {https://agupubs.onlinelibrary.wiley.com/doi/abs/10.1029/2001JD001508}, | ||
eprint = {https://agupubs.onlinelibrary.wiley.com/doi/pdf/10.1029/2001JD001508}, | ||
abstract = {The distributions of NOx and O3 are analyzed during TOPSE (Tropospheric Ozone Production about the Spring Equinox). In this study these data are compared with the calculations of a global chemical/transport model (Model for OZone And Related chemical Tracers (MOZART)). Specifically, the effect that hydrolysis of N2O5 on sulfate aerosols has on tropospheric NOx and O3 budgets is studied. The results show that without this heterogeneous reaction, the model significantly overestimates NOx concentrations at high latitudes of the Northern Hemisphere (NH) in winter and spring in comparison to the observations during TOPSE; with this reaction, modeled NOx concentrations are close to the measured values. This comparison provides evidence that the hydrolysis of N2O5 on sulfate aerosol plays an important role in controlling the tropospheric NOx and O3 budgets. The calculated reduction of NOx attributed to this reaction is 80 to 90\% in winter at high latitudes over North America. Because of the reduction of NOx, O3 concentrations are also decreased. The maximum O3 reduction occurs in spring although the maximum NOx reduction occurs in winter when photochemical O3 production is relatively low. The uncertainties related to uptake coefficient and aerosol loading in the model is analyzed. The analysis indicates that the changes in NOx due to these uncertainties are much smaller than the impact of hydrolysis of N2O5 on sulfate aerosol. The effect that hydrolysis of N2O5 on global NOx and O3 budgets are also assessed by the model. The results suggest that in the Northern Hemisphere, the average NOx budget decreases 50\% due to this reaction in winter and 5\% in summer. The average O3 budget is reduced by 8\% in winter and 6\% in summer. In the Southern Hemisphere (SH), the sulfate aerosol loading is significantly smaller than in the Northern Hemisphere. As a result, sulfate aerosol has little impact on NOx and O3 budgets of the Southern Hemisphere.}, | ||
year = {2003} | ||
} | ||
@article{Wennberg2018, | ||
author = {Wennberg, Paul O. and Bates, Kelvin H. and Crounse, John D. and Dodson, Leah G. and McVay, Renee C. and Mertens, Laura A. and Nguyen, Tran B. and Praske, Eric and Schwantes, Rebecca H. and Smarte, Matthew D. and St Clair, Jason M. and Teng, Alexander P. and Zhang, Xuan and Seinfeld, John H.}, | ||
title = {Gas-Phase Reactions of Isoprene and Its Major Oxidation Products}, | ||
journal = {Chemical Reviews}, | ||
volume = {118}, | ||
number = {7}, | ||
pages = {3337-3390}, | ||
year = {2018}, | ||
doi = {10.1021/acs.chemrev.7b00439}, | ||
note ={PMID: 29522327}, | ||
URL = {https://doi.org/10.1021/acs.chemrev.7b00439}, | ||
eprint = {https://doi.org/10.1021/acs.chemrev.7b00439} | ||
} | ||
@techreport{JPL15, | ||
author = {J. B. Burkholder, S. P. Sander, J. Abbatt, J. R. Barker, R. E. Huie, C. E. Kolb, M. J. Kurylo, V. L. Orkin, D. M. | ||
Wilmouth, and P. H. Wine}, | ||
title = {Chemical Kinetics and Photochemical Data for Use in Atmospheric Studies, Evaluation No. 18 JPL Publication 15-10}, | ||
institution = {Jet Propulsion Laboratory}, | ||
location = {Pasadena}, | ||
year = {2015}, | ||
url = {http://jpldataeval.jpl.nasa.gov} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these are still needed for some of the newer reaction types
!> A new MONARCH interface | ||
type(monarch_interface_t), pointer :: new_obj | ||
!> Path to the PartMC-camp configuration file list | ||
function constructor(camp_config_file, output_file_title, & |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why were the comments removed from this file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was easier for me to understand the file. I wanted to just put the minimum and work from there.
!> Interface for the MONACH model and PartMC-camp | ||
module camp_monarch_interface | ||
!> Interface for the MONACH model and CAMP-camp | ||
module camp_monarch_interface_2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why was this renamed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that is not needed, I passed so much time with 2 camp_monarch_interface
that I get used to that and don't even think it can be renamed.
!print*,"MONARCH_conc381",MONARCH_conc(i,j,k,this%map_monarch_id(:)) | ||
!print*,"state_var421",this%camp_state%state_var(:) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please remove commented out code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
type(camp_monarch_interface_t), intent(inout) :: this | ||
if (associated(this%camp_core)) deallocate(this%camp_core) | ||
|
||
end subroutine finalize | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are these pointer deallocations removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At some point it failed, but it seems to be fixed now, so I will left it as before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found the error: Some variables like "init_conc_camp_id" are set only for rank 0, so it fails when using 40 MPI processes. Expect a new commit, also with another fix for the nGPUs set
!> Number of time steps to integrate over | ||
integer, parameter :: NUM_TIME_STEP = 5 | ||
!> Index for water vapor in water_conc() | ||
integer(kind=i_kind), parameter :: NUM_EBI_SPEC = 72 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why were the comments removed from this file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as I commented before: It was easier for me to understand the file. I wanted to just put the minimum and work from there.
Only some of them are needed. I didn't remove them because it shouldn't affect performance and are useful for debugging |
Pushed a new commit with the changes from your suggestions. |
Run the GPU version requires using CVODE from the GitHub repository (The one zipped is not updated). We plan to upload a "first_install.sh" to download all dependencies. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @cguzman95 - I lost track of this, but it if you've addressed all the comments, I'm fine with merging it in. (Although it looks like some of Kyle's comments still need to be addressed)
Hi @mattldawson, yes, please merge it. I addressed Kyle'sc comments; the pending comments were about removing some prints, which I already did but forgot to comment on. |
Hello,
This pull request is an update of the GPU part, which has around 2 years of development.
There are minor changes on the CPU part related to quality-of-life improvements (save stats like execution time). Please check that it works as usual in your environments.
The GPU part has a lot of changes. Now most of the CVODE algorithm is translated to CUDA. There is a new folder related to compiling GPU (camp/compile/power9), where the files to compile and run are:
You may notice that for GPU I'm using a cvode version from https://github.com/mattldawson/cvode instead of the .tar version, because it has some time counters to compare both CPU and GPU execution times. Also, for the same reason, I'm using MPI to measure those timers, so to use the GPU you should have MPI.
Also, I added a new test to check the GPU. It requires Python and is tested for version Python/3.7.0. I usually develop running a very similar file located at test/monarch/TestMonarch.py.
I will be glad to hear any feedback.
Best Regards,
Christian