-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Producing DESI-Y1 Lya mocks with quickquasars. #578
Conversation
…ich type of raw mock is inputed
…f recieving a preprocessed catalog. Introduction of survey release module to produce Y1 mocks
…asars and implemented notebook to produce preprocessing mock catalogs to pass to quickquasars.
I won't be able to review this until Nov 24, but you can merge without me if others have approved it. I was a bit surprised to see the --raw-mock option in QQ. What does it do? Is it just to create a picca-readable quasar catalog to be used in the raw analyis? Because we don't want to create spectra files in the raw analysis... |
The
These are the reasons of why I say the |
Oh, I understand now. Thanks @HiramHerrera ! I thought these were related to the "raw analysis". I see now that you are just specifying which type of transmission files are being provided. |
Regarding this question:
@sbailey Since this is some sort of a template for the Y1 lya mock production, could we store it in |
bins = ((z - zmin)/(zmax - zmin) * len(zcenters) + 0.5).astype(np.int64) | ||
selection_z = np.random.uniform(size=z.size) < fraction[bins] | ||
|
||
np.random.set_state(rnd_state) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you restore the old random state?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used this to restore the quickquasars state to what it would be if we don't downsample to match the redshift distribution i.e not passing the --dn_dzdm
flag. This is probably not needed since in either case the realizations would differ in number of objects and therefore the all the random calls done by the code would be affected. So I think It can be removed if requested.
mask_footprint=desimodel.footprint.is_point_in_desi(tiles,mock['RA'],mock['DEC']) | ||
mock=mock[mask_footprint] | ||
log.info(f"Keeping {sum(mask_footprint)} mock QSOs in footprint TILES") | ||
if release is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better to write if release == "iron": ... else: raise NotImplementedError
log.info(f"Downsampling by NPASSES fraction in {release} release") | ||
# TODO: Implement this in desimodel instead of local path | ||
pixmap=Table.read('/global/cfs/cdirs/desi/users/hiramk/desi/quickquasars/sampling_tests/npass_pixmap.fits') | ||
mock_pixels = hp.ang2pix(1024, np.radians(90-mock['DEC']),np.radians(mock['RA']),nest=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are you using nside=1024? That is very high resolution
mock_pixels = hp.ang2pix(1024, np.radians(90-mock['DEC']),np.radians(mock['RA']),nest=True) | ||
try: | ||
data_pixels = hp.ang2pix(1024,np.radians(90-self.data['TARGET_DEC']),np.radians(self.data['TARGET_RA']),nest=True) | ||
except: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a good use of try-except. I recommend if 'TARGET_DEC
in self.data.dtype.names` or similar
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, never use naked except!
mock_pass_counts = np.bincount(mock_passes,minlength=8) | ||
mock['NPASS'] = mock_passes | ||
downsampling=np.divide(data_pass_counts,mock_pass_counts,out=np.zeros(8),where=mock_pass_counts>0) | ||
rand = np.random.uniform(size=len(mock)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better to have a seed here or leave a detailed comment why not.
filename=os.path.join(os.path.dirname(desisim.__file__),'data/dn_dzdM_EDR.fits') | ||
zcenters=fitsio.FITS(filename)['Z_CENTERS'][:] | ||
dz = zcenters[1]-zcenters[0] | ||
rmagcenters=fitsio.FITS(filename)['RMAG_CENTERS'][:] | ||
dn_dzdm=fitsio.FITS(filename)['dn_dzdm'][:,:] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be written as
with fitsio.FITS(filename) as fts:
zcenters = fts['Z_CENTERS'].read()
rmagcenters = fts['RMAG_CENTERS'].read()
dn_dzdm = fts['dn_dzdm'].read()
dz = zcenters[1] - zcenters[0]
if self.data is None: | ||
raise ValueError("No data catalog was provided") | ||
dz = 0.1 | ||
zbins = np.arange(0,10,dz) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not recommended to use np.arange
with a non-integer step. Better use np.linspace
dz = 0.1 | ||
zbins = np.arange(0,10,dz) | ||
zcenters=0.5*(zbins[1:]+zbins[:-1]) | ||
rmagbins = np.arange(15,25,0.1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here. Use np.linspace
for i,z_bin in enumerate(zcenters): | ||
w_z = (self.mockcatalog['Z'] > z_bin-0.5*dz) & (self.mockcatalog['Z'] <= z_bin+0.5*dz) | ||
if np.sum(w_z)==0: continue | ||
rand = np.random.uniform(size=np.sum(w_z)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seed?
mock=self.mockcatalog | ||
try: | ||
data_pixels = hp.ang2pix(1024,np.radians(90-self.data['TARGET_DEC']),np.radians(self.data['TARGET_RA']),nest=True) | ||
except: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bad use of try-except again. Use an if statement
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, Never use naked except!
random_variable = rv_discrete(values=(np.arange(1,len(pdf)+1),pdf)) | ||
is_pass = self.mockcatalog['NPASS'] == tile_pass | ||
exptime_mock[is_lya_mock&is_pass]=1000*random_variable.rvs(size=np.sum(is_pass&is_lya_mock)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you generating random variables? What is the seed?
if 'RMAG' in self.data.colnames: | ||
dn_dzdm=np.histogram2d(self.data['Z'],self.data['RMAG'],bins=(zbins,rmagbins))[0] | ||
elif 'FLUX_R' in self.data.colnames: | ||
dn_dzdm=np.histogram2d(self.data['Z'],22.5-2.5*np.log10(self.data['FLUX_R']),bins=(zbins,rmagbins))[0] | ||
else: | ||
raise ValueError("No magnitude information in data catalog") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Write
if 'RMAG' in self.data.colnames:
data_rmag = self.data
elif 'FLUX_R' in self.data.colnames:
data_rmag = 22.5-2.5*np.log10(self.data['FLUX_R'])
else:
raise KeyError("No magnitude information in data catalog")
dn_dzdm=np.histogram2d(self.data['Z'], data_rmag, bins=(zbins,rmagbins))[0]
dist = Table.read(os.path.join(os.path.dirname(desisim.__file__),'data/redshift_dist_chaussidon2022.ecsv')) | ||
dz=dist['z'][1]-dist['z'][0] | ||
factor=0.1/dz # Account for redshift bin width difference. | ||
zbins = np.arange(0,10.1,0.1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use np.linspace instead
w_z = (zcenters-0.5*dz > zmin) & (zcenters+0.5*dz <= zmax) | ||
dndz=dndz[w_z] | ||
zcenters=zcenters[w_z] | ||
zbins = np.arange(zmin,zmax+dz,dz,) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
np.linspace
dn_dzdr=fitsio.FITS(filename)[3][:,:] | ||
dndz=np.sum(dn_dzdr,axis=1) | ||
|
||
zcenters=fitsio.FITS(filename)[1][:] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use a with statement. Read zcenters first.
with fitsio.FITS(filenames) as fts:
zcenters = fts[1].read()
dn_dzdt = fts[3].read()
try: | ||
pixels = hp.ang2pix(nside, np.radians(90-catalog['DEC']),np.radians(catalog['RA']),nest=True) | ||
except KeyError: | ||
pixels = hp.ang2pix(nside, np.radians(90-catalog['TARGET_DEC']),np.radians(catalog['TARGET_RA']),nest=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bad use of try-except again. Use an if statement
self.mockcatalog['TARGETID']=self.mockcatalog['MOCKID'] | ||
self.mockcatalog['Z']=self.mockcatalog['Z_QSO_RSD'] | ||
self.invert=invert | ||
np.random.seed(seed) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah this is the seed. I recommend creating a member random engine rng_engine = np.random.default_rng(seed)
and calling it to generate the random numbers you need. default_rng
is recommended since numpy 1.17
@@ -433,7 +489,8 @@ def simulate_one_healpix(ifilename,args,model,obsconditions,decam_and_wise_filte | |||
DZ_FOG = DZ_FOG[indices] | |||
nqso = args.nmax | |||
|
|||
if args.dn_dzdm is not None: | |||
if args.dn_dzdm: | |||
# TODO: Deprecate this option. It will be faster to do this on preprocessing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can add warnings.warn("dn_dzdm will be deprecated. It is faster to do this in preprocessing", DeprecationWarning)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delay. I went through the code and left comments for best practices and recommended functions. They are minor changes but if unfixed they can introduce critical bugs. For example, a naked except can be really bad.
Also, why is the npass file so large?
I am merging as is because this is the version used for KP6 mocks. We will then tag. BUT the comments can still be addressed by reopening this PR after we merged and tagged. |
The post-merge version with a docstring fix has been tagged as desisim 0.38.0. I also installed this at NERSC. |
This PR introduces the necessary tools to produce DESI-Y1 Lya mocks.
Here are the Major changes listed:
quickquasars
.quickquasars
.--raw-mock [saclay, london, ohio]
allows to generate Y5 mocks directly from quickquasars for the specified raw mock team.--dn_dzdm
was reworked to boolean flag, used when redshift and magnitude distributions are to be downsampled byquickquasars
directly.--year1-throughput
adds the Y1 throughput model to mock spectra including the 440 nm dip observed during Y1--from-catalog
allows to read from an input pre-processed mock catalog generated by the SurveyRelease instance.--metal-strengths
allows to change the default metal strengths used for the--metals
flag. This flag should be the same length and in the same order as the--metals
flag in order to work. Thelya_spectra.py
script was modified to make this flag work.An example quickquasars command to run DESI-Y1 mocks would be:
Important notes:
--year1-throughput
flag works if a file calleddesiY1.yaml
file exists withinspecsim/data/config
pointing to thespecpsf/psf-quicksim.fits
and filesthru-[brz]_y1measured.fits
which include the dip feature at 440 nm. We can take advantage of Simulate DESI spectra with the same output wavelength binning as the pipeline specsim#125 to include this file. However, if I'm correct, that PR changes the defaults of thedesi.yaml
file to those of DESI-Y1. I think it is important to keep thedesi.yaml
pointing to a throughput model without the 440 nm dip (default DESIMODEL) as we might want mocks without this feature./global/cfs/cdirs/desi/users/hiramk/desi/quickquasars/sampling_tests/npass_pixmap.fits
this file is large enough (1.1GB) to not be included in this PR. The code should work as long as the current path (my local cfdirs) is not modified. However, we should probably copy this file to somewhere else. Is there a recommended place to store it? MaybeDESIMODEL
?--dn_dzdm
and--raw-mock
flags as the SurveyRelease instance already does this and its probably more efficient to downsample and assign magnitudes in pre-processing than within quickquasars. I kept them since @andreicuceu DESI-Y5 mocks were produced using these flags.