-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Random trigger events missing from the skim for 73070 #342
Comments
Similar csv files for the other run period REST files are linked below There's a script to extract a column here , use it like this |
Something must have been wrong during the production or copying of this file. The converted random trigger file produced during the REST production is nearly 500x larger:
|
I did some digging into MCW's database thanks to Thomas for hints. There are many runs for that run period with not many random trigger events at all. They're mostly in run numbers that are close together, plus a few now and then later on. Comparing these with the number of events in the trigger count histo from the REST monitoring files, there are WAY less than there should be. |
Here are counts from the trigger monitoring histogram (filled all the time) and then the random trigger counts (saved only when the beam was on) for a selection of runs. Ignore the *. 73125 135744 752 * |
I took a look at the earlier run periods to see what is normal. This is for a chunk of fall 2018. The columns are run, histogram_counts, beam_frac, histogram_counts x beam_frac, random_triggers_in_file
|
I've attached a file for each run period showing only the runs where the number of random triggers in the file is less than 0.7 x the expected number. The expected number is the number in the monitoring histogram x beam_current / beam_on_current. If beam_on_current is not in rcdb, it assumes that it is 1.5 x beam_current. The columns in the file are spring17.txt There are a few runs in the earlier run periods which have much fewer randoms than expected, and many from spring 2020. Also, quite a few of the spring 20 runs have beam_current > beam_on_current, which seems odd/wrong. If anyone else wants to have a play with this, the files that I used are in /work/halld/njarvis/randoms The runs with missing beam_on_current are listed in this file: |
Thanks a lot for looking into the details of this, @nsjarvis ! I looked into the 2019-11 runs, and indeed in many of these there is an issue with the beam fiducial table. I'll look more closely at this, figure out why it's happening, and put in some better checks to catch these problems earlier. We'll probably want to recreate the maps for these runs. As a reminder, the 2019-11 run period was the first one to implement this new beam fiducial method, so it looks like a few runs slipped past our QA as we worked out the bugs of this new technique. For the earlier run periods, I will fix the runs with bad RCDB settings, or other obvious issues. We should discuss what to do with the other runs - my proposal would be to fix a limit below which we should try to recover more random triggers, for example 10k. For example, if we expect a run should have 500k random triggers when the beam is on, but only have 250k triggers for folding into simulations - clearly we could improve the situation, but it doesn't seem worth the non-trivial effort to do so, IMO. For later run periods, we will want to double check things, but I think that the changes in our other procedures (i.e. properly calculating most of the info during REST production), lead to more consistent results, compared to going back to recalculate these values. |
One quick update - I was able to originally reproduce this issue of bad fiducial tables, but after adding in and removing some debugging statements, the problem is no longer showing up with my build. So perhaps we are a victim of an over-optimizing compiler. I'll keep on checking this and work on reproducing the random files. |
I've fixed the RCDB entries for 2017-01. |
Great, thanks. |
@nsjarvis - I think that for some of the run periods (earlier than 2019-11), you were looking at the wrong set of random files. For example, 41488 is listed as having zero events, but if I look at all of the entries for run 41488: MariaDB [gluex_mc]> select Run_Number,Tag,Path,Num_Events from Randoms where Run_Number=41488;; +------------+-----------------------+----------------------------------------------------------------------------------------+------------+ | Run_Number | Tag | Path | Num_Events | +------------+-----------------------+----------------------------------------------------------------------------------------+------------+ | 41488 | recon-2018_01-ver02 | /w/halld-scifs17exp/random_triggers/recon-2018_01-ver02/run041488_random.hddm | 0 | | 41488 | recon-2018_01-ver02 | /w/osgpool-sciwork18/halld/random_triggers/recon-2018_01-ver02/run041488_random.hddm | 1183612 | | 41488 | recon-2018_01-ver02 | /osgpool/halld/random_triggers/recon-2018_01-ver02/run041488_random.hddm | 1183612 | | 41488 | recon-2018_01-ver02.2 | /w/osgpool-sciwork18/halld/random_triggers/recon-2018_01-ver02.2/run041488_random.hddm | 881884 | +------------+-----------------------+----------------------------------------------------------------------------------------+------------+ Only the first has zero events, and corresponds to some ancient file. I don't know how MCWrapper chooses which file when there is multiple options... but in any case, for 2018-01, the correct tag to look at is "recon-2018_01-ver02.2". Similarly, for 2018-08, all of the "missing" runs exist under the recon-2018_08-ver02.2 tag. They appear to be missing since they have a tag of "None" (whoops). |
That's possible, I thought I looked at the most recent set. I'll dig up my notes. It's easy to run the script again anyway. |
Thanks, yeah, there is a list on this page: https://halldweb.jlab.org/wiki/index.php/How_to_choose_software_versions_on_the_MC_submission_form But these are the versions to check: I'm going to rerun some of the problem 2019-11 runs - my current best guess is that there was some memory error that was overwriting the threshold used to determine if the beam is on or not. Will use additional TLC this time. |
I did use older recon tags in MCW, sorry. This is what I find using the correct tags, and after the beam_on_current upload for spring 17: |
Thanks! I will look into filling in the missing files for spring/fall 18. Luckily these all seem to be very short runs. As for the runs missing ~50% of the events, I don't think these are so urgent, since there are still 10s or 100s of thousands of mix-in events. If we wanted to improve on this, it seems like a project for a student. |
I copied runs 41173, 41386, 42182, 51426, 51172 to the correct locations under 41221 needs to have its fiducial map calculated |
The random trigger file at /cache/halld/gluex_simulations/random_triggers/recon-2019_11-ver01/run073070_random.hddm contains only 1882 events. Peter confirmed that that's what the MCWrapper database indicates for this run.
One can see even from the monitoring histogram which only sees a fraction of the events that there should be at least 100 x more than this.
According to the monitoring plots made along with the REST production, there should be 670692 random triggers for that run.
It's possible that the random trigger files could be too short for other runs as well. I ran demon over the REST production files to count the randoms in the monitoring histogram. The plots are on this page and the CSV data file is here and also attached:
monitoring_data_2019-11_verREST1.csv
The text was updated successfully, but these errors were encountered: