Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kleykamp additional truth changes #116

Merged
merged 32 commits into from
Jul 11, 2024
Merged

Conversation

jdkio
Copy link
Contributor

@jdkio jdkio commented May 31, 2024

Summary

  • Add additional truth info such as RecoTrackPrimaryParticleTrueTrackLength, which is true track length from reco track start to particle true end
  • Make fiducial volume config option
  • Add reco track true hit information - to compare with reco
  • True particle info - Now we save genie and geant particles. In the case of geant particles, we only save those that deposited energy inside the TMS. Genie particle information is saved regardless
  • Add Truth_Spill - saves info per spill instead of per slice. Useful since it only has truth info once instead of once per slice. And for pileup we'll need to decouple interaction-based truth info and slice-based truth info
  • Set reco track length to avg of U + V instead of CalculateTrackLength3d. See issue #114

Things still missing

  • Info from key points on outside edge of detectors. Only saving up to fiducial volume. Have IsInsideTMSMass
  • Associating true hit information with true particle information. Like to get more refined start/stop point in track. Reco track true hit information should be similar enough except in some edge cases. I suggest using true hit for start and then true particle end position for end.

Primary adjustments

Bugs:

Improvements:

Scripts:

jdkio added 28 commits April 24, 2024 15:43
…so add various vars related to whether interaction started in TMS, etc
…le info if primary or deposited energy in TMS
@jdkio jdkio requested a review from LiamOS May 31, 2024 03:58
@jdkio jdkio marked this pull request as draft May 31, 2024 14:43
@jdkio
Copy link
Contributor Author

jdkio commented May 31, 2024

Ran larger sample on grid and many jobs got held due to memory limits. Either from saving all particles, or there's a memory leak. Investigating

@LiamOS LiamOS requested a review from AsaNehm May 31, 2024 15:44
@AsaNehm
Copy link
Contributor

AsaNehm commented Jun 3, 2024

For the averaging of TrackLengthU and TrackLengthV you mean the reco tracks or the truth ones? For the reco this shouldn't be necessary, as that is basically already done by using RecoX instead of the x position of the hits which averages the position of hit pairs when calculating the track length
TrackLengthU and TrackLengthV would then also need to be extended to TrackLengthX to account for X layers as well which would get difficult for only very few X layers
So I strongly oppose this averaging
Edit: Ah sorry, you're averaging for the y position. This zig-zagging is a problem that indeed could and most likely would be solved with averaging. Only problem is still for geometries with X layers
On the other hand this most likely wouldn't solve the track length truth vs. reco problem entirely, as as far as I understand makes the zig-zagging the 'track' longer and should therefore have a higher density-weighted track length than the truth. But the opposite was the case so far. Would that be only the effect from the different truth track start?

@LiamOS
Copy link
Member

LiamOS commented Jul 1, 2024

At last Friday's meeting we discussed merging this as the new variables seem to be (at least mostly) working correctly.
Is this PR still up to date and nominally ready to merge?

@LiamOS
Copy link
Member

LiamOS commented Jul 1, 2024

Asa's been having some issues with comparisons between true and reco Z start and end positions, à la
RecoStartZ
while the X and Y distributions appear fine.

I dug into this today and it seems to come from the interactions between GetPositionAtZ() and GetPositionPoints() in TMS_TrueParticle.cpp. The positions points get within some mm/cm of the desired z value, but do not get arbitrarily close. The closest point seems to be returned, meaning the returned Z and the given Z are not the same.

This can be """fixed""" by keeping track of the signed distance from the returned point and the given Y, and adding this back onto the returned point: (sorry for image diff)
gitdiff

To do this more correctly I'd check the momentum at that Z, and then add in a distance*mom[0/1]/mom[2] component to correct X and Y to the desired point.

@jdkio jdkio marked this pull request as ready for review July 4, 2024 00:15
@jdkio
Copy link
Contributor Author

jdkio commented Jul 4, 2024

Unfortunately there are no good answers. The idea behind the variable was to measure the start and stop points of the particle. The best solution is probably to find the exact point it enters the front of the TMS and make that the point you save, not base it on the z of the reco track. Otherwise you're always within 1 plane of the right answer.

So we'd probably want to update the function to look for start and end points inside the TMS. If it can't find those, then find the point that we enter/exit the TMS, interpolating from the last point outside and the first point inside (and vice versa for exiting particles). So it's basically what you said, but based on the fiducial volume z (and x/y for side-entering particles) instead of reco z.

And then there's the issue of reco tracks that are made from a particle that scatters and makes a continuing track. While technically two particles, we can treat that as a single track for the purposes of energy reconstruction. But currently we'd get the wrong starting or ending position for the track. Maybe our studies should simply add a cut to remove those for now, instead of trying to fix our definition. One could detect them by looking at the primary particle responsible for the front/back of a reco track and noticing that they're different. For now, we could try removing any reco track that has a substantial amount of energy from a secondary particle

I don't see myself having any cycles to finish this in the near future so if you can pick up the mantle, that would be greatly appreciated!

I also added RecoTrackTrueHitPosition, which gives the true hits of the reco track. This at least lets you check the x and y positions, but it does have that issue of always being right within 1 plane of z since it's the same z position by definition

@LiamOS
Copy link
Member

LiamOS commented Jul 4, 2024

Thanks for the detailed reply. I'm fairly sure I understand what's going on for these variables now (and know mostly where we're failing).

For tweaking the variables to match the truth closer, this wasn't a significant issue in X and Y (for physics/geometry reasons) and the Z can be easily fixed. If @AsaNehm needs more than the Z fix it likely won't be until after the PDR.

One thing I did want to bring up for discussion was whether a generic RecoTrackTrueVar should live in the Reco_Tree or in Truth_Info. My T2K brain says they should go in the Reco_Tree, as these are still reconstruction dependent quantities to some degree.

@jdkio
Copy link
Contributor Author

jdkio commented Jul 5, 2024

Well they do have truth info, so that's why they're in truth_info. When we have real data, the truth tree can be turned off. If we have it in the reco tree, then we'd need to turn off those specific branches. And it increases the chance that someone might use them as reco quantities, not knowing that it's truth info. I think that's why we originally had the two separate trees

setup.sh Outdated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hopefully this is not to be confused with the existing setup file for the new OS on the fermilab machines. Did someone check this in detail?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was to make it backwards compatible with ProcessND.py, which expects setup.sh for when it's running on sl7. We should actually have setup.sh check its os and then run either os-specific script and then we'll have the best of both worlds

src/TMS_Constants.h Outdated Show resolved Hide resolved

//std::cout<<"N total: "<<nTotal<<", N Primary: "<<nPrimary<<", N Interesting: "<<nInteresting<<", N charged: "<<nCharged<<", N high P: "<<nHighMomentum<<", N charged and low P: "<<nChargedAndLowMomentum<<", n TMS_TruePrimaryParticles: "<<TMS_TruePrimaryParticles.size()<<std::endl;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be removed if not necessary or transformed into a #ifdef debug ... #endif

double GetTrackLength(std::vector<TVector3> nodes, bool ignore_y = false) {
//std::cout<<"Getting track length for "<<nodes.size()<<" nodes"<<std::endl;
double out = 0;
if (nodes.size() != 19) return out;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is oddly specific. Why 19? Add comment at end of line?

for (size_t i = 0; i < GetPositionPoints().size(); i++) {
double distance = abs(GetPositionPoints()[i].Z() - z);
if (distance <= max_z_dist) {
//std::cout << "pos: " << GetPositionPoints()[i].X() << ", "<< GetPositionPoints()[i].Y() << ", "<< GetPositionPoints()[i].Z() << " " << GetPositionPoints()[i].Z() - prev_z << std::endl;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Take out or transform into #ifdef debug ... #endif

@@ -1074,7 +1202,7 @@ void TMS_TreeWriter::Fill(TMS_Event &event) {
nHitsIn3DTrack[itTrack] = (int) RecoTrack->Hits.size(); // Do we need to cast it? idk
// std::cout << "TreeWriter number of hits: " << nHitsIn3DTrack[itTrack] << std::endl;
RecoTrackEnergyRange[itTrack] = RecoTrack->EnergyRange;
RecoTrackLength[itTrack] = RecoTrack->Length;
RecoTrackLength[itTrack] = 0.5 * (TrackLengthU[itTrack] + TrackLengthV[itTrack]); // RecoTrack->Length;, 2d is better estimate than 3d because of y jumps
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a TODO for later after the Kalman filter works?

if (particle_info.energies.size() > 1) {
true_secondary_visible_energy = particle_info.energies[1];
true_secondary_particle_index = particle_info.indices[1];
//std::cout<<"checking for primary particle trackid"<<std::endl;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which part? It's saving the secondary particle info. The cout can be deleted though

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's about the cout

@@ -1135,8 +1267,87 @@ void TMS_TreeWriter::Fill(TMS_Event &event) {
if (itTrack < __TMS_MAX_LINES__) {
setMomentum(RecoTrackPrimaryParticleTrueMomentumTrackStart[itTrack], tp.GetMomentumAtZ(start_z, max_z_distance));
setPosition(RecoTrackPrimaryParticleTruePositionTrackStart[itTrack], tp.GetPositionAtZ(start_z, max_z_distance));
//std::cout << "Setting tp shite: " << tp.GetPositionAtZ(start_z, max_z_distance).X() << " " << tp.GetPositionAtZ(start_z, max_z_distance).Y() << " " << tp.GetPositionAtZ(start_z, max_z_distance).Z() << ",\t" << start_z << " " << max_z_distance << std::endl;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like @LiamOS's stash made it in based on the blame

Copy link
Member

@LiamOS LiamOS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Builds and runs fine from my checks, I've also been using some of this code on the Kalman branch so I trust it.

Approving now, will begin merging.

Copy link
Contributor

@AsaNehm AsaNehm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart from some cosmetics I also approve

@LiamOS LiamOS merged commit 0f4bb23 into main Jul 11, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants