DM-40513: Add first-draft metadetection PipelineTask. #22

TallJimbo · 2023-08-25T16:14:33Z

No description provided.

esheldon · 2023-08-25T17:01:37Z

python/lsst/drp/tasks/metadetection_shear.py

+              catalog for the patch, with schema equal to `object_schema`.
+        """
+        single_cell_tables: list[pa.Table] = []
+        for single_cell_coadds in zip(*patch_coadds, strict=True):


This is a loop over bands right? We will be making a combined coadd over bands for detection, but then process the bands simultaneously to get the shear info, fluxes etc.

This is a loop over individual cells, in which each single_cell_coadds object is a sequence of per-band single-cell coadds.

I figured we'd coadd across bands inside process_cell, one cell at a time. I was assuming there'd be no reason to coadd all cells across bands up front and then process both the multi-band coadd and the per-band coadds for each cell more later.

Ah, I see now there is a star on patch_coadds. So this means that patch_coadds is actuallly a sequence of sequences? Where can I find the definition of MultiCellCoadd?

Oops, I wrote this thinking that MultipleCellCoadd was iterable, but it isn't (you have to iterate over .cells) so this was actually wrong (just pushed a fix).

Here's the definition of MultipleCellCoadd: https://github.com/lsst/cell_coadds/blob/main/python/lsst/cell_coadds/_multiple_cell_coadd.py

And here's the definition of SingleCellCoadd: https://github.com/lsst/cell_coadds/blob/main/python/lsst/cell_coadds/_single_cell_coadd.py

Those both refer to ImagePlanes and CommonComponents from the same package.

Thanks.

Is there an existing way to coadd these individual band SingleCellCoadds or should we extract the masked image and do it ourselves?

How do I extract the image(s)?

(assuming we should not rely on hidden _outer attribute)

whoops, I see there is a property for outer

Go ahead and extract the outer masked image or individual outer single-plane images and do it yourself; it might make sense to move that into a method on SingleCellCoadd for reuse at some point, but we can look into that once you've defined the behavior you want for the combination.

esheldon · 2023-08-25T17:02:35Z

python/lsst/drp/tasks/metadetection_shear.py

+        # TODO: if we need to do any cell-overlap-region deduplication here
+        # (instead of purely in science analysis code), this is where'd it'd
+        # happen.
+        return Struct(


Does this struct have to match the schema returned by make_object_schema?

The object_catalog attribute of the struct needs to be a pyarrow table that matches that schema. The attributes of the Struct itself correspond to the cT.Output connections in MetadetectionShearConnections, of which there is only one right now.

esheldon · 2023-08-25T17:04:10Z

python/lsst/drp/tasks/metadetection_shear.py

+            Output object catalog for the cell, with schema equal to
+            `object_schema`.
+        """
+        rows: list[dict[str, Any]] = []


Must it be a list/dict or can it be a numpy array with fields? e.g. np.zeros(nrows, dtype=dtype)?

I'm pretty sure numpy array with fields can work too, because @erykoff has a function that can convert from that to pyarrow.Table we can import and use.

I do! What I don't have is a way of connecting that with the arrow schema metadata, but once we have an idea of what functionality we need I can add that.

esheldon · 2023-08-25T17:05:07Z

python/lsst/drp/tasks/metadetection_shear.py

+    # detection masks), add them here.
+
+    object_catalog = cT.Output(
+        "ShearObject",


Does this name matter? If so should this be more specific like "MDetShearObject" ?

It does matter, but for at least DR1 I think "ShearObject" makes sense as there will be only one shear catalog table and that's what we've put in the DPDD. @arunkannawadi may ultimately have been right that we should have put a more specific name in the DPDD, too, but I'm comfortable revisiting that for DR2+ if/when we decide to run another algorithm alongside.

I won't be surprised if Xiangchong Li has a code ready by then

MultipleCellCoadd isn't iterable, but its .cells attribute is a mapping.

The Input name is set correctly to cellCoadd temporarily commented object schema "cT.InitOutput" because this doesn't work yet temporarilyl set required_bands to r, need to learn how to set with config

Currently the real cell coadds don't have everything we need

The metadetect code expects MultiBandExposure

We currently run with a config restricting to r for real data, as other bands don't yet exist for the testbed Note we will want to process g separately, this is still TODO

currently the sims cannot generate the bright objects because the catalog is not available Also when using real data we need to pass in the info for bright objects to be masked

This has not been tested. It will definitely need config overrides to "config.connections.ref_cat" to work: - "cal_ref_cat" for DC2 data - "ps1_pv3_3pi_20170110" or "gaia_dr3_20230707" for real data Right now the reference catalog is being passed to run() as an lsst.afw.table.SimpleCatalog, which is probably not what anyone wants, but it's what we start off with. Calling .asAstropy() on it and passing that to run() would be one alternative, and calling .extract(<columns>) to get a dict of numpy array columns would be another.

esheldon reviewed Aug 25, 2023

View reviewed changes

TallJimbo and others added 15 commits October 19, 2023 16:16

Add first-draft shell of metadetection PipelineTask.

23ec7b9

Fix zip-iteration over cells.

da4a8fd

MultipleCellCoadd isn't iterable, but its .cells attribute is a mapping.

correct name "cellCoadd" and temp fixes

c7be25a

The Input name is set correctly to cellCoadd temporarily commented object schema "cT.InitOutput" because this doesn't work yet temporarilyl set required_bands to r, need to learn how to set with config

satisfy the flake8

500e06c

running a sim to get something we can process

78cc2b8

Currently the real cell coadds don't have everything we need

add more schema

b2bcb0b

add shear_bands

dd894bb

add code to convert cell coadd data type

b1b42e8

The metadetect code expects MultiBandExposure

a little cleanup

1bbdcb1

units on T

8a5a078

field doc

b8fa5d2

restore griz as default bands

fc83149

We currently run with a config restricting to r for real data, as other bands don't yet exist for the testbed Note we will want to process g separately, this is still TODO

add bright info for masking

1c0a815

currently the sims cannot generate the bright objects because the catalog is not available Also when using real data we need to pass in the info for bright objects to be masked

Reformat with black.

9abfa0a

TallJimbo force-pushed the tickets/DM-40513 branch from a25beb1 to bbe1965 Compare October 19, 2023 20:19

Fix flake8 errors.

0ded8ea

TallJimbo mentioned this pull request Oct 20, 2023

esheldon fork Tickets/dm 40513 #24

Draft

arunkannawadi mentioned this pull request Jan 9, 2024

DM:41142: Add metadetect as a dependency #42

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DM-40513: Add first-draft metadetection PipelineTask. #22

DM-40513: Add first-draft metadetection PipelineTask. #22

TallJimbo commented Aug 25, 2023

esheldon Aug 25, 2023

TallJimbo Aug 25, 2023 •

edited

Loading

esheldon Aug 25, 2023

TallJimbo Aug 25, 2023

esheldon Aug 25, 2023 •

edited

Loading

esheldon Aug 25, 2023 •

edited

Loading

esheldon Aug 25, 2023

TallJimbo Aug 25, 2023

esheldon Aug 25, 2023

TallJimbo Aug 25, 2023

esheldon Aug 25, 2023

TallJimbo Aug 25, 2023

erykoff Aug 25, 2023

esheldon Aug 25, 2023

TallJimbo Aug 25, 2023

esheldon Aug 25, 2023

DM-40513: Add first-draft metadetection PipelineTask. #22

Are you sure you want to change the base?

DM-40513: Add first-draft metadetection PipelineTask. #22

Conversation

TallJimbo commented Aug 25, 2023

Choose a reason for hiding this comment

TallJimbo Aug 25, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

esheldon Aug 25, 2023 • edited Loading

Choose a reason for hiding this comment

esheldon Aug 25, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TallJimbo Aug 25, 2023 •

edited

Loading

esheldon Aug 25, 2023 •

edited

Loading

esheldon Aug 25, 2023 •

edited

Loading