Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

idr0010-doil-dnadamage S-BIAD885 #641

Open
will-moore opened this issue Feb 22, 2023 · 29 comments
Open

idr0010-doil-dnadamage S-BIAD885 #641

will-moore opened this issue Feb 22, 2023 · 29 comments
Assignees

Comments

@will-moore
Copy link
Member

idr0010-doil-dnadamage

@will-moore will-moore moved this to test convert in NGFF conversion Feb 22, 2023
@dominikl dominikl moved this from test convert to re-import test image in NGFF conversion Feb 27, 2023
@pwalczysko pwalczysko moved this from re-import test image to convert all data to NGFF in NGFF conversion Feb 27, 2023
@pwalczysko
Copy link

Imported 1 ome.zarr plate into OMERO. http://localhost:1080/webclient/?show=plate-202 Time taken for the import

1 hour

@dominikl
Copy link
Member

Started conversion of full dataset on pilot-zarr2-dev.

@will-moore
Copy link
Member Author

Installed minio client on pilot-zarr2-dev same as at #643 (comment)

I see that currently 62 / 148 plates have been converted so far (in ~22 hours) so will need another day...

@will-moore
Copy link
Member Author

will-moore commented Apr 25, 2023

Started to copy some data over. Can copy the rest once done, but this allows me to start import etc...

$ cd /ngff
$ /home/wmoore/mc cp -r idr0010/ uk1s3/idr0010/zarr

https://hms-dbmi.github.io/vizarr/?source=https://uk1s3.embassy.ebi.ac.uk/idr0010/zarr/100-27.ome.zarr

Image

@will-moore
Copy link
Member Author

The copy above is showing some errors...

mc: <ERROR> Failed to copy `/data/ngff/idr0010/120.ome.zarr/A/20/0/2/0/1/0/0/0`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/120.ome.zarr/A/25/.zattrs`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/120.ome.zarr/A/25/.zgroup`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/12-23.ome.zarr/J/30/.zattrs`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/120.ome.zarr/A/21/.zattrs`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/120.ome.zarr/A/25/0/.zattrs`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/120.ome.zarr/A/25/0/.zgroup`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/12-23.ome.zarr/K/8/0/0/0/0/0/0/0`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/120.ome.zarr/A/21/0/.zgroup`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/120.ome.zarr/A/21/0/0/0/0/0/0/0`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/120.ome.zarr/A/25/0/0/.zarray`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/12-23.ome.zarr/K/8/0/1/.zarray`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/12-23.ome.zarr/K/8/0/1/0/0/0/0/0`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/120.ome.zarr/A/21/0/1/.zarray`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/120.ome.zarr/A/25/0/0/0/0/0/0/0`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/120.ome.zarr/A/25/0/0/0/1/0/0/0`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/12-23.ome.zarr/K/8/0/1/0/1/0/0/0`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/120.ome.zarr/A/21/0/2/.zarray`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/120.ome.zarr/A/22/0/.zattrs`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/120.ome.zarr/A/25/0/1/.zarray`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/12-23.ome.zarr/K/8/0/2/.zarray`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/12-23.ome.zarr/K/8/0/2/0/0/0/0/0`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/120.ome.zarr/A/24/.zgroup`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/120.ome.zarr/A/25/0/1/0/0/0/0/0`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/120.ome.zarr/A/25/0/1/0/1/0/0/0`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/12-23.ome.zarr/K/8/0/2/0/1/0/0/0`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/120.ome.zarr/A/24/0/0/.zarray`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/120.ome.zarr/A/24/0/2/.zarray`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/120.ome.zarr/A/24/0/2/0/0/0/0/0`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/120.ome.zarr/A/25/0/2/.zarray`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/120.ome.zarr/A/1/0/2/0/1/0/0/0`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/120.ome.zarr/A/14/0/2/0/1/0/0/0`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/120.ome.zarr/A/24/0/2/0/1/0/0/0`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/120.ome.zarr/A/25/0/2/0/0/0/0/0`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/120.ome.zarr/A/15/.zattrs`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/138-27.ome.zarr/H/6/0/0/0/0/0/0/0`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/138-27.ome.zarr/H/6/0/0/0/1/0/0/0`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/138-27.ome.zarr/H/6/0/1/.zarray`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/135-21.ome.zarr/J/22/0/0/0/1/0/0/0`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/138-27.ome.zarr/H/14/0/.zgroup`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/138-27.ome.zarr/H/16/.zgroup`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/138-27.ome.zarr/H/6/0/1/0/0/0/0/0`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/135-21.ome.zarr/J/25/0/2/0/1/0/0/0`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/136-19.ome.zarr/I/14/0/1/.zarray`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/138-27.ome.zarr/H/17/0/1/0/0/0/0/0`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/138-27.ome.zarr/H/6/0/1/0/1/0/0/0`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/138-27.ome.zarr/H/6/0/2/.zarray`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/138-27.ome.zarr/H/19/.zattrs`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/136-19.ome.zarr/J/27/0/0/0/0/0/0/0`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/138-27.ome.zarr/H/19/0/.zgroup`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/136-19.ome.zarr/J/6/.zattrs`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/138-27.ome.zarr/H/6/0/2/0/1/0/0/0`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/138-27.ome.zarr/H/7/.zattrs`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/136-19.ome.zarr/K/18/0/0/.zarray`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/138-27.ome.zarr/H/22/.zgroup`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/138-27.ome.zarr/H/22/0/1/.zarray`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/138-27.ome.zarr/H/7/.zgroup`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/137-11.ome.zarr/B/25/.zgroup`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/137-11.ome.zarr/B/26/0/1/.zarray`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/138-27.ome.zarr/H/23/0/1/0/0/0/0/0`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/138-27.ome.zarr/H/7/0/.zattrs`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/137-11.ome.zarr/B/30/.zgroup`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/138-27.ome.zarr/H/24/0/1/0/0/0/0/0`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/138-27.ome.zarr/H/7/0/.zgroup`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/138-27.ome.zarr/H/7/0/0/.zarray`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/137-11.ome.zarr/B/5/0/1/.zarray`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/138-27.ome.zarr/H/26/0/2/.zarray`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/138-27.ome.zarr/H/31/.zattrs`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/138-27.ome.zarr/H/7/0/0/0/0/0/0/0`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/137-11.ome.zarr/B/6/0/2/.zarray`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/137-11.ome.zarr/C/10/0/.zattrs`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/138-27.ome.zarr/H/6/0/0/.zarray`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/138-27.ome.zarr/H/7/0/0/0/1/0/0/0`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/138-27.ome.zarr/H/7/0/1/.zarray`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/137-11.ome.zarr/C/13/0/0/0/0/0/0/0`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/138-27.ome.zarr/H/6/0/2/0/0/0/0/0`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/138-27.ome.zarr/H/7/0/1/0/0/0/0/0`. Object does not exist
mc: <ERROR> Failed to copy `/data/ngff/idr0010/137-11.ome.zarr/C/18/0/.zgroup`. Object does not exist

But these objects do exist. e.g:

$ cat /data/ngff/idr0010/137-11.ome.zarr/C/18/0/.zgroup
{
  "zarr_format" : 2
}

However, they are not uploaded so cause 404 and other errors - plate won't display at https://hms-dbmi.github.io/vizarr/?source=https://uk1s3.embassy.ebi.ac.uk/idr0010/zarr/120.ome.zarr

This was fixed with: (no errors):

/home/wmoore/mc cp -r idr0010/120.ome.zarr/A/ uk1s3/idr0010/zarr/120.ome.zarr/A/

Repeated for others above.

@dominikl
Copy link
Member

Weird... at least it worked on the second copy attempt.

@will-moore
Copy link
Member Author

We get one plate that isn't in the released study 153-29.ome.zarr. See https://idr.openmicroscopy.org/webclient/?show=screen-1351

ls idr0010/
...
153-29.ome.zarr

Repeat the copy:

cd /ngff
/home/wmoore/mc cp -r idr0010/ uk1s3/idr0010/zarr

@will-moore will-moore moved this from convert all data to NGFF to upload data to s3 in NGFF conversion Apr 26, 2023
@will-moore will-moore moved this from upload data to s3 to create new Fileset to replace original Fileset in NGFF conversion Apr 27, 2023
@will-moore
Copy link
Member Author

Last plate to be processed: https://ome.github.io/ome-ngff-validator/?source=https://uk1s3.embassy.ebi.ac.uk/idr0010/zarr/99.ome.zarr

One more plate that in IDR, extra 153-29.ome.zarr:

$ ./mc ls uk1s3/idr0010/zarr | wc
    149     745    7330

@will-moore
Copy link
Member Author

@dominikl tried following #656 with idr0010 plates yesterday and got the same error that I got previously after updating symlinks:

#652 (comment)

NB: Plate import there takes 20 minutes.

Restarted import of all idr0010 plates just now to test again... (idr0125-pilot, http://localhost:1080/webclient/?show=screen-3202

@will-moore
Copy link
Member Author

Try again: NB: Using ZarrReader updated on idr0125-pilot at #643 (comment)

Import a single plate without chunks...

2023-05-18 21:01:48,615 1640361    [l.Client-0] INFO   ormats.importer.cli.LoggingImportMonitor - OBJECTS_RETURNED Step: 5 of 5  Logfile: 50477341
2023-05-18 21:01:53,416 1645162    [l.Client-2] INFO   ormats.importer.cli.LoggingImportMonitor - IMPORT_DONE Imported file: /ngff/idr0010/1-23.ome.zarr/OME/METADATA.ome.xml
Other imported objects:
Fileset:5287122

==> Summary
2705 files uploaded, 1 fileset, 1 plate created, 384 images imported, 0 errors in 0:26:38.197

Symlinks...

$ python idr-utils/scripts/managed_repo_symlinks.py Fileset:5287122 /idr0010/zarr --report

Fileset: 5287122 /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-3/2023-05/18/20-35-15.516/
Render Image 14834193
fs_contents ['1-23.ome.zarr']
Link from /data/OMERO/ManagedRepository/demo_2/Blitz-0-Ice.ThreadPool.Server-3/2023-05/18/20-35-15.516/1-23.ome.zarr to /idr0010/zarr/1-23.ome.zarr

Plate looks good! 👍
http://localhost:1080/webclient/?show=plate-10519

Fileset swap...

$ python idr-utils/scripts/swap_filesets.py Plate:4501 Plate:10519 /tmp/idr0010_1-23_filesetswap.sql --report
UPDATE pixels SET name = 'METADATA.ome.xml', path = 'demo_2/Blitz-0-Ice.ThreadPool.Server-3/2023-05/18/20-35-15.516/1-23.ome.zarr/OME' where image in (select id from Image where fileset = 5287122);

$ PGPASSWORD=**** psql -U omero -d idr -h 192.168.10.102 -f /tmp/idr0010_1-23_filesetswap.sql 
UPDATE 384

@will-moore
Copy link
Member Author

We are seeing a different Well ordering issue on this Plate:
After the Fileset swap, the Wells are now appearing as follows:
(plate has 32 colums and 12 rows).
The Wells are ordered A1 -> A32 then B1 -> B32 etc. but the sequence goes not row by row but down first column, then up the second column, down the 3rd column etc.
Here's a sample of which Wells are being displayed in the updated Plate:

 A1, A24, A25, B16, B17...
 A2, A23, A26, B15, B18...
 A3, A22, A27, B14, B19...
 A4, A21, A28, B13, B20...
......................................
A11, A14,  B3,  B6, B27...
A12, A13,  B4,  B5, B28...

Screenshot 2023-05-19 at 15 50 14

cc @dgault

@will-moore will-moore added the bug label May 19, 2023
@will-moore will-moore moved this from create new Fileset to replace original Fileset to upload some data to s3 and test in NGFF conversion Jun 26, 2023
@will-moore
Copy link
Member Author

Checked that https://hms-dbmi.github.io/vizarr/?source=https://uk1s3.embassy.ebi.ac.uk/idr0010/zarr/99.ome.zarr looks good compared with https://idr.openmicroscopy.org/webclient/?show=plate-5936, so the NGFF looks valid, even if we still have issues importing / viewing in OMERO.

So, good to go ahead with zip and upload to BioStudies without waiting on ZarrReader issues above....

@will-moore
Copy link
Member Author

On pilot-zarr2-dev...

cd /data/ngff/idr0010
for i in */; do sudo zip -r "${i%/}.zip" "$i"; done

@will-moore
Copy link
Member Author

Last time I checked zip creation after running zip command above, it was mostly complete...

[wmoore@pilot-zarr2-dev idr0010]$ ls
100-27.ome.zarr      113-34.ome.zarr.zip  127-44.ome.zarr      141-27.ome.zarr.zip  155-45.ome.zarr      2-60.ome.zarr.zip   40-37.ome.zarr      73-40.ome.zarr
100-27.ome.zarr.zip  114-58.ome.zarr      127-44.ome.zarr.zip  142-29.ome.zarr      155-45.ome.zarr.zip  26-41.ome.zarr      40-37.ome.zarr.zip  74-26.ome.zarr
101-24.ome.zarr      114-58.ome.zarr.zip  128-35.ome.zarr      142-29.ome.zarr.zip  156-42.ome.zarr      26-41.ome.zarr.zip  41-31.ome.zarr      75-33.ome.zarr
101-24.ome.zarr.zip  115-11.ome.zarr      128-35.ome.zarr.zip  143-29.ome.zarr      156-42.ome.zarr.zip  27-37.ome.zarr      41-31.ome.zarr.zip  76-45.ome.zarr
102.ome.zarr         115-11.ome.zarr.zip  129-58.ome.zarr      143-29.ome.zarr.zip  157-46.ome.zarr      27-37.ome.zarr.zip  42-44.ome.zarr      77-20.ome.zarr
102.ome.zarr.zip     116-25.ome.zarr      129-58.ome.zarr.zip  14-35.ome.zarr       157-46.ome.zarr.zip  28-43.ome.zarr      42-44.ome.zarr.zip  78-31.ome.zarr
10-34.ome.zarr       116-25.ome.zarr.zip  130-16.ome.zarr      14-35.ome.zarr.zip   158-10.ome.zarr      28-43.ome.zarr.zip  43-04.ome.zarr      79-39.ome.zarr
10-34.ome.zarr.zip   117-12.ome.zarr      130-16.ome.zarr.zip  144-21.ome.zarr      158-10.ome.zarr.zip  29-30.ome.zarr      4-36.ome.zarr       80-29.ome.zarr
103.ome.zarr         117-12.ome.zarr.zip  131-20.ome.zarr      144-21.ome.zarr.zip  159-13.ome.zarr      29-30.ome.zarr.zip  44-13.ome.zarr      8-10.ome.zarr
103.ome.zarr.zip     118-33.ome.zarr      131-20.ome.zarr.zip  145-20.ome.zarr      159-13.ome.zarr.zip  30-44.ome.zarr      45-31.ome.zarr      81-44.ome.zarr
104-13.ome.zarr      118-33.ome.zarr.zip  132-19.ome.zarr      145-20.ome.zarr.zip  16-45.ome.zarr       30-44.ome.zarr.zip  46-33.ome.zarr      82-14.ome.zarr
104-13.ome.zarr.zip  119-43.ome.zarr      132-19.ome.zarr.zip  146-35.ome.zarr      16-45.ome.zarr.zip   3-11.ome.zarr       47-35.ome.zarr      83-48.ome.zarr
105-12.ome.zarr      119-43.ome.zarr.zip  133-29.ome.zarr      146-35.ome.zarr.zip  17-43.ome.zarr       3-11.ome.zarr.zip   48-29.ome.zarr      84-33.ome.zarr
105-12.ome.zarr.zip  120.ome.zarr         133-29.ome.zarr.zip  147-09.ome.zarr      17-43.ome.zarr.zip   31-48.ome.zarr      49-06.ome.zarr      85.ome.zarr
106-13.ome.zarr      120.ome.zarr.zip     13-3.ome.zarr        147-09.ome.zarr.zip  18-18.ome.zarr       31-48.ome.zarr.zip  5-12.ome.zarr       86-31.ome.zarr
106-13.ome.zarr.zip  121-11.ome.zarr      13-3.ome.zarr.zip    148-48.ome.zarr      18-18.ome.zarr.zip   32-42.ome.zarr      5-12.ome.zarr.zip   87-48.ome.zarr
107.ome.zarr         121-11.ome.zarr.zip  134-34.ome.zarr      148-48.ome.zarr.zip  19-29.ome.zarr       32-42.ome.zarr.zip  60-30.ome.zarr      88-40.ome.zarr
107.ome.zarr.zip     12-23.ome.zarr       134-34.ome.zarr.zip  149-21.ome.zarr      19-29.ome.zarr.zip   33-46.ome.zarr      61-43.ome.zarr      89-16.ome.zarr
108-24.ome.zarr      12-23.ome.zarr.zip   135-21.ome.zarr      149-21.ome.zarr.zip  20-29.ome.zarr       33-46.ome.zarr.zip  6-14.ome.zarr       90-27.ome.zarr
108-24.ome.zarr.zip  122-42.ome.zarr      135-21.ome.zarr.zip  150-15.ome.zarr      20-29.ome.zarr.zip   34-30.ome.zarr      62-01.ome.zarr      91-29.ome.zarr
109-36.ome.zarr      122-42.ome.zarr.zip  136-19.ome.zarr      150-15.ome.zarr.zip  21-58.ome.zarr       34-30.ome.zarr.zip  63-27.ome.zarr      9-12.ome.zarr
109-36.ome.zarr.zip  123-18.ome.zarr      136-19.ome.zarr.zip  15-11.ome.zarr       21-58.ome.zarr.zip   35-18.ome.zarr      64-41.ome.zarr      92-44.ome.zarr
110-35.ome.zarr      123-18.ome.zarr.zip  137-11.ome.zarr      15-11.ome.zarr.zip   22-21.ome.zarr       35-18.ome.zarr.zip  65-12.ome.zarr      93-40.ome.zarr
110-35.ome.zarr.zip  1-23.ome.zarr        137-11.ome.zarr.zip  151-48.ome.zarr      22-21.ome.zarr.zip   36-28.ome.zarr      66-37.ome.zarr      94-05.ome.zarr
11-08.ome.zarr       1-23.ome.zarr.zip    138-27.ome.zarr      151-48.ome.zarr.zip  23-15.ome.zarr       36-28.ome.zarr.zip  67-44.ome.zarr      95-48.ome.zarr
11-08.ome.zarr.zip   124-44.ome.zarr      138-27.ome.zarr.zip  152-14.ome.zarr      23-15.ome.zarr.zip   37-34.ome.zarr      68-30.ome.zarr      96-14.ome.zarr
111-07.ome.zarr      124-44.ome.zarr.zip  139-29.ome.zarr      152-14.ome.zarr.zip  24-27.ome.zarr       37-34.ome.zarr.zip  69-49.ome.zarr      97.ome.zarr
111-07.ome.zarr.zip  125-27.ome.zarr      139-29.ome.zarr.zip  153-29.ome.zarr      24-27.ome.zarr.zip   38-21.ome.zarr      70-12.ome.zarr      98-23.ome.zarr
112-02.ome.zarr      125-27.ome.zarr.zip  140-35.ome.zarr      153-29.ome.zarr.zip  25-28.ome.zarr       38-21.ome.zarr.zip  71-19.ome.zarr      99.ome.zarr
112-02.ome.zarr.zip  126-12.ome.zarr      140-35.ome.zarr.zip  154-43.ome.zarr      25-28.ome.zarr.zip   39-44.ome.zarr      7-14.ome.zarr
113-34.ome.zarr      126-12.ome.zarr.zip  141-27.ome.zarr      154-43.ome.zarr.zip  2-60.ome.zarr        39-44.ome.zarr.zip  72-33.ome.zarr

But not it seems all of idr0010 data has been moved or deleted?!

ssh pilot-zarr2-dev

[wmoore@pilot-zarr2-dev ~]$ cd /data/ngff/idr0010
-bash: cd: /data/ngff/idr0010: No such file or directory

$ ls /data/ngff/
idr0013  memo

@dominikl
Copy link
Member

Sorry, accidentely deleted the zarrs on pilot-zarr2-dev... thought it was already submitted to biostudies. I'll start the conversion again.

@dominikl
Copy link
Member

Running

for i in `ls /data/idr-metadata/idr0010-doil-dnadamage/screenA/plates`; do echo $i; ~/bioformats2raw-0.7.0-SNAPSHOT/bin/bioformats2raw --memo-directory ../memo /data/idr-metadata/idr0010-doil-dnadamage/screenA/plates/$i ${i%.*}.ome.zarr; done

now (in /data/ngff/idr0010)

@dominikl dominikl self-assigned this Jul 12, 2023
@dominikl dominikl moved this from upload some data to s3 and test to BioStudies Submission in NGFF conversion Jul 13, 2023
@dominikl dominikl removed their assignment Jul 13, 2023
@dgault
Copy link

dgault commented Jul 13, 2023

The ZarrReader PR ome/ZarrReader#53 has been updated to try and improve the reordering behaviour to hopefully solve the issue seen in #641 (comment)

@will-moore
Copy link
Member Author

Just checking the sizes of zarr.zip on BioStudies, it looks like Plate 5-12.ome.zarr.zip is 465629 bytes, about 10x smaller than other plates.
Checking on IDR, that plate fails to render (so I guess we don't try to fix that now)?

Image

@will-moore
Copy link
Member Author

Checking counts of plates on https://www.ebi.ac.uk/biostudies/submissions/files?path=%2Fuser%2Fidr0010 I see that there are 149 zips there but only 148 at https://idr.openmicroscopy.org/webclient/?show=screen-1351

Using this JS Code on the biostudies page to find the difference ...

let url = "https://idr.openmicroscopy.org/webclient/api/plates/?id=1351"
let idr_plates = await fetch(url).then(rsp => rsp.json());
let idr_names = idr_plates.plates.map(p => p.name);
let names = [];
[].forEach.call(document.querySelectorAll("div [role='row'] .ag-cell[col-id='name']"), function(div) {
  names.push(div.innerHTML.trim().replace(".ome.zarr.zip", ""));
});
names.forEach(n => {if (idr_names.indexOf(n) == -1) {console.log(n)}; });

Returns:

153-29

This Plate doesn't appear in IDR - maybe it got removed from the submission for some reason?
I assume we can simply delete this from the biostudies submission page.

@sbesson
Copy link
Member

sbesson commented Aug 21, 2023

Specifically on #641 (comment). A few comments:

  • the plate loaded on IDR is stuck in an Import in progress state

  • trying to reimport the 5-12 plate in a testing environment fails during the second phase of the server-side import with

    2023-08-21 10:32:51,177 19008      [2-thread-1] INFO   ormats.importer.cli.LoggingImportMonitor - IMPORT_STARTED Logfile: 46134605
    2023-08-21 10:33:48,535 76366      [l.Client-0] INFO   ormats.importer.cli.LoggingImportMonitor - METADATA_IMPORTED Step: 1 of 5  Logfile: 46134605
    2023-08-21 10:33:55,721 83552      [l.Client-1] ERROR     ome.formats.importer.cli.ErrorHandler - INTERNAL_EXCEPTION: /uod/idr/metadata/idr0010-doil-dnadamage/screenA/plates/5-12.pattern
    java.lang.RuntimeException: Failure response on import!
    Category: ::omero::grid::ImportRequest
    Name: import-file-exception
    Parameters: {filename=demo_2/Blitz-0-Ice.ThreadPool.Server-26273/2023-08/21/10-32-40.039/metadata/idr0010-doil-dnadamage/screenA/plates/5-12.pattern, stacktrace=loci.formats.FormatException: Invalid tile size: x=0, y=0, w=696, h=520
      at loci.formats.FormatTools.checkTileSize(FormatTools.java:1025)
      at loci.formats.FormatTools.checkPlaneParameters(FormatTools.java:1001)
      at loci.formats.in.MinimalTiffReader.openBytes(MinimalTiffReader.java:289)
      at loci.formats.in.MetamorphReader.openBytes(MetamorphReader.java:286)
      at loci.formats.ImageReader.openBytes(ImageReader.java:467)
      at loci.formats.ReaderWrapper.openBytes(ReaderWrapper.java:348)
      at loci.formats.DimensionSwapper.openBytes(DimensionSwapper.java:249)
      at loci.formats.FileStitcher.openBytes(FileStitcher.java:493)
      at loci.formats.in.FilePatternReader.openBytes(FilePatternReader.java:144)
      at loci.formats.ImageReader.openBytes(ImageReader.java:467)
      at loci.formats.ChannelFiller.openBytes(ChannelFiller.java:156)
      at loci.formats.ChannelSeparator.openBytes(ChannelSeparator.java:229)
      at loci.formats.ReaderWrapper.openBytes(ReaderWrapper.java:348)
      at loci.formats.ReaderWrapper.openBytes(ReaderWrapper.java:348)
      at loci.formats.MinMaxCalculator.openBytes(MinMaxCalculator.java:269)
      at ome.services.blitz.repo.ManagedImportRequestI.parseDataByPlane(ManagedImportRequestI.java:872)
    ...
    
  • looking at the underlying binary files, there is indeed a dimension mismatch between the files for each channel:

    [sbesson@test114-omeroreadwrite 5-12]$ tiffinfo 0005-12\ 53BP1.stk  | grep Width
    TIFFReadDirectory: Warning, Unknown field with tag 317 (0x13d) encountered.
    TIFFReadDirectory: Warning, Unknown field with tag 33628 (0x835c) encountered.
    TIFFReadDirectory: Warning, Unknown field with tag 33629 (0x835d) encountered.
    TIFFReadDirectory: Warning, Unknown field with tag 33630 (0x835e) encountered.
    TIFFReadDirectory: Warning, Unknown field with tag 33631 (0x835f) encountered.
    TIFFFetchNormalTag: Warning, ASCII value for tag "ImageDescription" contains null byte in value; value incorrectly truncated during reading due to implementation limitations.
      Image Width: 695 Image Length: 520
    [sbesson@test114-omeroreadwrite 5-12]$ tiffinfo 0005-12\ Dapi.stk | grep Width
    TIFFReadDirectory: Warning, Unknown field with tag 317 (0x13d) encountered.
    TIFFReadDirectory: Warning, Unknown field with tag 33628 (0x835c) encountered.
    TIFFReadDirectory: Warning, Unknown field with tag 33629 (0x835d) encountered.
    TIFFReadDirectory: Warning, Unknown field with tag 33630 (0x835e) encountered.
    TIFFReadDirectory: Warning, Unknown field with tag 33631 (0x835f) encountered.
    TIFFFetchNormalTag: Warning, ASCII value for tag "ImageDescription" contains null byte in value; value incorrectly truncated during reading due to implementation limitations.
      Image Width: 696 Image Length: 520
    
  • note that all other plates in this study have fields of views with 696 x 520 so the problematic file is the 0005-12 53BP1.stk

It might be worth looking into the history of this submission to see if a workaround and/or an alternative binary file could be found with the correct dimensions in which case a reimport might be in scope.

Otherwise, a cleanup solution would be to de-annotate and delete this broken plate from production IDR in an upcoming release. /cc @francesw @will-moore

@will-moore
Copy link
Member Author

Looking in ls /uod/idr/filesets/idr0010-doil-dnadamage/20150501-original/Restored\ GW\ screen/5-12/ and compared the 0005-12\ 53BP1.stk tiffinfo with that from the corresponding file in screen 3-11 but found that this probably can't be used as a drop-in replacement. Biggest difference is all the tiff tags (don't actually know what these do)!

I don't have access to any of the historical discussion of this study. Was this on Trello (before Redmine)?

Thinking this is probably best to delete that plate 5-12 from IDR (and remove annotations from the table: https://idr.openmicroscopy.org/webclient/omero_table/14209182/?query=Plate-5894)

@will-moore
Copy link
Member Author

Updating ZarrReader on idr0125-pilot to test plate layout (currently looks as on #641 (comment))

sudo -u omero-server -s
cd
wget https://merge-ci.openmicroscopy.org/jenkins/job/BIOFORMATS-build/lastBuild/default/artifact/bio-formats-build/ZarrReader/target/OMEZarrReader-0.3.2-SNAPSHOT-jar-with-dependencies.jar
rm OMEZarrReader.jar
mv OMEZarrReader-0.3.2-SNAPSHOT-jar-with-dependencies.jar OMEZarrReader.jar

rm /opt/omero/server/OMERO.server/lib/client/OMEZarrReader.jar
cp OMEZarrReader.jar /opt/omero/server/OMERO.server/lib/client/
rm /opt/omero/server/OMERO.server/lib/server/OMEZarrReader.jar
cp OMEZarrReader.jar /opt/omero/server/OMERO.server/lib/server/

sudo service omero-server restart

This has no effect on the Images that are shown for each Well in the plate above.
Original Plate can be seen at https://hms-dbmi.github.io/vizarr/?source=https://uk1s3.embassy.ebi.ac.uk/idr0010/zarr/1-23.ome.zarr

@will-moore
Copy link
Member Author

will-moore commented Aug 31, 2023

Testing delete of Plate 5-12 and cleanup of annotations on idr0138-pilot, similar to idr0004 #637 (comment)

This first command took a very long time. Left running overnight!

omero metadata populate --context deletemap --report --wait 300 --batch 100 --localcfg '{"ns":["openmicroscopy.org/mapr/organism", "openmicroscopy.org/mapr/antibody", "openmicroscopy.org/mapr/gene", "openmicroscopy.org/mapr/cell_line", "openmicroscopy.org/mapr/phenotype", "openmicroscopy.org/mapr/sirna", "openmicroscopy.org/mapr/compound", "openmicroscopy.org/mapr/protein"], "typesToIgnore":["Annotation"]}' --cfg idr0010-screenA-bulkmap-config.yml Screen:1351

omero metadata populate --context deletemap --report --wait 300 --batch 100 --cfg idr0010-screenA-bulkmap-config.yml Screen:1351

python /uod/idr/metadata/idr-utils/scripts/annotate/clean_orphaned_maps.py
INFO:omero.util.Resources:Starting
INFO:omero.util.Resources:Starting
INFO:omero.util.Resources:Halted
INFO:root:Found 0 orphaned Organism maps
INFO:root:Found 0 orphaned Antibody maps
INFO:root:Found 634 orphaned Gene maps
INFO:root:Deleting 500 maps
INFO:root:Deleting 134 maps
INFO:root:Found 0 orphaned Cell Line maps
INFO:root:Found 2 orphaned Phenotype maps
INFO:root:Deleting 2 maps
INFO:root:Found 2 orphaned siRNA maps
INFO:root:Deleting 2 maps
INFO:root:Found 0 orphaned Compound maps
INFO:root:Found 0 orphaned Protein maps
INFO:root:Found 0 orphaned Notebook maps
INFO:root:Found 0 orphaned Study Info maps
INFO:root:Found 0 orphaned Study Components maps
INFO:omero.util.Resources:Halted

omero metadata deletebulkanns Screen:1351

Then we delete Plate...

$ omero delete Plate:5894 --report
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
omero.cmd.Delete2 Plate:5894 ok
Steps: 6
Elapsed time: 4.917 secs.
Flags: []
Deleted objects
  Detector:11795
  DetectorSettings:12295
  ImagingEnvironment:1368964-1369347
  Instrument:12245
  Laser:1545
  Objective:12045
  ObjectiveSettings:11795
  CommentAnnotation:2581005
  FilesetAnnotationLink:22345
  Channel:10014077-10014844
  Image:3086564-3086947
  LogicalChannel:42039,42040
  OriginalFile:14186393-14186396
  Pixels:3086564-3086947
  PlaneInfo:8256047-8256430
  Thumbnail:2895206-2895589
  Fileset:22545
  FilesetEntry:13995483-13995485
  FilesetJobLink:105221-105225
  IndexingJob:112189
  JobOriginalFileLink:35059
  MetadataImportJob:112186
  PixelDataJob:112187
  ThumbnailGenerationJob:112188
  UploadJob:112185
  Plate:5894
  ScreenPlateLink:6094
  Well:1291863-1292246
  WellSample:2890313-2890696

@will-moore
Copy link
Member Author

TODO (still testing on idr0138-pilot):

  • delete all rows from 5-12 from Table (annotation.csv)
  • re-annotate all the other plates

I'm wondering if this is really the best course of action, since we lose a lot of study results by deleting them from the OMERO.table. That leaves open the possibility that we fix the images in future (if a user wants to work with the data)?

@will-moore
Copy link
Member Author

Decision in IDR meeting on Monday was not to extend the NGFF work to include unrelated cleanup work.
We will simply leave Plate 5-12 as it is.

There is still a ZarrReader issue outstanding for idr0010, but the data is ready to be submitted to BioStudies...

@will-moore
Copy link
Member Author

Deleted 5-12.ome.zarr.zip (invalid) and 153-29.ome.zarr.zip since this Plate isn't published in IDR.
Updated the idr0015_files.tsv accordingly.

@will-moore will-moore assigned francesw and unassigned will-moore Sep 6, 2023
@francesw francesw changed the title idr0010-doil-dnadamage to NGFF idr0010-doil-dnadamage S-BIAD885 Sep 6, 2023
@francesw francesw moved this from BioStudies Submission to Data on Embassy s3 in NGFF conversion Sep 11, 2023
@francesw francesw removed their assignment Sep 11, 2023
@will-moore will-moore moved this from Data on Embassy s3 to create new Filesets in idr-next in NGFF conversion Sep 14, 2023
@will-moore
Copy link
Member Author

Testing mkngff workflow for ALL 147 Plates on idr-testing:omeroreadwrite. idr0010.csv at IDR/idr-utils@631808b

(venv3) bash-4.2$ for r in $(cat $IDRID.csv); do
>   biapath=$(echo $r | cut -d',' -f2)
>   uuid=$(echo $biapath | cut -d'/' -f2)
>   fsid=$(echo $r | cut -d',' -f3)
>   omero mkngff sql --symlink_repo /data/OMERO/ManagedRepository --secret=$SECRET $fsid "/bia-integrator-data/$biapath/$uuid.zarr" > "$IDRID/$fsid.sql"
> done
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Found prefix demo_2/2016-10/14 // 04-18-15.445 for fileset 22563
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2016-10/14/04-18-15.445
Creating dir at /data/OMERO/ManagedRepository/demo_2/2016-10/14/04-18-15.445_mkngff
Creating symlink /data/OMERO/ManagedRepository/demo_2/2016-10/14/04-18-15.445_mkngff/0046b0d0-f20b-4482-84b1-4b2b154865fd.zarr -> /bia-integrator-data/S-BIAD885/0046b0d0-f20b-4482-84b1-4b2b154865fd/0046b0d0-f20b-4482-84b1-4b2b154865fd.zarr
...

@will-moore
Copy link
Member Author

....Got 142 (out of 147) sql scripts generated so far and cancelled it because I need to restart server - SECRET will be invalid in all these scripts.

In nearly 22 hours, 142 filesets processed -> 7 filesets an hour - 9 mins each.

@will-moore
Copy link
Member Author

On idr-testing, as omero-server user, exporting last bunch with

idr0010/67-44.ome.zarr,S-BIAD885/f5170a3f-aec7-4229-ab84-d19f592588cd,22554
idr0010/29-30.ome.zarr,S-BIAD885/f54fa4a2-851f-4a62-87c3-e401f0edfb4f,22523
idr0010/150-15.ome.zarr,S-BIAD885/f5511c21-e4d0-41a9-a419-396c8daa180c,20855
idr0010/43-04.ome.zarr,S-BIAD885/f67c947b-e88d-4124-905e-60cd570868d9,22539
idr0010/66-37.ome.zarr,S-BIAD885/f98d6f98-7434-41a7-a104-209536635967,22553
idr0010/77-20.ome.zarr,S-BIAD885/fc4f84a3-87f2-42b9-84c7-dba54604c57c,22564
(venv3) bash-4.2$ for r in $(cat $IDRID.csv); do
>   biapath=$(echo $r | cut -d',' -f2)
>   uuid=$(echo $biapath | cut -d'/' -f2)
>   fsid=$(echo $r | cut -d',' -f3)
>   omero mkngff sql $fsid "/bia-integrator-data/$biapath/$uuid.zarr" >> "$IDRID/$fsid.sql"
> done
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Found prefix: demo_2/2016-10/14/02-08-12.416 for fileset: 22554
...

@will-moore will-moore moved this from check_pixels to pixels validated in NGFF conversion Nov 27, 2023
@will-moore will-moore removed the bug label Nov 27, 2023
@will-moore will-moore moved this from pixels validated to Round 2 - psql fileset IDs checked in NGFF conversion Mar 18, 2024
@will-moore will-moore moved this from Other issues (not studies) to NGFF studies in NGFF conversion May 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: NGFF studies
Development

No branches or pull requests

6 participants