diff --git a/.github/workflows/build-and-publish-docs.yml b/.github/workflows/build-and-publish-docs.yml new file mode 100644 index 0000000..decb148 --- /dev/null +++ b/.github/workflows/build-and-publish-docs.yml @@ -0,0 +1,57 @@ +name: Build and Deploy Documentation + +on: + push: + branches: + - main + - qy/create-docs + +permissions: + contents: write + +jobs: + build-and-deploy: + runs-on: ubuntu-latest + + steps: + # Checkout the repository + - name: Checkout Code + uses: actions/checkout@v4 + + # Configure Git credentials + - name: Configure Git Credentials + run: | + git config user.name "github-actions[bot]" + git config user.email "41898282+github-actions[bot]@users.noreply.github.com" + + # Set up Python + - name: Set up Python + uses: actions/setup-python@v5 + with: + python-version: "3.x" + + # Generate cache ID + - name: Set Cache ID + run: echo "cache_id=$(date --utc '+%V')" >> $GITHUB_ENV + + # Cache Python dependencies + - name: Cache Python Dependencies + uses: actions/cache@v4 + with: + key: mkdocs-material-${{ env.cache_id }} + path: .cache + restore-keys: | + mkdocs-material- + + # Install MkDocs and Plugins, Deploy Documentation + - name: Install Dependencies and Deploy Docs + run: | + pip install mkdocs-material \ + mkdocs-git-revision-date-localized-plugin \ + mkdocs-git-committers-plugin-2 \ + mkdocs-autorefs \ + mkdocstrings[python] \ + markdown-exec + mkdocs gh-deploy --force + env: + MKDOCS_GIT_COMMITTERS_APIKEY: ${{ secrets.MKDOCS_GIT_COMMITTERS_APIKEY }} diff --git a/README.md b/README.md index f4f3446..82c279f 100644 --- a/README.md +++ b/README.md @@ -1,107 +1,73 @@ -# Plant Nuclei Segmentation Pipelines +# Nuclear Segmentation Pipelines -This repository hosts the code and guides for the pipelines used in the paper [_A deep learning-based toolkit for 3D nuclei segmentation and quantitative analysis in cellular and tissue context_](https://www.biorxiv.org/content/10.1101/2024.02.19.580954v1). It is structured in to four folders: +![stardist_raw_and_segmentation](https://zenodo.org/records/8432366/files/stardist_raw_and_segmentation.jpg) + +The GoNuclear repository hosts the code and guides for the pipelines used in the paper [_A deep learning-based toolkit for 3D nuclei segmentation and quantitative analysis in cellular and tissue context_](https://doi.org/10.1242/dev.202800). It is structured in to four folders: - **stardist/** contains a 3D StarDist training and inference pipeline, `run-stardist`. - **plantseg/** contains configuration files for training and inference with PlantSeg. - **cellpose/** contains scripts for training and inference with Cellpose. - **evaluation/** contains modules for evaluating the segmentation results. -## Table of Contents - -- [Tools and Workflows](#tools-and-workflows) - - [StarDist](#stardist) - - [PlantSeg](#plantseg) - - [Cellpose](#cellpose) -- [Data](#data) - - [Training Data](#training-data) - - [Preparing Data for Inference](#preparing-data-for-inference) -- [Cite](#cite) - - -## Tools and Workflows - -### StarDist - -*See [`run-stardist`'s README.md](stardist/README.md) for more details.* - -This is one of the most important contribution of this repository. If your nuclei are more or less uniform in shape, please consider using the `run-stardist` pipeline in this repository. It generate separate and round instance segmentation masks for your nuclei images. - -- The code and tutorial for running StarDist inference is in the `stardist/` folder -- The pretrained model is automatically downloaded during inference (also available at [BioImage.IO: StarDist Plant Nuclei 3D ResNet](https://bioimage.io/#/?id=10.5281%2Fzenodo.8421755)) -- An example of segmentation results is shown below. - -![stardist_raw_and_segmentation](https://zenodo.org/records/8432366/files/stardist_raw_and_segmentation.jpg) - -### PlantSeg - -*See [PlantSeg's README.md](plantseg/README.md) for more details.* - -If your nuclei have irregular shapes, please consider using the PlantSeg pipeline. It generates instance masks for your nuclei images regardless of their nucleus size and shape. - -- The code and tutorial for running PlantSeg inference is in the `plantseg/` folder -- The pretrained model is automatically downloaded during inference (also available at [BioImage.IO: PlantSeg Plant Nuclei 3D UNet](https://bioimage.io/#/?id=10.5281%2Fzenodo.8401064)) -- An example of segmentation results is shown below. - -![plantseg_raw_and_gasp_segmentation](https://zenodo.org/records/10070349/files/plantseg_raw_and_gasp_segmentation.jpg) - -### Cellpose - -*See [Cellpose's README.md](cellpose/README.md) for more details.* - -- The guide for running Cellpose inference and training is in the `cellpose/` folder - -## Data - -### Training Data - -The training data is publicly available on [BioImage Archive](https://www.ebi.ac.uk/biostudies/BioImages/studies/S-BIAD1026). - -An example of the raw image: - -![raw](https://zenodo.org/records/10070349/files/plantseg_raw.jpg) - -Some key information about the training data is listed below: - -```python -original_voxel_size = { # z, y, x - 1135: [0.28371836501901143, 0.12678642066720086, 0.12678642066720086], # validation - 1136: [0.2837183895131086, 0.12756971653115998, 0.12756971653115998], # training - 1137: [0.2837183895131086, 0.1266211463645486, 0.1266211463645486 ], # training - 1139: [0.2799036917562724, 0.12674335484590543, 0.12674335484590543], # training - 1170: [0.27799632231404964, 0.12698523961670266, 0.12698522349145364], # training -} # [0.2837, 0.1268, 0.1268] is taken as the median - -original_median_extents = { # z, y, x - 1135: [16, 32, 33], # validation - 1136: [16, 32, 32], # training - 1137: [16, 32, 32], # training - 1139: [16, 32, 33], # training - 1170: [16, 29, 30], # training - 'average': -} # [16, 32, 32] is taken as the median +and are described in [**GoNuclear documentation** :book:](https://kreshuklab.github.io/go-nuclear/). + +## Data and Models + +Please go to [BioImage Archive S-BIAD1026](https://www.ebi.ac.uk/biostudies/BioImages/studies/S-BIAD1026) for the training data and models. I organised them in the following structure: + +```bash +Training data +├── 2d/ +│ ├── isotropic/ +│ │ ├── gold/ +│ │ └── initial/ +│ └── original/ +│ ├── gold/ +│ └── README.txt +└── 3d_all_in_one/ + ├── 1135.h5 + ├── 1136.h5 + ├── 1137.h5 + ├── 1139.h5 + └── 1170.h5 + +Models +├── cellpose/ +│ ├── cyto2_finetune/ +│ │ └── gold/ +│ ├── nuclei_finetune/ +│ │ ├── gold/ +│ │ └── initial/ +│ └── scratch_trained/ +│ └── gold/ +├── plantseg/ +│ └── 3dunet/ +│ ├── gold/ +│ ├── initial/ +│ ├── platinum/ +│ └── train_example.yml +└── stardist/ + ├── resnet/ + │ ├── gold/ + │ ├── initial/ + │ └── platinum/ + ├── train_example.yml + └── unet/ + └── gold/ ``` -**Note for training Cellpose:** The best image form for training StarDist and PlantSeg models are the original forms, i.e. the linked dataset is the one that provide the best results. However, to train Cellpose which only takes 2D training data, the images are prepared to be 2D slices of the rescaled isotropic 3D images. The 2D slices includes all XY, XZ and YZ slices ordered randomly by a random prefix in the file name. The 2D slices are saved as TIFF files. - -### Preparing Data for Inference - -Both HDF5 files and TIFF files can be directly used for both `run-stardist` and `plant-seg` inference. Go to the respective folders's README.md for more details. - -## Cite +## Citation If you find this work useful, please cite our paper and the respective tools' papers: ```bibtex -@article {Vijayan2024.02.19.580954, - author = {Athul Vijayan and Tejasvinee Atul Mody and Qin Yu and Adrian Wolny and Lorenzo Cerrone and Soeren Strauss and Miltos Tsiantis and Richard S. Smith and Fred Hamprecht and Anna Kreshuk and Kay Schneitz}, - title = {A deep learning-based toolkit for 3D nuclei segmentation and quantitative analysis in cellular and tissue context}, - elocation-id = {2024.02.19.580954}, - year = {2024}, - doi = {10.1101/2024.02.19.580954}, - publisher = {Cold Spring Harbor Laboratory}, - URL = {https://www.biorxiv.org/content/early/2024/02/21/2024.02.19.580954}, - eprint = {https://www.biorxiv.org/content/early/2024/02/21/2024.02.19.580954.full.pdf}, - journal = {bioRxiv} +@article{vijayan2024deep, + title={A deep learning-based toolkit for 3D nuclei segmentation and quantitative analysis in cellular and tissue context}, + author={Vijayan, Athul and Mody, Tejasvinee Atul and Yu, Qin and Wolny, Adrian and Cerrone, Lorenzo and Strauss, Soeren and Tsiantis, Miltos and Smith, Richard S and Hamprecht, Fred A and Kreshuk, Anna and others}, + journal={Development}, + volume={151}, + number={14}, + year={2024}, + publisher={The Company of Biologists} } ``` diff --git a/cellpose/README.md b/cellpose/README.md index f699574..989e502 100644 --- a/cellpose/README.md +++ b/cellpose/README.md @@ -2,38 +2,44 @@ This part of the repo concisely shows how to install, train and segment with Cellpose. In other word, it is a record of how Cellpose is used in this paper. Since my experiments show StarDist and PlantSeg have better 3D segmentation performance than Cellpose, this section is complete yet not extensive. -- [Installation](#installation) - - [Install Miniconda](#install-miniconda) - - [Install `cellpose` using `pip`](#install-cellpose-using-pip) -- [Segmentation](#segmentation) - - [Data Preparation](#data-preparation) - - [Segmentation Command](#segmentation-command) -- [Training](#training) - - [Data Preparation](#data-preparation-1) - - [Training Command](#training-command) -- [Cellpose Version and Code](#cellpose-version-and-code) - +* [Installation](#installation) + * [Install Miniconda](#install-miniconda) + * [Install `cellpose` using `pip`](#install-cellpose-using-pip) +* [Segmentation](#segmentation) + * [Data Preparation](#data-preparation) + * [Segmentation Command](#segmentation-command) +* [Training](#training) + * [Data Preparation](#data-preparation-1) + * [Training Command](#training-command) +* [Cellpose Version and Code](#cellpose-version-and-code) +* [Cite](#cite) ## Installation It is recommended to install this package in an environment managed by `conda`. We start the guide by installing Mini-`conda`. ### Install Miniconda + First step required to use the pipeline is installing Miniconda. If you already have a working Anaconda setup you can go directly to the next step. Anaconda can be downloaded for all platforms from [here](https://www.anaconda.com/products/distribution). We suggest to use Miniconda, because it is lighter and install fewer unnecessary packages. To download Miniconda, open a terminal and type: + ```bash wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh ``` Then install by typing: + ```bash bash ./Miniconda3-latest-Linux-x86_64.sh ``` + and follow the installation instructions. The Miniconda3-latest-Linux-x86_64.sh file can be safely deleted. ### Install `cellpose` using `pip` + To create and activate an `conda` environment for `cellpose`, then install `cellpose` in it, run the following commands in the terminal: + ```bash conda create --name cellpose python=3.8 conda activate cellpose @@ -41,6 +47,7 @@ pip install cellpose ``` If you have a nvidia gpu, follow these steps to make use of it: + ```bash pip uninstall torch conda install pytorch==1.12.0 cudatoolkit=11.3 -c pytorch @@ -50,13 +57,18 @@ If you encounter error or need more explanation, go to [Cellpose's official inst ## Segmentation +Although the PlantSeg and StarDist models from this study outperform the Cellpose models I trained. One may find the gold models in [BioImage Archive S-BIAD1026](https://www.ebi.ac.uk/biostudies/BioImages/studies/S-BIAD1026), or one of them [`philosophical-panda` at BioImage Model Zoo](https://bioimage.io/#/?tags=qin%20yu&id=philosophical-panda). + ### Data Preparation + Cellpose inference only segmenet TIFF images, not HDF5. However, it can take 3D volumes as input. ### Segmentation Command There are two ways of segmenting 3D images with Cellpose: -- Segment 3D images slice by slice then stitch 2D segmentation results into 3D segmentation results. With this approach, the images doesn't have to be isotropic, as long as the XY planes have similar properties as the training data. + +* Segment 3D images slice by slice then stitch 2D segmentation results into 3D segmentation results. With this approach, the images doesn't have to be isotropic, as long as the XY planes have similar properties as the training data. + ```bash cellpose \ --pretrained_model PATH_TO_MODEL \ @@ -70,7 +82,9 @@ There are two ways of segmenting 3D images with Cellpose: --no_npy \ --save_tif ``` -- Compute spatial flow of 3D images in all dimensions then segment the images in 3D directly. You may choose to rescale the images to be isotropic before segmentation, or specify the anisotropy to let Cellpose deal with the rescaling. Here I show the later. + +* Compute spatial flow of 3D images in all dimensions then segment the images in 3D directly. You may choose to rescale the images to be isotropic before segmentation, or specify the anisotropy to let Cellpose deal with the rescaling. Here I show the later. + ```bash cellpose \ --pretrained_model PATH_TO_MODEL \ @@ -86,13 +100,14 @@ There are two ways of segmenting 3D images with Cellpose: --save_tif ``` - ## Training ### Data Preparation + Cellpose training only takes 2D images as input. To train on 3D images, we first need to split the 3D images into 2D images. Note that 3D images are better to be rescaled for isotropy in the resulting 2D training data. ### Training Command + An example training command is shown below, which is used in the paper. The parameters `--learning_rate 0.1` and `--weight_decay 0.0001` are recommended by the [Cellpose official documentation](https://cellpose.readthedocs.io/en/latest/train.html). ```bash @@ -106,4 +121,32 @@ cellpose --train --use_gpu \ ``` ## Cellpose Version and Code + See [Cellpose's GitHub page](https://github.com/MouseLand/cellpose) for the code. Cellpose v2.0.5 was used for training and inference in this paper. + +## Cite + +If you find the code/models/datasets useful, please cite our paper and Cellpose: + +```bibtex +@article{vijayan2024deep, + title={A deep learning-based toolkit for 3D nuclei segmentation and quantitative analysis in cellular and tissue context}, + author={Vijayan, Athul and Mody, Tejasvinee Atul and Yu, Qin and Wolny, Adrian and Cerrone, Lorenzo and Strauss, Soeren and Tsiantis, Miltos and Smith, Richard S and Hamprecht, Fred A and Kreshuk, Anna and others}, + journal={Development}, + volume={151}, + number={14}, + year={2024}, + publisher={The Company of Biologists} +} + +@article{stringer2021cellpose, + title={Cellpose: a generalist algorithm for cellular segmentation}, + author={Stringer, Carsen and Wang, Tim and Michaelos, Michalis and Pachitariu, Marius}, + journal={Nature methods}, + volume={18}, + number={1}, + pages={100--106}, + year={2021}, + publisher={Nature Publishing Group US New York} +} +``` diff --git a/docs/chapters/cellpose/index.md b/docs/chapters/cellpose/index.md new file mode 100644 index 0000000..1e2191a --- /dev/null +++ b/docs/chapters/cellpose/index.md @@ -0,0 +1,142 @@ +# Use Cellpose: A Guide + +_This documentation page is a copy of the [GoNuclear-Cellpose README.md file](https://github.com/kreshuklab/go-nuclear/blob/main/cellpose/README.md)._ + +This part of the repo concisely shows how to install, train and segment with Cellpose. In other word, it is a record of how Cellpose is used in this paper. Since my experiments show StarDist and PlantSeg have better 3D segmentation performance than Cellpose, this section is complete yet not extensive. + +## Installation + +It is recommended to install this package in an environment managed by `conda`. We start the guide by installing Mini-`conda`. + +### Install Miniconda + +First step required to use the pipeline is installing Miniconda. If you already have a working Anaconda setup you can go directly to the next step. Anaconda can be downloaded for all platforms from [here](https://www.anaconda.com/products/distribution). We suggest to use Miniconda, because it is lighter and install fewer unnecessary packages. + +To download Miniconda, open a terminal and type: + +```bash +wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh +``` + +Then install by typing: + +```bash +bash ./Miniconda3-latest-Linux-x86_64.sh +``` + +and follow the installation instructions. The Miniconda3-latest-Linux-x86_64.sh file can be safely deleted. + +### Install `cellpose` using `pip` + +To create and activate an `conda` environment for `cellpose`, then install `cellpose` in it, run the following commands in the terminal: + +```bash +conda create --name cellpose python=3.8 +conda activate cellpose +pip install cellpose +``` + +If you have a nvidia gpu, follow these steps to make use of it: + +```bash +pip uninstall torch +conda install pytorch==1.12.0 cudatoolkit=11.3 -c pytorch +``` + +If you encounter error or need more explanation, go to [Cellpose's official instruction](https://github.com/MouseLand/cellpose#instructions). + +## Segmentation + +Although the PlantSeg and StarDist models from this study outperform the Cellpose models I trained. One may find the gold models in [BioImage Archive S-BIAD1026](https://www.ebi.ac.uk/biostudies/BioImages/studies/S-BIAD1026), or one of them [`philosophical-panda` at BioImage Model Zoo](https://bioimage.io/#/?tags=qin%20yu&id=philosophical-panda). + +### Data Preparation + +Cellpose inference only segmenet TIFF images, not HDF5. However, it can take 3D volumes as input. + +### Segmentation Command + +There are two ways of segmenting 3D images with Cellpose: + +- Segment 3D images slice by slice then stitch 2D segmentation results into 3D segmentation results. With this approach, the images doesn't have to be isotropic, as long as the XY planes have similar properties as the training data. + + ```bash + cellpose \ + --pretrained_model PATH_TO_MODEL \ + --savedir PATH_TO_OUTPUT_DIR \ + --dir PATH_TO_3D_TIFF_FOLDER \ + --diameter 26.5 \ + --verbose \ + --use_gpu \ + --stitch_threshold 0.9 \ + --chan 0 \ + --no_npy \ + --save_tif + ``` + +- Compute spatial flow of 3D images in all dimensions then segment the images in 3D directly. You may choose to rescale the images to be isotropic before segmentation, or specify the anisotropy to let Cellpose deal with the rescaling. Here I show the later. + + ```bash + cellpose \ + --pretrained_model PATH_TO_MODEL \ + --savedir PATH_TO_OUTPUT_DIR \ + --dir PATH_TO_3D_TIFF_FOLDER \ + --diameter 26.5 \ + --anisotropy 2.24 \ + --verbose \ + --use_gpu \ + --do_3D \ + --chan 0 \ + --no_npy \ + --save_tif + ``` + +## Training + +### Data Preparation + +Cellpose training only takes 2D images as input. To train on 3D images, we first need to split the 3D images into 2D images. Note that 3D images are better to be rescaled for isotropy in the resulting 2D training data. + +### Training Command + +An example training command is shown below, which is used in the paper. The parameters `--learning_rate 0.1` and `--weight_decay 0.0001` are recommended by the [Cellpose official documentation](https://cellpose.readthedocs.io/en/latest/train.html). + +```bash +cellpose --train --use_gpu \ + --dir PATH_TO_TRAINING_DATA \ + --pretrained_model nuclei \ + --learning_rate 0.1 \ + --weight_decay 0.0001 \ + --mask_filter _masks \ + --verbose +``` + +## Cellpose Version and Code + +See [Cellpose's GitHub page](https://github.com/MouseLand/cellpose) for the code. Cellpose v2.0.5 was used for training and inference in this paper. + +## Cite + +If you find the code/models/datasets useful, please cite our paper and Cellpose: + +```bibtex +@article{vijayan2024deep, + title={A deep learning-based toolkit for 3D nuclei segmentation and quantitative analysis in cellular and tissue context}, + author={Vijayan, Athul and Mody, Tejasvinee Atul and Yu, Qin and Wolny, Adrian and Cerrone, Lorenzo and Strauss, Soeren and Tsiantis, Miltos and Smith, Richard S and Hamprecht, Fred A and Kreshuk, Anna and others}, + journal={Development}, + volume={151}, + number={14}, + year={2024}, + publisher={The Company of Biologists} +} + +@article{stringer2021cellpose, + title={Cellpose: a generalist algorithm for cellular segmentation}, + author={Stringer, Carsen and Wang, Tim and Michaelos, Michalis and Pachitariu, Marius}, + journal={Nature methods}, + volume={18}, + number={1}, + pages={100--106}, + year={2021}, + publisher={Nature Publishing Group US New York} +} +``` diff --git a/docs/chapters/evaluation/index.md b/docs/chapters/evaluation/index.md new file mode 100644 index 0000000..4e6c627 --- /dev/null +++ b/docs/chapters/evaluation/index.md @@ -0,0 +1,21 @@ +# Evaluation Module + +_This documentation page is a copy of the [GoNuclear-Evaluation README.md file](https://github.com/kreshuklab/go-nuclear/blob/main/evaluation/README.md)._ + +This module contains the code for evaluating the performance of the trained models. This is an implementation of [the scoring metirc in 2018 Data Science Bowl](https://www.kaggle.com/code/stkbailey/step-by-step-explanation-of-scoring-metric). + +## Cite + +If you find this work useful, please cite our paper: + +```bibtex +@article{vijayan2024deep, + title={A deep learning-based toolkit for 3D nuclei segmentation and quantitative analysis in cellular and tissue context}, + author={Vijayan, Athul and Mody, Tejasvinee Atul and Yu, Qin and Wolny, Adrian and Cerrone, Lorenzo and Strauss, Soeren and Tsiantis, Miltos and Smith, Richard S and Hamprecht, Fred A and Kreshuk, Anna and others}, + journal={Development}, + volume={151}, + number={14}, + year={2024}, + publisher={The Company of Biologists} +} +``` diff --git a/docs/chapters/plantseg/index.md b/docs/chapters/plantseg/index.md new file mode 100644 index 0000000..1616819 --- /dev/null +++ b/docs/chapters/plantseg/index.md @@ -0,0 +1,230 @@ +# Run PlantSeg: A Guide + +_This documentation page is a copy of the [GoNuclear-PlantSeg README.md file](https://github.com/kreshuklab/go-nuclear/blob/main/plantseg/README.md)._ + +## Installation + +It is recommended to install this package with `mamba` (see below). If you don't have `mamba` installed, you can install it with `conda`. We start the guide by installing Mini-`conda`. + +### Install Miniconda + +First step required to use the pipeline is installing Miniconda. If you already have a working Anaconda setup you can go directly to the next step. Anaconda can be downloaded for all platforms from [here](https://www.anaconda.com/products/distribution). We suggest to use Miniconda, because it is lighter and install fewer unnecessary packages. + +To download Miniconda, open a terminal and type: + +```bash +wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh +``` + +Then install by typing: + +```bash +bash ./Miniconda3-latest-Linux-x86_64.sh +``` + +and follow the installation instructions. The Miniconda3-latest-Linux-x86_64.sh file can be safely deleted. + +### Install `plant-seg` using `mamba` + +Fist step is to install mamba, which is an alternative to conda: + +```bash +conda install -c conda-forge mamba +``` + +PlantSeg version >= v1.6.2 is required. If you have a nvidia gpu, install `plant-seg` using: + +```bash +mamba create -n plant-seg -c pytorch -c nvidia -c conda-forge pytorch pytorch-cuda=12.1 pyqt lcerrone::plantseg +``` + +or if you don't have a nvidia gpu, install `plant-seg` using: + +```bash +mamba create -n plant-seg -c pytorch -c nvidia -c conda-forge pytorch cpuonly pyqt lcerrone::plantseg +``` + +## Inference + +### Example configuration file for both training and inference + +The original configuration file used for training the final UNet PlantSeg model published on Bioimage.IO for wide applicability can be found at `plantseg/configs/config_train_final.yml`, which is a configuration file for [pytorch-3dunet](https://github.com/wolny/pytorch-3dunet), the core network of PlantSeg. + +An example config file for segmentation can be found at `plantseg/configs/config_pred_wide_applicability.yaml`. To modify it and use it for your own data, you need to change the `path` parameters: + +- `path`: path to the folder containing the images to be segmented or to the image to be segmented + +You may also need to change these parameters: + +- `preprocessing:factor`: a rescale factor to match the nucleus size of your data to the training data, not necessary but may help in specific cases +- `cnn_prediction:patch`: patch size should be smaller than the dimension of your image, and smaller than the GPU memory + +The full configuration file is shown below: + +```yaml +# Contains the path to the directory or file to process +path: PATH_TO_YOUR_DATA + +preprocessing: + # enable/disable preprocessing + state: True + # key for H5 or ZARR, can be set to null if only one key exists in each file + key: Null + # channel to use if input image has shape CZYX or CYX, otherwise set to null + channel: Null + # create a new sub folder where all results will be stored + save_directory: 'PreProcessing' + # rescaling the volume is essential for the generalization of the networks. The rescaling factor can be computed as the resolution + # of the volume at hand divided by the resolution of the dataset used in training. Be careful, if the difference is too large check for a different model. + factor: [1.0, 1.0, 1.0] + # the order of the spline interpolation + order: 2 + # cropping out areas of little interest can drastically improve the performance of plantseg. + # crop volume has to be input using the numpy slicing convention [b_z:e_z, b_x:e_x, b_y:e_y], where b_zxy is the + # first point of a bounding box and e_zxy is the second. eg: [:, 100:500, 400:900] + crop_volume: '[:,:,:]' + # optional: perform Gaussian smoothing or median filtering on the input. + filter: + # enable/disable filtering + state: False + # Accepted values: 'gaussian'/'median' + type: gaussian + # sigma (gaussian) or disc radius (median) + filter_param: 1.0 + +cnn_prediction: + # enable/disable UNet prediction + state: True + # key for H5 or ZARR, can be set to null if only one key exists in each file; null is recommended if the previous steps has state True + key: Null + # channel to use if input image has shape CZYX or CYX, otherwise set to null; null is recommended if the previous steps has state True + channel: Null + # Trained model name, more info on available models and custom models in PlantSeg documentation + model_name: 'PlantSeg_3Dnuc_platinum' + # If a CUDA capable gpu is available and corrected setup use "cuda", if not you can use "cpu" for cpu only inference (slower) + device: 'cuda' + # (int or tuple) padding to be removed from each axis in a given patch in order to avoid checkerboard artifacts + patch_halo: [64, 64, 64] + # how many subprocesses to use for data loading + num_workers: 8 + # patch size given to the network (adapt to fit in your GPU mem) + patch: [192, 256, 256] + # stride between patches will be computed as `stride_ratio * patch` + # recommended values are in range `[0.5, 0.75]` to make sure the patches have enough overlap to get smooth prediction maps + stride_ratio: 0.50 + # If "True" forces downloading networks from the online repos + model_update: False + +cnn_postprocessing: + # enable/disable cnn post processing + state: True + # key for H5 or ZARR, can be set to null if only one key exists in each file; null is recommended if the previous steps has state True + key: Null + # channel to use if input image has shape CZYX or CYX, otherwise set to null; null is recommended if the previous steps has state True + channel: 1 + # if True convert to result to tiff + tiff: True + # rescaling factor + factor: [1, 1, 1] + # spline order for rescaling + order: 2 + +segmentation: + # enable/disable segmentation + state: True + # key for H5 or ZARR, can be set to null if only one key exists in each file; null is recommended if the previous steps has state True + key: 'predictions' + # channel to use if prediction has shape CZYX or CYX, otherwise set to null; null is recommended if the previous steps has state True + channel: 1 + # Name of the algorithm to use for inferences. Options: MultiCut, MutexWS, GASP, DtWatershed + name: 'GASP' + # Segmentation specific parameters here + # balance under-/over-segmentation; 0 - aim for undersegmentation, 1 - aim for oversegmentation. (Not active for DtWatershed) + beta: 0.5 + # directory where to save the results + save_directory: 'GASP' + # enable/disable watershed + run_ws: True + # use 2D instead of 3D watershed + ws_2D: False + # probability maps threshold + ws_threshold: 0.4 + # set the minimum superpixels size + ws_minsize: 50 + # sigma for the gaussian smoothing of the distance transform + ws_sigma: 2.0 + # sigma for the gaussian smoothing of boundary + ws_w_sigma: 0 + # set the minimum segment size in the final segmentation. (Not active for DtWatershed) + post_minsize: 100 + +segmentation_postprocessing: + # enable/disable segmentation post processing + state: True + # key for H5 or ZARR, can be set to null if only one key exists in each file; null is recommended if the previous steps has state True + key: Null + # channel to use if input image has shape CZYX or CYX, otherwise set to null; null is recommended if the previous steps has state True + channel: Null + # if True convert to result to tiff + tiff: True + # rescaling factor + factor: [1, 1, 1] + # spline order for rescaling (keep 0 for segmentation post processing + order: 0 + # save raw input in the output segmentation file h5 file + save_raw: False + +``` + +### Prediction + +```shell +plantseg --config CONFIG_PATH +``` + +where CONFIG_PATH is the path to the YAML configuration file. For example, if you want to use the model with the example configuration file `configs/config_pred_wide_applicability.yaml`: + +```shell +cd ovules-instance-segmentation/plantseg/ +CUDA_VISIBLE_DEVICES=0 plantseg --config configs/train_and_infer.yml +``` + +### Specifying a Graphic Card (GPU) + +If you need to specify a graphic card, for example to use the No. 7 card (the eighth), do: + +```shell +CUDA_VISIBLE_DEVICES=7 plantseg --config CONFIG_PATH +``` + +If you have only one graphic card, use `CUDA_VISIBLE_DEVICES=0` to select the first card (No. 0). + +## Cite + +If you find this work useful, please cite both papers: + +```bibtex +@article{vijayan2024deep, + title={A deep learning-based toolkit for 3D nuclei segmentation and quantitative analysis in cellular and tissue context}, + author={Vijayan, Athul and Mody, Tejasvinee Atul and Yu, Qin and Wolny, Adrian and Cerrone, Lorenzo and Strauss, Soeren and Tsiantis, Miltos and Smith, Richard S and Hamprecht, Fred A and Kreshuk, Anna and others}, + journal={Development}, + volume={151}, + number={14}, + year={2024}, + publisher={The Company of Biologists} +} + +@article{wolny2020accurate, + title={Accurate and versatile 3D segmentation of plant tissues at cellular resolution}, + author={Wolny, Adrian and Cerrone, Lorenzo and Vijayan, Athul and Tofanelli, Rachele and Barro, Amaya Vilches and Louveaux, Marion and Wenzl, Christian and Strauss, S{\"o}ren and Wilson-S{\'a}nchez, David and Lymbouridou, Rena and others}, + journal={Elife}, + volume={9}, + pages={e57613}, + year={2020}, + publisher={eLife Sciences Publications Limited} +} +``` + +## PlantSeg Version and Code + +See [PlantSeg's website](https://github.com/hci-unihd/plant-seg) for more details. The PlantSeg version v1.4.3 was used for testing, and PlantSeg v1.6.2 was released for this paper. diff --git a/docs/chapters/stardist/index.md b/docs/chapters/stardist/index.md new file mode 100644 index 0000000..b5c67d3 --- /dev/null +++ b/docs/chapters/stardist/index.md @@ -0,0 +1,205 @@ +# Run Stardist: A Guide and A Pipeline + +_This documentation page is a copy of the [`run-stardist` README.md file](https://github.com/kreshuklab/go-nuclear/blob/main/stardist/README.md)._ + +![version](https://anaconda.org/qin-yu/run-stardist/badges/version.svg) +![latest_release_date](https://anaconda.org/qin-yu/run-stardist/badges/latest_release_date.svg) +![license](https://anaconda.org/qin-yu/run-stardist/badges/license.svg) +![downloads](https://anaconda.org/qin-yu/run-stardist/badges/downloads.svg) + +A complete training and inference pipeline for 3D StarDist with an example on 3D biological (ovules) datasets. Please submit an issue if you encountered errors or if you have any questions or suggestions. + +## Models and Data + +A 3D nucleus segmentation model is available for download from Bioimage.IO and ready to be used directly for segmenting your nuclei. The model is trained on a 3D confocal ovule dataset from _Arabidopsis thaliana_. The StarDist version v0.8.3 was used for the paper. + +### Use Pre-trained Model + +Model weights and related files can be found at [DOI 10.5281/zenodo.8421755](https://zenodo.org/doi/10.5281/zenodo.8421755). The programme downloads the model automatically for you to make inference on your images as long as you specify `generic_plant_nuclei_3D` as the `model_name` in the configuration file. + +This is the only 3D StarDist model available on Bioimage Model Zoo at the moment. If you have another model, put its folder in your `PATH_TO_MODEL_DIR` and specify the folder name as `MY_MODEL_NAME` in the configuration file (see below). Then you can run `predict-stardist` to use the model for inference. For more information on inference, see [Prediction](#prediction) section below. + +### Training data statistics and links + +The training data is publicly available on Zenodo at [BioImage Archive S-BIAD1026](https://www.ebi.ac.uk/biostudies/BioImages/studies/S-BIAD1026). Some key information about the training data is listed below: + +```python +original_voxel_size = { # z, y, x + 1135: [0.28371836501901143, 0.12678642066720086, 0.12678642066720086], # validation + 1136: [0.2837183895131086, 0.12756971653115998, 0.12756971653115998], # training + 1137: [0.2837183895131086, 0.1266211463645486, 0.1266211463645486 ], # training + 1139: [0.2799036917562724, 0.12674335484590543, 0.12674335484590543], # training + 1170: [0.27799632231404964, 0.12698523961670266, 0.12698522349145364], # training +} # [0.2837, 0.1268, 0.1268] is taken as the median + +original_median_extents = { # z, y, x + 1135: [16, 32, 33], # validation + 1136: [16, 32, 32], # training + 1137: [16, 32, 32], # training + 1139: [16, 32, 33], # training + 1170: [16, 29, 30], # training + 'average': +} # [16, 32, 32] is taken as the median +``` + +## Installation + +It is recommended to install this package with `mamba` (see below). If you don't have `mamba` installed, you can install it with `conda`. We start the guide by installing Mini-`conda`. + +### Install Miniconda + +First step required to use the pipeline is installing Miniconda. If you already have a working Anaconda setup you can go directly to the next step. Anaconda can be downloaded for all platforms from [here](https://www.anaconda.com/products/distribution). We suggest to use Miniconda, because it is lighter and install fewer unnecessary packages. + +To download Miniconda, open a terminal and type: + +```bash +wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh +``` + +Then install by typing: + +```bash +bash ./Miniconda3-latest-Linux-x86_64.sh +``` + +and follow the installation instructions. The Miniconda3-latest-Linux-x86_64.sh file can be safely deleted. + +### Install `run-stardist` using `mamba` + +Fist step is to install mamba, which is an alternative to conda: + +```bash +conda install -c conda-forge mamba +``` + +If you have a nvidia gpu, install `run-stardist` using: + +```bash +mamba create -n run-stardist -c qin-yu -c conda-forge "python>=3.10" tensorflow stardist wandb "pydantic<2" run-stardist +``` + +or if you don't have a nvidia gpu, install `run-stardist` using: + +```bash +mamba create -n run-stardist -c qin-yu -c conda-forge "python>=3.10" tensorflow-cpu stardist wandb "pydantic<2" run-stardist +``` + +## Usage + +### Example configuration file for both training and inference + +The original configuration file used for training the final ResNet StarDist model published on Bioimage.IO for wide applicability can be found at `stardist/configs/final_resnet_model_config.yml`, which can used for both training and inference (note that the inference output is only used for illustration in this repository because it's segmenting the training data). + +The generic template is shown below. A configuration template with more guidelines can be found at `stardist/configs/train_and_infer.yml`. + +```yaml +wandb: # optional, remove this part if not using W&B + project: ovules-instance-segmentation + name: final-stardist-model + +data: + # Rescale outside StarDist + rescale_factor: Null + + # Training (ignored in inference config file) + training: + - PATH_TO_INPUT_DIR_1 or PATH_TO_INPUT_FILE_1 + - PATH_TO_INPUT_DIR_2 or PATH_TO_INPUT_FILE_2 + validation: + - PATH_TO_INPUT_DIR_3 or PATH_TO_INPUT_FILE_3 + raw_name: raw/noisy # only required if HDF5 + label_name: label/gold # only required if HDF5 + + # Inference (ignored in training config file) + prediction: + - PATH_TO_INPUT_DIR_4 or PATH_TO_INPUT_FILE_4 + - PATH_TO_INPUT_DIR_5 or PATH_TO_INPUT_FILE_5 + format: tiff # only 'hdf5' or 'tiff' + name: raw/nuclei # dataset name of the raw image in HDF5 files, only required if format is `hdf5` + output_dir: MY_OUTPUT_DIR + output_dtype: uint16 # `uint8`, `uint16`, or `float32` are recommended + resize_to_original: True # output should be of he same shape as input + target_voxel_size: Null # the desired voxel size to rescale to during inference, null if rescale factor is set + save_probability_map: True + +stardist: + model_dir: PATH_TO_MODEL_DIR # set to `null` if model name is `generic_plant_nuclei_3D` + model_name: MY_MODEL_NAME # set to `generic_plant_nuclei_3D` to use the builtin model + model_type: StarDist3D + model_config: # model configuration should stay identical for training and inference + backbone: resnet + n_rays: 96 + grid: [2, 4, 4] + use_gpu: False + n_channel_in: 1 + patch_size: [96, 96, 96] # multiple of 16 prefered + train_batch_size: 8 + train_n_val_patches: 16 + steps_per_epoch: 400 + epochs: 1000 + +augmenter: + name: default +``` + +### Training + +```shell +train-stardist --config CONFIG_PATH +``` + +where CONFIG_PATH is the path to the YAML configuration file. For example, if you want to train the model with the example configuration file `configs/train_and_infer.yml`: + +```shell +cd ovules-instance-segmentation/stardist/ +CUDA_VISIBLE_DEVICES=0 train-stardist --config configs/train_and_infer.yml +``` + +### Prediction + +```shell +predict-stardist --config CONFIG_PATH +``` + +where CONFIG_PATH is the path to the YAML configuration file. For example, if you want to use the model with the example configuration file `configs/train_and_infer.yml`: + +```shell +cd ovules-instance-segmentation/stardist/ +CUDA_VISIBLE_DEVICES=0 predict-stardist --config configs/train_and_infer.yml +``` + +**Preprocessing:** For the published [StarDist Plant Nuclei 3D ResNet](https://zenodo.org/doi/10.5281/zenodo.8421755) the median size of nuclei in training data is `[16, 32, 32]`. To achieve the best segmentation result, the input 3D images should be rescaled so that your nucleus size in ZYX matches the training data. For example, if the median nucleus size of your data is `[32, 32, 32]`, then `rescale_factor` should be `[0.5, 1., 1.]`; if it's `[15, 33, 31]`, then it does not have to be rescaled. You may also choose to leave `rescale_factor` as `Null` and rescale your images with Fiji or other tools before running the pipeline. If `resize_to_original` is `True` then the output will have the original size of the input image. + +### Specifying a Graphic Card (GPU) + +If you need to specify a graphic card, for example to use the No. 7 card (the eighth), do: + +```shell +CUDA_VISIBLE_DEVICES=7 predict-stardist --config CONFIG_PATH +``` + +If you have only one graphic card, use `CUDA_VISIBLE_DEVICES=0` to select the first card (No. 0). + +## Cite + +If you find the code/models/datasets useful, please cite our paper and StarDist: + +```bibtex +@article{vijayan2024deep, + title={A deep learning-based toolkit for 3D nuclei segmentation and quantitative analysis in cellular and tissue context}, + author={Vijayan, Athul and Mody, Tejasvinee Atul and Yu, Qin and Wolny, Adrian and Cerrone, Lorenzo and Strauss, Soeren and Tsiantis, Miltos and Smith, Richard S and Hamprecht, Fred A and Kreshuk, Anna and others}, + journal={Development}, + volume={151}, + number={14}, + year={2024}, + publisher={The Company of Biologists} +} + +@inproceedings{weigert2020star, + title={Star-convex polyhedra for 3D object detection and segmentation in microscopy}, + author={Weigert, Martin and Schmidt, Uwe and Haase, Robert and Sugawara, Ko and Myers, Gene}, + booktitle={Proceedings of the IEEE/CVF winter conference on applications of computer vision}, + pages={3666--3673}, + year={2020} +} +``` diff --git a/docs/index.md b/docs/index.md new file mode 100644 index 0000000..7cecaf1 --- /dev/null +++ b/docs/index.md @@ -0,0 +1,136 @@ +# Nuclear Segmentation Pipelines + +The GoNuclear repository hosts the code and guides for the pipelines used in the paper [_A deep learning-based toolkit for 3D nuclei segmentation and quantitative analysis in cellular and tissue context_](https://doi.org/10.1242/dev.202800). It is structured in to four folders: + +- **stardist/** contains a 3D StarDist training and inference pipeline, `run-stardist`. +- **plantseg/** contains configuration files for training and inference with PlantSeg. +- **cellpose/** contains scripts for training and inference with Cellpose. +- **evaluation/** contains modules for evaluating the segmentation results. + +and are described in this documentation. + +## Tools and Workflows + +### StarDist + +_See [GoNuclear Documentation - `run-stardist`](chapters/stardist/index.md) for more details._ + +This is one of the most important contribution of this repository. If your nuclei are more or less uniform in shape, please consider using the `run-stardist` pipeline in this repository. It generate separate and round instance segmentation masks for your nuclei images. + +- The code and tutorial for running StarDist inference is in the `stardist/` folder +- The pretrained model is automatically downloaded during inference (also available at [BioImage.IO: StarDist Plant Nuclei 3D ResNet](https://bioimage.io/#/?id=10.5281%2Fzenodo.8421755)) +- An example of segmentation results is shown below. + +![stardist_raw_and_segmentation](https://zenodo.org/records/8432366/files/stardist_raw_and_segmentation.jpg) + +### PlantSeg + +_See [GoNuclear Documentation - PlantSeg](chapters/plantseg/index.md) for more details._ + +If your nuclei have irregular shapes, please consider using the PlantSeg pipeline. It generates instance masks for your nuclei images regardless of their nucleus size and shape. + +- The code and tutorial for running PlantSeg inference is in the `plantseg/` folder +- The pretrained model is automatically downloaded during inference (also available at [BioImage.IO: PlantSeg Plant Nuclei 3D UNet](https://bioimage.io/#/?id=10.5281%2Fzenodo.8401064)) +- An example of segmentation results is shown below. + +![plantseg_raw_and_gasp_segmentation](https://zenodo.org/records/10070349/files/plantseg_raw_and_gasp_segmentation.jpg) + +### Cellpose + +_See [GoNuclear Documentation - Cellpose](chapters/cellpose/index.md) for more details._ + +- The guide for running Cellpose inference and training is in the `cellpose/` folder + +## Data and Models + +### Training Data and Trained Models + +The training data is publicly available on [BioImage Archive S-BIAD1026](https://www.ebi.ac.uk/biostudies/BioImages/studies/S-BIAD1026). I organised them in the following structure: + +```bash +Training data +├── 2d/ +│ ├── isotropic/ +│ │ ├── gold/ +│ │ └── initial/ +│ └── original/ +│ ├── gold/ +│ └── README.txt +└── 3d_all_in_one/ + ├── 1135.h5 + ├── 1136.h5 + ├── 1137.h5 + ├── 1139.h5 + └── 1170.h5 + +Models +├── cellpose/ +│ ├── cyto2_finetune/ +│ │ └── gold/ +│ ├── nuclei_finetune/ +│ │ ├── gold/ +│ │ └── initial/ +│ └── scratch_trained/ +│ └── gold/ +├── plantseg/ +│ └── 3dunet/ +│ ├── gold/ +│ ├── initial/ +│ ├── platinum/ +│ └── train_example.yml +└── stardist/ + ├── resnet/ + │ ├── gold/ + │ ├── initial/ + │ └── platinum/ + ├── train_example.yml + └── unet/ + └── gold/ +``` + +An example of the raw image: + +![raw](https://zenodo.org/records/10070349/files/plantseg_raw.jpg) + +Some key information about the training data is listed below: + +```python +original_voxel_size = { # z, y, x + 1135: [0.28371836501901143, 0.12678642066720086, 0.12678642066720086], # validation + 1136: [0.2837183895131086, 0.12756971653115998, 0.12756971653115998], # training + 1137: [0.2837183895131086, 0.1266211463645486, 0.1266211463645486 ], # training + 1139: [0.2799036917562724, 0.12674335484590543, 0.12674335484590543], # training + 1170: [0.27799632231404964, 0.12698523961670266, 0.12698522349145364], # training +} # [0.2837, 0.1268, 0.1268] is taken as the median + +original_median_extents = { # z, y, x + 1135: [16, 32, 33], # validation + 1136: [16, 32, 32], # training + 1137: [16, 32, 32], # training + 1139: [16, 32, 33], # training + 1170: [16, 29, 30], # training + 'average': +} # [16, 32, 32] is taken as the median +``` + +**Note for training Cellpose:** The best image form for training StarDist and PlantSeg models are the original forms, i.e. the linked dataset is the one that provide the best results. However, to train Cellpose which only takes 2D training data, the images are prepared to be 2D slices of the rescaled isotropic 3D images. The 2D slices includes all XY, XZ and YZ slices ordered randomly by a random prefix in the file name. The 2D slices are saved as TIFF files and are provided along with the 3D images in the same [BioImage Archive S-BIAD1026](https://www.ebi.ac.uk/biostudies/BioImages/studies/S-BIAD1026) repository. + +### Preparing Data for Inference + +Both HDF5 files and TIFF files can be directly used for both `run-stardist` and `plant-seg` inference. Go to the respective GoNuclear documentation for more details. + +## Cite + +If you find this work useful, please cite our paper and the respective tools' papers: + +```bibtex +@article{vijayan2024deep, + title={A deep learning-based toolkit for 3D nuclei segmentation and quantitative analysis in cellular and tissue context}, + author={Vijayan, Athul and Mody, Tejasvinee Atul and Yu, Qin and Wolny, Adrian and Cerrone, Lorenzo and Strauss, Soeren and Tsiantis, Miltos and Smith, Richard S and Hamprecht, Fred A and Kreshuk, Anna and others}, + journal={Development}, + volume={151}, + number={14}, + year={2024}, + publisher={The Company of Biologists} +} +``` diff --git a/evaluation/README.md b/evaluation/README.md index 3f0b296..656d90b 100644 --- a/evaluation/README.md +++ b/evaluation/README.md @@ -1,21 +1,19 @@ # Evaluation Module -This module contains the code for evaluating the performance of the trained models. +This module contains the code for evaluating the performance of the trained models. This is an implementation of [the scoring metirc in 2018 Data Science Bowl](https://www.kaggle.com/code/stkbailey/step-by-step-explanation-of-scoring-metric). ## Cite If you find this work useful, please cite our paper: ```bibtex -@article {Vijayan2024.02.19.580954, - author = {Athul Vijayan and Tejasvinee Atul Mody and Qin Yu and Adrian Wolny and Lorenzo Cerrone and Soeren Strauss and Miltos Tsiantis and Richard S. Smith and Fred Hamprecht and Anna Kreshuk and Kay Schneitz}, - title = {A deep learning-based toolkit for 3D nuclei segmentation and quantitative analysis in cellular and tissue context}, - elocation-id = {2024.02.19.580954}, - year = {2024}, - doi = {10.1101/2024.02.19.580954}, - publisher = {Cold Spring Harbor Laboratory}, - URL = {https://www.biorxiv.org/content/early/2024/02/21/2024.02.19.580954}, - eprint = {https://www.biorxiv.org/content/early/2024/02/21/2024.02.19.580954.full.pdf}, - journal = {bioRxiv} +@article{vijayan2024deep, + title={A deep learning-based toolkit for 3D nuclei segmentation and quantitative analysis in cellular and tissue context}, + author={Vijayan, Athul and Mody, Tejasvinee Atul and Yu, Qin and Wolny, Adrian and Cerrone, Lorenzo and Strauss, Soeren and Tsiantis, Miltos and Smith, Richard S and Hamprecht, Fred A and Kreshuk, Anna and others}, + journal={Development}, + volume={151}, + number={14}, + year={2024}, + publisher={The Company of Biologists} } ``` diff --git a/mkdocs.yml b/mkdocs.yml new file mode 100644 index 0000000..3d094c2 --- /dev/null +++ b/mkdocs.yml @@ -0,0 +1,105 @@ +site_name: GoNuclear +site_url: https://kreshuklab.github.io/go-nuclear/ +site_description: Nuclear Segmentation Guides and Pipelines +repo_name: kreshuklab/go-nuclear +repo_url: https://github.com/kreshuklab/go-nuclear +edit_uri: edit/main/docs/ +copyright: Copyright © 2023 - 2025 Qin Yu + +theme: + name: material + icon: + repo: fontawesome/brands/github + palette: + # Palette toggle for light mode + - scheme: default + primary: green + toggle: + icon: material/brightness-7 + name: Switch to dark mode + + # Palette toggle for dark mode + - scheme: slate + # primary: teal + accent: light-green + toggle: + icon: material/brightness-4 + name: Switch to light mode + features: + - content.tooltips + - content.code.annotate + - navigation.instant + - navigation.instant.progress + - navigation.sections + - navigation.path + - navigation.indexes + - navigation.footer + - toc.follow + - search.suggest + - search.share + +extra: + social: + - icon: fontawesome/brands/github + link: https://github.com/kreshuklab/go-nuclear + name: GoNuclear on GitHub + +markdown_extensions: + - abbr + - attr_list + - md_in_html + - admonition + - pymdownx.extra + - pymdownx.details + - pymdownx.snippets + - pymdownx.highlight + - pymdownx.superfences + - pymdownx.tabbed: + alternate_style: true + - pymdownx.emoji: + emoji_index: !!python/name:material.extensions.emoji.twemoji + emoji_generator: !!python/name:material.extensions.emoji.to_svg + - pymdownx.snippets: + base_path: docs/snippets + check_paths: true + +plugins: + - search + - autorefs + - markdown-exec + - mkdocstrings: + handlers: + python: + import: + - https://docs.python.org/3/objects.inv + - https://numpy.org/doc/stable/objects.inv + options: + heading_level: 3 + docstring_style: google + show_source: true + show_signature_annotations: true + show_root_heading: true + show_root_full_path: true + show_bases: true + docstring_section_style: list + - git-revision-date-localized: + enable_creation_date: true + - git-committers: + repository: kreshuklab/go-nuclear + branch: main + +nav: + - Overview: + - index.md + + - PlantSeg: + - chapters/plantseg/index.md + + - StarDist: + - chapters/stardist/index.md + + - Cellpose: + - chapters/cellpose/index.md + + - Evaluation: + - chapters/evaluation/index.md diff --git a/plantseg/README.md b/plantseg/README.md index 9548e3e..58001db 100644 --- a/plantseg/README.md +++ b/plantseg/README.md @@ -1,14 +1,14 @@ # Run PlantSeg: A Guide -- [Installation](#installation) - - [Install Miniconda](#install-miniconda) - - [Install `plant-seg` using `mamba`](#install-plant-seg-using-mamba) -- [Inference](#inference) - - [Example configuration file for both training and inference](#example-configuration-file-for-both-training-and-inference) - - [Prediction](#prediction) - - [Specifying a Graphic Card (GPU)](#specifying-a-graphic-card-gpu) -- [Cite](#cite) -- [PlantSeg Version and Code](#plantseg-version-and-code) +* [Installation](#installation) + * [Install Miniconda](#install-miniconda) + * [Install `plant-seg` using `mamba`](#install-plant-seg-using-mamba) +* [Inference](#inference) + * [Example configuration file for both training and inference](#example-configuration-file-for-both-training-and-inference) + * [Prediction](#prediction) + * [Specifying a Graphic Card (GPU)](#specifying-a-graphic-card-gpu) +* [Cite](#cite) +* [PlantSeg Version and Code](#plantseg-version-and-code) ## Installation @@ -60,12 +60,12 @@ The original configuration file used for training the final UNet PlantSeg model An example config file for segmentation can be found at `plantseg/configs/config_pred_wide_applicability.yaml`. To modify it and use it for your own data, you need to change the `path` parameters: -- `path`: path to the folder containing the images to be segmented or to the image to be segmented +* `path`: path to the folder containing the images to be segmented or to the image to be segmented You may also need to change these parameters: -- `preprocessing:factor`: a rescale factor to match the nucleus size of your data to the training data, not necessary but may help in specific cases -- `cnn_prediction:patch`: patch size should be smaller than the dimension of your image, and smaller than the GPU memory +* `preprocessing:factor`: a rescale factor to match the nucleus size of your data to the training data, not necessary but may help in specific cases +* `cnn_prediction:patch`: patch size should be smaller than the dimension of your image, and smaller than the GPU memory The full configuration file is shown below: @@ -107,7 +107,7 @@ cnn_prediction: key: Null # channel to use if input image has shape CZYX or CYX, otherwise set to null; null is recommended if the previous steps has state True channel: Null - # Trained model name, more info on available models and custom models in the README + # Trained model name, more info on available models and custom models in PlantSeg documentation model_name: 'PlantSeg_3Dnuc_platinum' # If a CUDA capable gpu is available and corrected setup use "cuda", if not you can use "cpu" for cpu only inference (slower) device: 'cuda' @@ -212,16 +212,14 @@ If you have only one graphic card, use `CUDA_VISIBLE_DEVICES=0` to select the fi If you find this work useful, please cite both papers: ```bibtex -@article {Vijayan2024.02.19.580954, - author = {Athul Vijayan and Tejasvinee Atul Mody and Qin Yu and Adrian Wolny and Lorenzo Cerrone and Soeren Strauss and Miltos Tsiantis and Richard S. Smith and Fred Hamprecht and Anna Kreshuk and Kay Schneitz}, - title = {A deep learning-based toolkit for 3D nuclei segmentation and quantitative analysis in cellular and tissue context}, - elocation-id = {2024.02.19.580954}, - year = {2024}, - doi = {10.1101/2024.02.19.580954}, - publisher = {Cold Spring Harbor Laboratory}, - URL = {https://www.biorxiv.org/content/early/2024/02/21/2024.02.19.580954}, - eprint = {https://www.biorxiv.org/content/early/2024/02/21/2024.02.19.580954.full.pdf}, - journal = {bioRxiv} +@article{vijayan2024deep, + title={A deep learning-based toolkit for 3D nuclei segmentation and quantitative analysis in cellular and tissue context}, + author={Vijayan, Athul and Mody, Tejasvinee Atul and Yu, Qin and Wolny, Adrian and Cerrone, Lorenzo and Strauss, Soeren and Tsiantis, Miltos and Smith, Richard S and Hamprecht, Fred A and Kreshuk, Anna and others}, + journal={Development}, + volume={151}, + number={14}, + year={2024}, + publisher={The Company of Biologists} } @article{wolny2020accurate, diff --git a/stardist/README.md b/stardist/README.md index 924a993..80bde7e 100644 --- a/stardist/README.md +++ b/stardist/README.md @@ -7,51 +7,32 @@ A complete training and inference pipeline for 3D StarDist with an example on 3D biological (ovules) datasets. Please submit an issue if you encountered errors or if you have any questions or suggestions. -- [Models and Data](#models-and-data) - - [Cite](#cite) - - [Use Pre-trained Model](#use-pre-trained-model) - - [Training data statistics and links](#training-data-statistics-and-links) -- [Installation](#installation) - - [Install Miniconda](#install-miniconda) - - [Install `run-stardist` using `mamba`](#install-run-stardist-using-mamba) -- [Usage](#usage) - - [Example configuration file for both training and inference](#example-configuration-file-for-both-training-and-inference) - - [Training](#training) - - [Prediction](#prediction) - - [Specifying a Graphic Card (GPU)](#specifying-a-graphic-card-gpu) - +* [Models and Data](#models-and-data) + * [Use Pre-trained Model](#use-pre-trained-model) + * [Training data statistics and links](#training-data-statistics-and-links) +* [Installation](#installation) + * [Install Miniconda](#install-miniconda) + * [Install `run-stardist` using `mamba`](#install-run-stardist-using-mamba) +* [Usage](#usage) + * [Example configuration file for both training and inference](#example-configuration-file-for-both-training-and-inference) + * [Training](#training) + * [Prediction](#prediction) + * [Specifying a Graphic Card (GPU)](#specifying-a-graphic-card-gpu) +* [Cite](#cite) ## Models and Data A 3D nucleus segmentation model is available for download from Bioimage.IO and ready to be used directly for segmenting your nuclei. The model is trained on a 3D confocal ovule dataset from *Arabidopsis thaliana*. The StarDist version v0.8.3 was used for the paper. -### Cite - -If you find the code/models/datasets useful, please cite our paper: - -```bibtex -@article {Vijayan2024.02.19.580954, - author = {Athul Vijayan and Tejasvinee Atul Mody and Qin Yu and Adrian Wolny and Lorenzo Cerrone and Soeren Strauss and Miltos Tsiantis and Richard S. Smith and Fred Hamprecht and Anna Kreshuk and Kay Schneitz}, - title = {A deep learning-based toolkit for 3D nuclei segmentation and quantitative analysis in cellular and tissue context}, - elocation-id = {2024.02.19.580954}, - year = {2024}, - doi = {10.1101/2024.02.19.580954}, - publisher = {Cold Spring Harbor Laboratory}, - URL = {https://www.biorxiv.org/content/early/2024/02/21/2024.02.19.580954}, - eprint = {https://www.biorxiv.org/content/early/2024/02/21/2024.02.19.580954.full.pdf}, - journal = {bioRxiv} -} -``` - ### Use Pre-trained Model -Model weights and related files can be found at: https://zenodo.org/doi/10.5281/zenodo.8421755. The programme downloads the model automatically for you to make inference on your images as long as you specify `generic_plant_nuclei_3D` as the `model_name` in the configuration file. +Model weights and related files can be found at [DOI 10.5281/zenodo.8421755](https://zenodo.org/doi/10.5281/zenodo.8421755). The programme downloads the model automatically for you to make inference on your images as long as you specify `generic_plant_nuclei_3D` as the `model_name` in the configuration file. This is the only 3D StarDist model available on Bioimage Model Zoo at the moment. If you have another model, put its folder in your `PATH_TO_MODEL_DIR` and specify the folder name as `MY_MODEL_NAME` in the configuration file (see below). Then you can run `predict-stardist` to use the model for inference. For more information on inference, see [Prediction](#prediction) section below. ### Training data statistics and links -The training data is publicly available on Zenodo at `[TODO](to be published after paper submission)`. Some key information about the training data is listed below: +The training data is publicly available on Zenodo at [BioImage Archive S-BIAD1026](https://www.ebi.ac.uk/biostudies/BioImages/studies/S-BIAD1026). Some key information about the training data is listed below: ```python original_voxel_size = { # z, y, x @@ -91,6 +72,7 @@ Then install by typing: ```bash bash ./Miniconda3-latest-Linux-x86_64.sh ``` + and follow the installation instructions. The Miniconda3-latest-Linux-x86_64.sh file can be safely deleted. ### Install `run-stardist` using `mamba` @@ -208,3 +190,27 @@ CUDA_VISIBLE_DEVICES=7 predict-stardist --config CONFIG_PATH ``` If you have only one graphic card, use `CUDA_VISIBLE_DEVICES=0` to select the first card (No. 0). + +## Cite + +If you find the code/models/datasets useful, please cite our paper and StarDist: + +```bibtex +@article{vijayan2024deep, + title={A deep learning-based toolkit for 3D nuclei segmentation and quantitative analysis in cellular and tissue context}, + author={Vijayan, Athul and Mody, Tejasvinee Atul and Yu, Qin and Wolny, Adrian and Cerrone, Lorenzo and Strauss, Soeren and Tsiantis, Miltos and Smith, Richard S and Hamprecht, Fred A and Kreshuk, Anna and others}, + journal={Development}, + volume={151}, + number={14}, + year={2024}, + publisher={The Company of Biologists} +} + +@inproceedings{weigert2020star, + title={Star-convex polyhedra for 3D object detection and segmentation in microscopy}, + author={Weigert, Martin and Schmidt, Uwe and Haase, Robert and Sugawara, Ko and Myers, Gene}, + booktitle={Proceedings of the IEEE/CVF winter conference on applications of computer vision}, + pages={3666--3673}, + year={2020} +} +```