Skip to content

In this competition, I create an algorithm to identify metastatic cancer in small image patches taken from larger digital pathology scans. You may view and download the official Pcam dataset from GitHub https://github.com/basveeling/pcam. The data is provided under the CC0 License, following the license of Camelyon16.

Notifications You must be signed in to change notification settings

mitch-henderson/MetastaticCancerDetection-PcamAnalysis

Repository files navigation

DenseNet Convolutional Neural Network for Metastatic Cancer Identification in Images

RandomForest

In this competition, I create an algorithm to identify metastatic cancer in small image patches taken from larger digital pathology scans. You may view and download the official Pcam dataset from GitHub https://github.com/basveeling/pcam. The data is provided under the CC0 License, following the license of Camelyon16.

Histopathologic Cancer Detection

This project aims to detect metastatic cancer in small image patches taken from larger digital pathology scans. The dataset comes from the Pcam Kaggle competition.

Data

The dataset contains 220,025 96x96 pixel RGB histopathology patches. 130,000 images are labeled as negative (no cancer) and 90,000 as positive (contains cancer). The data is split into training and test sets.

Algorithm Toolset

The main tools and techniques used in this project:

Model

A DenseNet169 CNN architecture pretrained on ImageNet is used as the base model. The model is trained for 1 epoch on the training set with a learning rate of 0.01302280556410551 and weight decay of 0.01. Data augmentation techniques like rotations, flipping, cropping etc. are used to expand the training set.

The final model achieves:

Accuracy: 92%
ROC AUC: 0.99

on the validation set.

Usage

The train.py script trains the model on the training set and saves it to model.pth.

The predict.py script loads the trained model and makes predictions on the test set. It saves the predictions to submission.csv in the format accepted for the Kaggle competition.

Installation

The code requires Python 3 and the following libraries:

matplotlib opencv-python pandas scikit-learn fastai torchvision The dependencies can be installed using: pip install -r requirements.txt

References

The implementation is based on the following tutorial:

https://www.kaggle.com/code/awaisrauf/histopathologic-cancer-detection-with-fastai

License

The Pcam dataset is provided under the CC0 License, following the license of Camelyon16.

Questions/Comments:

Contact me on GitHub!

About

In this competition, I create an algorithm to identify metastatic cancer in small image patches taken from larger digital pathology scans. You may view and download the official Pcam dataset from GitHub https://github.com/basveeling/pcam. The data is provided under the CC0 License, following the license of Camelyon16.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published