Skip to content

Commit

Permalink
add download.sh
Browse files Browse the repository at this point in the history
Former-commit-id: 342c17b
  • Loading branch information
chenkenbio committed Jun 2, 2023
1 parent ef92d9b commit 7fde40d
Show file tree
Hide file tree
Showing 2 changed files with 26 additions and 7 deletions.
10 changes: 3 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,11 +22,7 @@ See [official guide](https://huggingface.co/docs/transformers/model_doc/bert) fo

**Download SpliceBERT**

- [SpliceBERT.1024nt.tar.gz](https://github.com/biomed-AI/SpliceBERT/releases/download/v0.1/SpliceBERT.1024nt.tar.gz)
- [SpliceBERT.510nt.tar.gz](https://github.com/biomed-AI/SpliceBERT/releases/download/v0.1/SpliceBERT.510nt.tar.gz)
- [SpliceBERT-human.510nt.tar.gz](https://github.com/biomed-AI/SpliceBERT/releases/download/v0.1/SpliceBERT-human.510nt.tar.gz)

The model weights are also available at [zenodo](https://doi.org/10.5281/zenodo.7995778).
The weights of SpliceBERT can be downloaded from [zenodo](https://doi.org/10.5281/zenodo.7995778): https://zenodo.org/record/7995778/files/models.tar.gz?download=1

**System requirements**

Expand Down Expand Up @@ -69,8 +65,8 @@ model = AutoModelForSequenceClassification.from_pretrained(SPLICEBERT_PATH, num_

## Reproduce the analysis in manuscript

Before running the codes, run `bash setup.sh` in the `./examples` folder to compile the codes written in cython (`cython` is required).
Then, run `bash download.sh` to fetch the data used in the analysis.
Before running the codes, run `bash download.sh` to fetch the data used in the analysis.
Then, run `bash setup.sh` in the `./examples` folder to compile the codes written in cython (`cython` is required).

The codes for analyzing SpliceBERT are available in [examples](./examples):
- [evolutionary conservation analysis](./examples/00-conservation) (related to Figure 1)
Expand Down
23 changes: 23 additions & 0 deletions download.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
#!/bin/bash


echo "Downloading the data to ./examples/ ..."
wget -c -O ./examples/data.tar.gz https://zenodo.org/record/7995778/files/data.tar.gz?download=1 && cd examples && tar -xzvf data.tar.gz && cd .. && echo "Done"

echo "Downloading the model weights ..."
wget -c -O models.tar.gz https://zenodo.org/record/7995778/files/models.tar.gz?download=1 && tar -xzvf models.tar.gz && echo "Done"


## check dnabert
echo "Preparing the DNABERT weights ..."
test -d ./models/dnabert || mkdir -p ./models/dnabert
cd ./models/dnabert
for k in 3 4 5 6; do
if [ ! -e "${k}-new-12w-0" ]; then
if [ -e "${k}-new-12w-0.zip" ]; then
unzip "${k}-new-12w-0.zip" && echo "unzip: ${k}-new-12w-0.zip -> ${k}-new-12w-0"
else
echo "NOTE: Users should manually download the weights of DNABERT${k} from https://github.com/jerryji1993/DNABERT and decompress it to ./models/dnabert/"
fi
fi
done

0 comments on commit 7fde40d

Please sign in to comment.