Compute Resources

Jump to bottom Edit New page

Cyriac Kandoth edited this page Jul 29, 2015 · 1 revision

Clusters

saba2.cbio.mskcc.org is used by Computational Biology (cBio) and Clinical Bioinformatics. GridEngine for job scheduling.

hal.cbio.mskcc.org is used by several labs at Computational Biology (cBio). PBS/Torque for job scheduling.

luna.mskcc.org is the official CMO cluster. Platform LSF for job scheduling. More details below.

The Luna Cluster

Reference: http://aji.cbio.mskcc.org/bic-hpc/luna/

Nodes:

The head node luna, or compute node s01, can be used for submitting jobs. Both have internet access.
Do not run work on the head node! Use bsub -Is bash if you prefer running work in command-line.
24 s compute nodes - 32 cores - 384 GB RAM
26 u compute nodes - 32 cores - 256 GB RAM
2 t compute nodes - 64 cores - 1.5 TB RAM

LSF commands:

bsub -Is bash - Starts an interactive session on one of the node
bsub -J NAME - name your job so the admins can see friendlier reports
bsub -w $jobid - run your job only after another specific job finishes

You'll find plenty of help online. Here is a nice cheat sheet.

Storage:

SOL ISI (isilon array) 1.5 - 2 PB
Each node has 800GB at /scratch for intermediate file storage

File System:

/home - 100GB limit - for scripts only, no huge files, frequent mirrored backup
/ifs/work - Fast disk, less space - for ongoing projects, ~10TB per lab
/ifs/res - Slow disk, more space - for long-term storage of sequence data
/ifs/archive - read only - GCL fastqs
/opt/common - binaries and popular third-party programs
/common/data -- data, genome assemblies, GTFs, etc.