-
Notifications
You must be signed in to change notification settings - Fork 8
/
Copy pathREADME.Rmd
111 lines (82 loc) · 4.67 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
---
output: github_document
---
```{r, echo = FALSE}
description <- readLines("DESCRIPTION")
rvers <- stringr::str_match(grep("R \\(", description, value = TRUE), "[0-9]{1,4}\\.[0-9]{1,4}\\.[0-9]{1,4}")[1,1]
version <- gsub(" ", "", gsub("Version:", "", grep("Version:", description, value = TRUE)))
```
<!-- badges: start -->
[![lifecycle](https://img.shields.io/badge/lifecycle-experimental-orange.svg)](https://tidyverse.org/lifecycle/#experimental)
[![CRAN_Status_Badge](http://www.r-pkg.org/badges/version/stackr)](http://cran.r-project.org/package=stackr)
[![Project Status: Active – The project has reached a stable, usable state and is being actively developed.](http://www.repostatus.org/badges/latest/active.svg)](http://www.repostatus.org/#active)
[![DOI](https://zenodo.org/badge/14548/thierrygosselin/stackr.svg)](https://zenodo.org/badge/latestdoi/14548/thierrygosselin/stackr)
[![packageversion](https://img.shields.io/badge/Package%20version-`r version`-orange.svg)](commits/master)
[![Last-changedate](https://img.shields.io/badge/last%20change-`r gsub('-', '--', Sys.Date())`-brightgreen.svg)](/commits/master)
[![minimal R version](https://img.shields.io/badge/R%3E%3D-`r rvers`-6666ff.svg)](https://cran.r-project.org/)
<!-- badges: end -->
```{r, echo = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "README-"
)
```
# stackr: an R package to run stacks software pipeline
This is the development page of the **stackr**.
**What's the difference with running stacks directly in the terminal?**
Besides running stacks within R, not much, tiny differences here and there
that speed up my RADseq workflow:
* The philosophy of working by project with pre-organized folders.
* Some important steps are **parallelized**.
* You have more than 1 sequencing chip/lane ? This workflow will save you lots of time.
* **Technical replicates**, inside or across chip/lanes are managed uniquely.
* Noise reduction.
* Data normalization.
* **nightmares because of a crashed computer/cluster/server?** stackr manage
stacks unique integer (previously called SQL IDs) throughout the pipeline. It's
integrated from the start, making it a breeze to just re-start your pipeline after a crash!
* **mismatch testing:** *de novo* mismatch threshold series is integrated
inside `run_ustacks` and stackr will produce tables and figures automatically.
* **catalog**: for bigger sampling size project, breaking down the catalog into
several separate *cstacks* steps makes the pipeline more rigorous if your
computer/cluster/server crash.
* **logs** generated by stacks are read and transferred in human-readable tables/tibbles.
Detecting problems is easier.
* summary of different stacks modules: available automatically inside stackr
pipeline, but also available for users who didn't use stackr to run stacks.
* For me all this = increased reproducibly.
**Who's it for?**
* It's currently developed with my own projects in mind.
* To help collaborators to get the most out of stacks.
It's not for R or stacks beginners. stacks related issues should be highlighted
on [stacks google group](https://groups.google.com/forum/?fromgroups#!forum/stacks-users).
## Installation
To try out the dev version of **stackr**, copy/paste the code below:
```r
if (!require("devtools")) install.packages("devtools")
devtools::install_github("thierrygosselin/stackr")
library(stackr)
```
## Citation:
To get the citation, inside R:
```r
citation("stackr")
```
Web site with additional info: [http://thierrygosselin.github.io/stackr/](http://thierrygosselin.github.io/stackr/)
* [Computer setup and troubleshooting](https://thierrygosselin.github.io/radiator/articles/rad_genomics_computer_setup.html)
* [Vignettes](https://thierrygosselin.github.io/radiator/articles/index.html)
## Life cycle
stackr is maturing, but in order to make the package better, changes are
inevitable. Argument names are very stable and follows stacks development closely.
* Philosophy, major changes and deprecated functions/arguments are documented in
life cycle section of functions.
* The latest changes are documented in [changelog, versions, new features and bug history](http://thierrygosselin.github.io/stackr/news/index.html)
* [issues](https://github.com/thierrygosselin/stackr/issues/new/choose) and [contributions](https://github.com/thierrygosselin/stackr/issues/new/choose)
## Stacks modules and RADseq typical workflow
**stackr** package provides wrapper functions to run
[STACKS](http://catchenlab.life.illinois.edu/stacks/) *process_radtags*,
*ustacks*, *cstacks*, *sstacks*, *rxstacks* and *populations* inside R.
Below, a flow chart
showing the corresponding stacks modules and stackr corresponding functions.
![](vignettes/stackr_workflow.png)