Skip to content

Commit

Permalink
Pats edits and moved the references.
Browse files Browse the repository at this point in the history
  • Loading branch information
Geoffrey Hannigan committed Oct 26, 2017
1 parent 1d60bea commit 853923e
Show file tree
Hide file tree
Showing 2 changed files with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions doc/manuscript.md
Original file line number Diff line number Diff line change
Expand Up @@ -69,10 +69,10 @@ Our study cohort consisted of 90 human subjects, 30 of whom had healthy colons,
Each extraction was performed with a blank buffer control to detect contaminants from reagents or other unintentional sources. Only one of the nine controls contained detectable DNA at a minimal concentration of 0.011 ng/µl, thus providing evidence of the enrichment and purification of VLP genomic DNA over potential contaminants **(Figure \ref{qualcontrol} A)**. As expected, these controls yielded few sequences and were almost entirely removed while rarefying the datasets to a common number of sequences **(Figure \ref{qualcontrol} B)**. The high quality phage and bacterial sequences were assembled into highly covered contigs longer than 1 kb **(Figure \ref{contigqc})**. Because contigs represent genome fragments, we further clustered related bacterial contigs into operational genomic units (OGUs) and viral contigs into operational viral units (OVUs) **(Figure \ref{contigqc} - \ref{clustercontigqc})** to approximate organismal units.

## Unaltered Diversity in Colorectal Cancer
Microbiome and disease associations are often described as being of an altered diversity (i.e., "dysbiotic"). Therefore, we first evaluated the influence of colorectal cancer on virome OVU diversity. We evaluated differences in communities between disease states using the Shannon diversity, richness, and Bray-Curtis metrics. We observed no significant alterations in either Shannon diversity or richness in the diseased states as compared to the healthy state **(Figure \ref{betaogu} C-D)**. There was no statistically significant clustering of the disease groups (ANOSIM p-value = 0.4, **Figure \ref{betaogu}**). Notably, there was a significant difference between the few blank controls that remained after rarefying the data and the other study groups (ANOSIM p-value < 0.001, **Figure \ref{betaogunegative})**, further supporting the quality of the sample set. In summary, standard alpha and beta diversity metrics were insufficient for capturing virus community differences between disease states **(Figure \ref{betaogu})**. This is consistent with what has been observed when the same metrics were applied to 16S rRNA sequenced and metagenomic samples [@Zeller:2014ix; @Zackular:2014fba; @Baxter:2016dja] and points to the need for alternate approaches to detect the impact of colorectal cancer disease state on these communities.
Microbiome and disease associations are often described as being of an altered diversity (i.e., "dysbiotic"). Therefore, we first evaluated the influence of colorectal cancer on virome OVU diversity. We evaluated differences in communities between disease states using the Shannon diversity, richness, and Bray-Curtis metrics. We observed no significant alterations in either Shannon diversity or richness in the diseased states as compared to the healthy state **(Figure \ref{betaogu} C-D)**. There was no statistically significant clustering of the disease groups (ANOSIM p-value = 0.4, **Figure \ref{betaogu}**). Notably, there was a significant difference between the few blank controls that remained after rarefying the data and the other study groups (ANOSIM p-value < 0.001, **Figure \ref{betaogunegative})**, further supporting the quality of the sample set. In summary, standard alpha and beta diversity metrics were insufficient for capturing virus community differences between disease states **(Figure \ref{betaogu})**. This is consistent with what has been observed when the same metrics were applied to 16S rRNA gene sequences and metagenomic samples [@Zeller:2014ix; @Zackular:2014fba; @Baxter:2016dja] and points to the need for alternate approaches to detect the impact of colorectal cancer disease state on these communities.

## Virome Composition in Colorectal Cancer
As opposed to the diversity metrics discussed above, OTU-based relative abundance profiles generated from 16S rRNA gene sequences are effective feature sets for classifying stool samples as originating from individuals with healthy, adenomatous, or cancerous colons [@Zackular:2014fba; @Baxter:2016dja]. The exceptional performance of bacteria in these classification models supports a role for bacteria in colorectal cancer. We built off of these findings by evaluating the ability of virus community signatures to classify stool samples and compared their performance to models built using bacterial community signatures.
As opposed to the diversity metrics discussed above, OTU-based relative abundance profiles generated from 16S rRNA gene sequences are effective for classifying stool samples as originating from individuals with healthy, adenomatous, or cancerous colons [@Zackular:2014fba; @Baxter:2016dja]. The exceptional performance of bacteria in these classification models supports a role for bacteria in colorectal cancer. We built off of these findings by evaluating the ability of virus community signatures to classify stool samples and compared their performance to models built using bacterial community signatures.

To identify the altered virus communities associated with colorectal cancer, we built and tested random forest models for classifying stool samples as belonging to individuals with either cancerous or healthy colons. We confirmed that our bacterial 16S rRNA gene model replicated the performance of the original report which used logit models instead of random forest models **(Figure \ref{predmodel} A)** [@Zackular:2014fba]. We then compared the bacterial OTU model to a model built using OVU relative abundances. The viral model performed as well as the bacterial model (corrected p-value = 0.7), with the viral and bacterial models achieving mean area under the curve (AUC) values of 0.767 and 0.772, respectively **(Figure \ref{predmodel} A - B)**. To evaluate the ability of both bacterial and viral biomarkers to classify samples, we built a combined model that used both bacterial and viral community data. The combined model did not yield a statistically significant performance improvement beyond the viral (corrected p-value = 0.1) and bacterial (corrected p-value = 0.2) models, yielding an AUC of 0.804 **(Figure \ref{predmodel} A - B)**.

Expand Down
Binary file modified doc/manuscript.pdf
Binary file not shown.

0 comments on commit 853923e

Please sign in to comment.