Genotype calling for an autotetraploid species without reference genome #19
Replies: 1 comment 7 replies
-
Hi Joris, I'm glad that you're finding polyRAD helpful! VCFs have columns for chromosome and position, so if that information is not available, polyRAD will not export to VCF. As I recall with the Stacks import option, the full tag sequences are not imported, just variable nucleotides, so that's something to keep in mind as well when interpreting allele names. The Structure file should have four rows for each individual, and one column for each locus. Each allele is assigned a number, and the most probable genotype is shown. So if you have 1,1,2,3, that means two copies of one allele, one copy of a different allele, and then one copy of a third allele. Best, Lindsay |
Beta Was this translation helpful? Give feedback.
-
Dear Lindsay,
first of all we would like to thank you for the promising initiative that PolyRAD offers for population geneticists wanting to deal with natural populations of non-model polyploid species.
We would like to use PolyRAD to call genotypes from an autotetrapoloid species (106 individuals genotyped through a nGBS protocol from 9 localities).
SNPs were called with STACKS (v.2.41) in de novo mode (as we do not have any reference genome available) for this species. As far as we know, the populations display a significant level of genetic structure and moderate to low degrees of genetic diversity.
PolyRAD seems very complete and what we want to do is rather quite simple: input data from STACKS, infer the most likely genotypes and output the results to feed other population genomics softwares.
The appropriate plan seems to follow what you showed in the video tutorial: i) quality check and filtering, ii) use the IteratePopStruct() function to infer genotypes and iii) export the results as a .vcf (or a structure file).
If we are correct, we may want to have a quick feedback on the values we obtained at different steps but before that, we would like to know whether or not it is possible to output RADdata objects (that have been called with the readStacks() function) as a vcf. We did some tests and got the following error message:
Error in RADdata2VCF(mydataPopStruct, "test.vcf") :
Complete haplotype information not provided; unable to determine SNP positions. Use refgenome argument in VCF2RADdata.
We also tried to Export_Structure(). It works but we are not sure to understand the values of the genotypes as displayed in the file.
All the best,
Beta Was this translation helpful? Give feedback.
All reactions