Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why does the core_clusters.txt file contain blank cells in genome columns (columns >=8)? #114

Open
jarrettlebov opened this issue Jun 10, 2020 · 0 comments

Comments

@jarrettlebov
Copy link

Hi, I'm a relatively new user of PanOCT. In viewing some of the output files, I noticed that in the core_clusters.txt file, while the vast majority of rows have no blank cells in the genome columns (columns >= 8), a minority of rows have some blank cells. These empty cells seem to always fall in rows where the "attributes" column contains an "FS", "FG-In", or "FN".

Given the description of core_clusters.txt file in the OUT_FILE_DESCRIPTIONS.txt file ("core_clusters.txt: Tab delimited file. Lists only those clusters which have a representative from every genome in the analysis."), I'm confused how the core_clusters.txt file could have any empty cells.

I apologize if this is a novice question, I can't seem to connect the dots by reading through the documentation on this site, the original 2012 PanOCT paper, or the 2018 pangenome pipeline paper.

Any guidance would be greatly appreciated. I have attached my core_clusters.txt file for reference.

core_clusters.txt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant