Data stucture

README:

Data stucture

Taxonomy:

This directory contains folders for each year of the Clements taxonomy.

Clements_2021
Clements_2022
Clements_2023

Each taxonomy_year directory contains:

OTT_crosswalk_YEAR.csv a CSV file mapping species from the Clements taxonomy for that taxonomy year to OpenTreeTaxonomy (OTT) identifiers (e.g. OTT_crosswalk_2021.csv). Also includes mapping from species in Clements to their Avibase id, and to the names for those taxa in the IOC World Bird List v14.1 , Birdlife/HBW 8, and the Howard and Moore v4 taxonomy"
Column names and contents
- TAXON_ORDER <- from Clements Checklist csv
- SPECIES_CODE <- from Clements Checklist csv
- TAXON_CONCEPT_ID <- Avibase id imported from Clements Checklist csv
- PRIMARY_COM_NAME <- from Clements Checklist csv
- SCI_NAME <- from Clements Checklist csv
- ORDER1 <- from Clements Checklist csv
- FAMILY <- from Clements Checklist csv
- ott_id <- id for that species in OTT
- ott_name <- name for that species in OTT
- ott_tax_sources <- sources for that taxon name in OTT
- ott_match_type <- how link was made from Clements name to OTT id. (may be 'canonical_match', 'synonym_match', 'new_taxon_addition', or 'NA' for no match, as well some other idiosyncratic match types for hand corrections)
- H_M_name <- best match name or names in the Howard and Moore v4 taxonomy(if multiple names matched they are a semi-colon seperated list)
- H_M_match_type <- how the the avibase taxon concepts map between the Clements and Howard and Moore names. One of 'concepts_match', 'child_of', 'parent_of', 'overlaps', or 'missing'.
- Birdlife_name <- best match name or names in the Birdlife/HBW taxonomy(if multiple names matched they are a semi-colon seperated list)
- Birdlife_match_type <- how the the avibase taxon concepts map between Clements and Birdlife names
- IOC_name <- best match name or names in the Birdlife/HBW taxonomy(if multiple names matched they are a semi-colon seperated list)
- IOC_match_type <- how the the avibase taxon concepts map between Clements and IOC names
a taxon addition file capturing a suggested placement for species for which we do not yet have phylogenetic information
a copy of the ebird taxonomy file for that year.

Trees:

This directory contains folders for each synth tree run, labelled by their version identifier (e.g. Aves_1.0)

The most recent tree is Aves_1.3 and that directory contains all of these files. (Older trees folders have some but not all of these files, and have a readme inculded in each folder)

Aves_1.3

For more details and code for how each of these files were generated see: https://github.com/McTavishLab/AvesTreeCode

dates_citations.txt <-input studies used to estimate dates
tree_citations.txt <- input studies used to estimate the phylogeny tree
all_node_ages.json <- a json file storing the age estimates for each internal node in the phylogeny only tree, with the metadata about what input study suggested that node age.
OpenTree_synth <- folder containing the direct outputs of OpenTree Synthesis (this is identical between Aves 1.2 and Aves 1.3)
- This folder contains many nitty gritty internal synthesis outputs. These files are detailed in depth in the index.html file contained in the OpenTree synth folder
A folder for each year of the Clements taxonomy folder:
- phylo_only.tre <- OpenTree Id labeled tree including only tips with phylogenetic information
- phylo_only_clements_labels.tre <- Clements labeled tree including only tips with phylogenetic information
- phylo_only_clements_labels_ultrametric.tre <- ultrametric tree which is the input for the taxon addition step
- taxon_addition_treeset.tre <- set of 100 complete trees from 100 stochastic taxon addition replicates
- dated_rand_sample_clements.tre <- dated taxon addition tree cloud, 100 topologies from the taxon addition treeset using random selections from the node dates age for each dated node calibration. First line is a header line containing the text "trees", followed by one newick tree per line.
- dated_mean_sample_clements.tre <- dated taxon addition tree cloud, 100 topologies from the taxon addition treeset using the mean node age for each node calibration. First line is a header line containing the text "trees", followed by one newick tree per line.
- MCC_clements.tre <- Maximim clade credibility tree including all taxa in this version of the taxonomy summarized from the dated trees using random selections for the node calibrations. Labelled with Clements taxonomy labels.

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
Taxonomy_versions		Taxonomy_versions
Tree_versions		Tree_versions
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data stucture

Taxonomy:

Trees:

About

Releases

Packages

Languages

McTavishLab/AvesData

Folders and files

Latest commit

History

Repository files navigation

Data stucture

Taxonomy:

Trees:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages