README:
This directory contains folders for each year of the Clements taxonomy.
- Clements_2021
- Clements_2022
- Clements_2023
Each taxonomy_year directory contains:
-
OTT_crosswalk_YEAR.csv a CSV file mapping species from the Clements taxonomy for that taxonomy year to OpenTreeTaxonomy (OTT) identifiers (e.g. OTT_crosswalk_2021.csv). Also includes mapping from species in Clements to their Avibase id, and to the names for those taxa in the IOC World Bird List v14.1 , Birdlife/HBW 8, and the Howard and Moore v4 taxonomy"
-
Column names and contents
- TAXON_ORDER <- from Clements Checklist csv
- SPECIES_CODE <- from Clements Checklist csv
- TAXON_CONCEPT_ID <- Avibase id imported from Clements Checklist csv
- PRIMARY_COM_NAME <- from Clements Checklist csv
- SCI_NAME <- from Clements Checklist csv
- ORDER1 <- from Clements Checklist csv
- FAMILY <- from Clements Checklist csv
- ott_id <- id for that species in OTT
- ott_name <- name for that species in OTT
- ott_tax_sources <- sources for that taxon name in OTT
- ott_match_type <- how link was made from Clements name to OTT id. (may be 'canonical_match', 'synonym_match', 'new_taxon_addition', or 'NA' for no match, as well some other idiosyncratic match types for hand corrections)
- H_M_name <- best match name or names in the Howard and Moore v4 taxonomy(if multiple names matched they are a semi-colon seperated list)
- H_M_match_type <- how the the avibase taxon concepts map between the Clements and Howard and Moore names. One of 'concepts_match', 'child_of', 'parent_of', 'overlaps', or 'missing'.
- Birdlife_name <- best match name or names in the Birdlife/HBW taxonomy(if multiple names matched they are a semi-colon seperated list)
- Birdlife_match_type <- how the the avibase taxon concepts map between Clements and Birdlife names
- IOC_name <- best match name or names in the Birdlife/HBW taxonomy(if multiple names matched they are a semi-colon seperated list)
- IOC_match_type <- how the the avibase taxon concepts map between Clements and IOC names
-
a taxon addition file capturing a suggested placement for species for which we do not yet have phylogenetic information
-
a copy of the ebird taxonomy file for that year.
This directory contains folders for each synth tree run, labelled by their version identifier (e.g. Aves_1.0)
The most recent tree is Aves_1.3 and that directory contains all of these files. (Older trees folders have some but not all of these files, and have a readme inculded in each folder)
Aves_1.3
For more details and code for how each of these files were generated see: https://github.com/McTavishLab/AvesTreeCode
- dates_citations.txt <-input studies used to estimate dates
- tree_citations.txt <- input studies used to estimate the phylogeny tree
- all_node_ages.json <- a json file storing the age estimates for each internal node in the phylogeny only tree, with the metadata about what input study suggested that node age.
- OpenTree_synth <- folder containing the direct outputs of OpenTree Synthesis (this is identical between Aves 1.2 and Aves 1.3)
- This folder contains many nitty gritty internal synthesis outputs. These files are detailed in depth in the index.html file contained in the OpenTree synth folder
- A folder for each year of the Clements taxonomy folder:
- phylo_only.tre <- OpenTree Id labeled tree including only tips with phylogenetic information
- phylo_only_clements_labels.tre <- Clements labeled tree including only tips with phylogenetic information
- phylo_only_clements_labels_ultrametric.tre <- ultrametric tree which is the input for the taxon addition step
- taxon_addition_treeset.tre <- set of 100 complete trees from 100 stochastic taxon addition replicates
- dated_rand_sample_clements.tre <- dated taxon addition tree cloud, 100 topologies from the taxon addition treeset using random selections from the node dates age for each dated node calibration. First line is a header line containing the text "trees", followed by one newick tree per line.
- dated_mean_sample_clements.tre <- dated taxon addition tree cloud, 100 topologies from the taxon addition treeset using the mean node age for each node calibration. First line is a header line containing the text "trees", followed by one newick tree per line.
- MCC_clements.tre <- Maximim clade credibility tree including all taxa in this version of the taxonomy summarized from the dated trees using random selections for the node calibrations. Labelled with Clements taxonomy labels.